首页 > 最新文献

Proceedings of the conference. Association for Computational Linguistics. Meeting最新文献

英文 中文
EchoGen: A New Benchmark Study on Generating Conclusions from Echocardiogram Notes. EchoGen:从超声心动图笔记生成结论的新基准研究。
Pub Date : 2022-05-01 DOI: 10.18653/v1/2022.bionlp-1.35
Liyan Tang, Shravan Kooragayalu, Yanshan Wang, Ying Ding, Greg Durrett, Justin F Rousseau, Yifan Peng

Generating a summary from findings has been recently explored (Zhang et al., 2018, 2020) in note types such as radiology reports that typically have short length. In this work, we focus on echocardiogram notes that is longer and more complex compared to previous note types. We formally define the task of echocardiography conclusion generation (EchoGen) as generating a conclusion given the findings section, with emphasis on key cardiac findings. To promote the development of EchoGen methods, we present a new benchmark, which consists of two datasets collected from two hospitals. We further compare both standard and state-of-the-art methods on this new benchmark, with an emphasis on factual consistency. To accomplish this, we develop a tool to automatically extract concept-attribute tuples from the text. We then propose an evaluation metric, FactComp, to compare concept-attribute tuples between the human reference and generated conclusions. Both automatic and human evaluations show that there is still a significant gap between human-written and machine-generated conclusions on echo reports in terms of factuality and overall quality.

最近已经探索了从研究结果中生成摘要(Zhang et al., 2018,2020),例如通常长度较短的放射学报告。在这项工作中,我们将重点放在超声心动图音符上,这些音符比以前的音符类型更长、更复杂。我们正式将超声心动图结论生成(EchoGen)的任务定义为根据发现部分生成结论,重点是关键的心脏发现。为了促进EchoGen方法的发展,我们提出了一个新的基准,它由来自两家医院的两个数据集组成。我们在这个新的基准上进一步比较了标准和最先进的方法,重点是事实的一致性。为了实现这一点,我们开发了一个工具来自动从文本中提取概念属性元组。然后,我们提出了一个评估指标FactComp,用于比较人类参考和生成结论之间的概念属性元组。自动评估和人工评估都表明,在真实性和总体质量方面,人工编写的结论与机器生成的结论之间仍然存在很大差距。
{"title":"EchoGen: A New Benchmark Study on Generating Conclusions from Echocardiogram Notes.","authors":"Liyan Tang,&nbsp;Shravan Kooragayalu,&nbsp;Yanshan Wang,&nbsp;Ying Ding,&nbsp;Greg Durrett,&nbsp;Justin F Rousseau,&nbsp;Yifan Peng","doi":"10.18653/v1/2022.bionlp-1.35","DOIUrl":"https://doi.org/10.18653/v1/2022.bionlp-1.35","url":null,"abstract":"<p><p>Generating a summary from findings has been recently explored (Zhang et al., 2018, 2020) in note types such as radiology reports that typically have short length. In this work, we focus on echocardiogram notes that is longer and more complex compared to previous note types. We formally define the task of echocardiography conclusion generation (<b>EchoGen</b>) as generating a conclusion given the findings section, with emphasis on key cardiac findings. To promote the development of EchoGen methods, we present a new benchmark, which consists of two datasets collected from two hospitals. We further compare both standard and state-of-the-art methods on this new benchmark, with an emphasis on factual consistency. To accomplish this, we develop a tool to automatically extract concept-attribute tuples from the text. We then propose an evaluation metric, <i>FactComp</i>, to compare concept-attribute tuples between the human reference and generated conclusions. Both automatic and human evaluations show that there is still a significant gap between human-written and machine-generated conclusions on echo reports in terms of factuality and overall quality.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":" ","pages":"359-368"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9634991/pdf/nihms-1844028.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40669497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Evaluating Biomedical Word Embeddings for Vocabulary Alignment at Scale in the UMLS Metathesaurus Using Siamese Networks. 使用Siamese网络评估UMLS元词典中生物医学词嵌入的词汇对齐。
Pub Date : 2022-05-01 DOI: 10.18653/v1/2022.insights-1.11
Goonmeet Bajaj, Vinh Nguyen, Thilini Wijesiriwardene, Hong Yung Yip, Vishesh Javangula, Srinivasan Parthasarathy, Amit Sheth, Olivier Bodenreider

Recent work uses a Siamese Network, initialized with BioWordVec embeddings (distributed word embeddings), for predicting synonymy among biomedical terms to automate a part of the UMLS (Unified Medical Language System) Metathesaurus construction process. We evaluate the use of contextualized word embeddings extracted from nine different biomedical BERT-based models for synonymy prediction in the UMLS by replacing BioWordVec embeddings with embeddings extracted from each biomedical BERT model using different feature extraction methods. Surprisingly, we find that Siamese Networks initialized with BioWordVec embeddings still outperform the Siamese Networks initialized with embedding extracted from biomedical BERT model.

最近的工作使用了一个Siamese网络,用BioWordVec嵌入(分布式词嵌入)初始化,用于预测生物医学术语之间的同义词,以自动完成UMLS(统一医学语言系统)元同义词库构建过程的一部分。通过使用不同的特征提取方法从每个生物医学BERT模型中提取的嵌入替换BioWordVec嵌入,我们评估了从9种不同的生物医学BERT模型中提取的上下文化词嵌入在UMLS同义词预测中的使用情况。令人惊讶的是,我们发现用BioWordVec嵌入初始化的Siamese网络仍然优于用生物医学BERT模型提取的嵌入初始化的Siamese网络。
{"title":"Evaluating Biomedical Word Embeddings for Vocabulary Alignment at Scale in the UMLS Metathesaurus Using Siamese Networks.","authors":"Goonmeet Bajaj,&nbsp;Vinh Nguyen,&nbsp;Thilini Wijesiriwardene,&nbsp;Hong Yung Yip,&nbsp;Vishesh Javangula,&nbsp;Srinivasan Parthasarathy,&nbsp;Amit Sheth,&nbsp;Olivier Bodenreider","doi":"10.18653/v1/2022.insights-1.11","DOIUrl":"https://doi.org/10.18653/v1/2022.insights-1.11","url":null,"abstract":"<p><p>Recent work uses a Siamese Network, initialized with BioWordVec embeddings (distributed word embeddings), for predicting synonymy among biomedical terms to automate a part of the UMLS (Unified Medical Language System) Metathesaurus construction process. We evaluate the use of contextualized word embeddings extracted from nine different biomedical BERT-based models for synonymy prediction in the UMLS by replacing BioWordVec embeddings with embeddings extracted from each biomedical BERT model using different feature extraction methods. Surprisingly, we find that Siamese Networks initialized with BioWordVec embeddings still outperform the Siamese Networks initialized with embedding extracted from biomedical BERT model.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":" ","pages":"82-87"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9455661/pdf/nihms-1833238.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33461234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models.
Pub Date : 2022-05-01 DOI: 10.18653/v1/2022.acl-long.131
Changye Li, David Knopman, Weizhe Xu, Trevor Cohen, Serguei Pakhomov

Deep learning (DL) techniques involving fine-tuning large numbers of model parameters have delivered impressive performance on the task of discriminating between language produced by cognitively healthy individuals, and those with Alzheimer's disease (AD). However, questions remain about their ability to generalize beyond the small reference sets that are publicly available for research. As an alternative to fitting model parameters directly, we propose a novel method by which a Transformer DL model (GPT-2) pre-trained on general English text is paired with an artificially degraded version of itself (GPT-D), to compute the ratio between these two models' perplexities on language from cognitively healthy and impaired individuals. This technique approaches state-of-the-art performance on text data from a widely used "Cookie Theft" picture description task, and unlike established alternatives also generalizes well to spontaneous conversations. Furthermore, GPT-D generates text with characteristics known to be associated with AD, demonstrating the induction of dementia-related linguistic anomalies. Our study is a step toward better understanding of the relationships between the inner workings of generative neural language models, the language that they produce, and the deleterious effects of dementia on human speech and language characteristics.

{"title":"GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models.","authors":"Changye Li, David Knopman, Weizhe Xu, Trevor Cohen, Serguei Pakhomov","doi":"10.18653/v1/2022.acl-long.131","DOIUrl":"10.18653/v1/2022.acl-long.131","url":null,"abstract":"<p><p>Deep learning (DL) techniques involving fine-tuning large numbers of model parameters have delivered impressive performance on the task of discriminating between language produced by cognitively healthy individuals, and those with Alzheimer's disease (AD). However, questions remain about their ability to generalize beyond the small reference sets that are publicly available for research. As an alternative to fitting model parameters directly, we propose a novel method by which a Transformer DL model (GPT-2) pre-trained on general English text is paired with an artificially degraded version of itself (GPT-D), to compute the ratio between these two models' <i>perplexities</i> on language from cognitively healthy and impaired individuals. This technique approaches state-of-the-art performance on text data from a widely used \"Cookie Theft\" picture description task, and unlike established alternatives also generalizes well to spontaneous conversations. Furthermore, GPT-D generates text with characteristics known to be associated with AD, demonstrating the induction of dementia-related linguistic anomalies. Our study is a step toward better understanding of the relationships between the inner workings of generative neural language models, the language that they produce, and the deleterious effects of dementia on human speech and language characteristics.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2022 ","pages":"1866-1877"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11753619/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143026029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Factuality in Text Simplification 文本简化中的真实性评价
Pub Date : 2022-04-15 DOI: 10.48550/arXiv.2204.07562
Ashwin Devaraj, William Sheffield, Byron C. Wallace, Junyi Jessy Li
Automated simplification models aim to make input texts more readable. Such methods have the potential to make complex information accessible to a wider audience, e.g., providing access to recent medical literature which might otherwise be impenetrable for a lay reader. However, such models risk introducing errors into automatically simplified texts, for instance by inserting statements unsupported by the corresponding original text, or by omitting key information. Providing more readable but inaccurate versions of texts may in many cases be worse than providing no such access at all. The problem of factual accuracy (and the lack thereof) has received heightened attention in the context of summarization models, but the factuality of automatically simplified texts has not been investigated. We introduce a taxonomy of errors that we use to analyze both references drawn from standard simplification datasets and state-of-the-art model outputs. We find that errors often appear in both that are not captured by existing evaluation metrics, motivating a need for research into ensuring the factual accuracy of automated simplification models.
自动简化模型旨在使输入文本更具可读性。这种方法有可能使更广泛的受众能够获得复杂的信息,例如,提供获取最近医学文献的途径,否则这些文献对于外行读者来说可能难以理解。然而,这样的模型可能会在自动简化的文本中引入错误,例如插入相应的原始文本不支持的语句,或者省略关键信息。在许多情况下,提供可读性更强但不准确的文本版本可能比完全不提供这种访问更糟糕。摘要模型中事实准确性(及其缺乏)的问题已受到高度关注,但自动简化文本的事实性尚未得到调查。我们引入了一个错误分类,我们使用它来分析从标准简化数据集和最先进的模型输出中提取的参考。我们发现,错误经常出现在现有的评估指标中,这促使人们需要研究确保自动化简化模型的事实准确性。
{"title":"Evaluating Factuality in Text Simplification","authors":"Ashwin Devaraj, William Sheffield, Byron C. Wallace, Junyi Jessy Li","doi":"10.48550/arXiv.2204.07562","DOIUrl":"https://doi.org/10.48550/arXiv.2204.07562","url":null,"abstract":"Automated simplification models aim to make input texts more readable. Such methods have the potential to make complex information accessible to a wider audience, e.g., providing access to recent medical literature which might otherwise be impenetrable for a lay reader. However, such models risk introducing errors into automatically simplified texts, for instance by inserting statements unsupported by the corresponding original text, or by omitting key information. Providing more readable but inaccurate versions of texts may in many cases be worse than providing no such access at all. The problem of factual accuracy (and the lack thereof) has received heightened attention in the context of summarization models, but the factuality of automatically simplified texts has not been investigated. We introduce a taxonomy of errors that we use to analyze both references drawn from standard simplification datasets and state-of-the-art model outputs. We find that errors often appear in both that are not captured by existing evaluation metrics, motivating a need for research into ensuring the factual accuracy of automated simplification models.","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"16 1","pages":"7331-7345"},"PeriodicalIF":0.0,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87084766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Predicting pragmatic discourse features in the language of adults with autism spectrum disorder. 自闭症谱系障碍成人语用语篇特征预测。
Pub Date : 2021-08-01 DOI: 10.18653/v1/2021.acl-srw.29
Christine Yang, Duanchen Liu, Qingyun Yang, Zoey Liu, Emily Prud'hommeaux

Individuals with autism spectrum disorder (ASD) experience difficulties in social aspects of communication, but the linguistic characteristics associated with deficits in discourse and pragmatic expression are often difficult to precisely identify and quantify. We are currently collecting a corpus of transcribed natural conversations produced in an experimental setting in which participants with and without ASD complete a number of collaborative tasks with their neurotypical peers. Using this dyadic conversational data, we investigate three pragmatic features - politeness, uncertainty, and informativeness - and present a dataset of utterances annotated for each of these features on a three-point scale. We then introduce ongoing work in developing and training neural models to automatically predict these features, with the goal of identifying the same between-groups differences that are observed using manual annotations. We find the best performing model for all three features is a feedforward neural network trained with BERT embeddings. Our models yield higher accuracy than ones used in previous approaches for deriving these features, with F1 exceeding 0.82 for all three pragmatic features.

自闭症谱系障碍(ASD)患者在社交方面存在沟通困难,但与话语和语用表达缺陷相关的语言特征往往难以精确识别和量化。我们目前正在收集在实验环境中产生的转录自然对话的语料库,在实验环境中,有和没有ASD的参与者与他们的神经正常的同伴完成了许多合作任务。使用这种二元会话数据,我们研究了三个语用特征——礼貌、不确定性和信息性——并给出了一个三分制的话语数据集,对这些特征中的每一个都进行了注释。然后,我们介绍了正在进行的开发和训练神经模型的工作,以自动预测这些特征,目标是识别使用手动注释观察到的相同组间差异。我们发现这三个特征的最佳表现模型是用BERT嵌入训练的前馈神经网络。我们的模型比以前用于导出这些特征的方法产生更高的精度,所有三个实用特征的F1都超过0.82。
{"title":"Predicting pragmatic discourse features in the language of adults with autism spectrum disorder.","authors":"Christine Yang,&nbsp;Duanchen Liu,&nbsp;Qingyun Yang,&nbsp;Zoey Liu,&nbsp;Emily Prud'hommeaux","doi":"10.18653/v1/2021.acl-srw.29","DOIUrl":"https://doi.org/10.18653/v1/2021.acl-srw.29","url":null,"abstract":"<p><p>Individuals with autism spectrum disorder (ASD) experience difficulties in social aspects of communication, but the linguistic characteristics associated with deficits in discourse and pragmatic expression are often difficult to precisely identify and quantify. We are currently collecting a corpus of transcribed natural conversations produced in an experimental setting in which participants with and without ASD complete a number of collaborative tasks with their neurotypical peers. Using this dyadic conversational data, we investigate three pragmatic features - politeness, uncertainty, and informativeness - and present a dataset of utterances annotated for each of these features on a three-point scale. We then introduce ongoing work in developing and training neural models to automatically predict these features, with the goal of identifying the same between-groups differences that are observed using manual annotations. We find the best performing model for all three features is a feedforward neural network trained with BERT embeddings. Our models yield higher accuracy than ones used in previous approaches for deriving these features, with F1 exceeding 0.82 for all three pragmatic features.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2021 ","pages":"284-291"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9633181/pdf/nihms-1846176.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40669995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network. 基于堆叠卷积的鲁棒知识图谱补全与学生重排序网络。
Pub Date : 2021-08-01 DOI: 10.18653/v1/2021.acl-long.82
Justin Lovelace, Denis Newman-Griffis, Shikhar Vashishth, Jill Fain Lehman, Carolyn Penstein Rosé

Knowledge Graph (KG) completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs. We curate two KG datasets that include biomedical and encyclopedic knowledge and use an existing commonsense KG dataset to explore KG completion in the more realistic setting where dense connectivity is not guaranteed. We develop a deep convolutional network that utilizes textual entity representations and demonstrate that our model outperforms recent KG completion methods in this challenging setting. We find that our model's performance improvements stem primarily from its robustness to sparsity. We then distill the knowledge from the convolutional network into a student network that re-ranks promising candidate entities. This re-ranking stage leads to further improvements in performance and demonstrates the effectiveness of entity re-ranking for KG completion.

知识图谱(Knowledge Graph, KG)完井研究通常侧重于密集连接的基准数据集,这些数据集不能代表真实的知识图谱。我们整理了两个KG数据集,包括生物医学和百科知识,并使用现有的常识KG数据集来探索更现实的、不能保证密集连接的知识图谱完井。我们开发了一个利用文本实体表示的深度卷积网络,并证明我们的模型在这个具有挑战性的环境中优于最近的KG补全方法。我们发现我们的模型的性能改进主要源于它对稀疏性的鲁棒性。然后,我们将卷积网络中的知识提取到一个学生网络中,该网络对有希望的候选实体进行重新排序。这个重新排序阶段可以进一步提高性能,并证明了KG完井实体重新排序的有效性。
{"title":"Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network.","authors":"Justin Lovelace,&nbsp;Denis Newman-Griffis,&nbsp;Shikhar Vashishth,&nbsp;Jill Fain Lehman,&nbsp;Carolyn Penstein Rosé","doi":"10.18653/v1/2021.acl-long.82","DOIUrl":"https://doi.org/10.18653/v1/2021.acl-long.82","url":null,"abstract":"<p><p>Knowledge Graph (KG) completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs. We curate two KG datasets that include biomedical and encyclopedic knowledge and use an existing commonsense KG dataset to explore KG completion in the more realistic setting where dense connectivity is not guaranteed. We develop a deep convolutional network that utilizes textual entity representations and demonstrate that our model outperforms recent KG completion methods in this challenging setting. We find that our model's performance improvements stem primarily from its robustness to sparsity. We then distill the knowledge from the convolutional network into a student network that re-ranks promising candidate entities. This re-ranking stage leads to further improvements in performance and demonstrates the effectiveness of entity re-ranking for KG completion.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2021 ","pages":"1016-1029"},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9272461/pdf/nihms-1810978.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40516342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Topic-Based Measures of Conversation for Detecting Mild Cognitive Impairment. 基于话题的谈话方法检测轻度认知障碍。
Liu Chen, Hiroko H Dodge, Meysam Asgari

Conversation is a complex cognitive task that engages multiple aspects of cognitive functions to remember the discussed topics, monitor the semantic and linguistic elements, and recognize others' emotions. In this paper, we propose a computational method based on the lexical coherence of consecutive utterances to quantify topical variations in semi-structured conversations of older adults with cognitive impairments. Extracting the lexical knowledge of conversational utterances, our method generates a set of novel conversational measures that indicate underlying cognitive deficits among subjects with mild cognitive impairment (MCI). Our preliminary results verify the utility of the proposed conversation-based measures in distinguishing MCI from healthy controls.

对话是一项复杂的认知任务,涉及多个方面的认知功能,包括记住所讨论的话题,监控语义和语言元素,以及识别他人的情绪。在本文中,我们提出了一种基于连续话语的词汇连贯性的计算方法来量化认知障碍老年人半结构化对话中的话题变化。通过提取会话话语的词汇知识,我们的方法生成了一套新的会话测量方法,这些测量方法可以显示轻度认知障碍(MCI)受试者潜在的认知缺陷。我们的初步结果验证了所提出的基于对话的措施在区分MCI和健康对照方面的效用。
{"title":"Topic-Based Measures of Conversation for Detecting Mild Cognitive Impairment.","authors":"Liu Chen,&nbsp;Hiroko H Dodge,&nbsp;Meysam Asgari","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Conversation is a complex cognitive task that engages multiple aspects of cognitive functions to remember the discussed topics, monitor the semantic and linguistic elements, and recognize others' emotions. In this paper, we propose a computational method based on the lexical coherence of consecutive utterances to quantify topical variations in semi-structured conversations of older adults with cognitive impairments. Extracting the lexical knowledge of conversational utterances, our method generates a set of novel conversational measures that indicate underlying cognitive deficits among subjects with mild cognitive impairment (MCI). Our preliminary results verify the utility of the proposed conversation-based measures in distinguishing MCI from healthy controls.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":" ","pages":"63-67"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7909094/pdf/nihms-1670817.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25414585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings. 生物医学知识图嵌入的基准和最佳实践。
Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.bionlp-1.18
David Chang, Ivana Balažević, Carl Allen, Daniel Chawla, Cynthia Brandt, Richard Andrew Taylor

Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the community.

许多生物医学和医疗保健数据以离散的符号形式编码,如文本和医疗代码。在知识库和本体中存储着丰富的专家管理的生物医学领域知识,但是缺乏可靠的学习知识表示方法限制了它们在机器学习应用中的实用性。近年来,随着自然语言处理的进步,基于文本的表示学习有了显著的改善,但迄今为止,学习生物医学概念嵌入的尝试还很缺乏。最近一组称为知识图嵌入的模型在一般领域知识图上显示了有希望的结果,我们探索了它们在生物医学领域的能力。我们在SNOMED-CT知识图上训练了几个最先进的知识图嵌入模型,提供了与现有方法比较的基准和对最佳实践的深入讨论,并说明了利用知识图的多关系性质学习生物医学知识表示的重要性。嵌入、代码和材料将提供给社区。
{"title":"Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings.","authors":"David Chang,&nbsp;Ivana Balažević,&nbsp;Carl Allen,&nbsp;Daniel Chawla,&nbsp;Cynthia Brandt,&nbsp;Richard Andrew Taylor","doi":"10.18653/v1/2020.bionlp-1.18","DOIUrl":"https://doi.org/10.18653/v1/2020.bionlp-1.18","url":null,"abstract":"<p><p>Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the community.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":" ","pages":"167-176"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7971091/pdf/nihms-1676481.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25511788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Towards End-2-end Learning for Predicting Behavior Codes from Spoken Utterances in Psychotherapy Conversations. 从心理治疗对话口语中预测行为代码的终结-2 端学习。
Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.acl-main.351
Karan Singla, Zhuohao Chen, David C Atkins, Shrikanth Narayanan

Spoken language understanding tasks usually rely on pipelines involving complex processing blocks such as voice activity detection, speaker diarization and Automatic speech recognition (ASR). We propose a novel framework for predicting utterance level labels directly from speech features, thus removing the dependency on first generating transcripts, and transcription free behavioral coding. Our classifier uses a pretrained Speech-2-Vector encoder as bottleneck to generate word-level representations from speech features. This pre-trained encoder learns to encode speech features for a word using an objective similar to Word2Vec. Our proposed approach just uses speech features and word segmentation information for predicting spoken utterance-level target labels. We show that our model achieves competitive results to other state-of-the-art approaches which use transcribed text for the task of predicting psychotherapy-relevant behavior codes.

口语理解任务通常依赖于涉及复杂处理模块的流水线,如语音活动检测、说话者日记化和自动语音识别(ASR)。我们提出了一个新颖的框架,可直接从语音特征预测语句级标签,从而消除了对首次生成转录和无转录行为编码的依赖。我们的分类器使用预训练的 Speech-2-Vector 编码器作为瓶颈,从语音特征生成词级表示。这种预先训练好的编码器通过类似于 Word2Vec 的目标来学习对单词的语音特征进行编码。我们提出的方法仅使用语音特征和单词分段信息来预测口语语段级目标标签。我们的研究表明,我们的模型与其他使用转录文本预测心理治疗相关行为代码的先进方法相比,取得了具有竞争力的结果。
{"title":"Towards End-2-end Learning for Predicting Behavior Codes from Spoken Utterances in Psychotherapy Conversations.","authors":"Karan Singla, Zhuohao Chen, David C Atkins, Shrikanth Narayanan","doi":"10.18653/v1/2020.acl-main.351","DOIUrl":"10.18653/v1/2020.acl-main.351","url":null,"abstract":"<p><p>Spoken language understanding tasks usually rely on pipelines involving complex processing blocks such as voice activity detection, speaker diarization and Automatic speech recognition (ASR). We propose a novel framework for predicting utterance level labels directly from speech features, thus removing the dependency on first generating transcripts, and transcription free behavioral coding. Our classifier uses a pretrained Speech-2-Vector encoder as bottleneck to generate word-level representations from speech features. This pre-trained encoder learns to encode speech features for a word using an objective similar to Word2Vec. Our proposed approach just uses speech features and word segmentation information for predicting spoken utterance-level target labels. We show that our model achieves competitive results to other state-of-the-art approaches which use transcribed text for the task of predicting psychotherapy-relevant behavior codes.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2020 ","pages":"3797-3803"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9901279/pdf/nihms-1858361.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9229146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards an ontology-based medication conversational agent for PrEP and PEP. 面向PrEP和PEP的基于本体的药物会话代理。
Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.nlpmc-1.5
Muhammad Tuan Amith, Licong Cui, Kirk Roberts, Cui Tao

HIV (human immunodeficiency virus) can damage a human's immune system and cause Acquired Immunodeficiency Syndrome (AIDS) which could lead to severe outcomes, including death. While HIV infections have decreased over the last decade, there is still a significant population where the infection permeates. PrEP and PEP are two proven preventive measures introduced that involve periodic dosage to stop the onset of HIV infection. However, the adherence rates for this medication is low in part due to the lack of information about the medication. There exist several communication barriers that prevent patient-provider communication from happening. In this work, we present our ontology-based method for automating the communication of this medication that can be deployed for live conversational agents for PrEP and PEP. This method facilitates a model of automated conversation between the machine and user can also answer relevant questions.

HIV(人类免疫缺陷病毒)可以破坏人类的免疫系统,导致获得性免疫缺陷综合症(艾滋病),这可能导致严重的后果,包括死亡。虽然艾滋病毒感染在过去十年中有所下降,但仍有大量人口感染。PrEP和PEP是两种经过验证的预防措施,涉及定期给药以阻止艾滋病毒感染的发生。然而,这种药物的依从率很低,部分原因是缺乏有关药物的信息。存在一些阻碍医患沟通的沟通障碍。在这项工作中,我们提出了基于本体的方法来自动化这种药物的通信,该方法可以部署为PrEP和PEP的实时会话代理。这种方法促进了机器和用户之间的自动对话模型,也可以回答相关问题。
{"title":"Towards an ontology-based medication conversational agent for PrEP and PEP.","authors":"Muhammad Tuan Amith,&nbsp;Licong Cui,&nbsp;Kirk Roberts,&nbsp;Cui Tao","doi":"10.18653/v1/2020.nlpmc-1.5","DOIUrl":"https://doi.org/10.18653/v1/2020.nlpmc-1.5","url":null,"abstract":"<p><p>HIV (human immunodeficiency virus) can damage a human's immune system and cause Acquired Immunodeficiency Syndrome (AIDS) which could lead to severe outcomes, including death. While HIV infections have decreased over the last decade, there is still a significant population where the infection permeates. PrEP and PEP are two proven preventive measures introduced that involve periodic dosage to stop the onset of HIV infection. However, the adherence rates for this medication is low in part due to the lack of information about the medication. There exist several communication barriers that prevent patient-provider communication from happening. In this work, we present our ontology-based method for automating the communication of this medication that can be deployed for live conversational agents for PrEP and PEP. This method facilitates a model of automated conversation between the machine and user can also answer relevant questions.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":" ","pages":"31-40"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7680642/pdf/nihms-1642495.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38636472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
Proceedings of the conference. Association for Computational Linguistics. Meeting
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1