首页 > 最新文献

Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task最新文献

英文 中文
Transfer Learning for Health-related Twitter Data 健康相关Twitter数据的迁移学习
A. Dirkson, S. Verberne
Transfer learning is promising for many NLP applications, especially in tasks with limited labeled data. This paper describes the methods developed by team TMRLeiden for the 2019 Social Media Mining for Health Applications (SMM4H) Shared Task. Our methods use state-of-the-art transfer learning methods to classify, extract and normalise adverse drug effects (ADRs) and to classify personal health mentions from health-related tweets. The code and fine-tuned models are publicly available.
迁移学习在许多NLP应用中很有前景,特别是在标签数据有限的任务中。本文描述了TMRLeiden团队为2019年健康应用社交媒体挖掘(SMM4H)共享任务开发的方法。我们的方法使用最先进的迁移学习方法对药物不良反应(adr)进行分类、提取和规范化,并对与健康相关的推文中提到的个人健康进行分类。代码和经过微调的模型都是公开的。
{"title":"Transfer Learning for Health-related Twitter Data","authors":"A. Dirkson, S. Verberne","doi":"10.18653/v1/W19-3212","DOIUrl":"https://doi.org/10.18653/v1/W19-3212","url":null,"abstract":"Transfer learning is promising for many NLP applications, especially in tasks with limited labeled data. This paper describes the methods developed by team TMRLeiden for the 2019 Social Media Mining for Health Applications (SMM4H) Shared Task. Our methods use state-of-the-art transfer learning methods to classify, extract and normalise adverse drug effects (ADRs) and to classify personal health mentions from health-related tweets. The code and fine-tuned models are publicly available.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123504449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Extracting Kinship from Obituary to Enhance Electronic Health Records for Genetic Research 从讣告中提取亲属关系以增强基因研究电子健康记录
Kai He, Jialun Wu, Xiaoyong Ma, Chong Zhang, Ming Huang, Chen Li, Lixia Yao
Claims database and electronic health records database do not usually capture kinship or family relationship information, which is imperative for genetic research. We identify online obituaries as a new data source and propose a special named entity recognition and relation extraction solution to extract names and kinships from online obituaries. Built on 1,809 annotated obituaries and a novel tagging scheme, our joint neural model achieved macro-averaged precision, recall and F measure of 72.69%, 78.54% and 74.93%, and micro-averaged precision, recall and F measure of 95.74%, 98.25% and 96.98% using 57 kinships with 10 or more examples in a 10-fold cross-validation experiment. The model performance improved dramatically when trained with 34 kinships with 50 or more examples. Leveraging additional information such as age, death date, birth date and residence mentioned by obituaries, we foresee a promising future of supplementing EHR databases with comprehensive and accurate kinship information for genetic research.
索赔数据库和电子健康记录数据库通常不捕获亲属关系或家庭关系信息,而这对于基因研究是必不可少的。我们将网络讣告作为一种新的数据来源,提出了一种特殊的命名实体识别和关系提取方案,从网络讣告中提取人名和亲属关系。在10倍交叉验证实验中,基于1809份标注讣告和一种新颖的标注方案,我们的联合神经模型在10个及以上样本的57个亲属关系中实现了宏观平均精度、召回率和F度量分别为72.69%、78.54%和74.93%,微观平均精度、召回率和F度量分别为95.74%、98.25%和96.98%。当使用34个亲属关系和50个或更多的样本进行训练时,模型的性能显著提高。利用讣告中提到的年龄、死亡日期、出生日期和居住地等附加信息,我们预见到为基因研究提供全面准确的亲属信息补充电子病历数据库的前景广阔。
{"title":"Extracting Kinship from Obituary to Enhance Electronic Health Records for Genetic Research","authors":"Kai He, Jialun Wu, Xiaoyong Ma, Chong Zhang, Ming Huang, Chen Li, Lixia Yao","doi":"10.18653/v1/W19-3201","DOIUrl":"https://doi.org/10.18653/v1/W19-3201","url":null,"abstract":"Claims database and electronic health records database do not usually capture kinship or family relationship information, which is imperative for genetic research. We identify online obituaries as a new data source and propose a special named entity recognition and relation extraction solution to extract names and kinships from online obituaries. Built on 1,809 annotated obituaries and a novel tagging scheme, our joint neural model achieved macro-averaged precision, recall and F measure of 72.69%, 78.54% and 74.93%, and micro-averaged precision, recall and F measure of 95.74%, 98.25% and 96.98% using 57 kinships with 10 or more examples in a 10-fold cross-validation experiment. The model performance improved dramatically when trained with 34 kinships with 50 or more examples. Leveraging additional information such as age, death date, birth date and residence mentioned by obituaries, we foresee a promising future of supplementing EHR databases with comprehensive and accurate kinship information for genetic research.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115738485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Detecting and Extracting of Adverse Drug Reaction Mentioning Tweets with Multi-Head Self Attention 带有多头自关注的药品不良反应提及推文的检测与提取
Suyu Ge, Tao Qi, Chuhan Wu, Yongfeng Huang
This paper describes our system for the first and second shared tasks of the fourth Social Media Mining for Health Applications (SMM4H) workshop. We enhance tweet representation with a language model and distinguish the importance of different words with Multi-Head Self-Attention. In addition, transfer learning is exploited to make up for the data shortage. Our system achieved competitive results on both tasks with an F1-score of 0.5718 for task 1 and 0.653 (overlap) / 0.357 (strict) for task 2.
本文描述了我们的系统用于第四届健康应用社交媒体挖掘(SMM4H)研讨会的第一和第二共享任务。我们利用语言模型增强tweet的表示,并利用多头自注意区分不同单词的重要性。此外,利用迁移学习来弥补数据的不足。我们的系统在两个任务上都取得了竞争结果,任务1的f1得分为0.5718,任务2的f1得分为0.653(重叠)/ 0.357(严格)。
{"title":"Detecting and Extracting of Adverse Drug Reaction Mentioning Tweets with Multi-Head Self Attention","authors":"Suyu Ge, Tao Qi, Chuhan Wu, Yongfeng Huang","doi":"10.18653/v1/W19-3214","DOIUrl":"https://doi.org/10.18653/v1/W19-3214","url":null,"abstract":"This paper describes our system for the first and second shared tasks of the fourth Social Media Mining for Health Applications (SMM4H) workshop. We enhance tweet representation with a language model and distinguish the importance of different words with Multi-Head Self-Attention. In addition, transfer learning is exploited to make up for the data shortage. Our system achieved competitive results on both tasks with an F1-score of 0.5718 for task 1 and 0.653 (overlap) / 0.357 (strict) for task 2.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123698663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Using Machine Learning and Deep Learning Methods to Find Mentions of Adverse Drug Reactions in Social Media 使用机器学习和深度学习方法在社交媒体中查找药物不良反应的提及
Pilar López Úbeda, Manuel Carlos Díaz Galiano, Maite Martin, L. Ureña López
Over time the use of social networks is becoming very popular platforms for sharing health related information. Social Media Mining for Health Applications (SMM4H) provides tasks such as those described in this document to help manage information in the health domain. This document shows the first participation of the SINAI group. We study approaches based on machine learning and deep learning to extract adverse drug reaction mentions from Twitter. The results obtained in the tasks are encouraging, we are close to the average of all participants and even above in some cases.
随着时间的推移,社交网络的使用正在成为分享健康相关信息的非常流行的平台。针对健康应用程序的社交媒体挖掘(SMM4H)提供了诸如本文中描述的任务,以帮助管理健康领域中的信息。这份文件显示了西奈小组的首次参与。我们研究了基于机器学习和深度学习的方法来从Twitter中提取药物不良反应。在任务中获得的结果是令人鼓舞的,我们接近所有参与者的平均水平,在某些情况下甚至高于。
{"title":"Using Machine Learning and Deep Learning Methods to Find Mentions of Adverse Drug Reactions in Social Media","authors":"Pilar López Úbeda, Manuel Carlos Díaz Galiano, Maite Martin, L. Ureña López","doi":"10.18653/v1/W19-3216","DOIUrl":"https://doi.org/10.18653/v1/W19-3216","url":null,"abstract":"Over time the use of social networks is becoming very popular platforms for sharing health related information. Social Media Mining for Health Applications (SMM4H) provides tasks such as those described in this document to help manage information in the health domain. This document shows the first participation of the SINAI group. We study approaches based on machine learning and deep learning to extract adverse drug reaction mentions from Twitter. The results obtained in the tasks are encouraging, we are close to the average of all participants and even above in some cases.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121038577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Identifying Adverse Drug Events Mentions in Tweets Using Attentive, Collocated, and Aggregated Medical Representation 识别不良药物事件提到的推文使用细心,错位,和汇总的医疗表示
Xinyan Zhao, D. Yu, V.G.Vinod Vydiswaran
Identifying mentions of medical concepts in social media is challenging because of high variability in free text. In this paper, we propose a novel neural network architecture, the Collocated LSTM with Attentive Pooling and Aggregated representation (CLAPA), that integrates a bidirectional LSTM model with attention and pooling strategy and utilizes the collocation information from training data to improve the representation of medical concepts. The collocation and aggregation layers improve the model performance on the task of identifying mentions of adverse drug events (ADE) in tweets. Using the dataset made available as part of the workshop shared task, we show that careful selection of neighborhood contexts can help uncover useful local information and improve the overall medical concept representation.
由于自由文本的高度可变性,在社交媒体中识别医学概念的提及是具有挑战性的。本文提出了一种新的神经网络结构CLAPA (Collocated LSTM with attention Pooling and Aggregated representation),该结构将双向LSTM模型与注意池化策略相结合,利用训练数据中的搭配信息来改善医学概念的表征。搭配层和聚合层提高了模型在识别推文中提及药物不良事件(ADE)的任务上的性能。使用作为研讨会共享任务的一部分提供的数据集,我们表明仔细选择邻域上下文可以帮助发现有用的局部信息并改善整体医学概念表示。
{"title":"Identifying Adverse Drug Events Mentions in Tweets Using Attentive, Collocated, and Aggregated Medical Representation","authors":"Xinyan Zhao, D. Yu, V.G.Vinod Vydiswaran","doi":"10.18653/v1/W19-3209","DOIUrl":"https://doi.org/10.18653/v1/W19-3209","url":null,"abstract":"Identifying mentions of medical concepts in social media is challenging because of high variability in free text. In this paper, we propose a novel neural network architecture, the Collocated LSTM with Attentive Pooling and Aggregated representation (CLAPA), that integrates a bidirectional LSTM model with attention and pooling strategy and utilizes the collocation information from training data to improve the representation of medical concepts. The collocation and aggregation layers improve the model performance on the task of identifying mentions of adverse drug events (ADE) in tweets. Using the dataset made available as part of the workshop shared task, we show that careful selection of neighborhood contexts can help uncover useful local information and improve the overall medical concept representation.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123158940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Passive Diagnosis Incorporating the PHQ-4 for Depression and Anxiety 应用PHQ-4被动诊断抑郁和焦虑
Fionn Delahunty, R. Johansson, Mihael Arcan
Depression and anxiety are the two most prevalent mental health disorders worldwide, impacting the lives of millions of people each year. In this work, we develop and evaluate a multilabel, multidimensional deep neural network designed to predict PHQ-4 scores based on individuals written text. Our system outperforms random baseline metrics and provides a novel approach to how we can predict psychometric scores from written text. Additionally, we explore how this architecture can be applied to analyse social media data.
抑郁和焦虑是世界上最普遍的两种精神健康障碍,每年影响着数百万人的生活。在这项工作中,我们开发并评估了一个多标签、多维深度神经网络,旨在根据个人书面文本预测PHQ-4分数。我们的系统优于随机基线指标,并为我们如何从书面文本预测心理测试分数提供了一种新颖的方法。此外,我们还探讨了如何将这种架构应用于分析社交媒体数据。
{"title":"Passive Diagnosis Incorporating the PHQ-4 for Depression and Anxiety","authors":"Fionn Delahunty, R. Johansson, Mihael Arcan","doi":"10.18653/v1/W19-3205","DOIUrl":"https://doi.org/10.18653/v1/W19-3205","url":null,"abstract":"Depression and anxiety are the two most prevalent mental health disorders worldwide, impacting the lives of millions of people each year. In this work, we develop and evaluate a multilabel, multidimensional deep neural network designed to predict PHQ-4 scores based on individuals written text. Our system outperforms random baseline metrics and provides a novel approach to how we can predict psychometric scores from written text. Additionally, we explore how this architecture can be applied to analyse social media data.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124434440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
MIDAS@SMM4H-2019: Identifying Adverse Drug Reactions and Personal Health Experience Mentions from Twitter MIDAS@SMM4H-2019:从Twitter上识别药物不良反应和个人健康经历
Debanjan Mahata, Sarthak Anand, Haimin Zhang, Simra Shahid, Laiba Mehnaz, Yaman Kumar Singla, R. Shah
In this paper, we present our approach and the system description for the Social Media Mining for Health Applications (SMM4H) Shared Task 1,2 and 4 (2019). Our main contribution is to show the effectiveness of Transfer Learning approaches like BERT and ULMFiT, and how they generalize for the classification tasks like identification of adverse drug reaction mentions and reporting of personal health problems in tweets. We show the use of stacked embeddings combined with BLSTM+CRF tagger for identifying spans mentioning adverse drug reactions in tweets. We also show that these approaches perform well even with imbalanced dataset in comparison to undersampling and oversampling.
在本文中,我们介绍了健康应用社交媒体挖掘(SMM4H)共享任务1、2和4(2019)的方法和系统描述。我们的主要贡献是展示了像BERT和ULMFiT这样的迁移学习方法的有效性,以及它们如何泛化到分类任务中,比如识别药物不良反应和在推特中报告个人健康问题。我们展示了堆叠嵌入与BLSTM+CRF标记器相结合的使用,用于识别推文中提到药物不良反应的跨度。我们还表明,与欠采样和过采样相比,这些方法即使在不平衡的数据集上也表现良好。
{"title":"MIDAS@SMM4H-2019: Identifying Adverse Drug Reactions and Personal Health Experience Mentions from Twitter","authors":"Debanjan Mahata, Sarthak Anand, Haimin Zhang, Simra Shahid, Laiba Mehnaz, Yaman Kumar Singla, R. Shah","doi":"10.18653/v1/W19-3223","DOIUrl":"https://doi.org/10.18653/v1/W19-3223","url":null,"abstract":"In this paper, we present our approach and the system description for the Social Media Mining for Health Applications (SMM4H) Shared Task 1,2 and 4 (2019). Our main contribution is to show the effectiveness of Transfer Learning approaches like BERT and ULMFiT, and how they generalize for the classification tasks like identification of adverse drug reaction mentions and reporting of personal health problems in tweets. We show the use of stacked embeddings combined with BLSTM+CRF tagger for identifying spans mentioning adverse drug reactions in tweets. We also show that these approaches perform well even with imbalanced dataset in comparison to undersampling and oversampling.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129202576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Detection of Adverse Drug Reaction Mentions in Tweets Using ELMo 基于ELMo的推文药物不良反应提及检测
S. Sarabadani
This paper describes the models used by our team in SMM4H 2019 shared task. We submitted results for subtasks 1 and 2. For task 1 which aims to detect tweets with Adverse Drug Reaction (ADR) mentions we used ELMo embeddings which is a deep contextualized word representation able to capture both syntactic and semantic characteristics. For task 2, which focuses on extraction of ADR mentions, first the same architecture as task 1 was used to identify whether or not a tweet contains ADR. Then, for tweets positively classified as mentioning ADR, the relevant text span was identified by similarity matching with 3 different lexicon sets.
本文描述了我们团队在SMM4H 2019共享任务中使用的模型。我们提交了子任务1和子任务2的结果。对于任务1,目的是检测含有药物不良反应(ADR)提及的推文,我们使用了ELMo嵌入,这是一种能够捕获句法和语义特征的深度上下文化单词表示。对于任务2,重点是提取ADR提及,首先使用与任务1相同的架构来识别tweet是否包含ADR。然后,对于正向分类为提及ADR的tweets,通过与3个不同的词典集的相似度匹配来识别相关的文本跨度。
{"title":"Detection of Adverse Drug Reaction Mentions in Tweets Using ELMo","authors":"S. Sarabadani","doi":"10.18653/v1/W19-3221","DOIUrl":"https://doi.org/10.18653/v1/W19-3221","url":null,"abstract":"This paper describes the models used by our team in SMM4H 2019 shared task. We submitted results for subtasks 1 and 2. For task 1 which aims to detect tweets with Adverse Drug Reaction (ADR) mentions we used ELMo embeddings which is a deep contextualized word representation able to capture both syntactic and semantic characteristics. For task 2, which focuses on extraction of ADR mentions, first the same architecture as task 1 was used to identify whether or not a tweet contains ADR. Then, for tweets positively classified as mentioning ADR, the relevant text span was identified by similarity matching with 3 different lexicon sets.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116967582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Adverse Drug Effect and Personalized Health Mentions, CLaC at SMM4H 2019, Tasks 1 and 4
Parsa Bagherzadeh, Nadia Sheikh, S. Bergler
CLaC labs participated in Task 1 and 4 of SMM4H 2019. We pursed two main objectives in our submission. First we tried to use some textual features in a deep net framework, and second, the potential use of more than one word embedding was tested. The results seem positively affected by the proposed architectures.
CLaC实验室参与了SMM4H 2019的任务1和4。我们提交意见书的目的主要有两个。首先,我们尝试在深度网络框架中使用一些文本特征,其次,测试了多个单词嵌入的潜在用途。结果似乎受到所提议的体系结构的积极影响。
{"title":"Adverse Drug Effect and Personalized Health Mentions, CLaC at SMM4H 2019, Tasks 1 and 4","authors":"Parsa Bagherzadeh, Nadia Sheikh, S. Bergler","doi":"10.18653/v1/W19-3222","DOIUrl":"https://doi.org/10.18653/v1/W19-3222","url":null,"abstract":"CLaC labs participated in Task 1 and 4 of SMM4H 2019. We pursed two main objectives in our submission. First we tried to use some textual features in a deep net framework, and second, the potential use of more than one word embedding was tested. The results seem positively affected by the proposed architectures.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124103006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Overview of the Fourth Social Media Mining for Health (SMM4H) Shared Tasks at ACL 2019 ACL 2019第四届健康社交媒体挖掘(SMM4H)共享任务概述
D. Weissenbacher, A. Sarker, A. Magge, A. Daughton, K. O’Connor, Michael J. Paul, G. Gonzalez-Hernandez
The number of users of social media continues to grow, with nearly half of adults worldwide and two-thirds of all American adults using social networking. Advances in automated data processing, machine learning and NLP present the possibility of utilizing this massive data source for biomedical and public health applications, if researchers address the methodological challenges unique to this media. We present the Social Media Mining for Health Shared Tasks collocated with the ACL at Florence in 2019, which address these challenges for health monitoring and surveillance, utilizing state of the art techniques for processing noisy, real-world, and substantially creative language expressions from social media users. For the fourth execution of this challenge, we proposed four different tasks. Task 1 asked participants to distinguish tweets reporting an adverse drug reaction (ADR) from those that do not. Task 2, a follow-up to Task 1, asked participants to identify the span of text in tweets reporting ADRs. Task 3 is an end-to-end task where the goal was to first detect tweets mentioning an ADR and then map the extracted colloquial mentions of ADRs in the tweets to their corresponding standard concept IDs in the MedDRA vocabulary. Finally, Task 4 asked participants to classify whether a tweet contains a personal mention of one’s health, a more general discussion of the health issue, or is an unrelated mention. A total of 34 teams from around the world registered and 19 teams from 12 countries submitted a system run. We summarize here the corpora for this challenge which are freely available at https://competitions.codalab.org/competitions/22521, and present an overview of the methods and the results of the competing systems.
社交媒体的用户数量持续增长,全球近一半的成年人和三分之二的美国成年人使用社交网络。自动化数据处理、机器学习和自然语言处理的进步,为生物医学和公共卫生应用提供了利用这一海量数据源的可能性,如果研究人员能够解决这一媒体独有的方法论挑战的话。我们将于2019年在佛罗伦萨展示与ACL相匹配的健康共享任务社交媒体挖掘,该任务利用最先进的技术处理来自社交媒体用户的嘈杂、真实和富有创造性的语言表达,解决了健康监测和监视方面的这些挑战。对于这个挑战的第四次执行,我们提出了四个不同的任务。任务1要求参与者区分报告药物不良反应(ADR)的推文和没有报告的推文。任务2是任务1的后续,要求参与者识别报道adr的推文的文本跨度。任务3是一个端到端任务,其目标是首先检测提到ADR的tweet,然后将tweet中提取的关于ADR的口语化提及映射到MedDRA词汇表中相应的标准概念id。最后,任务4要求参与者对一条推文进行分类,这条推文是对个人健康的提及,还是对健康问题的一般性讨论,还是与健康问题无关的提及。共有来自世界各地的34个团队注册,来自12个国家的19个团队提交了系统运行。我们在这里总结了这个挑战的语料库,这些语料库可以在https://competitions.codalab.org/competitions/22521上免费获得,并概述了竞争系统的方法和结果。
{"title":"Overview of the Fourth Social Media Mining for Health (SMM4H) Shared Tasks at ACL 2019","authors":"D. Weissenbacher, A. Sarker, A. Magge, A. Daughton, K. O’Connor, Michael J. Paul, G. Gonzalez-Hernandez","doi":"10.18653/v1/W19-3203","DOIUrl":"https://doi.org/10.18653/v1/W19-3203","url":null,"abstract":"The number of users of social media continues to grow, with nearly half of adults worldwide and two-thirds of all American adults using social networking. Advances in automated data processing, machine learning and NLP present the possibility of utilizing this massive data source for biomedical and public health applications, if researchers address the methodological challenges unique to this media. We present the Social Media Mining for Health Shared Tasks collocated with the ACL at Florence in 2019, which address these challenges for health monitoring and surveillance, utilizing state of the art techniques for processing noisy, real-world, and substantially creative language expressions from social media users. For the fourth execution of this challenge, we proposed four different tasks. Task 1 asked participants to distinguish tweets reporting an adverse drug reaction (ADR) from those that do not. Task 2, a follow-up to Task 1, asked participants to identify the span of text in tweets reporting ADRs. Task 3 is an end-to-end task where the goal was to first detect tweets mentioning an ADR and then map the extracted colloquial mentions of ADRs in the tweets to their corresponding standard concept IDs in the MedDRA vocabulary. Finally, Task 4 asked participants to classify whether a tweet contains a personal mention of one’s health, a more general discussion of the health issue, or is an unrelated mention. A total of 34 teams from around the world registered and 19 teams from 12 countries submitted a system run. We summarize here the corpora for this challenge which are freely available at https://competitions.codalab.org/competitions/22521, and present an overview of the methods and the results of the competing systems.","PeriodicalId":265570,"journal":{"name":"Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123729701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 85
期刊
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1