Identifying Medical Concepts and Semantic Types in Lay Vocabularies of Health Consumers Who are Concerned with Diabetes on Social Media Using the UMLS and NLP.
Adib Ahmed Anik, Paramita Basak Upama, Masud Rabbani, Shiyu Tian, Min Sook Park, Sheikh Iqbal Ahamed, Jake Luo, Hyunkyoung Oh
{"title":"Identifying Medical Concepts and Semantic Types in Lay Vocabularies of Health Consumers Who are Concerned with Diabetes on Social Media Using the UMLS and NLP.","authors":"Adib Ahmed Anik, Paramita Basak Upama, Masud Rabbani, Shiyu Tian, Min Sook Park, Sheikh Iqbal Ahamed, Jake Luo, Hyunkyoung Oh","doi":"10.1109/compsac61105.2024.00119","DOIUrl":null,"url":null,"abstract":"<p><p>This study suggests a way to utilize the existing medical ontology and natural language processing techniques to extract major medical concepts from lay vocabularies of health consumers on social media and group them based on the defined semantic types in the ontology. Diabetes-related discussions on Tumblr was used to test the efficiency of SpaCy and the Markov-Viterbi algorithm to map lay medical terms to the defined medical concepts in the UMLS. The system discussed in this paper can better analyze free texts, take care of word ambiguity and extract the lifestyle indicators from the daily life discussions of diabetic people on Tumblr. The findings of this study can contribute to developing health applications that track the health behavior of those living with chronic conditions such as diabetes. This approach can also assist researchers who are interested in processing lay languages used by health consumers to foster an understanding of their health behavior.</p>","PeriodicalId":74502,"journal":{"name":"Proceedings : Annual International Computer Software and Applications Conference. COMPSAC","volume":"2024 ","pages":"862-869"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11619756/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings : Annual International Computer Software and Applications Conference. COMPSAC","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/compsac61105.2024.00119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/26 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This study suggests a way to utilize the existing medical ontology and natural language processing techniques to extract major medical concepts from lay vocabularies of health consumers on social media and group them based on the defined semantic types in the ontology. Diabetes-related discussions on Tumblr was used to test the efficiency of SpaCy and the Markov-Viterbi algorithm to map lay medical terms to the defined medical concepts in the UMLS. The system discussed in this paper can better analyze free texts, take care of word ambiguity and extract the lifestyle indicators from the daily life discussions of diabetic people on Tumblr. The findings of this study can contribute to developing health applications that track the health behavior of those living with chronic conditions such as diabetes. This approach can also assist researchers who are interested in processing lay languages used by health consumers to foster an understanding of their health behavior.