Lu Zhou, Chaohong Wu, Xi-Ting Wang, Shuangqiao Liu, Yizhuo Zhang, Yue-Meng Sun, Jian Cui, Caiyan Li, Hui-Min Yuan, Yan Sun, Feng-jie Zheng, Feng-qin Xu, Yuhang Li
{"title":"中药同义术语转换:一种基于变压器的双向编码器表示模型,用于中药同义术语转换","authors":"Lu Zhou, Chaohong Wu, Xi-Ting Wang, Shuangqiao Liu, Yizhuo Zhang, Yue-Meng Sun, Jian Cui, Caiyan Li, Hui-Min Yuan, Yan Sun, Feng-jie Zheng, Feng-qin Xu, Yuhang Li","doi":"10.4103/2311-8571.378171","DOIUrl":null,"url":null,"abstract":"Background: The medical records of traditional Chinese medicine (TCM) contain numerous synonymous terms with different descriptions, which is not conducive to computer-aided data mining of TCM. However, there is a lack of models available to normalize synonymous TCM terms. Therefore, construction of a synonymous term conversion (STC) model for normalizing synonymous TCM terms is necessary. Methods: Based on the neural networks of bidirectional encoder representations from transformers (BERT), four types of TCM STC models were designed: Models based on BERT and text classification, text sequence generation, named entity recognition, and text matching. The superior STC model was selected on the basis of its performance in converting synonymous terms. Moreover, three misjudgment inspection methods for the conversion results of the STC model based on inconsistency were proposed to find incorrect term conversion: Neuron random deactivation, output comparison of multiple isomorphic models, and output comparison of multiple heterogeneous models (OCMH). Results: The classification-based STC model outperformed the other STC task models. It achieved F1 scores of 0.91, 0.91, and 0.83 for performing symptoms, patterns, and treatments STC tasks, respectively. The OCMH method showed the best performance in misjudgment inspection, with wrong detection rates of 0.80, 0.84, and 0.90 in the term conversion results for symptoms, patterns, and treatments, respectively. Conclusion: The TCM STC model based on classification achieved superior performance in converting synonymous terms for symptoms, patterns, and treatments. The misjudgment inspection method based on OCMH showed superior performance in identifying incorrect outputs.","PeriodicalId":23692,"journal":{"name":"World Journal of Traditional Chinese Medicine","volume":"9 1","pages":"224 - 233"},"PeriodicalIF":4.3000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Traditional chinese medicine synonymous term conversion: A bidirectional encoder representations from transformers-based model for converting synonymous terms in traditional chinese medicine\",\"authors\":\"Lu Zhou, Chaohong Wu, Xi-Ting Wang, Shuangqiao Liu, Yizhuo Zhang, Yue-Meng Sun, Jian Cui, Caiyan Li, Hui-Min Yuan, Yan Sun, Feng-jie Zheng, Feng-qin Xu, Yuhang Li\",\"doi\":\"10.4103/2311-8571.378171\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: The medical records of traditional Chinese medicine (TCM) contain numerous synonymous terms with different descriptions, which is not conducive to computer-aided data mining of TCM. However, there is a lack of models available to normalize synonymous TCM terms. Therefore, construction of a synonymous term conversion (STC) model for normalizing synonymous TCM terms is necessary. Methods: Based on the neural networks of bidirectional encoder representations from transformers (BERT), four types of TCM STC models were designed: Models based on BERT and text classification, text sequence generation, named entity recognition, and text matching. The superior STC model was selected on the basis of its performance in converting synonymous terms. Moreover, three misjudgment inspection methods for the conversion results of the STC model based on inconsistency were proposed to find incorrect term conversion: Neuron random deactivation, output comparison of multiple isomorphic models, and output comparison of multiple heterogeneous models (OCMH). Results: The classification-based STC model outperformed the other STC task models. It achieved F1 scores of 0.91, 0.91, and 0.83 for performing symptoms, patterns, and treatments STC tasks, respectively. The OCMH method showed the best performance in misjudgment inspection, with wrong detection rates of 0.80, 0.84, and 0.90 in the term conversion results for symptoms, patterns, and treatments, respectively. Conclusion: The TCM STC model based on classification achieved superior performance in converting synonymous terms for symptoms, patterns, and treatments. The misjudgment inspection method based on OCMH showed superior performance in identifying incorrect outputs.\",\"PeriodicalId\":23692,\"journal\":{\"name\":\"World Journal of Traditional Chinese Medicine\",\"volume\":\"9 1\",\"pages\":\"224 - 233\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Journal of Traditional Chinese Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.4103/2311-8571.378171\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"INTEGRATIVE & COMPLEMENTARY MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Journal of Traditional Chinese Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4103/2311-8571.378171","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INTEGRATIVE & COMPLEMENTARY MEDICINE","Score":null,"Total":0}
Traditional chinese medicine synonymous term conversion: A bidirectional encoder representations from transformers-based model for converting synonymous terms in traditional chinese medicine
Background: The medical records of traditional Chinese medicine (TCM) contain numerous synonymous terms with different descriptions, which is not conducive to computer-aided data mining of TCM. However, there is a lack of models available to normalize synonymous TCM terms. Therefore, construction of a synonymous term conversion (STC) model for normalizing synonymous TCM terms is necessary. Methods: Based on the neural networks of bidirectional encoder representations from transformers (BERT), four types of TCM STC models were designed: Models based on BERT and text classification, text sequence generation, named entity recognition, and text matching. The superior STC model was selected on the basis of its performance in converting synonymous terms. Moreover, three misjudgment inspection methods for the conversion results of the STC model based on inconsistency were proposed to find incorrect term conversion: Neuron random deactivation, output comparison of multiple isomorphic models, and output comparison of multiple heterogeneous models (OCMH). Results: The classification-based STC model outperformed the other STC task models. It achieved F1 scores of 0.91, 0.91, and 0.83 for performing symptoms, patterns, and treatments STC tasks, respectively. The OCMH method showed the best performance in misjudgment inspection, with wrong detection rates of 0.80, 0.84, and 0.90 in the term conversion results for symptoms, patterns, and treatments, respectively. Conclusion: The TCM STC model based on classification achieved superior performance in converting synonymous terms for symptoms, patterns, and treatments. The misjudgment inspection method based on OCMH showed superior performance in identifying incorrect outputs.