Kyoungsu Oh, Min Kang, SeoHyun Oh, Do-hyoung Kim, Seokhwan Kang, Youngho Lee
{"title":"卫生信息技术标准化命名实体识别工具AB-XLNet","authors":"Kyoungsu Oh, Min Kang, SeoHyun Oh, Do-hyoung Kim, Seokhwan Kang, Youngho Lee","doi":"10.1109/ICTC55196.2022.9952819","DOIUrl":null,"url":null,"abstract":"We conducted a study to identify drug-related information on non-standardized on non-standardized discharge summaries using a pre-trained BERT-based model. After tokenizing the dataset, it was identified with the IOB tagging schema and trained on the training data with the Random Insert technique through the pre-trained BERT. As a result, the F1-score of AB-XLNet was improved by 3% compared to XLNet, and ADE and Form, which could not be extracted from XLNet, were extracted. Future research will focus on presenting a generalized model using large amounts of data from multiple institutions.","PeriodicalId":441404,"journal":{"name":"2022 13th International Conference on Information and Communication Technology Convergence (ICTC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AB-XLNet: Named Entity Recognition Tool for Health Information Technology Standardization\",\"authors\":\"Kyoungsu Oh, Min Kang, SeoHyun Oh, Do-hyoung Kim, Seokhwan Kang, Youngho Lee\",\"doi\":\"10.1109/ICTC55196.2022.9952819\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We conducted a study to identify drug-related information on non-standardized on non-standardized discharge summaries using a pre-trained BERT-based model. After tokenizing the dataset, it was identified with the IOB tagging schema and trained on the training data with the Random Insert technique through the pre-trained BERT. As a result, the F1-score of AB-XLNet was improved by 3% compared to XLNet, and ADE and Form, which could not be extracted from XLNet, were extracted. Future research will focus on presenting a generalized model using large amounts of data from multiple institutions.\",\"PeriodicalId\":441404,\"journal\":{\"name\":\"2022 13th International Conference on Information and Communication Technology Convergence (ICTC)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 13th International Conference on Information and Communication Technology Convergence (ICTC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTC55196.2022.9952819\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 13th International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC55196.2022.9952819","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
AB-XLNet: Named Entity Recognition Tool for Health Information Technology Standardization
We conducted a study to identify drug-related information on non-standardized on non-standardized discharge summaries using a pre-trained BERT-based model. After tokenizing the dataset, it was identified with the IOB tagging schema and trained on the training data with the Random Insert technique through the pre-trained BERT. As a result, the F1-score of AB-XLNet was improved by 3% compared to XLNet, and ADE and Form, which could not be extracted from XLNet, were extracted. Future research will focus on presenting a generalized model using large amounts of data from multiple institutions.