Sawittree Jumpathong, Kanyanut Kriengket, P. Boonkwan, T. Supnithi
{"title":"Anomaly Detection in Lexical Definitions via One-Class Classification Techniques","authors":"Sawittree Jumpathong, Kanyanut Kriengket, P. Boonkwan, T. Supnithi","doi":"10.1109/iSAI-NLP54397.2021.9678166","DOIUrl":null,"url":null,"abstract":"It takes a long time to build vocabularies and their definitions because they must be approved only by the experts in the meeting of building vocabularies and the definitions are also unstructured. To save time, we applied three techniques of classification to the experiments that are one-class SVMs, isolation forests, and local outlier factors, and also observed how well the method can suggest word definition status via the accuracy. As a result, the local outlier factors obtained the highest accuracy when they used vectors that were produced by USE. They can recognize the boundary of the approved class better and there are several approved clusters and outliers are scattered among them. Also, it is found that the detected status of definitions is both identical and opposite to the reference one. For the patterns of definition writing, the approved definitions are always written in the logical order, and start with wide or general information, then is followed by specific details, examples, and references of English terms or examples. In case of the rejected definitions, they are not always written in the logical order, and their definition patterns are also various - only Thai translation, Thai translation with related entries, parts of speech (POS), Thai translation, related entries, and English term references followed by definitions, etc.","PeriodicalId":339826,"journal":{"name":"2021 16th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 16th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iSAI-NLP54397.2021.9678166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
It takes a long time to build vocabularies and their definitions because they must be approved only by the experts in the meeting of building vocabularies and the definitions are also unstructured. To save time, we applied three techniques of classification to the experiments that are one-class SVMs, isolation forests, and local outlier factors, and also observed how well the method can suggest word definition status via the accuracy. As a result, the local outlier factors obtained the highest accuracy when they used vectors that were produced by USE. They can recognize the boundary of the approved class better and there are several approved clusters and outliers are scattered among them. Also, it is found that the detected status of definitions is both identical and opposite to the reference one. For the patterns of definition writing, the approved definitions are always written in the logical order, and start with wide or general information, then is followed by specific details, examples, and references of English terms or examples. In case of the rejected definitions, they are not always written in the logical order, and their definition patterns are also various - only Thai translation, Thai translation with related entries, parts of speech (POS), Thai translation, related entries, and English term references followed by definitions, etc.