{"title":"Research on Semantic Consistency Problem Based on Word2vec","authors":"Hongman Wang, Shaoshun Kang","doi":"10.1109/IICSPI48186.2019.9096015","DOIUrl":null,"url":null,"abstract":"In order to ensure the effective convergence of data with the same semantics but different expressions when collaborative business data is aggregated, this paper studies the semantic consistency problem. Firstly, the word vector model based on word2vec is trained to obtain the data word vector, then the cosine similarity of the data word vector can be calculated. Then the clustering based on the cosine similarity is used to obtain the preliminary core word set. Finally, the final core word set is obtained through the rule-based correction. Experiments and analysis show that this method can effectively complete business requirements and achieve high accuracy","PeriodicalId":318693,"journal":{"name":"2019 2nd International Conference on Safety Produce Informatization (IICSPI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 2nd International Conference on Safety Produce Informatization (IICSPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IICSPI48186.2019.9096015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In order to ensure the effective convergence of data with the same semantics but different expressions when collaborative business data is aggregated, this paper studies the semantic consistency problem. Firstly, the word vector model based on word2vec is trained to obtain the data word vector, then the cosine similarity of the data word vector can be calculated. Then the clustering based on the cosine similarity is used to obtain the preliminary core word set. Finally, the final core word set is obtained through the rule-based correction. Experiments and analysis show that this method can effectively complete business requirements and achieve high accuracy