{"title":"A Cross-lingual Patent Topics Model for Trend Analysis","authors":"Yu Tsou, Deng-Neng Chen, Chiayu Lai","doi":"10.1109/ICS51289.2020.00108","DOIUrl":null,"url":null,"abstract":"Patent data represents one of the most important innovation indicators to evaluate technological trends. With the rapid growth of business globalization in recent decades, managing an increasing volume of patent documents written in different languages has become inevitably important for identifying new technological trends and industrial innovations. However, due to the complex structure of patent documents as well as the diverse writing styles, translation may not represent the actual proximity between patents. To mitigate the issue of cross-lingual patent analysis, we propose a method incorporating word embeddings and LDA model to identify cross-language technology trends, thereby solving the problem in which machine translation needs a huge parallel corpus. We conduct a preliminary experiment to evaluate our model in English and Chinese patents. The results show our proposed model can align two different languages effectively to identify technology trends by keywords and topics in the specific domain.","PeriodicalId":176275,"journal":{"name":"2020 International Computer Symposium (ICS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Computer Symposium (ICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICS51289.2020.00108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Patent data represents one of the most important innovation indicators to evaluate technological trends. With the rapid growth of business globalization in recent decades, managing an increasing volume of patent documents written in different languages has become inevitably important for identifying new technological trends and industrial innovations. However, due to the complex structure of patent documents as well as the diverse writing styles, translation may not represent the actual proximity between patents. To mitigate the issue of cross-lingual patent analysis, we propose a method incorporating word embeddings and LDA model to identify cross-language technology trends, thereby solving the problem in which machine translation needs a huge parallel corpus. We conduct a preliminary experiment to evaluate our model in English and Chinese patents. The results show our proposed model can align two different languages effectively to identify technology trends by keywords and topics in the specific domain.