{"title":"地理数据支持专利申请的分类","authors":"J. Stutzki, Matthias Schubert","doi":"10.1145/2948649.2948653","DOIUrl":null,"url":null,"abstract":"The automatic classification of patent applications into a particular patent classification system remains a challenge with many practical applications. From a computer science point of view, the task is a multi-label hierarchical classification problem, i.e. each patent application might belong to multiple classes within the class hierarchy. The problem is still especially difficult for purely text-based classifiers because patents and patent applications are often formulated in a rather generic way. Thus, additional sources of information should be used to improve class prediction. In our approach, we propose the use of location information contained in the meta data of a patent application in combination with text-based patent classification. We argue that certain technological areas often cluster in geographic regions. For example, space travel technology is often collocated at Houston, Texas due to the NASA facilities in this area. In many cases, the addresses of the inventors are correlated to the technological area of a given patent. Thus, the addresses can be exploited to provide additional information about the technological area. We present a geo-enriched classifier joining established methods for text-based classification with location-based topic prediction. Since the location-based prediction is not applicable to all cases, we provide a method to regulate the impact of the spatial predictor for these cases. Our experiments indicate that spatial prediction is applicable to a considerable amount of patent applications and that the combination of spatial prediction and text-based classification significantly improves the prediction accuracy.","PeriodicalId":336205,"journal":{"name":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Geodata supported classification of patent applications\",\"authors\":\"J. Stutzki, Matthias Schubert\",\"doi\":\"10.1145/2948649.2948653\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The automatic classification of patent applications into a particular patent classification system remains a challenge with many practical applications. From a computer science point of view, the task is a multi-label hierarchical classification problem, i.e. each patent application might belong to multiple classes within the class hierarchy. The problem is still especially difficult for purely text-based classifiers because patents and patent applications are often formulated in a rather generic way. Thus, additional sources of information should be used to improve class prediction. In our approach, we propose the use of location information contained in the meta data of a patent application in combination with text-based patent classification. We argue that certain technological areas often cluster in geographic regions. For example, space travel technology is often collocated at Houston, Texas due to the NASA facilities in this area. In many cases, the addresses of the inventors are correlated to the technological area of a given patent. Thus, the addresses can be exploited to provide additional information about the technological area. We present a geo-enriched classifier joining established methods for text-based classification with location-based topic prediction. Since the location-based prediction is not applicable to all cases, we provide a method to regulate the impact of the spatial predictor for these cases. Our experiments indicate that spatial prediction is applicable to a considerable amount of patent applications and that the combination of spatial prediction and text-based classification significantly improves the prediction accuracy.\",\"PeriodicalId\":336205,\"journal\":{\"name\":\"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2948649.2948653\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2948649.2948653","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Geodata supported classification of patent applications
The automatic classification of patent applications into a particular patent classification system remains a challenge with many practical applications. From a computer science point of view, the task is a multi-label hierarchical classification problem, i.e. each patent application might belong to multiple classes within the class hierarchy. The problem is still especially difficult for purely text-based classifiers because patents and patent applications are often formulated in a rather generic way. Thus, additional sources of information should be used to improve class prediction. In our approach, we propose the use of location information contained in the meta data of a patent application in combination with text-based patent classification. We argue that certain technological areas often cluster in geographic regions. For example, space travel technology is often collocated at Houston, Texas due to the NASA facilities in this area. In many cases, the addresses of the inventors are correlated to the technological area of a given patent. Thus, the addresses can be exploited to provide additional information about the technological area. We present a geo-enriched classifier joining established methods for text-based classification with location-based topic prediction. Since the location-based prediction is not applicable to all cases, we provide a method to regulate the impact of the spatial predictor for these cases. Our experiments indicate that spatial prediction is applicable to a considerable amount of patent applications and that the combination of spatial prediction and text-based classification significantly improves the prediction accuracy.