{"title":"Automatic patent classification by a three-phase model with document frequency matrix and boosted tree","authors":"F. Shamsi, Z. Aung","doi":"10.1109/ICEDSA.2016.7818566","DOIUrl":null,"url":null,"abstract":"With the increased volume of patent databases during the past years, it becomes necessary for companies to correctly classify and identify innovative patents in a timely manner though the use of automation. Although many patent classification methods have been proposed, the accuracy remains the most challenging factor for the success of a classification model. This paper presents an empirical study for automatic patent classification systems through the application of a three-phase model. Patent query, text processing, and the classification phases are applied, and a document frequency matrix and boosted tree (BT) classifier are used to classify patents into two classes. Model validation, accuracy and performance are calculated to determine the effectiveness of the proposed model.","PeriodicalId":247318,"journal":{"name":"2016 5th International Conference on Electronic Devices, Systems and Applications (ICEDSA)","volume":"175 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 5th International Conference on Electronic Devices, Systems and Applications (ICEDSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEDSA.2016.7818566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
With the increased volume of patent databases during the past years, it becomes necessary for companies to correctly classify and identify innovative patents in a timely manner though the use of automation. Although many patent classification methods have been proposed, the accuracy remains the most challenging factor for the success of a classification model. This paper presents an empirical study for automatic patent classification systems through the application of a three-phase model. Patent query, text processing, and the classification phases are applied, and a document frequency matrix and boosted tree (BT) classifier are used to classify patents into two classes. Model validation, accuracy and performance are calculated to determine the effectiveness of the proposed model.