基于triz的中文专利分类半监督学习框架

Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence Pub Date : 2020-04-23 DOI:10.1145/3404555.3404600

Lixiao Huang, Jiasi Yu, Yongjun Hu, Huiyou Chang

{"title":"基于triz的中文专利分类半监督学习框架","authors":"Lixiao Huang, Jiasi Yu, Yongjun Hu, Huiyou Chang","doi":"10.1145/3404555.3404600","DOIUrl":null,"url":null,"abstract":"Automatic patent classification based on the TRIZ inventive principles is essential for patent management and industrial analysis. However, acquiring labels for deep learning methods is extraordinarily difficult and costly. This paper proposes a new two-stage semi-supervised learning framework called TRIZ-ESSL, which stands for Enhanced Semi-Supervised Learning for TRIZ. TRIZ-ESSL makes full use of both labeled and unlabeled data to improve the prediction performance. TRIZ-ESSL takes the advantages of semi-supervised sequence learning and mixed objective function, a combination of cross-entropy, entropy minimization, adversarial and virtual adversarial loss functions. Firstly, TRIZ-ESSL uses unlabeled data to train a recurrent language model. Secondly, TRIZ-ESSL initializes the weights of the LSTM-based model with the pre-trained recurrent language model and then trains the text classification model using mixed objective function on both labeled and unlabeled sets. On 3 TRIZ-based classification tasks, TRIZ-ESSL outperforms the current popular semi-supervised training methods and Bert in terms of accuracy score.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Semi-Supervised Learning Framework for TRIZ-Based Chinese Patent Classification\",\"authors\":\"Lixiao Huang, Jiasi Yu, Yongjun Hu, Huiyou Chang\",\"doi\":\"10.1145/3404555.3404600\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic patent classification based on the TRIZ inventive principles is essential for patent management and industrial analysis. However, acquiring labels for deep learning methods is extraordinarily difficult and costly. This paper proposes a new two-stage semi-supervised learning framework called TRIZ-ESSL, which stands for Enhanced Semi-Supervised Learning for TRIZ. TRIZ-ESSL makes full use of both labeled and unlabeled data to improve the prediction performance. TRIZ-ESSL takes the advantages of semi-supervised sequence learning and mixed objective function, a combination of cross-entropy, entropy minimization, adversarial and virtual adversarial loss functions. Firstly, TRIZ-ESSL uses unlabeled data to train a recurrent language model. Secondly, TRIZ-ESSL initializes the weights of the LSTM-based model with the pre-trained recurrent language model and then trains the text classification model using mixed objective function on both labeled and unlabeled sets. On 3 TRIZ-based classification tasks, TRIZ-ESSL outperforms the current popular semi-supervised training methods and Bert in terms of accuracy score.\",\"PeriodicalId\":220526,\"journal\":{\"name\":\"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3404555.3404600\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3404555.3404600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

基于TRIZ发明原则的专利自动分类对专利管理和产业分析至关重要。然而，获取深度学习方法的标签是非常困难和昂贵的。本文提出了一种新的两阶段半监督学习框架，称为TRIZ- essl，即TRIZ的增强半监督学习。trz - essl充分利用标记和未标记的数据来提高预测性能。trz - essl采用了半监督序列学习和混合目标函数的优点，结合了交叉熵、熵最小化、对抗和虚拟对抗损失函数。首先，TRIZ-ESSL使用未标记的数据来训练循环语言模型。其次，TRIZ-ESSL利用预训练的递归语言模型初始化基于lstm的模型的权值，然后在标记集和未标记集上使用混合目标函数训练文本分类模型。在3个基于trz的分类任务上，trz - essl在准确率得分方面优于当前流行的半监督训练方法和Bert。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Semi-Supervised Learning Framework for TRIZ-Based Chinese Patent Classification

Automatic patent classification based on the TRIZ inventive principles is essential for patent management and industrial analysis. However, acquiring labels for deep learning methods is extraordinarily difficult and costly. This paper proposes a new two-stage semi-supervised learning framework called TRIZ-ESSL, which stands for Enhanced Semi-Supervised Learning for TRIZ. TRIZ-ESSL makes full use of both labeled and unlabeled data to improve the prediction performance. TRIZ-ESSL takes the advantages of semi-supervised sequence learning and mixed objective function, a combination of cross-entropy, entropy minimization, adversarial and virtual adversarial loss functions. Firstly, TRIZ-ESSL uses unlabeled data to train a recurrent language model. Secondly, TRIZ-ESSL initializes the weights of the LSTM-based model with the pre-trained recurrent language model and then trains the text classification model using mixed objective function on both labeled and unlabeled sets. On 3 TRIZ-based classification tasks, TRIZ-ESSL outperforms the current popular semi-supervised training methods and Bert in terms of accuracy score.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence

自引率

0.00%

发文量

期刊最新文献

mRNA Big Data Analysis of Hepatoma Carcinoma Between Different Genders Generalization or Instantiation?: Estimating the Relative Abstractness between Images and Text Auxiliary Edge Detection for Semantic Image Segmentation Intrusion Detection of Abnormal Objects for Railway Scenes Using Infrared Images Multi-Tenant Machine Learning Platform Based on Kubernetes