Lukas Jonathan Weber, Alice Kirchheim, Axel Zimmermann
{"title":"W&G-Bert:用于保修和商誉文本挖掘的预训练汽车保修和商誉语言表示模型的概念","authors":"Lukas Jonathan Weber, Alice Kirchheim, Axel Zimmermann","doi":"10.5121/csit.2022.120304","DOIUrl":null,"url":null,"abstract":"The request for precise text mining applications to extract information of company based automotive warranty and goodwill (W&G) data is steadily increasing. The progress of the analytical competence of text mining methods for information extraction is among others based on the developments and insights of deep learning techniques applied in natural language processing (NLP). Directly applying NLP based architectures to automotive W&G text mining would wage to a significant performance loss due to different word distributions of general domain and W&G specific corpora. Therefore, labelled W&G training datasets are necessary to transform a general-domain language model in a specific-domain one to increase the performance in W&G text mining tasks.","PeriodicalId":153049,"journal":{"name":"Computer Networks & Communications Trends","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"W&G-Bert: A Concept for a Pre-Trained Automotive Warranty and Goodwill Language Representation Model for Warranty and Goodwill Text Mining\",\"authors\":\"Lukas Jonathan Weber, Alice Kirchheim, Axel Zimmermann\",\"doi\":\"10.5121/csit.2022.120304\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The request for precise text mining applications to extract information of company based automotive warranty and goodwill (W&G) data is steadily increasing. The progress of the analytical competence of text mining methods for information extraction is among others based on the developments and insights of deep learning techniques applied in natural language processing (NLP). Directly applying NLP based architectures to automotive W&G text mining would wage to a significant performance loss due to different word distributions of general domain and W&G specific corpora. Therefore, labelled W&G training datasets are necessary to transform a general-domain language model in a specific-domain one to increase the performance in W&G text mining tasks.\",\"PeriodicalId\":153049,\"journal\":{\"name\":\"Computer Networks & Communications Trends\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Networks & Communications Trends\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5121/csit.2022.120304\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks & Communications Trends","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/csit.2022.120304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
W&G-Bert: A Concept for a Pre-Trained Automotive Warranty and Goodwill Language Representation Model for Warranty and Goodwill Text Mining
The request for precise text mining applications to extract information of company based automotive warranty and goodwill (W&G) data is steadily increasing. The progress of the analytical competence of text mining methods for information extraction is among others based on the developments and insights of deep learning techniques applied in natural language processing (NLP). Directly applying NLP based architectures to automotive W&G text mining would wage to a significant performance loss due to different word distributions of general domain and W&G specific corpora. Therefore, labelled W&G training datasets are necessary to transform a general-domain language model in a specific-domain one to increase the performance in W&G text mining tasks.