Pengxiang Wang, Cong-Xuan Zhang, Dingqian Wang, Shaohua Zhang, Jun Wang, Xianzhi Wang, Lan Huang
{"title":"Relation Extraction for Knowledge Graph Generation in the Agriculture Domain: A Case Study on Soybean Pests and Disease","authors":"Pengxiang Wang, Cong-Xuan Zhang, Dingqian Wang, Shaohua Zhang, Jun Wang, Xianzhi Wang, Lan Huang","doi":"10.13031/aea.15124","DOIUrl":null,"url":null,"abstract":"HighlightsWith the aim to reduce the burden of acquiring expert knowledge and strengthen the connection between written knowledge and the fields, this article investigated the problem of automatically extracting and organizing soybean pests and disease knowledge from text.Entities and relations were extracted using multiple models with deep neural network structures. Performance of these models were compared and evaluated in detail.A knowledge graph was automatically constructed using the extracted information, and made publicly available.ABSTRACT. Precision agriculture is an emerging type of agriculture that intensively uses information technology to automate agricultural production. Soybean (Glycine max (L.) Merri.), is an important crop in China, with an annual demand of approximately 110 million tons. However, in China, soybean production is threatened by more than 30 kinds of disease and 100 kinds of pests. With the rapidly increasing specialized information in the literature, it is difficult for farmers to keep up. Relation extraction automatically identifies and extracts structured knowledge from natural language text and thus can help to alleviate the problem. In this study, we propose to employ relation extraction to systematically extract information from expert-written text, and generate a knowledge graph from the extracted information. This case study was planned in China, therefore we mainly used Chinese texts. Firstly, we carefully chose expert-written text on soybean pests and disease, labeled the entities, and classified their thematic relations into five categories. Then, we built and trained three relation extraction models using state-of-the-art deep learning architectures and evaluated their performance on our task. Finally, we constructed an example knowledge graph from the extracted information and demonstrated their potential usage for automatic reasoning and solution recommendation for pests and disease prevention. In total, this study sampled 1038 entities and 1569 relation instances. Experimental results showed that our best model achieved an F1 score of 98.49% on identifying relations from text. Experimental results also showed the effectiveness of the example knowledge graph. Keywords: Bidirectional encoder representation from transformers, Knowledge graph, Relation extraction, Soybean pests and disease.","PeriodicalId":55501,"journal":{"name":"Applied Engineering in Agriculture","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Engineering in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.13031/aea.15124","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
HighlightsWith the aim to reduce the burden of acquiring expert knowledge and strengthen the connection between written knowledge and the fields, this article investigated the problem of automatically extracting and organizing soybean pests and disease knowledge from text.Entities and relations were extracted using multiple models with deep neural network structures. Performance of these models were compared and evaluated in detail.A knowledge graph was automatically constructed using the extracted information, and made publicly available.ABSTRACT. Precision agriculture is an emerging type of agriculture that intensively uses information technology to automate agricultural production. Soybean (Glycine max (L.) Merri.), is an important crop in China, with an annual demand of approximately 110 million tons. However, in China, soybean production is threatened by more than 30 kinds of disease and 100 kinds of pests. With the rapidly increasing specialized information in the literature, it is difficult for farmers to keep up. Relation extraction automatically identifies and extracts structured knowledge from natural language text and thus can help to alleviate the problem. In this study, we propose to employ relation extraction to systematically extract information from expert-written text, and generate a knowledge graph from the extracted information. This case study was planned in China, therefore we mainly used Chinese texts. Firstly, we carefully chose expert-written text on soybean pests and disease, labeled the entities, and classified their thematic relations into five categories. Then, we built and trained three relation extraction models using state-of-the-art deep learning architectures and evaluated their performance on our task. Finally, we constructed an example knowledge graph from the extracted information and demonstrated their potential usage for automatic reasoning and solution recommendation for pests and disease prevention. In total, this study sampled 1038 entities and 1569 relation instances. Experimental results showed that our best model achieved an F1 score of 98.49% on identifying relations from text. Experimental results also showed the effectiveness of the example knowledge graph. Keywords: Bidirectional encoder representation from transformers, Knowledge graph, Relation extraction, Soybean pests and disease.
期刊介绍:
This peer-reviewed journal publishes applications of engineering and technology research that address agricultural, food, and biological systems problems. Submissions must include results of practical experiences, tests, or trials presented in a manner and style that will allow easy adaptation by others; results of reviews or studies of installations or applications with substantially new or significant information not readily available in other refereed publications; or a description of successful methods of techniques of education, outreach, or technology transfer.