{"title":"Sentence opinion mining model for fusing target entities in official government documents","authors":"Xiao Ma, Teng Yang, Feng Bai, Yunmei Shi","doi":"10.3934/era.2023177","DOIUrl":null,"url":null,"abstract":"When drafting official government documents, it is necessary to firmly grasp the main idea and ensure that any positions stated within the text are consistent with those in previous documents. In combination with the field's demands, By taking advantage of suitable text-mining techniques to harvest opinions from sentences in official government documents, the efficiency of official government document writers can be significantly increased. Most existing opinion mining approaches employ text classification methods to directly mine the sentential text of official government documents while disregarding the influence of the objects described within the documents (i.e., the target entities) on the sentence opinion categories. To address these issues, this study proposes a sentence opinion mining model that fuses the target entities within documents. Based on the Bi-directional long short-term (BiLSTM) and attention mechanisms, the model fully considers the attention given by a official government document's target entity to different words within the corresponding sentence text, as well as the dependency between words of the sentence. The model subsequently fuses two by using feature vector fusion to obtain the final semantic representation of the text, which is then classified using a fully connected network and softmax function. Experimental results based on a dataset of official government documents show that the model significantly outperforms baseline models such as Text-convolutional neural network (TextCNN), recurrent neural network (RNN), and BiLSTM.","PeriodicalId":48554,"journal":{"name":"Electronic Research Archive","volume":"83 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Research Archive","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.3934/era.2023177","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
When drafting official government documents, it is necessary to firmly grasp the main idea and ensure that any positions stated within the text are consistent with those in previous documents. In combination with the field's demands, By taking advantage of suitable text-mining techniques to harvest opinions from sentences in official government documents, the efficiency of official government document writers can be significantly increased. Most existing opinion mining approaches employ text classification methods to directly mine the sentential text of official government documents while disregarding the influence of the objects described within the documents (i.e., the target entities) on the sentence opinion categories. To address these issues, this study proposes a sentence opinion mining model that fuses the target entities within documents. Based on the Bi-directional long short-term (BiLSTM) and attention mechanisms, the model fully considers the attention given by a official government document's target entity to different words within the corresponding sentence text, as well as the dependency between words of the sentence. The model subsequently fuses two by using feature vector fusion to obtain the final semantic representation of the text, which is then classified using a fully connected network and softmax function. Experimental results based on a dataset of official government documents show that the model significantly outperforms baseline models such as Text-convolutional neural network (TextCNN), recurrent neural network (RNN), and BiLSTM.
在起草政府正式文件时,要牢牢把握中心思想,保证文本中所表述的立场与以前的文件一致。结合该领域的需求,利用合适的文本挖掘技术从政府公文的句子中获取观点,可以显著提高政府公文作者的写作效率。现有的意见挖掘方法大多采用文本分类方法直接挖掘政府官方文件的句子文本,而忽略了文档中描述的对象(即目标实体)对句子意见类别的影响。为了解决这些问题,本研究提出了一个融合文档中目标实体的句子意见挖掘模型。该模型基于双向长短期(bidirectional long - short, BiLSTM)和注意机制,充分考虑了官方政府文件的目标实体对相应句子文本中不同单词的注意,以及句子中单词之间的依赖关系。随后,该模型通过特征向量融合将两者融合,得到文本的最终语义表示,然后使用全连接网络和softmax函数对文本进行分类。基于官方政府文件数据集的实验结果表明,该模型显著优于文本卷积神经网络(TextCNN)、循环神经网络(RNN)和BiLSTM等基准模型。