Opinion subset selection via submodular maximization

IF 8.1 1区 计算机科学 0 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Sciences Pub Date : 2021-06-01 DOI:10.1016/j.ins.2020.12.083
{"title":"Opinion subset selection via submodular maximization","authors":"Yang Zhao,&nbsp;Tommy W.S. Chow","doi":"10.1016/j.ins.2020.12.083","DOIUrl":null,"url":null,"abstract":"<div><p><span>Current research on subset selection for opinion analysis assumes that their methods can retrieve the opinions expressed in documents from general text features. However, such relaxed conditions can hardly maintain the performance of the analysis in </span>opinion mining<span>, especially when given strict limitations on the subset size<span>. In this paper, we propose a framework for opinion subset selection. This framework can select a small set of instances from original data to convey a subjective representation for opinion classification and regression. Compared with our framework, the conventional submodular based subset selection approach cannot capture the fine-grained opinion features expressed in the corpus. Specifically, we propose a monotone non-decreasing score function<span> and a framework based on topic modeling and submodular maximization for filtering irrelevant information and selecting the subsets. Our work further introduces an opinion-sensitive algorithm for optimizing the proposed function for opinion subset construction. We perform extensive experiments and comparative analysis of different subset selection methods in this work. The experimental result shows that the proposed opinion subset selection framework can compress the original text training set and preserve the test set’s classification and regression metric performance at the same time.</span></span></span></p></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"560 ","pages":"Pages 283-306"},"PeriodicalIF":8.1000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.ins.2020.12.083","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025521000141","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 5

Abstract

Current research on subset selection for opinion analysis assumes that their methods can retrieve the opinions expressed in documents from general text features. However, such relaxed conditions can hardly maintain the performance of the analysis in opinion mining, especially when given strict limitations on the subset size. In this paper, we propose a framework for opinion subset selection. This framework can select a small set of instances from original data to convey a subjective representation for opinion classification and regression. Compared with our framework, the conventional submodular based subset selection approach cannot capture the fine-grained opinion features expressed in the corpus. Specifically, we propose a monotone non-decreasing score function and a framework based on topic modeling and submodular maximization for filtering irrelevant information and selecting the subsets. Our work further introduces an opinion-sensitive algorithm for optimizing the proposed function for opinion subset construction. We perform extensive experiments and comparative analysis of different subset selection methods in this work. The experimental result shows that the proposed opinion subset selection framework can compress the original text training set and preserve the test set’s classification and regression metric performance at the same time.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过子模最大化选择意见子集
目前关于意见分析子集选择的研究假设他们的方法可以从一般文本特征中检索文档中表达的意见。然而,这种宽松的条件很难保持分析在意见挖掘中的性能,特别是在对子集大小进行严格限制的情况下。在本文中,我们提出了一个意见子集选择框架。该框架可以从原始数据中选择一小部分实例,以表达对意见分类和回归的主观表示。与我们的框架相比,传统的基于子模块的子集选择方法无法捕获语料库中表达的细粒度意见特征。具体来说,我们提出了一个单调非递减分数函数和一个基于主题建模和子模最大化的框架,用于过滤无关信息和选择子集。我们的工作进一步引入了一种意见敏感算法,用于优化所提出的意见子集构建函数。在这项工作中,我们对不同的子集选择方法进行了广泛的实验和比较分析。实验结果表明,所提出的意见子集选择框架能够在压缩原始文本训练集的同时,保持测试集的分类性能和回归度量性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
相关文献
Greedy modality selection via approximate submodular maximization
IF 0 ArXivPub Date : 2022-10-22 DOI: 10.48550/arXiv.2210.12562
Runxiang Cheng, Gargi Balasubramaniam, Yifei He, Yao-Hung Hubert Tsai, Han Zhao
Subset Selection: k-Submodular Maximization
IF 0 Evolutionary Learning: Advances in Theories and AlgorithmsPub Date : 1900-01-01 DOI: 10.1007/978-981-13-5956-9_15
Zhi-Hua Zhou, Yang Yu, Chao Qian
来源期刊
Information Sciences
Information Sciences 工程技术-计算机:信息系统
CiteScore
14.00
自引率
17.30%
发文量
1322
审稿时长
10.4 months
期刊介绍: Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.
期刊最新文献
A stable framework-based modeling of the complex dynamical system using a double context layered with self-weighted output feedback loop Elman recurrent neural network Adaptive fuzzy funnel control of nonlinear multi-agent systems via dual-channel event-triggered strategy Knowledge-aware differential equation discovery with automated background knowledge extraction Editorial Board Semantic enhanced bi-syntactic graph convolutional network for aspect-based sentiment analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1