AI Open最新文献

英文中文

Sarcasm detection using news headlines dataset 基于新闻标题数据集的讽刺检测

AI Open

Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.01.001

Rishabh Misra , Prahal Arora

Sarcasm has been an elusive concept for humans. Due to interesting linguistic properties, sarcasm detection has gained traction of the Natural Language Processing (NLP) research community in the past few years. However, the task of predicting sarcasm in a text remains a difficult one for machines as well, and there are limited insights into what makes a sentence sarcastic. Past studies in sarcasm detection either use large scale datasets collected using tag-based supervision or small scale manually annotated datasets. The former category of datasets are noisy in terms of labels and language, whereas the latter category of datasets do not have enough instances to train deep learning models reliably despite having high-quality labels. To overcome these shortcomings, we introduce a high-quality and relatively larger-scale dataset which is a collection of news headlines from a sarcastic news website and a real news website. We describe the unique aspects of our dataset and compare its various characteristics with other benchmark datasets in sarcasm detection domain. Furthermore, we produce insights into what constitute as sarcasm in a text using a Hybrid Neural Network architecture. First released in 2019, we dedicate a section on how the NLP research community has extensively relied upon our contributions to push the state of the art further in the sarcasm detection domain. Lastly, we make the dataset as well as framework implementation publicly available to facilitate continued research in this domain.

讽刺对人类来说一直是一个难以捉摸的概念。由于有趣的语言特性，讽刺检测在过去几年中受到了自然语言处理（NLP）研究界的关注。然而，对于机器来说，预测文本中的讽刺仍然是一项困难的任务，而且对一个句子的讽刺原因的见解有限。过去的讽刺检测研究要么使用使用基于标签的监督收集的大规模数据集，要么使用小规模手动注释的数据集。前一类数据集在标签和语言方面是有噪声的，而后一类数据集中尽管有高质量的标签，但没有足够的实例来可靠地训练深度学习模型。为了克服这些缺点，我们引入了一个高质量且规模相对较大的数据集，该数据集是来自讽刺新闻网站和真实新闻网站的新闻标题的集合。我们描述了我们数据集的独特之处，并将其各种特征与讽刺检测领域的其他基准数据集进行了比较。此外，我们使用混合神经网络架构来深入了解文本中的讽刺构成。我们于2019年首次发布，专门介绍了NLP研究界如何广泛依赖我们的贡献，进一步推动讽刺检测领域的最新技术。最后，我们公开了数据集和框架实现，以促进该领域的持续研究。

{"title":"Sarcasm detection using news headlines dataset","authors":"Rishabh Misra , Prahal Arora","doi":"10.1016/j.aiopen.2023.01.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.01.001","url":null,"abstract":"<div><p>Sarcasm has been an elusive concept for humans. Due to interesting linguistic properties, sarcasm detection has gained traction of the Natural Language Processing (NLP) research community in the past few years. However, the task of predicting sarcasm in a text remains a difficult one for machines as well, and there are limited insights into what makes a sentence sarcastic. Past studies in sarcasm detection either use large scale datasets collected using tag-based supervision or small scale manually annotated datasets. The former category of datasets are noisy in terms of labels and language, whereas the latter category of datasets do not have enough instances to train deep learning models reliably despite having high-quality labels. To overcome these shortcomings, we introduce a high-quality and relatively larger-scale dataset which is a collection of news headlines from a sarcastic news website and a real news website. We describe the unique aspects of our dataset and compare its various characteristics with other benchmark datasets in sarcasm detection domain. Furthermore, we produce insights into what constitute as sarcasm in a text using a Hybrid Neural Network architecture. First released in 2019, we dedicate a section on how the NLP research community has extensively relied upon our contributions to push the state of the art further in the sarcasm detection domain. Lastly, we make the dataset as well as framework implementation publicly available to facilitate continued research in this domain.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 13-18"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

On the distribution alignment of propagation in graph neural networks 论图神经网络传播的分布对齐

AI Open

Pub Date : 2022-12-01 DOI: 10.1016/j.aiopen.2022.11.006

Qinkai Zheng, Xiao Xia, Kun Zhang, E. Kharlamov, Yuxiao Dong

引用次数: 0

BCA: Bilinear Convolutional Neural Networks and Attention Networks for legal question answering 基于双线性卷积神经网络和注意力网络的法律问答

AI Open

Pub Date : 2022-11-01 DOI: 10.1016/j.aiopen.2022.11.002

Haiguang Zhang, Tongyue Zhang, Faxin Cao, Zhizheng Wang, Yuanyu Zhang, Yuanyuan Sun, Mark Anthony Vicente

引用次数: 1

HSSDA: Hierarchical relation aided Semi-Supervised Domain Adaptation 层次关系辅助半监督领域自适应

AI Open

Pub Date : 2022-11-01 DOI: 10.1016/j.aiopen.2022.11.001

Xiechao Guo, R. Liu, Dandan Song

引用次数: 0

Optimized separable convolution: Yet another efficient convolution operator 优化的可分离卷积:另一个有效的卷积算子

AI Open

Pub Date : 2022-10-01 DOI: 10.2139/ssrn.4245175

Tao Wei, Yonghong Tian, Yaowei Wang, Yun Liang, C. Chen

The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs O ( C 2 K 2 ) parameters to represent, where C is the channel size and K is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efﬁcient in reducing the model size. For example, depth separable convolution reduces the complexity to O ( C · ( C + K 2 )) while spatial separable convolution reduces the complexity to O ( C 2 K ) . However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O ( C 32 K ) . When the restriction in the number of separated convolutions can be lifted, an even lower complexity at O ( C · log( CK 2 )) can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.

卷积运算是近年来深度学习研究中最关键的组成部分。传统的二维卷积需要O (c2k 2)个参数来表示，其中C为通道大小，K为核大小。考虑到这些参数最近为了满足苛刻的应用程序的需要而急剧增加，参数的数量已经变得非常昂贵。在卷积的各种实现中，可分离卷积已被证明在减小模型尺寸方面更有效。例如，深度可分卷积将复杂度降低到O (C·(C + K 2))，而空间可分卷积将复杂度降低到O (C 2 K)。然而，这些被认为是临时设计，不能确保它们通常可以实现最佳分离。在本研究中，我们提出了一种新颖的原则性算子——优化可分离卷积，通过优化设计，一般可分离卷积的内部群数和核大小可以达到O (C 32 K)的复杂度。当分离卷积数量的限制可以解除时，可以实现更低的复杂度O (C·log(CK 2))。实验结果表明，所提出的优化的可分离卷积在精度-参数权衡方面能够比传统的、深度的和深度/空间的可分离卷积取得更好的性能。

{"title":"Optimized separable convolution: Yet another efficient convolution operator","authors":"Tao Wei, Yonghong Tian, Yaowei Wang, Yun Liang, C. Chen","doi":"10.2139/ssrn.4245175","DOIUrl":"https://doi.org/10.2139/ssrn.4245175","url":null,"abstract":"The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs O ( C 2 K 2 ) parameters to represent, where C is the channel size and K is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efﬁcient in reducing the model size. For example, depth separable convolution reduces the complexity to O ( C · ( C + K 2 )) while spatial separable convolution reduces the complexity to O ( C 2 K ) . However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O ( C 32 K ) . When the restriction in the number of separated convolutions can be lifted, an even lower complexity at O ( C · log( CK 2 )) can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"40 1","pages":"162-171"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85236498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Deep learning for fake news detection: A comprehensive survey 深度学习在假新闻检测中的应用综述

AI Open

Pub Date : 2022-10-01 DOI: 10.1016/j.aiopen.2022.09.001

Linmei Hu, Siqi Wei, Ziwang Zhao, Bin Wu

引用次数: 18

Debiased Recommendation with Neural Stratification 神经分层去偏见推荐

AI Open

Pub Date : 2022-08-15 DOI: 10.48550/arXiv.2208.07281

Quanyu Dai, Zhenhua Dong, Xu Chen

Debiased recommender models have recently attracted increasing attention from the academic and industry communities. Existing models are mostly based on the technique of inverse propensity score (IPS). However, in the recommendation domain, IPS can be hard to estimate given the sparse and noisy nature of the observed user-item exposure data. To alleviate this problem, in this paper, we assume that the user preference can be dominated by a small amount of latent factors, and propose to cluster the users for computing more accurate IPS via increasing the exposure densities. Basically, such method is similar with the spirit of stratification models in applied statistics. However, unlike previous heuristic stratification strategy, we learn the cluster criterion by presenting the users with low ranking embeddings, which are future shared with the user representations in the recommender model. At last, we find that our model has strong connections with the previous two types of debiased recommender models. We conduct extensive experiments based on real-world datasets to demonstrate the effectiveness of the proposed method.

去偏见推荐模型最近引起了学术界和工业界越来越多的关注。现有的模型大多基于逆倾向评分(IPS)技术。然而，在推荐领域，考虑到观察到的用户项目暴露数据的稀疏和噪声性质，IPS很难估计。为了缓解这一问题，本文假设用户偏好可以被少量潜在因素所支配，并提出通过增加暴露密度对用户进行聚类以计算更准确的IPS。这种方法基本上与应用统计学中的分层模型精神相似。然而，与之前的启发式分层策略不同，我们通过向用户呈现低排名嵌入来学习聚类标准，这些嵌入将与推荐模型中的用户表示共享。最后，我们发现我们的模型与前两类去偏见推荐模型有很强的联系。我们在真实世界的数据集上进行了大量的实验，以证明所提出方法的有效性。

引用次数: 2

Hierarchical label with imbalance and attributed network structure fusion for network embedding 基于不平衡的分层标签和属性网络结构融合的网络嵌入

AI Open

Pub Date : 2022-08-01 DOI: 10.1016/j.aiopen.2022.07.002

Shu Zhao, Jialin Chen, Jie Chen, Yanping Zhang, Jie Tang

引用次数: 0

A survey on heterogeneous information network based recommender systems: Concepts, methods, applications and resources 基于异构信息网络的推荐系统综述:概念、方法、应用和资源

AI Open

Pub Date : 2022-04-01 DOI: 10.1016/j.aiopen.2022.03.002

Jiawei Liu, Chuan Shi, Cheng Yang, Zhiyuan Lu, Philip S. Yu

引用次数: 11

Self-directed Machine Learning 自主机器学习

AI Open

Pub Date : 2022-01-04 DOI: 10.1016/j.aiopen.2022.06.001

Wenwu Zhu, Xin Wang, P. Xie

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

AI Open

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀