首页 > 最新文献

AI Open最新文献

英文 中文
Restricted orthogonal gradient projection for continual learning 连续学习的受限正交梯度投影
Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.010
Zeyuan Yang , Zonghan Yang , Yichen Liu , Peng Li , Yang Liu

Continual learning aims to avoid catastrophic forgetting and effectively leverage learned experiences to master new knowledge. Existing gradient projection approaches impose hard constraints on the optimization space for new tasks to minimize interference, which simultaneously hinders forward knowledge transfer. To address this issue, recent methods reuse frozen parameters with a growing network, resulting in high computational costs. Thus, it remains a challenge whether we can improve forward knowledge transfer for gradient projection approaches using a fixed network architecture. In this work, we propose the Restricted Orthogonal Gradient prOjection (ROGO) framework. The basic idea is to adopt a restricted orthogonal constraint allowing parameters optimized in the direction oblique to the whole frozen space to facilitate forward knowledge transfer while consolidating previous knowledge. Our framework requires neither data buffers nor extra parameters. Extensive experiments have demonstrated the superiority of our framework over several strong baselines. We also provide theoretical guarantees for our relaxing strategy.

持续学习旨在避免灾难性的遗忘,并有效地利用所学经验来掌握新知识。现有的梯度投影方法对新任务的优化空间施加了严格的约束,以最大限度地减少干扰,这同时阻碍了正向知识转移。为了解决这个问题,最近的方法在不断增长的网络中重用冻结的参数,导致计算成本高。因此,我们是否可以使用固定网络架构改进梯度投影方法的前向知识转移仍然是一个挑战。在这项工作中,我们提出了限制正交梯度投影(ROGO)框架。其基本思想是采用限制正交约束,允许在倾斜于整个冻结空间的方向上优化参数,以便于在巩固先前知识的同时向前转移知识。我们的框架既不需要数据缓冲区,也不需要额外的参数。大量的实验已经证明了我们的框架相对于几个强大的基线的优越性。我们也为我们的放松策略提供了理论保障。
{"title":"Restricted orthogonal gradient projection for continual learning","authors":"Zeyuan Yang ,&nbsp;Zonghan Yang ,&nbsp;Yichen Liu ,&nbsp;Peng Li ,&nbsp;Yang Liu","doi":"10.1016/j.aiopen.2023.08.010","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.010","url":null,"abstract":"<div><p>Continual learning aims to avoid catastrophic forgetting and effectively leverage learned experiences to master new knowledge. Existing gradient projection approaches impose hard constraints on the optimization space for new tasks to minimize interference, which simultaneously hinders forward knowledge transfer. To address this issue, recent methods reuse frozen parameters with a growing network, resulting in high computational costs. Thus, it remains a challenge whether we can improve forward knowledge transfer for gradient projection approaches <em>using a fixed network architecture</em>. In this work, we propose the Restricted Orthogonal Gradient prOjection (ROGO) framework. The basic idea is to adopt a restricted orthogonal constraint allowing parameters optimized in the direction oblique to the whole frozen space to facilitate forward knowledge transfer while consolidating previous knowledge. Our framework requires neither data buffers nor extra parameters. Extensive experiments have demonstrated the superiority of our framework over several strong baselines. We also provide theoretical guarantees for our relaxing strategy.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 98-110"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-grained hypergraph interest modeling for conversational recommendation 用于会话推荐的多粒度超图兴趣建模
Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.10.001
Chenzhan Shang , Yupeng Hou , Wayne Xin Zhao , Yaliang Li , Jing Zhang

Conversational recommender system (CRS) interacts with users through multi-turn dialogues in natural language, which aims to provide high-quality recommendations for user’s instant information need. Although great efforts have been made to develop effective CRS, most of them still focus on the contextual information from the current dialogue, usually suffering from the data scarcity issue. Therefore, we consider leveraging historical dialogue data to enrich the limited contexts of the current dialogue session.

In this paper, we propose a novel multi-grained hypergraph interest modeling approach to capture user interest beneath intricate historical data from different perspectives. As the core idea, we employ hypergraph to represent complicated semantic relations underlying historical dialogues. In our approach, we first employ the hypergraph structure to model users’ historical dialogue sessions and form a session-based hypergraph, which captures coarse-grained, session-level relations. Second, to alleviate the issue of data scarcity, we use an external knowledge graph and construct a knowledge-based hypergraph considering fine-grained, entity-level semantics. We further conduct multi-grained hypergraph convolution on the two kinds of hypergraphs, and utilize the enhanced representations to develop interest-aware CRS. Extensive experiments on two benchmarks ReDial and TG-ReDial validate the effectiveness of our approach on both recommendation and conversation tasks. Code is available at: https://github.com/RUCAIBox/MHIM.

会话推荐系统(CRS)通过自然语言的多回合对话与用户进行交互,旨在为用户提供即时信息需求的高质量推荐。尽管已经为开发有效的CRS做出了巨大努力,但大多数CRS仍然侧重于当前对话的上下文信息,通常存在数据稀缺问题。因此,我们考虑利用历史对话数据来丰富当前对话的有限背景。在本文中,我们提出了一种新的多粒度超图兴趣建模方法,从不同的角度捕捉复杂历史数据下的用户兴趣。我们的核心思想是利用超图来表示历史对话背后复杂的语义关系。在我们的方法中,我们首先使用超图结构对用户的历史对话会话进行建模,并形成基于会话的超图,该超图捕获粗粒度的会话级关系。其次,为了缓解数据稀缺性问题,我们使用外部知识图,并考虑细粒度的实体级语义,构建基于知识的超图。我们进一步对这两种超图进行了多粒度的超图卷积,并利用增强的表示来开发兴趣感知的CRS。在两个基准测试ReDial和TG-ReDial上进行的大量实验验证了我们的方法在推荐和对话任务上的有效性。代码可从https://github.com/RUCAIBox/MHIM获得。
{"title":"Multi-grained hypergraph interest modeling for conversational recommendation","authors":"Chenzhan Shang ,&nbsp;Yupeng Hou ,&nbsp;Wayne Xin Zhao ,&nbsp;Yaliang Li ,&nbsp;Jing Zhang","doi":"10.1016/j.aiopen.2023.10.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.10.001","url":null,"abstract":"<div><p>Conversational recommender system (CRS) interacts with users through multi-turn dialogues in natural language, which aims to provide high-quality recommendations for user’s instant information need. Although great efforts have been made to develop effective CRS, most of them still focus on the contextual information from the current dialogue, usually suffering from the data scarcity issue. Therefore, we consider leveraging historical dialogue data to enrich the limited contexts of the current dialogue session.</p><p>In this paper, we propose a novel multi-grained hypergraph interest modeling approach to capture user interest beneath intricate historical data from different perspectives. As the core idea, we employ <em>hypergraph</em> to represent complicated semantic relations underlying historical dialogues. In our approach, we first employ the hypergraph structure to model users’ historical dialogue sessions and form a <em>session-based hypergraph</em>, which captures <em>coarse-grained, session-level</em> relations. Second, to alleviate the issue of data scarcity, we use an external knowledge graph and construct a <em>knowledge-based hypergraph</em> considering <em>fine-grained, entity-level</em> semantics. We further conduct multi-grained hypergraph convolution on the two kinds of hypergraphs, and utilize the enhanced representations to develop interest-aware CRS. Extensive experiments on two benchmarks <span>ReDial</span> and <span>TG-ReDial</span> validate the effectiveness of our approach on both recommendation and conversation tasks. Code is available at: <span>https://github.com/RUCAIBox/MHIM</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 154-164"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000177/pdfft?md5=845c75e23c419b9a9e76d0939d4efddc&pid=1-s2.0-S2666651023000177-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92131677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving task generalization via unified schema prompt 通过统一的模式提示提高任务泛化能力
Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.011
Wanjun Zhong , Yifan Gao , Ning Ding , Zhiyuan Liu , Ming Zhou , Jiahai Wang , Jian Yin , Nan Duan

Task generalization has been a long-standing challenge in Natural Language Processing (NLP). Recent research attempts to improve the task generalization ability of pre-trained language models by mapping NLP tasks into human-readable prompted forms. However, these approaches require laborious and inflexible manual collection of prompts, and different prompts on the same downstream task may receive unstable performance. We propose Unified Schema Prompt, a flexible and extensible prompting method, which automatically customizes the learnable prompts for each task according to the task input schema. It models the shared knowledge between tasks, while keeping the characteristics of different task schema, and thus enhances task generalization ability. The schema prompt takes the explicit data structure of each task to formulate prompts so that little human effort is involved. To test the task generalization ability of schema prompt at scale, we conduct schema prompt-based multitask pre-training on a wide variety of general NLP tasks. The framework achieves strong zero-shot and few-shot generalization performance on 16 unseen downstream tasks from 8 task types (e.g., QA, NLI, etc.). Furthermore, comprehensive analyses demonstrate the effectiveness of each component in the schema prompt, its flexibility in task compositionality, and its ability to improve performance under a full-data fine-tuning setting.

任务泛化一直是自然语言处理中的一个长期挑战。最近的研究试图通过将NLP任务映射到人类可读的提示形式来提高预训练语言模型的任务泛化能力。然而,这些方法需要费力且不灵活的手动提示收集,并且同一下游任务上的不同提示可能会获得不稳定的性能。我们提出了统一模式提示,这是一种灵活且可扩展的提示方法,它根据任务输入模式自动定制每个任务的可学习提示。它对任务之间的共享知识进行建模,同时保持不同任务模式的特征,从而提高任务的泛化能力。模式提示采用每个任务的显式数据结构来制定提示,因此几乎不需要人工操作。为了在规模上测试模式提示的任务泛化能力,我们对各种通用NLP任务进行了基于模式提示的多任务预训练。该框架在8种任务类型(如QA、NLI等)的16个看不见的下游任务上实现了强大的零样本和较少的搜索泛化性能。此外,综合分析证明了每个组件在模式提示中的有效性、任务组合的灵活性,以及在全数据微调设置下提高性能的能力。
{"title":"Improving task generalization via unified schema prompt","authors":"Wanjun Zhong ,&nbsp;Yifan Gao ,&nbsp;Ning Ding ,&nbsp;Zhiyuan Liu ,&nbsp;Ming Zhou ,&nbsp;Jiahai Wang ,&nbsp;Jian Yin ,&nbsp;Nan Duan","doi":"10.1016/j.aiopen.2023.08.011","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.011","url":null,"abstract":"<div><p>Task generalization has been a long-standing challenge in Natural Language Processing (NLP). Recent research attempts to improve the task generalization ability of pre-trained language models by mapping NLP tasks into human-readable prompted forms. However, these approaches require laborious and inflexible manual collection of prompts, and different prompts on the same downstream task may receive unstable performance. We propose Unified Schema Prompt, a flexible and extensible prompting method, which automatically customizes the learnable prompts for each task according to the task input schema. It models the shared knowledge between tasks, while keeping the characteristics of different task schema, and thus enhances task generalization ability. The schema prompt takes the explicit data structure of each task to formulate prompts so that little human effort is involved. To test the task generalization ability of schema prompt at scale, we conduct schema prompt-based multitask pre-training on a wide variety of general NLP tasks. The framework achieves strong zero-shot and few-shot generalization performance on 16 unseen downstream tasks from 8 task types (e.g., QA, NLI, etc.). Furthermore, comprehensive analyses demonstrate the effectiveness of each component in the schema prompt, its flexibility in task compositionality, and its ability to improve performance under a full-data fine-tuning setting.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 120-129"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Batch virtual adversarial training for graph convolutional networks 图卷积网络的批处理虚拟对抗训练
Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.007
Zhijie Deng , Yinpeng Dong , Jun Zhu

We present batch virtual adversarial training (BVAT), a novel regularization method for graph convolutional networks (GCNs). BVAT addresses the issue that GCNs do not ensure the smoothness of the model’s output distribution against local perturbations around the input node features. We propose two algorithms, sampling-based BVAT and optimization-based BVAT, which promote the output smoothness of GCN classifiers based on the generated virtual adversarial perturbations for either a subset of independent nodes or all nodes via an elaborate optimization process. Extensive experiments on three citation network datasets Cora, Citeseer and Pubmed and a knowledge graph dataset Nell validate the efficacy of the proposed method in semi-supervised node classification tasks.

我们提出了批量虚拟对抗性训练(BVAT),这是一种用于图卷积网络(GCN)的新的正则化方法。BVAT解决了GCN不能确保模型输出分布的平滑性以对抗输入节点特征周围的局部扰动的问题。我们提出了两种算法,基于采样的BVAT和基于优化的BVAT,它们通过精心设计的优化过程,基于生成的独立节点子集或所有节点的虚拟对抗性扰动,提高了GCN分类器的输出平滑性。在三个引文网络数据集Cora、Citeseer和Pubmed以及一个知识图数据集Nell上进行的大量实验验证了所提出的方法在半监督节点分类任务中的有效性。
{"title":"Batch virtual adversarial training for graph convolutional networks","authors":"Zhijie Deng ,&nbsp;Yinpeng Dong ,&nbsp;Jun Zhu","doi":"10.1016/j.aiopen.2023.08.007","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.007","url":null,"abstract":"<div><p>We present batch virtual adversarial training (BVAT), a novel regularization method for graph convolutional networks (GCNs). BVAT addresses the issue that GCNs do not ensure the smoothness of the model’s output distribution against local perturbations around the input node features. We propose two algorithms, sampling-based BVAT and optimization-based BVAT, which promote the output smoothness of GCN classifiers based on the generated virtual adversarial perturbations for either a subset of independent nodes or all nodes via an elaborate optimization process. Extensive experiments on three citation network datasets <em>Cora</em>, <em>Citeseer</em> and <em>Pubmed</em> and a knowledge graph dataset <em>Nell</em> validate the efficacy of the proposed method in semi-supervised node classification tasks.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 73-79"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49761369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Sarcasm detection using news headlines dataset 基于新闻标题数据集的讽刺检测
Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.01.001
Rishabh Misra , Prahal Arora

Sarcasm has been an elusive concept for humans. Due to interesting linguistic properties, sarcasm detection has gained traction of the Natural Language Processing (NLP) research community in the past few years. However, the task of predicting sarcasm in a text remains a difficult one for machines as well, and there are limited insights into what makes a sentence sarcastic. Past studies in sarcasm detection either use large scale datasets collected using tag-based supervision or small scale manually annotated datasets. The former category of datasets are noisy in terms of labels and language, whereas the latter category of datasets do not have enough instances to train deep learning models reliably despite having high-quality labels. To overcome these shortcomings, we introduce a high-quality and relatively larger-scale dataset which is a collection of news headlines from a sarcastic news website and a real news website. We describe the unique aspects of our dataset and compare its various characteristics with other benchmark datasets in sarcasm detection domain. Furthermore, we produce insights into what constitute as sarcasm in a text using a Hybrid Neural Network architecture. First released in 2019, we dedicate a section on how the NLP research community has extensively relied upon our contributions to push the state of the art further in the sarcasm detection domain. Lastly, we make the dataset as well as framework implementation publicly available to facilitate continued research in this domain.

讽刺对人类来说一直是一个难以捉摸的概念。由于有趣的语言特性,讽刺检测在过去几年中受到了自然语言处理(NLP)研究界的关注。然而,对于机器来说,预测文本中的讽刺仍然是一项困难的任务,而且对一个句子的讽刺原因的见解有限。过去的讽刺检测研究要么使用使用基于标签的监督收集的大规模数据集,要么使用小规模手动注释的数据集。前一类数据集在标签和语言方面是有噪声的,而后一类数据集中尽管有高质量的标签,但没有足够的实例来可靠地训练深度学习模型。为了克服这些缺点,我们引入了一个高质量且规模相对较大的数据集,该数据集是来自讽刺新闻网站和真实新闻网站的新闻标题的集合。我们描述了我们数据集的独特之处,并将其各种特征与讽刺检测领域的其他基准数据集进行了比较。此外,我们使用混合神经网络架构来深入了解文本中的讽刺构成。我们于2019年首次发布,专门介绍了NLP研究界如何广泛依赖我们的贡献,进一步推动讽刺检测领域的最新技术。最后,我们公开了数据集和框架实现,以促进该领域的持续研究。
{"title":"Sarcasm detection using news headlines dataset","authors":"Rishabh Misra ,&nbsp;Prahal Arora","doi":"10.1016/j.aiopen.2023.01.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.01.001","url":null,"abstract":"<div><p>Sarcasm has been an elusive concept for humans. Due to interesting linguistic properties, sarcasm detection has gained traction of the Natural Language Processing (NLP) research community in the past few years. However, the task of predicting sarcasm in a text remains a difficult one for machines as well, and there are limited insights into what makes a sentence sarcastic. Past studies in sarcasm detection either use large scale datasets collected using tag-based supervision or small scale manually annotated datasets. The former category of datasets are noisy in terms of labels and language, whereas the latter category of datasets do not have enough instances to train deep learning models reliably despite having high-quality labels. To overcome these shortcomings, we introduce a high-quality and relatively larger-scale dataset which is a collection of news headlines from a sarcastic news website and a real news website. We describe the unique aspects of our dataset and compare its various characteristics with other benchmark datasets in sarcasm detection domain. Furthermore, we produce insights into what constitute as sarcasm in a text using a Hybrid Neural Network architecture. First released in 2019, we dedicate a section on how the NLP research community has extensively relied upon our contributions to push the state of the art further in the sarcasm detection domain. Lastly, we make the dataset as well as framework implementation publicly available to facilitate continued research in this domain.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 13-18"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
On the distribution alignment of propagation in graph neural networks 论图神经网络传播的分布对齐
Pub Date : 2022-12-01 DOI: 10.1016/j.aiopen.2022.11.006
Qinkai Zheng, Xiao Xia, Kun Zhang, E. Kharlamov, Yuxiao Dong
{"title":"On the distribution alignment of propagation in graph neural networks","authors":"Qinkai Zheng, Xiao Xia, Kun Zhang, E. Kharlamov, Yuxiao Dong","doi":"10.1016/j.aiopen.2022.11.006","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.006","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"12 1","pages":"218-228"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81955757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BCA: Bilinear Convolutional Neural Networks and Attention Networks for legal question answering 基于双线性卷积神经网络和注意力网络的法律问答
Pub Date : 2022-11-01 DOI: 10.1016/j.aiopen.2022.11.002
Haiguang Zhang, Tongyue Zhang, Faxin Cao, Zhizheng Wang, Yuanyu Zhang, Yuanyuan Sun, Mark Anthony Vicente
{"title":"BCA: Bilinear Convolutional Neural Networks and Attention Networks for legal question answering","authors":"Haiguang Zhang, Tongyue Zhang, Faxin Cao, Zhizheng Wang, Yuanyu Zhang, Yuanyuan Sun, Mark Anthony Vicente","doi":"10.1016/j.aiopen.2022.11.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.002","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 1","pages":"172-181"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78994593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
HSSDA: Hierarchical relation aided Semi-Supervised Domain Adaptation 层次关系辅助半监督领域自适应
Pub Date : 2022-11-01 DOI: 10.1016/j.aiopen.2022.11.001
Xiechao Guo, R. Liu, Dandan Song
{"title":"HSSDA: Hierarchical relation aided Semi-Supervised Domain Adaptation","authors":"Xiechao Guo, R. Liu, Dandan Song","doi":"10.1016/j.aiopen.2022.11.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.001","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"47 1","pages":"156-161"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78388787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimized separable convolution: Yet another efficient convolution operator 优化的可分离卷积:另一个有效的卷积算子
Pub Date : 2022-10-01 DOI: 10.2139/ssrn.4245175
Tao Wei, Yonghong Tian, Yaowei Wang, Yun Liang, C. Chen
The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs O ( C 2 K 2 ) parameters to represent, where C is the channel size and K is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to O ( C · ( C + K 2 )) while spatial separable convolution reduces the complexity to O ( C 2 K ) . However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O ( C 32 K ) . When the restriction in the number of separated convolutions can be lifted, an even lower complexity at O ( C · log( CK 2 )) can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.
卷积运算是近年来深度学习研究中最关键的组成部分。传统的二维卷积需要O (c2k 2)个参数来表示,其中C为通道大小,K为核大小。考虑到这些参数最近为了满足苛刻的应用程序的需要而急剧增加,参数的数量已经变得非常昂贵。在卷积的各种实现中,可分离卷积已被证明在减小模型尺寸方面更有效。例如,深度可分卷积将复杂度降低到O (C·(C + K 2)),而空间可分卷积将复杂度降低到O (C 2 K)。然而,这些被认为是临时设计,不能确保它们通常可以实现最佳分离。在本研究中,我们提出了一种新颖的原则性算子——优化可分离卷积,通过优化设计,一般可分离卷积的内部群数和核大小可以达到O (C 32 K)的复杂度。当分离卷积数量的限制可以解除时,可以实现更低的复杂度O (C·log(CK 2))。实验结果表明,所提出的优化的可分离卷积在精度-参数权衡方面能够比传统的、深度的和深度/空间的可分离卷积取得更好的性能。
{"title":"Optimized separable convolution: Yet another efficient convolution operator","authors":"Tao Wei, Yonghong Tian, Yaowei Wang, Yun Liang, C. Chen","doi":"10.2139/ssrn.4245175","DOIUrl":"https://doi.org/10.2139/ssrn.4245175","url":null,"abstract":"The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs O ( C 2 K 2 ) parameters to represent, where C is the channel size and K is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to O ( C · ( C + K 2 )) while spatial separable convolution reduces the complexity to O ( C 2 K ) . However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O ( C 32 K ) . When the restriction in the number of separated convolutions can be lifted, an even lower complexity at O ( C · log( CK 2 )) can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"40 1","pages":"162-171"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85236498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Deep learning for fake news detection: A comprehensive survey 深度学习在假新闻检测中的应用综述
Pub Date : 2022-10-01 DOI: 10.1016/j.aiopen.2022.09.001
Linmei Hu, Siqi Wei, Ziwang Zhao, Bin Wu
{"title":"Deep learning for fake news detection: A comprehensive survey","authors":"Linmei Hu, Siqi Wei, Ziwang Zhao, Bin Wu","doi":"10.1016/j.aiopen.2022.09.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.09.001","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"44 1","pages":"133-155"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91013077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
期刊
AI Open
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1