首页 > 最新文献

AI Open最新文献

英文 中文
Domain generalization by class-aware negative sampling-based contrastive learning 基于类别感知的负抽样对比学习的领域泛化
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.11.004
Mengwei Xie, Suyun Zhao, Hong Chen, Cuiping Li

When faced with the issue of different feature distribution between training and test data, the test data may differ in style and background from the training data due to the collection sources or privacy protection. That is, the transfer generalization problem. Contrastive learning, which is currently the most successful unsupervised learning method, provides good generalization performance for the various distributions of data and can use labeled data more effectively without overfitting. This study demonstrates how contrast can enhance a model’s ability to generalize, how joint contrastive learning and supervised learning can strengthen one another, and how this approach can be broadly used in various disciplines.

当面对训练数据和测试数据特征分布不同的问题时,由于采集来源或隐私保护的原因,测试数据可能会与训练数据在风格和背景上有所不同。即传递泛化问题。对比学习是目前最成功的无监督学习方法,它对数据的各种分布具有良好的泛化性能,可以更有效地利用标记数据而不会过度拟合。本研究展示了对比如何增强模型的泛化能力,联合对比学习和监督学习如何相互加强,以及这种方法如何广泛应用于各个学科。
{"title":"Domain generalization by class-aware negative sampling-based contrastive learning","authors":"Mengwei Xie,&nbsp;Suyun Zhao,&nbsp;Hong Chen,&nbsp;Cuiping Li","doi":"10.1016/j.aiopen.2022.11.004","DOIUrl":"10.1016/j.aiopen.2022.11.004","url":null,"abstract":"<div><p>When faced with the issue of different feature distribution between training and test data, the test data may differ in style and background from the training data due to the collection sources or privacy protection. That is, the transfer generalization problem. Contrastive learning, which is currently the most successful unsupervised learning method, provides good generalization performance for the various distributions of data and can use labeled data more effectively without overfitting. This study demonstrates how contrast can enhance a model’s ability to generalize, how joint contrastive learning and supervised learning can strengthen one another, and how this approach can be broadly used in various disciplines.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 200-207"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000195/pdfft?md5=d1beea40105807161328cdcc4aa5b211&pid=1-s2.0-S2666651022000195-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83293382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
StackVAE-G: An efficient and interpretable model for time series anomaly detection StackVAE-G:一种高效且可解释的时间序列异常检测模型
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.07.001
Wenkai Li , Wenbo Hu , Ting Chen , Ning Chen , Cheng Feng

Recent studies have shown that autoencoder-based models can achieve superior performance on anomaly detection tasks due to their excellent ability to fit complex data in an unsupervised manner. In this work, we propose a novel autoencoder-based model, named StackVAE-G that can significantly bring the efficiency and interpretability to multivariate time series anomaly detection. Specifically, we utilize the similarities across the time series channels by the stacking block-wise reconstruction with a weight-sharing scheme to reduce the size of learned models and also relieve the overfitting to unknown noises in the training data. We also leverage a graph learning module to learn a sparse adjacency matrix to explicitly capture the stable interrelation structure among multiple time series channels for the interpretable pattern reconstruction of interrelated channels. Combining these two modules, we introduce the stacking block-wise VAE (variational autoencoder) with GNN (graph neural network) model for multivariate time series anomaly detection. We conduct extensive experiments on three commonly used public datasets, showing that our model achieves comparable (even better) performance with the state-of-the-art models and meanwhile requires much less computation and memory cost. Furthermore, we demonstrate that the adjacency matrix learned by our model accurately captures the interrelation among multiple channels, and can provide valuable information for failure diagnosis applications.

最近的研究表明,基于自动编码器的模型能够以无监督的方式拟合复杂数据,因此在异常检测任务中可以获得优异的性能。在这项工作中,我们提出了一种新的基于自动编码器的模型,称为StackVAE-G,它可以显著提高多变量时间序列异常检测的效率和可解释性。具体而言,我们通过使用权重共享方案的逐块堆叠重建来利用时间序列通道之间的相似性,以减小学习模型的大小,并消除对训练数据中未知噪声的过拟合。我们还利用图学习模块来学习稀疏邻接矩阵,以明确地捕捉多个时间序列通道之间的稳定相互关系结构,用于相互关系通道的可解释模式重建。结合这两个模块,我们介绍了用于多变量时间序列异常检测的堆叠块VAE(变分自动编码器)和GNN(图神经网络)模型。我们在三个常用的公共数据集上进行了广泛的实验,表明我们的模型实现了与最先进的模型相当(甚至更好)的性能,同时需要更少的计算和内存成本。此外,我们证明了我们的模型学习的邻接矩阵准确地捕捉了多个通道之间的相互关系,可以为故障诊断应用提供有价值的信息。
{"title":"StackVAE-G: An efficient and interpretable model for time series anomaly detection","authors":"Wenkai Li ,&nbsp;Wenbo Hu ,&nbsp;Ting Chen ,&nbsp;Ning Chen ,&nbsp;Cheng Feng","doi":"10.1016/j.aiopen.2022.07.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.07.001","url":null,"abstract":"<div><p>Recent studies have shown that autoencoder-based models can achieve superior performance on anomaly detection tasks due to their excellent ability to fit complex data in an unsupervised manner. In this work, we propose a novel autoencoder-based model, named StackVAE-G that can significantly bring the efficiency and interpretability to multivariate time series anomaly detection. Specifically, we utilize the similarities across the time series channels by the stacking block-wise reconstruction with a weight-sharing scheme to reduce the size of learned models and also relieve the overfitting to unknown noises in the training data. We also leverage a graph learning module to learn a sparse adjacency matrix to explicitly capture the stable interrelation structure among multiple time series channels for the interpretable pattern reconstruction of interrelated channels. Combining these two modules, we introduce the stacking block-wise VAE (variational autoencoder) with GNN (graph neural network) model for multivariate time series anomaly detection. We conduct extensive experiments on three commonly used public datasets, showing that our model achieves comparable (even better) performance with the state-of-the-art models and meanwhile requires much less computation and memory cost. Furthermore, we demonstrate that the adjacency matrix learned by our model accurately captures the interrelation among multiple channels, and can provide valuable information for failure diagnosis applications.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 101-110"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000110/pdfft?md5=1bdde12e6a6cbde8b1220840197923b8&pid=1-s2.0-S2666651022000110-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72282566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Augmented and challenging datasets with multi-step reasoning and multi-span questions for Chinese judicial reading comprehension 具有多步骤推理和多跨度问题的中国司法阅读理解扩充和挑战性数据集
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.12.001
Qingye Meng , Ziyue Wang , Hang Chen , Xianzhen Luo , Baoxin Wang , Zhipeng Chen , Yiming Cui , Dayong Wu , Zhigang Chen , Shijin Wang
{"title":"Augmented and challenging datasets with multi-step reasoning and multi-span questions for Chinese judicial reading comprehension","authors":"Qingye Meng ,&nbsp;Ziyue Wang ,&nbsp;Hang Chen ,&nbsp;Xianzhen Luo ,&nbsp;Baoxin Wang ,&nbsp;Zhipeng Chen ,&nbsp;Yiming Cui ,&nbsp;Dayong Wu ,&nbsp;Zhigang Chen ,&nbsp;Shijin Wang","doi":"10.1016/j.aiopen.2022.12.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.12.001","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"193-199"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000225/pdfft?md5=b1c460292acbffd5098c88c36eca4487&pid=1-s2.0-S2666651022000225-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72286079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Survey: Transformer based video-language pre-training 基于Transformer的视频语言预训练
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.01.001
Ludan Ruan, Qin Jin

Inspired by the success of transformer-based pre-training methods on natural language tasks and further computer vision tasks, researchers have started to apply transformer to video processing. This survey aims to provide a comprehensive overview of transformer-based pre-training methods for Video-Language learning. We first briefly introduce the transformer structure as the background knowledge, including attention mechanism, position encoding etc. We then describe the typical paradigm of pre-training & fine-tuning on Video-Language processing in terms of proxy tasks, downstream tasks and commonly used video datasets. Next, we categorize transformer models into Single-Stream and Multi-Stream structures, highlight their innovations and compare their performances. Finally, we analyze and discuss the current challenges and possible future research directions for Video-Language pre-training.

受基于变压器的预训练方法在自然语言任务和进一步的计算机视觉任务上的成功启发,研究人员开始将变压器应用于视频处理。这项调查的目的是提供一个全面的概述基于变换的预训练方法的视频语言学习。首先简要介绍变压器的结构作为背景知识,包括注意机制、位置编码等。然后,我们描述了预训练的典型范例。在代理任务、下游任务和常用视频数据集方面对视频语言处理进行微调。接下来,我们将变压器模型分为单流和多流结构,重点介绍了它们的创新之处并比较了它们的性能。最后,我们分析和讨论了视频语言预训练目前面临的挑战和未来可能的研究方向。
{"title":"Survey: Transformer based video-language pre-training","authors":"Ludan Ruan,&nbsp;Qin Jin","doi":"10.1016/j.aiopen.2022.01.001","DOIUrl":"10.1016/j.aiopen.2022.01.001","url":null,"abstract":"<div><p>Inspired by the success of transformer-based pre-training methods on natural language tasks and further computer vision tasks, researchers have started to apply transformer to video processing. This survey aims to provide a comprehensive overview of transformer-based pre-training methods for Video-Language learning. We first briefly introduce the transformer structure as the background knowledge, including attention mechanism, position encoding etc. We then describe the typical paradigm of pre-training &amp; fine-tuning on Video-Language processing in terms of proxy tasks, downstream tasks and commonly used video datasets. Next, we categorize transformer models into Single-Stream and Multi-Stream structures, highlight their innovations and compare their performances. Finally, we analyze and discuss the current challenges and possible future research directions for Video-Language pre-training.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 1-13"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000018/pdfft?md5=d7b4ae16eb4b58434223ebe8ccf64030&pid=1-s2.0-S2666651022000018-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77585167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Learning towards conversational AI: A survey 学习对话式AI:一项调查
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.02.001
Tingchen Fu , Shen Gao , Xueliang Zhao , Ji-rong Wen , Rui Yan

Recent years have witnessed a surge of interest in the field of open-domain dialogue. Thanks to the rapid development of social media, large dialogue corpus from the Internet builds up a fundamental premise for data-driven dialogue model. The breakthrough in neural network also brings new ideas to researchers in AI and NLP. A great number of new techniques and methods therefore came into being. In this paper, we review some of the most representative works in recent years and divide existing prevailing frameworks for a dialogue model into three categories. We further analyze the trend of development for open-domain dialogue and summarize the goal of an open-domain dialogue system in two aspects, informative and controllable. The methods we review in this paper are selected according to our unique perspectives and by no means complete. Rather, we hope this servery could benefit NLP community for future research in open-domain dialogue.

近年来,人们对开放域对话领域的兴趣激增。由于社交媒体的快速发展,来自互联网的大量对话语料库为数据驱动的对话模型奠定了基础前提。神经网络的突破也给人工智能和自然语言处理的研究人员带来了新的思路。因此产生了大量的新技术和新方法。在本文中,我们回顾了近年来一些最具代表性的作品,并将现有的主流对话模式框架分为三类。进一步分析了开放域对话的发展趋势,并从信息和可控两个方面总结了开放域对话系统的目标。本文所综述的方法都是根据各自独特的视角来选择的,并不完整。相反,我们希望这个服务可以使NLP社区在未来的开放领域对话研究中受益。
{"title":"Learning towards conversational AI: A survey","authors":"Tingchen Fu ,&nbsp;Shen Gao ,&nbsp;Xueliang Zhao ,&nbsp;Ji-rong Wen ,&nbsp;Rui Yan","doi":"10.1016/j.aiopen.2022.02.001","DOIUrl":"10.1016/j.aiopen.2022.02.001","url":null,"abstract":"<div><p>Recent years have witnessed a surge of interest in the field of open-domain dialogue. Thanks to the rapid development of social media, large dialogue corpus from the Internet builds up a fundamental premise for data-driven dialogue model. The breakthrough in neural network also brings new ideas to researchers in AI and NLP. A great number of new techniques and methods therefore came into being. In this paper, we review some of the most representative works in recent years and divide existing prevailing frameworks for a dialogue model into three categories. We further analyze the trend of development for open-domain dialogue and summarize the goal of an open-domain dialogue system in two aspects, informative and controllable. The methods we review in this paper are selected according to our unique perspectives and by no means complete. Rather, we hope this servery could benefit NLP community for future research in open-domain dialogue.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 14-28"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000079/pdfft?md5=a8c5cdae822d93f7d82a0ff336415b53&pid=1-s2.0-S2666651022000079-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85008120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
The road from MLE to EM to VAE: A brief tutorial 从MLE到EM再到VAE的道路:一个简短的教程
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2021.10.001
Ming Ding

Variational Auto-Encoders (VAEs) have emerged as one of the most popular genres of generative models, which are learned to characterize the data distribution. The classic Expectation Maximization (EM) algorithm aims to learn models with hidden variables. Essentially, both of them are iteratively optimizing the evidence lower bound (ELBO) to maximize to the likelihood of the observed data.

This short tutorial joins them up into a line and offer a good way to thoroughly understand EM and VAE with minimal knowledge. It is especially helpful to beginners and readers with experiences in machine learning applications but no statistics background.

变分自编码器(VAEs)已经成为生成模型中最流行的类型之一,它被用来表征数据分布。经典的期望最大化(EM)算法旨在学习具有隐变量的模型。本质上,它们都是迭代优化证据下限(ELBO),以最大化观测数据的可能性。这个简短的教程将它们连接成一条线,并提供了一个用最少的知识彻底理解EM和VAE的好方法。它对初学者和有机器学习应用经验但没有统计学背景的读者特别有帮助。
{"title":"The road from MLE to EM to VAE: A brief tutorial","authors":"Ming Ding","doi":"10.1016/j.aiopen.2021.10.001","DOIUrl":"10.1016/j.aiopen.2021.10.001","url":null,"abstract":"<div><p>Variational Auto-Encoders (VAEs) have emerged as one of the most popular genres of <em>generative models</em>, which are learned to characterize the data distribution. The classic Expectation Maximization (EM) algorithm aims to learn models with hidden variables. Essentially, both of them are iteratively optimizing the <em>evidence lower bound</em> (ELBO) to maximize to the likelihood of the observed data.</p><p>This short tutorial joins them up into a line and offer a good way to thoroughly understand EM and VAE with minimal knowledge. It is especially helpful to beginners and readers with experiences in machine learning applications but no statistics background.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 29-34"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000279/pdfft?md5=8f78a90e4fd74243d885b738de1fe94e&pid=1-s2.0-S2666651021000279-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73299255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Optimized separable convolution: Yet another efficient convolution operator 优化的可分离卷积:又一种高效的卷积算子
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.10.002
Tao Wei , Yonghong Tian , Yaowei Wang , Yun Liang , Chang Wen Chen

The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs O(C2K2) parameters to represent, where C is the channel size and K is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to O(C(C+K2)) while spatial separable convolution reduces the complexity to O(C2K). However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O(C32K). When the restriction in the number of separated convolutions can be lifted, an even lower complexity at O(Clog(CK2)) can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.

卷积运算是最近激增的深度学习研究中最关键的组成部分。传统的2D卷积需要O(C2K2)参数来表示,其中C是信道大小,K是核大小。考虑到这些参数最近急剧增加以满足苛刻应用的需求,参数的数量变得非常昂贵。在卷积的各种实现中,可分离卷积已被证明在减小模型大小方面更有效。例如,深度可分离卷积将复杂度降低到O(C·(C+K2)),而空间可分离卷积则将复杂度降至O(C2K)。然而,这些被认为是临时设计,不能确保它们通常能够实现最佳分离。在本研究中,我们提出了一种新的、有原则的算子,称为优化可分离卷积,通过对内部组数和核大小的优化设计,对于一般可分离卷积可以实现O(C32K)的复杂度。当可以取消分离卷积数量的限制时,可以实现O(C·log(CK2))下更低的复杂度。实验结果表明,与传统、深度和深度/空间可分离卷积相比,所提出的优化的可分离卷积能够在精度方面实现改进的性能。
{"title":"Optimized separable convolution: Yet another efficient convolution operator","authors":"Tao Wei ,&nbsp;Yonghong Tian ,&nbsp;Yaowei Wang ,&nbsp;Yun Liang ,&nbsp;Chang Wen Chen","doi":"10.1016/j.aiopen.2022.10.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.10.002","url":null,"abstract":"<div><p>The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span> parameters to represent, where <span><math><mi>C</mi></math></span> is the channel size and <span><math><mi>K</mi></math></span> is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mrow><mo>(</mo><mi>C</mi><mo>+</mo><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math></span> while spatial separable convolution reduces the complexity to <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mn>2</mn></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math></span>. However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called <em>optimized separable convolution</em> by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>C</mi></mrow><mrow><mfrac><mrow><mn>3</mn></mrow><mrow><mn>2</mn></mrow></mfrac></mrow></msup><mi>K</mi><mo>)</mo></mrow></mrow></math></span>. When the restriction in the number of separated convolutions can be lifted, an even lower complexity at <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>C</mi><mi>⋅</mi><mo>log</mo><mrow><mo>(</mo><mi>C</mi><msup><mrow><mi>K</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math></span> can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 162-171"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000158/pdfft?md5=53825a8ab2de46247d122c455ee0622b&pid=1-s2.0-S2666651022000158-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72286083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey of transformers 变压器的调查
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.10.001
Tianyang Lin, Yuxin Wang, Xiangyang Liu, Xipeng Qiu

Transformers have achieved great success in many artificial intelligence fields, such as natural language processing, computer vision, and audio processing. Therefore, it is natural to attract lots of interest from academic and industry researchers. Up to the present, a great variety of Transformer variants (a.k.a. X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing. In this survey, we provide a comprehensive review of various X-formers. We first briefly introduce the vanilla Transformer and then propose a new taxonomy of X-formers. Next, we introduce the various X-formers from three perspectives: architectural modification, pre-training, and applications. Finally, we outline some potential directions for future research.

变形金刚在自然语言处理、计算机视觉、音频处理等诸多人工智能领域取得了巨大成功。因此,它自然会引起学术界和工业界研究人员的极大兴趣。到目前为止,已经提出了各种各样的Transformer变体(又名x -former),然而,关于这些Transformer变体的系统和全面的文献综述仍然缺失。在这项调查中,我们提供了各种x -former的全面审查。我们首先简要介绍香草Transformer,然后提出x -former的新分类。接下来,我们将从三个角度介绍各种x -former:架构修改、预培训和应用程序。最后,对今后的研究方向进行了展望。
{"title":"A survey of transformers","authors":"Tianyang Lin,&nbsp;Yuxin Wang,&nbsp;Xiangyang Liu,&nbsp;Xipeng Qiu","doi":"10.1016/j.aiopen.2022.10.001","DOIUrl":"10.1016/j.aiopen.2022.10.001","url":null,"abstract":"<div><p>Transformers have achieved great success in many artificial intelligence fields, such as natural language processing, computer vision, and audio processing. Therefore, it is natural to attract lots of interest from academic and industry researchers. Up to the present, a great variety of Transformer variants (a.k.a. X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing. In this survey, we provide a comprehensive review of various X-formers. We first briefly introduce the vanilla Transformer and then propose a new taxonomy of X-formers. Next, we introduce the various X-formers from three perspectives: architectural modification, pre-training, and applications. Finally, we outline some potential directions for future research.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 111-132"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000146/pdfft?md5=802c180f3454a2e26d638dce462d3dff&pid=1-s2.0-S2666651022000146-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80994748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 431
Debiased recommendation with neural stratification 基于神经分层的去偏推荐
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.11.005
Quanyu Dai , Zhenhua Dong , Xu Chen

Debiased recommender models have recently attracted increasing attention from the academic and industry communities. Existing models are mostly based on the technique of inverse propensity score (IPS). However, in the recommendation domain, IPS can be hard to estimate given the sparse and noisy nature of the observed user–item exposure data. To alleviate this problem, in this paper, we assume that the user preference can be dominated by a small amount of latent factors, and propose to cluster the users for computing more accurate IPS via increasing the exposure densities. Basically, such method is similar with the spirit of stratification models in applied statistics. However, unlike previous heuristic stratification strategy, we learn the cluster criterion by presenting the users with low ranking embeddings, which are future shared with the user representations in the recommender model. At last, we find that our model has strong connections with the previous two types of debiased recommender models. We conduct extensive experiments based on real-world datasets to demonstrate the effectiveness of the proposed method.

最近,去偏推荐模型越来越受到学术界和行业界的关注。现有的模型大多基于反倾向评分(IPS)技术。然而,在推荐领域,考虑到观察到的用户-项目暴露数据的稀疏性和噪声性,IPS可能很难估计。为了缓解这个问题,在本文中,我们假设用户偏好可以由少量潜在因素主导,并建议通过增加曝光密度来对用户进行聚类,以计算更准确的IPS。基本上,这种方法与应用统计学中分层模型的精神是相似的。然而,与以前的启发式分层策略不同,我们通过向用户呈现低排名嵌入来学习聚类标准,这些嵌入将来与推荐模型中的用户表示共享。最后,我们发现我们的模型与前两种类型的去偏推荐模型有很强的联系。我们基于真实世界的数据集进行了大量实验,以证明所提出方法的有效性。
{"title":"Debiased recommendation with neural stratification","authors":"Quanyu Dai ,&nbsp;Zhenhua Dong ,&nbsp;Xu Chen","doi":"10.1016/j.aiopen.2022.11.005","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.005","url":null,"abstract":"<div><p>Debiased recommender models have recently attracted increasing attention from the academic and industry communities. Existing models are mostly based on the technique of inverse propensity score (IPS). However, in the recommendation domain, IPS can be hard to estimate given the sparse and noisy nature of the observed user–item exposure data. To alleviate this problem, in this paper, we assume that the user preference can be dominated by a small amount of latent factors, and propose to cluster the users for computing more accurate IPS via increasing the exposure densities. Basically, such method is similar with the spirit of stratification models in applied statistics. However, unlike previous heuristic stratification strategy, we learn the cluster criterion by presenting the users with low ranking embeddings, which are future shared with the user representations in the recommender model. At last, we find that our model has strong connections with the previous two types of debiased recommender models. We conduct extensive experiments based on real-world datasets to demonstrate the effectiveness of the proposed method.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 213-217"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000201/pdfft?md5=1244b2c9319c988375fcebe6f3172caa&pid=1-s2.0-S2666651022000201-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72246441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BCA: Bilinear Convolutional Neural Networks and Attention Networks for legal question answering BCA:用于法律问答的双线性卷积神经网络和注意力网络
Pub Date : 2022-01-01 DOI: 10.1016/j.aiopen.2022.11.002
Haiguang Zhang, Tongyue Zhang, Faxin Cao, Zhizheng Wang, Yuanyu Zhang, Yuanyuan Sun, Mark Anthony Vicente

The National Judicial Examination of China is an essential examination for selecting legal practitioners. In recent years, people have tried to use machine learning algorithms to answer examination questions. With the proposal of JEC-QA (Zhong et al. 2020), the judicial examination becomes a particular legal task. The data of judicial examination contains two types, i.e., Knowledge-Driven questions and Case-Analysis questions. Both require complex reasoning and text comprehension, thus challenging computers to answer judicial examination questions. We propose Bilinear Convolutional Neural Networks and Attention Networks (BCA) in this paper, which is an improved version based on the model proposed by our team on the Challenge of AI in Law 2021 judicial examination task. It has two essential modules, Knowledge-Driven Module (KDM) for local features extraction and Case-Analysis Module (CAM) for the semantic difference clarification between the question stem and the options. We also add a post-processing module to correct the results in the final stage. The experimental results show that our system achieves state-of-the-art in the offline test of the judicial examination task.

国家司法考试是选拔法律从业人员的重要考试。近年来,人们尝试使用机器学习算法来回答考试问题。随着JEC-QA(Zhong et al.2020)的提出,司法审查成为一项特殊的法律任务。司法考试数据分为知识驱动题和案例分析题两类。两者都需要复杂的推理和文本理解,因此对计算机回答司法考试问题具有挑战性。我们在本文中提出了双线性卷积神经网络和注意力网络(BCA),这是基于我们团队在2021年法律中人工智能挑战司法考试任务中提出的模型的改进版本。它有两个基本模块,用于局部特征提取的知识驱动模块(KDM)和用于澄清题干和选项之间语义差异的案例分析模块(CAM)。我们还添加了一个后处理模块,以在最后阶段更正结果。实验结果表明,我们的系统在司法考试任务的离线测试中达到了最先进的水平。
{"title":"BCA: Bilinear Convolutional Neural Networks and Attention Networks for legal question answering","authors":"Haiguang Zhang,&nbsp;Tongyue Zhang,&nbsp;Faxin Cao,&nbsp;Zhizheng Wang,&nbsp;Yuanyu Zhang,&nbsp;Yuanyuan Sun,&nbsp;Mark Anthony Vicente","doi":"10.1016/j.aiopen.2022.11.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.002","url":null,"abstract":"<div><p>The National Judicial Examination of China is an essential examination for selecting legal practitioners. In recent years, people have tried to use machine learning algorithms to answer examination questions. With the proposal of JEC-QA (Zhong et al. 2020), the judicial examination becomes a particular legal task. The data of judicial examination contains two types, i.e., Knowledge-Driven questions and Case-Analysis questions. Both require complex reasoning and text comprehension, thus challenging computers to answer judicial examination questions. We propose <strong>B</strong>ilinear <strong>C</strong>onvolutional Neural Networks and <strong>A</strong>ttention Networks (<strong>BCA</strong>) in this paper, which is an improved version based on the model proposed by our team on the Challenge of AI in Law 2021 judicial examination task. It has two essential modules, <strong>K</strong>nowledge-<strong>D</strong>riven <strong>M</strong>odule (<strong>KDM</strong>) for local features extraction and <strong>C</strong>ase-<strong>A</strong>nalysis <strong>M</strong>odule (<strong>CAM</strong>) for the semantic difference clarification between the question stem and the options. We also add a post-processing module to correct the results in the final stage. The experimental results show that our system achieves state-of-the-art in the offline test of the judicial examination task.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"3 ","pages":"Pages 172-181"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651022000171/pdfft?md5=7fc8cf53d6ea6be2b3999607b407f336&pid=1-s2.0-S2666651022000171-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72286081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
AI Open
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1