首页 > 最新文献

AI Open最新文献

英文 中文
WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models 五道语料库:一个用于预训练语言模型的超大规模汉语语料库
Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.06.001
Sha Yuan , Hanyu Zhao , Zhengxiao Du , Ming Ding , Xiao Liu , Yukuo Cen , Xu Zou , Zhilin Yang , Jie Tang

Using large-scale training data to build a pre-trained language model (PLM) with a larger volume of parameters can significantly improve downstream tasks. For example, OpenAI trained the GPT3 model with 175 billion parameters on 570 GB English training data, enabling downstream applications building with only a small number of samples. However, there is a lack of Chinese corpus to support large-scale PLMs. This paper introduces a super large-scale Chinese corpora WuDaoCorpora, containing about 3 TB training data and 1.08 trillion Chinese characters. We also release the base version of WuDaoCorpora, containing about 200 GB training data and 72 billion Chinese characters. As a baseline, we train a model transformer-XL with 3 billion parameters on the base version to test the corpora's effect. The results show that the models trained on this corpora can achieve excellent performance in Chinese. The data and model are available at https://data.wudaoai.cn and https://github.com/THUDM/Chinese-Transformer-XL, respectively.

利用大规模训练数据构建具有更大参数量的预训练语言模型(PLM),可以显著改善下游任务。例如,OpenAI在570 GB英语训练数据上训练了1750亿个参数的GPT3模型,只用少量样本就可以构建下游应用。然而,目前还缺乏支持大规模plm的中文语料库。本文介绍了一个超大规模的汉语语料库“五道语料库”,包含约3tb的训练数据和1.08万亿的汉字。我们还发布了五道语料库的基础版本,包含约200gb的训练数据和720亿个汉字。作为基线,我们在基础版本上训练了一个具有30亿个参数的模型transformer-XL来测试语料库的效果。结果表明,在该语料库上训练的模型在汉语学习中取得了优异的成绩。数据和模型可分别在https://data.wudaoai.cn和https://github.com/THUDM/Chinese-Transformer-XL上获得。
{"title":"WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models","authors":"Sha Yuan ,&nbsp;Hanyu Zhao ,&nbsp;Zhengxiao Du ,&nbsp;Ming Ding ,&nbsp;Xiao Liu ,&nbsp;Yukuo Cen ,&nbsp;Xu Zou ,&nbsp;Zhilin Yang ,&nbsp;Jie Tang","doi":"10.1016/j.aiopen.2021.06.001","DOIUrl":"10.1016/j.aiopen.2021.06.001","url":null,"abstract":"<div><p>Using large-scale training data to build a pre-trained language model (PLM) with a larger volume of parameters can significantly improve downstream tasks. For example, OpenAI trained the GPT3 model with 175 billion parameters on 570 GB English training data, enabling downstream applications building with only a small number of samples. However, there is a lack of Chinese corpus to support large-scale PLMs. This paper introduces a super large-scale Chinese corpora WuDaoCorpora, containing about 3 TB training data and 1.08 trillion Chinese characters. We also release the base version of WuDaoCorpora, containing about 200 GB training data and 72 billion Chinese characters. As a baseline, we train a model transformer-XL with 3 billion parameters on the base version to test the corpora's effect. The results show that the models trained on this corpora can achieve excellent performance in Chinese. The data and model are available at <span>https://data.wudaoai.cn</span><svg><path></path></svg> and <span>https://github.com/THUDM/Chinese-Transformer-XL</span><svg><path></path></svg>, respectively.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 65-68"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.06.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81139879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
Know what you don't need: Single-Shot Meta-Pruning for attention heads 知道什么是你不需要的:一次性修剪注意力
Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.05.003
Zhengyan Zhang , Fanchao Qi , Zhiyuan Liu , Qun Liu , Maosong Sun

Deep pre-trained Transformer models have achieved state-of-the-art results over a variety of natural language processing (NLP) tasks. By learning rich language knowledge with millions of parameters, these models are usually overparameterized and significantly increase the computational overhead in applications. It is intuitive to address this issue by model compression. In this work, we propose a method, called Single-Shot Meta-Pruning, to compress deep pre-trained Transformers before fine-tuning. Specifically, we focus on pruning unnecessary attention heads adaptively for different downstream tasks. To measure the informativeness of attention heads, we train our Single-Shot Meta-Pruner (SMP) with a meta-learning paradigm aiming to maintain the distribution of text representations after pruning. Compared with existing compression methods for pre-trained models, our method can reduce the overhead of both fine-tuning and inference. Experimental results show that our pruner can selectively prune 50% of attention heads with little impact on the performance on downstream tasks and even provide better text representations. The source code is available at https://github.com/thunlp/SMP.

深度预训练的Transformer模型已经在各种自然语言处理(NLP)任务上取得了最先进的结果。通过学习具有数百万个参数的丰富语言知识,这些模型通常被过度参数化,并且显著增加了应用程序中的计算开销。通过模型压缩来解决这个问题是很直观的。在这项工作中,我们提出了一种称为Single-Shot Meta-Pruning的方法,在微调之前压缩深度预训练的变压器。具体来说,我们专注于自适应地修剪不必要的注意头,以适应不同的下游任务。为了测量注意头的信息性,我们使用元学习范式训练我们的单镜头元修剪器(SMP),旨在保持修剪后文本表示的分布。与现有的预训练模型压缩方法相比,我们的方法可以减少微调和推理的开销。实验结果表明,我们的修剪器可以选择性地修剪50%的注意头,对下游任务的性能影响很小,甚至可以提供更好的文本表示。源代码可从https://github.com/thunlp/SMP获得。
{"title":"Know what you don't need: Single-Shot Meta-Pruning for attention heads","authors":"Zhengyan Zhang ,&nbsp;Fanchao Qi ,&nbsp;Zhiyuan Liu ,&nbsp;Qun Liu ,&nbsp;Maosong Sun","doi":"10.1016/j.aiopen.2021.05.003","DOIUrl":"10.1016/j.aiopen.2021.05.003","url":null,"abstract":"<div><p>Deep pre-trained Transformer models have achieved state-of-the-art results over a variety of natural language processing (NLP) tasks. By learning rich language knowledge with millions of parameters, these models are usually overparameterized and significantly increase the computational overhead in applications. It is intuitive to address this issue by model compression. In this work, we propose a method, called Single-Shot Meta-Pruning, to compress deep pre-trained Transformers before fine-tuning. Specifically, we focus on pruning unnecessary attention heads adaptively for different downstream tasks. To measure the informativeness of attention heads, we train our Single-Shot Meta-Pruner (SMP) with a meta-learning paradigm aiming to maintain the distribution of text representations after pruning. Compared with existing compression methods for pre-trained models, our method can reduce the overhead of both fine-tuning and inference. Experimental results show that our pruner can selectively prune 50% of attention heads with little impact on the performance on downstream tasks and even provide better text representations. The source code is available at <span>https://github.com/thunlp/SMP</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 36-42"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.05.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75769316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Rule-based data augmentation for knowledge graph embedding 基于规则的知识图嵌入数据增强
Pub Date : 2021-01-01 DOI: 10.1016/j.aiopen.2021.09.003
Guangyao Li , Zequn Sun , Lei Qian , Qiang Guo , Wei Hu

Knowledge graph (KG) embedding models suffer from the incompleteness issue of observed facts. Different from existing solutions that incorporate additional information or employ expressive and complex embedding techniques, we propose to augment KGs by iteratively mining logical rules from the observed facts and then using the rules to generate new relational triples. We incrementally train KG embeddings with the coming of new augmented triples, and leverage the embeddings to validate these new triples. To guarantee the quality of the augmented data, we filter out the noisy triples based on a propagation mechanism during the validation. The mined rules and rule groundings are human-understandable, and can make the augmentation procedure reliable. Our KG augmentation framework is applicable to any KG embedding models with no need to modify their embedding techniques. Our experiments on two popular embedding-based tasks (i.e., entity alignment and link prediction) show that the proposed framework can bring significant improvement to existing KG embedding models on most benchmark datasets.

知识图(KG)嵌入模型存在观察事实不完备的问题。与现有的包含额外信息或使用表达性和复杂嵌入技术的解决方案不同,我们建议通过从观察到的事实中迭代挖掘逻辑规则,然后使用规则生成新的关系三元组来增强KGs。随着新的增广三元组的出现,我们逐渐训练KG嵌入,并利用嵌入来验证这些新的三元组。为了保证增强数据的质量,我们在验证过程中基于传播机制过滤掉噪声三元组。挖掘的规则和规则基础是人类可以理解的,并且可以使增强过程可靠。我们的KG增强框架适用于任何KG嵌入模型,无需修改其嵌入技术。我们在两个流行的基于嵌入的任务(即实体对齐和链接预测)上的实验表明,所提出的框架可以在大多数基准数据集上对现有的KG嵌入模型进行显着改进。
{"title":"Rule-based data augmentation for knowledge graph embedding","authors":"Guangyao Li ,&nbsp;Zequn Sun ,&nbsp;Lei Qian ,&nbsp;Qiang Guo ,&nbsp;Wei Hu","doi":"10.1016/j.aiopen.2021.09.003","DOIUrl":"10.1016/j.aiopen.2021.09.003","url":null,"abstract":"<div><p>Knowledge graph (KG) embedding models suffer from the incompleteness issue of observed facts. Different from existing solutions that incorporate additional information or employ expressive and complex embedding techniques, we propose to augment KGs by iteratively mining logical rules from the observed facts and then using the rules to generate new relational triples. We incrementally train KG embeddings with the coming of new augmented triples, and leverage the embeddings to validate these new triples. To guarantee the quality of the augmented data, we filter out the noisy triples based on a propagation mechanism during the validation. The mined rules and rule groundings are human-understandable, and can make the augmentation procedure reliable. Our KG augmentation framework is applicable to any KG embedding models with no need to modify their embedding techniques. Our experiments on two popular embedding-based tasks (i.e., entity alignment and link prediction) show that the proposed framework can bring significant improvement to existing KG embedding models on most benchmark datasets.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 186-196"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000267/pdfft?md5=46899106b76601dcb62a0d1c184db35c&pid=1-s2.0-S2666651021000267-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87043858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
AI-driven drug discovery: A boon against COVID-19? 人工智能驱动的药物发现:对抗COVID-19的福音?
Pub Date : 2020-01-01 DOI: 10.1016/j.aiopen.2020.07.001
Aman Chandra Kaushik , Utkarsh Raj

The COVID-19 is an issue of international concern and threat to public health and there is an urgent need of drug/vaccine design. There is no vaccine or specific drug yet made as of July 23, 2020, for the coronavirus disease (COVID-19). Thus, the patients currently can only be treated symptomatically. A quick identification of the drugs for COVID-19 may act as a potential therapeutic medication which has been used earlier in patients to answer the present pandemic condition before it could get more worse. According to our view, an artificial intelligence (AI) based tool that may predict drugs/peptides directly from the sequences of infected patients and thereby, they might have better affinity with the target and contribute towards vaccine design against COVID-19. Researchers across the world proposed several vaccines/drugs for COVID-19 utilizing AI based approaches, however, testing of these proposed vaccines/drugs will be needed to verify the safety and feasibility for combating COVID-19.

COVID-19是一个国际关注的问题,对公共卫生构成威胁,迫切需要设计药物/疫苗。截至2020年7月23日,还没有针对冠状病毒疾病(COVID-19)的疫苗或特异性药物。因此,目前只能对症治疗。快速识别针对COVID-19的药物可能会成为一种潜在的治疗药物,这种药物已经在患者中使用过,以应对目前的大流行状况,以免病情恶化。根据我们的观点,基于人工智能(AI)的工具可以直接从感染患者的序列中预测药物/肽,因此它们可能与靶标具有更好的亲和力,并有助于针对COVID-19的疫苗设计。世界各地的研究人员利用基于人工智能的方法提出了几种针对COVID-19的疫苗/药物,但是,需要对这些拟议的疫苗/药物进行测试,以验证对抗COVID-19的安全性和可行性。
{"title":"AI-driven drug discovery: A boon against COVID-19?","authors":"Aman Chandra Kaushik ,&nbsp;Utkarsh Raj","doi":"10.1016/j.aiopen.2020.07.001","DOIUrl":"10.1016/j.aiopen.2020.07.001","url":null,"abstract":"<div><p>The COVID-19 is an issue of international concern and threat to public health and there is an urgent need of drug/vaccine design. There is no vaccine or specific drug yet made as of July 23, 2020, for the coronavirus disease (COVID-19). Thus, the patients currently can only be treated symptomatically. A quick identification of the drugs for COVID-19 may act as a potential therapeutic medication which has been used earlier in patients to answer the present pandemic condition before it could get more worse. According to our view, an artificial intelligence (AI) based tool that may predict drugs/peptides directly from the sequences of infected patients and thereby, they might have better affinity with the target and contribute towards vaccine design against COVID-19. Researchers across the world proposed several vaccines/drugs for COVID-19 utilizing AI based approaches, however, testing of these proposed vaccines/drugs will be needed to verify the safety and feasibility for combating COVID-19.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 1-4"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2020.07.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76327761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Extracting Events and Their Relations from Texts: A Survey on Recent Research Progress and Challenges 从文本中提取事件及其关系:近期研究进展与挑战综述
Pub Date : 2020-01-01 DOI: 10.1016/j.aiopen.2021.02.004
Kang Liu , Yubo Chen , Jian Liu , Xinyu Zuo , Jun Zhao

Event is a common but non-negligible knowledge type. How to identify events from texts, extract their arguments, even analyze the relations between different events are important for many applications. This paper summaries some constructed event-centric knowledge graphs and the recent typical approaches for event and event relation extraction, besides task description, widely used evaluation datasets, and challenges. Specifically, in the event extraction task, we mainly focus on three recent important research problems: 1) how to learn the textual semantic representations for events in sentence-level event extraction; 2) how to extract relations across sentences or in a document level; 3) how to acquire or augment labeled instances for model training. In event relation extraction, we focus on the extraction approaches for three typical event relation types, including coreference, causal and temporal relations, respectively. Finally, we give out our conclusion and potential research issues in the future.

事件是一种常见但不可忽略的知识类型。如何从文本中识别事件,提取事件的参数,甚至分析不同事件之间的关系,对于许多应用来说都很重要。本文总结了一些以事件为中心的知识图的构建,以及近年来事件和事件关系提取的典型方法,以及任务描述、广泛使用的评估数据集和面临的挑战。具体而言,在事件提取任务中,我们主要关注三个近期的重要研究问题:1)如何学习句子级事件提取中事件的文本语义表示;2)如何在句子或文档层面提取关系;3)如何获取或增加标记实例用于模型训练。在事件关系提取中,重点研究了三种典型事件关系类型的提取方法,分别是共参考关系、因果关系和时间关系。最后,给出了本文的结论和未来的研究方向。
{"title":"Extracting Events and Their Relations from Texts: A Survey on Recent Research Progress and Challenges","authors":"Kang Liu ,&nbsp;Yubo Chen ,&nbsp;Jian Liu ,&nbsp;Xinyu Zuo ,&nbsp;Jun Zhao","doi":"10.1016/j.aiopen.2021.02.004","DOIUrl":"10.1016/j.aiopen.2021.02.004","url":null,"abstract":"<div><p>Event is a common but non-negligible knowledge type. How to identify events from texts, extract their arguments, even analyze the relations between different events are important for many applications. This paper summaries some constructed event-centric knowledge graphs and the recent typical approaches for event and event relation extraction, besides task description, widely used evaluation datasets, and challenges. Specifically, in the event extraction task, we mainly focus on three recent important research problems: 1) how to learn the textual semantic representations for events in sentence-level event extraction; 2) how to extract relations across sentences or in a document level; 3) how to acquire or augment labeled instances for model training. In event relation extraction, we focus on the extraction approaches for three typical event relation types, including coreference, causal and temporal relations, respectively. Finally, we give out our conclusion and potential research issues in the future.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 22-39"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.02.004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78935150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Graph neural networks: A review of methods and applications 图神经网络:方法和应用综述
Pub Date : 2020-01-01 DOI: 10.1016/j.aiopen.2021.01.001
Jie Zhou , Ganqu Cui , Shengding Hu , Zhengyan Zhang , Cheng Yang , Zhiyuan Liu , Lifeng Wang , Changcheng Li , Maosong Sun

Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics systems, learning molecular fingerprints, predicting protein interface, and classifying diseases demand a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures (like the dependency trees of sentences and the scene graphs of images) is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs. In recent years, variants of GNNs such as graph convolutional network (GCN), graph attention network (GAT), graph recurrent network (GRN) have demonstrated ground-breaking performances on many deep learning tasks. In this survey, we propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.

许多学习任务都需要处理包含丰富元素间关系信息的图数据。建模物理系统、学习分子指纹、预测蛋白质界面和分类疾病都需要一个模型来从图输入中学习。在其他领域,如从文本和图像等非结构化数据中学习,对提取的结构(如句子的依赖树和图像的场景图)进行推理是一个重要的研究课题,也需要图推理模型。图神经网络(gnn)是一种通过在图节点之间传递消息来捕获图之间依赖关系的神经模型。近年来,图卷积网络(GCN)、图注意网络(GAT)、图循环网络(GRN)等gnn的变体在许多深度学习任务中展示了突破性的性能。在本研究中,我们提出了GNN模型的通用设计管道,并讨论了每个组件的变体,系统地对应用进行了分类,并提出了未来研究的四个开放性问题。
{"title":"Graph neural networks: A review of methods and applications","authors":"Jie Zhou ,&nbsp;Ganqu Cui ,&nbsp;Shengding Hu ,&nbsp;Zhengyan Zhang ,&nbsp;Cheng Yang ,&nbsp;Zhiyuan Liu ,&nbsp;Lifeng Wang ,&nbsp;Changcheng Li ,&nbsp;Maosong Sun","doi":"10.1016/j.aiopen.2021.01.001","DOIUrl":"10.1016/j.aiopen.2021.01.001","url":null,"abstract":"<div><p>Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics systems, learning molecular fingerprints, predicting protein interface, and classifying diseases demand a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures (like the dependency trees of sentences and the scene graphs of images) is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs. In recent years, variants of GNNs such as graph convolutional network (GCN), graph attention network (GAT), graph recurrent network (GRN) have demonstrated ground-breaking performances on many deep learning tasks. In this survey, we propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 57-81"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.01.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88004694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3407
AI OPEN inaugural editorial AI OPEN首期社论
Pub Date : 2020-01-01 DOI: 10.1016/j.aiopen.2021.04.001
Jie Tang (Professor and Associate Chair)
{"title":"AI OPEN inaugural editorial","authors":"Jie Tang (Professor and Associate Chair)","doi":"10.1016/j.aiopen.2021.04.001","DOIUrl":"10.1016/j.aiopen.2021.04.001","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Page A1"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.04.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77764390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural machine translation: A review of methods, resources, and tools 神经机器翻译:方法、资源和工具的回顾
Pub Date : 2020-01-01 DOI: 10.1016/j.aiopen.2020.11.001
Zhixing Tan , Shuo Wang , Zonghan Yang , Gang Chen , Xuancheng Huang , Maosong Sun , Yang Liu

Machine translation (MT) is an important sub-field of natural language processing that aims to translate natural languages using computers. In recent years, end-to-end neural machine translation (NMT) has achieved great success and has become the new mainstream method in practical MT systems. In this article, we first provide a broad review of the methods for NMT and focus on methods relating to architectures, decoding, and data augmentation. Then we summarize the resources and tools that are useful for researchers. Finally, we conclude with a discussion of possible future research directions.

机器翻译是自然语言处理的一个重要分支,其目的是利用计算机对自然语言进行翻译。近年来,端到端神经机器翻译(NMT)取得了巨大的成功,已成为实用机器翻译系统中新的主流方法。在本文中,我们首先对NMT的方法进行了广泛的回顾,并重点介绍了与体系结构、解码和数据增强相关的方法。然后总结了对研究人员有用的资源和工具。最后,对未来可能的研究方向进行了讨论。
{"title":"Neural machine translation: A review of methods, resources, and tools","authors":"Zhixing Tan ,&nbsp;Shuo Wang ,&nbsp;Zonghan Yang ,&nbsp;Gang Chen ,&nbsp;Xuancheng Huang ,&nbsp;Maosong Sun ,&nbsp;Yang Liu","doi":"10.1016/j.aiopen.2020.11.001","DOIUrl":"10.1016/j.aiopen.2020.11.001","url":null,"abstract":"<div><p>Machine translation (MT) is an important sub-field of natural language processing that aims to translate natural languages using computers. In recent years, end-to-end neural machine translation (NMT) has achieved great success and has become the new mainstream method in practical MT systems. In this article, we first provide a broad review of the methods for NMT and focus on methods relating to architectures, decoding, and data augmentation. Then we summarize the resources and tools that are useful for researchers. Finally, we conclude with a discussion of possible future research directions.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 5-21"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2020.11.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86830106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
User behavior modeling for Web search evaluation 网络搜索评价的用户行为建模
Pub Date : 2020-01-01 DOI: 10.1016/j.aiopen.2021.02.003
Fan Zhang , Yiqun Liu , Jiaxin Mao , Min Zhang , Shaoping Ma

Search engines are widely used in our daily life. Batch evaluation of the performance of search systems to their users has always been an essential issue in the field of information retrieval. However, batch evaluation, which usually compares different search systems based on offline collections, cannot directly take the perception of users to the systems into consideration. Recently, substantial studies have focused on proposing effective evaluation metrics that model user behavior to bring human factors in the loop of Web search evaluation. In this survey, we comprehensively review the development of user behavior modeling for Web search evaluation and related works of different model-based evaluation metrics. From the overview of these metrics, we can see how the assumptions and modeling methods of user behavior have evolved with time. We also show the methods to compare the performances of model-based evaluation metrics in terms of modeling user behavior and measuring user satisfaction. Finally, we briefly discuss some potential future research directions in this field.

搜索引擎在我们的日常生活中被广泛使用。批量评价搜索系统对用户的性能一直是信息检索领域的一个重要问题。然而,批量评估通常是基于离线集合比较不同的搜索系统,不能直接考虑用户对系统的感知。最近,大量的研究集中在提出有效的评估指标,以模拟用户行为,将人为因素纳入Web搜索评估的循环中。在本次调查中,我们全面回顾了用于网络搜索评估的用户行为建模的发展以及不同的基于模型的评估指标的相关工作。从这些指标的概述中,我们可以看到用户行为的假设和建模方法是如何随着时间的推移而演变的。我们还展示了比较基于模型的评估指标在建模用户行为和测量用户满意度方面的性能的方法。最后,简要讨论了该领域未来可能的研究方向。
{"title":"User behavior modeling for Web search evaluation","authors":"Fan Zhang ,&nbsp;Yiqun Liu ,&nbsp;Jiaxin Mao ,&nbsp;Min Zhang ,&nbsp;Shaoping Ma","doi":"10.1016/j.aiopen.2021.02.003","DOIUrl":"10.1016/j.aiopen.2021.02.003","url":null,"abstract":"<div><p>Search engines are widely used in our daily life. Batch evaluation of the performance of search systems to their users has always been an essential issue in the field of information retrieval. However, batch evaluation, which usually compares different search systems based on offline collections, cannot directly take the perception of users to the systems into consideration. Recently, substantial studies have focused on proposing effective evaluation metrics that model user behavior to bring human factors in the loop of Web search evaluation. In this survey, we comprehensively review the development of user behavior modeling for Web search evaluation and related works of different model-based evaluation metrics. From the overview of these metrics, we can see how the assumptions and modeling methods of user behavior have evolved with time. We also show the methods to compare the performances of model-based evaluation metrics in terms of modeling user behavior and measuring user satisfaction. Finally, we briefly discuss some potential future research directions in this field.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 40-56"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.02.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78381934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Augmented and challenging datasets with multi-step reasoning and multi-span questions for Chinese judicial reading comprehension 中文司法阅读理解的多步推理和多跨度问题扩充和挑战数据集
Pub Date : 1900-01-01 DOI: 10.1016/j.aiopen.2022.12.001
Qingye Meng, Ziyue Wang, Hang Chen, Xianzhen Luo, Baoxin Wang, Zhipeng Chen, Yiming Cui, Dayong Wu, Zhigang Chen, Shijin Wang
{"title":"Augmented and challenging datasets with multi-step reasoning and multi-span questions for Chinese judicial reading comprehension","authors":"Qingye Meng, Ziyue Wang, Hang Chen, Xianzhen Luo, Baoxin Wang, Zhipeng Chen, Yiming Cui, Dayong Wu, Zhigang Chen, Shijin Wang","doi":"10.1016/j.aiopen.2022.12.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.12.001","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"115 1","pages":"193-199"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82853916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
AI Open
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1