首页 > 最新文献

Proceedings of COLING. International Conference on Computational Linguistics最新文献

英文 中文
MaxMatch-Dropout: Subword Regularization for WordPiece MaxMatch-Dropout: WordPiece的子词正则化
Pub Date : 2022-09-09 DOI: 10.48550/arXiv.2209.04126
Tatsuya Hiraoka
We present a subword regularization method for WordPiece, which uses a maximum matching algorithm for tokenization. The proposed method, MaxMatch-Dropout, randomly drops words in a search using the maximum matching algorithm. It realizes finetuning with subword regularization for popular pretrained language models such as BERT-base. The experimental results demonstrate that MaxMatch-Dropout improves the performance of text classification and machine translation tasks as well as other subword regularization methods. Moreover, we provide a comparative analysis of subword regularization methods: subword regularization with SentencePiece (Unigram), BPE-Dropout, and MaxMatch-Dropout.
提出了一种基于最大匹配算法的WordPiece子词正则化方法。所提出的方法MaxMatch-Dropout使用最大匹配算法在搜索中随机删除单词。它实现了对BERT-base等流行的预训练语言模型的子词正则化微调。实验结果表明,MaxMatch-Dropout提高了文本分类和机器翻译任务以及其他子词正则化方法的性能。此外,我们还提供了子词正则化方法的比较分析:使用sentencepece (Unigram), BPE-Dropout和MaxMatch-Dropout进行子词正则化。
{"title":"MaxMatch-Dropout: Subword Regularization for WordPiece","authors":"Tatsuya Hiraoka","doi":"10.48550/arXiv.2209.04126","DOIUrl":"https://doi.org/10.48550/arXiv.2209.04126","url":null,"abstract":"We present a subword regularization method for WordPiece, which uses a maximum matching algorithm for tokenization. The proposed method, MaxMatch-Dropout, randomly drops words in a search using the maximum matching algorithm. It realizes finetuning with subword regularization for popular pretrained language models such as BERT-base. The experimental results demonstrate that MaxMatch-Dropout improves the performance of text classification and machine translation tasks as well as other subword regularization methods. Moreover, we provide a comparative analysis of subword regularization methods: subword regularization with SentencePiece (Unigram), BPE-Dropout, and MaxMatch-Dropout.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"116 11 1","pages":"4864-4872"},"PeriodicalIF":0.0,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90239115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
MICO: Selective Search with Mutual Information Co-training MICO:互信息协同训练的选择性搜索
Pub Date : 2022-09-09 DOI: 10.48550/arXiv.2209.04378
Zhanyu Wang, Xiao Zhang, Hyokun Yun, C. Teo, Trishul M. Chilimbi
In contrast to traditional exhaustive search, selective search first clusters documents into several groups before all the documents are searched exhaustively by a query, to limit the search executed within one group or only a few groups. Selective search is designed to reduce the latency and computation in modern large-scale search systems. In this study, we propose MICO, a Mutual Information CO-training framework for selective search with minimal supervision using the search logs. After training, MICO does not only cluster the documents, but also routes unseen queries to the relevant clusters for efficient retrieval. In our empirical experiments, MICO significantly improves the performance on multiple metrics of selective search and outperforms a number of existing competitive baselines.
与传统的穷举搜索相比,选择性搜索在查询穷举搜索所有文档之前首先将文档聚集到几个组中,以限制在一个组或仅在几个组中执行搜索。在现代大规模搜索系统中,选择性搜索是为了减少延迟和计算量而设计的。在本研究中,我们提出了MICO,一种互信息协同训练框架,用于使用搜索日志进行最小监督的选择性搜索。经过训练后,MICO不仅对文档进行聚类,而且还将未见过的查询路由到相关的聚类,以便进行有效的检索。在我们的实证实验中,MICO显著提高了选择性搜索的多个指标的性能,并且优于许多现有的竞争基线。
{"title":"MICO: Selective Search with Mutual Information Co-training","authors":"Zhanyu Wang, Xiao Zhang, Hyokun Yun, C. Teo, Trishul M. Chilimbi","doi":"10.48550/arXiv.2209.04378","DOIUrl":"https://doi.org/10.48550/arXiv.2209.04378","url":null,"abstract":"In contrast to traditional exhaustive search, selective search first clusters documents into several groups before all the documents are searched exhaustively by a query, to limit the search executed within one group or only a few groups. Selective search is designed to reduce the latency and computation in modern large-scale search systems. In this study, we propose MICO, a Mutual Information CO-training framework for selective search with minimal supervision using the search logs. After training, MICO does not only cluster the documents, but also routes unseen queries to the relevant clusters for efficient retrieval. In our empirical experiments, MICO significantly improves the performance on multiple metrics of selective search and outperforms a number of existing competitive baselines.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"50 1","pages":"1179-1192"},"PeriodicalIF":0.0,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85271663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adapting to Non-Centered Languages for Zero-shot Multilingual Translation 适应非中心语言的零射击多语种翻译
Pub Date : 2022-09-09 DOI: 10.48550/arXiv.2209.04138
Zhi Qu, Taro Watanabe
Multilingual neural machine translation can translate unseen language pairs during training, i.e. zero-shot translation. However, the zero-shot translation is always unstable. Although prior works attributed the instability to the domination of central language, e.g. English, we supplement this viewpoint with the strict dependence of non-centered languages. In this work, we propose a simple, lightweight yet effective language-specific modeling method by adapting to non-centered languages and combining the shared information and the language-specific information to counteract the instability of zero-shot translation. Experiments with Transformer on IWSLT17, Europarl, TED talks, and OPUS-100 datasets show that our method not only performs better than strong baselines in centered data conditions but also can easily fit non-centered data conditions. By further investigating the layer attribution, we show that our proposed method can disentangle the coupled representation in the correct direction.
多语言神经机器翻译可以在训练过程中翻译未见过的语言对,即零射击翻译。然而,零点平移总是不稳定的。虽然先前的研究将这种不稳定归因于中心语言(如英语)的统治,但我们补充了非中心语言的严格依赖。在这项工作中,我们提出了一种简单,轻量级但有效的特定语言建模方法,通过适应非中心语言,将共享信息和特定语言信息相结合来抵消零射击翻译的不稳定性。Transformer在IWSLT17、Europarl、TED talks和OPUS-100数据集上的实验表明,我们的方法不仅在中心数据条件下优于强基线,而且可以很容易地拟合非中心数据条件。通过对层属性的进一步研究,我们证明了我们的方法可以在正确的方向上解开耦合表示。
{"title":"Adapting to Non-Centered Languages for Zero-shot Multilingual Translation","authors":"Zhi Qu, Taro Watanabe","doi":"10.48550/arXiv.2209.04138","DOIUrl":"https://doi.org/10.48550/arXiv.2209.04138","url":null,"abstract":"Multilingual neural machine translation can translate unseen language pairs during training, i.e. zero-shot translation. However, the zero-shot translation is always unstable. Although prior works attributed the instability to the domination of central language, e.g. English, we supplement this viewpoint with the strict dependence of non-centered languages. In this work, we propose a simple, lightweight yet effective language-specific modeling method by adapting to non-centered languages and combining the shared information and the language-specific information to counteract the instability of zero-shot translation. Experiments with Transformer on IWSLT17, Europarl, TED talks, and OPUS-100 datasets show that our method not only performs better than strong baselines in centered data conditions but also can easily fit non-centered data conditions. By further investigating the layer attribution, we show that our proposed method can disentangle the coupled representation in the correct direction.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"6 1","pages":"5251-5265"},"PeriodicalIF":0.0,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78615614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
OneEE: A One-Stage Framework for Fast Overlapping and Nested Event Extraction OneEE:一个快速重叠和嵌套事件提取的单阶段框架
Pub Date : 2022-09-06 DOI: 10.48550/arXiv.2209.02693
H. Cao, Jingye Li, Fangfang Su, Fei Li, Hao Fei, Shengqiong Wu, Bobo Li, Liang Zhao, Donghong Ji
Event extraction (EE) is an essential task of information extraction, which aims to extract structured event information from unstructured text. Most prior work focuses on extracting flat events while neglecting overlapped or nested ones. A few models for overlapped and nested EE includes several successive stages to extract event triggers and arguments,which suffer from error propagation. Therefore, we design a simple yet effective tagging scheme and model to formulate EE as word-word relation recognition, called OneEE. The relations between trigger or argument words are simultaneously recognized in one stage with parallel grid tagging, thus yielding a very fast event extraction speed. The model is equipped with an adaptive event fusion module to generate event-aware representations and a distance-aware predictor to integrate relative distance information for word-word relation recognition, which are empirically demonstrated to be effective mechanisms. Experiments on 3 overlapped and nested EE benchmarks, namely FewFC, Genia11, and Genia13, show that OneEE achieves the state-of-the-art (SOTA) results. Moreover, the inference speed of OneEE is faster than those of baselines in the same condition, and can be further substantially improved since it supports parallel inference.
事件抽取(Event extraction, EE)是信息抽取的一项重要任务,其目的是从非结构化文本中抽取结构化事件信息。大多数先前的工作侧重于提取平坦事件,而忽略了重叠或嵌套的事件。一些用于重叠和嵌套EE的模型包括几个连续的阶段来提取事件触发器和参数,这些阶段会受到错误传播的影响。因此,我们设计了一个简单而有效的标记方案和模型,将EE表述为词-词关系识别,称为OneEE。通过并行网格标注,在一个阶段内同时识别触发词和参数词之间的关系,从而获得非常快的事件提取速度。该模型采用自适应事件融合模块生成事件感知表示,采用距离感知预测器整合相对距离信息进行词词关系识别,是一种有效的机制。在3个重叠嵌套的EE基准测试(即FewFC、Genia11和Genia13)上的实验表明,OneEE达到了最先进(SOTA)的结果。此外,在相同条件下,OneEE的推理速度比基线快,并且由于它支持并行推理,可以进一步大幅度提高。
{"title":"OneEE: A One-Stage Framework for Fast Overlapping and Nested Event Extraction","authors":"H. Cao, Jingye Li, Fangfang Su, Fei Li, Hao Fei, Shengqiong Wu, Bobo Li, Liang Zhao, Donghong Ji","doi":"10.48550/arXiv.2209.02693","DOIUrl":"https://doi.org/10.48550/arXiv.2209.02693","url":null,"abstract":"Event extraction (EE) is an essential task of information extraction, which aims to extract structured event information from unstructured text. Most prior work focuses on extracting flat events while neglecting overlapped or nested ones. A few models for overlapped and nested EE includes several successive stages to extract event triggers and arguments,which suffer from error propagation. Therefore, we design a simple yet effective tagging scheme and model to formulate EE as word-word relation recognition, called OneEE. The relations between trigger or argument words are simultaneously recognized in one stage with parallel grid tagging, thus yielding a very fast event extraction speed. The model is equipped with an adaptive event fusion module to generate event-aware representations and a distance-aware predictor to integrate relative distance information for word-word relation recognition, which are empirically demonstrated to be effective mechanisms. Experiments on 3 overlapped and nested EE benchmarks, namely FewFC, Genia11, and Genia13, show that OneEE achieves the state-of-the-art (SOTA) results. Moreover, the inference speed of OneEE is faster than those of baselines in the same condition, and can be further substantially improved since it supports parallel inference.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"52 1","pages":"1953-1964"},"PeriodicalIF":0.0,"publicationDate":"2022-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84683798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual Retrieval 具体:通过跨语言检索改进跨语言事实核查
Pub Date : 2022-09-05 DOI: 10.48550/arXiv.2209.02071
Kung-Hsiang Huang, Chengxiang Zhai, Heng Ji
Fact-checking has gained increasing attention due to the widespread of falsified information. Most fact-checking approaches focus on claims made in English only due to the data scarcity issue in other languages. The lack of fact-checking datasets in low-resource languages calls for an effective cross-lingual transfer technique for fact-checking. Additionally, trustworthy information in different languages can be complementary and helpful in verifying facts. To this end, we present the first fact-checking framework augmented with cross-lingual retrieval that aggregates evidence retrieved from multiple languages through a cross-lingual retriever. Given the absence of cross-lingual information retrieval datasets with claim-like queries, we train the retriever with our proposed Cross-lingual Inverse Cloze Task (X-ICT), a self-supervised algorithm that creates training instances by translating the title of a passage. The goal for X-ICT is to learn cross-lingual retrieval in which the model learns to identify the passage corresponding to a given translated title. On the X-Fact dataset, our approach achieves 2.23% absolute F1 improvement in the zero-shot cross-lingual setup over prior systems. The source code and data are publicly available at https://github.com/khuangaf/CONCRETE.
由于虚假信息的广泛存在,事实核查越来越受到关注。由于其他语言的数据缺乏问题,大多数事实核查方法只关注用英语提出的主张。由于缺乏低资源语言的事实核查数据集,需要一种有效的跨语言转移技术来进行事实核查。此外,不同语言的可靠信息可以相互补充,有助于核实事实。为此,我们提出了第一个增强了跨语言检索的事实核查框架,该框架通过跨语言检索器聚合了从多种语言检索到的证据。鉴于缺乏具有类似声明查询的跨语言信息检索数据集,我们使用我们提出的跨语言逆完形任务(X-ICT)训练检索器,这是一种自监督算法,通过翻译文章标题来创建训练实例。X-ICT的目标是学习跨语言检索,其中模型学习识别与给定翻译标题对应的段落。在X-Fact数据集上,与之前的系统相比,我们的方法在零射击跨语言设置方面实现了2.23%的绝对F1改进。源代码和数据可在https://github.com/khuangaf/CONCRETE上公开获得。
{"title":"CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual Retrieval","authors":"Kung-Hsiang Huang, Chengxiang Zhai, Heng Ji","doi":"10.48550/arXiv.2209.02071","DOIUrl":"https://doi.org/10.48550/arXiv.2209.02071","url":null,"abstract":"Fact-checking has gained increasing attention due to the widespread of falsified information. Most fact-checking approaches focus on claims made in English only due to the data scarcity issue in other languages. The lack of fact-checking datasets in low-resource languages calls for an effective cross-lingual transfer technique for fact-checking. Additionally, trustworthy information in different languages can be complementary and helpful in verifying facts. To this end, we present the first fact-checking framework augmented with cross-lingual retrieval that aggregates evidence retrieved from multiple languages through a cross-lingual retriever. Given the absence of cross-lingual information retrieval datasets with claim-like queries, we train the retriever with our proposed Cross-lingual Inverse Cloze Task (X-ICT), a self-supervised algorithm that creates training instances by translating the title of a passage. The goal for X-ICT is to learn cross-lingual retrieval in which the model learns to identify the passage corresponding to a given translated title. On the X-Fact dataset, our approach achieves 2.23% absolute F1 improvement in the zero-shot cross-lingual setup over prior systems. The source code and data are publicly available at https://github.com/khuangaf/CONCRETE.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"64 1","pages":"1024-1035"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79122304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Rare but Severe Neural Machine Translation Errors Induced by Minimal Deletion: An Empirical Study on Chinese and English 由最小删减引起的罕见但严重的神经机器翻译错误:中英文的实证研究
Pub Date : 2022-09-05 DOI: 10.48550/arXiv.2209.02145
Ruikang Shi, Alvin Grissom II, Duc Minh Trinh
We examine the inducement of rare but severe errors in English-Chinese and Chinese-English in-domain neural machine translation by minimal deletion of source text with character-based models. By deleting a single character, we can induce severe translation errors. We categorize these errors and compare the results of deleting single characters and single words. We also examine the effect of training data size on the number and types of pathological cases induced by these minimal perturbations, finding significant variation. We find that deleting a word hurts overall translation score more than deleting a character, but certain errors are more likely to occur when deleting characters, with language direction also influencing the effect.
本文利用基于字符的模型研究了在英汉和汉英领域内神经机器翻译中,通过最小化源文本的删除来诱发罕见但严重的错误。通过删除一个字符,我们可以引起严重的翻译错误。我们对这些错误进行分类,并比较删除单个字符和单个单词的结果。我们还研究了训练数据大小对由这些最小扰动引起的病理病例的数量和类型的影响,发现了显著的变化。我们发现,删除一个单词比删除一个字符对整体翻译分数的影响更大,但删除字符时更容易出现某些错误,语言方向也会影响效果。
{"title":"Rare but Severe Neural Machine Translation Errors Induced by Minimal Deletion: An Empirical Study on Chinese and English","authors":"Ruikang Shi, Alvin Grissom II, Duc Minh Trinh","doi":"10.48550/arXiv.2209.02145","DOIUrl":"https://doi.org/10.48550/arXiv.2209.02145","url":null,"abstract":"We examine the inducement of rare but severe errors in English-Chinese and Chinese-English in-domain neural machine translation by minimal deletion of source text with character-based models. By deleting a single character, we can induce severe translation errors. We categorize these errors and compare the results of deleting single characters and single words. We also examine the effect of training data size on the number and types of pathological cases induced by these minimal perturbations, finding significant variation. We find that deleting a word hurts overall translation score more than deleting a character, but certain errors are more likely to occur when deleting characters, with language direction also influencing the effect.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"68 1","pages":"5175-5180"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75399855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-Figurative Language Generation 多比喻语言生成
Pub Date : 2022-09-05 DOI: 10.48550/arXiv.2209.01835
Huiyuan Lai, M. Nissim
Figurative language generation is the task of reformulating a given text in the desired figure of speech while still being faithful to the original context. We take the first step towards multi-figurative language modelling by providing a benchmark for the automatic generation of five common figurative forms in English. We train mFLAG employing a scheme for multi-figurative language pre-training on top of BART, and a mechanism for injecting the target figurative information into the encoder; this enables the generation of text with the target figurative form from another figurative form without parallel figurative-figurative sentence pairs. Our approach outperforms all strong baselines. We also offer some qualitative analysis and reflections on the relationship between the different figures of speech.
比喻语言生成的任务是在忠实于原语境的前提下,以所期望的修辞格重新表述给定文本。我们通过为英语中五种常见的比喻形式的自动生成提供基准,向多比喻语言建模迈出了第一步。我们在BART的基础上采用了一种多比喻语言预训练方案和一种将目标比喻信息注入编码器的机制来训练mFLAG;这使得从另一个比喻形式生成具有目标比喻形式的文本,而不需要平行的比喻-比喻句子对。我们的方法优于所有强基线。本文还对不同修辞格之间的关系进行了定性分析和思考。
{"title":"Multi-Figurative Language Generation","authors":"Huiyuan Lai, M. Nissim","doi":"10.48550/arXiv.2209.01835","DOIUrl":"https://doi.org/10.48550/arXiv.2209.01835","url":null,"abstract":"Figurative language generation is the task of reformulating a given text in the desired figure of speech while still being faithful to the original context. We take the first step towards multi-figurative language modelling by providing a benchmark for the automatic generation of five common figurative forms in English. We train mFLAG employing a scheme for multi-figurative language pre-training on top of BART, and a mechanism for injecting the target figurative information into the encoder; this enables the generation of text with the target figurative form from another figurative form without parallel figurative-figurative sentence pairs. Our approach outperforms all strong baselines. We also offer some qualitative analysis and reflections on the relationship between the different figures of speech.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"11 1","pages":"5939-5954"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90360575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Informative Language Representation Learning for Massively Multilingual Neural Machine Translation 大规模多语言神经机器翻译的信息语言表示学习
Pub Date : 2022-09-04 DOI: 10.48550/arXiv.2209.01530
Renren Jin, Deyi Xiong
In a multilingual neural machine translation model that fully shares parameters across all languages, an artificial language token is usually used to guide translation into the desired target language. However, recent studies show that prepending language tokens sometimes fails to navigate the multilingual neural machine translation models into right translation directions, especially on zero-shot translation. To mitigate this issue, we propose two methods, language embedding embodiment and language-aware multi-head attention, to learn informative language representations to channel translation into right directions. The former embodies language embeddings into different critical switching points along the information flow from the source to the target, aiming at amplifying translation direction guiding signals. The latter exploits a matrix, instead of a vector, to represent a language in the continuous space. The matrix is chunked into multiple heads so as to learn language representations in multiple subspaces. Experiment results on two datasets for massively multilingual neural machine translation demonstrate that language-aware multi-head attention benefits both supervised and zero-shot translation and significantly alleviates the off-target translation issue. Further linguistic typology prediction experiments show that matrix-based language representations learned by our methods are capable of capturing rich linguistic typology features.
在跨所有语言完全共享参数的多语言神经机器翻译模型中,通常使用人工语言标记来引导翻译到所需的目标语言。然而,最近的研究表明,前置语言标记有时不能将多语言神经机器翻译模型引导到正确的翻译方向,特别是在零采样翻译时。为了解决这一问题,我们提出了语言嵌入体现和语言感知多头注意两种方法来学习信息语言表征,从而引导翻译朝着正确的方向发展。前者将语言嵌入到从源语到译语的信息流中不同的关键切换点,目的是放大翻译方向的引导信号。后者利用矩阵而不是向量来表示连续空间中的语言。该矩阵被分块成多个头,以便在多个子空间中学习语言表示。在两个大规模多语言神经机器翻译数据集上的实验结果表明,语言感知多头注意既有利于监督翻译,又有利于零射击翻译,显著缓解了脱靶翻译问题。进一步的语言类型学预测实验表明,通过我们的方法学习的基于矩阵的语言表示能够捕获丰富的语言类型学特征。
{"title":"Informative Language Representation Learning for Massively Multilingual Neural Machine Translation","authors":"Renren Jin, Deyi Xiong","doi":"10.48550/arXiv.2209.01530","DOIUrl":"https://doi.org/10.48550/arXiv.2209.01530","url":null,"abstract":"In a multilingual neural machine translation model that fully shares parameters across all languages, an artificial language token is usually used to guide translation into the desired target language. However, recent studies show that prepending language tokens sometimes fails to navigate the multilingual neural machine translation models into right translation directions, especially on zero-shot translation. To mitigate this issue, we propose two methods, language embedding embodiment and language-aware multi-head attention, to learn informative language representations to channel translation into right directions. The former embodies language embeddings into different critical switching points along the information flow from the source to the target, aiming at amplifying translation direction guiding signals. The latter exploits a matrix, instead of a vector, to represent a language in the continuous space. The matrix is chunked into multiple heads so as to learn language representations in multiple subspaces. Experiment results on two datasets for massively multilingual neural machine translation demonstrate that language-aware multi-head attention benefits both supervised and zero-shot translation and significantly alleviates the off-target translation issue. Further linguistic typology prediction experiments show that matrix-based language representations learned by our methods are capable of capturing rich linguistic typology features.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"24 1","pages":"5158-5174"},"PeriodicalIF":0.0,"publicationDate":"2022-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78111219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploiting Hybrid Semantics of Relation Paths for Multi-hop Question Answering over Knowledge Graphs 利用关系路径的混合语义实现知识图多跳问答
Pub Date : 2022-09-02 DOI: 10.48550/arXiv.2209.00870
Zile Qiao, Wei Ye, Tong Zhang, Tong Mo, Weiping Li, Shikun Zhang
Answering natural language questions on knowledge graphs (KGQA) remains a great challenge in terms of understanding complex questions via multi-hop reasoning. Previous efforts usually exploit large-scale entity-related text corpus or knowledge graph (KG) embeddings as auxiliary information to facilitate answer selection. However, the rich semantics implied in off-the-shelf relation paths between entities is far from well explored. This paper proposes improving multi-hop KGQA by exploiting relation paths’ hybrid semantics. Specifically, we integrate explicit textual information and implicit KG structural features of relation paths based on a novel rotate-and-scale entity link prediction framework. Extensive experiments on three existing KGQA datasets demonstrate the superiority of our method, especially in multi-hop scenarios. Further investigation confirms our method’s systematical coordination between questions and relation paths to identify answer entities.
在知识图上回答自然语言问题(KGQA)在通过多跳推理理解复杂问题方面仍然是一个巨大的挑战。以前的研究通常利用大规模实体相关文本语料库或知识图(KG)嵌入作为辅助信息来促进答案选择。然而,实体之间现成关系路径中隐含的丰富语义还远远没有得到很好的探索。本文提出利用关系路径的混合语义改进多跳KGQA算法。具体而言,我们基于一种新颖的旋转和缩放实体链接预测框架,集成了关系路径的显式文本信息和隐式KG结构特征。在三个现有的KGQA数据集上进行的大量实验证明了我们的方法的优越性,特别是在多跳场景下。进一步的研究证实了我们的方法系统地协调了问题和关系路径来识别答案实体。
{"title":"Exploiting Hybrid Semantics of Relation Paths for Multi-hop Question Answering over Knowledge Graphs","authors":"Zile Qiao, Wei Ye, Tong Zhang, Tong Mo, Weiping Li, Shikun Zhang","doi":"10.48550/arXiv.2209.00870","DOIUrl":"https://doi.org/10.48550/arXiv.2209.00870","url":null,"abstract":"Answering natural language questions on knowledge graphs (KGQA) remains a great challenge in terms of understanding complex questions via multi-hop reasoning. Previous efforts usually exploit large-scale entity-related text corpus or knowledge graph (KG) embeddings as auxiliary information to facilitate answer selection. However, the rich semantics implied in off-the-shelf relation paths between entities is far from well explored. This paper proposes improving multi-hop KGQA by exploiting relation paths’ hybrid semantics. Specifically, we integrate explicit textual information and implicit KG structural features of relation paths based on a novel rotate-and-scale entity link prediction framework. Extensive experiments on three existing KGQA datasets demonstrate the superiority of our method, especially in multi-hop scenarios. Further investigation confirms our method’s systematical coordination between questions and relation paths to identify answer entities.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"43 21","pages":"1813-1822"},"PeriodicalIF":0.0,"publicationDate":"2022-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72536670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multi-modal Contrastive Representation Learning for Entity Alignment 面向实体对齐的多模态对比表示学习
Pub Date : 2022-09-02 DOI: 10.48550/arXiv.2209.00891
Zhenxi Lin, Ziheng Zhang, Meng Wang, Yinghui Shi, Xianlong Wu, Yefeng Zheng
Multi-modal entity alignment aims to identify equivalent entities between two different multi-modal knowledge graphs, which consist of structural triples and images associated with entities. Most previous works focus on how to utilize and encode information from different modalities, while it is not trivial to leverage multi-modal knowledge in entity alignment because of the modality heterogeneity. In this paper, we propose MCLEA, a Multi-modal Contrastive Learning based Entity Alignment model, to obtain effective joint representations for multi-modal entity alignment. Different from previous works, MCLEA considers task-oriented modality and models the inter-modal relationships for each entity representation. In particular, MCLEA firstly learns multiple individual representations from multiple modalities, and then performs contrastive learning to jointly model intra-modal and inter-modal interactions. Extensive experimental results show that MCLEA outperforms state-of-the-art baselines on public datasets under both supervised and unsupervised settings.
多模态实体对齐旨在识别两个不同的多模态知识图之间的等效实体,这些知识图由结构三元组和与实体相关的图像组成。以往的工作大多集中在如何利用和编码来自不同模态的信息,而由于模态的异质性,在实体对齐中利用多模态知识并不是一件容易的事情。本文提出了一种基于多模态对比学习的实体对齐模型MCLEA,以获得多模态实体对齐的有效联合表示。与以往的工作不同,MCLEA考虑了面向任务的模态,并对每个实体表示的模态间关系进行了建模。特别是,MCLEA首先从多个模态中学习多个个体表征,然后进行对比学习,共同建模模态内和模态间的相互作用。大量的实验结果表明,MCLEA在监督和无监督设置下的公共数据集上都优于最先进的基线。
{"title":"Multi-modal Contrastive Representation Learning for Entity Alignment","authors":"Zhenxi Lin, Ziheng Zhang, Meng Wang, Yinghui Shi, Xianlong Wu, Yefeng Zheng","doi":"10.48550/arXiv.2209.00891","DOIUrl":"https://doi.org/10.48550/arXiv.2209.00891","url":null,"abstract":"Multi-modal entity alignment aims to identify equivalent entities between two different multi-modal knowledge graphs, which consist of structural triples and images associated with entities. Most previous works focus on how to utilize and encode information from different modalities, while it is not trivial to leverage multi-modal knowledge in entity alignment because of the modality heterogeneity. In this paper, we propose MCLEA, a Multi-modal Contrastive Learning based Entity Alignment model, to obtain effective joint representations for multi-modal entity alignment. Different from previous works, MCLEA considers task-oriented modality and models the inter-modal relationships for each entity representation. In particular, MCLEA firstly learns multiple individual representations from multiple modalities, and then performs contrastive learning to jointly model intra-modal and inter-modal interactions. Extensive experimental results show that MCLEA outperforms state-of-the-art baselines on public datasets under both supervised and unsupervised settings.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"32 1","pages":"2572-2584"},"PeriodicalIF":0.0,"publicationDate":"2022-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86177663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
期刊
Proceedings of COLING. International Conference on Computational Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1