首页 > 最新文献

AI Open最新文献

英文 中文
GPT understands, too GPT 也理解
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2023.08.012
Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU). However, our preliminary study reveals that manual discrete prompts often lead to unstable performance—e.g., changing a single word in the prompt might result in substantial performance drop. We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts. Empirically, P-Tuning not only stabilizes training by minimizing the gap between various discrete prompts, but also improves performance by a sizeable margin on a wide range of NLU tasks including LAMA and SuperGLUE. P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.
事实证明,用自然语言模式提示预训练语言模型对自然语言理解(NLU)非常有效。然而,我们的初步研究表明,人工离散提示通常会导致性能不稳定,例如,改变提示中的一个单词就可能导致性能大幅下降。我们提出了一种新方法 P-Tuning,它将可训练的连续提示嵌入与离散提示串联起来。根据经验,P-Tuning 不仅能通过最小化各种离散提示之间的差距来稳定训练,还能在包括 LAMA 和 SuperGLUE 在内的各种 NLU 任务中大幅提高性能。P-Tuning 对冻结语言模型和调整语言模型都普遍有效,而且在完全监督和少数几个镜头的设置下都是如此。
{"title":"GPT understands, too","authors":"","doi":"10.1016/j.aiopen.2023.08.012","DOIUrl":"10.1016/j.aiopen.2023.08.012","url":null,"abstract":"<div><div>Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU). However, our preliminary study reveals that manual discrete prompts often lead to unstable performance—<em>e.g.</em>, changing a single word in the prompt might result in substantial performance drop. We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts. Empirically, P-Tuning not only stabilizes training by minimizing the gap between various discrete prompts, but also improves performance by a sizeable margin on a wide range of NLU tasks including LAMA and SuperGLUE. P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.</div></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 208-215"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84420866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Authorship style transfer with inverse transfer data augmentation 作者风格转移与反向转移数据增强
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.08.003
Zhonghui Shao , Jing Zhang , Haoyang Li , Xinmei Huang , Chao Zhou , Yuanchun Wang , Jibing Gong , Cuiping Li , Hong Chen

Authorship style transfer aims to modify the style of neutral text to match the unique speaking or writing style of a particular individual. While Large Language Models (LLMs) present promising solutions, their effectiveness is limited by the small number of in-context learning demonstrations, particularly for authorship styles not frequently seen during pre-training. In response, this paper proposes an inverse transfer data augmentation (ITDA) method, leveraging LLMs to create (neutral text, stylized text) pairs. This method involves removing the existing styles from stylized texts, a process made more feasible due to the prevalence of neutral texts in pre-training. We use this augmented dataset to train a compact model that is efficient for deployment and adept at replicating the targeted style. Our experimental results, conducted across four datasets with distinct authorship styles, establish the effectiveness of ITDA over traditional style transfer methods and forward transfer using GPT-3.5. For further research and application, our dataset and code are openly accessible at https://github.com/Vicky-Shao/ITDA.

作者风格转换的目的是修改中性文本的风格,使之与特定个人的独特说话或写作风格相匹配。虽然大语言模型(LLMs)提供了很有前景的解决方案,但由于语境中学习演示的数量较少,它们的有效性受到了限制,特别是对于在预训练中不常见的作者风格。为此,本文提出了一种反向传输数据增强(ITDA)方法,利用 LLM 创建(中性文本、风格化文本)对。该方法涉及从风格化文本中移除现有风格,由于中性文本在预训练中的普遍存在,这一过程变得更加可行。我们使用这个增强的数据集来训练一个紧凑的模型,该模型不仅部署高效,而且善于复制目标样式。我们在四个具有不同作者风格的数据集上进行的实验结果表明,ITDA 比传统的风格转移方法和使用 GPT-3.5 的前向转移方法更有效。为便于进一步研究和应用,我们的数据集和代码可在 https://github.com/Vicky-Shao/ITDA 上公开访问。
{"title":"Authorship style transfer with inverse transfer data augmentation","authors":"Zhonghui Shao ,&nbsp;Jing Zhang ,&nbsp;Haoyang Li ,&nbsp;Xinmei Huang ,&nbsp;Chao Zhou ,&nbsp;Yuanchun Wang ,&nbsp;Jibing Gong ,&nbsp;Cuiping Li ,&nbsp;Hong Chen","doi":"10.1016/j.aiopen.2024.08.003","DOIUrl":"10.1016/j.aiopen.2024.08.003","url":null,"abstract":"<div><p>Authorship style transfer aims to modify the style of neutral text to match the unique speaking or writing style of a particular individual. While Large Language Models (LLMs) present promising solutions, their effectiveness is limited by the small number of in-context learning demonstrations, particularly for authorship styles not frequently seen during pre-training. In response, this paper proposes an inverse transfer data augmentation (<span>ITDA</span>) method, leveraging LLMs to create (neutral text, stylized text) pairs. This method involves removing the existing styles from stylized texts, a process made more feasible due to the prevalence of neutral texts in pre-training. We use this augmented dataset to train a compact model that is efficient for deployment and adept at replicating the targeted style. Our experimental results, conducted across four datasets with distinct authorship styles, establish the effectiveness of <span>ITDA</span> over traditional style transfer methods and forward transfer using GPT-3.5. For further research and application, our dataset and code are openly accessible at <span><span>https://github.com/Vicky-Shao/ITDA</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 94-103"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000135/pdfft?md5=3a5bc730b200d5992d33b797c1afbf4f&pid=1-s2.0-S2666651024000135-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142075773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large language models in law: A survey 法律中的大型语言模型:调查
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.09.002
Jinqi Lai , Wensheng Gan , Jiayang Wu , Zhenlian Qi , Philip S. Yu
The advent of artificial intelligence (AI) has significantly impacted the traditional judicial industry. Moreover, recently, with the development of AI-generated content (AIGC), AI and law have found applications in various domains, including image recognition, automatic text generation, and interactive chat. With the rapid emergence and growing popularity of large models, it is evident that AI will drive transformation in the traditional judicial industry. However, the application of legal large language models (LLMs) is still in its nascent stage. Several challenges need to be addressed. In this paper, we aim to provide a comprehensive survey of legal LLMs. We not only conduct an extensive survey of LLMs but also expose their applications in the judicial system. We first provide an overview of AI technologies in the legal field and showcase the recent research in LLMs. Then, we discuss the practical implementations presented by legal LLMs, such as providing legal advice to users and assisting judges during trials. In addition, we explore the limitations of legal LLMs, including data, algorithms, and judicial practice. Finally, we summarize practical recommendations and propose future development directions to address these challenges.
人工智能(AI)的出现极大地冲击了传统的司法行业。此外,近期随着人工智能生成内容(AIGC)的发展,人工智能与法律在图像识别、自动文本生成、互动聊天等多个领域都有了应用。随着大模型的快速出现和日益普及,人工智能显然将推动传统司法行业的转型。然而,法律大型语言模型(LLM)的应用仍处于起步阶段。一些挑战亟待解决。本文旨在对法律大语言模型进行全面调查。我们不仅对 LLM 进行了广泛的调查,还揭示了它们在司法系统中的应用。我们首先概述了人工智能技术在法律领域的应用,并展示了最近在 LLMs 方面的研究。然后,我们讨论了法律 LLM 的实际应用,例如为用户提供法律建议和在审判过程中协助法官。此外,我们还探讨了法律 LLM 的局限性,包括数据、算法和司法实践。最后,我们总结了实用建议,并提出了应对这些挑战的未来发展方向。
{"title":"Large language models in law: A survey","authors":"Jinqi Lai ,&nbsp;Wensheng Gan ,&nbsp;Jiayang Wu ,&nbsp;Zhenlian Qi ,&nbsp;Philip S. Yu","doi":"10.1016/j.aiopen.2024.09.002","DOIUrl":"10.1016/j.aiopen.2024.09.002","url":null,"abstract":"<div><div>The advent of artificial intelligence (AI) has significantly impacted the traditional judicial industry. Moreover, recently, with the development of AI-generated content (AIGC), AI and law have found applications in various domains, including image recognition, automatic text generation, and interactive chat. With the rapid emergence and growing popularity of large models, it is evident that AI will drive transformation in the traditional judicial industry. However, the application of legal large language models (LLMs) is still in its nascent stage. Several challenges need to be addressed. In this paper, we aim to provide a comprehensive survey of legal LLMs. We not only conduct an extensive survey of LLMs but also expose their applications in the judicial system. We first provide an overview of AI technologies in the legal field and showcase the recent research in LLMs. Then, we discuss the practical implementations presented by legal LLMs, such as providing legal advice to users and assisting judges during trials. In addition, we explore the limitations of legal LLMs, including data, algorithms, and judicial practice. Finally, we summarize practical recommendations and propose future development directions to address these challenges.</div></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 181-196"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142539171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A study of natural robustness of deep reinforcement learning algorithms towards adversarial perturbations 深度强化学习算法对对抗性扰动的自然鲁棒性研究
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.08.005
Qisai Liu , Xian Yeow Lee , Soumik Sarkar

Deep reinforcement learning (DRL) has been shown to have numerous potential applications in the real world. However, DRL algorithms are still extremely sensitive to noise and adversarial perturbations, hence inhibiting the deployment of RL in many real-life applications. Analyzing the robustness of DRL algorithms to adversarial attacks is an important prerequisite to enabling the widespread adoption of DRL algorithms. Common perturbations on DRL frameworks during test time include perturbations to the observation and the action channel. Compared with observation channel attacks, action channel attacks are less studied; hence, few comparisons exist that compare the effectiveness of these attacks in DRL literature. In this work, we examined the effectiveness of these two paradigms of attacks on common DRL algorithms and studied the natural robustness of DRL algorithms towards various adversarial attacks in hopes of gaining insights into the individual response of each type of algorithm under different attack conditions.

深度强化学习(DRL)已被证明在现实世界中有许多潜在应用。然而,DRL 算法对噪声和对抗性扰动仍然极为敏感,因此阻碍了 RL 在许多现实应用中的部署。分析 DRL 算法对对抗性攻击的鲁棒性是 DRL 算法得以广泛应用的重要前提。测试期间对 DRL 框架的常见扰动包括对观察和行动通道的扰动。与观测信道攻击相比,行动信道攻击的研究较少,因此 DRL 文献中很少有比较这些攻击有效性的文章。在这项工作中,我们检验了这两种攻击范式对常见 DRL 算法的有效性,并研究了 DRL 算法对各种对抗性攻击的天然鲁棒性,希望能深入了解每种算法在不同攻击条件下的个体响应。
{"title":"A study of natural robustness of deep reinforcement learning algorithms towards adversarial perturbations","authors":"Qisai Liu ,&nbsp;Xian Yeow Lee ,&nbsp;Soumik Sarkar","doi":"10.1016/j.aiopen.2024.08.005","DOIUrl":"10.1016/j.aiopen.2024.08.005","url":null,"abstract":"<div><p>Deep reinforcement learning (DRL) has been shown to have numerous potential applications in the real world. However, DRL algorithms are still extremely sensitive to noise and adversarial perturbations, hence inhibiting the deployment of RL in many real-life applications. Analyzing the robustness of DRL algorithms to adversarial attacks is an important prerequisite to enabling the widespread adoption of DRL algorithms. Common perturbations on DRL frameworks during test time include perturbations to the observation and the action channel. Compared with observation channel attacks, action channel attacks are less studied; hence, few comparisons exist that compare the effectiveness of these attacks in DRL literature. In this work, we examined the effectiveness of these two paradigms of attacks on common DRL algorithms and studied the natural robustness of DRL algorithms towards various adversarial attacks in hopes of gaining insights into the individual response of each type of algorithm under different attack conditions.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 126-141"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000159/pdfft?md5=a50110d80c809055a00e87466dc649b1&pid=1-s2.0-S2666651024000159-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142163035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CellBoost: A pipeline for machine assisted annotation in neuroanatomy CellBoost:神经解剖学中的机器辅助注释管道
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.09.001
Kui Qian , Beth Friedman , Jun Takatoh , Alexander Groisman , Fan Wang , David Kleinfeld , Yoav Freund

One of the important yet labor intensive tasks in neuroanatomy is the identification of select populations of cells. Current high-throughput techniques enable marking cells with histochemical fluorescent molecules as well as through the genetic expression of fluorescent proteins. Modern scanning microscopes allow high resolution multi-channel imaging of the mechanically or optically sectioned brain with thousands of marked cells per square millimeter. Manual identification of all marked cells is prohibitively time consuming. At the same time, simple segmentation algorithms to identify marked cells suffer from high error rates and sensitivity to variation in fluorescent intensity and spatial distribution.

We present a methodology that combines human judgement and machine learning that serves to significantly reduce the labor of the anatomist while improving the consistency of the annotation.

As a demonstration, we analyzed murine brains with marked premotor neurons in the brainstem. We compared the error rate of our method to the disagreement rate among human anatomists. This comparison shows that our method can reduce the time to annotate by as much as ten-fold without significantly increasing the rate of errors. We show that our method achieves significant reduction in labor while achieving an accuracy that is similar to the level of agreement between different anatomists.

神经解剖学中一项重要而又耗费大量人力的工作是识别选定的细胞群。目前的高通量技术可通过组织化学荧光分子以及荧光蛋白的基因表达对细胞进行标记。现代扫描显微镜可对机械或光学切片的大脑进行高分辨率多通道成像,每平方毫米可显示数千个标记细胞。手动识别所有标记细胞非常耗时。与此同时,用于识别标记细胞的简单分割算法存在错误率高、对荧光强度和空间分布变化敏感等问题。我们提出了一种将人工判断与机器学习相结合的方法,该方法可显著减少解剖学家的劳动,同时提高注释的一致性。作为演示,我们分析了脑干中带有标记的前运动神经元的鼠脑。我们将我们方法的错误率与人类解剖学家之间的分歧率进行了比较。比较结果表明,我们的方法可以将标注时间缩短十倍之多,而错误率却不会明显增加。我们的研究表明,我们的方法在显著减少工作量的同时,还能达到与不同解剖学家之间的一致水平相近的精确度。
{"title":"CellBoost: A pipeline for machine assisted annotation in neuroanatomy","authors":"Kui Qian ,&nbsp;Beth Friedman ,&nbsp;Jun Takatoh ,&nbsp;Alexander Groisman ,&nbsp;Fan Wang ,&nbsp;David Kleinfeld ,&nbsp;Yoav Freund","doi":"10.1016/j.aiopen.2024.09.001","DOIUrl":"10.1016/j.aiopen.2024.09.001","url":null,"abstract":"<div><p>One of the important yet labor intensive tasks in neuroanatomy is the identification of select populations of cells. Current high-throughput techniques enable marking cells with histochemical fluorescent molecules as well as through the genetic expression of fluorescent proteins. Modern scanning microscopes allow high resolution multi-channel imaging of the mechanically or optically sectioned brain with thousands of marked cells per square millimeter. Manual identification of all marked cells is prohibitively time consuming. At the same time, simple segmentation algorithms to identify marked cells suffer from high error rates and sensitivity to variation in fluorescent intensity and spatial distribution.</p><p>We present a methodology that combines human judgement and machine learning that serves to significantly reduce the labor of the anatomist while improving the consistency of the annotation.</p><p>As a demonstration, we analyzed murine brains with marked premotor neurons in the brainstem. We compared the error rate of our method to the disagreement rate among human anatomists. This comparison shows that our method can reduce the time to annotate by as much as ten-fold without significantly increasing the rate of errors. We show that our method achieves significant reduction in labor while achieving an accuracy that is similar to the level of agreement between different anatomists.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 142-154"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000160/pdfft?md5=d645a8a10e8ed886c8fad283100f34b8&pid=1-s2.0-S2666651024000160-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relation-aware deep neural network enables more efficient biomedical knowledge acquisition from massive literature 关系感知深度神经网络让从海量文献中获取生物医学知识更高效
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.08.002
Chenyang Song , Zheni Zeng , Changyao Tian , Kuai Li , Yuan Yao , Suncong Zheng , Zhiyuan Liu , Maosong Sun

Biomedical knowledge is typically organized in a relational scheme, such as chemical-disease relation, gene-disease relation, and gene-pathway relation. Biomedical scientists heavily rely on search engines to acquire up-to-date relational knowledge from massive biomedical articles. The navigation efficiency of the retrieval process, however, is significantly restricted by keyword matching techniques unaware of the biomedical relations of these keywords in articles. To bridge the gap between existing retrieval techniques and practical access demands for relational knowledge, we present a novel framework, Biomedical Relation-Aware Document Ranking (BioRADR), capable of retrieving articles expressing specific relations with respect to the queried entity pair. Based on a deep neural network, BioRADR can be trained from large-scale data automatically annotated via distant supervision, and empirical evaluation reveals that it outperforms the strongest baseline by over 8 points in NDCG@1. We implement an online system (http://bioradr.ai.thunlp.org/) based on BioRADR, enabling more efficient relation-oriented retrieval of biomedical articles.

生物医学知识通常采用关系式组织,如化学-疾病关系、基因-疾病关系和基因-途径关系。生物医学家严重依赖搜索引擎从海量生物医学文章中获取最新的关系知识。然而,由于关键词匹配技术不了解这些关键词在文章中的生物医学关系,检索过程的导航效率受到很大限制。为了弥补现有检索技术与关系知识实际访问需求之间的差距,我们提出了一个新颖的框架--生物医学关系感知文档排名(BioRADR),它能够检索表达与被查询实体对的特定关系的文章。BioRADR 基于深度神经网络,可以通过远距离监督自动注释的大规模数据进行训练,经验评估显示,它在 NDCG@1 中的表现比最强基线高出 8 分以上。我们实现了一个基于 BioRADR 的在线系统 (http://bioradr.ai.thunlp.org/),使面向关系的生物医学文章检索更加高效。
{"title":"Relation-aware deep neural network enables more efficient biomedical knowledge acquisition from massive literature","authors":"Chenyang Song ,&nbsp;Zheni Zeng ,&nbsp;Changyao Tian ,&nbsp;Kuai Li ,&nbsp;Yuan Yao ,&nbsp;Suncong Zheng ,&nbsp;Zhiyuan Liu ,&nbsp;Maosong Sun","doi":"10.1016/j.aiopen.2024.08.002","DOIUrl":"10.1016/j.aiopen.2024.08.002","url":null,"abstract":"<div><p>Biomedical knowledge is typically organized in a relational scheme, such as chemical-disease relation, gene-disease relation, and gene-pathway relation. Biomedical scientists heavily rely on search engines to acquire up-to-date relational knowledge from massive biomedical articles. The navigation efficiency of the retrieval process, however, is significantly restricted by keyword matching techniques unaware of the biomedical relations of these keywords in articles. To bridge the gap between existing retrieval techniques and practical access demands for relational knowledge, we present a novel framework, <strong>Bio</strong>medical <strong>R</strong>elation-<strong>A</strong>ware <strong>D</strong>ocument <strong>R</strong>anking (BioRADR), capable of retrieving articles expressing specific relations with respect to the queried entity pair. Based on a deep neural network, BioRADR can be trained from large-scale data automatically annotated via distant supervision, and empirical evaluation reveals that it outperforms the strongest baseline by over 8 points in NDCG@1. We implement an online system (<span><span>http://bioradr.ai.thunlp.org/</span><svg><path></path></svg></span>) based on BioRADR, enabling more efficient relation-oriented retrieval of biomedical articles.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 104-114"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000123/pdfft?md5=0371d6da4f7cdd9c7adbcc0dac99a13d&pid=1-s2.0-S2666651024000123-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Label-aware debiased causal reasoning for Natural Language Inference 用于自然语言推理的标签感知去标签化因果推理
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.02.001
Kun Zhang , Dacao Zhang , Le Wu , Richang Hong , Ye Zhao , Meng Wang

Recently, researchers have argued that the impressive performance of Natural Language Inference (NLI) models is highly due to the spurious correlations existing in training data, which makes models vulnerable and poorly generalized. Some work has made preliminary debiased attempts by developing data-driven interventions or model-level debiased learning. Despite the progress, existing debiased methods either suffered from the high cost of data annotation processing, or required elaborate design to identify biased factors. By conducting detailed investigations and data analysis, we argue that label information can provide meaningful guidance to identify these spurious correlations in training data, which has not been paid enough attention. Thus, we design a novel Label-aware Debiased Causal Reasoning Network (LDCRN). Specifically, according to the data analysis, we first build a causal graph to describe causal relations and spurious correlations in NLI. Then, we employ an NLI model (e.g., RoBERTa) to calculate total causal effect of input sentences to labels. Meanwhile, we design a novel label-aware biased module to model spurious correlations and calculate their causal effect in a fine-grained manner. The debiasing process is realized by subtracting this causal effect from total causal effect. Finally, extensive experiments over two well-known NLI datasets and multiple human-annotated challenging test sets are conducted to prove the superiority of LDCRN. Moreover, we have developed novel challenging test sets based on MultiNLI to facilitate the community.

近来,研究人员认为,自然语言推理(NLI)模型的出色表现在很大程度上是由于训练数据中存在的虚假相关性,这使得模型易受攻击且通用性差。一些研究通过开发数据驱动的干预或模型级去误差学习,进行了初步的去误差尝试。尽管取得了进展,但现有的去偏方法要么受制于高昂的数据注释处理成本,要么需要精心设计以识别偏差因素。通过详细调查和数据分析,我们认为标签信息可以为识别训练数据中的这些虚假相关性提供有意义的指导,而这一点尚未得到足够重视。因此,我们设计了一种新颖的标签感知偏差因果推理网络(Label-aware Debiased Causal Reasoning Network,LDCRN)。具体来说,根据数据分析,我们首先建立一个因果图来描述 NLI 中的因果关系和虚假相关性。然后,我们采用一个 NLI 模型(如 RoBERTa)来计算输入句子对标签的总因果效应。同时,我们还设计了一个新颖的标签感知偏差模块,用于对虚假相关性进行建模,并以细粒度的方式计算其因果效应。通过从总因果效应中减去这种因果效应,就实现了去伪存真的过程。最后,我们在两个著名的 NLI 数据集和多个由人类标注的挑战性测试集上进行了大量实验,以证明 LDCRN 的优越性。此外,我们还在 MultiNLI 的基础上开发了新的挑战性测试集,为社区提供便利。
{"title":"Label-aware debiased causal reasoning for Natural Language Inference","authors":"Kun Zhang ,&nbsp;Dacao Zhang ,&nbsp;Le Wu ,&nbsp;Richang Hong ,&nbsp;Ye Zhao ,&nbsp;Meng Wang","doi":"10.1016/j.aiopen.2024.02.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2024.02.001","url":null,"abstract":"<div><p>Recently, researchers have argued that the impressive performance of Natural Language Inference (NLI) models is highly due to the <em>spurious correlations</em> existing in training data, which makes models vulnerable and poorly generalized. Some work has made preliminary debiased attempts by developing data-driven interventions or model-level debiased learning. Despite the progress, existing debiased methods either suffered from the high cost of data annotation processing, or required elaborate design to identify biased factors. By conducting detailed investigations and data analysis, we argue that label information can provide meaningful guidance to identify these spurious correlations in training data, which has not been paid enough attention. Thus, we design a novel <em>Label-aware Debiased Causal Reasoning Network</em> (<em>LDCRN</em>). Specifically, according to the data analysis, we first build a causal graph to describe causal relations and spurious correlations in NLI. Then, we employ an NLI model (e.g., RoBERTa) to calculate total causal effect of input sentences to labels. Meanwhile, we design a novel label-aware biased module to model spurious correlations and calculate their causal effect in a fine-grained manner. The debiasing process is realized by subtracting this causal effect from total causal effect. Finally, extensive experiments over two well-known NLI datasets and multiple human-annotated challenging test sets are conducted to prove the superiority of <em>LDCRN</em>. Moreover, we have developed novel challenging test sets based on MultiNLI to facilitate the community.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 70-78"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000081/pdfft?md5=1863010d7dc5353ee714fa3b391ab574&pid=1-s2.0-S2666651024000081-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140138581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boosting graph search with attention network for solving the general orienteering problem 利用注意力网络促进图搜索,解决一般定向问题
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.01.006
Zongtao Liu , Wei Dong , Chaoliang Wang , Haoqingzi Shen , Gang Sun , Qun jiang , Quanjin Tao , Yang Yang

Recently, several studies explore to use neural networks(NNs) to solve different routing problems, which is an auspicious direction. These studies usually design an encoder–decoder based framework that uses encoder embeddings of nodes and the problem-specific context to iteratively generate node sequence(path), and further optimize the produced result on top, such as a beam search. However, these models are limited to accepting only the coordinates of nodes as input, disregarding the self-referential nature of the studied routing problems, and failing to account for the low reliability of node selection in the initial stages, thereby posing challenges for real-world applications.

In this paper, we take the orienteering problem as an example to tackle these limitations in the previous studies. We propose a novel combination of a variant beam search algorithm and a learned heuristic for solving the general orienteering problem. We acquire the heuristic with an attention network that takes the distances among nodes as input, and learn it via a reinforcement learning framework. The empirical studies show that our method can surpass a wide range of baselines and achieve results iteratively generate the optimal or highly specialized approach.

最近,一些研究探索使用神经网络(NN)来解决不同的路由问题,这是一个很好的方向。这些研究通常设计一个基于编码器-解码器的框架,利用节点的编码器嵌入和特定问题的上下文来迭代生成节点序列(路径),并在此基础上进一步优化生成的结果,例如波束搜索。然而,这些模型仅限于接受节点坐标作为输入,忽略了所研究路由问题的自反性,也没有考虑到初始阶段节点选择的低可靠性,从而给实际应用带来了挑战。我们提出了一种新颖的变体波束搜索算法和学习启发式相结合的方法来解决一般定向问题。我们将启发式与以节点间距离为输入的注意力网络相结合,并通过强化学习框架对其进行学习。实证研究表明,我们的方法可以超越各种基线,并取得迭代生成最优或高度专业化方法的结果。
{"title":"Boosting graph search with attention network for solving the general orienteering problem","authors":"Zongtao Liu ,&nbsp;Wei Dong ,&nbsp;Chaoliang Wang ,&nbsp;Haoqingzi Shen ,&nbsp;Gang Sun ,&nbsp;Qun jiang ,&nbsp;Quanjin Tao ,&nbsp;Yang Yang","doi":"10.1016/j.aiopen.2024.01.006","DOIUrl":"https://doi.org/10.1016/j.aiopen.2024.01.006","url":null,"abstract":"<div><p>Recently, several studies explore to use neural networks(NNs) to solve different routing problems, which is an auspicious direction. These studies usually design an encoder–decoder based framework that uses encoder embeddings of nodes and the problem-specific context to iteratively generate node sequence(path), and further optimize the produced result on top, such as a beam search. However, these models are limited to accepting only the coordinates of nodes as input, disregarding the self-referential nature of the studied routing problems, and failing to account for the low reliability of node selection in the initial stages, thereby posing challenges for real-world applications.</p><p>In this paper, we take the orienteering problem as an example to tackle these limitations in the previous studies. We propose a novel combination of a variant beam search algorithm and a learned heuristic for solving the general orienteering problem. We acquire the heuristic with an attention network that takes the distances among nodes as input, and learn it via a reinforcement learning framework. The empirical studies show that our method can surpass a wide range of baselines and achieve results iteratively generate the optimal or highly specialized approach.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 46-54"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266665102400007X/pdfft?md5=4bd44cc9b0d6326c8e34b456fa017774&pid=1-s2.0-S266665102400007X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139936464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wave2Graph: Integrating spectral features and correlations for graph-based learning in sound waves Wave2Graph:整合频谱特征和相关性,实现基于图谱的声波学习
Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.08.004
Van-Truong Hoang , Khanh-Tung Tran , Xuan-Son Vu , Duy-Khuong Nguyen , Monowar Bhuyan , Hoang D. Nguyen

This paper investigates a novel graph-based representation of sound waves inspired by the physical phenomenon of correlated vibrations. We propose a Wave2Graph framework for integrating multiple acoustic representations, including the spectrum of frequencies and correlations, into various neural computing architectures to achieve new state-of-the-art performances in sound classification. The capability and reliability of our end-to-end framework are evidently demonstrated in voice pathology for low-cost and non-invasive mass-screening of medical conditions, including respiratory illnesses and Alzheimer’s Dementia. We conduct extensive experiments on multiple public benchmark datasets (ICBHI and ADReSSo) and our real-world dataset (IJSound: Respiratory disease detection using coughs and breaths). Wave2Graph framework consistently outperforms previous state-of-the-art methods with a large magnitude, up to 7.65% improvement, promising the usefulness of graph-based representation in signal processing and machine learning.

本文受相关振动物理现象的启发,研究了一种基于图形的新型声波表示法。我们提出了一个 Wave2Graph 框架,用于将包括频谱和相关性在内的多种声学表示法集成到各种神经计算架构中,从而在声音分类方面实现最先进的新性能。我们的端到端框架的能力和可靠性在语音病理学中得到了明显的体现,可用于呼吸系统疾病和阿尔茨海默氏症痴呆症等疾病的低成本、无创大规模筛查。我们在多个公共基准数据集(ICBHI 和 ADReSSo)和真实世界数据集(IJSound:利用咳嗽和呼吸检测呼吸疾病)上进行了广泛的实验。Wave2Graph 框架的表现始终优于之前的先进方法,最高提升幅度达 7.65%,证明了基于图的表示法在信号处理和机器学习中的实用性。
{"title":"Wave2Graph: Integrating spectral features and correlations for graph-based learning in sound waves","authors":"Van-Truong Hoang ,&nbsp;Khanh-Tung Tran ,&nbsp;Xuan-Son Vu ,&nbsp;Duy-Khuong Nguyen ,&nbsp;Monowar Bhuyan ,&nbsp;Hoang D. Nguyen","doi":"10.1016/j.aiopen.2024.08.004","DOIUrl":"10.1016/j.aiopen.2024.08.004","url":null,"abstract":"<div><p>This paper investigates a novel graph-based representation of sound waves inspired by the physical phenomenon of correlated vibrations. We propose a Wave2Graph framework for integrating multiple acoustic representations, including the spectrum of frequencies and correlations, into various neural computing architectures to achieve new state-of-the-art performances in sound classification. The capability and reliability of our end-to-end framework are evidently demonstrated in voice pathology for low-cost and non-invasive mass-screening of medical conditions, including respiratory illnesses and Alzheimer’s Dementia. We conduct extensive experiments on multiple public benchmark datasets (ICBHI and ADReSSo) and our real-world dataset (IJSound: Respiratory disease detection using coughs and breaths). Wave2Graph framework consistently outperforms previous state-of-the-art methods with a large magnitude, up to 7.65% improvement, promising the usefulness of graph-based representation in signal processing and machine learning.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 115-125"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000147/pdfft?md5=39354e1c8fc8f37b3f91eb3d652b379f&pid=1-s2.0-S2666651024000147-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How to generate popular post headlines on social media? 如何在社交媒体上生成受欢迎的帖子标题?
Pub Date : 2023-12-16 DOI: 10.1016/j.aiopen.2023.12.002
Zhouxiang Fang , Min Yu , Zhendong Fu , Boning Zhang , Xuanwen Huang , Xiaoqi Tang , Yang Yang

Posts, as important containers of user-generated-content on social media, are of tremendous social influence and commercial value. As an integral component of post, headline has decisive influence on post’s popularity. However, the current mainstream method for headline generation is still manually writing, which is unstable and requires extensive human effort. This drives us to explore a novel research question: Can we automate the generation of popular headlines on social media? We collect more than 1 million posts of 42,447 thousand celebrities from public data of Xiaohongshu, which is a well-known social media platform in China. We then conduct careful observations on the headlines of these posts. Observation results demonstrate that trends and personal styles are widespread in headlines on social medias and have significant contribution to posts’ popularity. Motivated by these insights, we present MEBART, which combines Multiple preference-Extractors with Bidirectional and Auto-Regressive Transformers (BART), capturing trends and personal styles to generate popular headlines on social medias. We perform extensive experiments on real-world datasets and achieve SOTA performance compared with advanced baselines. In addition, ablation and case studies demonstrate that MEBART advances in capturing trends and personal styles.

帖子作为社交媒体上用户生成内容的重要载体,具有巨大的社会影响力和商业价值。作为帖子的重要组成部分,标题对帖子的受欢迎程度有着决定性的影响。然而,目前生成标题的主流方法仍然是人工撰写,这种方法不稳定,而且需要大量的人力。这促使我们探索一个新的研究问题:我们能否自动生成社交媒体上的流行标题?我们从中国知名社交媒体平台小红书的公开数据中收集了 4244.7 万名人的 100 多万条帖子。然后,我们对这些帖子的标题进行了细致的观察。观察结果表明,流行趋势和个人风格在社交媒体的标题中广泛存在,并对帖子的受欢迎程度有重要影响。受这些见解的启发,我们提出了 MEBART,它将多重偏好提取器与双向和自回归变换器(BART)相结合,捕捉趋势和个人风格,从而生成社交媒体上的流行标题。我们在真实世界的数据集上进行了广泛的实验,与先进的基线相比,取得了 SOTA 的性能。此外,消融和案例研究也证明了 MEBART 在捕捉趋势和个人风格方面的进步。
{"title":"How to generate popular post headlines on social media?","authors":"Zhouxiang Fang ,&nbsp;Min Yu ,&nbsp;Zhendong Fu ,&nbsp;Boning Zhang ,&nbsp;Xuanwen Huang ,&nbsp;Xiaoqi Tang ,&nbsp;Yang Yang","doi":"10.1016/j.aiopen.2023.12.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.12.002","url":null,"abstract":"<div><p>Posts, as important containers of user-generated-content on social media, are of tremendous social influence and commercial value. As an integral component of post, headline has decisive influence on post’s popularity. However, the current mainstream method for headline generation is still manually writing, which is unstable and requires extensive human effort. This drives us to explore a novel research question: Can we automate the generation of popular headlines on social media? We collect more than 1 million posts of 42,447 thousand celebrities from public data of Xiaohongshu, which is a well-known social media platform in China. We then conduct careful observations on the headlines of these posts. Observation results demonstrate that trends and personal styles are widespread in headlines on social medias and have significant contribution to posts’ popularity. Motivated by these insights, we present MEBART, which combines <strong>M</strong>ultiple preference-<strong>E</strong>xtractors with <strong>B</strong>idirectional and <strong>A</strong>uto-Regressive Transformers (BART), capturing trends and personal styles to generate popular headlines on social medias. We perform extensive experiments on real-world datasets and achieve SOTA performance compared with advanced baselines. In addition, ablation and case studies demonstrate that <em>MEBART</em> advances in capturing trends and personal styles.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 1-9"},"PeriodicalIF":0.0,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000244/pdfft?md5=77f6189a8605961caeb7262aab78dbf9&pid=1-s2.0-S2666651023000244-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139050352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
AI Open
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1