AI Open最新文献_第2页

CPT: Colorful Prompt Tuning for pre-trained vision-language models CPT：基于颜色的提示调整，用于预训练的视觉语言模型

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.01.004

Yuan Yao , Ao Zhang , Zhengyan Zhang , Zhiyuan Liu , Tat-Seng Chua , Maosong Sun

Vision-Language Pre-training (VLP) models have shown promising capabilities in grounding natural language in image data, facilitating a broad range of cross-modal tasks. However, we note that there exists a significant gap between the objective forms of model pre-training and fine-tuning, resulting in a need for large amounts of labeled data to stimulate the visual grounding capability of VLP models for downstream tasks. To address the challenge, we present Color-based Prompt Tuning (CPT), a novel paradigm for tuning VLP models, which reformulates visual grounding into a fill-in-the-blank problem with color-based co-referential markers in image and text, maximally mitigating the gap. In this way, CPT enables strong few-shot and even zero-shot visual grounding capabilities of VLP models. Comprehensive experimental results show that CPT achieves state-of-the-art performance on zero/few-shot visual grounding (e.g., 75.1 zero-shot accuracy in RefCOCO evaluation), outperforming fine-tuned and other prompt-tuned models by a large margin. Moreover, CPT can also be easily extended to achieve promising zero/few-shot performance on other vision-language tasks, such as visual relation detection, visual commonsense reasoning and visual question answering. We make the data and codes publicly available at https://github.com/thunlp/CPT.

视觉语言预训练（VLP）模型在图像数据的自然语言基础方面表现出了良好的能力，从而促进了广泛的跨模态任务。然而，我们注意到，模型预训练和微调的客观形式之间存在很大差距，因此需要大量标注数据来激发 VLP 模型的视觉接地能力，以完成下游任务。为了应对这一挑战，我们提出了基于颜色的提示调整（CPT），这是一种用于调整 VLP 模型的新型范式，它将视觉接地重新组合为一个填空问题，在图像和文本中使用基于颜色的共同参照标记，从而最大限度地缩小了差距。通过这种方法，CPT 使 VLP 模型具有强大的少镜头甚至零镜头视觉接地能力。综合实验结果表明，CPT 在零镜头/少镜头视觉接地方面达到了最先进的性能（例如，在 RefCOCO 评估中，零镜头准确率为 75.1），远远优于微调模型和其他及时调整模型。此外，CPT 还可以很容易地扩展到其他视觉语言任务，如视觉关系检测、视觉常识推理和视觉问题解答等，以获得令人满意的零/少镜头性能。我们在 https://github.com/thunlp/CPT 上公开了数据和代码。

{"title":"CPT: Colorful Prompt Tuning for pre-trained vision-language models","authors":"Yuan Yao , Ao Zhang , Zhengyan Zhang , Zhiyuan Liu , Tat-Seng Chua , Maosong Sun","doi":"10.1016/j.aiopen.2024.01.004","DOIUrl":"10.1016/j.aiopen.2024.01.004","url":null,"abstract":"<div><p>Vision-Language Pre-training (VLP) models have shown promising capabilities in grounding natural language in image data, facilitating a broad range of cross-modal tasks. However, we note that there exists a significant gap between the objective forms of model pre-training and fine-tuning, resulting in a need for large amounts of labeled data to stimulate the visual grounding capability of VLP models for downstream tasks. To address the challenge, we present <strong>C</strong>olor-based <strong>P</strong>rompt <strong>T</strong>uning (CPT), a novel paradigm for tuning VLP models, which reformulates visual grounding into a fill-in-the-blank problem with color-based co-referential markers in image and text, maximally mitigating the gap. In this way, CPT enables strong few-shot and even zero-shot visual grounding capabilities of VLP models. Comprehensive experimental results show that CPT achieves state-of-the-art performance on zero/few-shot visual grounding (e.g., 75.1 zero-shot accuracy in RefCOCO evaluation), outperforming fine-tuned and other prompt-tuned models by a large margin. Moreover, CPT can also be easily extended to achieve promising zero/few-shot performance on other vision-language tasks, such as visual relation detection, visual commonsense reasoning and visual question answering. We make the data and codes publicly available at <span>https://github.com/thunlp/CPT</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 30-38"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000056/pdfft?md5=a0b3ea3b64a989f20cbd8db1f84428c6&pid=1-s2.0-S2666651024000056-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139686627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An ecosystem for personal knowledge graphs: A survey and research roadmap 个人知识图谱生态系统：调查与研究路线图

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.01.003

Martin G. Skjæveland, Krisztian Balog, Nolwenn Bernard, Weronika Łajewska, Trond Linjordet

This paper presents an ecosystem for personal knowledge graphs (PKGs), commonly defined as resources of structured information about entities related to an individual, their attributes, and the relations between them. PKGs are a key enabler of secure and sophisticated personal data management and personalized services. However, there are challenges that need to be addressed before PKGs can achieve widespread adoption. One of the fundamental challenges is the very definition of what constitutes a PKG, as there are multiple interpretations of the term. We propose our own definition of a PKG, emphasizing the aspects of (1) data ownership by a single individual and (2) the delivery of personalized services as the primary purpose. We further argue that a holistic view of PKGs is needed to unlock their full potential, and propose a unified framework for PKGs, where the PKG is a part of a larger ecosystem with clear interfaces towards data services and data sources. A comprehensive survey and synthesis of existing work is conducted, with a mapping of the surveyed work into the proposed unified ecosystem. Finally, we identify open challenges and research opportunities for the ecosystem as a whole, as well as for the specific aspects of PKGs, which include population, representation and management, and utilization.

个人知识图谱（PKGs）通常被定义为与个人相关的实体、其属性以及它们之间关系的结构化信息资源。PKG 是实现安全、复杂的个人数据管理和个性化服务的关键因素。然而，在 PKG 得到广泛应用之前，还需要应对一些挑战。基本挑战之一是 PKG 的定义本身，因为对该术语有多种解释。我们提出了自己对 PKG 的定义，强调了以下两方面：(1) 单个个体对数据的所有权；(2) 以提供个性化服务为主要目的。我们进一步认为，需要从整体上看待 PKG，才能充分释放其潜力，并提出了一个统一的 PKG 框架，即 PKG 是一个更大的生态系统的一部分，具有面向数据服务和数据源的清晰接口。我们对现有工作进行了全面调查和综合，并将调查工作映射到建议的统一生态系统中。最后，我们确定了整个生态系统以及 PKG 的具体方面（包括人口、代表性和管理以及利用）所面临的挑战和研究机会。

{"title":"An ecosystem for personal knowledge graphs: A survey and research roadmap","authors":"Martin G. Skjæveland, Krisztian Balog, Nolwenn Bernard, Weronika Łajewska, Trond Linjordet","doi":"10.1016/j.aiopen.2024.01.003","DOIUrl":"https://doi.org/10.1016/j.aiopen.2024.01.003","url":null,"abstract":"<div><p>This paper presents an ecosystem for personal knowledge graphs (PKGs), commonly defined as resources of structured information about entities related to an individual, their attributes, and the relations between them. PKGs are a key enabler of secure and sophisticated personal data management and personalized services. However, there are challenges that need to be addressed before PKGs can achieve widespread adoption. One of the fundamental challenges is the very definition of what constitutes a PKG, as there are multiple interpretations of the term. We propose our own definition of a PKG, emphasizing the aspects of (1) data ownership by a single individual and (2) the delivery of personalized services as the primary purpose. We further argue that a holistic view of PKGs is needed to unlock their full potential, and propose a unified framework for PKGs, where the PKG is a part of a larger ecosystem with clear interfaces towards data services and data sources. A comprehensive survey and synthesis of existing work is conducted, with a mapping of the surveyed work into the proposed unified ecosystem. Finally, we identify open challenges and research opportunities for the ecosystem as a whole, as well as for the specific aspects of PKGs, which include population, representation and management, and utilization.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 55-69"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000044/pdfft?md5=a12ec1f170570bcf4e71b8ae5c11e512&pid=1-s2.0-S2666651024000044-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generating graph perturbations to enhance the generalization of GNNs 生成图形扰动以增强 GNN 的泛化能力

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.10.001

Sofiane Ennadir , Giannis Nikolentzos , Michalis Vazirgiannis , Henrik Boström

Graph neural networks (GNNs) have become the standard approach for performing machine learning on graphs. Such models need large amounts of training data, however, in several graph classification and regression tasks, only limited training data is available. Unfortunately, due to the complex nature of graphs, common augmentation strategies employed in other settings, such as computer vision, do not apply to graphs. This work aims to improve the generalization ability of GNNs by increasing the size of the training set of a given problem. The new samples are generated using an iterative contrastive learning procedure that augments the dataset during the training, in a task-relevant approach, by manipulating the graph topology. The proposed approach is general, assumes no knowledge about the underlying architecture, and can thus be applied to any GNN. We provided a theoretical analysis regarding the equivalence of the proposed approach to a regularization technique. We demonstrate instances of our framework on popular GNNs, and evaluate them on several real-world benchmark graph classification datasets. The experimental results show that the proposed approach, in several cases, enhances the generalization of the underlying prediction models reaching in some datasets state-of-the-art performance.

图神经网络（GNN）已成为对图进行机器学习的标准方法。这类模型需要大量的训练数据，但在一些图分类和回归任务中，只有有限的训练数据可用。遗憾的是，由于图的复杂性，在计算机视觉等其他环境中采用的常见增强策略并不适用于图。这项研究旨在通过增加给定问题的训练集规模来提高 GNN 的泛化能力。新样本是通过迭代对比学习程序生成的，该程序在训练过程中通过操纵图拓扑结构，以任务相关的方式增加数据集。所提出的方法具有通用性，不需要了解底层架构，因此可应用于任何 GNN。我们对所提出的方法与正则化技术的等效性进行了理论分析。我们在流行的 GNN 上演示了我们的框架实例，并在几个真实世界的基准图分类数据集上对其进行了评估。实验结果表明，所提出的方法在某些情况下增强了底层预测模型的泛化能力，在某些数据集上达到了最先进的性能。

{"title":"Generating graph perturbations to enhance the generalization of GNNs","authors":"Sofiane Ennadir , Giannis Nikolentzos , Michalis Vazirgiannis , Henrik Boström","doi":"10.1016/j.aiopen.2024.10.001","DOIUrl":"10.1016/j.aiopen.2024.10.001","url":null,"abstract":"<div><div>Graph neural networks (GNNs) have become the standard approach for performing machine learning on graphs. Such models need large amounts of training data, however, in several graph classification and regression tasks, only limited training data is available. Unfortunately, due to the complex nature of graphs, common augmentation strategies employed in other settings, such as computer vision, do not apply to graphs. This work aims to improve the generalization ability of GNNs by increasing the size of the training set of a given problem. The new samples are generated using an iterative contrastive learning procedure that augments the dataset during the training, in a task-relevant approach, by manipulating the graph topology. The proposed approach is general, assumes no knowledge about the underlying architecture, and can thus be applied to any GNN. We provided a theoretical analysis regarding the equivalence of the proposed approach to a regularization technique. We demonstrate instances of our framework on popular GNNs, and evaluate them on several real-world benchmark graph classification datasets. The experimental results show that the proposed approach, in several cases, enhances the generalization of the underlying prediction models reaching in some datasets state-of-the-art performance.</div></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 216-223"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142704286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GPT understands, too GPT 也理解

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2023.08.012

Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU). However, our preliminary study reveals that manual discrete prompts often lead to unstable performance—e.g., changing a single word in the prompt might result in substantial performance drop. We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts. Empirically, P-Tuning not only stabilizes training by minimizing the gap between various discrete prompts, but also improves performance by a sizeable margin on a wide range of NLU tasks including LAMA and SuperGLUE. P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.

事实证明，用自然语言模式提示预训练语言模型对自然语言理解（NLU）非常有效。然而，我们的初步研究表明，人工离散提示通常会导致性能不稳定，例如，改变提示中的一个单词就可能导致性能大幅下降。我们提出了一种新方法 P-Tuning，它将可训练的连续提示嵌入与离散提示串联起来。根据经验，P-Tuning 不仅能通过最小化各种离散提示之间的差距来稳定训练，还能在包括 LAMA 和 SuperGLUE 在内的各种 NLU 任务中大幅提高性能。P-Tuning 对冻结语言模型和调整语言模型都普遍有效，而且在完全监督和少数几个镜头的设置下都是如此。

引用次数: 0

Authorship style transfer with inverse transfer data augmentation 作者风格转移与反向转移数据增强

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.08.003

Zhonghui Shao , Jing Zhang , Haoyang Li , Xinmei Huang , Chao Zhou , Yuanchun Wang , Jibing Gong , Cuiping Li , Hong Chen

Authorship style transfer aims to modify the style of neutral text to match the unique speaking or writing style of a particular individual. While Large Language Models (LLMs) present promising solutions, their effectiveness is limited by the small number of in-context learning demonstrations, particularly for authorship styles not frequently seen during pre-training. In response, this paper proposes an inverse transfer data augmentation (ITDA) method, leveraging LLMs to create (neutral text, stylized text) pairs. This method involves removing the existing styles from stylized texts, a process made more feasible due to the prevalence of neutral texts in pre-training. We use this augmented dataset to train a compact model that is efficient for deployment and adept at replicating the targeted style. Our experimental results, conducted across four datasets with distinct authorship styles, establish the effectiveness of ITDA over traditional style transfer methods and forward transfer using GPT-3.5. For further research and application, our dataset and code are openly accessible at https://github.com/Vicky-Shao/ITDA.

作者风格转换的目的是修改中性文本的风格，使之与特定个人的独特说话或写作风格相匹配。虽然大语言模型（LLMs）提供了很有前景的解决方案，但由于语境中学习演示的数量较少，它们的有效性受到了限制，特别是对于在预训练中不常见的作者风格。为此，本文提出了一种反向传输数据增强（ITDA）方法，利用 LLM 创建（中性文本、风格化文本）对。该方法涉及从风格化文本中移除现有风格，由于中性文本在预训练中的普遍存在，这一过程变得更加可行。我们使用这个增强的数据集来训练一个紧凑的模型，该模型不仅部署高效，而且善于复制目标样式。我们在四个具有不同作者风格的数据集上进行的实验结果表明，ITDA 比传统的风格转移方法和使用 GPT-3.5 的前向转移方法更有效。为便于进一步研究和应用，我们的数据集和代码可在 https://github.com/Vicky-Shao/ITDA 上公开访问。

{"title":"Authorship style transfer with inverse transfer data augmentation","authors":"Zhonghui Shao , Jing Zhang , Haoyang Li , Xinmei Huang , Chao Zhou , Yuanchun Wang , Jibing Gong , Cuiping Li , Hong Chen","doi":"10.1016/j.aiopen.2024.08.003","DOIUrl":"10.1016/j.aiopen.2024.08.003","url":null,"abstract":"<div><p>Authorship style transfer aims to modify the style of neutral text to match the unique speaking or writing style of a particular individual. While Large Language Models (LLMs) present promising solutions, their effectiveness is limited by the small number of in-context learning demonstrations, particularly for authorship styles not frequently seen during pre-training. In response, this paper proposes an inverse transfer data augmentation (<span>ITDA</span>) method, leveraging LLMs to create (neutral text, stylized text) pairs. This method involves removing the existing styles from stylized texts, a process made more feasible due to the prevalence of neutral texts in pre-training. We use this augmented dataset to train a compact model that is efficient for deployment and adept at replicating the targeted style. Our experimental results, conducted across four datasets with distinct authorship styles, establish the effectiveness of <span>ITDA</span> over traditional style transfer methods and forward transfer using GPT-3.5. For further research and application, our dataset and code are openly accessible at <span><span>https://github.com/Vicky-Shao/ITDA</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 94-103"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000135/pdfft?md5=3a5bc730b200d5992d33b797c1afbf4f&pid=1-s2.0-S2666651024000135-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142075773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large language models in law: A survey 法律中的大型语言模型：调查

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.09.002

Jinqi Lai , Wensheng Gan , Jiayang Wu , Zhenlian Qi , Philip S. Yu

The advent of artificial intelligence (AI) has significantly impacted the traditional judicial industry. Moreover, recently, with the development of AI-generated content (AIGC), AI and law have found applications in various domains, including image recognition, automatic text generation, and interactive chat. With the rapid emergence and growing popularity of large models, it is evident that AI will drive transformation in the traditional judicial industry. However, the application of legal large language models (LLMs) is still in its nascent stage. Several challenges need to be addressed. In this paper, we aim to provide a comprehensive survey of legal LLMs. We not only conduct an extensive survey of LLMs but also expose their applications in the judicial system. We first provide an overview of AI technologies in the legal field and showcase the recent research in LLMs. Then, we discuss the practical implementations presented by legal LLMs, such as providing legal advice to users and assisting judges during trials. In addition, we explore the limitations of legal LLMs, including data, algorithms, and judicial practice. Finally, we summarize practical recommendations and propose future development directions to address these challenges.

人工智能（AI）的出现极大地冲击了传统的司法行业。此外，近期随着人工智能生成内容（AIGC）的发展，人工智能与法律在图像识别、自动文本生成、互动聊天等多个领域都有了应用。随着大模型的快速出现和日益普及，人工智能显然将推动传统司法行业的转型。然而，法律大型语言模型（LLM）的应用仍处于起步阶段。一些挑战亟待解决。本文旨在对法律大语言模型进行全面调查。我们不仅对 LLM 进行了广泛的调查，还揭示了它们在司法系统中的应用。我们首先概述了人工智能技术在法律领域的应用，并展示了最近在 LLMs 方面的研究。然后，我们讨论了法律 LLM 的实际应用，例如为用户提供法律建议和在审判过程中协助法官。此外，我们还探讨了法律 LLM 的局限性，包括数据、算法和司法实践。最后，我们总结了实用建议，并提出了应对这些挑战的未来发展方向。

{"title":"Large language models in law: A survey","authors":"Jinqi Lai , Wensheng Gan , Jiayang Wu , Zhenlian Qi , Philip S. Yu","doi":"10.1016/j.aiopen.2024.09.002","DOIUrl":"10.1016/j.aiopen.2024.09.002","url":null,"abstract":"<div><div>The advent of artificial intelligence (AI) has significantly impacted the traditional judicial industry. Moreover, recently, with the development of AI-generated content (AIGC), AI and law have found applications in various domains, including image recognition, automatic text generation, and interactive chat. With the rapid emergence and growing popularity of large models, it is evident that AI will drive transformation in the traditional judicial industry. However, the application of legal large language models (LLMs) is still in its nascent stage. Several challenges need to be addressed. In this paper, we aim to provide a comprehensive survey of legal LLMs. We not only conduct an extensive survey of LLMs but also expose their applications in the judicial system. We first provide an overview of AI technologies in the legal field and showcase the recent research in LLMs. Then, we discuss the practical implementations presented by legal LLMs, such as providing legal advice to users and assisting judges during trials. In addition, we explore the limitations of legal LLMs, including data, algorithms, and judicial practice. Finally, we summarize practical recommendations and propose future development directions to address these challenges.</div></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 181-196"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142539171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A study of natural robustness of deep reinforcement learning algorithms towards adversarial perturbations 深度强化学习算法对对抗性扰动的自然鲁棒性研究

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.08.005

Qisai Liu , Xian Yeow Lee , Soumik Sarkar

Deep reinforcement learning (DRL) has been shown to have numerous potential applications in the real world. However, DRL algorithms are still extremely sensitive to noise and adversarial perturbations, hence inhibiting the deployment of RL in many real-life applications. Analyzing the robustness of DRL algorithms to adversarial attacks is an important prerequisite to enabling the widespread adoption of DRL algorithms. Common perturbations on DRL frameworks during test time include perturbations to the observation and the action channel. Compared with observation channel attacks, action channel attacks are less studied; hence, few comparisons exist that compare the effectiveness of these attacks in DRL literature. In this work, we examined the effectiveness of these two paradigms of attacks on common DRL algorithms and studied the natural robustness of DRL algorithms towards various adversarial attacks in hopes of gaining insights into the individual response of each type of algorithm under different attack conditions.

深度强化学习（DRL）已被证明在现实世界中有许多潜在应用。然而，DRL 算法对噪声和对抗性扰动仍然极为敏感，因此阻碍了 RL 在许多现实应用中的部署。分析 DRL 算法对对抗性攻击的鲁棒性是 DRL 算法得以广泛应用的重要前提。测试期间对 DRL 框架的常见扰动包括对观察和行动通道的扰动。与观测信道攻击相比，行动信道攻击的研究较少，因此 DRL 文献中很少有比较这些攻击有效性的文章。在这项工作中，我们检验了这两种攻击范式对常见 DRL 算法的有效性，并研究了 DRL 算法对各种对抗性攻击的天然鲁棒性，希望能深入了解每种算法在不同攻击条件下的个体响应。

{"title":"A study of natural robustness of deep reinforcement learning algorithms towards adversarial perturbations","authors":"Qisai Liu , Xian Yeow Lee , Soumik Sarkar","doi":"10.1016/j.aiopen.2024.08.005","DOIUrl":"10.1016/j.aiopen.2024.08.005","url":null,"abstract":"<div><p>Deep reinforcement learning (DRL) has been shown to have numerous potential applications in the real world. However, DRL algorithms are still extremely sensitive to noise and adversarial perturbations, hence inhibiting the deployment of RL in many real-life applications. Analyzing the robustness of DRL algorithms to adversarial attacks is an important prerequisite to enabling the widespread adoption of DRL algorithms. Common perturbations on DRL frameworks during test time include perturbations to the observation and the action channel. Compared with observation channel attacks, action channel attacks are less studied; hence, few comparisons exist that compare the effectiveness of these attacks in DRL literature. In this work, we examined the effectiveness of these two paradigms of attacks on common DRL algorithms and studied the natural robustness of DRL algorithms towards various adversarial attacks in hopes of gaining insights into the individual response of each type of algorithm under different attack conditions.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 126-141"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000159/pdfft?md5=a50110d80c809055a00e87466dc649b1&pid=1-s2.0-S2666651024000159-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142163035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CellBoost: A pipeline for machine assisted annotation in neuroanatomy CellBoost：神经解剖学中的机器辅助注释管道

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.09.001

Kui Qian , Beth Friedman , Jun Takatoh , Alexander Groisman , Fan Wang , David Kleinfeld , Yoav Freund

One of the important yet labor intensive tasks in neuroanatomy is the identification of select populations of cells. Current high-throughput techniques enable marking cells with histochemical fluorescent molecules as well as through the genetic expression of fluorescent proteins. Modern scanning microscopes allow high resolution multi-channel imaging of the mechanically or optically sectioned brain with thousands of marked cells per square millimeter. Manual identification of all marked cells is prohibitively time consuming. At the same time, simple segmentation algorithms to identify marked cells suffer from high error rates and sensitivity to variation in fluorescent intensity and spatial distribution.

We present a methodology that combines human judgement and machine learning that serves to significantly reduce the labor of the anatomist while improving the consistency of the annotation.

As a demonstration, we analyzed murine brains with marked premotor neurons in the brainstem. We compared the error rate of our method to the disagreement rate among human anatomists. This comparison shows that our method can reduce the time to annotate by as much as ten-fold without significantly increasing the rate of errors. We show that our method achieves significant reduction in labor while achieving an accuracy that is similar to the level of agreement between different anatomists.

神经解剖学中一项重要而又耗费大量人力的工作是识别选定的细胞群。目前的高通量技术可通过组织化学荧光分子以及荧光蛋白的基因表达对细胞进行标记。现代扫描显微镜可对机械或光学切片的大脑进行高分辨率多通道成像，每平方毫米可显示数千个标记细胞。手动识别所有标记细胞非常耗时。与此同时，用于识别标记细胞的简单分割算法存在错误率高、对荧光强度和空间分布变化敏感等问题。我们提出了一种将人工判断与机器学习相结合的方法，该方法可显著减少解剖学家的劳动，同时提高注释的一致性。作为演示，我们分析了脑干中带有标记的前运动神经元的鼠脑。我们将我们方法的错误率与人类解剖学家之间的分歧率进行了比较。比较结果表明，我们的方法可以将标注时间缩短十倍之多，而错误率却不会明显增加。我们的研究表明，我们的方法在显著减少工作量的同时，还能达到与不同解剖学家之间的一致水平相近的精确度。

{"title":"CellBoost: A pipeline for machine assisted annotation in neuroanatomy","authors":"Kui Qian , Beth Friedman , Jun Takatoh , Alexander Groisman , Fan Wang , David Kleinfeld , Yoav Freund","doi":"10.1016/j.aiopen.2024.09.001","DOIUrl":"10.1016/j.aiopen.2024.09.001","url":null,"abstract":"<div><p>One of the important yet labor intensive tasks in neuroanatomy is the identification of select populations of cells. Current high-throughput techniques enable marking cells with histochemical fluorescent molecules as well as through the genetic expression of fluorescent proteins. Modern scanning microscopes allow high resolution multi-channel imaging of the mechanically or optically sectioned brain with thousands of marked cells per square millimeter. Manual identification of all marked cells is prohibitively time consuming. At the same time, simple segmentation algorithms to identify marked cells suffer from high error rates and sensitivity to variation in fluorescent intensity and spatial distribution.</p><p>We present a methodology that combines human judgement and machine learning that serves to significantly reduce the labor of the anatomist while improving the consistency of the annotation.</p><p>As a demonstration, we analyzed murine brains with marked premotor neurons in the brainstem. We compared the error rate of our method to the disagreement rate among human anatomists. This comparison shows that our method can reduce the time to annotate by as much as ten-fold without significantly increasing the rate of errors. We show that our method achieves significant reduction in labor while achieving an accuracy that is similar to the level of agreement between different anatomists.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 142-154"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000160/pdfft?md5=d645a8a10e8ed886c8fad283100f34b8&pid=1-s2.0-S2666651024000160-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Relation-aware deep neural network enables more efficient biomedical knowledge acquisition from massive literature 关系感知深度神经网络让从海量文献中获取生物医学知识更高效

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.08.002

Chenyang Song , Zheni Zeng , Changyao Tian , Kuai Li , Yuan Yao , Suncong Zheng , Zhiyuan Liu , Maosong Sun

Biomedical knowledge is typically organized in a relational scheme, such as chemical-disease relation, gene-disease relation, and gene-pathway relation. Biomedical scientists heavily rely on search engines to acquire up-to-date relational knowledge from massive biomedical articles. The navigation efficiency of the retrieval process, however, is significantly restricted by keyword matching techniques unaware of the biomedical relations of these keywords in articles. To bridge the gap between existing retrieval techniques and practical access demands for relational knowledge, we present a novel framework, Biomedical Relation-Aware Document Ranking (BioRADR), capable of retrieving articles expressing specific relations with respect to the queried entity pair. Based on a deep neural network, BioRADR can be trained from large-scale data automatically annotated via distant supervision, and empirical evaluation reveals that it outperforms the strongest baseline by over 8 points in NDCG@1. We implement an online system (http://bioradr.ai.thunlp.org/) based on BioRADR, enabling more efficient relation-oriented retrieval of biomedical articles.

生物医学知识通常采用关系式组织，如化学-疾病关系、基因-疾病关系和基因-途径关系。生物医学家严重依赖搜索引擎从海量生物医学文章中获取最新的关系知识。然而，由于关键词匹配技术不了解这些关键词在文章中的生物医学关系，检索过程的导航效率受到很大限制。为了弥补现有检索技术与关系知识实际访问需求之间的差距，我们提出了一个新颖的框架--生物医学关系感知文档排名（BioRADR），它能够检索表达与被查询实体对的特定关系的文章。BioRADR 基于深度神经网络，可以通过远距离监督自动注释的大规模数据进行训练，经验评估显示，它在 NDCG@1 中的表现比最强基线高出 8 分以上。我们实现了一个基于 BioRADR 的在线系统 (http://bioradr.ai.thunlp.org/)，使面向关系的生物医学文章检索更加高效。

{"title":"Relation-aware deep neural network enables more efficient biomedical knowledge acquisition from massive literature","authors":"Chenyang Song , Zheni Zeng , Changyao Tian , Kuai Li , Yuan Yao , Suncong Zheng , Zhiyuan Liu , Maosong Sun","doi":"10.1016/j.aiopen.2024.08.002","DOIUrl":"10.1016/j.aiopen.2024.08.002","url":null,"abstract":"<div><p>Biomedical knowledge is typically organized in a relational scheme, such as chemical-disease relation, gene-disease relation, and gene-pathway relation. Biomedical scientists heavily rely on search engines to acquire up-to-date relational knowledge from massive biomedical articles. The navigation efficiency of the retrieval process, however, is significantly restricted by keyword matching techniques unaware of the biomedical relations of these keywords in articles. To bridge the gap between existing retrieval techniques and practical access demands for relational knowledge, we present a novel framework, <strong>Bio</strong>medical <strong>R</strong>elation-<strong>A</strong>ware <strong>D</strong>ocument <strong>R</strong>anking (BioRADR), capable of retrieving articles expressing specific relations with respect to the queried entity pair. Based on a deep neural network, BioRADR can be trained from large-scale data automatically annotated via distant supervision, and empirical evaluation reveals that it outperforms the strongest baseline by over 8 points in NDCG@1. We implement an online system (<span><span>http://bioradr.ai.thunlp.org/</span><svg><path></path></svg></span>) based on BioRADR, enabling more efficient relation-oriented retrieval of biomedical articles.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 104-114"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000123/pdfft?md5=0371d6da4f7cdd9c7adbcc0dac99a13d&pid=1-s2.0-S2666651024000123-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Label-aware debiased causal reasoning for Natural Language Inference 用于自然语言推理的标签感知去标签化因果推理

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.02.001

Kun Zhang , Dacao Zhang , Le Wu , Richang Hong , Ye Zhao , Meng Wang

Recently, researchers have argued that the impressive performance of Natural Language Inference (NLI) models is highly due to the spurious correlations existing in training data, which makes models vulnerable and poorly generalized. Some work has made preliminary debiased attempts by developing data-driven interventions or model-level debiased learning. Despite the progress, existing debiased methods either suffered from the high cost of data annotation processing, or required elaborate design to identify biased factors. By conducting detailed investigations and data analysis, we argue that label information can provide meaningful guidance to identify these spurious correlations in training data, which has not been paid enough attention. Thus, we design a novel Label-aware Debiased Causal Reasoning Network (LDCRN). Specifically, according to the data analysis, we first build a causal graph to describe causal relations and spurious correlations in NLI. Then, we employ an NLI model (e.g., RoBERTa) to calculate total causal effect of input sentences to labels. Meanwhile, we design a novel label-aware biased module to model spurious correlations and calculate their causal effect in a fine-grained manner. The debiasing process is realized by subtracting this causal effect from total causal effect. Finally, extensive experiments over two well-known NLI datasets and multiple human-annotated challenging test sets are conducted to prove the superiority of LDCRN. Moreover, we have developed novel challenging test sets based on MultiNLI to facilitate the community.

近来，研究人员认为，自然语言推理（NLI）模型的出色表现在很大程度上是由于训练数据中存在的虚假相关性，这使得模型易受攻击且通用性差。一些研究通过开发数据驱动的干预或模型级去误差学习，进行了初步的去误差尝试。尽管取得了进展，但现有的去偏方法要么受制于高昂的数据注释处理成本，要么需要精心设计以识别偏差因素。通过详细调查和数据分析，我们认为标签信息可以为识别训练数据中的这些虚假相关性提供有意义的指导，而这一点尚未得到足够重视。因此，我们设计了一种新颖的标签感知偏差因果推理网络（Label-aware Debiased Causal Reasoning Network，LDCRN）。具体来说，根据数据分析，我们首先建立一个因果图来描述 NLI 中的因果关系和虚假相关性。然后，我们采用一个 NLI 模型（如 RoBERTa）来计算输入句子对标签的总因果效应。同时，我们还设计了一个新颖的标签感知偏差模块，用于对虚假相关性进行建模，并以细粒度的方式计算其因果效应。通过从总因果效应中减去这种因果效应，就实现了去伪存真的过程。最后，我们在两个著名的 NLI 数据集和多个由人类标注的挑战性测试集上进行了大量实验，以证明 LDCRN 的优越性。此外，我们还在 MultiNLI 的基础上开发了新的挑战性测试集，为社区提供便利。

{"title":"Label-aware debiased causal reasoning for Natural Language Inference","authors":"Kun Zhang , Dacao Zhang , Le Wu , Richang Hong , Ye Zhao , Meng Wang","doi":"10.1016/j.aiopen.2024.02.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2024.02.001","url":null,"abstract":"<div><p>Recently, researchers have argued that the impressive performance of Natural Language Inference (NLI) models is highly due to the <em>spurious correlations</em> existing in training data, which makes models vulnerable and poorly generalized. Some work has made preliminary debiased attempts by developing data-driven interventions or model-level debiased learning. Despite the progress, existing debiased methods either suffered from the high cost of data annotation processing, or required elaborate design to identify biased factors. By conducting detailed investigations and data analysis, we argue that label information can provide meaningful guidance to identify these spurious correlations in training data, which has not been paid enough attention. Thus, we design a novel <em>Label-aware Debiased Causal Reasoning Network</em> (<em>LDCRN</em>). Specifically, according to the data analysis, we first build a causal graph to describe causal relations and spurious correlations in NLI. Then, we employ an NLI model (e.g., RoBERTa) to calculate total causal effect of input sentences to labels. Meanwhile, we design a novel label-aware biased module to model spurious correlations and calculate their causal effect in a fine-grained manner. The debiasing process is realized by subtracting this causal effect from total causal effect. Finally, extensive experiments over two well-known NLI datasets and multiple human-annotated challenging test sets are conducted to prove the superiority of <em>LDCRN</em>. Moreover, we have developed novel challenging test sets based on MultiNLI to facilitate the community.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 70-78"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000081/pdfft?md5=1863010d7dc5353ee714fa3b391ab574&pid=1-s2.0-S2666651024000081-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140138581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0