首页 > 最新文献

Proceedings of the conference. Association for Computational Linguistics. Meeting最新文献

英文 中文
IRIS: Interpretable Retrieval-Augmented Classification for Long Interspersed Document Sequences. IRIS:长分散文档序列的可解释检索增强分类。
Fengnan Li, Elliot D Hill, Shu Jiang, Jiaxin Gao, Matthew M Engelhard

Transformer-based models have achieved state-of-the-art performance in document classification but struggle with long-text processing due to the quadratic computational complexity in the self-attention module. Existing solutions, such as sparse attention, hierarchical models, and key sentence extraction, partially address the issue but still fall short when the input sequence is exceptionally lengthy. To address this challenge, we propose IRIS (Interpretable Retrieval-Augmented Classification for long Interspersed Document Sequences), a novel, lightweight framework that utilizes retrieval to efficiently classify long documents while enhancing interpretability. IRIS segments documents into chunks, stores their embeddings in a vector database, and retrieves those most relevant to a given task using learnable query vectors. A linear attention mechanism then aggregates the retrieved embeddings for classification, allowing the model to process arbitrarily long documents without increasing computational cost and remaining trainable on a single GPU. Our experiments across six datasets show that IRIS achieves comparable performance to baseline models on standard benchmarks, and excels in three clinical note disease risk prediction tasks where documents are extremely long and key information is sparse. Furthermore, IRIS provides global interpretability by revealing a clear summary of key risk factors identified by the model. These findings highlight the potential of IRIS as an efficient and interpretable solution for long-document classification, particularly in healthcare applications where both performance and explainability are crucial.

基于变压器的模型在文档分类方面取得了最先进的性能,但由于自关注模块的二次计算复杂度,在长文本处理方面存在困难。现有的解决方案,如稀疏注意、分层模型和关键句子提取,部分地解决了这个问题,但当输入序列异常长时仍然不足。为了应对这一挑战,我们提出了IRIS(可解释的检索增强分类,用于长穿插文档序列),这是一个新颖的轻量级框架,它利用检索来有效地对长文档进行分类,同时增强了可解释性。IRIS将文档分割成块,将它们的嵌入存储在向量数据库中,并使用可学习的查询向量检索与给定任务最相关的那些。然后,线性注意力机制将检索到的嵌入聚合起来进行分类,允许模型处理任意长的文档,而不会增加计算成本,并且在单个GPU上保持可训练性。我们在六个数据集上的实验表明,IRIS在标准基准测试中达到了与基线模型相当的性能,并且在三个临床记录疾病风险预测任务中表现出色,其中文档非常长,关键信息稀疏。此外,IRIS通过揭示模型识别的关键风险因素的清晰摘要,提供了全球可解释性。这些发现突出了IRIS作为长文档分类的有效且可解释的解决方案的潜力,特别是在性能和可解释性都至关重要的医疗保健应用中。
{"title":"IRIS: Interpretable Retrieval-Augmented Classification for Long Interspersed Document Sequences.","authors":"Fengnan Li, Elliot D Hill, Shu Jiang, Jiaxin Gao, Matthew M Engelhard","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Transformer-based models have achieved state-of-the-art performance in document classification but struggle with long-text processing due to the quadratic computational complexity in the self-attention module. Existing solutions, such as sparse attention, hierarchical models, and key sentence extraction, partially address the issue but still fall short when the input sequence is exceptionally lengthy. To address this challenge, we propose <b>IRIS</b> (<b>I</b>nterpretable <b>R</b>etrieval-Augmented Classification for long <b>I</b>nterspersed Document <b>S</b>equences), a novel, lightweight framework that utilizes retrieval to efficiently classify long documents while enhancing interpretability. IRIS segments documents into chunks, stores their embeddings in a vector database, and retrieves those most relevant to a given task using learnable query vectors. A linear attention mechanism then aggregates the retrieved embeddings for classification, allowing the model to process arbitrarily long documents without increasing computational cost and remaining trainable on a single GPU. Our experiments across six datasets show that IRIS achieves comparable performance to baseline models on standard benchmarks, and excels in three clinical note disease risk prediction tasks where documents are extremely long and key information is sparse. Furthermore, IRIS provides global interpretability by revealing a clear summary of key risk factors identified by the model. These findings highlight the potential of IRIS as an efficient and interpretable solution for long-document classification, particularly in healthcare applications where both performance and explainability are crucial.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2025 ","pages":"30263-30283"},"PeriodicalIF":0.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357761/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking. GraphCheck:通过提取知识图驱动的事实检查打破长期文本障碍。
Pub Date : 2025-07-01 DOI: 10.18653/v1/2025.acl-long.729
Yingjian Chen, Haoran Liu, Yinhong Liu, Jinxiang Xie, Rui Yang, Han Yuan, Yanran Fu, Peng Yuan Zhou, Qingyu Chen, James Caverlee, Irene Li

Large language models (LLMs) are widely used, but they often generate subtle factual errors, especially in long-form text. These errors are fatal in some specialized domains such as medicine. Existing fact-checking with grounding documents methods face two main challenges: (1) they struggle to understand complex multihop relations in long documents, often overlooking subtle factual errors; (2) most specialized methods rely on pairwise comparisons, requiring multiple model calls, leading to high resource and computational costs. To address these challenges, we propose GraphCheck , a fact-checking framework that uses extracted knowledge graphs to enhance text representation. Graph Neural Networks further process these graphs as a soft prompt, enabling LLMs to incorporate structured knowledge more effectively. Enhanced with graph-based reasoning, GraphCheck captures multihop reasoning chains that are often overlooked by existing methods, enabling precise and efficient fact-checking in a single inference call. Experimental results on seven benchmarks spanning both general and medical domains demonstrate up to a 7.1% overall improvement over baseline models. Notably, GraphCheck outperforms existing specialized fact-checkers and achieves comparable performance with state-of-the-art LLMs, such as DeepSeek-V3 and OpenAI-o1, with significantly fewer parameters.

大型语言模型(llm)被广泛使用,但是它们经常产生微妙的事实错误,特别是在长格式文本中。这些错误在某些专业领域(如医学)是致命的。现有的基于基础文档的事实核查方法面临两个主要挑战:(1)它们难以理解长文档中复杂的多跳关系,往往忽略了微妙的事实错误;(2)大多数专业方法依赖于两两比较,需要多次模型调用,导致资源和计算成本高。为了解决这些挑战,我们提出了GraphCheck,这是一个事实检查框架,它使用提取的知识图来增强文本表示。图神经网络进一步处理这些图作为软提示,使法学硕士能够更有效地整合结构化知识。GraphCheck通过基于图的推理进行了增强,可以捕获现有方法经常忽略的多跳推理链,从而在单个推理调用中实现精确和高效的事实检查。跨越一般和医学领域的七个基准的实验结果表明,与基线模型相比,该模型的总体改进高达7.1%。值得注意的是,GraphCheck优于现有的专业事实检查器,并与最先进的llm(如DeepSeek-V3和openai - 01)实现了相当的性能,参数明显减少。
{"title":"GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking.","authors":"Yingjian Chen, Haoran Liu, Yinhong Liu, Jinxiang Xie, Rui Yang, Han Yuan, Yanran Fu, Peng Yuan Zhou, Qingyu Chen, James Caverlee, Irene Li","doi":"10.18653/v1/2025.acl-long.729","DOIUrl":"10.18653/v1/2025.acl-long.729","url":null,"abstract":"<p><p>Large language models (LLMs) are widely used, but they often generate subtle factual errors, especially in long-form text. These errors are fatal in some specialized domains such as medicine. Existing fact-checking with grounding documents methods face two main challenges: (1) they struggle to understand complex multihop relations in long documents, often overlooking subtle factual errors; (2) most specialized methods rely on pairwise comparisons, requiring multiple model calls, leading to high resource and computational costs. To address these challenges, we propose <b><i>GraphCheck</i></b> , a fact-checking framework that uses extracted knowledge graphs to enhance text representation. Graph Neural Networks further process these graphs as a soft prompt, enabling LLMs to incorporate structured knowledge more effectively. Enhanced with graph-based reasoning, GraphCheck captures multihop reasoning chains that are often overlooked by existing methods, enabling precise and efficient fact-checking in a single inference call. Experimental results on seven benchmarks spanning both general and medical domains demonstrate up to a 7.1% overall improvement over baseline models. Notably, GraphCheck outperforms existing specialized fact-checkers and achieves comparable performance with state-of-the-art LLMs, such as DeepSeek-V3 and OpenAI-o1, with significantly fewer parameters.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2025 ","pages":"14976-14995"},"PeriodicalIF":0.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12360635/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144884476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OLIVE: Object Level In-Context Visual Embeddings. OLIVE:对象级上下文视觉嵌入。
Pub Date : 2024-08-01 DOI: 10.18653/v1/2024.acl-long.282
Timothy Ossowski, Junjie Hu

Recent generalist vision-language models (VLMs) have demonstrated impressive reasoning capabilities across diverse multimodal tasks. However, these models still struggle with fine-grained object level understanding and grounding. In terms of modeling, existing VLMs implicitly align text tokens with image patch tokens, which is ineffective for embedding alignment at the same granularity and inevitably introduces noisy spurious background features. Additionally, these models struggle when generalizing to unseen visual concepts and may not be reliable for domain-specific tasks without further fine-tuning. To address these limitations, we propose a novel method to prompt large language models with in-context visual object vectors, thereby enabling controllable object level reasoning. This eliminates the necessity of fusing a lengthy array of image patch features and significantly speeds up training. Furthermore, we propose region-level retrieval using our object representations, facilitating rapid adaptation to new objects without additional training. Our experiments reveal that our method achieves competitive referring object classification and captioning performance, while also offering zero-shot generalization and robustness to visually challenging contexts.

最近的通才视觉语言模型(vlm)在不同的多模态任务中展示了令人印象深刻的推理能力。然而,这些模型仍然在细粒度对象级别的理解和基础上挣扎。在建模方面,现有的vlm隐式地将文本标记与图像补丁标记对齐,这对于在相同粒度下嵌入对齐是无效的,并且不可避免地引入嘈杂的虚假背景特征。此外,当泛化到看不见的视觉概念时,这些模型会遇到困难,并且如果没有进一步的微调,对于特定领域的任务可能不可靠。为了解决这些限制,我们提出了一种新的方法来提示具有上下文视觉对象向量的大型语言模型,从而实现可控的对象级推理。这消除了融合一长串图像补丁特征的必要性,并显著加快了训练速度。此外,我们建议使用我们的对象表示进行区域级检索,从而在不需要额外训练的情况下快速适应新对象。我们的实验表明,我们的方法实现了竞争性参考对象分类和字幕性能,同时还提供了零射击泛化和鲁棒性,以应对视觉上具有挑战性的上下文。
{"title":"OLIVE: Object Level In-Context Visual Embeddings.","authors":"Timothy Ossowski, Junjie Hu","doi":"10.18653/v1/2024.acl-long.282","DOIUrl":"10.18653/v1/2024.acl-long.282","url":null,"abstract":"<p><p>Recent generalist vision-language models (VLMs) have demonstrated impressive reasoning capabilities across diverse multimodal tasks. However, these models still struggle with fine-grained object level understanding and grounding. In terms of modeling, existing VLMs implicitly align text tokens with image patch tokens, which is ineffective for embedding alignment at the same granularity and inevitably introduces noisy spurious background features. Additionally, these models struggle when generalizing to unseen visual concepts and may not be reliable for domain-specific tasks without further fine-tuning. To address these limitations, we propose a novel method to prompt large language models with in-context visual object vectors, thereby enabling controllable object level reasoning. This eliminates the necessity of fusing a lengthy array of image patch features and significantly speeds up training. Furthermore, we propose region-level retrieval using our object representations, facilitating rapid adaptation to new objects without additional training. Our experiments reveal that our method achieves competitive referring object classification and captioning performance, while also offering zero-shot generalization and robustness to visually challenging contexts.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2024 ","pages":"5170-5185"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11931571/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143702203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources. 多样性中的统一:跨多模式医学资源的协同预训练。
Pub Date : 2024-08-01 DOI: 10.18653/v1/2024.acl-long.199
Xiaochen Wang, Junyu Luo, Jiaqi Wang, Yuan Zhong, Xiaokun Zhang, Yaqing Wang, Parminder Bhatia, Cao Xiao, Fenglong Ma

Although pre-training has become a prevalent approach for addressing various biomedical tasks, the current efficacy of pre-trained models is hindered by their reliance on a limited scope of medical sources. This limitation results in data scarcity during pre-training and restricts the range of applicable downstream tasks. In response to these challenges, we develop Medical Cross-Source Pre-training (MEDCSP), a new pre-training strategy designed to bridge the gap between multimodal medical sources. MEDCSP employs modality-level aggregation to unify patient data within individual sources. Additionally, leveraging temporal information and diagnosis history, MEDCSP effectively captures explicit and implicit correlations between patients across different sources. To evaluate the proposed strategy, we conduct comprehensive experiments, where the experiments are based on 6 modalities from 2 real-world medical data sources, and MEDCSP is evaluated on 4 tasks against 19 baselines, marking an initial yet essential step towards cross-source modeling in the medical domain.

尽管预训练已成为解决各种生物医学任务的普遍方法,但目前预训练模型的有效性受到其对有限范围的医学来源的依赖的阻碍。这种限制导致预训练期间的数据稀缺,并限制了适用的下游任务的范围。为了应对这些挑战,我们开发了医学跨源预训练(MEDCSP),这是一种新的预训练策略,旨在弥合多模式医疗资源之间的差距。MEDCSP采用模式级聚合来统一各个来源中的患者数据。此外,利用时间信息和诊断历史,MEDCSP有效地捕获了不同来源的患者之间的显式和隐含相关性。为了评估所提出的策略,我们进行了全面的实验,其中实验基于来自2个真实医学数据源的6种模式,并在19条基线上对MEDCSP进行了4项任务的评估,这标志着医学领域跨源建模的初步但重要的一步。
{"title":"Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources.","authors":"Xiaochen Wang, Junyu Luo, Jiaqi Wang, Yuan Zhong, Xiaokun Zhang, Yaqing Wang, Parminder Bhatia, Cao Xiao, Fenglong Ma","doi":"10.18653/v1/2024.acl-long.199","DOIUrl":"https://doi.org/10.18653/v1/2024.acl-long.199","url":null,"abstract":"<p><p>Although pre-training has become a prevalent approach for addressing various biomedical tasks, the current efficacy of pre-trained models is hindered by their reliance on a limited scope of medical sources. This limitation results in data scarcity during pre-training and restricts the range of applicable downstream tasks. In response to these challenges, we develop <b>Med</b>ical <b>C</b>ross-<b>S</b>ource <b>P</b>re-training (MEDCSP), a new pre-training strategy designed to bridge the gap between multimodal medical sources. MEDCSP employs modality-level aggregation to unify patient data within individual sources. Additionally, leveraging temporal information and diagnosis history, MEDCSP effectively captures explicit and implicit correlations between patients across different sources. To evaluate the proposed strategy, we conduct comprehensive experiments, where the experiments are based on 6 modalities from 2 real-world medical data sources, and MEDCSP is evaluated on 4 tasks against 19 baselines, marking an initial yet essential step towards cross-source modeling in the medical domain.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2024 Volume 1 Long Papers","pages":"3644-3656"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12007664/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144054485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Medical Vision-Language Pre-Training for Brain Abnormalities. 针对大脑异常的医学视觉语言预培训。
Masoud Monajatipoor, Zi-Yi Dou, Aichi Chien, Nanyun Peng, Kai-Wei Chang

Vision-language models have become increasingly powerful for tasks that require an understanding of both visual and linguistic elements, bridging the gap between these modalities. In the context of multimodal clinical AI, there is a growing need for models that possess domain-specific knowledge, as existing models often lack the expertise required for medical applications. In this paper, we take brain abnormalities as an example to demonstrate how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed. In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset from case reports and published journals and subsequently constructing a high-performance vision-language model tailored to specific medical tasks. We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain. We evaluated the resulting model with quantitative and qualitative intrinsic evaluations. The resulting dataset and our code can be found here https://github.com/masoud-monajati/MedVL_pretraining_pipeline.

对于需要理解视觉和语言元素的任务来说,视觉语言模型已变得越来越强大,在这些模态之间架起了一座桥梁。在多模态临床人工智能的背景下,对拥有特定领域知识的模型的需求日益增长,因为现有模型往往缺乏医疗应用所需的专业知识。在本文中,我们以大脑异常为例,演示如何从公共资源(如 PubMed)中自动收集医学图像-文本对齐数据进行预训练。特别是,我们提出了一个简化预训练过程的管道,首先从病例报告和已出版期刊中收集大量脑图像-文本数据集,然后构建一个为特定医疗任务量身定制的高性能视觉语言模型。我们还研究了医疗领域中将子图标映射到子标题的独特挑战。我们通过定量和定性的内在评估,对由此产生的模型进行了评估。由此产生的数据集和我们的代码可以在这里找到 https://github.com/masoud-monajati/MedVL_pretraining_pipeline。
{"title":"Medical Vision-Language Pre-Training for Brain Abnormalities.","authors":"Masoud Monajatipoor, Zi-Yi Dou, Aichi Chien, Nanyun Peng, Kai-Wei Chang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Vision-language models have become increasingly powerful for tasks that require an understanding of both visual and linguistic elements, bridging the gap between these modalities. In the context of multimodal clinical AI, there is a growing need for models that possess domain-specific knowledge, as existing models often lack the expertise required for medical applications. In this paper, we take <i>brain abnormalities</i> as an example to demonstrate how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed. In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset from case reports and published journals and subsequently constructing a high-performance vision-language model tailored to specific medical tasks. We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain. We evaluated the resulting model with quantitative and qualitative intrinsic evaluations. The resulting dataset and our code can be found here https://github.com/masoud-monajati/MedVL_pretraining_pipeline.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2024 LREC/COLING","pages":"11159-11164"},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11238846/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141617775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification. 层次文本分类的层次感知序列生成。
Vidit Jain, Mukund Rungta, Yuchen Zhuang, Yue Yu, Zeyu Wang, Mu Gao, Jeffrey Skolnick, Chao Zhang

Hierarchical text classification (HTC) is a complex subtask under multi-label text classification, characterized by a hierarchical label taxonomy and data imbalance. The best-performing models aim to learn a static representation by combining document and hierarchical label information. However, the relevance of document sections can vary based on the hierarchy level, necessitating a dynamic document representation. To address this, we propose HiGen, a text-generation-based framework utilizing language models to encode dynamic text representations. We introduce a level-guided loss function to capture the relationship between text and label name semantics. Our approach incorporates a task-specific pretraining strategy, adapting the language model to in-domain knowledge and significantly enhancing performance for classes with limited examples. Furthermore, we present a new and valuable dataset called ENZYME, designed for HTC, which comprises articles from PubMed with the goal of predicting Enzyme Commission (EC) numbers. Through extensive experiments on the ENZYME dataset and the widely recognized WOS and NYT datasets, our methodology demonstrates superior performance, surpassing existing approaches while efficiently handling data and mitigating class imbalance. We release our code and dataset here: https://github.com/viditjain99/HiGen.

层次文本分类是多标签文本分类下的一项复杂子任务,具有层次标签分类和数据不平衡的特点。性能最好的模型旨在通过组合文档和分层标签信息来学习静态表示。但是,文档部分的相关性可能会根据层次结构级别而变化,因此需要动态文档表示。为了解决这个问题,我们提出了HiGen,一个基于文本生成的框架,利用语言模型对动态文本表示进行编码。我们引入了一个水平引导的损失函数来捕获文本和标签名称语义之间的关系。我们的方法结合了特定任务的预训练策略,使语言模型适应领域内的知识,并显著提高了具有有限示例的类的性能。此外,我们提出了一个新的和有价值的数据集,称为酶,为HTC设计,其中包括来自PubMed的文章,目的是预测酶委员会(EC)数字。通过在ENZYME数据集和广泛认可的WOS和NYT数据集上的大量实验,我们的方法展示了卓越的性能,超越了现有的方法,同时有效地处理数据并减轻了类不平衡。我们在这里发布代码和数据集:https://github.com/viditjain99/HiGen。
{"title":"HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification.","authors":"Vidit Jain, Mukund Rungta, Yuchen Zhuang, Yue Yu, Zeyu Wang, Mu Gao, Jeffrey Skolnick, Chao Zhang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Hierarchical text classification (HTC) is a complex subtask under multi-label text classification, characterized by a hierarchical label taxonomy and data imbalance. The best-performing models aim to learn a static representation by combining document and hierarchical label information. However, the relevance of document sections can vary based on the hierarchy level, necessitating a dynamic document representation. To address this, we propose HiGen, a text-generation-based framework utilizing language models to encode dynamic text representations. We introduce a level-guided loss function to capture the relationship between text and label name semantics. Our approach incorporates a task-specific pretraining strategy, adapting the language model to in-domain knowledge and significantly enhancing performance for classes with limited examples. Furthermore, we present a new and valuable dataset called ENZYME, designed for HTC, which comprises articles from PubMed with the goal of predicting Enzyme Commission (EC) numbers. Through extensive experiments on the ENZYME dataset and the widely recognized WOS and NYT datasets, our methodology demonstrates superior performance, surpassing existing approaches while efficiently handling data and mitigating class imbalance. We release our code and dataset here: https://github.com/viditjain99/HiGen.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2024 EACL","pages":"1354-1368"},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11781299/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143070252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SOCIALITE-LLAMA: An Instruction-Tuned Model for Social Scientific Tasks. SOCIALITE-LLAMA:社会科学任务的指令调整模型。
Pub Date : 2024-03-01 DOI: 10.18653/v1/2024.eacl-short.40
Gourab Dey, Adithya V Ganesan, Yash Kumar Lal, Manal Shah, Shreyashee Sinha, Matthew Matero, Salvatore Giorgi, Vivek Kulkarni, H Andrew Schwartz

Social science NLP tasks, such as emotion or humor detection, are required to capture the semantics along with the implicit pragmatics from text, often with limited amounts of training data. Instruction tuning has been shown to improve the many capabilities of large language models (LLMs) such as commonsense reasoning, reading comprehension, and computer programming. However, little is known about the effectiveness of instruction tuning on the social domain where implicit pragmatic cues are often needed to be captured. We explore the use of instruction tuning for social science NLP tasks and introduce Socialite-Llama- an open-source, instruction-tuned Llama2. On a suite of 20 social science tasks, Socialite-Llama improves upon the performance of Llama2 as well as matches or improves upon the performance of a state-of-the-art, multi-task finetuned model on a majority of them. Further, Socialite-Llama also leads to improvement on 5 out of 6 related social tasks as compared to Llama2, suggesting instruction tuning can lead to generalized social understanding. All resources including our code, model and dataset can be found through bit.ly/socialitellama.

社会科学NLP任务,如情感或幽默检测,需要从文本中捕获语义以及隐含语用,通常使用有限的训练数据。指令调优已被证明可以提高大型语言模型(llm)的许多功能,例如常识性推理、阅读理解和计算机编程。然而,在经常需要捕捉隐含语用线索的社会领域,指令调整的有效性知之甚少。我们探索了在社会科学NLP任务中使用指令调优,并介绍了Socialite-Llama——一个开源的、指令调优的Llama2。在一套20个社会科学任务中,Socialite-Llama在Llama2的基础上进行了改进,并在大多数任务中匹配或改进了最先进的多任务微调模型的性能。此外,与Llama2相比,Socialite-Llama在6项相关社会任务中的5项也有所改善,这表明指令调整可以导致广义的社会理解。所有资源,包括我们的代码,模型和数据集都可以通过bit.ly/socialitellama找到。
{"title":"SOCIALITE-LLAMA: An Instruction-Tuned Model for Social Scientific Tasks.","authors":"Gourab Dey, Adithya V Ganesan, Yash Kumar Lal, Manal Shah, Shreyashee Sinha, Matthew Matero, Salvatore Giorgi, Vivek Kulkarni, H Andrew Schwartz","doi":"10.18653/v1/2024.eacl-short.40","DOIUrl":"10.18653/v1/2024.eacl-short.40","url":null,"abstract":"<p><p>Social science NLP tasks, such as emotion or humor detection, are required to capture the semantics along with the implicit pragmatics from text, often with limited amounts of training data. Instruction tuning has been shown to improve the many capabilities of large language models (LLMs) such as commonsense reasoning, reading comprehension, and computer programming. However, little is known about the effectiveness of instruction tuning on the social domain where implicit pragmatic cues are often needed to be captured. We explore the use of instruction tuning for social science NLP tasks and introduce Socialite-Llama- an open-source, instruction-tuned Llama2. On a suite of 20 social science tasks, Socialite-Llama improves upon the performance of Llama2 as well as matches or improves upon the performance of a state-of-the-art, multi-task finetuned model on a majority of them. Further, Socialite-Llama also leads to improvement on 5 out of 6 related social tasks as compared to Llama2, suggesting instruction tuning can lead to generalized social understanding. All resources including our code, model and dataset can be found through bit.ly/socialitellama.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2024 EACL v2-SP","pages":"454-468"},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12584587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145454217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the Transferability of Clinical Note Section Classification Models with BERT and Large Language Model Ensembles. 利用BERT和大型语言模型集成提高临床笔记部分分类模型的可移植性。
Weipeng Zhou, Dmitriy Dligach, Majid Afshar, Yanjun Gao, Timothy A Miller

Text in electronic health records is organized into sections, and classifying those sections into section categories is useful for downstream tasks. In this work, we attempt to improve the transferability of section classification models by combining the dataset-specific knowledge in supervised learning models with the world knowledge inside large language models (LLMs). Surprisingly, we find that zero-shot LLMs out-perform supervised BERT-based models applied to out-of-domain data. We also find that their strengths are synergistic, so that a simple ensemble technique leads to additional performance gains.

电子健康记录中的文本被组织成多个部分,将这些部分分类为多个部分类别对下游任务很有用。在这项工作中,我们试图通过将监督学习模型中的数据集特定知识与大型语言模型(LLM)中的世界知识相结合来提高部分分类模型的可转移性。令人惊讶的是,我们发现基于零样本LLM的监督BERT模型应用于域外数据。我们还发现,它们的优势是协同的,因此简单的集成技术可以带来额外的性能提升。
{"title":"Improving the Transferability of Clinical Note Section Classification Models with BERT and Large Language Model Ensembles.","authors":"Weipeng Zhou,&nbsp;Dmitriy Dligach,&nbsp;Majid Afshar,&nbsp;Yanjun Gao,&nbsp;Timothy A Miller","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Text in electronic health records is organized into sections, and classifying those sections into section categories is useful for downstream tasks. In this work, we attempt to improve the transferability of section classification models by combining the dataset-specific knowledge in supervised learning models with the world knowledge inside large language models (LLMs). Surprisingly, we find that zero-shot LLMs out-perform supervised BERT-based models applied to out-of-domain data. We also find that their strengths are synergistic, so that a simple ensemble technique leads to additional performance gains.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2023 ","pages":"125-130"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10544420/pdf/nihms-1921258.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41107790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the Transferability of Clinical Note Section Classification Models with BERT and Large Language Model Ensembles 利用BERT和大型语言模型集成提高临床笔记部分分类模型的可移植性
Pub Date : 2023-07-01 DOI: 10.18653/v1/2023.clinicalnlp-1.16
Weipeng Zhou, M. Afshar, Dmitriy Dligach, Yanjun Gao, Timothy Miller
Text in electronic health records is organized into sections, and classifying those sections into section categories is useful for downstream tasks. In this work, we attempt to improve the transferability of section classification models by combining the dataset-specific knowledge in supervised learning models with the world knowledge inside large language models (LLMs). Surprisingly, we find that zero-shot LLMs out-perform supervised BERT-based models applied to out-of-domain data. We also find that their strengths are synergistic, so that a simple ensemble technique leads to additional performance gains.
电子健康记录中的文本被组织成部分,将这些部分分类为部分类别对于后续任务很有用。在这项工作中,我们试图通过将监督学习模型中的数据集特定知识与大型语言模型(llm)中的世界知识相结合来提高截面分类模型的可移植性。令人惊讶的是,我们发现零射击llm优于应用于域外数据的基于bert的监督模型。我们还发现它们的优势是协同的,因此一个简单的集成技术可以带来额外的性能收益。
{"title":"Improving the Transferability of Clinical Note Section Classification Models with BERT and Large Language Model Ensembles","authors":"Weipeng Zhou, M. Afshar, Dmitriy Dligach, Yanjun Gao, Timothy Miller","doi":"10.18653/v1/2023.clinicalnlp-1.16","DOIUrl":"https://doi.org/10.18653/v1/2023.clinicalnlp-1.16","url":null,"abstract":"Text in electronic health records is organized into sections, and classifying those sections into section categories is useful for downstream tasks. In this work, we attempt to improve the transferability of section classification models by combining the dataset-specific knowledge in supervised learning models with the world knowledge inside large language models (LLMs). Surprisingly, we find that zero-shot LLMs out-perform supervised BERT-based models applied to out-of-domain data. We also find that their strengths are synergistic, so that a simple ensemble technique leads to additional performance gains.","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"19 1","pages":"125-130"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90831319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Less Likely Brainstorming: Using Language Models to Generate Alternative Hypotheses. 不太可能的头脑风暴:使用语言模型生成替代假设。
Pub Date : 2023-07-01 DOI: 10.18653/v1/2023.findings-acl.794
Liyan Tang, Yifan Peng, Yanshan Wang, Ying Ding, Greg Durrett, Justin F Rousseau

A human decision-maker benefits the most from an AI assistant that corrects for their biases. For problems such as generating interpretation of a radiology report given findings, a system predicting only highly likely outcomes may be less useful, where such outcomes are already obvious to the user. To alleviate biases in human decision-making, it is worth considering a broad differential diagnosis, going beyond the most likely options. We introduce a new task, "less likely brainstorming," that asks a model to generate outputs that humans think are relevant but less likely to happen. We explore the task in two settings: a brain MRI interpretation generation setting and an everyday commonsense reasoning setting. We found that a baseline approach of training with less likely hypotheses as targets generates outputs that humans evaluate as either likely or irrelevant nearly half of the time; standard MLE training is not effective. To tackle this problem, we propose a controlled text generation method that uses a novel contrastive learning strategy to encourage models to differentiate between generating likely and less likely outputs according to humans. We compare our method with several state-of-the-art controlled text generation models via automatic and human evaluations and show that our models' capability of generating less likely outputs is improved.

人类决策者从纠正他们偏见的人工智能助手中获益最多。对于某些问题,例如根据发现生成放射学报告的解释,如果这些结果对用户来说已经很明显,那么仅预测高度可能的结果的系统可能用处不大。为了减轻人类决策中的偏见,值得考虑广泛的鉴别诊断,而不是最可能的选择。我们引入了一个新任务,“不太可能的头脑风暴”,它要求一个模型生成人类认为相关但不太可能发生的输出。我们在两种设置中探索任务:脑MRI解释生成设置和日常常识推理设置。我们发现,以不太可能的假设作为目标的基线训练方法产生的输出,人类在近一半的时间内评估为可能或不相关;标准的MLE培训效果不佳。为了解决这个问题,我们提出了一种受控文本生成方法,该方法使用一种新的对比学习策略来鼓励模型区分根据人类生成的可能输出和不太可能输出。我们通过自动和人工评估将我们的方法与几种最先进的受控文本生成模型进行了比较,并表明我们的模型生成不太可能输出的能力得到了提高。
{"title":"Less Likely Brainstorming: Using Language Models to Generate Alternative Hypotheses.","authors":"Liyan Tang,&nbsp;Yifan Peng,&nbsp;Yanshan Wang,&nbsp;Ying Ding,&nbsp;Greg Durrett,&nbsp;Justin F Rousseau","doi":"10.18653/v1/2023.findings-acl.794","DOIUrl":"https://doi.org/10.18653/v1/2023.findings-acl.794","url":null,"abstract":"<p><p>A human decision-maker benefits the most from an AI assistant that corrects for their biases. For problems such as generating interpretation of a radiology report given findings, a system predicting only highly likely outcomes may be less useful, where such outcomes are already obvious to the user. To alleviate biases in human decision-making, it is worth considering a broad differential diagnosis, going beyond the most likely options. We introduce a new task, \"less likely brainstorming,\" that asks a model to generate outputs that humans think are relevant but less likely to happen. We explore the task in two settings: a brain MRI interpretation generation setting and an everyday commonsense reasoning setting. We found that a baseline approach of training with less likely hypotheses as targets generates outputs that humans evaluate as either likely or irrelevant nearly half of the time; standard MLE training is not effective. To tackle this problem, we propose a controlled text generation method that uses a novel contrastive learning strategy to encourage models to differentiate between generating likely and less likely outputs according to humans. We compare our method with several state-of-the-art controlled text generation models via automatic and human evaluations and show that our models' capability of generating less likely outputs is improved.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2023 ","pages":"12532-12555"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10494958/pdf/nihms-1923571.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10263511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the conference. Association for Computational Linguistics. Meeting
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1