首页 > 最新文献

arXiv - CS - Computation and Language最新文献

英文 中文
THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models THaMES:大型语言模型中减少和评估幻觉的端到端工具
Pub Date : 2024-09-17 DOI: arxiv-2409.11353
Mengfei Liang, Archish Arun, Zekun Wu, Cristian Munoz, Jonathan Lutch, Emre Kazim, Adriano Koshiyama, Philip Treleaven
Hallucination, the generation of factually incorrect content, is a growingchallenge in Large Language Models (LLMs). Existing detection and mitigationmethods are often isolated and insufficient for domain-specific needs, lackinga standardized pipeline. This paper introduces THaMES (Tool for HallucinationMitigations and EvaluationS), an integrated framework and library addressingthis gap. THaMES offers an end-to-end solution for evaluating and mitigatinghallucinations in LLMs, featuring automated test set generation, multifacetedbenchmarking, and adaptable mitigation strategies. It automates test setcreation from any corpus, ensuring high data quality, diversity, andcost-efficiency through techniques like batch processing, weighted sampling,and counterfactual validation. THaMES assesses a model's ability to detect andreduce hallucinations across various tasks, including text generation andbinary classification, applying optimal mitigation strategies like In-ContextLearning (ICL), Retrieval Augmented Generation (RAG), and Parameter-EfficientFine-tuning (PEFT). Evaluations of state-of-the-art LLMs using a knowledge baseof academic papers, political news, and Wikipedia reveal that commercial modelslike GPT-4o benefit more from RAG than ICL, while open-weight models likeLlama-3.1-8B-Instruct and Mistral-Nemo gain more from ICL. Additionally, PEFTsignificantly enhances the performance of Llama-3.1-8B-Instruct in bothevaluation tasks.
幻觉,即生成与事实不符的内容,是大型语言模型(LLM)中一个日益严峻的挑战。现有的检测和缓解方法往往是孤立的,不足以满足特定领域的需求,缺乏标准化的管道。本文介绍了 THaMES(Tool for HallucinationMitigations and EvaluationS,幻觉识别与评估工具),它是一个集成框架和库,可解决这一空白。THaMES 为评估和减轻 LLM 中的幻觉提供了端到端的解决方案,具有自动测试集生成、多方面基准测试和可调整的减轻策略等特点。它可以从任何语料库自动生成测试集,通过批处理、加权采样和反事实验证等技术确保数据的高质量、多样性和成本效益。THaMES 评估了模型在文本生成和二元分类等各种任务中检测和减少幻觉的能力,并应用了最佳缓解策略,如上下文学习 (ICL)、检索增强生成 (RAG) 和参数高效微调 (PEFT)。使用学术论文、政治新闻和维基百科等知识库对最先进的 LLM 进行评估后发现,GPT-4o 等商业模型从 RAG 中获得的收益比 ICL 更大,而 Llama-3.1-8B-Instruct 和 Mistral-Nemo 等开放重量模型从 ICL 中获得的收益更大。此外,PEFT 显著提高了 Llama-3.1-8B-Instruct 在双评估任务中的性能。
{"title":"THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models","authors":"Mengfei Liang, Archish Arun, Zekun Wu, Cristian Munoz, Jonathan Lutch, Emre Kazim, Adriano Koshiyama, Philip Treleaven","doi":"arxiv-2409.11353","DOIUrl":"https://doi.org/arxiv-2409.11353","url":null,"abstract":"Hallucination, the generation of factually incorrect content, is a growing\u0000challenge in Large Language Models (LLMs). Existing detection and mitigation\u0000methods are often isolated and insufficient for domain-specific needs, lacking\u0000a standardized pipeline. This paper introduces THaMES (Tool for Hallucination\u0000Mitigations and EvaluationS), an integrated framework and library addressing\u0000this gap. THaMES offers an end-to-end solution for evaluating and mitigating\u0000hallucinations in LLMs, featuring automated test set generation, multifaceted\u0000benchmarking, and adaptable mitigation strategies. It automates test set\u0000creation from any corpus, ensuring high data quality, diversity, and\u0000cost-efficiency through techniques like batch processing, weighted sampling,\u0000and counterfactual validation. THaMES assesses a model's ability to detect and\u0000reduce hallucinations across various tasks, including text generation and\u0000binary classification, applying optimal mitigation strategies like In-Context\u0000Learning (ICL), Retrieval Augmented Generation (RAG), and Parameter-Efficient\u0000Fine-tuning (PEFT). Evaluations of state-of-the-art LLMs using a knowledge base\u0000of academic papers, political news, and Wikipedia reveal that commercial models\u0000like GPT-4o benefit more from RAG than ICL, while open-weight models like\u0000Llama-3.1-8B-Instruct and Mistral-Nemo gain more from ICL. Additionally, PEFT\u0000significantly enhances the performance of Llama-3.1-8B-Instruct in both\u0000evaluation tasks.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diversity-grounded Channel Prototypical Learning for Out-of-Distribution Intent Detection 用于分布外意图检测的基于多样性的信道原型学习
Pub Date : 2024-09-17 DOI: arxiv-2409.11114
Bo Liu, Liming Zhan, Yujie Feng, Zexin Lu, Chengqiang Xie, Lei Xue, Xiao-Ming Wu, Albert Y. S. Lam
In the realm of task-oriented dialogue systems, a robust intent detectionmechanism must effectively handle malformed utterances encountered inreal-world scenarios. This study presents a novel fine-tuning framework forlarge language models (LLMs) aimed at enhancing in-distribution (ID) intentclassification and out-of-distribution (OOD) intent detection, which utilizessemantic matching with prototypes derived from ID class names. By harnessingthe highly distinguishable representations of LLMs, we construct semanticprototypes for each ID class using a diversity-grounded prompt tuning approach.We rigorously test our framework in a challenging OOD context, where ID and OODclasses are semantically close yet distinct, referred to as emph{near} OODdetection. For a thorough assessment, we benchmark our method against theprevalent fine-tuning approaches. The experimental findings reveal that ourmethod demonstrates superior performance in both few-shot ID intentclassification and near-OOD intent detection tasks.
在面向任务的对话系统领域,强大的意图检测机制必须能有效处理真实世界场景中遇到的畸形语句。本研究为大语言模型(LLMs)提出了一个新颖的微调框架,旨在增强分布内(ID)意图分类和分布外(OOD)意图检测,该框架利用了从 ID 类名衍生出的原型进行语义匹配。我们在具有挑战性的 OOD 环境中对我们的框架进行了严格测试,在这种环境中,ID 和 OOD 类别在语义上非常接近,但又截然不同,这被称为 "接近 "OOD 检测。OODdetection.为了进行全面评估,我们将我们的方法与流行的微调方法进行了比较。实验结果表明,我们的方法在少量 ID 意图分类和近似 OOD 意图检测任务中都表现出了卓越的性能。
{"title":"Diversity-grounded Channel Prototypical Learning for Out-of-Distribution Intent Detection","authors":"Bo Liu, Liming Zhan, Yujie Feng, Zexin Lu, Chengqiang Xie, Lei Xue, Xiao-Ming Wu, Albert Y. S. Lam","doi":"arxiv-2409.11114","DOIUrl":"https://doi.org/arxiv-2409.11114","url":null,"abstract":"In the realm of task-oriented dialogue systems, a robust intent detection\u0000mechanism must effectively handle malformed utterances encountered in\u0000real-world scenarios. This study presents a novel fine-tuning framework for\u0000large language models (LLMs) aimed at enhancing in-distribution (ID) intent\u0000classification and out-of-distribution (OOD) intent detection, which utilizes\u0000semantic matching with prototypes derived from ID class names. By harnessing\u0000the highly distinguishable representations of LLMs, we construct semantic\u0000prototypes for each ID class using a diversity-grounded prompt tuning approach.\u0000We rigorously test our framework in a challenging OOD context, where ID and OOD\u0000classes are semantically close yet distinct, referred to as emph{near} OOD\u0000detection. For a thorough assessment, we benchmark our method against the\u0000prevalent fine-tuning approaches. The experimental findings reveal that our\u0000method demonstrates superior performance in both few-shot ID intent\u0000classification and near-OOD intent detection tasks.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LOLA -- An Open-Source Massively Multilingual Large Language Model LOLA -- 一种开源的大规模多语种大语言模型
Pub Date : 2024-09-17 DOI: arxiv-2409.11272
Nikit Srivastava, Denis Kuchelev, Tatiana Moteu, Kshitij Shetty, Michael Roeder, Diego Moussallem, Hamada Zahera, Axel-Cyrille Ngonga Ngomo
This paper presents LOLA, a massively multilingual large language modeltrained on more than 160 languages using a sparse Mixture-of-ExpertsTransformer architecture. Our architectural and implementation choices addressthe challenge of harnessing linguistic diversity while maintaining efficiencyand avoiding the common pitfalls of multilinguality. Our analysis of theevaluation results shows competitive performance in natural language generationand understanding tasks. Additionally, we demonstrate how the learnedexpert-routing mechanism exploits implicit phylogenetic linguistic patterns topotentially alleviate the curse of multilinguality. We provide an in-depth lookat the training process, an analysis of the datasets, and a balancedexploration of the model's strengths and limitations. As an open-source model,LOLA promotes reproducibility and serves as a robust foundation for futureresearch. Our findings enable the development of compute-efficient multilingualmodels with strong, scalable performance across languages.
本文介绍的 LOLA 是一种大规模多语言大型语言模型,它使用稀疏专家混合转换器架构在 160 多种语言上进行训练。我们在架构和实现方面的选择解决了在保持效率的同时利用语言多样性的难题,并避免了多语言性的常见缺陷。对评估结果的分析表明,我们在自然语言生成和理解任务中的表现极具竞争力。此外,我们还展示了所学的外显路由机制是如何利用隐含的系统发育语言模式来缓解多语言性诅咒的。我们深入探讨了训练过程,分析了数据集,并对模型的优势和局限性进行了均衡的探讨。作为一个开源模型,LOLA 促进了可重复性,并为未来研究奠定了坚实的基础。我们的研究成果有助于开发计算效率高的多语言模型,这些模型在不同语言间具有强大的可扩展性能。
{"title":"LOLA -- An Open-Source Massively Multilingual Large Language Model","authors":"Nikit Srivastava, Denis Kuchelev, Tatiana Moteu, Kshitij Shetty, Michael Roeder, Diego Moussallem, Hamada Zahera, Axel-Cyrille Ngonga Ngomo","doi":"arxiv-2409.11272","DOIUrl":"https://doi.org/arxiv-2409.11272","url":null,"abstract":"This paper presents LOLA, a massively multilingual large language model\u0000trained on more than 160 languages using a sparse Mixture-of-Experts\u0000Transformer architecture. Our architectural and implementation choices address\u0000the challenge of harnessing linguistic diversity while maintaining efficiency\u0000and avoiding the common pitfalls of multilinguality. Our analysis of the\u0000evaluation results shows competitive performance in natural language generation\u0000and understanding tasks. Additionally, we demonstrate how the learned\u0000expert-routing mechanism exploits implicit phylogenetic linguistic patterns to\u0000potentially alleviate the curse of multilinguality. We provide an in-depth look\u0000at the training process, an analysis of the datasets, and a balanced\u0000exploration of the model's strengths and limitations. As an open-source model,\u0000LOLA promotes reproducibility and serves as a robust foundation for future\u0000research. Our findings enable the development of compute-efficient multilingual\u0000models with strong, scalable performance across languages.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs AraDiCE:法律硕士的方言和文化能力基准
Pub Date : 2024-09-17 DOI: arxiv-2409.11404
Basel Mousi, Nadir Durrani, Fatema Ahmad, Md. Arid Hasan, Maram Hasanain, Tameem Kabbani, Fahim Dalvi, Shammur Absar Chowdhury, Firoj Alam
Arabic, with its rich diversity of dialects, remains significantlyunderrepresented in Large Language Models, particularly in dialectalvariations. We address this gap by introducing seven synthetic datasets indialects alongside Modern Standard Arabic (MSA), created using MachineTranslation (MT) combined with human post-editing. We present AraDiCE, abenchmark for Arabic Dialect and Cultural Evaluation. We evaluate LLMs ondialect comprehension and generation, focusing specifically on low-resourceArabic dialects. Additionally, we introduce the first-ever fine-grainedbenchmark designed to evaluate cultural awareness across the Gulf, Egypt, andLevant regions, providing a novel dimension to LLM evaluation. Our findingsdemonstrate that while Arabic-specific models like Jais and AceGPT outperformmultilingual models on dialectal tasks, significant challenges persist indialect identification, generation, and translation. This work contributes ~45Kpost-edited samples, a cultural benchmark, and highlights the importance oftailored training to improve LLM performance in capturing the nuances ofdiverse Arabic dialects and cultural contexts. We will release the dialectaltranslation models and benchmarks curated in this study.
阿拉伯语具有丰富的方言多样性,但在大型语言模型中,特别是在方言变化方面,其代表性仍然明显不足。为了填补这一空白,我们引入了七个合成数据集,这些数据集是使用机器翻译(MT)结合人工后期编辑创建的,与现代标准阿拉伯语(MSA)并列。我们提出了阿拉伯语方言和文化评估基准 AraDiCE。我们对 LLM 的方言理解和生成进行了评估,特别关注低资源阿拉伯语方言。此外,我们还首次推出了细粒度基准,用于评估海湾、埃及和莱万特地区的文化意识,为 LLM 评估提供了一个新的维度。我们的研究结果表明,虽然 Jais 和 AceGPT 等阿拉伯语特定模型在方言任务上优于多语言模型,但在方言识别、生成和翻译方面仍然存在重大挑战。这项工作提供了约 45K 个经过编辑的样本,这是一个文化基准,并强调了有针对性的训练对于提高 LLM 在捕捉不同阿拉伯语方言和文化背景的细微差别方面的性能的重要性。我们将发布本研究中策划的方言翻译模型和基准。
{"title":"AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs","authors":"Basel Mousi, Nadir Durrani, Fatema Ahmad, Md. Arid Hasan, Maram Hasanain, Tameem Kabbani, Fahim Dalvi, Shammur Absar Chowdhury, Firoj Alam","doi":"arxiv-2409.11404","DOIUrl":"https://doi.org/arxiv-2409.11404","url":null,"abstract":"Arabic, with its rich diversity of dialects, remains significantly\u0000underrepresented in Large Language Models, particularly in dialectal\u0000variations. We address this gap by introducing seven synthetic datasets in\u0000dialects alongside Modern Standard Arabic (MSA), created using Machine\u0000Translation (MT) combined with human post-editing. We present AraDiCE, a\u0000benchmark for Arabic Dialect and Cultural Evaluation. We evaluate LLMs on\u0000dialect comprehension and generation, focusing specifically on low-resource\u0000Arabic dialects. Additionally, we introduce the first-ever fine-grained\u0000benchmark designed to evaluate cultural awareness across the Gulf, Egypt, and\u0000Levant regions, providing a novel dimension to LLM evaluation. Our findings\u0000demonstrate that while Arabic-specific models like Jais and AceGPT outperform\u0000multilingual models on dialectal tasks, significant challenges persist in\u0000dialect identification, generation, and translation. This work contributes ~45K\u0000post-edited samples, a cultural benchmark, and highlights the importance of\u0000tailored training to improve LLM performance in capturing the nuances of\u0000diverse Arabic dialects and cultural contexts. We will release the dialectal\u0000translation models and benchmarks curated in this study.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ProSLM : A Prolog Synergized Language Model for explainable Domain Specific Knowledge Based Question Answering ProSLM : 用于可解释的特定领域知识型问题解答的 Prolog 协同语言模型
Pub Date : 2024-09-17 DOI: arxiv-2409.11589
Priyesh Vakharia, Abigail Kufeldt, Max Meyers, Ian Lane, Leilani Gilpin
Neurosymbolic approaches can add robustness to opaque neural systems byincorporating explainable symbolic representations. However, previousapproaches have not used formal logic to contextualize queries to and validateoutputs of large language models (LLMs). We propose systemname{}, a novelneurosymbolic framework, to improve the robustness and reliability of LLMs inquestion-answering tasks. We provide systemname{} with a domain-specificknowledge base, a logical reasoning system, and an integration to an existingLLM. This framework has two capabilities (1) context gathering: generatingexplainable and relevant context for a given query, and (2) validation:confirming and validating the factual accuracy of a statement in accordancewith a knowledge base (KB). Our work opens a new area of neurosymbolicgenerative AI text validation and user personalization.
神经符号方法可以通过纳入可解释的符号表示,为不透明的神经系统增加鲁棒性。然而,以前的方法并没有使用形式逻辑来对大型语言模型(LLM)的查询和输出进行语境化验证。我们提出了一个新颖的神经符号框架--systemname{},以提高大型语言模型在问题解答任务中的稳健性和可靠性。我们为systemname{}提供了一个特定领域的知识库、一个逻辑推理系统和一个与现有LLM的集成。该框架有两个功能:(1)上下文收集:为给定查询生成可解释的相关上下文;(2)验证:根据知识库(KB)确认和验证语句的事实准确性。我们的工作开辟了神经符号生成人工智能文本验证和用户个性化的新领域。
{"title":"ProSLM : A Prolog Synergized Language Model for explainable Domain Specific Knowledge Based Question Answering","authors":"Priyesh Vakharia, Abigail Kufeldt, Max Meyers, Ian Lane, Leilani Gilpin","doi":"arxiv-2409.11589","DOIUrl":"https://doi.org/arxiv-2409.11589","url":null,"abstract":"Neurosymbolic approaches can add robustness to opaque neural systems by\u0000incorporating explainable symbolic representations. However, previous\u0000approaches have not used formal logic to contextualize queries to and validate\u0000outputs of large language models (LLMs). We propose systemname{}, a novel\u0000neurosymbolic framework, to improve the robustness and reliability of LLMs in\u0000question-answering tasks. We provide systemname{} with a domain-specific\u0000knowledge base, a logical reasoning system, and an integration to an existing\u0000LLM. This framework has two capabilities (1) context gathering: generating\u0000explainable and relevant context for a given query, and (2) validation:\u0000confirming and validating the factual accuracy of a statement in accordance\u0000with a knowledge base (KB). Our work opens a new area of neurosymbolic\u0000generative AI text validation and user personalization.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring ChatGPT-based Augmentation Strategies for Contrastive Aspect-based Sentiment Analysis 为基于对比方面的情感分析探索基于 ChatGPT 的增强策略
Pub Date : 2024-09-17 DOI: arxiv-2409.11218
Lingling Xu, Haoran Xie, S. Joe Qin, Fu Lee Wang, Xiaohui Tao
Aspect-based sentiment analysis (ABSA) involves identifying sentiment towardsspecific aspect terms in a sentence and allows us to uncover nuancedperspectives and attitudes on particular aspects of a product, service, ortopic. However, the scarcity of labeled data poses a significant challenge totraining high-quality models. To address this issue, we explore the potentialof data augmentation using ChatGPT, a well-performing large language model(LLM), to enhance the sentiment classification performance towards aspectterms. Specifically, we explore three data augmentation strategies based onChatGPT: context-focused, aspect-focused, and context-aspect data augmentationtechniques. Context-focused data augmentation focuses on changing the wordexpression of context words in the sentence while keeping aspect termsunchanged. In contrast, aspect-focused data augmentation aims to change aspectterms but keep context words unchanged. Context-Aspect data augmentationintegrates the above two data augmentations to generate augmented samples.Furthermore, we incorporate contrastive learning into the ABSA tasks to improveperformance. Extensive experiments show that all three data augmentationtechniques lead to performance improvements, with the context-aspect dataaugmentation strategy performing best and surpassing the performance of thebaseline models.
基于方面的情感分析(ABSA)涉及识别句子中特定方面术语的情感,使我们能够发现对产品、服务或主题特定方面的细微观点和态度。然而,标注数据的匮乏给高质量模型的训练带来了巨大挑战。为了解决这个问题,我们探索了使用 ChatGPT(一种性能良好的大型语言模型(LLM))进行数据增强的潜力,以提高对方面术语的情感分类性能。具体来说,我们在 ChatGPT 的基础上探索了三种数据增强策略:以上下文为重点的数据增强技术、以方面为重点的数据增强技术和以上下文为重点的数据增强技术。以上下文为重点的数据增强侧重于改变句子中上下文词语的表达方式,同时保持方面词不变。与此相反,以方面为重点的数据增强旨在改变方面词,但保持上下文词不变。此外,我们还在 ABSA 任务中加入了对比学习以提高性能。广泛的实验表明,这三种数据增强技术都能提高性能,其中上下文方面数据增强策略的性能最好,超过了基准模型的性能。
{"title":"Exploring ChatGPT-based Augmentation Strategies for Contrastive Aspect-based Sentiment Analysis","authors":"Lingling Xu, Haoran Xie, S. Joe Qin, Fu Lee Wang, Xiaohui Tao","doi":"arxiv-2409.11218","DOIUrl":"https://doi.org/arxiv-2409.11218","url":null,"abstract":"Aspect-based sentiment analysis (ABSA) involves identifying sentiment towards\u0000specific aspect terms in a sentence and allows us to uncover nuanced\u0000perspectives and attitudes on particular aspects of a product, service, or\u0000topic. However, the scarcity of labeled data poses a significant challenge to\u0000training high-quality models. To address this issue, we explore the potential\u0000of data augmentation using ChatGPT, a well-performing large language model\u0000(LLM), to enhance the sentiment classification performance towards aspect\u0000terms. Specifically, we explore three data augmentation strategies based on\u0000ChatGPT: context-focused, aspect-focused, and context-aspect data augmentation\u0000techniques. Context-focused data augmentation focuses on changing the word\u0000expression of context words in the sentence while keeping aspect terms\u0000unchanged. In contrast, aspect-focused data augmentation aims to change aspect\u0000terms but keep context words unchanged. Context-Aspect data augmentation\u0000integrates the above two data augmentations to generate augmented samples.\u0000Furthermore, we incorporate contrastive learning into the ABSA tasks to improve\u0000performance. Extensive experiments show that all three data augmentation\u0000techniques lead to performance improvements, with the context-aspect data\u0000augmentation strategy performing best and surpassing the performance of the\u0000baseline models.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling 通过基于图的上下文知识三元组建模检测文本生成中的零资源幻觉
Pub Date : 2024-09-17 DOI: arxiv-2409.11283
Xinyue Fang, Zhen Huang, Zhiliang Tian, Minghui Fang, Ziyi Pan, Quntian Fang, Zhihua Wen, Hengyue Pan, Dongsheng Li
LLMs obtain remarkable performance but suffer from hallucinations. Mostresearch on detecting hallucination focuses on the questions with short andconcrete correct answers that are easy to check the faithfulness. Hallucinationdetections for text generation with open-ended answers are more challenging.Some researchers use external knowledge to detect hallucinations in generatedtexts, but external resources for specific scenarios are hard to access. Recentstudies on detecting hallucinations in long text without external resourcesconduct consistency comparison among multiple sampled outputs. To handle longtexts, researchers split long texts into multiple facts and individuallycompare the consistency of each pairs of facts. However, these methods (1)hardly achieve alignment among multiple facts; (2) overlook dependenciesbetween multiple contextual facts. In this paper, we propose a graph-basedcontext-aware (GCA) hallucination detection for text generations, which alignsknowledge facts and considers the dependencies between contextual knowledgetriples in consistency comparison. Particularly, to align multiple facts, weconduct a triple-oriented response segmentation to extract multiple knowledgetriples. To model dependencies among contextual knowledge triple (facts), weconstruct contextual triple into a graph and enhance triples' interactions viamessage passing and aggregating via RGCN. To avoid the omission of knowledgetriples in long text, we conduct a LLM-based reverse verification viareconstructing the knowledge triples. Experiments show that our model enhanceshallucination detection and excels all baselines.
LLMs 性能卓越,但也存在幻觉。大多数关于幻觉检测的研究都集中在具有简短而具体的正确答案的问题上,这样的问题很容易检测其忠实性。一些研究人员使用外部知识来检测生成文本中的幻觉,但很难获取特定场景的外部资源。最近关于在没有外部资源的情况下检测长文本中幻觉的研究在多个采样输出中进行了一致性比较。为了处理长文本,研究人员将长文本分割成多个事实,并单独比较每对事实的一致性。然而,这些方法(1)很难实现多个事实之间的一致性;(2)忽略了多个上下文事实之间的依赖关系。本文提出了一种基于图的上下文感知(GCA)的文本生成幻觉检测方法,该方法对齐知识事实,并在一致性比较中考虑上下文知识三元组之间的依赖关系。特别是,为了对齐多个事实,我们进行了面向三重的响应分割,以提取多个知识要素。为了模拟上下文知识三元(事实)之间的依赖关系,我们将上下文三元构建成图,并通过 RGCN 进行信息传递和聚合,增强三元之间的交互。为了避免长文本中知识三元组的遗漏,我们通过重新构建知识三元组来进行基于 LLM 的反向验证。实验表明,我们的模型增强了幻觉检测能力,并优于所有基线模型。
{"title":"Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling","authors":"Xinyue Fang, Zhen Huang, Zhiliang Tian, Minghui Fang, Ziyi Pan, Quntian Fang, Zhihua Wen, Hengyue Pan, Dongsheng Li","doi":"arxiv-2409.11283","DOIUrl":"https://doi.org/arxiv-2409.11283","url":null,"abstract":"LLMs obtain remarkable performance but suffer from hallucinations. Most\u0000research on detecting hallucination focuses on the questions with short and\u0000concrete correct answers that are easy to check the faithfulness. Hallucination\u0000detections for text generation with open-ended answers are more challenging.\u0000Some researchers use external knowledge to detect hallucinations in generated\u0000texts, but external resources for specific scenarios are hard to access. Recent\u0000studies on detecting hallucinations in long text without external resources\u0000conduct consistency comparison among multiple sampled outputs. To handle long\u0000texts, researchers split long texts into multiple facts and individually\u0000compare the consistency of each pairs of facts. However, these methods (1)\u0000hardly achieve alignment among multiple facts; (2) overlook dependencies\u0000between multiple contextual facts. In this paper, we propose a graph-based\u0000context-aware (GCA) hallucination detection for text generations, which aligns\u0000knowledge facts and considers the dependencies between contextual knowledge\u0000triples in consistency comparison. Particularly, to align multiple facts, we\u0000conduct a triple-oriented response segmentation to extract multiple knowledge\u0000triples. To model dependencies among contextual knowledge triple (facts), we\u0000construct contextual triple into a graph and enhance triples' interactions via\u0000message passing and aggregating via RGCN. To avoid the omission of knowledge\u0000triples in long text, we conduct a LLM-based reverse verification via\u0000reconstructing the knowledge triples. Experiments show that our model enhances\u0000hallucination detection and excels all baselines.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chain-of-Thought Prompting for Speech Translation 语音翻译的思维链提示
Pub Date : 2024-09-17 DOI: arxiv-2409.11538
Ke Hu, Zhehuai Chen, Chao-Han Huck Yang, Piotr Żelasko, Oleksii Hrinchuk, Vitaly Lavrukhin, Jagadeesh Balam, Boris Ginsburg
Large language models (LLMs) have demonstrated remarkable advancements inlanguage understanding and generation. Building on the success of text-basedLLMs, recent research has adapted these models to use speech embeddings forprompting, resulting in Speech-LLM models that exhibit strong performance inautomatic speech recognition (ASR) and automatic speech translation (AST). Inthis work, we propose a novel approach to leverage ASR transcripts as promptsfor AST in a Speech-LLM built on an encoder-decoder text LLM. The Speech-LLMmodel consists of a speech encoder and an encoder-decoder structureMegatron-T5. By first decoding speech to generate ASR transcripts andsubsequently using these transcripts along with encoded speech for prompting,we guide the speech translation in a two-step process like chain-of-thought(CoT) prompting. Low-rank adaptation (LoRA) is used for the T5 LLM for modeladaptation and shows superior performance to full model fine-tuning.Experimental results show that the proposed CoT prompting significantlyimproves AST performance, achieving an average increase of 2.4 BLEU pointsacross 6 En->X or X->En AST tasks compared to speech prompting alone.Additionally, compared to a related CoT prediction method that predicts aconcatenated sequence of ASR and AST transcripts, our method performs better byan average of 2 BLEU points.
大语言模型(LLM)在语言理解和生成方面取得了显著进步。在基于文本的大型语言模型取得成功的基础上,最近的研究将这些模型调整为使用语音嵌入进行提示,从而产生了在自动语音识别(ASR)和自动语音翻译(AST)中表现出色的语音大型语言模型。在这项工作中,我们提出了一种新方法,在基于编码器-解码器文本 LLM 的 Speech-LLM 中利用 ASR 转录作为 AST 的提示。语音 LLM 模型由一个语音编码器和一个编码器-解码器结构(Megatron-T5)组成。我们首先对语音进行解码,生成 ASR 转录本,然后使用这些转录本和编码语音进行提示,通过类似于思维链(CoT)提示的两步过程引导语音翻译。实验结果表明,建议的 CoT 提示显著提高了 AST 性能,与单独的语音提示相比,在 6 个 En->X 或 X->En AST 任务中平均提高了 2.4 个 BLEU 点。此外,与预测 ASR 和 AST 转录本合并序列的相关 CoT 预测方法相比,我们的方法平均提高了 2 个 BLEU 点。
{"title":"Chain-of-Thought Prompting for Speech Translation","authors":"Ke Hu, Zhehuai Chen, Chao-Han Huck Yang, Piotr Żelasko, Oleksii Hrinchuk, Vitaly Lavrukhin, Jagadeesh Balam, Boris Ginsburg","doi":"arxiv-2409.11538","DOIUrl":"https://doi.org/arxiv-2409.11538","url":null,"abstract":"Large language models (LLMs) have demonstrated remarkable advancements in\u0000language understanding and generation. Building on the success of text-based\u0000LLMs, recent research has adapted these models to use speech embeddings for\u0000prompting, resulting in Speech-LLM models that exhibit strong performance in\u0000automatic speech recognition (ASR) and automatic speech translation (AST). In\u0000this work, we propose a novel approach to leverage ASR transcripts as prompts\u0000for AST in a Speech-LLM built on an encoder-decoder text LLM. The Speech-LLM\u0000model consists of a speech encoder and an encoder-decoder structure\u0000Megatron-T5. By first decoding speech to generate ASR transcripts and\u0000subsequently using these transcripts along with encoded speech for prompting,\u0000we guide the speech translation in a two-step process like chain-of-thought\u0000(CoT) prompting. Low-rank adaptation (LoRA) is used for the T5 LLM for model\u0000adaptation and shows superior performance to full model fine-tuning.\u0000Experimental results show that the proposed CoT prompting significantly\u0000improves AST performance, achieving an average increase of 2.4 BLEU points\u0000across 6 En->X or X->En AST tasks compared to speech prompting alone.\u0000Additionally, compared to a related CoT prediction method that predicts a\u0000concatenated sequence of ASR and AST transcripts, our method performs better by\u0000an average of 2 BLEU points.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semformer: Transformer Language Models with Semantic Planning Semformer:具有语义规划功能的转换器语言模型
Pub Date : 2024-09-17 DOI: arxiv-2409.11143
Yongjing Yin, Junran Ding, Kai Song, Yue Zhang
Next-token prediction serves as the dominant component in current neurallanguage models. During the training phase, the model employs teacher forcing,which predicts tokens based on all preceding ground truth tokens. However, thisapproach has been found to create shortcuts, utilizing the revealed prefix tospuriously fit future tokens, potentially compromising the accuracy of thenext-token predictor. In this paper, we introduce Semformer, a novel method oftraining a Transformer language model that explicitly models the semanticplanning of response. Specifically, we incorporate a sequence of planningtokens into the prefix, guiding the planning token representations to predictthe latent semantic representations of the response, which are induced by anautoencoder. In a minimal planning task (i.e., graph path-finding), our modelexhibits near-perfect performance and effectively mitigates shortcut learning,a feat that standard training methods and baseline models have been unable toaccomplish. Furthermore, we pretrain Semformer from scratch with 125Mparameters, demonstrating its efficacy through measures of perplexity,in-context learning, and fine-tuning on summarization tasks.
下一个标记预测是当前神经语言模型的主要组成部分。在训练阶段,模型采用教师强制法,即根据前面所有的地面实况标记来预测标记。然而,人们发现这种方法会产生捷径,利用揭示的前缀来错误地拟合未来的标记,从而可能影响下一标记预测器的准确性。在本文中,我们介绍了 Semformer,这是一种训练 Transformer 语言模型的新方法,它可以明确地模拟响应的语义规划。具体来说,我们在前缀中加入了一系列规划标记,引导规划标记表征预测反应的潜在语义表征,这些潜在语义表征是由自动编码器诱导的。在最小规划任务(即图路径查找)中,我们的模式lex 表现出近乎完美的性能,并有效地减少了捷径学习,而这正是标准训练方法和基线模型所无法实现的。此外,我们还用 1.25 亿个参数对 Semformer 进行了从头开始的预训练,通过在摘要任务中的困惑度测量、上下文学习和微调,证明了它的功效。
{"title":"Semformer: Transformer Language Models with Semantic Planning","authors":"Yongjing Yin, Junran Ding, Kai Song, Yue Zhang","doi":"arxiv-2409.11143","DOIUrl":"https://doi.org/arxiv-2409.11143","url":null,"abstract":"Next-token prediction serves as the dominant component in current neural\u0000language models. During the training phase, the model employs teacher forcing,\u0000which predicts tokens based on all preceding ground truth tokens. However, this\u0000approach has been found to create shortcuts, utilizing the revealed prefix to\u0000spuriously fit future tokens, potentially compromising the accuracy of the\u0000next-token predictor. In this paper, we introduce Semformer, a novel method of\u0000training a Transformer language model that explicitly models the semantic\u0000planning of response. Specifically, we incorporate a sequence of planning\u0000tokens into the prefix, guiding the planning token representations to predict\u0000the latent semantic representations of the response, which are induced by an\u0000autoencoder. In a minimal planning task (i.e., graph path-finding), our model\u0000exhibits near-perfect performance and effectively mitigates shortcut learning,\u0000a feat that standard training methods and baseline models have been unable to\u0000accomplish. Furthermore, we pretrain Semformer from scratch with 125M\u0000parameters, demonstrating its efficacy through measures of perplexity,\u0000in-context learning, and fine-tuning on summarization tasks.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Small Language Models can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs 小语言模型在创意短文写作中胜过人类:将 SLM 与人类和 LLM 进行比较的研究
Pub Date : 2024-09-17 DOI: arxiv-2409.11547
Guillermo Marco, Luz Rello, Julio Gonzalo
In this paper, we evaluate the creative fiction writing abilities of afine-tuned small language model (SLM), BART Large, and compare its performanceto humans and two large language models (LLMs): GPT-3.5 and GPT-4o. Ourevaluation consists of two experiments: (i) a human evaluation where readersassess the stories generated by the SLM compared to human-written stories, and(ii) a qualitative linguistic analysis comparing the textual characteristics ofthe stories generated by the different models. In the first experiment, weasked 68 participants to rate short stories generated by the models and humansalong dimensions such as grammaticality, relevance, creativity, andattractiveness. BART Large outperformed human writers in most aspects, exceptcreativity, with an overall score of 2.11 compared to 1.85 for human-writtentexts -- a 14% improvement. In the second experiment, the qualitative analysisrevealed that, while GPT-4o exhibited near-perfect internal and externalcoherence, it tended to produce more predictable narratives, with only 3% ofits stories seen as novel. In contrast, 15% of BART's stories were considerednovel, indicating a higher degree of creativity despite its smaller model size.This study provides both quantitative and qualitative insights into how modelsize and fine-tuning influence the balance between creativity, fluency, andcoherence in creative writing tasks.
在本文中,我们评估了经过精细调整的小语言模型(SLM)BART Large 的小说创作能力,并将其表现与人类和两种大语言模型(LLM)进行了比较:GPT-3.5 和 GPT-4o。评估包括两个实验:(i)人类评估,读者将 SLM 生成的故事与人类编写的故事进行比较评估;(ii)定性语言分析,比较不同模型生成的故事的文本特征。在第一个实验中,我们请 68 名参与者对模型和人类编写的短篇故事进行评分,评分标准包括语法性、相关性、创造性和吸引力。除创造性外,BART Large 在大多数方面的表现都优于人类写作者,总得分为 2.11 分,而人类写作的文本为 1.85 分,提高了 14%。在第二个实验中,定性分析显示,虽然 GPT-4o 表现出近乎完美的内部和外部一致性,但它倾向于产生更多可预测的叙事,只有 3% 的故事被认为是新颖的。这项研究从定量和定性两个方面揭示了模型大小和微调如何影响创意写作任务中创意、流畅性和一致性之间的平衡。
{"title":"Small Language Models can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs","authors":"Guillermo Marco, Luz Rello, Julio Gonzalo","doi":"arxiv-2409.11547","DOIUrl":"https://doi.org/arxiv-2409.11547","url":null,"abstract":"In this paper, we evaluate the creative fiction writing abilities of a\u0000fine-tuned small language model (SLM), BART Large, and compare its performance\u0000to humans and two large language models (LLMs): GPT-3.5 and GPT-4o. Our\u0000evaluation consists of two experiments: (i) a human evaluation where readers\u0000assess the stories generated by the SLM compared to human-written stories, and\u0000(ii) a qualitative linguistic analysis comparing the textual characteristics of\u0000the stories generated by the different models. In the first experiment, we\u0000asked 68 participants to rate short stories generated by the models and humans\u0000along dimensions such as grammaticality, relevance, creativity, and\u0000attractiveness. BART Large outperformed human writers in most aspects, except\u0000creativity, with an overall score of 2.11 compared to 1.85 for human-written\u0000texts -- a 14% improvement. In the second experiment, the qualitative analysis\u0000revealed that, while GPT-4o exhibited near-perfect internal and external\u0000coherence, it tended to produce more predictable narratives, with only 3% of\u0000its stories seen as novel. In contrast, 15% of BART's stories were considered\u0000novel, indicating a higher degree of creativity despite its smaller model size.\u0000This study provides both quantitative and qualitative insights into how model\u0000size and fine-tuning influence the balance between creativity, fluency, and\u0000coherence in creative writing tasks.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Computation and Language
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1