用于科学信息提取的大型语言模型：病毒学实证研究

Findings Pub Date : 2024-01-18 DOI:10.48550/arXiv.2401.10040

Mahsa Shamsabadi, Jennifer D'Souza, Sören Auer

{"title":"用于科学信息提取的大型语言模型：病毒学实证研究","authors":"Mahsa Shamsabadi, Jennifer D'Souza, Sören Auer","doi":"10.48550/arXiv.2401.10040","DOIUrl":null,"url":null,"abstract":"In this paper, we champion the use of structured and semantic content representation of discourse-based scholarly communication, inspired by tools like Wikipedia infoboxes or structured Amazon product descriptions. These representations provide users with a concise overview, aiding scientists in navigating the dense academic landscape. Our novel automated approach leverages the robust text generation capabilities of LLMs to produce structured scholarly contribution summaries, offering both a practical solution and insights into LLMs’ emergent abilities.For LLMs, the prime focus is on improving their general intelligence as conversational agents. We argue that these models can also be applied effectively in information extraction (IE), specifically in complex IE tasks within terse domains like Science. This paradigm shift replaces the traditional modular, pipelined machine learning approach with a simpler objective expressed through instructions. Our results show that finetuned FLAN-T5 with 1000x fewer parameters than the state-of-the-art GPT-davinci is competitive for the task.","PeriodicalId":508951,"journal":{"name":"Findings","volume":"24 18","pages":"374-392"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large Language Models for Scientific Information Extraction: An Empirical Study for Virology\",\"authors\":\"Mahsa Shamsabadi, Jennifer D'Souza, Sören Auer\",\"doi\":\"10.48550/arXiv.2401.10040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we champion the use of structured and semantic content representation of discourse-based scholarly communication, inspired by tools like Wikipedia infoboxes or structured Amazon product descriptions. These representations provide users with a concise overview, aiding scientists in navigating the dense academic landscape. Our novel automated approach leverages the robust text generation capabilities of LLMs to produce structured scholarly contribution summaries, offering both a practical solution and insights into LLMs’ emergent abilities.For LLMs, the prime focus is on improving their general intelligence as conversational agents. We argue that these models can also be applied effectively in information extraction (IE), specifically in complex IE tasks within terse domains like Science. This paradigm shift replaces the traditional modular, pipelined machine learning approach with a simpler objective expressed through instructions. Our results show that finetuned FLAN-T5 with 1000x fewer parameters than the state-of-the-art GPT-davinci is competitive for the task.\",\"PeriodicalId\":508951,\"journal\":{\"name\":\"Findings\",\"volume\":\"24 18\",\"pages\":\"374-392\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Findings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2401.10040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Findings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2401.10040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

受维基百科信息框或结构化亚马逊产品描述等工具的启发，我们在本文中倡导使用结构化和语义化的内容表示法来表示基于话语的学术交流。这些表示法为用户提供了简明的概览，有助于科学家浏览密集的学术景观。我们新颖的自动方法利用 LLM 强大的文本生成能力来生成结构化的学术贡献摘要，既提供了实用的解决方案，又让人们深入了解了 LLM 的新兴能力。我们认为，这些模型也可以有效地应用于信息提取（IE），特别是科学等简洁领域的复杂 IE 任务。这种范式的转变取代了传统的模块化、流水线式机器学习方法，而是通过指令来表达更简单的目标。我们的研究结果表明，经过微调的 FLAN-T5 比最先进的 GPT-davinci 少了 1000 倍的参数，在执行任务时具有很强的竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Large Language Models for Scientific Information Extraction: An Empirical Study for Virology

In this paper, we champion the use of structured and semantic content representation of discourse-based scholarly communication, inspired by tools like Wikipedia infoboxes or structured Amazon product descriptions. These representations provide users with a concise overview, aiding scientists in navigating the dense academic landscape. Our novel automated approach leverages the robust text generation capabilities of LLMs to produce structured scholarly contribution summaries, offering both a practical solution and insights into LLMs’ emergent abilities.For LLMs, the prime focus is on improving their general intelligence as conversational agents. We argue that these models can also be applied effectively in information extraction (IE), specifically in complex IE tasks within terse domains like Science. This paradigm shift replaces the traditional modular, pipelined machine learning approach with a simpler objective expressed through instructions. Our results show that finetuned FLAN-T5 with 1000x fewer parameters than the state-of-the-art GPT-davinci is competitive for the task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Findings

自引率

0.00%

发文量