NapSS: Paragraph-level Medical Text Simplification via Narrative Prompting and Sentence-matching Summarization

Junru Lu, Jiazheng Li, Byron C. Wallace, Yulan He, Gabriele Pergola
{"title":"NapSS: Paragraph-level Medical Text Simplification via Narrative Prompting and Sentence-matching Summarization","authors":"Junru Lu, Jiazheng Li, Byron C. Wallace, Yulan He, Gabriele Pergola","doi":"10.48550/arXiv.2302.05574","DOIUrl":null,"url":null,"abstract":"Accessing medical literature is difficult for laypeople as the content is written for specialists and contains medical jargon. Automated text simplification methods offer a potential means to address this issue. In this work, we propose a summarize-then-simplify two-stage strategy, which we call NapSS, identifying the relevant content to simplify while ensuring that the original narrative flow is preserved. In this approach, we first generate reference summaries via sentence matching between the original and the simplified abstracts. These summaries are then used to train an extractive summarizer, learning the most relevant content to be simplified. Then, to ensure the narrative consistency of the simplified text, we synthesize auxiliary narrative prompts combining key phrases derived from the syntactical analyses of the original text. Our model achieves results significantly better than the seq2seq baseline on an English medical corpus, yielding 3%~4% absolute improvements in terms of lexical similarity, and providing a further 1.1% improvement of SARI score when combined with the baseline. We also highlight shortcomings of existing evaluation methods, and introduce new metrics that take into account both lexical and high-level semantic similarity. A human evaluation conducted on a random sample of the test set further establishes the effectiveness of the proposed approach. Codes and models are released here: https://github.com/LuJunru/NapSS.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1049-1061"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Findings (Sydney (N.S.W.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2302.05574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Accessing medical literature is difficult for laypeople as the content is written for specialists and contains medical jargon. Automated text simplification methods offer a potential means to address this issue. In this work, we propose a summarize-then-simplify two-stage strategy, which we call NapSS, identifying the relevant content to simplify while ensuring that the original narrative flow is preserved. In this approach, we first generate reference summaries via sentence matching between the original and the simplified abstracts. These summaries are then used to train an extractive summarizer, learning the most relevant content to be simplified. Then, to ensure the narrative consistency of the simplified text, we synthesize auxiliary narrative prompts combining key phrases derived from the syntactical analyses of the original text. Our model achieves results significantly better than the seq2seq baseline on an English medical corpus, yielding 3%~4% absolute improvements in terms of lexical similarity, and providing a further 1.1% improvement of SARI score when combined with the baseline. We also highlight shortcomings of existing evaluation methods, and introduce new metrics that take into account both lexical and high-level semantic similarity. A human evaluation conducted on a random sample of the test set further establishes the effectiveness of the proposed approach. Codes and models are released here: https://github.com/LuJunru/NapSS.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过叙述提示和句子匹配摘要的段落级医学文本简化
非专业人士很难访问医学文献,因为这些内容是为专家撰写的,并且包含医学术语。自动化文本简化方法为解决这一问题提供了一种潜在的手段。在这项工作中,我们提出了一种总结然后简化的两阶段策略,我们称之为NapSS,确定要简化的相关内容,同时确保原始叙事流得到保留。在这种方法中,我们首先通过原始摘要和简化摘要之间的句子匹配来生成参考摘要。然后,这些摘要被用来训练提取式总结者,学习要简化的最相关的内容。然后,为了确保简化文本的叙事一致性,我们结合对原文句法分析得出的关键短语,合成辅助叙事提示。我们的模型在英语医学语料库中取得的结果明显好于seq2seq基线,在词汇相似性方面产生了3%~4%的绝对改善,与基线相结合时,严重急性呼吸系统综合征得分进一步提高了1.1%。我们还强调了现有评估方法的不足,并引入了同时考虑词汇和高级语义相似性的新指标。对测试集的随机样本进行的人类评估进一步证明了所提出方法的有效性。此处发布代码和型号:https://github.com/LuJunru/NapSS.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
审稿时长
4 weeks
期刊最新文献
Exploring Pedestrian Injury Severity by Incorporating Spatial Information in Machine Learning Darkness and Death in the U.S.: Walking Distances Across the Nation by Time of Day and Time of Year Activity Reduction as Resilience Indicator: Evidence with Filomena Data The Lifestyle and Mobility Connection of Community Supported Agriculture (CSA) Users Transit Fleet Electrification Barriers, Resolutions and Costs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1