使用预训练变压器模型的学习资源和路径的自动标题生成

Prakhar Mishra, Chaitali Diwan, S. Srinivasa, G. Srinivasaraghavan
{"title":"使用预训练变压器模型的学习资源和路径的自动标题生成","authors":"Prakhar Mishra, Chaitali Diwan, S. Srinivasa, G. Srinivasaraghavan","doi":"10.1142/s1793351x21400134","DOIUrl":null,"url":null,"abstract":"To create curiosity and interest for a topic in online learning is a challenging task. A good preview that outlines the contents of a learning pathway could help learners know the topic and get interested in it. Towards this end, we propose a hierarchical title generation approach to generate semantically relevant titles for the learning resources in a learning pathway and a title for the pathway itself. Our approach to Automatic Title Generation for a given text is based on pre-trained Transformer Language Model GPT-2. A pool of candidate titles are generated and an appropriate title is selected among them which is then refined or de-noised to get the final title. The model is trained on research paper abstracts from arXiv and evaluated on three different test sets. We show that it generates semantically and syntactically relevant titles as reflected in ROUGE, BLEU scores and human evaluations. We propose an optional abstractive Summarizer module based on pre-trained Transformer model T5 to shorten medium length documents. This module is also trained and evaluated on research papers from arXiv dataset. Finally, we show that the proposed model of hierarchical title generation for learning pathways has promising results.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Automatic Title Generation for Learning Resources and Pathways with Pre-trained Transformer Models\",\"authors\":\"Prakhar Mishra, Chaitali Diwan, S. Srinivasa, G. Srinivasaraghavan\",\"doi\":\"10.1142/s1793351x21400134\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To create curiosity and interest for a topic in online learning is a challenging task. A good preview that outlines the contents of a learning pathway could help learners know the topic and get interested in it. Towards this end, we propose a hierarchical title generation approach to generate semantically relevant titles for the learning resources in a learning pathway and a title for the pathway itself. Our approach to Automatic Title Generation for a given text is based on pre-trained Transformer Language Model GPT-2. A pool of candidate titles are generated and an appropriate title is selected among them which is then refined or de-noised to get the final title. The model is trained on research paper abstracts from arXiv and evaluated on three different test sets. We show that it generates semantically and syntactically relevant titles as reflected in ROUGE, BLEU scores and human evaluations. We propose an optional abstractive Summarizer module based on pre-trained Transformer model T5 to shorten medium length documents. This module is also trained and evaluated on research papers from arXiv dataset. Finally, we show that the proposed model of hierarchical title generation for learning pathways has promising results.\",\"PeriodicalId\":217956,\"journal\":{\"name\":\"Int. J. Semantic Comput.\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Semantic Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s1793351x21400134\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Semantic Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1793351x21400134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

在在线学习中,对一个话题产生好奇心和兴趣是一项具有挑战性的任务。一个好的预览,勾勒出学习路径的内容,可以帮助学习者了解主题,并对其产生兴趣。为此,我们提出了一种分层标题生成方法,为学习路径中的学习资源和路径本身生成语义相关的标题。我们为给定文本自动生成标题的方法是基于预训练的转换语言模型GPT-2。生成候选标题池,并从中选择合适的标题,然后对其进行细化或去噪以获得最终标题。该模型在arXiv的研究论文摘要上进行训练,并在三个不同的测试集上进行评估。我们表明,它生成语义和语法相关的标题,反映在ROUGE, BLEU分数和人类评价。我们提出了一个基于预训练的Transformer模型T5的可选抽象Summarizer模块来缩短中等长度的文档。该模块还对来自arXiv数据集的研究论文进行了训练和评估。最后,我们证明了所提出的学习路径分层标题生成模型具有良好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Automatic Title Generation for Learning Resources and Pathways with Pre-trained Transformer Models
To create curiosity and interest for a topic in online learning is a challenging task. A good preview that outlines the contents of a learning pathway could help learners know the topic and get interested in it. Towards this end, we propose a hierarchical title generation approach to generate semantically relevant titles for the learning resources in a learning pathway and a title for the pathway itself. Our approach to Automatic Title Generation for a given text is based on pre-trained Transformer Language Model GPT-2. A pool of candidate titles are generated and an appropriate title is selected among them which is then refined or de-noised to get the final title. The model is trained on research paper abstracts from arXiv and evaluated on three different test sets. We show that it generates semantically and syntactically relevant titles as reflected in ROUGE, BLEU scores and human evaluations. We propose an optional abstractive Summarizer module based on pre-trained Transformer model T5 to shorten medium length documents. This module is also trained and evaluated on research papers from arXiv dataset. Finally, we show that the proposed model of hierarchical title generation for learning pathways has promising results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Guest Editorial - Special Issue on IEEE AIKE 2022 TemporalDedup: Domain-Independent Deduplication of Redundant and Errant Temporal Data Knowledge Graph-Based Explainable Artificial Intelligence for Business Process Analysis Knowledge Graph-Based Integration of Autonomous Driving Datasets Confidence-Based Cheat Detection Through Constrained Order Inference of Temporal Sequences
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1