技术写作中基于相似度的文本重用支持

Proceedings of the 2015 ACM Symposium on Document Engineering Pub Date : 2015-09-08 DOI:10.1145/2682571.2797068

Axel J. Soto, A. Mohammad, Andrew Albert, Aminul Islam, E. Milios, Michael Doyle, R. Minghim, Maria Cristina Ferreira de Oliveira

{"title":"技术写作中基于相似度的文本重用支持","authors":"Axel J. Soto, A. Mohammad, Andrew Albert, Aminul Islam, E. Milios, Michael Doyle, R. Minghim, Maria Cristina Ferreira de Oliveira","doi":"10.1145/2682571.2797068","DOIUrl":null,"url":null,"abstract":"Technical writing in professional environments, such as user manual authoring for new products, is a task that relies heavily on reuse of content. Therefore, technical content is typically created following a strategy where modular units of text have references to each other. One of the main challenges faced by technical authors is to avoid duplicating existing content, as this adds unnecessary effort, generates undesirable inconsistencies, and dramatically increases maintenance and translation costs. However, there are few computational tools available to support this activity. This paper investigates the use of different similarity methods for the task of identification of reuse opportunities in technical writing. We evaluated our results using existing ground truth as well as feedback from technical authors. Finally, we also propose a tool that combines text similarity algorithms with interactive visualizations to aid authors in understanding differences in a collection of topics and identifying reuse opportunities.","PeriodicalId":106339,"journal":{"name":"Proceedings of the 2015 ACM Symposium on Document Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Similarity-Based Support for Text Reuse in Technical Writing\",\"authors\":\"Axel J. Soto, A. Mohammad, Andrew Albert, Aminul Islam, E. Milios, Michael Doyle, R. Minghim, Maria Cristina Ferreira de Oliveira\",\"doi\":\"10.1145/2682571.2797068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Technical writing in professional environments, such as user manual authoring for new products, is a task that relies heavily on reuse of content. Therefore, technical content is typically created following a strategy where modular units of text have references to each other. One of the main challenges faced by technical authors is to avoid duplicating existing content, as this adds unnecessary effort, generates undesirable inconsistencies, and dramatically increases maintenance and translation costs. However, there are few computational tools available to support this activity. This paper investigates the use of different similarity methods for the task of identification of reuse opportunities in technical writing. We evaluated our results using existing ground truth as well as feedback from technical authors. Finally, we also propose a tool that combines text similarity algorithms with interactive visualizations to aid authors in understanding differences in a collection of topics and identifying reuse opportunities.\",\"PeriodicalId\":106339,\"journal\":{\"name\":\"Proceedings of the 2015 ACM Symposium on Document Engineering\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2015 ACM Symposium on Document Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2682571.2797068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM Symposium on Document Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2682571.2797068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

专业环境中的技术写作，比如为新产品编写用户手册，是一项严重依赖于内容重用的任务。因此，技术内容通常是按照文本的模块单元相互引用的策略创建的。技术作者面临的主要挑战之一是避免重复现有内容，因为这会增加不必要的工作，产生不希望看到的不一致，并极大地增加维护和翻译成本。然而，很少有可用的计算工具来支持这种活动。本文研究了在技术写作中使用不同的相似度方法来识别重用机会的任务。我们使用现有的基础事实以及技术作者的反馈来评估我们的结果。最后，我们还提出了一个将文本相似度算法与交互式可视化相结合的工具，以帮助作者理解主题集合中的差异并识别重用机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Similarity-Based Support for Text Reuse in Technical Writing

Technical writing in professional environments, such as user manual authoring for new products, is a task that relies heavily on reuse of content. Therefore, technical content is typically created following a strategy where modular units of text have references to each other. One of the main challenges faced by technical authors is to avoid duplicating existing content, as this adds unnecessary effort, generates undesirable inconsistencies, and dramatically increases maintenance and translation costs. However, there are few computational tools available to support this activity. This paper investigates the use of different similarity methods for the task of identification of reuse opportunities in technical writing. We evaluated our results using existing ground truth as well as feedback from technical authors. Finally, we also propose a tool that combines text similarity algorithms with interactive visualizations to aid authors in understanding differences in a collection of topics and identifying reuse opportunities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2015 ACM Symposium on Document Engineering

自引率

0.00%

发文量

期刊最新文献

VEDD: A Visual Editor for Creation and Semi-Automatic Update of Derived Documents Document Engineering Issues in Document Analysis Document Changes: Modeling, Detection, Storage and Visualization (DChanges 2015) Creating eBooks with Accessible Graphics Content Spatio-temporal Validation of Multimedia Documents