Structured source retrieval for improving softwaresearch during program comprehension tasks

ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity Pub Date : 2014-10-20 DOI:10.1145/2660252.2660253

Brian P. Eddy

{"title":"Structured source retrieval for improving softwaresearch during program comprehension tasks","authors":"Brian P. Eddy","doi":"10.1145/2660252.2660253","DOIUrl":null,"url":null,"abstract":"During the software maintenance and evolution phase, the majority of a developer's time is spent on programming comprehension tasks. Feature location (i.e., finding the first location to make a modification), impact analysis (i.e., determining what and to what extent a program is affected by a change), and traceability (i.e., determining where requirements are implemented in the program), are all examples of such tasks. Recent research in the area of program comprehension has focused on using textual information, structural information (i.e., information regarding the creation and use of objects and methods within the code), and execution traces to develop tools that ease the burden on developers and decrease the time spent in each task. Furthermore, new studies in automating these tasks have started using text retrieval techniques, such as the vector space model (VSM), latent semantic indexing (LSI), and latent Dirichlet allocation (LDA) for searching software. This doctoral symposium summary presents two promising areas for improving existing techniques by combining structural information with text retrieval. The first is a methodology for evaluating the usefulness of text obtained from a program by looking at the structural location of terms (e.g., method name, comments, identifiers). The second focuses on improving the existing text retrieval approaches by providing more flexible queries (i.e., search strings). These two areas are complementary to each other and may be combined.","PeriodicalId":194590,"journal":{"name":"ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2660252.2660253","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

During the software maintenance and evolution phase, the majority of a developer's time is spent on programming comprehension tasks. Feature location (i.e., finding the first location to make a modification), impact analysis (i.e., determining what and to what extent a program is affected by a change), and traceability (i.e., determining where requirements are implemented in the program), are all examples of such tasks. Recent research in the area of program comprehension has focused on using textual information, structural information (i.e., information regarding the creation and use of objects and methods within the code), and execution traces to develop tools that ease the burden on developers and decrease the time spent in each task. Furthermore, new studies in automating these tasks have started using text retrieval techniques, such as the vector space model (VSM), latent semantic indexing (LSI), and latent Dirichlet allocation (LDA) for searching software. This doctoral symposium summary presents two promising areas for improving existing techniques by combining structural information with text retrieval. The first is a methodology for evaluating the usefulness of text obtained from a program by looking at the structural location of terms (e.g., method name, comments, identifiers). The second focuses on improving the existing text retrieval approaches by providing more flexible queries (i.e., search strings). These two areas are complementary to each other and may be combined.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在程序理解任务中改进软件搜索的结构化源检索

在软件维护和发展阶段，开发人员的大部分时间都花在编程理解任务上。特性定位(例如，找到进行修改的第一个位置)，影响分析(例如，确定变更对程序的影响是什么以及影响到什么程度)，以及可追溯性(例如，确定需求在程序中的实现位置)，都是此类任务的示例。最近在程序理解领域的研究集中在使用文本信息、结构信息(例如，关于代码中对象和方法的创建和使用的信息)和执行跟踪来开发工具，以减轻开发人员的负担并减少在每个任务上花费的时间。此外，自动化这些任务的新研究已经开始使用文本检索技术，例如搜索软件的向量空间模型(VSM)、潜在语义索引(LSI)和潜在狄利let分配(LDA)。本博士研讨会总结提出了两个有希望的领域，以改进现有的技术，结合结构信息和文本检索。第一个是通过查看术语的结构位置(例如，方法名、注释、标识符)来评估从程序中获得的文本的有用性的方法。第二个重点是通过提供更灵活的查询(即搜索字符串)来改进现有的文本检索方法。这两个领域是相辅相成的，可以结合起来。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity

自引率

0.00%

发文量

期刊最新文献

A programmatic introduction to Neo4j Advanced debugging techniques to identify concurrency bugs in actor-based programs When importless becomes meaningful DSLDI 2014: second workshop on domain specific languages design and implementation FPW'14: future programming workshop