2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)最新文献

英文中文

Ownership at Large: Open Problems and Challenges in Ownership Management 大所有权:所有权管理中开放的问题和挑战

2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)

Pub Date : 2020-04-15 DOI: 10.1145/3387904.3389293

John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, M. Harman, Shan He, R. Lämmel, E. Meijer, Silvia Sapora, Justin Spahr-Summers

Software-intensive organizations rely on large numbers of software assets of different types, e.g., source-code files, tables in the data warehouse, and software configurations. Who is the most suitable owner of a given asset changes over time, e.g., due to reorganization and individual function changes. New forms of automation can help suggest more suitable owners for any given asset at a given point in time. By such efforts on ownership health, accountability of ownership is increased. The problem of finding the most suitable owners for an asset is essentially a program comprehension problem: how do we automatically determine who would be best placed to understand, maintain, evolve (and thereby assume ownership of) a given asset. This paper introduces the Facebook Ownesty system, which uses a combination of ultra large scale data mining and machine learning and has been deployed at Facebook as part of the company's ownership management approach. Ownesty processes many millions of software assets (e.g., source-code files) and it takes into account workflow and organizational aspects. The paper sets out open problems and challenges on ownership for the research community with advances expected from the fields of software engineering, programming languages, and machine learning.

软件密集型组织依赖于大量不同类型的软件资产，例如，源代码文件、数据仓库中的表和软件配置。谁是给定资产最合适的所有者会随着时间的推移而变化，例如，由于重组和个别功能的变化。新的自动化形式可以帮助在给定的时间点为任何给定的资产建议更合适的所有者。通过这种关于所有权健康的努力，加强了所有权的问责制。为资产寻找最合适的所有者的问题本质上是一个程序理解问题:我们如何自动确定谁将最适合理解、维护、发展(并由此承担所有权)给定资产。本文介绍了Facebook Ownesty系统，该系统结合了超大规模的数据挖掘和机器学习，并已部署在Facebook作为公司所有权管理方法的一部分。所有权处理数以百万计的软件资产(例如，源代码文件)，它考虑到工作流程和组织方面。这篇论文为研究界提出了关于所有权的开放问题和挑战，并期望从软件工程、编程语言和机器学习领域取得进展。

{"title":"Ownership at Large: Open Problems and Challenges in Ownership Management","authors":"John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, M. Harman, Shan He, R. Lämmel, E. Meijer, Silvia Sapora, Justin Spahr-Summers","doi":"10.1145/3387904.3389293","DOIUrl":"https://doi.org/10.1145/3387904.3389293","url":null,"abstract":"Software-intensive organizations rely on large numbers of software assets of different types, e.g., source-code files, tables in the data warehouse, and software configurations. Who is the most suitable owner of a given asset changes over time, e.g., due to reorganization and individual function changes. New forms of automation can help suggest more suitable owners for any given asset at a given point in time. By such efforts on ownership health, accountability of ownership is increased. The problem of finding the most suitable owners for an asset is essentially a program comprehension problem: how do we automatically determine who would be best placed to understand, maintain, evolve (and thereby assume ownership of) a given asset. This paper introduces the Facebook Ownesty system, which uses a combination of ultra large scale data mining and machine learning and has been deployed at Facebook as part of the company's ownership management approach. Ownesty processes many millions of software assets (e.g., source-code files) and it takes into account workflow and organizational aspects. The paper sets out open problems and challenges on ownership for the research community with advances expected from the fields of software engineering, programming languages, and machine learning.","PeriodicalId":231095,"journal":{"name":"2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124870079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Understanding What Software Engineers Are Working on The Work-Item Prediction Challenge 了解软件工程师在工作项预测挑战中所做的工作

2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)

Pub Date : 2020-04-13 DOI: 10.1145/3387904.3389294

R. Lämmel, A. Kerber, Liane Praza

Understanding what a software engineer (a developer, an incident responder, a production engineer, etc.) is working on is a challenging problem - especially when considering the more complex software engineering workflows in software-intensive organizations: i) engineers rely on a multitude (perhaps hundreds) of loosely integrated tools; ii) engineers engage in concurrent and relatively long running workflows; ii) infrastructure (such as logging) is not fully aware of work items; iv) engineering processes (e.g., for incident response) are not explicitly modeled. In this paper, we explain the corresponding’ work-item prediction challenge’ on the grounds of representative scenarios, report on related efforts at Facebook, discuss some lessons learned, and review related work to call to arms to leverage, advance, and combine techniques from program comprehension, mining software repositories, process mining, and machine learning.

理解软件工程师(开发人员、事件响应人员、生产工程师等)正在做什么是一个具有挑战性的问题——特别是考虑到软件密集型组织中更复杂的软件工程工作流时:1)工程师依赖于大量(可能是数百个)松散集成的工具;Ii)工程师从事并发和相对长时间运行的工作流程;Ii)基础设施(如日志记录)没有完全意识到工作项;Iv)工程过程(例如，事件响应)没有明确建模。在本文中，我们在代表性场景的基础上解释了相应的“工作项预测挑战”，报告了Facebook的相关工作，讨论了一些经验教训，并回顾了相关工作，以呼吁利用、推进和结合来自程序理解、挖掘软件存储库、过程挖掘和机器学习的技术。

引用次数: 3

Improved Code Summarization via a Graph Neural Network 基于图神经网络的改进代码摘要

2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)

Pub Date : 2020-04-06 DOI: 10.1145/3387904.3389268

Alexander LeClair, S. Haque, Lingfei Wu, Collin McMillan

Automatic source code summarization is the task of generating natural language descriptions for source code. Automatic code summarization is a rapidly expanding research area, especially as the community has taken greater advantage of advances in neural network and AI technologies. In general, source code summarization techniques use the source code as input and outputs a natural language description. Yet a strong consensus is developing that using structural information as input leads to improved performance. The first approaches to use structural information flattened the AST into a sequence. Recently, more complex approaches based on random AST paths or graph neural networks have improved on the models using flattened ASTs. However, the literature still does not describe the using a graph neural network together with source code sequence as separate inputs to a model. Therefore, in this paper, we present an approach that uses a graph-based neural architecture that better matches the default structure of the AST to generate these summaries. We evaluate our technique using a data set of 2.1 million Java method-comment pairs and show improvement over four baseline techniques, two from the software engineering literature, and two from machine learning literature.

自动源代码摘要是为源代码生成自然语言描述的任务。自动代码摘要是一个迅速发展的研究领域，特别是随着社区越来越多地利用神经网络和人工智能技术的进步。通常，源代码摘要技术使用源代码作为输入并输出自然语言描述。然而，一种强有力的共识正在形成，即使用结构信息作为输入可以提高性能。第一种使用结构信息的方法将AST平铺成一个序列。最近，基于随机AST路径或图神经网络的更复杂的方法在使用扁平AST的模型上进行了改进。然而，文献仍然没有描述将图神经网络与源代码序列一起作为模型的单独输入。因此，在本文中，我们提出了一种方法，该方法使用基于图的神经体系结构来更好地匹配AST的默认结构来生成这些摘要。我们使用210万个Java方法注释对的数据集来评估我们的技术，并显示了四种基线技术的改进，其中两种来自软件工程文献，另两种来自机器学习文献。

{"title":"Improved Code Summarization via a Graph Neural Network","authors":"Alexander LeClair, S. Haque, Lingfei Wu, Collin McMillan","doi":"10.1145/3387904.3389268","DOIUrl":"https://doi.org/10.1145/3387904.3389268","url":null,"abstract":"Automatic source code summarization is the task of generating natural language descriptions for source code. Automatic code summarization is a rapidly expanding research area, especially as the community has taken greater advantage of advances in neural network and AI technologies. In general, source code summarization techniques use the source code as input and outputs a natural language description. Yet a strong consensus is developing that using structural information as input leads to improved performance. The first approaches to use structural information flattened the AST into a sequence. Recently, more complex approaches based on random AST paths or graph neural networks have improved on the models using flattened ASTs. However, the literature still does not describe the using a graph neural network together with source code sequence as separate inputs to a model. Therefore, in this paper, we present an approach that uses a graph-based neural architecture that better matches the default structure of the AST to generate these summaries. We evaluate our technique using a data set of 2.1 million Java method-comment pairs and show improvement over four baseline techniques, two from the software engineering literature, and two from machine learning literature.","PeriodicalId":231095,"journal":{"name":"2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133317692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 215

A Self-Attentional Neural Architecture for Code Completion with Multi-Task learning 基于多任务学习的代码完成自注意神经结构

2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)

Pub Date : 2019-09-16 DOI: 10.1145/3387904.3389261

Fang Liu, Ge Li, Bolin Wei, Xin Xia, Ming Li, Zhiyi Fu, Zhi Jin

Code completion, one of the most useful features in the Integrated Development Environments (IDEs), can accelerate software development by suggesting the libraries, APIs, and method names in real-time. Recent studies have shown that statistical language models can improve the performance of code completion tools through learning from large-scale software repositories. However, these models suffer from three major drawbacks: a) The hierarchical structural information of the programs is not fully utilized in the program's representation; b) In programs, the semantic relationships can be very long. Existing recurrent neural networks based language models are not sufficient to model the long-term dependency. c) Existing approaches perform a specific task in one model, which leads to the underuse of the information from related tasks. To address these challenges, in this paper, we propose a selfattentional neural architecture for code completion with multi-task learning. To utilize the hierarchical structural information of the programs, we present a novel method that considers the path from the predicting node to the root node. To capture the long-term dependency in the input programs, we adopt a self-attentional architecture based network as the base language model. To enable the knowledge sharing between related tasks, we creatively propose a Multi-Task Learning (MTL) framework to learn two related tasks in code completion jointly. Experiments on three real-world datasets demonstrate the effectiveness of our model when compared with state-of-the-art methods.

代码完成是集成开发环境(ide)中最有用的特性之一，它可以通过实时提示库、api和方法名来加速软件开发。最近的研究表明，统计语言模型可以通过学习大规模软件存储库来提高代码完成工具的性能。然而，这些模型有三个主要的缺点:a)在程序的表示中没有充分利用程序的层次结构信息;b)在程序中，语义关系可以很长。现有的基于递归神经网络的语言模型不足以模拟长期依赖关系。c)现有的方法在一个模型中执行特定的任务，导致相关任务的信息没有得到充分利用。为了解决这些挑战，在本文中，我们提出了一种用于多任务学习的代码完成的自注意神经结构。为了利用程序的层次结构信息，我们提出了一种考虑从预测节点到根节点的路径的新方法。为了捕获输入程序的长期依赖性，我们采用了基于自关注架构的网络作为基本语言模型。为了实现相关任务之间的知识共享，我们创造性地提出了一个多任务学习(Multi-Task Learning, MTL)框架，共同学习代码完成中的两个相关任务。在三个真实数据集上的实验表明，与最先进的方法相比，我们的模型是有效的。

{"title":"A Self-Attentional Neural Architecture for Code Completion with Multi-Task learning","authors":"Fang Liu, Ge Li, Bolin Wei, Xin Xia, Ming Li, Zhiyi Fu, Zhi Jin","doi":"10.1145/3387904.3389261","DOIUrl":"https://doi.org/10.1145/3387904.3389261","url":null,"abstract":"Code completion, one of the most useful features in the Integrated Development Environments (IDEs), can accelerate software development by suggesting the libraries, APIs, and method names in real-time. Recent studies have shown that statistical language models can improve the performance of code completion tools through learning from large-scale software repositories. However, these models suffer from three major drawbacks: a) The hierarchical structural information of the programs is not fully utilized in the program's representation; b) In programs, the semantic relationships can be very long. Existing recurrent neural networks based language models are not sufficient to model the long-term dependency. c) Existing approaches perform a specific task in one model, which leads to the underuse of the information from related tasks. To address these challenges, in this paper, we propose a selfattentional neural architecture for code completion with multi-task learning. To utilize the hierarchical structural information of the programs, we present a novel method that considers the path from the predicting node to the root node. To capture the long-term dependency in the input programs, we adopt a self-attentional architecture based network as the base language model. To enable the knowledge sharing between related tasks, we creatively propose a Multi-Task Learning (MTL) framework to learn two related tasks in code completion jointly. Experiments on three real-world datasets demonstrate the effectiveness of our model when compared with state-of-the-art methods.","PeriodicalId":231095,"journal":{"name":"2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125686728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 55

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2020 IEEE/ACM 28th International Conference on Program Comprehension (ICPC)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀