Interpretable Code Summarization

IF 5.7 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Reliability Pub Date : 2024-03-14 DOI:10.1109/TR.2024.3392876
Md Sarwar Kamal;Sonia Farhana Nimmy;Nilanjan Dey
{"title":"Interpretable Code Summarization","authors":"Md Sarwar Kamal;Sonia Farhana Nimmy;Nilanjan Dey","doi":"10.1109/TR.2024.3392876","DOIUrl":null,"url":null,"abstract":"Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic for software maintenance, code generation, and code recovery. Existing code summarization methods follow the encoding/decoding approach and use various machine learning techniques to generate natural language from source codes. Although most of these methods are state of the art, it is difficult to understand the complex encoding and decoding process to map the tokens with natural language words. Therefore, these coding and decoding approaches are treated as opaque models (black box). This research proposes explainable AI methods that overcome the black box features for the token mapping in code summarization process. Here, we created an abstract syntax tree (AST) from the tokens of the source code. We then embedded the AST into natural language words using a bilingual statistical probability approach to generate possible statistical parse trees. We applied a page rank algorithm among the parse trees to rank the trees. From the best-ranked tree, we generate the comment for the corresponding code snippet. To explain our code generation method, we used Takagi–Sugeno fuzzy approach, layerwise relevance propagation and a hidden Markov model. These approaches make our method trustworthy and understandable to humans to understand the process of source code token mapping with natural language words.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 1","pages":"2280-2289"},"PeriodicalIF":5.7000,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10530504/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Code summarization is a process of creating a readable natural language from programming source codes. Code summarization has become a popular research topic for software maintenance, code generation, and code recovery. Existing code summarization methods follow the encoding/decoding approach and use various machine learning techniques to generate natural language from source codes. Although most of these methods are state of the art, it is difficult to understand the complex encoding and decoding process to map the tokens with natural language words. Therefore, these coding and decoding approaches are treated as opaque models (black box). This research proposes explainable AI methods that overcome the black box features for the token mapping in code summarization process. Here, we created an abstract syntax tree (AST) from the tokens of the source code. We then embedded the AST into natural language words using a bilingual statistical probability approach to generate possible statistical parse trees. We applied a page rank algorithm among the parse trees to rank the trees. From the best-ranked tree, we generate the comment for the corresponding code snippet. To explain our code generation method, we used Takagi–Sugeno fuzzy approach, layerwise relevance propagation and a hidden Markov model. These approaches make our method trustworthy and understandable to humans to understand the process of source code token mapping with natural language words.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
可解释代码汇总
代码摘要是从编程源代码创建可读的自然语言的过程。代码摘要已经成为软件维护、代码生成和代码恢复的热门研究课题。现有的代码总结方法遵循编码/解码方法,并使用各种机器学习技术从源代码生成自然语言。虽然这些方法大多是最先进的,但很难理解将标记与自然语言单词映射的复杂编码和解码过程。因此,这些编码和解码方法被视为不透明的模型(黑箱)。本研究提出了一种可解释的人工智能方法,克服了代码摘要过程中标记映射的黑箱特征。在这里,我们从源代码的标记创建了一个抽象语法树(AST)。然后,我们使用双语统计概率方法将AST嵌入到自然语言单词中,以生成可能的统计解析树。我们在解析树中应用页面排序算法对树进行排序。从排名最好的树中,我们为相应的代码片段生成注释。为了解释我们的代码生成方法,我们使用了Takagi-Sugeno模糊方法、分层关联传播和隐马尔可夫模型。这些方法使我们的方法对人类理解源代码标记与自然语言单词映射的过程是可信的和可理解的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Reliability
IEEE Transactions on Reliability 工程技术-工程:电子与电气
CiteScore
12.20
自引率
8.50%
发文量
153
审稿时长
7.5 months
期刊介绍: IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.
期刊最新文献
URL2Path: A Robust Graph Learning Approach for Malicious URL Detection A Multisource Data Feature Fusion Method Based on FCN and Residual Attention Mechanism for Remaining Life Prediction of Gas Turbine CoWAR: A General Complementary Web API Recommendation Framework Based on Learning Model Decentralized Event-Triggered Quantized Control for Cyber-Physical Systems Under Multiple-Channel Denial-of-Service Attacks Zero Forgetting Lifelong Dictionary Learning Based on Low-Rank Decomposition for Multimode Process Monitoring
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1