Mining framework usage graphs from app corpora

Sergio Mover, S. Sankaranarayanan, Rhys Braginton Pettee Olsen, B. E. Chang
{"title":"Mining framework usage graphs from app corpora","authors":"Sergio Mover, S. Sankaranarayanan, Rhys Braginton Pettee Olsen, B. E. Chang","doi":"10.1109/SANER.2018.8330216","DOIUrl":null,"url":null,"abstract":"We investigate the problem of mining graph-based usage patterns for large, object-oriented frameworks like Android—revisiting previous approaches based on graph-based object usage models (groums). Groums are a promising approach to represent usage patterns for object-oriented libraries because they simultaneously describe control flow and data dependencies between methods of multiple interacting object types. However, this expressivity comes at a cost: mining groums requires solving a subgraph isomorphism problem that is well known to be expensive. This cost limits the applicability of groum mining to large API frameworks. In this paper, we employ groum mining to learn usage patterns for object-oriented frameworks from program corpora. The central challenge is to scale groum mining so that it is sensitive to usages horizontally across programs from arbitrarily many developers (as opposed to simply usages vertically within the program of a single developer). To address this challenge, we develop a novel groum mining algorithm that scales on a large corpus of programs. We first use frequent itemset mining to restrict the search for groums to smaller subsets of methods in the given corpus. Then, we pose the subgraph isomorphism as a SAT problem and apply efficient pre-processing algorithms to rule out fruitless comparisons ahead of time. Finally, we identify containment relationships between clusters of groums to characterize popular usage patterns in the corpus (as well as classify less popular patterns as possible anomalies). We find that our approach scales on a corpus of over five hundred open source Android applications, effectively mining obligatory and best-practice usage patterns.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"18 1","pages":"277-289"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SANER.2018.8330216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

We investigate the problem of mining graph-based usage patterns for large, object-oriented frameworks like Android—revisiting previous approaches based on graph-based object usage models (groums). Groums are a promising approach to represent usage patterns for object-oriented libraries because they simultaneously describe control flow and data dependencies between methods of multiple interacting object types. However, this expressivity comes at a cost: mining groums requires solving a subgraph isomorphism problem that is well known to be expensive. This cost limits the applicability of groum mining to large API frameworks. In this paper, we employ groum mining to learn usage patterns for object-oriented frameworks from program corpora. The central challenge is to scale groum mining so that it is sensitive to usages horizontally across programs from arbitrarily many developers (as opposed to simply usages vertically within the program of a single developer). To address this challenge, we develop a novel groum mining algorithm that scales on a large corpus of programs. We first use frequent itemset mining to restrict the search for groums to smaller subsets of methods in the given corpus. Then, we pose the subgraph isomorphism as a SAT problem and apply efficient pre-processing algorithms to rule out fruitless comparisons ahead of time. Finally, we identify containment relationships between clusters of groums to characterize popular usage patterns in the corpus (as well as classify less popular patterns as possible anomalies). We find that our approach scales on a corpus of over five hundred open source Android applications, effectively mining obligatory and best-practice usage patterns.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从应用语料库中挖掘框架使用图
我们研究了为大型面向对象框架(如android)挖掘基于图的使用模式的问题——重新审视了以前基于图的对象使用模型(组)的方法。组是表示面向对象库的使用模式的一种很有前途的方法,因为它们同时描述了多个交互对象类型的方法之间的控制流和数据依赖关系。然而,这种表达性是有代价的:挖掘群需要解决子图同构问题,这是众所周知的昂贵问题。这种成本限制了群挖掘对大型API框架的适用性。在本文中,我们使用群挖掘从程序语料库中学习面向对象框架的使用模式。核心挑战是扩展群挖掘,以便对任意多个开发人员的程序中的横向使用敏感(与单个开发人员的程序中的简单垂直使用相反)。为了应对这一挑战,我们开发了一种新的群挖掘算法,该算法可以在大型程序语料库上进行扩展。我们首先使用频繁项集挖掘来将组的搜索限制在给定语料库中更小的方法子集中。然后,我们将子图同构作为一个SAT问题,并应用有效的预处理算法提前排除无结果的比较。最后,我们确定组簇之间的包含关系,以表征语料库中流行的使用模式(以及将不太流行的模式分类为可能的异常)。我们发现我们的方法可以在超过500个开源Android应用程序的语料库上扩展,有效地挖掘必要的和最佳实践的使用模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Exploring the integration of user feedback in automated testing of Android applications The Statechart Workbench: Enabling scalable software event log analysis using process mining Detecting code smells using machine learning techniques: Are we there yet? Classifying stack overflow posts on API issues Re-evaluating method-level bug prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1