Automatic Classification of UML Class Diagrams from Images

Truong Ho-Quang, M. Chaudron, I. Samuelsson, J. Hjaltason, Bilal Karasneh, Hafeez Osman
{"title":"Automatic Classification of UML Class Diagrams from Images","authors":"Truong Ho-Quang, M. Chaudron, I. Samuelsson, J. Hjaltason, Bilal Karasneh, Hafeez Osman","doi":"10.1109/APSEC.2014.65","DOIUrl":null,"url":null,"abstract":"Graphical modelling of various aspects of software and systems is a common part of software development. UML is the de-facto standard for various types of software models. To be able to research UML, academia needs to have a corpus of UML models. For building such a database, an automated system that has the ability to classify UML class diagram images would be very beneficial, since a large portion of UML class diagrams (UML CDs) is available as images on the Internet. In this study, we propose 23 image-features and investigate the use of these features for the purpose of classifying UML CD images. We analyse the performance of the features and assess their contribution based on their Information Gain Attribute Evaluation scores. We study specificity and sensitivity scores of six classification algorithms on a set of 1300 images. We found that 19 out of 23 introduced features can be considered as influential predictors for classifying UML CD images. Through the six algorithms, the prediction rate achieves nearly 96% correctness for UML-CD and 91% of correctness for non-UML CD.","PeriodicalId":380881,"journal":{"name":"2014 21st Asia-Pacific Software Engineering Conference","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 21st Asia-Pacific Software Engineering Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC.2014.65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

Abstract

Graphical modelling of various aspects of software and systems is a common part of software development. UML is the de-facto standard for various types of software models. To be able to research UML, academia needs to have a corpus of UML models. For building such a database, an automated system that has the ability to classify UML class diagram images would be very beneficial, since a large portion of UML class diagrams (UML CDs) is available as images on the Internet. In this study, we propose 23 image-features and investigate the use of these features for the purpose of classifying UML CD images. We analyse the performance of the features and assess their contribution based on their Information Gain Attribute Evaluation scores. We study specificity and sensitivity scores of six classification algorithms on a set of 1300 images. We found that 19 out of 23 introduced features can be considered as influential predictors for classifying UML CD images. Through the six algorithms, the prediction rate achieves nearly 96% correctness for UML-CD and 91% of correctness for non-UML CD.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从图像中自动分类UML类图
软件和系统的各个方面的图形化建模是软件开发的一个常见部分。UML是各种软件模型的事实标准。为了能够研究UML,学术界需要有UML模型的语料库。对于构建这样一个数据库,一个能够对UML类图图像进行分类的自动化系统将是非常有益的,因为很大一部分UML类图(UML cd)可以在Internet上以图像的形式获得。在这项研究中,我们提出了23个图像特征,并研究了这些特征对UML CD图像分类的用途。我们分析了特征的性能,并根据它们的信息增益属性评估分数来评估它们的贡献。我们研究了六种分类算法在1300张图像上的特异性和灵敏度得分。我们发现23个引入的特征中有19个可以被认为是对UML CD图像进行分类的有影响力的预测因子。通过这六种算法,UML-CD的预测准确率达到近96%,非uml CD的预测准确率达到91%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
pIML -- An Interrupt Program Modelling Language for Real-Time and Embedded Systems What Community Contribution Pattern Says about Stability of Software Project? Guidelines for the Use of Function Block Diagram in Reactor Protection Systems Data Flow Based Integration Testing for Embedded System Using Interaction Model Model Checking of Software Product Lines in Presence of Nondeterminism and Probabilities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1