通过随机差分测试发现基于gremlin的图形数据库系统中的bug

Yingying Zheng, Wensheng Dou, Yicheng Wang, Zheng Qin, Leile Tang, Yu Gao, Dong Wang, Wei Wang, Jun Wei
{"title":"通过随机差分测试发现基于gremlin的图形数据库系统中的bug","authors":"Yingying Zheng, Wensheng Dou, Yicheng Wang, Zheng Qin, Leile Tang, Yu Gao, Dong Wang, Wei Wang, Jun Wei","doi":"10.1145/3533767.3534409","DOIUrl":null,"url":null,"abstract":"Graph database systems (GDBs) allow efficiently storing and retrieving graph data, and have become the critical component in many applications, e.g., knowledge graphs, social networks, and fraud detection. It is important to ensure that GDBs operate correctly. Logic bugs can occur and make GDBs return an incorrect result for a given query. These bugs are critical and can easily go unnoticed by developers when the graph and queries become complicated. Despite the importance of GDBs, logic bugs in GDBs have received less attention than those in relational database systems. In this paper, we present Grand, an approach for automatically finding logic bugs in GDBs that adopt Gremlin as their query language. The core idea of Grand is to construct semantically equivalent databases for multiple GDBs, and then compare the results of a Gremlin query on these databases. If the return results of a query on multiple GDBs are different, the likely cause is a logic bug in these GDBs. To effectively test GDBs, we propose a model-based query generation approach to generate valid Gremlin queries that can potentially return non-empty results, and a data mapping approach to unify the format of query results for different GDBs. We evaluate Grand on six widely-used GDBs, e.g., Neo4j and HugeGraph. In total, we have found 21 previously-unknown logic bugs in these GDBs. Among them, developers have confirmed 18 bugs, and fixed 7 bugs.","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Finding bugs in Gremlin-based graph database systems via Randomized differential testing\",\"authors\":\"Yingying Zheng, Wensheng Dou, Yicheng Wang, Zheng Qin, Leile Tang, Yu Gao, Dong Wang, Wei Wang, Jun Wei\",\"doi\":\"10.1145/3533767.3534409\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph database systems (GDBs) allow efficiently storing and retrieving graph data, and have become the critical component in many applications, e.g., knowledge graphs, social networks, and fraud detection. It is important to ensure that GDBs operate correctly. Logic bugs can occur and make GDBs return an incorrect result for a given query. These bugs are critical and can easily go unnoticed by developers when the graph and queries become complicated. Despite the importance of GDBs, logic bugs in GDBs have received less attention than those in relational database systems. In this paper, we present Grand, an approach for automatically finding logic bugs in GDBs that adopt Gremlin as their query language. The core idea of Grand is to construct semantically equivalent databases for multiple GDBs, and then compare the results of a Gremlin query on these databases. If the return results of a query on multiple GDBs are different, the likely cause is a logic bug in these GDBs. To effectively test GDBs, we propose a model-based query generation approach to generate valid Gremlin queries that can potentially return non-empty results, and a data mapping approach to unify the format of query results for different GDBs. We evaluate Grand on six widely-used GDBs, e.g., Neo4j and HugeGraph. In total, we have found 21 previously-unknown logic bugs in these GDBs. Among them, developers have confirmed 18 bugs, and fixed 7 bugs.\",\"PeriodicalId\":412271,\"journal\":{\"name\":\"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis\",\"volume\":\"100 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3533767.3534409\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533767.3534409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

图形数据库系统(gdb)允许高效地存储和检索图形数据,并已成为许多应用程序的关键组成部分,例如知识图谱,社交网络和欺诈检测。确保gdb正常运行非常重要。可能出现逻辑错误,并使gdb为给定查询返回不正确的结果。当图形和查询变得复杂时,这些错误很容易被开发人员忽视。尽管gdb很重要,但与关系数据库系统中的逻辑错误相比,gdb中的逻辑错误受到的关注较少。在本文中,我们提出了Grand,一种在采用Gremlin作为查询语言的gdb中自动查找逻辑错误的方法。Grand的核心思想是为多个gdb构建语义等效的数据库,然后比较这些数据库上的Gremlin查询的结果。如果在多个gdb上查询的返回结果不同,可能的原因是这些gdb中的逻辑错误。为了有效地测试gdb,我们提出了一种基于模型的查询生成方法来生成有效的Gremlin查询,这些查询可能会返回非空结果,并提出了一种数据映射方法来统一不同gdb的查询结果格式。我们在六个广泛使用的gdb上评估了Grand,例如Neo4j和HugeGraph。总的来说,我们在这些gdb中发现了21个以前未知的逻辑错误。其中,开发人员确认了18个bug,修复了7个bug。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Finding bugs in Gremlin-based graph database systems via Randomized differential testing
Graph database systems (GDBs) allow efficiently storing and retrieving graph data, and have become the critical component in many applications, e.g., knowledge graphs, social networks, and fraud detection. It is important to ensure that GDBs operate correctly. Logic bugs can occur and make GDBs return an incorrect result for a given query. These bugs are critical and can easily go unnoticed by developers when the graph and queries become complicated. Despite the importance of GDBs, logic bugs in GDBs have received less attention than those in relational database systems. In this paper, we present Grand, an approach for automatically finding logic bugs in GDBs that adopt Gremlin as their query language. The core idea of Grand is to construct semantically equivalent databases for multiple GDBs, and then compare the results of a Gremlin query on these databases. If the return results of a query on multiple GDBs are different, the likely cause is a logic bug in these GDBs. To effectively test GDBs, we propose a model-based query generation approach to generate valid Gremlin queries that can potentially return non-empty results, and a data mapping approach to unify the format of query results for different GDBs. We evaluate Grand on six widely-used GDBs, e.g., Neo4j and HugeGraph. In total, we have found 21 previously-unknown logic bugs in these GDBs. Among them, developers have confirmed 18 bugs, and fixed 7 bugs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
One step further: evaluating interpreters using metamorphic testing Faster mutation analysis with MeMu Test mimicry to assess the exploitability of library vulnerabilities A large-scale study of usability criteria addressed by static analysis tools NCScope: hardware-assisted analyzer for native code in Android apps
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1