首页 > 最新文献

Empirical Software Engineering最新文献

英文 中文
Just-in-Time crash prediction for mobile apps 移动应用程序的即时崩溃预测
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-08 DOI: 10.1007/s10664-024-10455-7
Chathrie Wimalasooriya, Sherlock A. Licorish, Daniel Alencar da Costa, Stephen G. MacDonell

Just-In-Time (JIT) defect prediction aims to identify defects early, at commit time. Hence, developers can take precautions to avoid defects when the code changes are still fresh in their minds. However, the utility of JIT defect prediction has not been investigated in relation to crashes of mobile apps. We therefore conducted a multi-case study employing both quantitative and qualitative analysis. In the quantitative analysis, we used machine learning techniques for prediction. We collected 113 reliability-related metrics for about 30,000 commits from 14 Android apps and selected 14 important metrics for prediction. We found that both standard JIT metrics and static analysis warnings are important for JIT prediction of mobile app crashes. We further optimized prediction performance, comparing seven state-of-the-art defect prediction techniques with hyperparameter optimization. Our results showed that Random Forest is the best performing model with an AUC-ROC of 0.83. In our qualitative analysis, we manually analysed a sample of 642 commits and identified different types of changes that are common in crash-inducing commits. We explored whether different aspects of changes can be used as metrics in JIT models to improve prediction performance. We found these metrics improve the prediction performance significantly. Hence, we suggest considering static analysis warnings and Android-specific metrics to adapt standard JIT defect prediction models for a mobile context to predict crashes. Finally, we provide recommendations to bridge the gap between research and practice and point to opportunities for future research.

即时缺陷预测(JIT)的目的是在提交时尽早发现缺陷。因此,开发人员可以在对代码更改记忆犹新时采取预防措施,避免出现缺陷。然而,JIT 缺陷预测在移动应用程序崩溃方面的实用性尚未得到研究。因此,我们采用定量和定性分析方法进行了一项多案例研究。在定量分析中,我们使用了机器学习技术进行预测。我们从 14 个 Android 应用程序的约 30,000 次提交中收集了 113 个可靠性相关指标,并选择了 14 个重要指标进行预测。我们发现,标准 JIT 指标和静态分析警告对于 JIT 预测移动应用程序崩溃都很重要。我们进一步优化了预测性能,通过超参数优化比较了七种最先进的缺陷预测技术。结果表明,随机森林是性能最好的模型,AUC-ROC 为 0.83。在定性分析中,我们手动分析了 642 个提交样本,并确定了导致崩溃的提交中常见的不同变更类型。我们探讨了是否可以将不同方面的变更作为 JIT 模型的衡量指标,以提高预测性能。我们发现这些指标能显著提高预测性能。因此,我们建议考虑静态分析警告和特定于 Android 的指标,以调整标准 JIT 缺陷预测模型,使其适用于移动环境,从而预测崩溃。最后,我们提出了弥合研究与实践之间差距的建议,并指出了未来研究的机遇。
{"title":"Just-in-Time crash prediction for mobile apps","authors":"Chathrie Wimalasooriya, Sherlock A. Licorish, Daniel Alencar da Costa, Stephen G. MacDonell","doi":"10.1007/s10664-024-10455-7","DOIUrl":"https://doi.org/10.1007/s10664-024-10455-7","url":null,"abstract":"<p>Just-In-Time (JIT) defect prediction aims to identify defects early, at commit time. Hence, developers can take precautions to avoid defects when the code changes are still fresh in their minds. However, the utility of JIT defect prediction has not been investigated in relation to crashes of mobile apps. We therefore conducted a multi-case study employing both quantitative and qualitative analysis. In the quantitative analysis, we used machine learning techniques for prediction. We collected 113 reliability-related metrics for about 30,000 commits from 14 Android apps and selected 14 important metrics for prediction. We found that both standard JIT metrics and static analysis warnings are important for JIT prediction of mobile app crashes. We further optimized prediction performance, comparing seven state-of-the-art defect prediction techniques with hyperparameter optimization. Our results showed that Random Forest is the best performing model with an AUC-ROC of 0.83. In our qualitative analysis, we manually analysed a sample of 642 commits and identified different types of changes that are common in crash-inducing commits. We explored whether different aspects of changes can be used as metrics in JIT models to improve prediction performance. We found these metrics improve the prediction performance significantly. Hence, we suggest considering static analysis warnings <i>and</i> Android-specific metrics to adapt standard JIT defect prediction models for a mobile context to predict crashes. Finally, we provide recommendations to bridge the gap between research and practice and point to opportunities for future research.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing and revivifying function signature inference using deep learning 利用深度学习分析和活化函数特征推理
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-08 DOI: 10.1007/s10664-024-10453-9
Yan Lin, Trisha Singhal, Debin Gao, David Lo

Function signature plays an important role in binary analysis and security enhancement, with typical examples in bug finding and control-flow integrity enforcement. However, recovery of function signatures by static binary analysis is challenging since crucial information vital for such recovery is stripped off during compilation. Although function signature recovery using deep learning (DL) is proposed in an effort to handle such challenges, the reported accuracy is low for binaries compiled with optimizations. In this paper, we first perform a systematic study to quantify the extent to which compiler optimizations (negatively) impact the accuracy of existing DL techniques based on Recurrent Neural Network (RNN) for function signature recovery. Our experiments show that the state-of-the-art DL technique has its accuracy dropped from 98.7% to 87.7% when training and testing optimized binaries. We further investigate the type of instructions that existing RNN model deems most important in inferring function signatures with the help of saliency map. The results show that existing RNN model mistakenly considers non-argument-accessing instructions to infer the number of arguments, especially when dealing with optimized binaries. Finally, we identify specific weaknesses in such existing approaches and propose an enhanced DL approach named ReSIL to incorporate compiler-optimization-specific domain knowledge into the learning process. Our experimental results show that ReSIL significantly improves the accuracy and F1 score in inferring function signatures, e.g., with accuracy in inferring the number of arguments for callees compiled with optimization flag O1 from 84.83% to 92.68%. Meanwhile, ReSIL correctly considers the argument-accessing instructions as the most important ones to perform the inferencing. We also demonstrate security implications of ReSIL in Control-Flow Integrity enforcement in stopping potential Counterfeit Object-Oriented Programming (COOP) attacks.

函数签名在二进制分析和安全增强中发挥着重要作用,典型的例子有错误查找和控制流完整性执行。然而,通过静态二进制分析恢复函数签名具有挑战性,因为在编译过程中,恢复函数签名所需的关键信息会被剥离。虽然有人提出使用深度学习(DL)来恢复函数签名,以应对这种挑战,但对于经过优化编译的二进制文件来说,报告的准确率很低。在本文中,我们首先进行了一项系统研究,以量化编译器优化在多大程度上(负面地)影响了基于递归神经网络(RNN)的现有 DL 技术在函数签名恢复方面的准确性。我们的实验表明,在训练和测试优化二进制文件时,最先进的 DL 技术的准确率从 98.7% 降至 87.7%。我们进一步研究了现有 RNN 模型在借助显著性图推断功能特征时认为最重要的指令类型。结果表明,现有的 RNN 模型在推断参数数时错误地考虑了非参数访问指令,尤其是在处理优化二进制文件时。最后,我们指出了这些现有方法的具体弱点,并提出了一种名为 ReSIL 的增强型 DL 方法,将编译器优化特定领域的知识纳入学习过程。实验结果表明,ReSIL 显著提高了推断函数签名的准确率和 F1 分数,例如,推断使用优化标志 O1 编译的 callees 的参数数的准确率从 84.83% 提高到 92.68%。同时,ReSIL 正确地将参数访问指令视为执行推断的最重要指令。我们还展示了 ReSIL 在控制流完整性执行方面的安全意义,以阻止潜在的假冒面向对象编程(COOP)攻击。
{"title":"Analyzing and revivifying function signature inference using deep learning","authors":"Yan Lin, Trisha Singhal, Debin Gao, David Lo","doi":"10.1007/s10664-024-10453-9","DOIUrl":"https://doi.org/10.1007/s10664-024-10453-9","url":null,"abstract":"<p>Function signature plays an important role in binary analysis and security enhancement, with typical examples in bug finding and control-flow integrity enforcement. However, recovery of function signatures by static binary analysis is challenging since crucial information vital for such recovery is stripped off during compilation. Although function signature recovery using deep learning (DL) is proposed in an effort to handle such challenges, the reported accuracy is low for binaries compiled with optimizations. In this paper, we first perform a systematic study to quantify the extent to which compiler optimizations (negatively) impact the accuracy of existing DL techniques based on Recurrent Neural Network (RNN) for function signature recovery. Our experiments show that the state-of-the-art DL technique has its accuracy dropped from 98.7% to 87.7% when training and testing optimized binaries. We further investigate the type of instructions that existing RNN model deems most important in inferring function signatures with the help of saliency map. The results show that existing RNN model mistakenly considers non-argument-accessing instructions to infer the number of arguments, especially when dealing with optimized binaries. Finally, we identify specific weaknesses in such existing approaches and propose an enhanced DL approach named ReSIL to incorporate compiler-optimization-specific domain knowledge into the learning process. Our experimental results show that ReSIL significantly improves the accuracy and F1 score in inferring function signatures, e.g., with accuracy in inferring the number of arguments for callees compiled with optimization flag O1 from 84.83% to 92.68%. Meanwhile, ReSIL correctly considers the argument-accessing instructions as the most important ones to perform the inferencing. We also demonstrate security implications of ReSIL in Control-Flow Integrity enforcement in stopping potential Counterfeit Object-Oriented Programming (COOP) attacks.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140933239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ethics in AI through the practitioner’s view: a grounded theory literature review 从实践者的角度看人工智能伦理:基础理论文献综述
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-06 DOI: 10.1007/s10664-024-10465-5
Aastha Pant, Rashina Hoda, Chakkrit Tantithamthavorn, Burak Turhan

The term ethics is widely used, explored, and debated in the context of developing Artificial Intelligence (AI) based software systems. In recent years, numerous incidents have raised the profile of ethical issues in AI development and led to public concerns about the proliferation of AI technology in our everyday lives. But what do we know about the views and experiences of those who develop these systems – the AI practitioners? We conducted a grounded theory literature review (GTLR) of 38 primary empirical studies that included AI practitioners’ views on ethics in AI and analysed them to derive five categories: practitioner awareness, perception, need, challenge, and approach. These are underpinned by multiple codes and concepts that we explain with evidence from the included studies. We present a taxonomy of ethics in AI from practitioners’ viewpoints to assist AI practitioners in identifying and understanding the different aspects of AI ethics. The taxonomy provides a landscape view of the key aspects that concern AI practitioners when it comes to ethics in AI. We also share an agenda for future research studies and recommendations for practitioners, managers, and organisations to help in their efforts to better consider and implement ethics in AI.

在开发基于人工智能(AI)的软件系统时,伦理一词被广泛使用、探讨和辩论。近年来,众多事件凸显了人工智能开发中的伦理问题,并引发了公众对人工智能技术在日常生活中扩散的担忧。但是,我们对这些系统的开发者--人工智能从业者--的观点和经验了解多少呢?我们对包含人工智能从业人员对人工智能伦理看法的 38 项主要实证研究进行了基础理论文献综述(GTLR),并通过分析得出了五个类别:从业人员意识、认知、需求、挑战和方法。这些类别由多个代码和概念支撑,我们通过所纳入研究的证据对这些代码和概念进行了解释。我们从从业人员的角度提出了人工智能伦理分类法,以帮助人工智能从业人员识别和理解人工智能伦理的不同方面。该分类法提供了人工智能从业人员在人工智能伦理方面所关注的关键方面的全景视图。我们还为从业人员、管理人员和组织分享了未来研究的议程和建议,以帮助他们更好地考虑和实施人工智能伦理。
{"title":"Ethics in AI through the practitioner’s view: a grounded theory literature review","authors":"Aastha Pant, Rashina Hoda, Chakkrit Tantithamthavorn, Burak Turhan","doi":"10.1007/s10664-024-10465-5","DOIUrl":"https://doi.org/10.1007/s10664-024-10465-5","url":null,"abstract":"<p>The term ethics is widely used, explored, and debated in the context of developing Artificial Intelligence (AI) based software systems. In recent years, numerous incidents have raised the profile of ethical issues in AI development and led to public concerns about the proliferation of AI technology in our everyday lives. But what do we know about the views and experiences of those who develop these systems – the AI practitioners? We conducted a grounded theory literature review (GTLR) of 38 primary empirical studies that included AI practitioners’ views on ethics in AI and analysed them to derive five categories: practitioner <i>awareness</i>, <i>perception</i>, <i>need</i>, <i>challenge</i>, and <i>approach</i>. These are underpinned by multiple codes and concepts that we explain with evidence from the included studies. We present a <i>taxonomy of ethics in AI from practitioners’ viewpoints</i> to assist AI practitioners in identifying and understanding the different aspects of AI ethics. The taxonomy provides a landscape view of the key aspects that concern AI practitioners when it comes to ethics in AI. We also share an agenda for future research studies and recommendations for practitioners, managers, and organisations to help in their efforts to better consider and implement ethics in AI.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140888028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testing 寻找错误:通过回归测试自动识别导致错误的变更
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-04 DOI: 10.1007/s10664-024-10479-z
Michel Maes-Bermejo, Alexander Serebrenik, Micael Gallego, Francisco Gortázar, Gregorio Robles, Jesús María González Barahona

Context

Finding code changes that introduced bugs is important both for practitioners and researchers, but doing it precisely is a manual, effort-intensive process. The perfect test method is a theoretical construct aimed at detecting Bug-Introducing Changes (BIC) through a theoretical perfect test. This perfect test always fails if the bug is present, and passes otherwise.

Objective

To explore a possible automatic operationalization of the perfect test method.

Method

To use regression tests as substitutes for the perfect test. For this, we transplant the regression tests to past snapshots of the code, and use them to identify the BIC, on a well-known collection of bugs from the Defects4J dataset.

Results

From 809 bugs in the dataset, when running our operationalization of the perfect test method, for 95 of them the BIC was identified precisely and in the remaining 4 cases, a list of candidates including the BIC was provided.

Conclusions

We demonstrate that the operationalization of the perfect test method through regression tests is feasible and can be completely automated in practice when tests can be transplanted and run in past snapshots of the code. Given that implementing regression tests when a bug is fixed is considered a good practice, when developers follow it, they can detect effortlessly bug-introducing changes by using our operationalization of the perfect test method.

背景寻找引入错误的代码变更对于从业人员和研究人员来说都很重要,但要精确地完成这项工作却是一个需要大量人力和精力的过程。完美测试法是一种理论构造,旨在通过理论上的完美测试来检测引入错误的变更(BIC)。目标探索完美测试法的自动操作方法。方法使用回归测试来替代完美测试。为此,我们将回归测试移植到过去的代码快照中,并在 Defects4J 数据集中的著名错误集合上使用它们来识别 BIC。结果从数据集中的 809 个错误中,当运行我们的完美测试方法操作化时,其中 95 个错误的 BIC 被精确识别,在其余 4 个案例中,提供了包括 BIC 在内的候选列表。结论我们证明,通过回归测试实现完美测试方法的可操作性是可行的,而且在实践中,当测试可以移植到过去的代码快照中运行时,完全可以实现自动化。鉴于在修复错误时实施回归测试被认为是一种良好的做法,因此当开发人员遵循这种做法时,他们可以通过使用我们的完美测试方法的操作化,毫不费力地检测到引入错误的更改。
{"title":"Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testing","authors":"Michel Maes-Bermejo, Alexander Serebrenik, Micael Gallego, Francisco Gortázar, Gregorio Robles, Jesús María González Barahona","doi":"10.1007/s10664-024-10479-z","DOIUrl":"https://doi.org/10.1007/s10664-024-10479-z","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Context</h3><p>Finding code changes that introduced bugs is important both for practitioners and researchers, but doing it precisely is a manual, effort-intensive process. The <i>perfect test</i> method is a theoretical construct aimed at detecting Bug-Introducing Changes (BIC) through a theoretical <i>perfect test</i>. This <i>perfect test</i> always fails if the bug is present, and passes otherwise.</p><h3 data-test=\"abstract-sub-heading\">Objective</h3><p>To explore a possible automatic operationalization of the <i>perfect test</i> method.</p><h3 data-test=\"abstract-sub-heading\">Method</h3><p>To use regression tests as substitutes for the <i>perfect test</i>. For this, we transplant the regression tests to past snapshots of the code, and use them to identify the BIC, on a well-known collection of bugs from the Defects4J dataset.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>From 809 bugs in the dataset, when running our operationalization of the perfect test method, for 95 of them the BIC was identified precisely and in the remaining 4 cases, a list of candidates including the BIC was provided.</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>We demonstrate that the operationalization of the <i>perfect test</i> method through regression tests is feasible and can be completely automated in practice when tests can be transplanted and run in past snapshots of the code. Given that implementing regression tests when a bug is fixed is considered a good practice, when developers follow it, they can detect effortlessly bug-introducing changes by using our operationalization of the <i>perfect test</i> method.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140888213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How do annotations affect Java code readability? 注解如何影响 Java 代码的可读性?
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-03 DOI: 10.1007/s10664-024-10460-w
Eduardo Guerra, Everaldo Gomes, Jeferson Ferreira, Igor Wiese, Phyllipe Lima, Marco Gerosa, Paulo Meirelles

Context

Code annotations have gained widespread popularity in programming languages, offering developers the ability to attach metadata to code elements to define custom behaviors. Many modern frameworks and APIs use annotations to keep integration less verbose and located nearer to the corresponding code element. Despite these advantages, practitioners’ anecdotal evidence suggests that annotations might negatively affect code readability.

Objective

To better understand this effect, this paper systematically investigates the relationship between code annotations and code readability.

Method

In a survey with software developers (n=332), we present 15 pairs of Java code snippets with and without code annotations. These pairs were designed considering five categories of annotation used in real-world Java frameworks and APIs. Survey participants selected the code snippet they considered more readable for each pair and answered an open question about how annotations affect the code’s readability.

Results

Preferences were scattered for all categories of annotation usage, revealing no consensus among participants. The answers were spread even when segregated by participants’ programming or annotation-related experience. Nevertheless, some participants showed a consistent preference in favor or against annotations across all categories, which may indicate a personal preference. Our qualitative analysis of the open-ended questions revealed that participants often praise annotation impacts on design, maintainability, and productivity but expressed contrasting views on understandability and code clarity.

Conclusions

Software developers and API designers can consider our results when deciding whether to use annotations, equipped with the insight that developers express contrasting views of the annotations’ impact on code readability.

ContextCode 注释在编程语言中得到了广泛的普及,它为开发人员提供了为代码元素附加元数据以定义自定义行为的能力。许多现代框架和应用程序接口都使用注解来减少集成的冗长度,并使其更接近相应的代码元素。为了更好地理解这种影响,本文系统地研究了代码注释与代码可读性之间的关系。方法在对软件开发人员(n=332)的调查中,我们展示了 15 对有代码注释和没有代码注释的 Java 代码片段。这些代码对的设计考虑了实际 Java 框架和 API 中使用的五类注释。调查参与者为每对代码选择了他们认为可读性更高的代码片段,并回答了一个关于注释如何影响代码可读性的开放性问题。即使按参与者的编程或注释相关经验进行分类,答案也很分散。不过,一些参与者在所有类别中都表现出一致的支持或反对注释的偏好,这可能表明了他们的个人偏好。我们对开放式问题进行的定性分析显示,参与者经常称赞注释对设计、可维护性和生产率的影响,但对可理解性和代码清晰度的影响却表达了截然不同的观点。
{"title":"How do annotations affect Java code readability?","authors":"Eduardo Guerra, Everaldo Gomes, Jeferson Ferreira, Igor Wiese, Phyllipe Lima, Marco Gerosa, Paulo Meirelles","doi":"10.1007/s10664-024-10460-w","DOIUrl":"https://doi.org/10.1007/s10664-024-10460-w","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Context</h3><p>Code annotations have gained widespread popularity in programming languages, offering developers the ability to attach metadata to code elements to define custom behaviors. Many modern frameworks and APIs use annotations to keep integration less verbose and located nearer to the corresponding code element. Despite these advantages, practitioners’ anecdotal evidence suggests that annotations might negatively affect code readability.</p><h3 data-test=\"abstract-sub-heading\">Objective</h3><p>To better understand this effect, this paper systematically investigates the relationship between code annotations and code readability.</p><h3 data-test=\"abstract-sub-heading\">Method</h3><p>In a survey with software developers (n=332), we present 15 pairs of Java code snippets with and without code annotations. These pairs were designed considering five categories of annotation used in real-world Java frameworks and APIs. Survey participants selected the code snippet they considered more readable for each pair and answered an open question about how annotations affect the code’s readability.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>Preferences were scattered for all categories of annotation usage, revealing no consensus among participants. The answers were spread even when segregated by participants’ programming or annotation-related experience. Nevertheless, some participants showed a consistent preference in favor or against annotations across all categories, which may indicate a personal preference. Our qualitative analysis of the open-ended questions revealed that participants often praise annotation impacts on design, maintainability, and productivity but expressed contrasting views on understandability and code clarity.</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>Software developers and API designers can consider our results when deciding whether to use annotations, equipped with the insight that developers express contrasting views of the annotations’ impact on code readability.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140888185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VioDroid-Finder: automated evaluation of compliance and consistency for Android apps VioDroid-Finder:自动评估安卓应用程序的合规性和一致性
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-03 DOI: 10.1007/s10664-024-10470-8
Junren Chen, Cheng Huang, Jiaxuan Han

Rapid growth in the variety and quantity of apps makes it difficult for users to protect their privacy, although existing regulations have been introduced and the Android ecosystem is constantly being improved, there are still violations as privacy policies may not fully comply with regulations, and app behavior may not be fully consistent with privacy policies. To solve such issues, this paper proposes an automated method called VioDroid-Finder aiming at the evaluation of compliance and consistency for Android apps. We first study existing common regulations and conclude the privacy policy content into 7 aspects (i.e., privacy categories), for privacy policies, different compliance rules are required to be complied with in each privacy category. Secondly, we present a policy structure parser model based on the structure extraction/rebuilding method (which can convert the unstructured text to an XML tree) and subtitle similarity calculation algorithm. Thirdly, we propose a violation analyzer using the BERT model to classify each sentence in the privacy policy, we collect existing issues and combine them with manual observations to define 6 types of violations and detect them based on classification results. Then, we propose an inconsistency analyzer that converts permissions, APIs, and GUI into a set of personal information based on static analysis, inconsistencies are detected by comparing that set with personal information declared in the privacy policy. Finally, we evaluate 600 Chinese apps using the proposed method, from which we detect many violations and inconsistencies reflecting the current widespread privacy violation issues.

尽管现有法规已经出台,安卓生态系统也在不断完善,但由于隐私政策可能不完全符合法规,应用程序行为可能与隐私政策不完全一致,因此仍然存在违规现象。为了解决这些问题,本文提出了一种名为 VioDroid-Finder 的自动方法,旨在评估安卓应用程序的合规性和一致性。我们首先研究了现有的通用法规,并将隐私政策内容总结为 7 个方面(即隐私类别),对于隐私政策,每个隐私类别需要遵守不同的合规规则。其次,我们提出了基于结构提取/重建方法(可将非结构化文本转换为 XML 树)和字幕相似度计算算法的政策结构解析器模型。第三,我们提出了一个违规分析器,利用 BERT 模型对隐私政策中的每句话进行分类,我们收集现有的问题并结合人工观察,定义了 6 种违规类型,并根据分类结果进行检测。然后,我们提出了一个不一致分析器,它基于静态分析将权限、API 和图形用户界面转换成一组个人信息,通过比较这组信息和隐私政策中声明的个人信息来检测不一致之处。最后,我们使用提出的方法对 600 个中文应用程序进行了评估,从中发现了许多违规和不一致之处,反映了当前普遍存在的侵犯隐私问题。
{"title":"VioDroid-Finder: automated evaluation of compliance and consistency for Android apps","authors":"Junren Chen, Cheng Huang, Jiaxuan Han","doi":"10.1007/s10664-024-10470-8","DOIUrl":"https://doi.org/10.1007/s10664-024-10470-8","url":null,"abstract":"<p>Rapid growth in the variety and quantity of apps makes it difficult for users to protect their privacy, although existing regulations have been introduced and the Android ecosystem is constantly being improved, there are still violations as privacy policies may not fully comply with regulations, and app behavior may not be fully consistent with privacy policies. To solve such issues, this paper proposes an automated method called VioDroid-Finder aiming at the evaluation of compliance and consistency for Android apps. We first study existing common regulations and conclude the privacy policy content into 7 aspects (i.e., privacy categories), for privacy policies, different compliance rules are required to be complied with in each privacy category. Secondly, we present a policy structure parser model based on the structure extraction/rebuilding method (which can convert the unstructured text to an XML tree) and subtitle similarity calculation algorithm. Thirdly, we propose a violation analyzer using the BERT model to classify each sentence in the privacy policy, we collect existing issues and combine them with manual observations to define 6 types of violations and detect them based on classification results. Then, we propose an inconsistency analyzer that converts permissions, APIs, and GUI into a set of personal information based on static analysis, inconsistencies are detected by comparing that set with personal information declared in the privacy policy. Finally, we evaluate 600 Chinese apps using the proposed method, from which we detect many violations and inconsistencies reflecting the current widespread privacy violation issues.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Patterns of multi-container composition for service orchestration with Docker Compose 使用 Docker Compose 进行服务协调的多容器组合模式
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-03 DOI: 10.1007/s10664-024-10462-8
Kalvin Eng, Abram Hindle, Eleni Stroulia

Software design patterns present general code solutions to common software design problems. Modern software systems rely heavily on containers for running their constituent service components. Yet, despite the prevalence of ready-to-use Docker service images ready to participate in multi-container service compositions of applications, developers do not have much guidance on how to compose their own Docker service orchestrations. Thus in this work, we curate a dataset of successful projects that employ Docker Compose as an orchestration tool to run multiple service containers; then, we engage in qualitative and quantitative analysis of Docker Compose configurations. The collection of data and analysis enables the identification and naming of repeating multi-container composition patterns that are used in numerous successful open-source projects, much like software design patterns. These patterns highlight how software systems are orchestrated in the real-world and can give examples to anybody wishing to compose their own service orchestrations. These contributions also advance empirical research in software engineering patterns as evidence is provided about how Docker Compose is used.

软件设计模式为常见的软件设计问题提供了通用代码解决方案。现代软件系统在很大程度上依赖容器来运行其组成服务组件。然而,尽管有很多随时可用的 Docker 服务映像可以参与应用程序的多容器服务组合,开发人员在如何组成自己的 Docker 服务编排方面却没有太多指导。因此,在这项工作中,我们收集了使用 Docker Compose 作为协调工具来运行多个服务容器的成功项目数据集;然后,我们对 Docker Compose 配置进行了定性和定量分析。通过收集数据和分析,我们识别并命名了在众多成功开源项目中使用的重复多容器组成模式,就像软件设计模式一样。这些模式强调了软件系统在现实世界中的协调方式,并为任何希望构建自己的服务协调的人提供了范例。由于提供了有关 Docker Compose 如何使用的证据,这些贡献还推动了软件工程模式的实证研究。
{"title":"Patterns of multi-container composition for service orchestration with Docker Compose","authors":"Kalvin Eng, Abram Hindle, Eleni Stroulia","doi":"10.1007/s10664-024-10462-8","DOIUrl":"https://doi.org/10.1007/s10664-024-10462-8","url":null,"abstract":"<p>Software design patterns present general code solutions to common software design problems. Modern software systems rely heavily on containers for running their constituent service components. Yet, despite the prevalence of ready-to-use Docker service images ready to participate in multi-container service compositions of applications, developers do not have much guidance on how to compose their own Docker service orchestrations. Thus in this work, we curate a dataset of successful projects that employ Docker Compose as an orchestration tool to run multiple service containers; then, we engage in qualitative and quantitative analysis of Docker Compose configurations. The collection of data and analysis enables the identification and naming of repeating multi-container composition patterns that are used in numerous successful open-source projects, much like software design patterns. These patterns highlight how software systems are orchestrated in the real-world and can give examples to anybody wishing to compose their own service orchestrations. These contributions also advance empirical research in software engineering patterns as evidence is provided about how Docker Compose is used.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic bi-modal question title generation for Stack Overflow with prompt learning 利用提示学习为 Stack Overflow 自动生成双模问题标题
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-03 DOI: 10.1007/s10664-024-10466-4
Shaoyu Yang, Xiang Chen, Ke Liu, Guang Yang, Chi Yu

When drafting question posts for Stack Overflow, developers may not accurately summarize the core problems in the question titles, which can cause these questions to not get timely help. Therefore, improving the quality of question titles has attracted the wide attention of researchers. An initial study aimed to automatically generate the titles by only analyzing the code snippets in the question body. However, this study ignored the helpful information in their corresponding problem descriptions. Therefore, we propose an approach SOTitle+ by considering bi-modal information (i.e., the code snippets and the problem descriptions) in the question body. Then we formalize the title generation for different programming languages as separate but related tasks and utilize multi-task learning to solve these tasks. Later we fine-tune the pre-trained language model CodeT5 to automatically generate the titles. Unfortunately, the inconsistent inputs and optimization objectives between the pre-training task and our investigated task may make fine-tuning hard to fully explore the knowledge of the pre-trained model. To solve this issue, SOTitle+ further prompt-tunes CodeT5 with hybrid prompts (i.e., mixture of hard and soft prompts). To verify the effectiveness of SOTitle+, we construct a large-scale high-quality corpus from recent data dumps shared by Stack Overflow. Our corpus includes 179,119 high-quality question posts for six popular programming languages. Experimental results show that SOTitle+ can significantly outperform four state-of-the-art baselines in both automatic evaluation and human evaluation. In addition, our ablation studies also confirm the effectiveness of component settings (such as bi-modal information, prompt learning, hybrid prompts, and multi-task learning) of SOTitle+. Our work indicates that considering bi-modal information and prompt learning in Stack Overflow title generation is a promising exploration direction.

在为 Stack Overflow 起草问题帖子时,开发人员可能无法在问题标题中准确概括核心问题,从而导致这些问题无法得到及时帮助。因此,提高问题标题的质量引起了研究人员的广泛关注。最初的一项研究旨在仅通过分析问题正文中的代码片段来自动生成标题。但是,这项研究忽略了相应问题描述中的有用信息。因此,我们提出了一种 SOTitle+ 方法,即考虑问题正文中的双模信息(即代码片段和问题描述)。然后,我们将不同编程语言的标题生成形式化为独立但相关的任务,并利用多任务学习来解决这些任务。之后,我们对预先训练好的语言模型 CodeT5 进行微调,以自动生成标题。遗憾的是,由于预训练任务和我们研究的任务之间的输入和优化目标不一致,微调可能难以充分挖掘预训练模型的知识。为了解决这个问题,SOTitle+ 使用混合提示(即硬提示和软提示的混合)对 CodeT5 进行了进一步的提示调整。为了验证 SOTitle+ 的有效性,我们从 Stack Overflow 最近共享的数据转储中构建了一个大规模高质量语料库。我们的语料库包括六种流行编程语言的 179,119 条高质量问题帖子。实验结果表明,SOTitle+ 在自动评估和人工评估中的表现都明显优于四种最先进的基线。此外,我们的消融研究还证实了 SOTitle+ 的组件设置(如双模信息、提示学习、混合提示和多任务学习)的有效性。我们的工作表明,在 Stack Overflow 标题生成中考虑双模信息和提示学习是一个很有前景的探索方向。
{"title":"Automatic bi-modal question title generation for Stack Overflow with prompt learning","authors":"Shaoyu Yang, Xiang Chen, Ke Liu, Guang Yang, Chi Yu","doi":"10.1007/s10664-024-10466-4","DOIUrl":"https://doi.org/10.1007/s10664-024-10466-4","url":null,"abstract":"<p>When drafting question posts for Stack Overflow, developers may not accurately summarize the core problems in the question titles, which can cause these questions to not get timely help. Therefore, improving the quality of question titles has attracted the wide attention of researchers. An initial study aimed to automatically generate the titles by only analyzing the code snippets in the question body. However, this study ignored the helpful information in their corresponding problem descriptions. Therefore, we propose an approach <span>SOTitle+</span> by considering bi-modal information (i.e., the code snippets and the problem descriptions) in the question body. Then we formalize the title generation for different programming languages as separate but related tasks and utilize multi-task learning to solve these tasks. Later we fine-tune the pre-trained language model CodeT5 to automatically generate the titles. Unfortunately, the inconsistent inputs and optimization objectives between the pre-training task and our investigated task may make fine-tuning hard to fully explore the knowledge of the pre-trained model. To solve this issue, <span>SOTitle+</span> further prompt-tunes CodeT5 with hybrid prompts (i.e., mixture of hard and soft prompts). To verify the effectiveness of <span>SOTitle+</span>, we construct a large-scale high-quality corpus from recent data dumps shared by Stack Overflow. Our corpus includes 179,119 high-quality question posts for six popular programming languages. Experimental results show that <span>SOTitle+</span> can significantly outperform four state-of-the-art baselines in both automatic evaluation and human evaluation. In addition, our ablation studies also confirm the effectiveness of component settings (such as bi-modal information, prompt learning, hybrid prompts, and multi-task learning) of <span>SOTitle+</span>. Our work indicates that considering bi-modal information and prompt learning in Stack Overflow title generation is a promising exploration direction.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An empirical study into the effects of transpilation on quantum circuit smells 转置对量子电路气味影响的实证研究
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-02 DOI: 10.1007/s10664-024-10461-9
Manuel De Stefano, Dario Di Nucci, Fabio Palomba, Andrea De Lucia

Quantum computing is a promising field that can solve complex problems beyond traditional computers’ capabilities. Developing high-quality quantum software applications, called quantum software engineering, has recently gained attention. However, quantum software development faces challenges related to code quality. A recent study found that many open-source quantum programs are affected by quantum-specific code smells, with long circuit being the most common. While the study provided relevant insights into the prevalence of code smells in quantum circuits, it did not explore the potential effect of transpilation, a necessary step for executing quantum computer programs, on the emergence of code smells. Indeed, transpilation might alter those characteristics employed to detect the presence of a smell on a circuit. To address this limitation, we present a new study investigating the impact of transpilation on quantum-specific code smells and how different target gate sets affect the results. We conducted experiments on 17 open-source quantum programs alongside a set of 100 synthetic circuits. We found that transpilation can significantly alter the metrics that are used to detect code smells, even into previously smell-free circuits, with the long circuit smell being the most susceptible to transpilation. Furthermore, the choice of the gate set significantly influences the presence and severity of code smells in transpiled circuits, highlighting the need for careful gate set selection to mitigate their impact. These findings have implications for circuit optimization and high-quality quantum software development. Further research is needed to understand the consequences of code smells and their potential impact on quantum computations, considering the characteristics and constraints of different gate sets and hardware platforms.

量子计算是一个大有可为的领域,它可以解决超出传统计算机能力的复杂问题。开发高质量的量子软件应用程序(称为量子软件工程)最近受到了关注。然而,量子软件开发面临着代码质量方面的挑战。最近的一项研究发现,许多开源量子程序都受到量子特定代码气味的影响,其中最常见的是长电路。虽然这项研究提供了量子电路中普遍存在的代码气味的相关见解,但它并没有探讨转译(执行量子计算机程序的必要步骤)对代码气味出现的潜在影响。事实上,转译可能会改变用于检测电路中是否存在气味的特征。为了解决这一局限性,我们开展了一项新研究,调查转置对量子特定代码气味的影响,以及不同目标门集对结果的影响。我们对 17 个开源量子程序和一组 100 个合成电路进行了实验。我们发现,转置可以显著改变用于检测代码气味的指标,即使是以前没有气味的电路也不例外,其中长电路气味最容易受到转置的影响。此外,门集的选择也会极大地影响代码气味在转置电路中的存在和严重程度,这凸显了谨慎选择门集以减轻其影响的必要性。这些发现对电路优化和高质量量子软件开发具有重要意义。考虑到不同门组和硬件平台的特点和限制,还需要进一步研究,以了解代码气味的后果及其对量子计算的潜在影响。
{"title":"An empirical study into the effects of transpilation on quantum circuit smells","authors":"Manuel De Stefano, Dario Di Nucci, Fabio Palomba, Andrea De Lucia","doi":"10.1007/s10664-024-10461-9","DOIUrl":"https://doi.org/10.1007/s10664-024-10461-9","url":null,"abstract":"<p>Quantum computing is a promising field that can solve complex problems beyond traditional computers’ capabilities. Developing high-quality quantum software applications, called quantum software engineering, has recently gained attention. However, quantum software development faces challenges related to code quality. A recent study found that many open-source quantum programs are affected by quantum-specific code smells, with <i>long circuit</i> being the most common. While the study provided relevant insights into the prevalence of code smells in quantum circuits, it did not explore the potential effect of transpilation, a necessary step for executing quantum computer programs, on the emergence of code smells. Indeed, transpilation might alter those characteristics employed to detect the presence of a smell on a circuit. To address this limitation, we present a new study investigating the impact of transpilation on quantum-specific code smells and how different target gate sets affect the results. We conducted experiments on 17 open-source quantum programs alongside a set of 100 synthetic circuits. We found that transpilation can significantly alter the metrics that are used to detect code smells, even into previously smell-free circuits, with the <i>long circuit</i> smell being the most susceptible to transpilation. Furthermore, the choice of the gate set significantly influences the presence and severity of code smells in transpiled circuits, highlighting the need for careful gate set selection to mitigate their impact. These findings have implications for circuit optimization and high-quality quantum software development. Further research is needed to understand the consequences of code smells and their potential impact on quantum computations, considering the characteristics and constraints of different gate sets and hardware platforms.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140829388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative analysis of real issues in open-source machine learning projects 开源机器学习项目中实际问题的比较分析
IF 4.1 2区 计算机科学 Q1 Computer Science Pub Date : 2024-05-02 DOI: 10.1007/s10664-024-10467-3
Tuan Dung Lai, Anj Simmons, Scott Barnett, Jean-Guy Schneider, Rajesh Vasa

Context

In the last decade of data-driven decision-making, Machine Learning (ML) systems reign supreme. Because of the different characteristics between ML and traditional Software Engineering systems, we do not know to what extent the issue-reporting needs are different, and to what extent these differences impact the issue resolution process.

Objective

We aim to compare the differences between ML and non-ML issues in open-source applied AI projects in terms of resolution time and size of fix. This research aims to enhance the predictability of maintenance tasks by providing valuable insights for issue reporting and task scheduling activities.

Method

We collect issue reports from Github repositories of open-source ML projects using an automatic approach, filter them using ML keywords and libraries, manually categorize them using an adapted deep learning bug taxonomy, and compare resolution time and fix size for ML and non-ML issues in a controlled sample.

Result

147 ML issues and 147 non-ML issues are collected for analysis. We found that ML issues take more time to resolve than non-ML issues, the median difference is 14 days. There is no significant difference in terms of size of fix between ML and non-ML issues. No significant differences are found between different ML issue categories in terms of resolution time and size of fix.

Conclusion

Our study provided evidence that the life cycle for ML issues is stretched, and thus further work is required to identify the reason. The results also highlighted the need for future work to design custom tooling to support faster resolution of ML issues.

背景在过去十年的数据驱动决策中,机器学习(ML)系统占据了主导地位。由于 ML 与传统软件工程系统的不同特性,我们不知道问题报告的需求在多大程度上有所不同,也不知道这些差异在多大程度上影响了问题的解决过程。这项研究旨在为问题报告和任务调度活动提供有价值的见解,从而提高维护任务的可预测性。方法我们使用自动方法从开源 ML 项目的 Github 存储库中收集问题报告,使用 ML 关键字和库对其进行过滤,使用适应的深度学习错误分类法对其进行手动分类,并在控制样本中比较 ML 和非 ML 问题的解决时间和修复规模。我们发现,ML 问题比非ML 问题的解决时间更长,中位数相差 14 天。就修复规模而言,ML 和非 ML 问题之间没有明显差异。在解决时间和修复规模方面,不同类型的 ML 问题之间也没有明显差异。研究结果还强调,今后需要设计定制工具,以支持更快地解决 ML 问题。
{"title":"Comparative analysis of real issues in open-source machine learning projects","authors":"Tuan Dung Lai, Anj Simmons, Scott Barnett, Jean-Guy Schneider, Rajesh Vasa","doi":"10.1007/s10664-024-10467-3","DOIUrl":"https://doi.org/10.1007/s10664-024-10467-3","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Context</h3><p>In the last decade of data-driven decision-making, Machine Learning (ML) systems reign supreme. Because of the different characteristics between ML and traditional Software Engineering systems, we do not know to what extent the issue-reporting needs are different, and to what extent these differences impact the issue resolution process.</p><h3 data-test=\"abstract-sub-heading\">Objective</h3><p>We aim to compare the differences between ML and non-ML issues in open-source applied AI projects in terms of resolution time and size of fix. This research aims to enhance the predictability of maintenance tasks by providing valuable insights for issue reporting and task scheduling activities.</p><h3 data-test=\"abstract-sub-heading\">Method</h3><p>We collect issue reports from Github repositories of open-source ML projects using an automatic approach, filter them using ML keywords and libraries, manually categorize them using an adapted deep learning bug taxonomy, and compare resolution time and fix size for ML and non-ML issues in a controlled sample.</p><h3 data-test=\"abstract-sub-heading\">Result</h3><p>147 ML issues and 147 non-ML issues are collected for analysis. We found that ML issues take more time to resolve than non-ML issues, the median difference is 14 days. There is no significant difference in terms of size of fix between ML and non-ML issues. No significant differences are found between different ML issue categories in terms of resolution time and size of fix.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>Our study provided evidence that the life cycle for ML issues is stretched, and thus further work is required to identify the reason. The results also highlighted the need for future work to design custom tooling to support faster resolution of ML issues.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":4.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140829393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Empirical Software Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1