Can citations tell us about a paper's reproducibility? A case study of machine learning papers

arXiv - CS - Digital Libraries Pub Date : 2024-05-07 DOI:arxiv-2405.03977

Rochana R. Obadage, Sarah M. Rajtmajer, Jian Wu

引用次数: 0

Abstract

The iterative character of work in machine learning (ML) and artificial intelligence (AI) and reliance on comparisons against benchmark datasets emphasize the importance of reproducibility in that literature. Yet, resource constraints and inadequate documentation can make running replications particularly challenging. Our work explores the potential of using downstream citation contexts as a signal of reproducibility. We introduce a sentiment analysis framework applied to citation contexts from papers involved in Machine Learning Reproducibility Challenges in order to interpret the positive or negative outcomes of reproduction attempts. Our contributions include training classifiers for reproducibility-related contexts and sentiment analysis, and exploring correlations between citation context sentiment and reproducibility scores. Study data, software, and an artifact appendix are publicly available at https://github.com/lamps-lab/ccair-ai-reproducibility .

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

引用能说明论文的可复制性吗？机器学习论文案例研究

机器学习（ML）和人工智能（AI）领域的工作具有迭代性，并且依赖于与基准数据集的比较，这就强调了文献可重复性的重要性。然而，资源的限制和文档的不足可能会使复制的运行特别具有挑战性。我们的工作探索了使用下游引用上下文作为可重复性信号的潜力。我们引入了一个情感分析框架，将其应用于机器学习可重复性挑战赛中论文的引用上下文，以解释复制尝试的积极或消极结果。我们的贡献包括训练可重复性相关上下文和情感分析的分类器，以及探索引用上下文情感和可重复性分数之间的相关性。研究数据、软件和工具附录可通过 https://github.com/lamps-lab/ccair-ai-reproducibility 公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Digital Libraries

自引率

0.00%

发文量