{"title":"Can citations tell us about a paper's reproducibility? A case study of machine learning papers","authors":"Rochana R. Obadage, Sarah M. Rajtmajer, Jian Wu","doi":"arxiv-2405.03977","DOIUrl":null,"url":null,"abstract":"The iterative character of work in machine learning (ML) and artificial\nintelligence (AI) and reliance on comparisons against benchmark datasets\nemphasize the importance of reproducibility in that literature. Yet, resource\nconstraints and inadequate documentation can make running replications\nparticularly challenging. Our work explores the potential of using downstream\ncitation contexts as a signal of reproducibility. We introduce a sentiment\nanalysis framework applied to citation contexts from papers involved in Machine\nLearning Reproducibility Challenges in order to interpret the positive or\nnegative outcomes of reproduction attempts. Our contributions include training\nclassifiers for reproducibility-related contexts and sentiment analysis, and\nexploring correlations between citation context sentiment and reproducibility\nscores. Study data, software, and an artifact appendix are publicly available\nat https://github.com/lamps-lab/ccair-ai-reproducibility .","PeriodicalId":501285,"journal":{"name":"arXiv - CS - Digital Libraries","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.03977","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The iterative character of work in machine learning (ML) and artificial
intelligence (AI) and reliance on comparisons against benchmark datasets
emphasize the importance of reproducibility in that literature. Yet, resource
constraints and inadequate documentation can make running replications
particularly challenging. Our work explores the potential of using downstream
citation contexts as a signal of reproducibility. We introduce a sentiment
analysis framework applied to citation contexts from papers involved in Machine
Learning Reproducibility Challenges in order to interpret the positive or
negative outcomes of reproduction attempts. Our contributions include training
classifiers for reproducibility-related contexts and sentiment analysis, and
exploring correlations between citation context sentiment and reproducibility
scores. Study data, software, and an artifact appendix are publicly available
at https://github.com/lamps-lab/ccair-ai-reproducibility .