This article examines the impact of new AI‐related technologies in data mining and big data on important research questions in crime analytics. Because the field is so broad, the review focuses on a selection of the most important topics. Challenges for information management, and in turn law and society, include: AI‐powered predictive policing; big data for legal and adversarial decisions; bias using big data and analytics in profiling and predicting criminality; forecasting crime risk and crime rates; and, regulating AI systems.
{"title":"Themes in data mining, big data, and crime analytics","authors":"G. Oatley","doi":"10.1002/widm.1432","DOIUrl":"https://doi.org/10.1002/widm.1432","url":null,"abstract":"This article examines the impact of new AI‐related technologies in data mining and big data on important research questions in crime analytics. Because the field is so broad, the review focuses on a selection of the most important topics. Challenges for information management, and in turn law and society, include: AI‐powered predictive policing; big data for legal and adversarial decisions; bias using big data and analytics in profiling and predicting criminality; forecasting crime risk and crime rates; and, regulating AI systems.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79585296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The need for accurate and unbiased assessment of residential real property has always been important not only to financial institutions lending on or holding such assets but also to municipalities that rely on property taxes as their critical source of revenue. The common methodology for predicting residential property sale price is based on traditional multiple regression in spite of known issues. Machine learning methods have been proposed as an alternative approach but the results are far from satisfactory. A review of existing studies and relevant issues can help researchers better assess the pros and cons of the approaches in this important stream of research and move the field forward. This article provides such a review. In our review, we have noticed that common to both the regression‐based methods and machine learning methods are the use of batch‐mode learning. Thus in addition to providing a review of recent research on batch‐based residential property prediction models, this article also explores a new approach to constructing residential property price prediction models by treating past sale records as an evolving data stream. The results of our study show that the data stream approach outperforms the traditional regression method and demonstrate the potential of data stream methods in improving prediction models for residential property prices.
{"title":"Predicting home sale prices: A review of existing methods and illustration of data stream methods for improved performance","authors":"Donghui Shi, J. Guan, J. Zurada, Alan Levitan","doi":"10.1002/widm.1435","DOIUrl":"https://doi.org/10.1002/widm.1435","url":null,"abstract":"The need for accurate and unbiased assessment of residential real property has always been important not only to financial institutions lending on or holding such assets but also to municipalities that rely on property taxes as their critical source of revenue. The common methodology for predicting residential property sale price is based on traditional multiple regression in spite of known issues. Machine learning methods have been proposed as an alternative approach but the results are far from satisfactory. A review of existing studies and relevant issues can help researchers better assess the pros and cons of the approaches in this important stream of research and move the field forward. This article provides such a review. In our review, we have noticed that common to both the regression‐based methods and machine learning methods are the use of batch‐mode learning. Thus in addition to providing a review of recent research on batch‐based residential property prediction models, this article also explores a new approach to constructing residential property price prediction models by treating past sale records as an evolving data stream. The results of our study show that the data stream approach outperforms the traditional regression method and demonstrate the potential of data stream methods in improving prediction models for residential property prices.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87352821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Eirini Ntoutsi
As decision‐making increasingly relies on machine learning (ML) and (big) data, the issue of fairness in data‐driven artificial intelligence systems is receiving increasing attention from both research and industry. A large variety of fairness‐aware ML solutions have been proposed which involve fairness‐related interventions in the data, learning algorithms, and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real‐world datasets used for fairness‐aware ML. We focus on tabular data as the most common data representation for fairness‐aware ML. We start our analysis by identifying relationships between the different attributes, particularly with respect to protected attributes and class attribute, using a Bayesian network. For a deeper understanding of bias in the datasets, we investigate interesting relationships using exploratory analysis.
{"title":"A survey on datasets for fairness‐aware machine learning","authors":"Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Eirini Ntoutsi","doi":"10.1002/widm.1452","DOIUrl":"https://doi.org/10.1002/widm.1452","url":null,"abstract":"As decision‐making increasingly relies on machine learning (ML) and (big) data, the issue of fairness in data‐driven artificial intelligence systems is receiving increasing attention from both research and industry. A large variety of fairness‐aware ML solutions have been proposed which involve fairness‐related interventions in the data, learning algorithms, and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real‐world datasets used for fairness‐aware ML. We focus on tabular data as the most common data representation for fairness‐aware ML. We start our analysis by identifying relationships between the different attributes, particularly with respect to protected attributes and class attribute, using a Bayesian network. For a deeper understanding of bias in the datasets, we investigate interesting relationships using exploratory analysis.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79017406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Uncovering community structure has made a significant advancement in explaining, analyzing, and forecasting behaviors and dynamics of networks related to different fields in sociology, criminology, biology, medicine, communication, economics, and academia. Detecting and clustering communities is a powerful step toward identifying the structural properties and the behavioral patterns in social networks. Recently, online learning has been progressively adopted by a lot of educational practices which raise many questions about assessing the learners' engagement, collaboration, and behaviors in the new emerging learning communities. This systematic literature review aims to assess the use of community detection techniques in analyzing the network's structure in online learning environments. It provides a comprehensive overview of the existing research that adopted those techniques with identifying the educational objectives behind their application as well as suggesting possible future research directions. Our analysis covered 65 studies that found in the literature and applied different community discovery techniques on various types of online learning environments to analyze their users' interactions patterns. Our review revealed the potential of this field in improving educational practices and decisions and in utilizing the massive amount of data generated from interacting with those environments. Finally, we highlighted the need to include automated community discovery techniques in online learning environments to facilitate and enhance their use as well as we stressed on the urge for further advance research to uncover a lot of hidden opportunities.
{"title":"Detecting communities using social network analysis in online learning environments: Systematic literature review","authors":"Sahar Yassine, S. Kadry, M. Sicilia","doi":"10.1002/widm.1431","DOIUrl":"https://doi.org/10.1002/widm.1431","url":null,"abstract":"Uncovering community structure has made a significant advancement in explaining, analyzing, and forecasting behaviors and dynamics of networks related to different fields in sociology, criminology, biology, medicine, communication, economics, and academia. Detecting and clustering communities is a powerful step toward identifying the structural properties and the behavioral patterns in social networks. Recently, online learning has been progressively adopted by a lot of educational practices which raise many questions about assessing the learners' engagement, collaboration, and behaviors in the new emerging learning communities. This systematic literature review aims to assess the use of community detection techniques in analyzing the network's structure in online learning environments. It provides a comprehensive overview of the existing research that adopted those techniques with identifying the educational objectives behind their application as well as suggesting possible future research directions. Our analysis covered 65 studies that found in the literature and applied different community discovery techniques on various types of online learning environments to analyze their users' interactions patterns. Our review revealed the potential of this field in improving educational practices and decisions and in utilizing the massive amount of data generated from interacting with those environments. Finally, we highlighted the need to include automated community discovery techniques in online learning environments to facilitate and enhance their use as well as we stressed on the urge for further advance research to uncover a lot of hidden opportunities.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77808874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In computer terminology, mining is considered as extracting meaningful information or knowledge from a large amount of data/information using computers. The meaningful information can be extracted from normal text, and images obtained from different resources, such as natural scene images, video, and documents by deriving semantics from text and content of the images. Although there are many pieces of work on text/data mining and several survey/review papers are published in the literature, to the best of our knowledge there is no survey paper on mining textual information from the natural scene, video, and document images considering word spotting techniques. In this article, we, therefore, provide a comprehensive review of both the non‐spotting and spotting based mining techniques. The mining approaches are categorized as feature, learning and hybrid‐based methods to analyze the strengths and limitations of the models of each category. In addition, it also discusses the usefulness of the methods according to different situations and applications. Furthermore, based on the review of different mining approaches, this article identifies the limitations of the existing methods and suggests new applications and future directions to continue the research in multiple directions. We believe such a review article will be useful to the researchers to quickly become familiar with the state‐of‐the‐art information and progresses made toward mining textual information from natural scene and video images.
{"title":"Mining text from natural scene and video images: A survey","authors":"P. Shivakumara, Alireza Alaei, U. Pal","doi":"10.1002/widm.1428","DOIUrl":"https://doi.org/10.1002/widm.1428","url":null,"abstract":"In computer terminology, mining is considered as extracting meaningful information or knowledge from a large amount of data/information using computers. The meaningful information can be extracted from normal text, and images obtained from different resources, such as natural scene images, video, and documents by deriving semantics from text and content of the images. Although there are many pieces of work on text/data mining and several survey/review papers are published in the literature, to the best of our knowledge there is no survey paper on mining textual information from the natural scene, video, and document images considering word spotting techniques. In this article, we, therefore, provide a comprehensive review of both the non‐spotting and spotting based mining techniques. The mining approaches are categorized as feature, learning and hybrid‐based methods to analyze the strengths and limitations of the models of each category. In addition, it also discusses the usefulness of the methods according to different situations and applications. Furthermore, based on the review of different mining approaches, this article identifies the limitations of the existing methods and suggests new applications and future directions to continue the research in multiple directions. We believe such a review article will be useful to the researchers to quickly become familiar with the state‐of‐the‐art information and progresses made toward mining textual information from natural scene and video images.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85412730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyperspectral imaging has shown tremendous growth over the past three decades. Hyperspectral imaging was evolved through remote sensing. Along, with the technological enhancements hyperspectral imaging has outgrown, conquering over other various application areas. In addition to it, data enriched data cubes with abundant spectral and spatial information works as perk for capturing, analyzing, reviewing, and interpreting results from data. This review concentrates on emerging application areas of hyperspectral imaging. Emerging application areas are selected in ways where there is a vast scope for future enhancements by exploiting cutting edge technology, that is, deep learning. Applications of hyperspectral imaging techniques in some selected areas (remote sensing, document forgery, history and archaeology conservation, surveillance and security, machine vision for fruit quality inspection, medical imaging) are focused. The review pivots around the publicly available datasets and features used domain wise. This review can act as a baseline for deep learning and machine vision experts, historical geographers, and scholars by providing them a view of how hyperspectral imaging is implemented in multiple domains along with future research prospects.
{"title":"Critical insights into modern hyperspectral image applications through deep learning","authors":"Garima Jaiswal, Aruna Sharma, S. Yadav","doi":"10.1002/widm.1426","DOIUrl":"https://doi.org/10.1002/widm.1426","url":null,"abstract":"Hyperspectral imaging has shown tremendous growth over the past three decades. Hyperspectral imaging was evolved through remote sensing. Along, with the technological enhancements hyperspectral imaging has outgrown, conquering over other various application areas. In addition to it, data enriched data cubes with abundant spectral and spatial information works as perk for capturing, analyzing, reviewing, and interpreting results from data. This review concentrates on emerging application areas of hyperspectral imaging. Emerging application areas are selected in ways where there is a vast scope for future enhancements by exploiting cutting edge technology, that is, deep learning. Applications of hyperspectral imaging techniques in some selected areas (remote sensing, document forgery, history and archaeology conservation, surveillance and security, machine vision for fruit quality inspection, medical imaging) are focused. The review pivots around the publicly available datasets and features used domain wise. This review can act as a baseline for deep learning and machine vision experts, historical geographers, and scholars by providing them a view of how hyperspectral imaging is implemented in multiple domains along with future research prospects.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80501507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Bischl, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, Theresa Ullmann, M. Becker, A. Boulesteix, Difan Deng, M. Lindauer
Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time‐consuming and irreproducible manual process of trial‐and‐error to find well‐performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.
{"title":"Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges","authors":"B. Bischl, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, Theresa Ullmann, M. Becker, A. Boulesteix, Difan Deng, M. Lindauer","doi":"10.1002/widm.1484","DOIUrl":"https://doi.org/10.1002/widm.1484","url":null,"abstract":"Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time‐consuming and irreproducible manual process of trial‐and‐error to find well‐performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88823751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Angelov, E. Soares, Richard Jiang, Nicholas I. Arnold, Peter M. Atkinson
This paper provides a brief analytical review of the current state‐of‐the‐art in relation to the explainability of artificial intelligence in the context of recent advances in machine learning and deep learning. The paper starts with a brief historical introduction and a taxonomy, and formulates the main challenges in terms of explainability building on the recently formulated National Institute of Standards four principles of explainability. Recently published methods related to the topic are then critically reviewed and analyzed. Finally, future directions for research are suggested.
{"title":"Explainable artificial intelligence: an analytical review","authors":"P. Angelov, E. Soares, Richard Jiang, Nicholas I. Arnold, Peter M. Atkinson","doi":"10.1002/widm.1424","DOIUrl":"https://doi.org/10.1002/widm.1424","url":null,"abstract":"This paper provides a brief analytical review of the current state‐of‐the‐art in relation to the explainability of artificial intelligence in the context of recent advances in machine learning and deep learning. The paper starts with a brief historical introduction and a taxonomy, and formulates the main challenges in terms of explainability building on the recently formulated National Institute of Standards four principles of explainability. Recently published methods related to the topic are then critically reviewed and analyzed. Finally, future directions for research are suggested.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78016549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Electricity usage of buildings (including offices, malls, and residential apartments) represents a significant portion of a nation's energy expenditure and carbon footprint. In the United States, the buildings' appliances consume 72% of the total produced electricity approximately. In this regard, cyber‐physical system (CPS) researchers have put forth associated research questions to reduce cyber‐physical building environment energy consumption by minimizing the energy dissipation while securing occupants' comfort. Some of the questions in CPS building include finding the optimal HVAC control, monitoring appliances' energy usage, detecting insulation problems, estimating the occupants' number and activities, managing thermal comfort, intelligently interacting with the smart grid. Various machine learning (ML) applications have been studied in recent CPS researches to improve building energy efficiency by addressing these questions. In this paper, we comprehensively review and report on the contemporary applications of ML algorithms such as deep learning, transfer learning, active learning, reinforcement learning, and other emerging techniques that propose and envision to address the above challenges in the CPS building environment. Finally, we conclude this article by discussing diverse existing open questions and prospective future directions in the CPS building environment research.
{"title":"Trending machine learning models in cyber‐physical building environment: A survey","authors":"Zahid Hasan, Nirmalya Roy","doi":"10.1002/widm.1422","DOIUrl":"https://doi.org/10.1002/widm.1422","url":null,"abstract":"Electricity usage of buildings (including offices, malls, and residential apartments) represents a significant portion of a nation's energy expenditure and carbon footprint. In the United States, the buildings' appliances consume 72% of the total produced electricity approximately. In this regard, cyber‐physical system (CPS) researchers have put forth associated research questions to reduce cyber‐physical building environment energy consumption by minimizing the energy dissipation while securing occupants' comfort. Some of the questions in CPS building include finding the optimal HVAC control, monitoring appliances' energy usage, detecting insulation problems, estimating the occupants' number and activities, managing thermal comfort, intelligently interacting with the smart grid. Various machine learning (ML) applications have been studied in recent CPS researches to improve building energy efficiency by addressing these questions. In this paper, we comprehensively review and report on the contemporary applications of ML algorithms such as deep learning, transfer learning, active learning, reinforcement learning, and other emerging techniques that propose and envision to address the above challenges in the CPS building environment. Finally, we conclude this article by discussing diverse existing open questions and prospective future directions in the CPS building environment research.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75415831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chris Niessl, M. Herrmann, Chiara Wiedemann, Giuseppe Casalicchio, Anne-Laure Boulesteix Institute for Medical Information Processing, Biometry, Epidemiology, Lmu Munich, Germany, Department of Statistics
In recent years, the need for neutral benchmark studies that focus on the comparison of methods coming from computational sciences has been increasingly recognized by the scientific community. While general advice on the design and analysis of neutral benchmark studies can be found in recent literature, a certain flexibility always exists. This includes the choice of data sets and performance measures, the handling of missing performance values, and the way the performance values are aggregated over the data sets. As a consequence of this flexibility, researchers may be concerned about how their choices affect the results or, in the worst case, may be tempted to engage in questionable research practices (e.g., the selective reporting of results or the post hoc modification of design or analysis components) to fit their expectations. To raise awareness for this issue, we use an example benchmark study to illustrate how variable benchmark results can be when all possible combinations of a range of design and analysis options are considered. We then demonstrate how the impact of each choice on the results can be assessed using multidimensional unfolding. In conclusion, based on previous literature and on our illustrative example, we claim that the multiplicity of design and analysis options combined with questionable research practices lead to biased interpretations of benchmark results and to over‐optimistic conclusions. This issue should be considered by computational researchers when designing and analyzing their benchmark studies and by the scientific community in general in an effort towards more reliable benchmark results.
{"title":"Over‐optimism in benchmark studies and the multiplicity of design and analysis options when interpreting their results","authors":"Chris Niessl, M. Herrmann, Chiara Wiedemann, Giuseppe Casalicchio, Anne-Laure Boulesteix Institute for Medical Information Processing, Biometry, Epidemiology, Lmu Munich, Germany, Department of Statistics","doi":"10.1002/widm.1441","DOIUrl":"https://doi.org/10.1002/widm.1441","url":null,"abstract":"In recent years, the need for neutral benchmark studies that focus on the comparison of methods coming from computational sciences has been increasingly recognized by the scientific community. While general advice on the design and analysis of neutral benchmark studies can be found in recent literature, a certain flexibility always exists. This includes the choice of data sets and performance measures, the handling of missing performance values, and the way the performance values are aggregated over the data sets. As a consequence of this flexibility, researchers may be concerned about how their choices affect the results or, in the worst case, may be tempted to engage in questionable research practices (e.g., the selective reporting of results or the post hoc modification of design or analysis components) to fit their expectations. To raise awareness for this issue, we use an example benchmark study to illustrate how variable benchmark results can be when all possible combinations of a range of design and analysis options are considered. We then demonstrate how the impact of each choice on the results can be assessed using multidimensional unfolding. In conclusion, based on previous literature and on our illustrative example, we claim that the multiplicity of design and analysis options combined with questionable research practices lead to biased interpretations of benchmark results and to over‐optimistic conclusions. This issue should be considered by computational researchers when designing and analyzing their benchmark studies and by the scientific community in general in an effort towards more reliable benchmark results.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":null,"pages":null},"PeriodicalIF":7.8,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85483185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}