Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Eirini Ntoutsi
As decision‐making increasingly relies on machine learning (ML) and (big) data, the issue of fairness in data‐driven artificial intelligence systems is receiving increasing attention from both research and industry. A large variety of fairness‐aware ML solutions have been proposed which involve fairness‐related interventions in the data, learning algorithms, and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real‐world datasets used for fairness‐aware ML. We focus on tabular data as the most common data representation for fairness‐aware ML. We start our analysis by identifying relationships between the different attributes, particularly with respect to protected attributes and class attribute, using a Bayesian network. For a deeper understanding of bias in the datasets, we investigate interesting relationships using exploratory analysis.
{"title":"A survey on datasets for fairness‐aware machine learning","authors":"Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Eirini Ntoutsi","doi":"10.1002/widm.1452","DOIUrl":"https://doi.org/10.1002/widm.1452","url":null,"abstract":"As decision‐making increasingly relies on machine learning (ML) and (big) data, the issue of fairness in data‐driven artificial intelligence systems is receiving increasing attention from both research and industry. A large variety of fairness‐aware ML solutions have been proposed which involve fairness‐related interventions in the data, learning algorithms, and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real‐world datasets used for fairness‐aware ML. We focus on tabular data as the most common data representation for fairness‐aware ML. We start our analysis by identifying relationships between the different attributes, particularly with respect to protected attributes and class attribute, using a Bayesian network. For a deeper understanding of bias in the datasets, we investigate interesting relationships using exploratory analysis.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"28 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79017406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Uncovering community structure has made a significant advancement in explaining, analyzing, and forecasting behaviors and dynamics of networks related to different fields in sociology, criminology, biology, medicine, communication, economics, and academia. Detecting and clustering communities is a powerful step toward identifying the structural properties and the behavioral patterns in social networks. Recently, online learning has been progressively adopted by a lot of educational practices which raise many questions about assessing the learners' engagement, collaboration, and behaviors in the new emerging learning communities. This systematic literature review aims to assess the use of community detection techniques in analyzing the network's structure in online learning environments. It provides a comprehensive overview of the existing research that adopted those techniques with identifying the educational objectives behind their application as well as suggesting possible future research directions. Our analysis covered 65 studies that found in the literature and applied different community discovery techniques on various types of online learning environments to analyze their users' interactions patterns. Our review revealed the potential of this field in improving educational practices and decisions and in utilizing the massive amount of data generated from interacting with those environments. Finally, we highlighted the need to include automated community discovery techniques in online learning environments to facilitate and enhance their use as well as we stressed on the urge for further advance research to uncover a lot of hidden opportunities.
{"title":"Detecting communities using social network analysis in online learning environments: Systematic literature review","authors":"Sahar Yassine, S. Kadry, M. Sicilia","doi":"10.1002/widm.1431","DOIUrl":"https://doi.org/10.1002/widm.1431","url":null,"abstract":"Uncovering community structure has made a significant advancement in explaining, analyzing, and forecasting behaviors and dynamics of networks related to different fields in sociology, criminology, biology, medicine, communication, economics, and academia. Detecting and clustering communities is a powerful step toward identifying the structural properties and the behavioral patterns in social networks. Recently, online learning has been progressively adopted by a lot of educational practices which raise many questions about assessing the learners' engagement, collaboration, and behaviors in the new emerging learning communities. This systematic literature review aims to assess the use of community detection techniques in analyzing the network's structure in online learning environments. It provides a comprehensive overview of the existing research that adopted those techniques with identifying the educational objectives behind their application as well as suggesting possible future research directions. Our analysis covered 65 studies that found in the literature and applied different community discovery techniques on various types of online learning environments to analyze their users' interactions patterns. Our review revealed the potential of this field in improving educational practices and decisions and in utilizing the massive amount of data generated from interacting with those environments. Finally, we highlighted the need to include automated community discovery techniques in online learning environments to facilitate and enhance their use as well as we stressed on the urge for further advance research to uncover a lot of hidden opportunities.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"89 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77808874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A coreset of an input set is its small summarization, such that solving a problem on the coreset as its input, provably yields the same result as solving the same problem on the original (full) set, for a given family of problems (models/classifiers/loss functions). Coresets have been suggested for many fundamental problems, for example, in machine/deep learning, computer vision, databases, and theoretical computer science. This introductory paper was written following requests regarding the many inconsistent coreset definitions, lack of source code, the required deep theoretical background from different fields, and the dense papers that make it hard for beginners to apply and develop coresets. The article provides folklore, classic, and simple results including step‐by‐step proofs and figures, for the simplest (accurate) coresets. Nevertheless, we did not find most of their constructions in the literature. Moreover, we expect that putting them together in a retrospective context would help the reader to grasp current results that usually generalize these fundamental observations. Experts might appreciate the unified notation and comparison table for existing results. Open source code is provided for all presented algorithms, to demonstrate their usage, and to support the readers who are more familiar with programming than mathematics.
{"title":"Overview of accurate coresets","authors":"Ibrahim Jubran, Alaa Maalouf, D. Feldman","doi":"10.1002/widm.1429","DOIUrl":"https://doi.org/10.1002/widm.1429","url":null,"abstract":"A coreset of an input set is its small summarization, such that solving a problem on the coreset as its input, provably yields the same result as solving the same problem on the original (full) set, for a given family of problems (models/classifiers/loss functions). Coresets have been suggested for many fundamental problems, for example, in machine/deep learning, computer vision, databases, and theoretical computer science. This introductory paper was written following requests regarding the many inconsistent coreset definitions, lack of source code, the required deep theoretical background from different fields, and the dense papers that make it hard for beginners to apply and develop coresets. The article provides folklore, classic, and simple results including step‐by‐step proofs and figures, for the simplest (accurate) coresets. Nevertheless, we did not find most of their constructions in the literature. Moreover, we expect that putting them together in a retrospective context would help the reader to grasp current results that usually generalize these fundamental observations. Experts might appreciate the unified notation and comparison table for existing results. Open source code is provided for all presented algorithms, to demonstrate their usage, and to support the readers who are more familiar with programming than mathematics.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"5 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79049243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In computer terminology, mining is considered as extracting meaningful information or knowledge from a large amount of data/information using computers. The meaningful information can be extracted from normal text, and images obtained from different resources, such as natural scene images, video, and documents by deriving semantics from text and content of the images. Although there are many pieces of work on text/data mining and several survey/review papers are published in the literature, to the best of our knowledge there is no survey paper on mining textual information from the natural scene, video, and document images considering word spotting techniques. In this article, we, therefore, provide a comprehensive review of both the non‐spotting and spotting based mining techniques. The mining approaches are categorized as feature, learning and hybrid‐based methods to analyze the strengths and limitations of the models of each category. In addition, it also discusses the usefulness of the methods according to different situations and applications. Furthermore, based on the review of different mining approaches, this article identifies the limitations of the existing methods and suggests new applications and future directions to continue the research in multiple directions. We believe such a review article will be useful to the researchers to quickly become familiar with the state‐of‐the‐art information and progresses made toward mining textual information from natural scene and video images.
{"title":"Mining text from natural scene and video images: A survey","authors":"P. Shivakumara, Alireza Alaei, U. Pal","doi":"10.1002/widm.1428","DOIUrl":"https://doi.org/10.1002/widm.1428","url":null,"abstract":"In computer terminology, mining is considered as extracting meaningful information or knowledge from a large amount of data/information using computers. The meaningful information can be extracted from normal text, and images obtained from different resources, such as natural scene images, video, and documents by deriving semantics from text and content of the images. Although there are many pieces of work on text/data mining and several survey/review papers are published in the literature, to the best of our knowledge there is no survey paper on mining textual information from the natural scene, video, and document images considering word spotting techniques. In this article, we, therefore, provide a comprehensive review of both the non‐spotting and spotting based mining techniques. The mining approaches are categorized as feature, learning and hybrid‐based methods to analyze the strengths and limitations of the models of each category. In addition, it also discusses the usefulness of the methods according to different situations and applications. Furthermore, based on the review of different mining approaches, this article identifies the limitations of the existing methods and suggests new applications and future directions to continue the research in multiple directions. We believe such a review article will be useful to the researchers to quickly become familiar with the state‐of‐the‐art information and progresses made toward mining textual information from natural scene and video images.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"101 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85412730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyperspectral imaging has shown tremendous growth over the past three decades. Hyperspectral imaging was evolved through remote sensing. Along, with the technological enhancements hyperspectral imaging has outgrown, conquering over other various application areas. In addition to it, data enriched data cubes with abundant spectral and spatial information works as perk for capturing, analyzing, reviewing, and interpreting results from data. This review concentrates on emerging application areas of hyperspectral imaging. Emerging application areas are selected in ways where there is a vast scope for future enhancements by exploiting cutting edge technology, that is, deep learning. Applications of hyperspectral imaging techniques in some selected areas (remote sensing, document forgery, history and archaeology conservation, surveillance and security, machine vision for fruit quality inspection, medical imaging) are focused. The review pivots around the publicly available datasets and features used domain wise. This review can act as a baseline for deep learning and machine vision experts, historical geographers, and scholars by providing them a view of how hyperspectral imaging is implemented in multiple domains along with future research prospects.
{"title":"Critical insights into modern hyperspectral image applications through deep learning","authors":"Garima Jaiswal, Aruna Sharma, S. Yadav","doi":"10.1002/widm.1426","DOIUrl":"https://doi.org/10.1002/widm.1426","url":null,"abstract":"Hyperspectral imaging has shown tremendous growth over the past three decades. Hyperspectral imaging was evolved through remote sensing. Along, with the technological enhancements hyperspectral imaging has outgrown, conquering over other various application areas. In addition to it, data enriched data cubes with abundant spectral and spatial information works as perk for capturing, analyzing, reviewing, and interpreting results from data. This review concentrates on emerging application areas of hyperspectral imaging. Emerging application areas are selected in ways where there is a vast scope for future enhancements by exploiting cutting edge technology, that is, deep learning. Applications of hyperspectral imaging techniques in some selected areas (remote sensing, document forgery, history and archaeology conservation, surveillance and security, machine vision for fruit quality inspection, medical imaging) are focused. The review pivots around the publicly available datasets and features used domain wise. This review can act as a baseline for deep learning and machine vision experts, historical geographers, and scholars by providing them a view of how hyperspectral imaging is implemented in multiple domains along with future research prospects.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"102 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80501507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Bischl, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, Theresa Ullmann, M. Becker, A. Boulesteix, Difan Deng, M. Lindauer
Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time‐consuming and irreproducible manual process of trial‐and‐error to find well‐performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.
{"title":"Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges","authors":"B. Bischl, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, Theresa Ullmann, M. Becker, A. Boulesteix, Difan Deng, M. Lindauer","doi":"10.1002/widm.1484","DOIUrl":"https://doi.org/10.1002/widm.1484","url":null,"abstract":"Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time‐consuming and irreproducible manual process of trial‐and‐error to find well‐performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"22 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88823751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Angelov, E. Soares, Richard Jiang, Nicholas I. Arnold, Peter M. Atkinson
This paper provides a brief analytical review of the current state‐of‐the‐art in relation to the explainability of artificial intelligence in the context of recent advances in machine learning and deep learning. The paper starts with a brief historical introduction and a taxonomy, and formulates the main challenges in terms of explainability building on the recently formulated National Institute of Standards four principles of explainability. Recently published methods related to the topic are then critically reviewed and analyzed. Finally, future directions for research are suggested.
{"title":"Explainable artificial intelligence: an analytical review","authors":"P. Angelov, E. Soares, Richard Jiang, Nicholas I. Arnold, Peter M. Atkinson","doi":"10.1002/widm.1424","DOIUrl":"https://doi.org/10.1002/widm.1424","url":null,"abstract":"This paper provides a brief analytical review of the current state‐of‐the‐art in relation to the explainability of artificial intelligence in the context of recent advances in machine learning and deep learning. The paper starts with a brief historical introduction and a taxonomy, and formulates the main challenges in terms of explainability building on the recently formulated National Institute of Standards four principles of explainability. Recently published methods related to the topic are then critically reviewed and analyzed. Finally, future directions for research are suggested.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"70 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78016549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ce Yu, Kun Li, Yanxia Zhang, Jian Xiao, Chenzhou Cui, Yihan Tao, Shanjian Tang, Chao Sun, Chongke Bi
The improvement of observation capabilities has expanded the scale of new data available for time domain astronomy research, and the accumulation of observational data continues to accelerate. However, traditional data analysis methods are difficult to fully tap the potential scientific value of all data. Therefore, in the current and future research on light curve analysis, it is inevitable to use artificial intelligence (AI) technology to assist in data analysis in order to obtain as many candidates as possible with scientific research goals. This survey reviews important developments in light curve analysis over the past years, summarizes the basic concepts in machine learning and their applications in light curve analysis and concludes perspectives and challenges for light curve analysis in the near future. The full exploration of light curves of variable celestial objects relies heavily on new techniques derived from promotion of machine learning and deep learning in the astronomical big data era.
{"title":"A survey on machine learning based light curve analysis for variable astronomical sources","authors":"Ce Yu, Kun Li, Yanxia Zhang, Jian Xiao, Chenzhou Cui, Yihan Tao, Shanjian Tang, Chao Sun, Chongke Bi","doi":"10.1002/widm.1425","DOIUrl":"https://doi.org/10.1002/widm.1425","url":null,"abstract":"The improvement of observation capabilities has expanded the scale of new data available for time domain astronomy research, and the accumulation of observational data continues to accelerate. However, traditional data analysis methods are difficult to fully tap the potential scientific value of all data. Therefore, in the current and future research on light curve analysis, it is inevitable to use artificial intelligence (AI) technology to assist in data analysis in order to obtain as many candidates as possible with scientific research goals. This survey reviews important developments in light curve analysis over the past years, summarizes the basic concepts in machine learning and their applications in light curve analysis and concludes perspectives and challenges for light curve analysis in the near future. The full exploration of light curves of variable celestial objects relies heavily on new techniques derived from promotion of machine learning and deep learning in the astronomical big data era.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"268 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75389550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Electricity usage of buildings (including offices, malls, and residential apartments) represents a significant portion of a nation's energy expenditure and carbon footprint. In the United States, the buildings' appliances consume 72% of the total produced electricity approximately. In this regard, cyber‐physical system (CPS) researchers have put forth associated research questions to reduce cyber‐physical building environment energy consumption by minimizing the energy dissipation while securing occupants' comfort. Some of the questions in CPS building include finding the optimal HVAC control, monitoring appliances' energy usage, detecting insulation problems, estimating the occupants' number and activities, managing thermal comfort, intelligently interacting with the smart grid. Various machine learning (ML) applications have been studied in recent CPS researches to improve building energy efficiency by addressing these questions. In this paper, we comprehensively review and report on the contemporary applications of ML algorithms such as deep learning, transfer learning, active learning, reinforcement learning, and other emerging techniques that propose and envision to address the above challenges in the CPS building environment. Finally, we conclude this article by discussing diverse existing open questions and prospective future directions in the CPS building environment research.
{"title":"Trending machine learning models in cyber‐physical building environment: A survey","authors":"Zahid Hasan, Nirmalya Roy","doi":"10.1002/widm.1422","DOIUrl":"https://doi.org/10.1002/widm.1422","url":null,"abstract":"Electricity usage of buildings (including offices, malls, and residential apartments) represents a significant portion of a nation's energy expenditure and carbon footprint. In the United States, the buildings' appliances consume 72% of the total produced electricity approximately. In this regard, cyber‐physical system (CPS) researchers have put forth associated research questions to reduce cyber‐physical building environment energy consumption by minimizing the energy dissipation while securing occupants' comfort. Some of the questions in CPS building include finding the optimal HVAC control, monitoring appliances' energy usage, detecting insulation problems, estimating the occupants' number and activities, managing thermal comfort, intelligently interacting with the smart grid. Various machine learning (ML) applications have been studied in recent CPS researches to improve building energy efficiency by addressing these questions. In this paper, we comprehensively review and report on the contemporary applications of ML algorithms such as deep learning, transfer learning, active learning, reinforcement learning, and other emerging techniques that propose and envision to address the above challenges in the CPS building environment. Finally, we conclude this article by discussing diverse existing open questions and prospective future directions in the CPS building environment research.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"729 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75415831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chris Niessl, M. Herrmann, Chiara Wiedemann, Giuseppe Casalicchio, Anne-Laure Boulesteix Institute for Medical Information Processing, Biometry, Epidemiology, Lmu Munich, Germany, Department of Statistics
In recent years, the need for neutral benchmark studies that focus on the comparison of methods coming from computational sciences has been increasingly recognized by the scientific community. While general advice on the design and analysis of neutral benchmark studies can be found in recent literature, a certain flexibility always exists. This includes the choice of data sets and performance measures, the handling of missing performance values, and the way the performance values are aggregated over the data sets. As a consequence of this flexibility, researchers may be concerned about how their choices affect the results or, in the worst case, may be tempted to engage in questionable research practices (e.g., the selective reporting of results or the post hoc modification of design or analysis components) to fit their expectations. To raise awareness for this issue, we use an example benchmark study to illustrate how variable benchmark results can be when all possible combinations of a range of design and analysis options are considered. We then demonstrate how the impact of each choice on the results can be assessed using multidimensional unfolding. In conclusion, based on previous literature and on our illustrative example, we claim that the multiplicity of design and analysis options combined with questionable research practices lead to biased interpretations of benchmark results and to over‐optimistic conclusions. This issue should be considered by computational researchers when designing and analyzing their benchmark studies and by the scientific community in general in an effort towards more reliable benchmark results.
{"title":"Over‐optimism in benchmark studies and the multiplicity of design and analysis options when interpreting their results","authors":"Chris Niessl, M. Herrmann, Chiara Wiedemann, Giuseppe Casalicchio, Anne-Laure Boulesteix Institute for Medical Information Processing, Biometry, Epidemiology, Lmu Munich, Germany, Department of Statistics","doi":"10.1002/widm.1441","DOIUrl":"https://doi.org/10.1002/widm.1441","url":null,"abstract":"In recent years, the need for neutral benchmark studies that focus on the comparison of methods coming from computational sciences has been increasingly recognized by the scientific community. While general advice on the design and analysis of neutral benchmark studies can be found in recent literature, a certain flexibility always exists. This includes the choice of data sets and performance measures, the handling of missing performance values, and the way the performance values are aggregated over the data sets. As a consequence of this flexibility, researchers may be concerned about how their choices affect the results or, in the worst case, may be tempted to engage in questionable research practices (e.g., the selective reporting of results or the post hoc modification of design or analysis components) to fit their expectations. To raise awareness for this issue, we use an example benchmark study to illustrate how variable benchmark results can be when all possible combinations of a range of design and analysis options are considered. We then demonstrate how the impact of each choice on the results can be assessed using multidimensional unfolding. In conclusion, based on previous literature and on our illustrative example, we claim that the multiplicity of design and analysis options combined with questionable research practices lead to biased interpretations of benchmark results and to over‐optimistic conclusions. This issue should be considered by computational researchers when designing and analyzing their benchmark studies and by the scientific community in general in an effort towards more reliable benchmark results.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"168 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85483185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}