Pub Date : 2022-08-11DOI: 10.1080/00224065.2022.2106910
M. Gronle, M. Grasso, Emidio Granito, F. Schaal, B. Colosimo
Abstract Open science has the capacity of boosting innovative solutions and knowledge development thanks to a transparent access to data shared within the research community and collaborative networks. Because of this, it has become a policy priority in various research and development strategy plans and roadmaps, but the awareness if its potential is still limited in industry. Additive manufacturing (AM) represents a field where open science initiatives may have a great impact, as large academic and industrial communities are working in the same area, enormous quantities of data are generated on a daily basis by companies and research centers, and many challenging problems still need to be solved. This article presents a case study based on an open science collaboration project between TRUMPF Laser- und Systemtechnik GmbH, one of the major AM systems developers and Politecnico di Milano. The case study relies on an open data set including in-line and in-situ signals gathered during the laser powder bed fusion of specimens of aluminum parts on an industrial machine. The signals were acquired by means of two photodiodes installed co-axially to the laser path. The specimens were designed to introduce, on purpose, anomalies in certain locations and in certain layers. The data set is specifically designed to support the development of novel in-situ monitoring methodologies for fast and robust anomaly detection while the part is being built. A layerwise statistical monitoring approach is proposed and preliminary results are presented, but the problem is open to additional research and to the exploration of novel solutions.
{"title":"Open data for open science in Industry 4.0: In-situ monitoring of quality in additive manufacturing","authors":"M. Gronle, M. Grasso, Emidio Granito, F. Schaal, B. Colosimo","doi":"10.1080/00224065.2022.2106910","DOIUrl":"https://doi.org/10.1080/00224065.2022.2106910","url":null,"abstract":"Abstract Open science has the capacity of boosting innovative solutions and knowledge development thanks to a transparent access to data shared within the research community and collaborative networks. Because of this, it has become a policy priority in various research and development strategy plans and roadmaps, but the awareness if its potential is still limited in industry. Additive manufacturing (AM) represents a field where open science initiatives may have a great impact, as large academic and industrial communities are working in the same area, enormous quantities of data are generated on a daily basis by companies and research centers, and many challenging problems still need to be solved. This article presents a case study based on an open science collaboration project between TRUMPF Laser- und Systemtechnik GmbH, one of the major AM systems developers and Politecnico di Milano. The case study relies on an open data set including in-line and in-situ signals gathered during the laser powder bed fusion of specimens of aluminum parts on an industrial machine. The signals were acquired by means of two photodiodes installed co-axially to the laser path. The specimens were designed to introduce, on purpose, anomalies in certain locations and in certain layers. The data set is specifically designed to support the development of novel in-situ monitoring methodologies for fast and robust anomaly detection while the part is being built. A layerwise statistical monitoring approach is proposed and preliminary results are presented, but the problem is open to additional research and to the exploration of novel solutions.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"110 ","pages":"253 - 265"},"PeriodicalIF":2.5,"publicationDate":"2022-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72506242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-25DOI: 10.1080/00224065.2022.2097966
Yuxia Liu, Yubin Tian, Dianpeng Wang
Abstract In experimental design, a common problem seen in practice is when the result includes one binary response and multiple continuous responses. However, this problem receives scant attention. Most studies pertaining to this problem usually consider the situation in which the continuous responses are independent of the stimulus level condition on the binary response. However, in many practical applications, real data show that this conditional independent assumption is not always appropriate. This article considers a new model for the dependent situation and a corresponding sequential design is proposed under the decision-theoretic framework. To deal with the problem of complex computation involved in searching for optimal designs, fast algorithms are presented using two strategies to approximate the optimal criterion, denoted as SI-optimal design and Bayesian D-optimal design, respectively. Simulation studies based on data from a Chinese chemical material factory show that the proposed methods perform well in estimating the interesting quantiles.
{"title":"Bayesian sequential design for sensitivity experiments with hybrid responses","authors":"Yuxia Liu, Yubin Tian, Dianpeng Wang","doi":"10.1080/00224065.2022.2097966","DOIUrl":"https://doi.org/10.1080/00224065.2022.2097966","url":null,"abstract":"Abstract In experimental design, a common problem seen in practice is when the result includes one binary response and multiple continuous responses. However, this problem receives scant attention. Most studies pertaining to this problem usually consider the situation in which the continuous responses are independent of the stimulus level condition on the binary response. However, in many practical applications, real data show that this conditional independent assumption is not always appropriate. This article considers a new model for the dependent situation and a corresponding sequential design is proposed under the decision-theoretic framework. To deal with the problem of complex computation involved in searching for optimal designs, fast algorithms are presented using two strategies to approximate the optimal criterion, denoted as SI-optimal design and Bayesian D-optimal design, respectively. Simulation studies based on data from a Chinese chemical material factory show that the proposed methods perform well in estimating the interesting quantiles.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"114 1","pages":"181 - 194"},"PeriodicalIF":2.5,"publicationDate":"2022-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87507321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-19DOI: 10.1080/00224065.2023.2219012
Christian Capezza, Fabio Centofanti, A. Lepore, A. Menafoglio, B. Palumbo, S. Vantini
Modern statistical process monitoring (SPM) applications focus on profile monitoring, i.e., the monitoring of process quality characteristics that can be modeled as profiles, also known as functional data. Despite the large interest in the profile monitoring literature, there is still a lack of software to facilitate its practical application. This article introduces the funcharts R package that implements recent developments on the SPM of multivariate functional quality characteristics, possibly adjusted by the influence of additional variables, referred to as covariates. The package also implements the real-time version of all control charting procedures to monitor profiles partially observed up to an intermediate domain point. The package is illustrated both through its built-in data generator and a real-case study on the SPM of Ro-Pax ship CO2 emissions during navigation, which is based on the ShipNavigation data provided in the Supplementary Material.
{"title":"funcharts: control charts for multivariate functional data in R","authors":"Christian Capezza, Fabio Centofanti, A. Lepore, A. Menafoglio, B. Palumbo, S. Vantini","doi":"10.1080/00224065.2023.2219012","DOIUrl":"https://doi.org/10.1080/00224065.2023.2219012","url":null,"abstract":"Modern statistical process monitoring (SPM) applications focus on profile monitoring, i.e., the monitoring of process quality characteristics that can be modeled as profiles, also known as functional data. Despite the large interest in the profile monitoring literature, there is still a lack of software to facilitate its practical application. This article introduces the funcharts R package that implements recent developments on the SPM of multivariate functional quality characteristics, possibly adjusted by the influence of additional variables, referred to as covariates. The package also implements the real-time version of all control charting procedures to monitor profiles partially observed up to an intermediate domain point. The package is illustrated both through its built-in data generator and a real-case study on the SPM of Ro-Pax ship CO2 emissions during navigation, which is based on the ShipNavigation data provided in the Supplementary Material.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"4 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2022-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84184134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-13DOI: 10.1080/00224065.2022.2080617
F. Pascual, Joseph P. Navelski
Abstract Many modern products fail due to one of multiple causes called competing risks. In this article, we propose variable features for monitoring product failure by control charts under competing risks. Failure reports arrive one at a time from a sample of population of units. Features are derived from both the reports and the assumed competing-risk statistical model. To assess the efficacy of different feature subsets in detecting shifts in the failure-time process, we consider control charts based on random forests and compare the average run length performances under different shift scenarios. We demonstrate the control charts with both simulated data sets and actual field data set from a consulting problem. We also propose graphical fault-diagnosis methods for identifying assignable causes of alarm signals. Control charts based on the proposed features will provide valuable information to manufacturers in planning for warranty, part-replacement, or repair.
{"title":"Monitoring reliability under competing risks using field data","authors":"F. Pascual, Joseph P. Navelski","doi":"10.1080/00224065.2022.2080617","DOIUrl":"https://doi.org/10.1080/00224065.2022.2080617","url":null,"abstract":"Abstract Many modern products fail due to one of multiple causes called competing risks. In this article, we propose variable features for monitoring product failure by control charts under competing risks. Failure reports arrive one at a time from a sample of population of units. Features are derived from both the reports and the assumed competing-risk statistical model. To assess the efficacy of different feature subsets in detecting shifts in the failure-time process, we consider control charts based on random forests and compare the average run length performances under different shift scenarios. We demonstrate the control charts with both simulated data sets and actual field data set from a consulting problem. We also propose graphical fault-diagnosis methods for identifying assignable causes of alarm signals. Control charts based on the proposed features will provide valuable information to manufacturers in planning for warranty, part-replacement, or repair.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"3 1","pages":"123 - 139"},"PeriodicalIF":2.5,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74895174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-12DOI: 10.1080/00224065.2022.2096515
Joseph D. Conklin
{"title":"Design and Analysis of Experiments and Observational Studies using R","authors":"Joseph D. Conklin","doi":"10.1080/00224065.2022.2096515","DOIUrl":"https://doi.org/10.1080/00224065.2022.2096515","url":null,"abstract":"","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"7 1","pages":"269 - 270"},"PeriodicalIF":2.5,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85600900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-15DOI: 10.1080/00224065.2022.2081103
Amirhossein Fallahdizcheh, Chao Wang
Abstract The typical way to conduct data-driven prognosis is to train a degradation model with historical data, then apply the model to predict failure for in-service units. Most existing works assume the historical data and in-service data are from the same process. In practice, however, different but related processes can share similar degradation patterns. Thus, the historical data from these processes are expected to provide useful prognosis information for each other. In this article, we propose a data-level transfer learning framework to extract useful and shared information from different processes to benefit the prognosis of in-service units. In this framework, the degradation data in each process is modeled by a mixed effects model. To facilitate the information sharing among different mixed effects models, a hierarchical Bayesian structure is proposed to model and connect the distributions of mixed effects in different mixed models. Because the degradation paths in different processes are rarely the same, the dimension of the mixed effects/regressor in each process can be different. To handle this issue, we propose a tailored linear transformation to marginalize or expand the distributions of mixed effects in different degradation processes to achieve consistent dimensions. The transferred information is finally incorporated with the degradation data from in-service units to conduct prognosis. The proposed method is validated and compared with various benchmarks in extensive numerical studies and two case studies. The results show the proposed method can successfully transfer useful information in different processes to benefit the prognosis.
{"title":"Data-level transfer learning for degradation modeling and prognosis","authors":"Amirhossein Fallahdizcheh, Chao Wang","doi":"10.1080/00224065.2022.2081103","DOIUrl":"https://doi.org/10.1080/00224065.2022.2081103","url":null,"abstract":"Abstract The typical way to conduct data-driven prognosis is to train a degradation model with historical data, then apply the model to predict failure for in-service units. Most existing works assume the historical data and in-service data are from the same process. In practice, however, different but related processes can share similar degradation patterns. Thus, the historical data from these processes are expected to provide useful prognosis information for each other. In this article, we propose a data-level transfer learning framework to extract useful and shared information from different processes to benefit the prognosis of in-service units. In this framework, the degradation data in each process is modeled by a mixed effects model. To facilitate the information sharing among different mixed effects models, a hierarchical Bayesian structure is proposed to model and connect the distributions of mixed effects in different mixed models. Because the degradation paths in different processes are rarely the same, the dimension of the mixed effects/regressor in each process can be different. To handle this issue, we propose a tailored linear transformation to marginalize or expand the distributions of mixed effects in different degradation processes to achieve consistent dimensions. The transferred information is finally incorporated with the degradation data from in-service units to conduct prognosis. The proposed method is validated and compared with various benchmarks in extensive numerical studies and two case studies. The results show the proposed method can successfully transfer useful information in different processes to benefit the prognosis.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"100 1","pages":"140 - 162"},"PeriodicalIF":2.5,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81621824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-02DOI: 10.1080/00224065.2022.2081104
Peihua Qiu, Kai-zuan Yang
Abstract Spatio-temporal process monitoring (STPM) has received a considerable attention recently due to its broad applications in environment monitoring, disease surveillance, streaming image processing, and more. Because spatio-temporal data often have complicated structure, including latent spatio-temporal data correlation, complex spatio-temporal mean structure, and nonparametric data distribution, STPM is a challenging research problem. In practice, if a spatio-temporal process has a distributional shift (e.g., mean shift) started at a specific time point, then the spatial locations with the shift are usually clustered in small regions. This kind of spatial feature of the shift has not been considered in the existing STPM literature yet. In this paper, we develop a new STPM method that takes into account the spatial feature of the shift in its construction. The new method combines the ideas of exponentially weighted moving average in the temporal domain for online process monitoring and spatial LASSO in the spatial domain for accommodating the spatial feature of a future shift. It can also accommodate the complicated spatio-temporal data structure well. Both simulation studies and a real-data application show that it can provide a reliable and effective tool for different STPM applications.
时空过程监测(spatial -temporal process monitoring, STPM)由于在环境监测、疾病监测、流图像处理等领域的广泛应用,近年来受到了广泛的关注。由于时空数据往往具有复杂的结构,包括潜在的时空相关性、复杂的时空平均结构和非参数的数据分布等,因此时空pm是一个具有挑战性的研究问题。在实践中,如果一个时空过程在特定的时间点发生了分布偏移(例如,均值偏移),那么发生偏移的空间位置通常聚集在小区域中。这种转移的空间特征在现有的STPM文献中尚未被考虑。在本文中,我们开发了一种新的STPM方法,该方法在其构造中考虑了位移的空间特征。该方法结合了指数加权移动平均的思想,在时域用于在线过程监测,空间LASSO在空间域用于适应未来变化的空间特征。它还能很好地适应复杂的时空数据结构。仿真研究和实际应用表明,该方法可以为不同的STPM应用提供可靠有效的工具。
{"title":"Spatio-temporal process monitoring using exponentially weighted spatial LASSO","authors":"Peihua Qiu, Kai-zuan Yang","doi":"10.1080/00224065.2022.2081104","DOIUrl":"https://doi.org/10.1080/00224065.2022.2081104","url":null,"abstract":"Abstract Spatio-temporal process monitoring (STPM) has received a considerable attention recently due to its broad applications in environment monitoring, disease surveillance, streaming image processing, and more. Because spatio-temporal data often have complicated structure, including latent spatio-temporal data correlation, complex spatio-temporal mean structure, and nonparametric data distribution, STPM is a challenging research problem. In practice, if a spatio-temporal process has a distributional shift (e.g., mean shift) started at a specific time point, then the spatial locations with the shift are usually clustered in small regions. This kind of spatial feature of the shift has not been considered in the existing STPM literature yet. In this paper, we develop a new STPM method that takes into account the spatial feature of the shift in its construction. The new method combines the ideas of exponentially weighted moving average in the temporal domain for online process monitoring and spatial LASSO in the spatial domain for accommodating the spatial feature of a future shift. It can also accommodate the complicated spatio-temporal data structure well. Both simulation studies and a real-data application show that it can provide a reliable and effective tool for different STPM applications.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"30 1","pages":"163 - 180"},"PeriodicalIF":2.5,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74938549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-12DOI: 10.1080/00224065.2022.2060150
W. Woodall
{"title":"Book review: Introduction to statistical process control","authors":"W. Woodall","doi":"10.1080/00224065.2022.2060150","DOIUrl":"https://doi.org/10.1080/00224065.2022.2060150","url":null,"abstract":"","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"6 1","pages":"267 - 268"},"PeriodicalIF":2.5,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84374739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-04DOI: 10.1080/00224065.2022.2053794
Piao Chen, Kilian Buis, Xiujie Zhao
Abstract The gamma distribution is one of the most important parametric models in probability theory and statistics. Although a multitude of studies have theoretically investigated the properties of the gamma distribution in the literature, there is still a serious lack of tailored statistical tools to facilitate its practical applications. To fill the gap, this paper develops a comprehensive R package for the gamma distribution. In specific, the R package focuses on the following three important tasks: generate the gamma random variables, estimate the model parameters, and construct statistical limits, including confidence limits, prediction limits, and tolerance limits based on the gamma random variables. The proposed package encompasses the state-of-the-art methods of the gamma distribution in the literature and its usage is illustrated by a real application.
{"title":"A comprehensive toolbox for the gamma distribution: The gammadist package","authors":"Piao Chen, Kilian Buis, Xiujie Zhao","doi":"10.1080/00224065.2022.2053794","DOIUrl":"https://doi.org/10.1080/00224065.2022.2053794","url":null,"abstract":"Abstract The gamma distribution is one of the most important parametric models in probability theory and statistics. Although a multitude of studies have theoretically investigated the properties of the gamma distribution in the literature, there is still a serious lack of tailored statistical tools to facilitate its practical applications. To fill the gap, this paper develops a comprehensive R package for the gamma distribution. In specific, the R package focuses on the following three important tasks: generate the gamma random variables, estimate the model parameters, and construct statistical limits, including confidence limits, prediction limits, and tolerance limits based on the gamma random variables. The proposed package encompasses the state-of-the-art methods of the gamma distribution in the literature and its usage is illustrated by a real application.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"41 1","pages":"75 - 87"},"PeriodicalIF":2.5,"publicationDate":"2022-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80021599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-07DOI: 10.1080/00224065.2022.2041378
Caleb King
A deep dive into the theoretical underpinnings of common high-dimensional statistical 20 techniques, Dr. Giraud’s Introduction to High-Dimensional Statistics is a good reference for those who wish to explore the mathematical foundations of state-of-the-art multivariate methods. The book covers a wide array of topics, from estimation bounds to multivariate regression and even clustering. In this 2nd edition, Dr. Giraud expands his work to include more recent advances and statistical methods. The book consists of 12 chapters, starting with a brief introduction to the complexities of conducting statistics in high dimensions. The book then proceeds similar to a standard statistical textbook, moving from properties of statistical estimators to statistical modeling, including regression and then other more advanced topics. Each chapter concludes with a set of exercises, many of which are portions of proofs from the chapter left for the reader. All that being said, do not let the title of the book fool you. By the author’s own admission, this is not an introduction on the same level as Hastie et al.’s Elements of Statistical Learning. Instead, the focus of this book is on the mathematical foundations of high-dimensional techniques, proving theorems regarding properties of estimators. I must confess this is not quite what I expected upon first look; one truly cannot judge a book by its cover. That is not to say that this book is lacking. It is impressive in its efficient, yet thorough, presentation of the theory. I especially appreciated how the author took time at the beginning to illustrate some of the strange behavior one encounters in very high dimensions. However, I did find it jarring that mathematical notation was very often used without much introduction. There is an appendix with notations at the end of the book, but I would’ve rather had a bit more interpretation within the text rather than having to flip back and forth. There were also a few typographical errors and partial omissions of formulas, though I can’t be sure if this was part of the text or bugs in the software I used to read the digital version. In summary, this book would certainly make for a good graduate level textbook in an advanced course on statistical methods. If you are willing to put the necessary time and investment into rigorously exploring the foundations of high-dimensional statistics, than you can hardly do better than this book.
{"title":"Introduction to High-Dimensional Statistics, Christophe Giraud. Chapman& Hall/CRC Press, 2021, 364 pp., $72.00 hardcover, ISBN 978-0-367-71622-6.","authors":"Caleb King","doi":"10.1080/00224065.2022.2041378","DOIUrl":"https://doi.org/10.1080/00224065.2022.2041378","url":null,"abstract":"A deep dive into the theoretical underpinnings of common high-dimensional statistical 20 techniques, Dr. Giraud’s Introduction to High-Dimensional Statistics is a good reference for those who wish to explore the mathematical foundations of state-of-the-art multivariate methods. The book covers a wide array of topics, from estimation bounds to multivariate regression and even clustering. In this 2nd edition, Dr. Giraud expands his work to include more recent advances and statistical methods. The book consists of 12 chapters, starting with a brief introduction to the complexities of conducting statistics in high dimensions. The book then proceeds similar to a standard statistical textbook, moving from properties of statistical estimators to statistical modeling, including regression and then other more advanced topics. Each chapter concludes with a set of exercises, many of which are portions of proofs from the chapter left for the reader. All that being said, do not let the title of the book fool you. By the author’s own admission, this is not an introduction on the same level as Hastie et al.’s Elements of Statistical Learning. Instead, the focus of this book is on the mathematical foundations of high-dimensional techniques, proving theorems regarding properties of estimators. I must confess this is not quite what I expected upon first look; one truly cannot judge a book by its cover. That is not to say that this book is lacking. It is impressive in its efficient, yet thorough, presentation of the theory. I especially appreciated how the author took time at the beginning to illustrate some of the strange behavior one encounters in very high dimensions. However, I did find it jarring that mathematical notation was very often used without much introduction. There is an appendix with notations at the end of the book, but I would’ve rather had a bit more interpretation within the text rather than having to flip back and forth. There were also a few typographical errors and partial omissions of formulas, though I can’t be sure if this was part of the text or bugs in the software I used to read the digital version. In summary, this book would certainly make for a good graduate level textbook in an advanced course on statistical methods. If you are willing to put the necessary time and investment into rigorously exploring the foundations of high-dimensional statistics, than you can hardly do better than this book.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"11 1","pages":"121 - 121"},"PeriodicalIF":2.5,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90561275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}