Pub Date : 2022-07-19DOI: 10.1080/00224065.2023.2219012
Christian Capezza, Fabio Centofanti, A. Lepore, A. Menafoglio, B. Palumbo, S. Vantini
Modern statistical process monitoring (SPM) applications focus on profile monitoring, i.e., the monitoring of process quality characteristics that can be modeled as profiles, also known as functional data. Despite the large interest in the profile monitoring literature, there is still a lack of software to facilitate its practical application. This article introduces the funcharts R package that implements recent developments on the SPM of multivariate functional quality characteristics, possibly adjusted by the influence of additional variables, referred to as covariates. The package also implements the real-time version of all control charting procedures to monitor profiles partially observed up to an intermediate domain point. The package is illustrated both through its built-in data generator and a real-case study on the SPM of Ro-Pax ship CO2 emissions during navigation, which is based on the ShipNavigation data provided in the Supplementary Material.
{"title":"funcharts: control charts for multivariate functional data in R","authors":"Christian Capezza, Fabio Centofanti, A. Lepore, A. Menafoglio, B. Palumbo, S. Vantini","doi":"10.1080/00224065.2023.2219012","DOIUrl":"https://doi.org/10.1080/00224065.2023.2219012","url":null,"abstract":"Modern statistical process monitoring (SPM) applications focus on profile monitoring, i.e., the monitoring of process quality characteristics that can be modeled as profiles, also known as functional data. Despite the large interest in the profile monitoring literature, there is still a lack of software to facilitate its practical application. This article introduces the funcharts R package that implements recent developments on the SPM of multivariate functional quality characteristics, possibly adjusted by the influence of additional variables, referred to as covariates. The package also implements the real-time version of all control charting procedures to monitor profiles partially observed up to an intermediate domain point. The package is illustrated both through its built-in data generator and a real-case study on the SPM of Ro-Pax ship CO2 emissions during navigation, which is based on the ShipNavigation data provided in the Supplementary Material.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84184134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-13DOI: 10.1080/00224065.2022.2080617
F. Pascual, Joseph P. Navelski
Abstract Many modern products fail due to one of multiple causes called competing risks. In this article, we propose variable features for monitoring product failure by control charts under competing risks. Failure reports arrive one at a time from a sample of population of units. Features are derived from both the reports and the assumed competing-risk statistical model. To assess the efficacy of different feature subsets in detecting shifts in the failure-time process, we consider control charts based on random forests and compare the average run length performances under different shift scenarios. We demonstrate the control charts with both simulated data sets and actual field data set from a consulting problem. We also propose graphical fault-diagnosis methods for identifying assignable causes of alarm signals. Control charts based on the proposed features will provide valuable information to manufacturers in planning for warranty, part-replacement, or repair.
{"title":"Monitoring reliability under competing risks using field data","authors":"F. Pascual, Joseph P. Navelski","doi":"10.1080/00224065.2022.2080617","DOIUrl":"https://doi.org/10.1080/00224065.2022.2080617","url":null,"abstract":"Abstract Many modern products fail due to one of multiple causes called competing risks. In this article, we propose variable features for monitoring product failure by control charts under competing risks. Failure reports arrive one at a time from a sample of population of units. Features are derived from both the reports and the assumed competing-risk statistical model. To assess the efficacy of different feature subsets in detecting shifts in the failure-time process, we consider control charts based on random forests and compare the average run length performances under different shift scenarios. We demonstrate the control charts with both simulated data sets and actual field data set from a consulting problem. We also propose graphical fault-diagnosis methods for identifying assignable causes of alarm signals. Control charts based on the proposed features will provide valuable information to manufacturers in planning for warranty, part-replacement, or repair.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74895174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-12DOI: 10.1080/00224065.2022.2096515
Joseph D. Conklin
{"title":"Design and Analysis of Experiments and Observational Studies using R","authors":"Joseph D. Conklin","doi":"10.1080/00224065.2022.2096515","DOIUrl":"https://doi.org/10.1080/00224065.2022.2096515","url":null,"abstract":"","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85600900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-15DOI: 10.1080/00224065.2022.2081103
Amirhossein Fallahdizcheh, Chao Wang
Abstract The typical way to conduct data-driven prognosis is to train a degradation model with historical data, then apply the model to predict failure for in-service units. Most existing works assume the historical data and in-service data are from the same process. In practice, however, different but related processes can share similar degradation patterns. Thus, the historical data from these processes are expected to provide useful prognosis information for each other. In this article, we propose a data-level transfer learning framework to extract useful and shared information from different processes to benefit the prognosis of in-service units. In this framework, the degradation data in each process is modeled by a mixed effects model. To facilitate the information sharing among different mixed effects models, a hierarchical Bayesian structure is proposed to model and connect the distributions of mixed effects in different mixed models. Because the degradation paths in different processes are rarely the same, the dimension of the mixed effects/regressor in each process can be different. To handle this issue, we propose a tailored linear transformation to marginalize or expand the distributions of mixed effects in different degradation processes to achieve consistent dimensions. The transferred information is finally incorporated with the degradation data from in-service units to conduct prognosis. The proposed method is validated and compared with various benchmarks in extensive numerical studies and two case studies. The results show the proposed method can successfully transfer useful information in different processes to benefit the prognosis.
{"title":"Data-level transfer learning for degradation modeling and prognosis","authors":"Amirhossein Fallahdizcheh, Chao Wang","doi":"10.1080/00224065.2022.2081103","DOIUrl":"https://doi.org/10.1080/00224065.2022.2081103","url":null,"abstract":"Abstract The typical way to conduct data-driven prognosis is to train a degradation model with historical data, then apply the model to predict failure for in-service units. Most existing works assume the historical data and in-service data are from the same process. In practice, however, different but related processes can share similar degradation patterns. Thus, the historical data from these processes are expected to provide useful prognosis information for each other. In this article, we propose a data-level transfer learning framework to extract useful and shared information from different processes to benefit the prognosis of in-service units. In this framework, the degradation data in each process is modeled by a mixed effects model. To facilitate the information sharing among different mixed effects models, a hierarchical Bayesian structure is proposed to model and connect the distributions of mixed effects in different mixed models. Because the degradation paths in different processes are rarely the same, the dimension of the mixed effects/regressor in each process can be different. To handle this issue, we propose a tailored linear transformation to marginalize or expand the distributions of mixed effects in different degradation processes to achieve consistent dimensions. The transferred information is finally incorporated with the degradation data from in-service units to conduct prognosis. The proposed method is validated and compared with various benchmarks in extensive numerical studies and two case studies. The results show the proposed method can successfully transfer useful information in different processes to benefit the prognosis.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81621824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-02DOI: 10.1080/00224065.2022.2081104
Peihua Qiu, Kai-zuan Yang
Abstract Spatio-temporal process monitoring (STPM) has received a considerable attention recently due to its broad applications in environment monitoring, disease surveillance, streaming image processing, and more. Because spatio-temporal data often have complicated structure, including latent spatio-temporal data correlation, complex spatio-temporal mean structure, and nonparametric data distribution, STPM is a challenging research problem. In practice, if a spatio-temporal process has a distributional shift (e.g., mean shift) started at a specific time point, then the spatial locations with the shift are usually clustered in small regions. This kind of spatial feature of the shift has not been considered in the existing STPM literature yet. In this paper, we develop a new STPM method that takes into account the spatial feature of the shift in its construction. The new method combines the ideas of exponentially weighted moving average in the temporal domain for online process monitoring and spatial LASSO in the spatial domain for accommodating the spatial feature of a future shift. It can also accommodate the complicated spatio-temporal data structure well. Both simulation studies and a real-data application show that it can provide a reliable and effective tool for different STPM applications.
时空过程监测(spatial -temporal process monitoring, STPM)由于在环境监测、疾病监测、流图像处理等领域的广泛应用,近年来受到了广泛的关注。由于时空数据往往具有复杂的结构,包括潜在的时空相关性、复杂的时空平均结构和非参数的数据分布等,因此时空pm是一个具有挑战性的研究问题。在实践中,如果一个时空过程在特定的时间点发生了分布偏移(例如,均值偏移),那么发生偏移的空间位置通常聚集在小区域中。这种转移的空间特征在现有的STPM文献中尚未被考虑。在本文中,我们开发了一种新的STPM方法,该方法在其构造中考虑了位移的空间特征。该方法结合了指数加权移动平均的思想,在时域用于在线过程监测,空间LASSO在空间域用于适应未来变化的空间特征。它还能很好地适应复杂的时空数据结构。仿真研究和实际应用表明,该方法可以为不同的STPM应用提供可靠有效的工具。
{"title":"Spatio-temporal process monitoring using exponentially weighted spatial LASSO","authors":"Peihua Qiu, Kai-zuan Yang","doi":"10.1080/00224065.2022.2081104","DOIUrl":"https://doi.org/10.1080/00224065.2022.2081104","url":null,"abstract":"Abstract Spatio-temporal process monitoring (STPM) has received a considerable attention recently due to its broad applications in environment monitoring, disease surveillance, streaming image processing, and more. Because spatio-temporal data often have complicated structure, including latent spatio-temporal data correlation, complex spatio-temporal mean structure, and nonparametric data distribution, STPM is a challenging research problem. In practice, if a spatio-temporal process has a distributional shift (e.g., mean shift) started at a specific time point, then the spatial locations with the shift are usually clustered in small regions. This kind of spatial feature of the shift has not been considered in the existing STPM literature yet. In this paper, we develop a new STPM method that takes into account the spatial feature of the shift in its construction. The new method combines the ideas of exponentially weighted moving average in the temporal domain for online process monitoring and spatial LASSO in the spatial domain for accommodating the spatial feature of a future shift. It can also accommodate the complicated spatio-temporal data structure well. Both simulation studies and a real-data application show that it can provide a reliable and effective tool for different STPM applications.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74938549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-12DOI: 10.1080/00224065.2022.2060150
W. Woodall
{"title":"Book review: Introduction to statistical process control","authors":"W. Woodall","doi":"10.1080/00224065.2022.2060150","DOIUrl":"https://doi.org/10.1080/00224065.2022.2060150","url":null,"abstract":"","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84374739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-04DOI: 10.1080/00224065.2022.2053794
Piao Chen, Kilian Buis, Xiujie Zhao
Abstract The gamma distribution is one of the most important parametric models in probability theory and statistics. Although a multitude of studies have theoretically investigated the properties of the gamma distribution in the literature, there is still a serious lack of tailored statistical tools to facilitate its practical applications. To fill the gap, this paper develops a comprehensive R package for the gamma distribution. In specific, the R package focuses on the following three important tasks: generate the gamma random variables, estimate the model parameters, and construct statistical limits, including confidence limits, prediction limits, and tolerance limits based on the gamma random variables. The proposed package encompasses the state-of-the-art methods of the gamma distribution in the literature and its usage is illustrated by a real application.
{"title":"A comprehensive toolbox for the gamma distribution: The gammadist package","authors":"Piao Chen, Kilian Buis, Xiujie Zhao","doi":"10.1080/00224065.2022.2053794","DOIUrl":"https://doi.org/10.1080/00224065.2022.2053794","url":null,"abstract":"Abstract The gamma distribution is one of the most important parametric models in probability theory and statistics. Although a multitude of studies have theoretically investigated the properties of the gamma distribution in the literature, there is still a serious lack of tailored statistical tools to facilitate its practical applications. To fill the gap, this paper develops a comprehensive R package for the gamma distribution. In specific, the R package focuses on the following three important tasks: generate the gamma random variables, estimate the model parameters, and construct statistical limits, including confidence limits, prediction limits, and tolerance limits based on the gamma random variables. The proposed package encompasses the state-of-the-art methods of the gamma distribution in the literature and its usage is illustrated by a real application.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80021599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-07DOI: 10.1080/00224065.2022.2041378
Caleb King
A deep dive into the theoretical underpinnings of common high-dimensional statistical 20 techniques, Dr. Giraud’s Introduction to High-Dimensional Statistics is a good reference for those who wish to explore the mathematical foundations of state-of-the-art multivariate methods. The book covers a wide array of topics, from estimation bounds to multivariate regression and even clustering. In this 2nd edition, Dr. Giraud expands his work to include more recent advances and statistical methods. The book consists of 12 chapters, starting with a brief introduction to the complexities of conducting statistics in high dimensions. The book then proceeds similar to a standard statistical textbook, moving from properties of statistical estimators to statistical modeling, including regression and then other more advanced topics. Each chapter concludes with a set of exercises, many of which are portions of proofs from the chapter left for the reader. All that being said, do not let the title of the book fool you. By the author’s own admission, this is not an introduction on the same level as Hastie et al.’s Elements of Statistical Learning. Instead, the focus of this book is on the mathematical foundations of high-dimensional techniques, proving theorems regarding properties of estimators. I must confess this is not quite what I expected upon first look; one truly cannot judge a book by its cover. That is not to say that this book is lacking. It is impressive in its efficient, yet thorough, presentation of the theory. I especially appreciated how the author took time at the beginning to illustrate some of the strange behavior one encounters in very high dimensions. However, I did find it jarring that mathematical notation was very often used without much introduction. There is an appendix with notations at the end of the book, but I would’ve rather had a bit more interpretation within the text rather than having to flip back and forth. There were also a few typographical errors and partial omissions of formulas, though I can’t be sure if this was part of the text or bugs in the software I used to read the digital version. In summary, this book would certainly make for a good graduate level textbook in an advanced course on statistical methods. If you are willing to put the necessary time and investment into rigorously exploring the foundations of high-dimensional statistics, than you can hardly do better than this book.
{"title":"Introduction to High-Dimensional Statistics, Christophe Giraud. Chapman& Hall/CRC Press, 2021, 364 pp., $72.00 hardcover, ISBN 978-0-367-71622-6.","authors":"Caleb King","doi":"10.1080/00224065.2022.2041378","DOIUrl":"https://doi.org/10.1080/00224065.2022.2041378","url":null,"abstract":"A deep dive into the theoretical underpinnings of common high-dimensional statistical 20 techniques, Dr. Giraud’s Introduction to High-Dimensional Statistics is a good reference for those who wish to explore the mathematical foundations of state-of-the-art multivariate methods. The book covers a wide array of topics, from estimation bounds to multivariate regression and even clustering. In this 2nd edition, Dr. Giraud expands his work to include more recent advances and statistical methods. The book consists of 12 chapters, starting with a brief introduction to the complexities of conducting statistics in high dimensions. The book then proceeds similar to a standard statistical textbook, moving from properties of statistical estimators to statistical modeling, including regression and then other more advanced topics. Each chapter concludes with a set of exercises, many of which are portions of proofs from the chapter left for the reader. All that being said, do not let the title of the book fool you. By the author’s own admission, this is not an introduction on the same level as Hastie et al.’s Elements of Statistical Learning. Instead, the focus of this book is on the mathematical foundations of high-dimensional techniques, proving theorems regarding properties of estimators. I must confess this is not quite what I expected upon first look; one truly cannot judge a book by its cover. That is not to say that this book is lacking. It is impressive in its efficient, yet thorough, presentation of the theory. I especially appreciated how the author took time at the beginning to illustrate some of the strange behavior one encounters in very high dimensions. However, I did find it jarring that mathematical notation was very often used without much introduction. There is an appendix with notations at the end of the book, but I would’ve rather had a bit more interpretation within the text rather than having to flip back and forth. There were also a few typographical errors and partial omissions of formulas, though I can’t be sure if this was part of the text or bugs in the software I used to read the digital version. In summary, this book would certainly make for a good graduate level textbook in an advanced course on statistical methods. If you are willing to put the necessary time and investment into rigorously exploring the foundations of high-dimensional statistics, than you can hardly do better than this book.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90561275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-25DOI: 10.1080/00224065.2022.2041379
Shuai Huang
This book is useful for readers who want to hone their skills in Bayesian modeling and computation. Written by experts in the area of Bayesian software and major contributors to some existing widely used Bayesian computational tools, this book covers not only basic Bayesian probabilistic inference but also a range of models from linear models (and mixed effect models, hierarchical models, splines, etc) to time series models such as the state space model. It also covers the Bayesian additive regression trees. Almost all the concepts and techniques are implemented using PyMC3, Tensorflow Probability (TFP), ArviZ and other libraries. By doing all the modeling, computation, and data analysis, the authors not only show how these things work, but also show how and why things don’t work by emphasis on exploratory data analysis, model comparison, and diagnostics. To learn from the book, readers may need some statistical background such as basic training in statistics and probability theory. Some understanding of Bayesian modeling and inference is also needed, such as the concepts of prior, likelihood, posterior, the bayes’s law, and Monte Carlo sampling. Some experience with Python would also be very beneficial for readers to get started on this journey of Bayesian modeling. The authors suggested a few books as possible preliminaries for their book. I feel that the readers may also benefit from reading Andrew Gelman’s book, Bayesian Data Analysis, Chapman & Hall/CRC, 3rd Edition, 2013. Of course, as the authors pointed it out, this book is not for a Bayesian Reader but a Bayesian practitioner. The book is more of an interactive experience for Bayesian practitioners by learning all the computational tools to model and to negotiate with data for a good modeling practice. On the other hand, if readers have already had experience with real-world data analysis using Python or R or other similar tools, even if this book is their first experience with Bayesian modeling and computation, readers may still learn a lot from this book. There are an abundance of figures and detailed explanations of how things are done and how the results are interpreted. Picking up these details would need some trained sensibility when dealing with real-world data, but aspiring and experienced practitioners should find all the details useful and impressive. And there are also many big picture schematic drawings to help readers connect all the details with overall concepts such as end-to-end workflows. The Figure 9.1 is a remarkable example. Overall, as Kevin Murphy pointed out in the Forward, “this is a valuable addition to the literature, which should hopefully further the adoption of Bayesian methods”. I highly recommend readers who are interested in learning Bayesian models and their applications in practice to have this book on their bookshelf.
这本书对想要磨练贝叶斯建模和计算技能的读者很有用。由贝叶斯软件领域的专家和一些现有广泛使用的贝叶斯计算工具的主要贡献者撰写,本书不仅涵盖了基本的贝叶斯概率推断,还涵盖了从线性模型(和混合效应模型,层次模型,样条等)到时间序列模型(如状态空间模型)的一系列模型。它还涵盖了贝叶斯加性回归树。几乎所有的概念和技术都是使用PyMC3、Tensorflow Probability (TFP)、ArviZ和其他库实现的。通过进行所有的建模、计算和数据分析,作者不仅展示了这些东西是如何工作的,而且通过强调探索性数据分析、模型比较和诊断,还展示了事情是如何以及为什么不工作的。为了从这本书中学习,读者可能需要一些统计背景,如统计和概率论的基本训练。还需要对贝叶斯建模和推理有一定的了解,例如先验、似然、后验、贝叶斯定律和蒙特卡罗抽样的概念。对于开始贝叶斯建模之旅的读者来说,一些Python的经验也是非常有益的。作者们推荐了几本书作为他们这本书可能的序言。我觉得读者也可以从Andrew Gelman的书中受益,Bayesian Data Analysis, Chapman & Hall/CRC, 3rd Edition, 2013。当然,正如作者所指出的,这本书不是为贝叶斯读者而写,而是为贝叶斯实践者而写。这本书更多的是贝叶斯实践者的互动体验,通过学习所有的计算工具来建模和与数据协商,以获得良好的建模实践。另一方面,如果读者已经有了使用Python或R或其他类似工具进行实际数据分析的经验,即使本书是他们第一次使用贝叶斯建模和计算,读者仍然可以从本书中学到很多东西。书中有大量的数据和详细的解释,说明事情是如何完成的,结果是如何解释的。在处理真实世界的数据时,获取这些细节需要一些训练有素的敏感性,但是有抱负和经验丰富的从业者应该会发现所有的细节都是有用的和令人印象深刻的。此外,还有许多大图原理图,帮助读者将所有细节与整体概念(如端到端工作流)联系起来。图9.1就是一个很好的例子。总的来说,正如Kevin Murphy在前言中指出的,“这是对文献的一个有价值的补充,它应该有望进一步采用贝叶斯方法”。我强烈建议对学习贝叶斯模型及其在实践中的应用感兴趣的读者把这本书放在书架上。
{"title":"Bayesian Modeling and Computation in Python","authors":"Shuai Huang","doi":"10.1080/00224065.2022.2041379","DOIUrl":"https://doi.org/10.1080/00224065.2022.2041379","url":null,"abstract":"This book is useful for readers who want to hone their skills in Bayesian modeling and computation. Written by experts in the area of Bayesian software and major contributors to some existing widely used Bayesian computational tools, this book covers not only basic Bayesian probabilistic inference but also a range of models from linear models (and mixed effect models, hierarchical models, splines, etc) to time series models such as the state space model. It also covers the Bayesian additive regression trees. Almost all the concepts and techniques are implemented using PyMC3, Tensorflow Probability (TFP), ArviZ and other libraries. By doing all the modeling, computation, and data analysis, the authors not only show how these things work, but also show how and why things don’t work by emphasis on exploratory data analysis, model comparison, and diagnostics. To learn from the book, readers may need some statistical background such as basic training in statistics and probability theory. Some understanding of Bayesian modeling and inference is also needed, such as the concepts of prior, likelihood, posterior, the bayes’s law, and Monte Carlo sampling. Some experience with Python would also be very beneficial for readers to get started on this journey of Bayesian modeling. The authors suggested a few books as possible preliminaries for their book. I feel that the readers may also benefit from reading Andrew Gelman’s book, Bayesian Data Analysis, Chapman & Hall/CRC, 3rd Edition, 2013. Of course, as the authors pointed it out, this book is not for a Bayesian Reader but a Bayesian practitioner. The book is more of an interactive experience for Bayesian practitioners by learning all the computational tools to model and to negotiate with data for a good modeling practice. On the other hand, if readers have already had experience with real-world data analysis using Python or R or other similar tools, even if this book is their first experience with Bayesian modeling and computation, readers may still learn a lot from this book. There are an abundance of figures and detailed explanations of how things are done and how the results are interpreted. Picking up these details would need some trained sensibility when dealing with real-world data, but aspiring and experienced practitioners should find all the details useful and impressive. And there are also many big picture schematic drawings to help readers connect all the details with overall concepts such as end-to-end workflows. The Figure 9.1 is a remarkable example. Overall, as Kevin Murphy pointed out in the Forward, “this is a valuable addition to the literature, which should hopefully further the adoption of Bayesian methods”. I highly recommend readers who are interested in learning Bayesian models and their applications in practice to have this book on their bookshelf.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78872910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-23DOI: 10.1080/00224065.2022.2035284
Guanghui Wang, Changliang Zou
Abstract Change-point detection is a popular statistical method for Phase I analysis in statistical process control. The cpss package has been developed to provide users with multiple choices of change-point searching algorithms for a variety of frequently considered parametric change-point models, including the univariate and multivariate mean and/or (co)variance change models, changes in linear models and generalized linear models, and change models in exponential families. In particular, it integrates the recently proposed COPSS criterion to determine the number of change-points in a data-driven fashion that avoids selecting or specifying additional tuning parameters in existing approaches. Hence it is more convenient to use in practical applications. In addition, the cpss package brings great possibilities to handle user-customized change-point models.
{"title":"cpss: an package for change-point detection by sample-splitting methods","authors":"Guanghui Wang, Changliang Zou","doi":"10.1080/00224065.2022.2035284","DOIUrl":"https://doi.org/10.1080/00224065.2022.2035284","url":null,"abstract":"Abstract Change-point detection is a popular statistical method for Phase I analysis in statistical process control. The cpss package has been developed to provide users with multiple choices of change-point searching algorithms for a variety of frequently considered parametric change-point models, including the univariate and multivariate mean and/or (co)variance change models, changes in linear models and generalized linear models, and change models in exponential families. In particular, it integrates the recently proposed COPSS criterion to determine the number of change-points in a data-driven fashion that avoids selecting or specifying additional tuning parameters in existing approaches. Hence it is more convenient to use in practical applications. In addition, the cpss package brings great possibilities to handle user-customized change-point models.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85866418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}