Journal of Quality Technology最新文献_第6页

Bayesian Modeling and Computation in Python Python中的贝叶斯建模与计算

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2022-02-25 DOI: 10.1080/00224065.2022.2041379

Shuai Huang

This book is useful for readers who want to hone their skills in Bayesian modeling and computation. Written by experts in the area of Bayesian software and major contributors to some existing widely used Bayesian computational tools, this book covers not only basic Bayesian probabilistic inference but also a range of models from linear models (and mixed effect models, hierarchical models, splines, etc) to time series models such as the state space model. It also covers the Bayesian additive regression trees. Almost all the concepts and techniques are implemented using PyMC3, Tensorflow Probability (TFP), ArviZ and other libraries. By doing all the modeling, computation, and data analysis, the authors not only show how these things work, but also show how and why things don’t work by emphasis on exploratory data analysis, model comparison, and diagnostics. To learn from the book, readers may need some statistical background such as basic training in statistics and probability theory. Some understanding of Bayesian modeling and inference is also needed, such as the concepts of prior, likelihood, posterior, the bayes’s law, and Monte Carlo sampling. Some experience with Python would also be very beneficial for readers to get started on this journey of Bayesian modeling. The authors suggested a few books as possible preliminaries for their book. I feel that the readers may also benefit from reading Andrew Gelman’s book, Bayesian Data Analysis, Chapman & Hall/CRC, 3rd Edition, 2013. Of course, as the authors pointed it out, this book is not for a Bayesian Reader but a Bayesian practitioner. The book is more of an interactive experience for Bayesian practitioners by learning all the computational tools to model and to negotiate with data for a good modeling practice. On the other hand, if readers have already had experience with real-world data analysis using Python or R or other similar tools, even if this book is their first experience with Bayesian modeling and computation, readers may still learn a lot from this book. There are an abundance of figures and detailed explanations of how things are done and how the results are interpreted. Picking up these details would need some trained sensibility when dealing with real-world data, but aspiring and experienced practitioners should find all the details useful and impressive. And there are also many big picture schematic drawings to help readers connect all the details with overall concepts such as end-to-end workflows. The Figure 9.1 is a remarkable example. Overall, as Kevin Murphy pointed out in the Forward, “this is a valuable addition to the literature, which should hopefully further the adoption of Bayesian methods”. I highly recommend readers who are interested in learning Bayesian models and their applications in practice to have this book on their bookshelf.

这本书对想要磨练贝叶斯建模和计算技能的读者很有用。由贝叶斯软件领域的专家和一些现有广泛使用的贝叶斯计算工具的主要贡献者撰写，本书不仅涵盖了基本的贝叶斯概率推断，还涵盖了从线性模型(和混合效应模型，层次模型，样条等)到时间序列模型(如状态空间模型)的一系列模型。它还涵盖了贝叶斯加性回归树。几乎所有的概念和技术都是使用PyMC3、Tensorflow Probability (TFP)、ArviZ和其他库实现的。通过进行所有的建模、计算和数据分析，作者不仅展示了这些东西是如何工作的，而且通过强调探索性数据分析、模型比较和诊断，还展示了事情是如何以及为什么不工作的。为了从这本书中学习，读者可能需要一些统计背景，如统计和概率论的基本训练。还需要对贝叶斯建模和推理有一定的了解，例如先验、似然、后验、贝叶斯定律和蒙特卡罗抽样的概念。对于开始贝叶斯建模之旅的读者来说，一些Python的经验也是非常有益的。作者们推荐了几本书作为他们这本书可能的序言。我觉得读者也可以从Andrew Gelman的书中受益，Bayesian Data Analysis, Chapman & Hall/CRC, 3rd Edition, 2013。当然，正如作者所指出的，这本书不是为贝叶斯读者而写，而是为贝叶斯实践者而写。这本书更多的是贝叶斯实践者的互动体验，通过学习所有的计算工具来建模和与数据协商，以获得良好的建模实践。另一方面，如果读者已经有了使用Python或R或其他类似工具进行实际数据分析的经验，即使本书是他们第一次使用贝叶斯建模和计算，读者仍然可以从本书中学到很多东西。书中有大量的数据和详细的解释，说明事情是如何完成的，结果是如何解释的。在处理真实世界的数据时，获取这些细节需要一些训练有素的敏感性，但是有抱负和经验丰富的从业者应该会发现所有的细节都是有用的和令人印象深刻的。此外，还有许多大图原理图，帮助读者将所有细节与整体概念(如端到端工作流)联系起来。图9.1就是一个很好的例子。总的来说，正如Kevin Murphy在前言中指出的，“这是对文献的一个有价值的补充，它应该有望进一步采用贝叶斯方法”。我强烈建议对学习贝叶斯模型及其在实践中的应用感兴趣的读者把这本书放在书架上。

{"title":"Bayesian Modeling and Computation in Python","authors":"Shuai Huang","doi":"10.1080/00224065.2022.2041379","DOIUrl":"https://doi.org/10.1080/00224065.2022.2041379","url":null,"abstract":"This book is useful for readers who want to hone their skills in Bayesian modeling and computation. Written by experts in the area of Bayesian software and major contributors to some existing widely used Bayesian computational tools, this book covers not only basic Bayesian probabilistic inference but also a range of models from linear models (and mixed effect models, hierarchical models, splines, etc) to time series models such as the state space model. It also covers the Bayesian additive regression trees. Almost all the concepts and techniques are implemented using PyMC3, Tensorflow Probability (TFP), ArviZ and other libraries. By doing all the modeling, computation, and data analysis, the authors not only show how these things work, but also show how and why things don’t work by emphasis on exploratory data analysis, model comparison, and diagnostics. To learn from the book, readers may need some statistical background such as basic training in statistics and probability theory. Some understanding of Bayesian modeling and inference is also needed, such as the concepts of prior, likelihood, posterior, the bayes’s law, and Monte Carlo sampling. Some experience with Python would also be very beneficial for readers to get started on this journey of Bayesian modeling. The authors suggested a few books as possible preliminaries for their book. I feel that the readers may also benefit from reading Andrew Gelman’s book, Bayesian Data Analysis, Chapman & Hall/CRC, 3rd Edition, 2013. Of course, as the authors pointed it out, this book is not for a Bayesian Reader but a Bayesian practitioner. The book is more of an interactive experience for Bayesian practitioners by learning all the computational tools to model and to negotiate with data for a good modeling practice. On the other hand, if readers have already had experience with real-world data analysis using Python or R or other similar tools, even if this book is their first experience with Bayesian modeling and computation, readers may still learn a lot from this book. There are an abundance of figures and detailed explanations of how things are done and how the results are interpreted. Picking up these details would need some trained sensibility when dealing with real-world data, but aspiring and experienced practitioners should find all the details useful and impressive. And there are also many big picture schematic drawings to help readers connect all the details with overall concepts such as end-to-end workflows. The Figure 9.1 is a remarkable example. Overall, as Kevin Murphy pointed out in the Forward, “this is a valuable addition to the literature, which should hopefully further the adoption of Bayesian methods”. I highly recommend readers who are interested in learning Bayesian models and their applications in practice to have this book on their bookshelf.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"30 1","pages":"266 - 266"},"PeriodicalIF":2.5,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78872910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

cpss: an package for change-point detection by sample-splitting methods Cpss:一个通过样本分割方法进行变更点检测的包

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2022-02-23 DOI: 10.1080/00224065.2022.2035284

Guanghui Wang, Changliang Zou

Abstract Change-point detection is a popular statistical method for Phase I analysis in statistical process control. The cpss package has been developed to provide users with multiple choices of change-point searching algorithms for a variety of frequently considered parametric change-point models, including the univariate and multivariate mean and/or (co)variance change models, changes in linear models and generalized linear models, and change models in exponential families. In particular, it integrates the recently proposed COPSS criterion to determine the number of change-points in a data-driven fashion that avoids selecting or specifying additional tuning parameters in existing approaches. Hence it is more convenient to use in practical applications. In addition, the cpss package brings great possibilities to handle user-customized change-point models.

摘要变化点检测是统计过程控制中常用的一种统计方法。cpss软件包的开发为用户提供了多种选择的变化点搜索算法，用于各种经常考虑的参数变化点模型，包括单变量和多变量均值和/或(co)方差变化模型，线性模型和广义线性模型的变化，以及指数族的变化模型。特别是，它集成了最近提出的COPSS标准，以数据驱动的方式确定更改点的数量，从而避免在现有方法中选择或指定额外的调优参数。因此在实际应用中使用更为方便。此外，cpss包为处理用户自定义的变更点模型提供了很大的可能性。

引用次数: 1

Design strategies and approximation methods for high-performance computing variability management 高性能计算可变性管理的设计策略与近似方法

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2022-01-24 DOI: 10.1080/00224065.2022.2035285

Yueyao Wang, Li Xu, Yili Hong, Rong Pan, Tyler H. Chang, T. Lux, Jon Bernard, L. Watson, K. Cameron

Abstract Performance variability management is an active research area in high-performance computing (HPC). In this article, we focus on input/output (I/O) variability, which is a complicated function that is affected by many system factors. To study the performance variability, computer scientists often use grid-based designs (GBDs) which are equivalent to full factorial designs to collect I/O variability data, and use mathematical approximation methods to build a prediction model. Mathematical approximation models, as deterministic methods, could be biased particularly if extrapolations are needed. In statistics literature, space-filling designs (SFDs) and surrogate models such as Gaussian process (GP) are popular for data collection and building predictive models. The applicability of SFDs and surrogates in the HPC variability management setting, however, needs investigation. In this case study, we investigate their applicability in the HPC setting in terms of design efficiency, prediction accuracy, and scalability. We first customize the existing SFDs so that they can be applied in the HPC setting. We conduct a comprehensive investigation of design strategies and the prediction ability of approximation methods. We use both synthetic data simulated from three test functions and the real data from the HPC setting. We then compare different methods in terms of design efficiency, prediction accuracy, and scalability. In our synthetic and real data analysis, GP with SFDs outperforms in most scenarios. With respect to the choice of approximation models, GP is recommended if the data are collected by SFDs. If data are collected using GBDs, both GP and Delaunay can be considered. With the best choice of approximation method, the performance of SFDs and GBD depends on the property of the underlying surface. For the cases in which SFDs perform better, the number of design points needed for SFDs is about half of or less than that of the GBD to achieve the same prediction accuracy. Although we observe that the GBD can also outperform SFDs for smooth underlying surface, GBD is not scalable to high dimensional experimental regions. Therefore, SFDs that can be tailored to high dimension and non-smooth surface are recommended especially when large numbers of input factors need to be considered in the model. This article has online supplementary materials.

性能可变性管理是高性能计算领域的一个活跃研究领域。在本文中，我们将重点讨论输入/输出(I/O)可变性，这是一个受许多系统因素影响的复杂函数。为了研究性能变异性，计算机科学家通常使用相当于全因子设计的基于网格的设计(GBDs)来收集I/O变异性数据，并使用数学近似方法建立预测模型。数学近似模型作为确定性方法，尤其在需要外推时可能存在偏差。在统计文献中，空间填充设计(SFDs)和替代模型(如高斯过程(GP))是数据收集和建立预测模型的常用方法。然而，SFDs和替代品在HPC可变性管理环境中的适用性需要调查。在本案例研究中，我们从设计效率、预测准确性和可扩展性方面研究了它们在高性能计算环境中的适用性。我们首先定制现有的sfd，以便它们可以应用于HPC设置。我们对近似方法的设计策略和预测能力进行了全面的研究。我们使用了三个测试函数模拟的合成数据和HPC设置的真实数据。然后，我们在设计效率、预测精度和可扩展性方面比较了不同的方法。在我们的合成和真实数据分析中，带有sfd的GP在大多数情况下都表现出色。关于近似模型的选择，如果数据是由SFDs收集的，建议使用GP。如果使用GBDs收集数据，GP和Delaunay都可以考虑。在选择最佳近似方法时，sfd和GBD的性能取决于下垫面的性质。在sfd表现较好的情况下，sfd所需的设计点数量约为GBD的一半或更少，以达到相同的预测精度。虽然我们观察到GBD在光滑下垫面上也优于sfd，但GBD不能扩展到高维实验区域。因此，在模型中需要考虑大量输入因素的情况下，建议使用能够适应高维、非光滑表面的sfd。这篇文章有在线补充资料。

{"title":"Design strategies and approximation methods for high-performance computing variability management","authors":"Yueyao Wang, Li Xu, Yili Hong, Rong Pan, Tyler H. Chang, T. Lux, Jon Bernard, L. Watson, K. Cameron","doi":"10.1080/00224065.2022.2035285","DOIUrl":"https://doi.org/10.1080/00224065.2022.2035285","url":null,"abstract":"Abstract Performance variability management is an active research area in high-performance computing (HPC). In this article, we focus on input/output (I/O) variability, which is a complicated function that is affected by many system factors. To study the performance variability, computer scientists often use grid-based designs (GBDs) which are equivalent to full factorial designs to collect I/O variability data, and use mathematical approximation methods to build a prediction model. Mathematical approximation models, as deterministic methods, could be biased particularly if extrapolations are needed. In statistics literature, space-filling designs (SFDs) and surrogate models such as Gaussian process (GP) are popular for data collection and building predictive models. The applicability of SFDs and surrogates in the HPC variability management setting, however, needs investigation. In this case study, we investigate their applicability in the HPC setting in terms of design efficiency, prediction accuracy, and scalability. We first customize the existing SFDs so that they can be applied in the HPC setting. We conduct a comprehensive investigation of design strategies and the prediction ability of approximation methods. We use both synthetic data simulated from three test functions and the real data from the HPC setting. We then compare different methods in terms of design efficiency, prediction accuracy, and scalability. In our synthetic and real data analysis, GP with SFDs outperforms in most scenarios. With respect to the choice of approximation models, GP is recommended if the data are collected by SFDs. If data are collected using GBDs, both GP and Delaunay can be considered. With the best choice of approximation method, the performance of SFDs and GBD depends on the property of the underlying surface. For the cases in which SFDs perform better, the number of design points needed for SFDs is about half of or less than that of the GBD to achieve the same prediction accuracy. Although we observe that the GBD can also outperform SFDs for smooth underlying surface, GBD is not scalable to high dimensional experimental regions. Therefore, SFDs that can be tailored to high dimension and non-smooth surface are recommended especially when large numbers of input factors need to be considered in the model. This article has online supplementary materials.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"39 1","pages":"88 - 103"},"PeriodicalIF":2.5,"publicationDate":"2022-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82735014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Knots and their effect on the tensile strength of lumber: A case study 结及其对木材抗拉强度的影响:一个案例研究

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2022-01-10 DOI: 10.1080/00224065.2023.2180457

Shuxiang Fan, S. Wong, J. Zidek

Abstract When assessing the strength of sawn lumber for use in engineering applications, the sizes and locations of knots are an important consideration. Knots are the most common visual characteristics of lumber, that result from the growth of tree branches. Large individual knots, as well as clusters of distinct knots, are known to have strength-reducing effects. However, industry grading rules that govern knots are informed by subjective judgment to some extent, particularly the spatial interaction of knots and their relationship with lumber strength. This case study reports the results of an experiment that investigated and modeled the strength-reducing effects of knots on a sample of Douglas Fir lumber. Experimental data were obtained by taking scans of lumber surfaces and applying tensile strength testing. The modeling approach presented incorporates all relevant knot information in a Bayesian framework, thereby contributing a more refined way of managing the quality of manufactured lumber.

当评估在工程应用中使用的锯材的强度时，结点的大小和位置是一个重要的考虑因素。结是木材最常见的视觉特征，是树枝生长的结果。大的单个结，以及不同的结簇，都有降低强度的作用。然而，管理结的行业分级规则在某种程度上是由主观判断决定的，特别是结的空间相互作用及其与木材强度的关系。本案例研究报告了一项实验的结果，该实验调查并模拟了花旗松木材样品上结的强度降低效应。实验数据是通过对木材表面进行扫描和拉伸强度测试获得的。所提出的建模方法在贝叶斯框架中包含了所有相关的结信息，从而提供了一种更精细的方法来管理制造木材的质量。

引用次数: 0

Mathematical Statistics 数理统计

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2021-12-30 DOI: 10.1080/00224065.2020.1764418

Shuai Huang

As a sister book of the book Probability by the same author, this book is supposed to be the second course in a mathematical statistics sequence of classes. The readers should have learned calculus and completed a calculusbased course in probability. As with most mathematical statistics textbooks, point estimations, interval estimation, and hypothesis testing are the core concepts. This book is particularly written for students who would have their first exposure to mathematical statistics, so the author carefully selected his materials and had focused on the understanding of statistics such as the sample mean and sample variance being also random variables as well. R is used throughout the text for graphics, computation, and Monte Carlo simulation. The homework is comprehensive. From all these aspects, this book has a similar style as the other book of Probability by the same author. The book’s organization is deceptively simple: it only has four chapters. Chapter 1, almost 100 pages, is about random sampling. Chapter 2, another 100 pages, is about point estimation. Chapter 3, 135 pages, is about interval estimation. Chapter 4, 133 pages, is about hypothesis testing. This “simple” structure makes the four pillars of mathematical statistics very clear to readers who first learn the topic. Within each chapter, just like in the book Probability, each concept is presented in detail and in multiple aspects. And when calculation is involved, enough middle steps are preserved so readers can easily follow the steps. One notable example is the presentation of the hypothesis testing. Not like many other textbooks that start with proven methods such as the Z-test, this book introduces the big picture first, and this big picture includes “a hunch”: it presents in the very beginning a clear outline of the 12 steps for hypothesis testing, starts with “a hunch, or theory, concerning a problem of interest,” then moves to the second step “translate the theory into a question concerning an unknown parameter theta,” then “state the null hypothesis of theta.” ... Then technical explanation of many of these steps is given in detail. The type 1 and type 2 errors are also presented right along with this 12-step outline. What is more, strange (i.e., idiosyncratic) forms of hypothesis testing are presented! It concerns three brothers, Chico, Harpo, and Groucho. Each of them comes up with their own testing statistics, e.g., x1þ x2, min (x1, x2), or max(x1, x2), where x1 and x2 are random samples of size 2 from a uniform distribution U(0, theta). Is theta 1⁄4 5, or theta 1⁄4 2? Is this an allusion to the three little pigs? Nonetheless, this is a hilarious example that very effectively instructs the technical details of hypothesis testing, but also revives this “ancient” technique that tells readers that, in using the proven hypothesis testing methods, we actually have made choices (i.e., each of the three brothers’ proposals have pros and cons, in terms of the type 1 and

作为同一作者的《概率论》一书的姊妹书，这本书应该是数理统计系列课程中的第二门课程。读者应该学过微积分，并完成了基于微积分的概率论课程。与大多数数理统计教科书一样，点估计、区间估计和假设检验是核心概念。这本书是专门为学生谁将有他们的第一次接触数理统计，所以作者精心选择了他的材料，并集中在统计的理解，如样本均值和样本方差也是随机变量。R在整个文本中用于图形，计算和蒙特卡罗模拟。作业很全面。从所有这些方面来看，这本书的风格与同一作者的另一本《概率论》相似。这本书的组织结构看似简单:只有四章。第一章，将近100页，是关于随机抽样的。第二章，也是100页，是关于点估计的。第3章，135页，是关于区间估计的。第四章，133页，是关于假设检验的。这种“简单”的结构使数理统计的四大支柱对第一次学习这个主题的读者来说非常清楚。在每一章中，就像在《概率论》一书中一样，每个概念都从多个方面进行了详细的介绍。当涉及计算时，保留了足够的中间步骤，以便读者可以轻松地遵循步骤。一个显著的例子是假设检验的呈现。与许多其他教科书从z检验等已证明的方法开始不同，这本书首先介绍了大图，而这个大图包括“预感”:它在一开始就提出了假设检验的12个步骤的清晰大纲，从“关于感兴趣的问题的预感或理论”开始，然后进入第二步“将理论转化为关于未知参数的问题”，然后“陈述”零假设。“…然后对其中许多步骤进行了详细的技术说明。第1类和第2类错误也与这12步大纲一起呈现。更重要的是，提出了奇怪的(即，特殊的)假设检验形式!故事涉及三兄弟，奇科、哈波和格劳乔。它们每个都有自己的检验统计量，例如x1 + x2, min (x1, x2)或max(x1, x2)，其中x1和x2是来自均匀分布U(0， θ)的大小为2的随机样本。是1 / 4 5还是1 / 4 2?这是在影射三只小猪吗?尽管如此，这是一个非常有趣的例子，它非常有效地指导了假设检验的技术细节，但也复兴了这个“古老”的技术，告诉读者，在使用经过验证的假设检验方法时，我们实际上已经做出了选择(即，三兄弟的每个建议都有利弊，就第一类和第二类错误而言)，并且总是有新的想法可能。

{"title":"Mathematical Statistics","authors":"Shuai Huang","doi":"10.1080/00224065.2020.1764418","DOIUrl":"https://doi.org/10.1080/00224065.2020.1764418","url":null,"abstract":"As a sister book of the book Probability by the same author, this book is supposed to be the second course in a mathematical statistics sequence of classes. The readers should have learned calculus and completed a calculusbased course in probability. As with most mathematical statistics textbooks, point estimations, interval estimation, and hypothesis testing are the core concepts. This book is particularly written for students who would have their first exposure to mathematical statistics, so the author carefully selected his materials and had focused on the understanding of statistics such as the sample mean and sample variance being also random variables as well. R is used throughout the text for graphics, computation, and Monte Carlo simulation. The homework is comprehensive. From all these aspects, this book has a similar style as the other book of Probability by the same author. The book’s organization is deceptively simple: it only has four chapters. Chapter 1, almost 100 pages, is about random sampling. Chapter 2, another 100 pages, is about point estimation. Chapter 3, 135 pages, is about interval estimation. Chapter 4, 133 pages, is about hypothesis testing. This “simple” structure makes the four pillars of mathematical statistics very clear to readers who first learn the topic. Within each chapter, just like in the book Probability, each concept is presented in detail and in multiple aspects. And when calculation is involved, enough middle steps are preserved so readers can easily follow the steps. One notable example is the presentation of the hypothesis testing. Not like many other textbooks that start with proven methods such as the Z-test, this book introduces the big picture first, and this big picture includes “a hunch”: it presents in the very beginning a clear outline of the 12 steps for hypothesis testing, starts with “a hunch, or theory, concerning a problem of interest,” then moves to the second step “translate the theory into a question concerning an unknown parameter theta,” then “state the null hypothesis of theta.” ... Then technical explanation of many of these steps is given in detail. The type 1 and type 2 errors are also presented right along with this 12-step outline. What is more, strange (i.e., idiosyncratic) forms of hypothesis testing are presented! It concerns three brothers, Chico, Harpo, and Groucho. Each of them comes up with their own testing statistics, e.g., x1þ x2, min (x1, x2), or max(x1, x2), where x1 and x2 are random samples of size 2 from a uniform distribution U(0, theta). Is theta 1⁄4 5, or theta 1⁄4 2? Is this an allusion to the three little pigs? Nonetheless, this is a hilarious example that very effectively instructs the technical details of hypothesis testing, but also revives this “ancient” technique that tells readers that, in using the proven hypothesis testing methods, we actually have made choices (i.e., each of the three brothers’ proposals have pros and cons, in terms of the type 1 and ","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"78 1 1","pages":"118 - 118"},"PeriodicalIF":2.5,"publicationDate":"2021-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85795417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Addendum to “Estimating pure-error from near replicates in design of experiments” “实验设计中接近重复的纯误差估计”附录

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2021-12-30 DOI: 10.1080/00224065.2021.2019569

Caleb King, T. Bzik, Peter A. Parker, M. Wells, Benjamin R. Baer

引用次数: 0

Industrial Data Analytics for Diagnosis and Prognosis: A Random Effects Modeling Approach 用于诊断和预后的工业数据分析:随机效应建模方法

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2021-12-08 DOI: 10.1080/00224065.2021.2006583

Jing Li

In Industrial Data Analytics for Diagnosis and Prognosis A Random Effects Modelling Approach, distinguished engineers Shiyu Zhou and Yong Chen deliver a rigorous and practical introduction to the random effects modeling approach for industrial system diagnosis and prognosis. In the book’s two parts, general statistical concepts and useful theory are described and explained, as are industrial diagnosis and prognosis methods. The accomplished authors describe and model fixed effects, random effects, and variation in univariate and multivariate datasets and cover the application of the random effects approach to diagnosis of variation sources in industrial processes. They offer a detailed performance comparison of different diagnosis methods before moving on to the application of the random effects approach to failure prognosis in industrial processes and systems.

在《用于诊断和预测的工业数据分析:随机效应建模方法》一书中，杰出的工程师周世宇和陈勇对用于工业系统诊断和预测的随机效应建模方法进行了严格而实用的介绍。在书的两个部分，一般的统计概念和有用的理论被描述和解释，因为是工业诊断和预后方法。完成的作者描述和模型固定效应，随机效应，和变化在单变量和多变量数据集，并涵盖随机效应方法的应用，以诊断变化源在工业过程。在将随机效应方法应用于工业过程和系统的故障预测之前，他们提供了不同诊断方法的详细性能比较。

引用次数: 2

Utilizing individual clear effects for intelligent factor allocations and design selections 利用个人清晰的效果进行智能因素分配和设计选择

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2021-11-01 DOI: 10.1080/00224065.2021.1991863

Qi Zhou, William Li, Hongquan Xu

Abstract Extensive studies have been conducted on how to select efficient designs with respect to a criterion. Most design criteria aim to capture the overall efficiency of the design across all columns. When prior information indicated that a small number of factors and their two-factor interactions (2fi’s) are likely to be more significant than other effects, commonly used minimum aberration designs may no longer be the best choice. Motivated by a real-life experiment, we propose a new class of regular fractional factorial designs that focus on estimating a subset of columns and their corresponding 2fi’s clear of other important effects. After introducing the concept of individual clear effects (iCE) to describe clear 2fi’s involving a specific factor, we define the clear effect pattern criterion to characterize the distribution of iCE’s over all columns. We then obtain a new class of designs that sequentially maximize the clear effect pattern. These newly constructed designs are often different from existing optimal designs. We develop a series of theoretical results that can be particularly useful for constructing designs with large run sizes, for which algorithmic construction becomes computationally challenging. We also provide some practical guidelines on how to choose appropriate designs with respect to different run size, the number of factors, and the number of 2fi’s that need to be clear.

广泛的研究已经进行了如何选择有效的设计相对于一个标准。大多数设计标准的目标是捕获所有列的整体设计效率。当先前的信息表明少数因素及其双因素相互作用(2fi’s)可能比其他影响更重要时，常用的最小像差设计可能不再是最佳选择。受现实生活实验的启发，我们提出了一类新的规则分数阶乘设计，其重点是估计列的子集及其相应的2fi的其他重要影响。在引入个体清晰效应(iCE)的概念来描述涉及特定因素的清晰2fi之后，我们定义了清晰效应模式准则来表征iCE在所有列上的分布。然后，我们获得了一类新的设计，这些设计依次最大化了清晰的效果图案。这些新建的设计往往不同于现有的最优设计。我们开发了一系列理论结果，这些结果对于构建具有大型运行规模的设计特别有用，因为算法构建在计算上具有挑战性。我们还提供了一些实用指南，说明如何根据不同的运行尺寸、因素数量和需要明确的2fi数量选择合适的设计。

{"title":"Utilizing individual clear effects for intelligent factor allocations and design selections","authors":"Qi Zhou, William Li, Hongquan Xu","doi":"10.1080/00224065.2021.1991863","DOIUrl":"https://doi.org/10.1080/00224065.2021.1991863","url":null,"abstract":"Abstract Extensive studies have been conducted on how to select efficient designs with respect to a criterion. Most design criteria aim to capture the overall efficiency of the design across all columns. When prior information indicated that a small number of factors and their two-factor interactions (2fi’s) are likely to be more significant than other effects, commonly used minimum aberration designs may no longer be the best choice. Motivated by a real-life experiment, we propose a new class of regular fractional factorial designs that focus on estimating a subset of columns and their corresponding 2fi’s clear of other important effects. After introducing the concept of individual clear effects (iCE) to describe clear 2fi’s involving a specific factor, we define the clear effect pattern criterion to characterize the distribution of iCE’s over all columns. We then obtain a new class of designs that sequentially maximize the clear effect pattern. These newly constructed designs are often different from existing optimal designs. We develop a series of theoretical results that can be particularly useful for constructing designs with large run sizes, for which algorithmic construction becomes computationally challenging. We also provide some practical guidelines on how to choose appropriate designs with respect to different run size, the number of factors, and the number of 2fi’s that need to be clear.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"171 1","pages":"3 - 17"},"PeriodicalIF":2.5,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78446053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Self-starting process monitoring based on transfer learning 基于迁移学习的自启动过程监控

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2021-10-28 DOI: 10.1080/00224065.2021.1991251

Zhijun Wang, Chunjie Wu, Miaomiao Yu, F. Tsung

Abstract Conventional self-starting control schemes can perform poorly when monitoring processes with early shifts, being limited by the number of historical observations sampled. In real applications, pre-observed data sets from other production lines are always available, prompting us to propose a scheme that monitors the target process using historical data obtained from other sources. The methodology of self-taught clustering from unsupervised transfer learning is revised to transfer knowledge from previous observations and improve out-of-control (OC) performance, especially for processes with early shifts. However, if the difference in distribution between the target process and the pre-observed data set is large, our scheme may not be the best. Simulation results and two illustrative examples demonstrate the superiority of the proposed scheme.

传统的自启动控制方案在监测具有早期移位的过程时表现不佳，受到历史观测值采样数量的限制。在实际应用中，来自其他生产线的预先观察数据集总是可用的，这促使我们提出一种方案，使用从其他来源获得的历史数据来监视目标过程。从无监督迁移学习中自学聚类的方法被修改为从先前的观察中转移知识并改善失控(OC)性能，特别是对于早期转移的过程。但是，如果目标过程与预观察数据集之间的分布差异很大，我们的方案可能不是最好的。仿真结果和两个实例验证了该方案的优越性。

引用次数: 2

Phase I analysis of high-dimensional processes in the presence of outliers 第一阶段分析存在异常值的高维过程

IF 2.5 2区工程技术 Q2 ENGINEERING, INDUSTRIAL

Journal of Quality Technology

Pub Date : 2021-10-26 DOI: 10.1080/00224065.2023.2196034

M. Ebadi, Shoja'eddin Chenouri, Stefan H. Steiner

Abstract One of the significant challenges in monitoring the quality of products today is the high dimensionality of quality characteristics. In this paper, we address Phase I analysis of high-dimensional processes with individual observations when the available number of samples collected over time is limited. Using a new charting statistic, we propose a robust procedure for parameter estimation in Phase I. This robust procedure is efficient in parameter estimation in the presence of outliers or contamination in the data. A consistent estimator is proposed for parameter estimation and a finite sample correction coefficient is derived and evaluated through simulation. We assess the statistical performance of the proposed method in Phase I. This assessment is carried out in the absence and presence of outliers. We show that, in both cases, the proposed control chart scheme effectively detects various kinds of shifts in the process mean. Besides, we present two real-world examples to illustrate the applicability of our proposed method.

摘要当今产品质量监控面临的重大挑战之一是质量特征的高维性。在本文中，我们解决了第一阶段的高维过程的分析与个人的观察，当可用的样本数量随着时间的推移是有限的。使用新的图表统计量，我们提出了一种鲁棒的阶段参数估计程序。这种鲁棒程序在数据中存在异常值或污染的情况下，参数估计是有效的。提出了参数估计的一致估计量，推导了有限样本校正系数，并通过仿真对其进行了评估。我们在第一阶段评估了所提出方法的统计性能。该评估是在没有和存在异常值的情况下进行的。我们证明，在这两种情况下，所提出的控制图方案都能有效地检测到过程均值的各种移位。此外，我们提出了两个现实世界的例子来说明我们所提出的方法的适用性。

引用次数: 1