Technometrics最新文献

英文中文

Efficient Model-free Subsampling Method for Massive Data 海量数据的高效无模型子抽样方法

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-18 DOI: 10.1080/00401706.2023.2271091

Zheng Zhou, Zebin Yang, Aijun Zhang, Yongdao Zhou

AbstractSubsampling plays a crucial role in tackling problems associated with the storage and statistical learning of massive datasets. However, most existing subsampling methods are model-based, which means their performances can drop significantly when the underlying model is misspecified. Such an issue calls for model-free subsampling methods that are robust under diverse model specifications. Recently, several model-free subsampling methods are developed. However, the computing time of these methods grows explosively with the sample size, making them impractical for handling massive data. In this paper, an efficient model-free subsampling method is proposed, which segments the original data into some regular data blocks and obtains subsamples from each data block by the data-driven subsampling method. Compared with existing model-free subsampling methods, the proposed method has a significant speed advantage and performs more robustly for datasets with complex underlying distributions. As demonstrated in simulation experiments, the proposed method is an order of magnitude faster than other commonly used model-free subsampling methods when the sample size of the original dataset reaches the order of 107. Moreover, simulation experiments and case studies show that the proposed method is more robust than other model-free subsampling methods under diverse model specifications and subsample sizes.Keywords: Big data subsamplingModel robustnessParallel computingUniform designsDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.

摘要子抽样在解决海量数据集的存储和统计学习问题中起着至关重要的作用。然而，大多数现有的子采样方法都是基于模型的，这意味着当底层模型被错误指定时，它们的性能会显著下降。这样的问题需要在各种模型规范下具有鲁棒性的无模型子采样方法。近年来发展了几种无模型子抽样方法。然而，这些方法的计算时间随着样本量的增长呈爆炸式增长，使得它们在处理海量数据时不切实际。本文提出了一种有效的无模型子采样方法，该方法将原始数据分割成规则的数据块，通过数据驱动的子采样方法从每个数据块中获取子样本。与现有的无模型子采样方法相比，该方法具有显著的速度优势，并且对于底层分布复杂的数据集具有更强的鲁棒性。仿真实验表明，当原始数据集的样本量达到107数量级时，该方法比其他常用的无模型子采样方法快一个数量级。仿真实验和实例研究表明，在不同的模型规格和子样本量下，该方法比其他无模型子抽样方法具有更强的鲁棒性。关键词:大数据子采样模型鲁棒性并行计算统一设计免责声明作为对作者和研究人员的服务，我们提供此版本的已接受稿件(AM)。在最终出版版本记录(VoR)之前，将对该手稿进行编辑、排版和审查。在制作和印前，可能会发现可能影响内容的错误，所有适用于期刊的法律免责声明也与这些版本有关。

{"title":"Efficient Model-free Subsampling Method for Massive Data","authors":"Zheng Zhou, Zebin Yang, Aijun Zhang, Yongdao Zhou","doi":"10.1080/00401706.2023.2271091","DOIUrl":"https://doi.org/10.1080/00401706.2023.2271091","url":null,"abstract":"AbstractSubsampling plays a crucial role in tackling problems associated with the storage and statistical learning of massive datasets. However, most existing subsampling methods are model-based, which means their performances can drop significantly when the underlying model is misspecified. Such an issue calls for model-free subsampling methods that are robust under diverse model specifications. Recently, several model-free subsampling methods are developed. However, the computing time of these methods grows explosively with the sample size, making them impractical for handling massive data. In this paper, an efficient model-free subsampling method is proposed, which segments the original data into some regular data blocks and obtains subsamples from each data block by the data-driven subsampling method. Compared with existing model-free subsampling methods, the proposed method has a significant speed advantage and performs more robustly for datasets with complex underlying distributions. As demonstrated in simulation experiments, the proposed method is an order of magnitude faster than other commonly used model-free subsampling methods when the sample size of the original dataset reaches the order of 107. Moreover, simulation experiments and case studies show that the proposed method is more robust than other model-free subsampling methods under diverse model specifications and subsample sizes.Keywords: Big data subsamplingModel robustnessParallel computingUniform designsDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.","PeriodicalId":22208,"journal":{"name":"Technometrics","volume":"20 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135884639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tensor-based Temporal Control for Partially Observed High-dimensional Streaming Data 基于张量的部分观测高维流数据时间控制

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-16 DOI: 10.1080/00401706.2023.2271060

Zihan Zhang, Shancong Mou, Kamran Paynabar, Jianjun Shi

AbstractIn advanced manufacturing processes, high-dimensional (HD) streaming data (e.g., sequential images or videos) are commonly used to provide online measurements of product quality. Although there exist numerous research studies for monitoring and anomaly detection using HD streaming data, little research is conducted on feedback control based on HD streaming data to improve product quality, especially in the presence of incomplete responses. To address this challenge, this paper proposes a novel tensor-based automatic control method for partially observed HD streaming data, which consists of two stages: offline modeling and online control. In the offline modeling stage, we propose a one-step approach integrating parameter estimation of the system model with missing value imputation for the response data. This approach (i) improves the accuracy of parameter estimation, and (ii) maintains a stable and superior imputation performance in a wider range of the rank or missing ratio for the data to be completed, compared to the existing data completion methods. In the online control stage, for each incoming sample, missing observations are imputed by balancing its low-rank information and the one-step-ahead prediction result based on the control action from the last time step. Then, the optimal control action is computed by minimizing a quadratic loss function on the sum of squared deviations from the target. Furthermore, we conduct two sets of simulations and one case study on semiconductor manufacturing to validate the superiority of the proposed framework.Keywords: Streaming DataHigh DimensionTensorFeedback ControlPartial ObservationDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.

在先进制造过程中，高维(HD)流数据(例如，顺序图像或视频)通常用于提供产品质量的在线测量。尽管利用高清流数据进行监测和异常检测的研究很多，但基于高清流数据的反馈控制以提高产品质量的研究很少，特别是在响应不完全的情况下。针对这一挑战，本文提出了一种新的基于张量的部分观测高清流数据自动控制方法，该方法分为离线建模和在线控制两个阶段。在离线建模阶段，我们提出了一种将系统模型的参数估计与响应数据的缺失值输入相结合的一步法。与现有的数据补全方法相比，该方法(1)提高了参数估计的精度;(2)在更大的待补全数据的秩或缺失率范围内保持了稳定和优越的补全性能。在在线控制阶段，对于每个输入样本，通过平衡其低秩信息和基于上一时间步控制动作的前一步预测结果来输入缺失观测值。然后，通过最小化与目标偏差平方和的二次损失函数来计算最优控制动作。此外，我们还进行了两组仿真和一个半导体制造案例研究，以验证所提出框架的优越性。关键词:流数据高维张量反馈控制部分观测免责声明作为对作者和研究人员的服务，我们提供此版本的已接受稿件(AM)。在最终出版版本记录(VoR)之前，将对该手稿进行编辑、排版和审查。在制作和印前，可能会发现可能影响内容的错误，所有适用于期刊的法律免责声明也与这些版本有关。

{"title":"Tensor-based Temporal Control for Partially Observed High-dimensional Streaming Data","authors":"Zihan Zhang, Shancong Mou, Kamran Paynabar, Jianjun Shi","doi":"10.1080/00401706.2023.2271060","DOIUrl":"https://doi.org/10.1080/00401706.2023.2271060","url":null,"abstract":"AbstractIn advanced manufacturing processes, high-dimensional (HD) streaming data (e.g., sequential images or videos) are commonly used to provide online measurements of product quality. Although there exist numerous research studies for monitoring and anomaly detection using HD streaming data, little research is conducted on feedback control based on HD streaming data to improve product quality, especially in the presence of incomplete responses. To address this challenge, this paper proposes a novel tensor-based automatic control method for partially observed HD streaming data, which consists of two stages: offline modeling and online control. In the offline modeling stage, we propose a one-step approach integrating parameter estimation of the system model with missing value imputation for the response data. This approach (i) improves the accuracy of parameter estimation, and (ii) maintains a stable and superior imputation performance in a wider range of the rank or missing ratio for the data to be completed, compared to the existing data completion methods. In the online control stage, for each incoming sample, missing observations are imputed by balancing its low-rank information and the one-step-ahead prediction result based on the control action from the last time step. Then, the optimal control action is computed by minimizing a quadratic loss function on the sum of squared deviations from the target. Furthermore, we conduct two sets of simulations and one case study on semiconductor manufacturing to validate the superiority of the proposed framework.Keywords: Streaming DataHigh DimensionTensorFeedback ControlPartial ObservationDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also.","PeriodicalId":22208,"journal":{"name":"Technometrics","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136114209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine Learning for Knowledge Discovery with R: Methodologies for Modeling, Inference, and PredictionKao-Tai Tsai, Boca Raton, FL: CRC Press, Taylor & Francis Group, LLC, 2022, xiii + 260 pp., $ 88.00, ISBN: 978-1-032-06536-6 (H) 使用R进行知识发现的机器学习:建模，推理和预测的方法蔡高泰，博卡拉顿，佛罗里达州:CRC出版社，泰勒;Francis Group, LLC, 2022, 13 + 260页，$ 88.00,ISBN: 978-1-032-06536-6 (H)

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-02 DOI: 10.1080/00401706.2023.2262891

Aszani Aszani

引用次数: 0

Post-Shrinkage Strategies in Statistical and Machine Learning for High Dimensional DataPost-Shrinkage Strategies in Statistical and Machine Learning for High Dimensional Data, Syed Ejaz Ahmed, Feryaal Ahmed, and Bahadir Yüzbaşı, New York: Chapman and Hall/CRC Press, 2023, 408 pp., ISBN 9780367763442 高维数据统计和机器学习中的后收缩策略，Syed Ejaz Ahmed, Feryaal Ahmed, Bahadir y<s:1> zba<e:1>，纽约:Chapman and Hall/CRC出版社，2023,408页，ISBN 9780367763442

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-02 DOI: 10.1080/00401706.2023.2262896

Abdulkadir Hussein

引用次数: 0

Computer Age Statistical Inference: Algorithms, Evidence, and Data Science, Student ed.Bradley Efron and Trevor Hastie, UK: Cambridge University Press, 2021, xix + 491 pp., $ 39.99 (pbk), ISBN 978-1-108-82341-8. 《计算机时代统计推断:算法、证据和数据科学》，学生编。布拉德利·埃夫隆和特雷弗·哈斯蒂，英国:剑桥大学出版社，2021年，19 + 491页，39.99美元(每磅)，ISBN 978-1-108-82341-8。

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-02 DOI: 10.1080/00401706.2023.2262897

Stan Lipovetsky

引用次数: 0

A Criminologist’s Guide to R: Crime by the NumbersJacob Kaplan, Boca Raton, FL: Chapman and Hall/CRC Press, Taylor & Francis Group, 2022, 432 pp., ISBN 9781032244075. 犯罪学家的R指南:犯罪的数字雅各布卡普兰，博卡拉顿，佛罗里达州:查普曼和霍尔/CRC出版社，泰勒&;弗朗西斯集团，2022,432页，ISBN 9781032244075。

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-02 DOI: 10.1080/00401706.2023.2262895

Enrique Garcia-Ceja

引用次数: 0

Statistical GenomicsBrooke Fridley and Xuefeng Wang, New York, NY: Humana, 2023, 377 pp., EUR 169.99, ISBN 978-1-0716-2986-4 (eBook) 统计基因组学布鲁克·弗里德利和王雪峰，纽约，纽约:Humana, 2023, 377页，169.99欧元，ISBN 978-1-0716-2986-4(电子书)

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-02 DOI: 10.1080/00401706.2023.2262893

Irvanal Haq, Nila Lestari

引用次数: 0

Mathematics of The Big Four Casino Table Games: Blackjack, Baccarat, Craps, & RouletteMark Bollman, Boca Raton, FL: CRC Press/Chapman & Hall, Taylor & Francis Group, 2021, xi +353 pp., 43 B/W illustrations, $ 31.16 (pbk), ISBN 9780367740900 四大赌场桌面游戏的数学：21 点、百家乐、骰子和轮盘马克-波尔曼，佛罗里达州博卡拉顿：CRC Press/Chapman & Hall, Taylor & Francis Group, 2021, xi +353 pp.

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-02 DOI: 10.1080/00401706.2023.2262898

Stan Lipovetsky

引用次数: 0

AI, Machine Learning and Deep Learning a Security PerspectiveEdited by Fei Hu, Xiali Hei, Boca Raton, FL:CRC Press, 2023, 346 pp., 136 B/W Illustrations, GBP 99.99 (Hardback), ISBN 9781032034041, https://doi.org/10.1201/9781003187158 人工智能、机器学习和深度学习:安全视角胡飞、黑夏丽主编，佛罗里达州博卡拉顿:CRC出版社，2023年，346页，136 B/W插图，99.99英镑(精装本)，ISBN 9781032034041, https://doi.org/10.1201/9781003187158

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-02 DOI: 10.1080/00401706.2023.2262890

Fajar Pitarsi Dharma, Moses Laksono Singgih, Hamdan S. Bintang

引用次数: 0

Luck, Logic, and White Lies: The Mathematics of Games; 2nd ed.Jörg Bewersdorff, translated by David Kramer, Boca Raton, FL: A.K. Peters/CRC Press, Taylor & Francis Group, 2021, xx + 548 pp., $ 47.96 (pbk), ISBN 9780367548414 运气、逻辑和白色谎言：游戏数学》；第 2 版，约尔格-比韦尔斯多夫著，戴维-克莱默译，佛罗里达州博卡拉顿：A.K. Peters/CRC Press, Taylor & Francis Group, 2021, xx + 548 pp.

3区工程技术 Q1 STATISTICS & PROBABILITY

Technometrics

Pub Date : 2023-10-02 DOI: 10.1080/00401706.2023.2262889

Stan Lipovetsky

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Technometrics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀