Applied Stochastic Models in Business and Industry最新文献_第5页

Data Quality: What if Deming Were Born Today? 数据质量：如果戴明出生在今天会怎样？

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-06-29 DOI: 10.1002/asmb.70025

Dennis K. J. Lin, Nicholas Rios

If Francis Bacon were born today, he might have said “data is power” instead of his original saying, “knowledge is power.” In modern society, data is everywhere. In memory of Deming (a guru in quality), this paper attempts to address the fundamental issue of data quality and how Deming would handle it. Specifically, we attempt to explain what data quality really means, and the critical impact that it has on data science. Statisticians, who understand how to collect high quality data, have much more to contribute to both the intellectual vitality and the practical utility of data science. At the same time, data science challenges statisticians to move out of some familiar habits to engage less structured problems, to become more comfortable with ambiguity, and to engage more scientists in a fruitful discussion on what various parties can bring to this new mode of investigation. Some potential avenues for future research in the collection of high-quality data will be proposed.

如果弗朗西斯·培根出生在今天，他可能会说“数据就是力量”，而不是他最初所说的“知识就是力量”。在现代社会，数据无处不在。为了纪念戴明（质量大师），本文试图解决数据质量的基本问题以及戴明将如何处理它。具体来说，我们试图解释数据质量的真正含义，以及它对数据科学的关键影响。统计学家，谁知道如何收集高质量的数据，有更多的贡献，无论是智力活力和数据科学的实际应用。与此同时，数据科学挑战统计学家摆脱一些熟悉的习惯，去处理不那么结构化的问题，更适应模棱两可，并让更多的科学家参与到富有成效的讨论中，讨论各方可以为这种新的调查模式带来什么。本文还将提出未来高质量数据收集研究的一些潜在途径。

引用次数: 0

Topic-Sentiment Hybrid Networks for Explainable Document Clustering: A Probabilistic Multi-Dimensional Similarity Analysis 主题-情感混合网络在可解释文档聚类中的应用：一个概率多维相似度分析

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-06-22 DOI: 10.1002/asmb.70024

Marco Ortu

This study introduces a statistical methodology for document clustering that integrates multiple dimensions of textual similarity through network topology analysis. The proposed methodology, which we call Multi-dimensional Similarity Network Analysis (MSNA), extends traditional document-clustering approaches by combining semantic embeddings, topic probability distributions, and emotional probability distribution into a unified similarity measure. We formalize this through a weighted combination of Jensen-Shannon divergences across different probability spaces, creating a comprehensive similarity network. The clustering is achieved through a community detection algorithm that optimizes a multi-objective modularity function, accounting for the different similarity dimensions. We prove the statistical consistency of our approach and derive bounds for the clustering performance under mild regularity conditions. The methodology is validated on a large-scale data set of Airbnb reviews <math> <semantics> <mrow> <mo>(</mo> <mi>n</mi> <mo>=</mo> <mn>114</mn> <mo>,</mo> <mn>000</mn> <mo>)</mo> </mrow> <annotation>$$ left(n=114,000right) $$</annotation> </semantics></math> from Sardinia, Italy, containing text content, topic distributions, and emotional features. Results show significant improvements in both clustering quality (average silhouette score increased) and interpretability compared to traditional single-dimension approaches. From an empirical perspective, the synthetic data validation demonstrates robust performance with topic strength in the range <math> <semantics> <mrow> <mo>[</mo> <mn>0</mn> <mo>.</mo> <mn>4</mn> <mo>,</mo> <mn>1</mn> <mo>.</mo> <mn>0</mn> <mo>]</mo> </mrow> <annotation>$$ left[0.4,1.0right] $$</annotation> </semantics></math> and emotion strength in <math> <semantics> <mrow> <mo>[</mo> <mn>0</mn> <mo>.</mo> <mn>2</mn> <mo>,</mo> <mn>1</mn> <mo>.</mo> <mn>0</mn> <mo>]</mo> </mrow> <annotation>$$ left[0.2,1.0right] $$</annotation> </semantics></math>, achieving mean Adjusted Rand Index scores of 0.44. The application to real-world data identifies five distinct clusters through PROCSIMA (PRObabilistic Clustering SIMilarity A

本文介绍了一种通过网络拓扑分析整合多个维度文本相似度的文档聚类统计方法。本文提出的方法被称为多维相似网络分析（MSNA），它通过将语义嵌入、主题概率分布和情感概率分布结合到一个统一的相似度量中，扩展了传统的文档聚类方法。我们通过跨不同概率空间的Jensen-Shannon散度的加权组合将其形式化，从而创建了一个综合的相似性网络。聚类是通过社区检测算法来实现的，该算法优化了多目标模块化函数，考虑了不同的相似度维度。我们证明了我们的方法的统计一致性，并推导了在温和正则性条件下聚类性能的界限。该方法在来自意大利撒丁岛的Airbnb评论（n = 114,000） $$ left(n=114,000right) $$的大规模数据集上进行了验证，该数据集包含文本内容、主题分布和情感特征。结果表明，与传统的单维度方法相比，聚类质量（平均轮廓分数增加）和可解释性都有显著改善。从实证的角度来看，合成数据验证在主题强度[0]范围内表现出稳健的性能。4,1。[0] $$ left[0.4,1.0right] $$和情感强度[0。2,1。0] $$ left[0.2,1.0right] $$，调整后的Rand Index平均得分为0.44。对现实世界数据的应用通过PROCSIMA（概率聚类相似性分析）识别出五个不同的集群，随后的SMARTS（评论主题和情感的语义分析）分析揭示了每个集群中可解释的社区结构。该框架能够同时捕获文本的语义、主题和情感方面，这使得它对客户体验分析和服务质量监控中的应用程序特别有价值。

{"title":"Topic-Sentiment Hybrid Networks for Explainable Document Clustering: A Probabilistic Multi-Dimensional Similarity Analysis","authors":"Marco Ortu","doi":"10.1002/asmb.70024","DOIUrl":"https://doi.org/10.1002/asmb.70024","url":null,"abstract":"This study introduces a statistical methodology for document clustering that integrates multiple dimensions of textual similarity through network topology analysis. The proposed methodology, which we call Multi-dimensional Similarity Network Analysis (MSNA), extends traditional document-clustering approaches by combining semantic embeddings, topic probability distributions, and emotional probability distribution into a unified similarity measure. We formalize this through a weighted combination of Jensen-Shannon divergences across different probability spaces, creating a comprehensive similarity network. The clustering is achieved through a community detection algorithm that optimizes a multi-objective modularity function, accounting for the different similarity dimensions. We prove the statistical consistency of our approach and derive bounds for the clustering performance under mild regularity conditions. The methodology is validated on a large-scale data set of Airbnb reviews <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>(</mo>\u0000 <mi>n</mi>\u0000 <mo>=</mo>\u0000 <mn>114</mn>\u0000 <mo>,</mo>\u0000 <mn>000</mn>\u0000 <mo>)</mo>\u0000 </mrow>\u0000 <annotation>$$ left(n=114,000right) $$</annotation>\u0000 </semantics></math> from Sardinia, Italy, containing text content, topic distributions, and emotional features. Results show significant improvements in both clustering quality (average silhouette score increased) and interpretability compared to traditional single-dimension approaches. From an empirical perspective, the synthetic data validation demonstrates robust performance with topic strength in the range <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>[</mo>\u0000 <mn>0</mn>\u0000 <mo>.</mo>\u0000 <mn>4</mn>\u0000 <mo>,</mo>\u0000 <mn>1</mn>\u0000 <mo>.</mo>\u0000 <mn>0</mn>\u0000 <mo>]</mo>\u0000 </mrow>\u0000 <annotation>$$ left[0.4,1.0right] $$</annotation>\u0000 </semantics></math> and emotion strength in <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>[</mo>\u0000 <mn>0</mn>\u0000 <mo>.</mo>\u0000 <mn>2</mn>\u0000 <mo>,</mo>\u0000 <mn>1</mn>\u0000 <mo>.</mo>\u0000 <mn>0</mn>\u0000 <mo>]</mo>\u0000 </mrow>\u0000 <annotation>$$ left[0.2,1.0right] $$</annotation>\u0000 </semantics></math>, achieving mean Adjusted Rand Index scores of 0.44. The application to real-world data identifies five distinct clusters through PROCSIMA (PRObabilistic Clustering SIMilarity A","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 4","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144339419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Adaptive Learning Approach to Multivariate Time Forecasting in Industrial Processes 工业过程中多元时间预测的自适应学习方法

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-06-09 DOI: 10.1002/asmb.70016

Fernando Miguelez, Josu Doncel, M. D. Ugarte

Industrial processes generate a massive amount of monitoring data that can be exploited to uncover hidden time losses in the system. This can be used to enhance the accuracy of maintenance policies and increase the effectiveness of the equipment. In this work, we propose a method for one-step probabilistic multivariate forecasting of time variables involved in a production process. The method is based on an Input-Output Hidden Markov Model (IO-HMM), in which the parameters of interest are the state transition probabilities and the parameters of the observations' joint density. The ultimate goal of the method is to predict operational process times in the near future, which enables the identification of hidden losses and the location of improvement areas in the process. The input stream in the IO-HMM model includes past values of the response variables and other process features, such as calendar variables, that can have an impact on the model's parameters. The discrete part of the IO-HMM models the operational mode of the process. The state transition probabilities are supposed to change over time and are updated using Bayesian principles. The continuous part of the IO-HMM models the joint density of the response variables. The estimate of the continuous model parameters is recursively computed through an adaptive algorithm that also admits a Bayesian interpretation. The adaptive algorithm allows for efficient updating of the current parameter estimates as soon as new information is available. We evaluate the method's performance using a real data set obtained from a company in a particular sector, and the results are compared with a collection of benchmark models.

工业流程会产生大量的监控数据，可以利用这些数据来发现系统中隐藏的时间损失。这可以用来提高维护政策的准确性，提高设备的有效性。在这项工作中，我们提出了一种生产过程中涉及的时间变量的一步概率多元预测方法。该方法基于输入-输出隐马尔可夫模型（IO-HMM），其中感兴趣的参数是状态转移概率和观测值的联合密度参数。该方法的最终目标是在不久的将来预测操作过程时间，从而能够识别隐藏的损失并确定过程中改进区域的位置。IO-HMM模型中的输入流包括响应变量和其他过程特征（如日历变量）的过去值，它们会对模型的参数产生影响。IO-HMM的离散部分对过程的运行模式进行建模。状态转移概率应该随时间变化，并使用贝叶斯原理进行更新。IO-HMM的连续部分模拟了响应变量的联合密度。连续模型参数的估计是通过一种自适应算法递归计算的，该算法也承认贝叶斯解释。自适应算法允许在新信息可用时有效地更新当前参数估计。我们使用从特定行业的公司获得的真实数据集来评估该方法的性能，并将结果与一组基准模型进行比较。

{"title":"An Adaptive Learning Approach to Multivariate Time Forecasting in Industrial Processes","authors":"Fernando Miguelez, Josu Doncel, M. D. Ugarte","doi":"10.1002/asmb.70016","DOIUrl":"https://doi.org/10.1002/asmb.70016","url":null,"abstract":"Industrial processes generate a massive amount of monitoring data that can be exploited to uncover hidden time losses in the system. This can be used to enhance the accuracy of maintenance policies and increase the effectiveness of the equipment. In this work, we propose a method for one-step probabilistic multivariate forecasting of time variables involved in a production process. The method is based on an Input-Output Hidden Markov Model (IO-HMM), in which the parameters of interest are the state transition probabilities and the parameters of the observations' joint density. The ultimate goal of the method is to predict operational process times in the near future, which enables the identification of hidden losses and the location of improvement areas in the process. The input stream in the IO-HMM model includes past values of the response variables and other process features, such as calendar variables, that can have an impact on the model's parameters. The discrete part of the IO-HMM models the operational mode of the process. The state transition probabilities are supposed to change over time and are updated using Bayesian principles. The continuous part of the IO-HMM models the joint density of the response variables. The estimate of the continuous model parameters is recursively computed through an adaptive algorithm that also admits a Bayesian interpretation. The adaptive algorithm allows for efficient updating of the current parameter estimates as soon as new information is available. We evaluate the method's performance using a real data set obtained from a company in a particular sector, and the results are compared with a collection of benchmark models.","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 3","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70016","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Analysis of Association Rules: Sensitivity Analysis 关联规则分析：敏感性分析

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-06-09 DOI: 10.1002/asmb.70022

Ron S. Kenett, Chris Gotwalt

Association rules are extracting information from transactional databases of documents with a collection of items also called “tokens” or “words”. The approach is used in the analysis of text records, of social media and of consumer behaviour. We present an innovative sensitivity analysis of association rules (AR) measures of interest. In text analytics, a document term matrix (DTM) consists of rows referring to documents and columns corresponding to items. In binary weights, “1” indicates the presence of a term in a document and “0” otherwise. From a DTM one computes measures of interest characterising ARs. The approach we introduce is based on the application of befitting cross validation (BCV) principles to ARs. The sensitivity analysis of ARs is based on computer generated repeated shuffling of training and validation sets that provide an assessment of the uncertainty of AR measures of interest. We demonstrate this methodology with reports of symptoms associated with a Nicardipine drug product used in the treatment of high blood pressure and angina. Patients self-reports on side effect events are analysed. Association rules, derived from these reports, describe combinations of terms in these reports. AR measures of interest are defined in section 1. In section 2 we introduce the case study that motivates the method we propose. In section 3 we apply BCV principles by concatenating side effect events of Nicardipine by patient. Sensitivity analysis (SA) of ARs is introduced and demonstrated in section 4. The sensitivity analysis method presented here is discussed in section 5 where we formulate general data analysis considerations on how to organise and analyse semantic data.

关联规则从带有一组项目（也称为“令牌”或“词”）的文档的事务性数据库中提取信息。这种方法被用于分析文本记录、社交媒体和消费者行为。我们提出了一个创新的敏感性分析的关联规则（AR）措施的兴趣。在文本分析中，文档术语矩阵（DTM）由引用文档的行和对应于项的列组成。在二进制权重中，“1”表示文档中存在某个术语，否则为“0”。从DTM中计算出表征ar的兴趣度量。我们介绍的方法是基于拟合交叉验证（BCV）原则在ar中的应用。AR的敏感性分析是基于计算机生成的训练集和验证集的重复洗牌，这些集提供了对感兴趣的AR测量的不确定性的评估。我们用一种尼卡地平药物产品治疗高血压和心绞痛的相关症状报告来证明这种方法。分析患者对副作用事件的自我报告。来自这些报告的关联规则描述了这些报告中术语的组合。感兴趣的应收帐款度量见第1节。在第2节中，我们介绍了激励我们提出的方法的案例研究。在第3节中，我们通过逐个患者串联尼卡地平的副作用事件来应用BCV原则。第4节介绍并演示了ar的敏感性分析（SA）。本文提出的敏感性分析方法将在第5节中进行讨论，其中我们制定了关于如何组织和分析语义数据的一般数据分析考虑因素。

{"title":"The Analysis of Association Rules: Sensitivity Analysis","authors":"Ron S. Kenett, Chris Gotwalt","doi":"10.1002/asmb.70022","DOIUrl":"https://doi.org/10.1002/asmb.70022","url":null,"abstract":"<div>\u0000 \u0000 Association rules are extracting information from transactional databases of documents with a collection of items also called “tokens” or “words”. The approach is used in the analysis of text records, of social media and of consumer behaviour. We present an innovative sensitivity analysis of association rules (AR) measures of interest. In text analytics, a document term matrix (DTM) consists of rows referring to documents and columns corresponding to items. In binary weights, “1” indicates the presence of a term in a document and “0” otherwise. From a DTM one computes measures of interest characterising ARs. The approach we introduce is based on the application of befitting cross validation (BCV) principles to ARs. The sensitivity analysis of ARs is based on computer generated repeated shuffling of training and validation sets that provide an assessment of the uncertainty of AR measures of interest. We demonstrate this methodology with reports of symptoms associated with a Nicardipine drug product used in the treatment of high blood pressure and angina. Patients self-reports on side effect events are analysed. Association rules, derived from these reports, describe combinations of terms in these reports. AR measures of interest are defined in section 1. In section 2 we introduce the case study that motivates the method we propose. In section 3 we apply BCV principles by concatenating side effect events of Nicardipine by patient. Sensitivity analysis (SA) of ARs is introduced and demonstrated in section 4. The sensitivity analysis method presented here is discussed in section 5 where we formulate general data analysis considerations on how to organise and analyse semantic data.\u0000 </div>","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 3","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A New Framework to Estimate Return on Investment for Player Salaries in the National Basketball Association 一个估算nba球员工资投资回报的新框架

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-05-31 DOI: 10.1002/asmb.70020

Jackson P. Lautier

An essential component of financial analysis is a comparison of realized returns. These calculations are straightforward when all cash flows have dollar values. Complexities arise if some flows are nonmonetary, however, such as on-court basketball activities. To our knowledge, this problem remains open. We thus present the first known framework to estimate a return on investment for player salaries in the National Basketball Association (NBA). It is a flexible five-part procedure that includes a novel player credit estimator, the Wealth Redistribution Merit Share (WRMS). The WRMS is a per-game wealth redistribution estimator that allocates fractional performance-based credit to players standardized and centered to uniformity. We show it is asymptotically unbiased to the natural share and simultaneously more robust. The per-game approach allows for break-even analysis between high-performing players with frequent missed games and average-performing players with consistent availability. The WRMS may be used to allocate revenue from a single game to each of its players. Using a player's salary as an initial investment, this creates a sequence of cash flows that may be evaluated using traditional financial analysis. We illustrate all methods with empirical estimates from the 2022–2023 NBA regular season. All data and replication code are made available.

财务分析的一个重要组成部分是实现收益的比较。当所有现金流都有美元价值时，这些计算就很简单了。然而，如果一些流动是非货币性的，比如球场上的篮球活动，就会出现复杂性。据我们所知，这个问题仍然悬而未决。因此，我们提出了第一个已知的框架来估计美国国家篮球协会（NBA）球员工资的投资回报。这是一个灵活的五部分程序，包括一个新颖的玩家信用评估器，财富再分配价值份额（WRMS）。WRMS是一种游戏财富再分配估算器，它将部分基于表现的信用分配给标准化且以一致性为中心的玩家。我们证明了它对自然份额渐近无偏，同时更稳健。每场比赛的方法允许在经常缺席比赛的高水平球员和持续可用的平均水平球员之间进行盈亏平衡分析。WRMS可用于将单个游戏的收入分配给每个玩家。将球员的工资作为初始投资，这创造了一系列现金流，可以使用传统的财务分析进行评估。我们用2022-2023 NBA常规赛的实证估计来说明所有方法。提供所有数据和复制代码。

{"title":"A New Framework to Estimate Return on Investment for Player Salaries in the National Basketball Association","authors":"Jackson P. Lautier","doi":"10.1002/asmb.70020","DOIUrl":"https://doi.org/10.1002/asmb.70020","url":null,"abstract":"<div>\u0000 \u0000 An essential component of financial analysis is a comparison of realized returns. These calculations are straightforward when all cash flows have dollar values. Complexities arise if some flows are nonmonetary, however, such as on-court basketball activities. To our knowledge, this problem remains open. We thus present the first known framework to estimate a return on investment for player salaries in the National Basketball Association (NBA). It is a flexible five-part procedure that includes a novel player credit estimator, the Wealth Redistribution Merit Share (WRMS). The WRMS is a per-game wealth redistribution estimator that allocates fractional performance-based credit to players standardized and centered to uniformity. We show it is asymptotically unbiased to the natural share and simultaneously more robust. The per-game approach allows for break-even analysis between high-performing players with frequent missed games and average-performing players with consistent availability. The WRMS may be used to allocate revenue from a single game to each of its players. Using a player's salary as an initial investment, this creates a sequence of cash flows that may be evaluated using traditional financial analysis. We illustrate all methods with empirical estimates from the 2022–2023 NBA regular season. All data and replication code are made available.\u0000 </div>","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 3","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144179402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comparisons of Coherent Systems' Lifetimes in the Increasing Convex Order 渐增凸序相干系统寿命的比较

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-05-31 DOI: 10.1002/asmb.70021

Francesco Buono, Franco Pellerey

Stochastic orders have been widely used in reliability literature to compare the performances of coherent systems, and various criteria have been provided on this purpose. In particular, sufficient conditions have been found for the lifetime of a system to be stochastically larger than that of another system having the same components with identically distributed lifetimes but a different structure function. Known results of this kind concern some of the most relevant stochastic orders, but in the literature no conditions have been provided for the well-known increasing convex order (icx). Here we describe conditions such that two lifetimes of coherent systems are comparable in this stochastic sense when conditions for other stronger orders do not apply. Illustrative examples are also given.

在可靠性文献中，随机阶数被广泛用于比较相干系统的性能，并为此提供了各种准则。特别地，我们找到了一个系统寿命随机大于具有相同寿命分布但结构函数不同的相同组件的另一个系统的充分条件。这类已知的结果与一些最相关的随机阶有关，但在文献中没有为众所周知的递增凸阶（icx）提供条件。在这里，我们描述了当其他更强阶的条件不适用时，相干系统的两个寿命在这种随机意义上具有可比性的条件。并给出了实例说明。

引用次数: 0

Stochastic Modeling and Time-Frequency Analysis for Predictive Maintenance of Automotive Suspension Systems 汽车悬架系统预测性维修的随机建模与时频分析

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-05-26 DOI: 10.1002/asmb.70013

Livio Fenga, Luca Biazzo

This article presents a real-time predictive maintenance model of vehicle suspensions based on vibration signal analysis. The study is grounded in the observation that suspension wear and failure are primarily driven by cumulative stresses and external shocks encountered during vehicle operation. We use a wavelet-based technique integrated with stochastic modeling and lifetime data analysis to predict the remaining useful life (RUL) of the suspension. The proposed framework provides a decision-making tool for determining whether and when suspension systems should be subjected to inspection, replacement, or overhaul. An empirical application, using vibration data from a uniaxial accelerometer mounted on a vehicle suspension under varying road conditions, validates the theoretical model and estimation procedure.

提出了一种基于振动信号分析的汽车悬架实时预测维修模型。该研究的基础是观察到悬挂磨损和失效主要是由车辆运行过程中遇到的累积应力和外部冲击驱动的。我们使用基于小波的技术，结合随机建模和寿命数据分析来预测悬架的剩余使用寿命。建议的框架提供了一个决策工具，用于确定悬挂系统是否以及何时应该进行检查、更换或大修。利用安装在车辆悬架上的单轴加速度计在不同道路条件下的振动数据进行了实证应用，验证了理论模型和估计过程。

引用次数: 0

Foreword to the Special Issue on Data Science in Business and Industry 商业和工业数据科学特刊前言

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-05-22 DOI: 10.1002/asmb.70019

David Banks, Alba Martínez-Ruiz, David F. Muñoz, Javier Trejos-Zelaya

引用次数: 0

Feature Selection for Stock Movement Direction Prediction Using Sparse Support Vector Machine 基于稀疏支持向量机的股票运动方向预测特征选择

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-05-19 DOI: 10.1002/asmb.70011

Maoxuan Miao, Jinran Wu, Fengjing Cai, Liya Fu, Shurong Zheng, You-Gan Wang

In financial markets, accurate stock price movement prediction can significantly enhance investors' profits. However, the stock price is a highly complex dynamic system with considerable fluctuations, and the accuracy of direction prediction can be improved by selecting appropriate technical indicators. In this work, we propose a novel sparse support vector machines (SVMs) that combines recursive feature elimination (RFE) and ReliefF using a weight parameter. Unlike traditional RFE-based SVMs, our approach constructs a nested feature subset structure, <math> <semantics> <mrow> <msub> <mrow> <mi>F</mi> </mrow> <mrow> <mn>1</mn> </mrow> </msub> <mo>⊂</mo> <msub> <mrow> <mi>F</mi> </mrow> <mrow> <mn>2</mn> </mrow> </msub> <mo>⊂</mo> <mi>⋯</mi> <mo>⊂</mo> <msub> <mrow> <mi>F</mi> </mrow> <mrow> <mi>p</mi> </mrow> </msub> </mrow> <annotation>$$ {F}_1subset {F}_2subset cdots subset {F}_p $$</annotation> </semantics></math>, using a new filter algorithm that combines backward sacrifice and ReliefF by weighting. This new filter algorithm can capture relevant features and feature interactions simultaneously and is crucial in preventing valuable features from being removed at each iteration. Moreover, the ReliefF algorithm combined with RFE can identify more discriminative feature subsets by reordering the features such that valuable ones are ranked higher than valueless ones, and removing valueless features sequentially through iterative processes. Our experimental results on nine stock datasets from the liquor and spirits concept demonstrate that our proposed method outperforms baseline sparse SVMs and SVM models in terms of accuracy and F-test, while also producing a desirable number of features and automatically eliminating redundancy among technical indicators. We also show that on most stock datasets, the ReliefF algorithm combined with RFE can effectively identify discriminative feature subsets for cases of linear and Gaussian kernel SVMs and our proposed filter method can prevent valuable features from being removed at each iteration. In addition, our experimental findings reveal that feature subsets generated by technical indicators are more discriminative while feature subsets generated by technical i

在金融市场中，准确的股价走势预测可以显著提高投资者的利润。但股票价格是一个高度复杂的动态系统，波动较大，通过选择合适的技术指标可以提高方向预测的准确性。在这项工作中，我们提出了一种新的稀疏支持向量机（svm），它结合了递归特征消除（RFE）和使用权重参数的ReliefF。与传统的基于rfe的svm不同，我们的方法构建了一个嵌套的特征子集结构，f1∧f2∧⋯F p $$ {F}_1subset {F}_2subset cdots subset {F}_p $$，使用一种新的过滤算法，通过加权将向后牺牲和ReliefF结合起来。这种新的过滤算法可以同时捕获相关特征和特征交互，并且在防止有价值的特征在每次迭代中被删除方面至关重要。此外，结合RFE的ReliefF算法通过对特征进行重新排序，使有价值的特征的排名高于无价值的特征，并通过迭代过程依次去除无价值的特征，从而识别出更具判别性的特征子集。我们在白酒和烈酒概念的9个库存数据集上的实验结果表明，我们提出的方法在准确性和f检验方面优于基线稀疏支持向量机和支持向量机模型，同时也产生了理想数量的特征并自动消除了技术指标之间的冗余。我们还表明，在大多数股票数据集上，ReliefF算法结合RFE可以有效地识别线性核支持向量机和高斯核支持向量机的判别特征子集，并且我们提出的滤波方法可以防止有价值的特征在每次迭代中被删除。此外，我们的实验结果表明，由技术指标生成的特征子集具有更强的判别性，而由技术指标子集映射到某个高维空间生成的特征子集的判别性较弱。

{"title":"Feature Selection for Stock Movement Direction Prediction Using Sparse Support Vector Machine","authors":"Maoxuan Miao, Jinran Wu, Fengjing Cai, Liya Fu, Shurong Zheng, You-Gan Wang","doi":"10.1002/asmb.70011","DOIUrl":"https://doi.org/10.1002/asmb.70011","url":null,"abstract":"In financial markets, accurate stock price movement prediction can significantly enhance investors' profits. However, the stock price is a highly complex dynamic system with considerable fluctuations, and the accuracy of direction prediction can be improved by selecting appropriate technical indicators. In this work, we propose a novel sparse support vector machines (SVMs) that combines recursive feature elimination (RFE) and ReliefF using a weight parameter. Unlike traditional RFE-based SVMs, our approach constructs a nested feature subset structure, <math>\u0000 <semantics>\u0000 <mrow>\u0000 <msub>\u0000 <mrow>\u0000 <mi>F</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mn>1</mn>\u0000 </mrow>\u0000 </msub>\u0000 <mo>⊂</mo>\u0000 <msub>\u0000 <mrow>\u0000 <mi>F</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mn>2</mn>\u0000 </mrow>\u0000 </msub>\u0000 <mo>⊂</mo>\u0000 <mi>⋯</mi>\u0000 <mo>⊂</mo>\u0000 <msub>\u0000 <mrow>\u0000 <mi>F</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>p</mi>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ {F}_1subset {F}_2subset cdots subset {F}_p $$</annotation>\u0000 </semantics></math>, using a new filter algorithm that combines backward sacrifice and ReliefF by weighting. This new filter algorithm can capture relevant features and feature interactions simultaneously and is crucial in preventing valuable features from being removed at each iteration. Moreover, the ReliefF algorithm combined with RFE can identify more discriminative feature subsets by reordering the features such that valuable ones are ranked higher than valueless ones, and removing valueless features sequentially through iterative processes. Our experimental results on nine stock datasets from the liquor and spirits concept demonstrate that our proposed method outperforms baseline sparse SVMs and SVM models in terms of accuracy and F-test, while also producing a desirable number of features and automatically eliminating redundancy among technical indicators. We also show that on most stock datasets, the ReliefF algorithm combined with RFE can effectively identify discriminative feature subsets for cases of linear and Gaussian kernel SVMs and our proposed filter method can prevent valuable features from being removed at each iteration. In addition, our experimental findings reveal that feature subsets generated by technical indicators are more discriminative while feature subsets generated by technical i","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 3","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144091865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bayesian Hierarchical Modeling of Noisy Gamma Processes: Formulation and Extensions for Unit-To-Unit Variability 噪声伽马过程的贝叶斯分层建模：单位到单位可变性的公式和扩展

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry

Pub Date : 2025-05-15 DOI: 10.1002/asmb.70014

Ryan Leadbetter, Gabriel González Cáceres, Aloke Phatak

The gamma process is a natural model for monotonic degradation processes. In practice, it is desirable to extend the single gamma process to incorporate measurement error and to construct models for the degradation of several nominally identical units. In this paper, we show how these extensions are easily facilitated through the Bayesian hierarchical modeling framework. Following the precepts of the Bayesian statistical workflow, we show the principled construction of a noisy gamma process model. We also reparameterise the gamma process to simplify the specification of priors and make it obvious how the single gamma process model can be extended to include unit-to-unit variability or covariates. We first fit the noisy gamma process model to a single simulated degradation trace. In doing so, we find an identifiability problem between the volatility of the gamma process and the measurement error when there are only a few noisy degradation observations. However, this lack of identifiability can be resolved by including extra information in the analysis through a stronger prior or extra data that informs one of the non-identifiable parameters, or by borrowing information from multiple units. We then explore extensions of the model to account for unit-to-unit variability and demonstrate them using a crack-propagation data set with added measurement error. Lastly, we perform model selection in a fully Bayesian framework by using cross-validation to approximate the expected log probability density of a new observation. We also show how failure time distributions with uncertainty intervals can be calculated for new units or units that are currently under test but have yet to fail.

伽马过程是单调降解过程的自然模型。在实践中，我们希望将单个伽马过程扩展到包含测量误差，并为几个名义上相同的单元的退化构建模型。在本文中，我们将展示如何通过贝叶斯分层建模框架轻松地促进这些扩展。遵循贝叶斯统计工作流的规则，我们展示了一个有噪声的伽马过程模型的原则构造。我们还重新参数化了伽马过程，以简化先验的说明，并使单个伽马过程模型如何扩展到包括单位间的可变性或协变量变得明显。我们首先将噪声过程模型拟合到单个模拟退化轨迹上。在这样做的过程中，我们发现当只有少量噪声退化观测时，伽马过程的挥发性和测量误差之间存在可识别性问题。然而，这种可识别性的缺乏可以通过在分析中包含额外的信息来解决，通过更强的先验或通知一个不可识别参数的额外数据，或者通过从多个单元借用信息。然后，我们探索模型的扩展，以解释单位间的可变性，并使用带有附加测量误差的裂纹传播数据集来演示它们。最后，我们通过交叉验证在完全贝叶斯框架中进行模型选择，以近似新观测的期望对数概率密度。我们还展示了如何为新单元或当前正在测试但尚未失效的单元计算具有不确定间隔的故障时间分布。

{"title":"Bayesian Hierarchical Modeling of Noisy Gamma Processes: Formulation and Extensions for Unit-To-Unit Variability","authors":"Ryan Leadbetter, Gabriel González Cáceres, Aloke Phatak","doi":"10.1002/asmb.70014","DOIUrl":"https://doi.org/10.1002/asmb.70014","url":null,"abstract":"The gamma process is a natural model for monotonic degradation processes. In practice, it is desirable to extend the single gamma process to incorporate measurement error and to construct models for the degradation of several nominally identical units. In this paper, we show how these extensions are easily facilitated through the Bayesian hierarchical modeling framework. Following the precepts of the Bayesian statistical workflow, we show the principled construction of a noisy gamma process model. We also reparameterise the gamma process to simplify the specification of priors and make it obvious how the single gamma process model can be extended to include unit-to-unit variability or covariates. We first fit the noisy gamma process model to a single simulated degradation trace. In doing so, we find an identifiability problem between the volatility of the gamma process and the measurement error when there are only a few noisy degradation observations. However, this lack of identifiability can be resolved by including extra information in the analysis through a stronger prior or extra data that informs one of the non-identifiable parameters, or by borrowing information from multiple units. We then explore extensions of the model to account for unit-to-unit variability and demonstrate them using a crack-propagation data set with added measurement error. Lastly, we perform model selection in a fully Bayesian framework by using cross-validation to approximate the expected log probability density of a new observation. We also show how failure time distributions with uncertainty intervals can be calculated for new units or units that are currently under test but have yet to fail.","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 3","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144074562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0