首页 > 最新文献

Communications in Statistics Case Studies Data Analysis and Applications最新文献

英文 中文
A robust approach for outlier imputation: Singular spectrum decomposition 一种鲁棒的离群值归算方法:奇异谱分解
Q4 Mathematics Pub Date : 2021-12-28 DOI: 10.1080/23737484.2021.2017810
Maryam Movahedifar, Hossein Hassani, M. Yarmohammadi, M. Kalantari, Rangan Gupta
Abstract Singular spectrum analysis (SSA) is a nonparametric method for separating time series data into a sum of small numbers of interpretable components (signal + noise). One of the steps of the SSA method, which is referenced to Embedding, is extremely sensitive to contamination of outliers which are often founded in time series analysis. To reduce the effect of outliers, SSA based on Singular Spectrum Decomposition (SSD) method is proposed. In this article, the ability of SSA based on SSD and basic SSA are compared in time series reconstruction in the presence of outliers. It is noteworthy that the matrix norm used in Basic SSA is the Frobenius norm or L 2-norm. There is a newer version of SSA that is based on L 1-norm and called L 1-SSA. It was confirmed that L 1-SSA is robust against outliers. In this regard, this research is also introduced a new version of SSD based on L 1-norm which is called L 1-SSD. A wide empirical study on both simulated and real data verifies the efficiency of basic SSA based on SSD and L 1-norm in reconstructing the time series where polluted by outliers.
奇异谱分析(SSA)是一种将时间序列数据分离为少量可解释分量(信号+噪声)之和的非参数方法。SSA方法的其中一个步骤,即嵌入,对时间序列分析中经常出现的异常点污染非常敏感。为了降低异常值的影响,提出了基于奇异谱分解(SSD)的SSA方法。本文比较了基于SSD的SSA和基本SSA在存在异常点的情况下重建时间序列的能力。值得注意的是,在Basic SSA中使用的矩阵范数是Frobenius范数或l2范数。有一个更新版本的SSA是基于l1规范的,称为l1 -SSA。结果表明,l1 - ssa对异常值具有鲁棒性。对此,本研究还介绍了一种基于l1规范的新型SSD,称为l1 -SSD。通过对模拟数据和实际数据的大量实证研究,验证了基于SSD和L -范数的基本SSA在被异常值污染的时间序列重构中的有效性。
{"title":"A robust approach for outlier imputation: Singular spectrum decomposition","authors":"Maryam Movahedifar, Hossein Hassani, M. Yarmohammadi, M. Kalantari, Rangan Gupta","doi":"10.1080/23737484.2021.2017810","DOIUrl":"https://doi.org/10.1080/23737484.2021.2017810","url":null,"abstract":"Abstract Singular spectrum analysis (SSA) is a nonparametric method for separating time series data into a sum of small numbers of interpretable components (signal + noise). One of the steps of the SSA method, which is referenced to Embedding, is extremely sensitive to contamination of outliers which are often founded in time series analysis. To reduce the effect of outliers, SSA based on Singular Spectrum Decomposition (SSD) method is proposed. In this article, the ability of SSA based on SSD and basic SSA are compared in time series reconstruction in the presence of outliers. It is noteworthy that the matrix norm used in Basic SSA is the Frobenius norm or L 2-norm. There is a newer version of SSA that is based on L 1-norm and called L 1-SSA. It was confirmed that L 1-SSA is robust against outliers. In this regard, this research is also introduced a new version of SSD based on L 1-norm which is called L 1-SSD. A wide empirical study on both simulated and real data verifies the efficiency of basic SSA based on SSD and L 1-norm in reconstructing the time series where polluted by outliers.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"34 1","pages":"234 - 250"},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88992819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Modeling serially correlated heavy-tailed data with some missing response values using stochastic EM algorithm 利用随机电磁算法对响应值缺失的重尾序列数据进行建模
Q4 Mathematics Pub Date : 2021-12-22 DOI: 10.1080/23737484.2021.2017808
U. Nduka, I. Iwueze, C. Nwaigwe
Abstract The linear regression model is a popular tool used by almost all in different areas of research. The model relies mainly on the assumption of uncorrelated errors from a Gaussian distribution. However, many datasets in practice violate this basic assumption, making inference in such cases invalid. Therefore, the linear regression model with structured errors driven by heavy-tailed innovations are preferred in practice. Another issue that occur frequently with real-life data is missing values, due to some reasons such as system breakdown and labor unrest. Despite the challenge these two issues pose to practitioners, there is scarcity of literature where they have jointly been studied. Hence, this article considers these two issues jointly, for the first time, and develops an efficient parameter estimation procedure for Student’s-t autoregressive regression model for time series with missing values of the response variable. The procedure is based on a stochastic approximation expectation–maximization algorithm coupled with a Markov chain Monte Carlo technique. The procedure gives efficient closed-form expressions for the parameters of the model, which are very easy to compute. Simulations and real-life data analysis show that the method is efficient for use with incomplete time series data.
线性回归模型是几乎所有研究领域都在使用的一种流行工具。该模型主要依赖于高斯分布中不相关误差的假设。然而,在实践中,许多数据集违背了这一基本假设,使得在这种情况下的推理无效。因此,在实践中,由重尾创新驱动的具有结构误差的线性回归模型是首选的。另一个在现实数据中经常发生的问题是由于系统故障和劳工骚乱等原因导致的值丢失。尽管这两个问题给实践者带来了挑战,但它们共同研究的文献却很少。因此,本文首次将这两个问题结合起来考虑,并对响应变量缺失的时间序列的Student -t自回归模型提出了一种有效的参数估计方法。该程序是基于随机逼近期望最大化算法与马尔可夫链蒙特卡罗技术相结合。该程序给出了模型参数的有效的封闭表达式,易于计算。仿真和实际数据分析表明,该方法对于不完全时间序列数据是有效的。
{"title":"Modeling serially correlated heavy-tailed data with some missing response values using stochastic EM algorithm","authors":"U. Nduka, I. Iwueze, C. Nwaigwe","doi":"10.1080/23737484.2021.2017808","DOIUrl":"https://doi.org/10.1080/23737484.2021.2017808","url":null,"abstract":"Abstract The linear regression model is a popular tool used by almost all in different areas of research. The model relies mainly on the assumption of uncorrelated errors from a Gaussian distribution. However, many datasets in practice violate this basic assumption, making inference in such cases invalid. Therefore, the linear regression model with structured errors driven by heavy-tailed innovations are preferred in practice. Another issue that occur frequently with real-life data is missing values, due to some reasons such as system breakdown and labor unrest. Despite the challenge these two issues pose to practitioners, there is scarcity of literature where they have jointly been studied. Hence, this article considers these two issues jointly, for the first time, and develops an efficient parameter estimation procedure for Student’s-t autoregressive regression model for time series with missing values of the response variable. The procedure is based on a stochastic approximation expectation–maximization algorithm coupled with a Markov chain Monte Carlo technique. The procedure gives efficient closed-form expressions for the parameters of the model, which are very easy to compute. Simulations and real-life data analysis show that the method is efficient for use with incomplete time series data.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"58 1","pages":"81 - 104"},"PeriodicalIF":0.0,"publicationDate":"2021-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76012816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Could significant regression be treated as insignificant: An anomaly in statistics? 显著回归是否可以被视为不显著:统计学中的异常?
Q4 Mathematics Pub Date : 2021-11-08 DOI: 10.1080/23737484.2021.1986171
Yushan Cheng, Yongchang Hui, Shuangzhe Liu, Wing-Keung Wong
Abstract Literature has found that regression of independent (nearly) nonstationary time series could be spurious. We incorporate this idea to examine whether significant regression could be treated as insignificant in some situations. To do so, we conjecture that significant regression could appear significant in some cases but it could become insignificant in some other cases. To check whether our conjecture could hold, we set up a model in which both dependent and independent variables Yt and Xt are the sum of two variables, say and , in which and are independent and (nearly) nonstationary AR(1) time series such that and . Following this model-setup, we design some situations and the algorithm for our simulation to check whether our conjecture could hold. We find that on the one hand, our conjecture could hold that significant regression could appear significant in some cases when α 1 and α 2 are of different signs. On the other hand, our findings show that our conjecture does not hold and significant regression cannot be treated as insignificant when α 1 and α 2 are of the same signs. We note that as far as we know, our article is the first article to discover that significant regression can be treated as insignificant in some situations. Thus, the main contribution of our article is that our article is the first article to discover that significant regression can be treated as insignificant in some situations and remains significant in other situations. We believe that our discovery could be an anomaly in statistics. Our findings are useful for academics and practitioners in their data analysis in the way that if they find the regression is insignificant, they should investigate further whether their analysis falls into the problem studied in our article.
文献已经发现独立(近似)非平稳时间序列的回归可能是假的。我们结合这个想法来检验在某些情况下显著回归是否可以被视为不显著。为了做到这一点,我们推测显著回归在某些情况下可能显得显著,但在其他一些情况下可能变得不显著。为了检验我们的猜想是否成立,我们建立了一个模型,其中因变量和自变量Yt和Xt都是两个变量的和,其中和是独立的(几乎)非平稳的AR(1)时间序列,使得和。在此模型建立之后,我们设计了一些场景和算法来进行模拟,以检验我们的猜想是否成立。我们发现,一方面,我们的猜想可以证明,当α 1和α 2的符号不同时,显著回归在某些情况下是显著的。另一方面,我们的研究结果表明,当α 1和α 2具有相同的标志时,我们的猜想并不成立,显著回归不能被视为不显著。我们注意到,据我们所知,我们的文章是第一篇发现在某些情况下显著回归可以被视为不显著的文章。因此,我们文章的主要贡献在于,我们的文章是第一篇发现显著回归在某些情况下可以被视为不显著的文章,而在其他情况下仍然是显著的。我们相信我们的发现可能是统计学上的一个反常现象。我们的发现对学者和从业者的数据分析是有用的,如果他们发现回归是不显著的,他们应该进一步调查他们的分析是否属于我们文章中研究的问题。
{"title":"Could significant regression be treated as insignificant: An anomaly in statistics?","authors":"Yushan Cheng, Yongchang Hui, Shuangzhe Liu, Wing-Keung Wong","doi":"10.1080/23737484.2021.1986171","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986171","url":null,"abstract":"Abstract Literature has found that regression of independent (nearly) nonstationary time series could be spurious. We incorporate this idea to examine whether significant regression could be treated as insignificant in some situations. To do so, we conjecture that significant regression could appear significant in some cases but it could become insignificant in some other cases. To check whether our conjecture could hold, we set up a model in which both dependent and independent variables Yt and Xt are the sum of two variables, say and , in which and are independent and (nearly) nonstationary AR(1) time series such that and . Following this model-setup, we design some situations and the algorithm for our simulation to check whether our conjecture could hold. We find that on the one hand, our conjecture could hold that significant regression could appear significant in some cases when α 1 and α 2 are of different signs. On the other hand, our findings show that our conjecture does not hold and significant regression cannot be treated as insignificant when α 1 and α 2 are of the same signs. We note that as far as we know, our article is the first article to discover that significant regression can be treated as insignificant in some situations. Thus, the main contribution of our article is that our article is the first article to discover that significant regression can be treated as insignificant in some situations and remains significant in other situations. We believe that our discovery could be an anomaly in statistics. Our findings are useful for academics and practitioners in their data analysis in the way that if they find the regression is insignificant, they should investigate further whether their analysis falls into the problem studied in our article.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"273 1","pages":"133 - 151"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74738609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Early detection of individual growing pigs’ sanitary challenges using functional data analysis of real-time feed intake patterns 利用实时采食模式的功能数据分析,早期发现个体生长猪的卫生问题
Q4 Mathematics Pub Date : 2021-10-28 DOI: 10.1080/23737484.2021.1991855
Bernard Colin, Simon Germain, C. Pomar
Abstract This article is concerned with the conception of an automatic numerical procedure which, integrated into automatic feeders, can identify changes in the feed intake patterns of individual pigs, thus allowing early detection of potential sanitary challenges within the herd. More precisely, the proposed numerical procedure analyzes every day, and for each pig within the herd, feed intake data collected during 5 consecutive days (memory lag) to predict the feeding patterns of the following day. Then, the procedure evaluates, for each animal, the difference between the predicted and the observed feeding patterns and automatically detects if this difference is greater than a given threshold. In this case, a signal is sent to a monitoring center and the animal can be placed under observation.
本文介绍了一种自动数值程序的概念,该程序集成到自动喂食器中,可以识别单个猪采食量模式的变化,从而可以早期发现猪群中潜在的卫生问题。更准确地说,所提出的数值程序分析了猪群中每头猪连续5天(记忆滞后)收集的采食量数据,以预测第二天的摄食模式。然后,该程序对每只动物评估预测和观察到的喂养模式之间的差异,并自动检测这种差异是否大于给定的阈值。在这种情况下,一个信号被发送到一个监测中心,动物可以被置于观察之下。
{"title":"Early detection of individual growing pigs’ sanitary challenges using functional data analysis of real-time feed intake patterns","authors":"Bernard Colin, Simon Germain, C. Pomar","doi":"10.1080/23737484.2021.1991855","DOIUrl":"https://doi.org/10.1080/23737484.2021.1991855","url":null,"abstract":"Abstract This article is concerned with the conception of an automatic numerical procedure which, integrated into automatic feeders, can identify changes in the feed intake patterns of individual pigs, thus allowing early detection of potential sanitary challenges within the herd. More precisely, the proposed numerical procedure analyzes every day, and for each pig within the herd, feed intake data collected during 5 consecutive days (memory lag) to predict the feeding patterns of the following day. Then, the procedure evaluates, for each animal, the difference between the predicted and the observed feeding patterns and automatically detects if this difference is greater than a given threshold. In this case, a signal is sent to a monitoring center and the animal can be placed under observation.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"148 1","pages":"177 - 198"},"PeriodicalIF":0.0,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74906957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fuzzy theories and statistics—fuzzy data analysis 模糊理论与统计-模糊数据分析
Q4 Mathematics Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1991854
N. Watanabe
Abstract Fuzzy theories are not well accepted in the field of statistics. However, fuzzy theories are important from the statistical viewpoint. In this note we first discuss the statistical applications of the fuzzy theories briefly. The main is the fuzzy set theory and we do not refer the fuzzy measure theory. Second, we introduce some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity. Furthermore we define the fuzzy directional data as a special case of fuzzy data. The statistical analysis of fuzzy directional data is also discussed.
模糊理论在统计学领域不被广泛接受。然而,从统计学的角度来看,模糊理论是重要的。在本文中,我们首先简要讨论模糊理论的统计应用。主要是模糊集合理论,不涉及模糊测度理论。其次,介绍了一些分析模糊数据的统计工具。模糊数据分析在与人类敏感性相关的领域具有重要意义。进一步将模糊方向数据定义为模糊数据的一种特殊情况。并对模糊定向数据的统计分析进行了讨论。
{"title":"Fuzzy theories and statistics—fuzzy data analysis","authors":"N. Watanabe","doi":"10.1080/23737484.2021.1991854","DOIUrl":"https://doi.org/10.1080/23737484.2021.1991854","url":null,"abstract":"Abstract Fuzzy theories are not well accepted in the field of statistics. However, fuzzy theories are important from the statistical viewpoint. In this note we first discuss the statistical applications of the fuzzy theories briefly. The main is the fuzzy set theory and we do not refer the fuzzy measure theory. Second, we introduce some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity. Furthermore we define the fuzzy directional data as a special case of fuzzy data. The statistical analysis of fuzzy directional data is also discussed.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"19 1","pages":"561 - 572"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88111455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimax strategies for Bernoulli two-armed bandit on a moderate control horizon 中等控制水平下Bernoulli双臂土匪的极大极小策略
Q4 Mathematics Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1986170
A. Kolnogorov, Denis Grunev
ABSTRACT We consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data if there are two processing methods available with different a priori unknown efficiencies. One has to determine the most effective method and provide its predominant application. In contrast to big data processing for which several approaches have been developed, including batch processing, the optimization of moderate data processing is currently not well understood. We consider minimax approach and search for minimax strategy and minimax risk as Bayesian ones corresponding to the worst-case prior distribution for which Bayesian risk attains its maximal value. Close to the worst-case prior distribution and corresponding Bayesian risk are obtained by numerical methods. Calculations show that determined strategy provides the value of maximal regret close to determined Bayesian risk and, hence, is approximately minimax one. Results can be applied to big data processing if the data arises by batches of moderate size with approximately uniform properties.
摘要考虑中等控制水平上的伯努利双臂强盗问题,当存在两种不同先验未知效率的处理方法时,将其应用于处理中等数据量的优化问题。一个人必须确定最有效的方法,并提供其主要应用。与包括批处理在内的几种方法已经开发出来的大数据处理相反,适度数据处理的优化目前还没有得到很好的理解。我们考虑极大极小方法,搜索极大极小策略和极大极小风险作为贝叶斯方法,对应于贝叶斯风险达到最大值的最坏情况先验分布。通过数值方法得到了最坏情况下的接近先验分布和相应的贝叶斯风险。计算表明,确定策略提供的最大后悔值接近确定的贝叶斯风险,因此,近似为极小极大值。结果可以应用于大数据处理,如果数据的批次大小适中,性质近似均匀。
{"title":"Minimax strategies for Bernoulli two-armed bandit on a moderate control horizon","authors":"A. Kolnogorov, Denis Grunev","doi":"10.1080/23737484.2021.1986170","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986170","url":null,"abstract":"ABSTRACT We consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data if there are two processing methods available with different a priori unknown efficiencies. One has to determine the most effective method and provide its predominant application. In contrast to big data processing for which several approaches have been developed, including batch processing, the optimization of moderate data processing is currently not well understood. We consider minimax approach and search for minimax strategy and minimax risk as Bayesian ones corresponding to the worst-case prior distribution for which Bayesian risk attains its maximal value. Close to the worst-case prior distribution and corresponding Bayesian risk are obtained by numerical methods. Calculations show that determined strategy provides the value of maximal regret close to determined Bayesian risk and, hence, is approximately minimax one. Results can be applied to big data processing if the data arises by batches of moderate size with approximately uniform properties.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"24 1","pages":"536 - 544"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90082110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Special issue – Communications in statistics – Case studies and data analysis – 6th stochastic modeling techniques and data analysis international conference 特刊-统计中的通信-案例研究和数据分析-第六届随机建模技术和数据分析国际会议
Q4 Mathematics Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.2012013
C. Skiadas, Yiannis Dimotikalis, M. Caruana
This Special Issue on Statistical Methods and Data Analysis contains eleven invited articles presented at the 6th Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2020). The invited articles, theoretical, experimental and observational, present new results that have applications in real-life problems. An important objective was to select articles that present new methods for analyzing real-life data and lead to the advancement of the related fields. The following articles are included in this Special Issue: Mark A. Caruana and Liam Grech present their work on “Automobile Insurance Fraud Detection.” They explore the risk of incurring financial losses from fraudulent claims concerning insurance companies. Alexander Kolnogorov and Denis Grunev in their paper on “Minimax Strategies for Bernoulli Two-Armed Bandit on a Moderate Control Horizon” consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data when there are two processing methods available with different a priori unknown efficiencies. Panagiota Giannouli, Alex Karagrigoriou, Christos Kountzakis and Kimon Ntotsis in their paper “Multilevel Dimension Reduction for Credit Scoring Modelling and Prediction: Empirical Evidence for Greece” propose an innovative approach to flexible and accurate credit scoring modeling with the use of not only financial but also credit behavioral characteristics. Norio Watanabe is discussing “Fuzzy Theories and Statistics – Fuzzy Data Analysis –” and introduces some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity.
本期统计方法和数据分析特刊包含11篇在第六届随机建模技术和数据分析国际会议(SMTDA2020)上发表的特邀文章。应邀的文章,理论,实验和观察,提出了新的结果,在现实生活中的问题应用。一个重要的目标是选择文章,提出新的方法来分析现实生活中的数据,并导致相关领域的进步。以下文章包括在这个特刊中:马克A.卡鲁阿纳和利亚姆·格雷奇介绍了他们在“汽车保险欺诈检测”方面的工作。他们探讨了保险公司因欺诈性索赔而遭受经济损失的风险。Alexander Kolnogorov和Denis Grunev在他们的论文“中等控制水平上Bernoulli双臂强盗的极小极大策略”中考虑了中等控制水平上的Bernoulli双臂强盗问题,该问题适用于在存在两种具有不同先验未知效率的处理方法时处理适量数据的优化。Panagiota Giannouli, Alex Karagrigoriou, Christos Kountzakis和Kimon Ntotsis在他们的论文“信用评分建模和预测的多层次降维:希腊的经验证据”中提出了一种创新的方法,不仅使用金融特征,而且使用信用行为特征来灵活准确地建立信用评分模型。Norio Watanabe正在讨论“模糊理论和统计学——模糊数据分析”,并介绍了一些分析模糊数据的统计工具。模糊数据分析在与人类敏感性相关的领域具有重要意义。
{"title":"Special issue – Communications in statistics – Case studies and data analysis – 6th stochastic modeling techniques and data analysis international conference","authors":"C. Skiadas, Yiannis Dimotikalis, M. Caruana","doi":"10.1080/23737484.2021.2012013","DOIUrl":"https://doi.org/10.1080/23737484.2021.2012013","url":null,"abstract":"This Special Issue on Statistical Methods and Data Analysis contains eleven invited articles presented at the 6th Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2020). The invited articles, theoretical, experimental and observational, present new results that have applications in real-life problems. An important objective was to select articles that present new methods for analyzing real-life data and lead to the advancement of the related fields. The following articles are included in this Special Issue: Mark A. Caruana and Liam Grech present their work on “Automobile Insurance Fraud Detection.” They explore the risk of incurring financial losses from fraudulent claims concerning insurance companies. Alexander Kolnogorov and Denis Grunev in their paper on “Minimax Strategies for Bernoulli Two-Armed Bandit on a Moderate Control Horizon” consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data when there are two processing methods available with different a priori unknown efficiencies. Panagiota Giannouli, Alex Karagrigoriou, Christos Kountzakis and Kimon Ntotsis in their paper “Multilevel Dimension Reduction for Credit Scoring Modelling and Prediction: Empirical Evidence for Greece” propose an innovative approach to flexible and accurate credit scoring modeling with the use of not only financial but also credit behavioral characteristics. Norio Watanabe is discussing “Fuzzy Theories and Statistics – Fuzzy Data Analysis –” and introduces some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"47 1","pages":"517 - 519"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78411578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sibling rivalry within inverse Weibull family to predict the COVID-19 spread in South Africa 逆威布尔家族内的兄弟姐妹竞争预测COVID-19在南非的传播
Q4 Mathematics Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1979433
Farzane Hashemi, A. Bekker, Kirsten Smith, M. Arashi
ABSTRACT This article draws attention to a comparative study of different members within the inverse Weibull Power Series (IWPS) to analyze the COVID-19 data from South Africa for the period from 27 March to 23 August 2020. A new sibling of the IWPS is introduced, namely the inverse Weibull negative binomial. An EM algorithm is developed for computing the maximum likelihood estimates of the model parameters. The IWPS growth curve model and its special cases are used for prediction of the COVID-19 spread in South Africa. It is found that the IWPS model fits the disease growth of the COVID-19 confirmed cases well with worthy long-term predictions. The IWPS growth curve modeling of South African predicts that the number of confirmed new cases will decrease at the end of November 2020.
本文对逆威布尔幂级数(IWPS)中的不同成员进行了比较研究,以分析2020年3月27日至8月23日期间南非的COVID-19数据。引入了IWPS的一个新兄弟,即逆威布尔负二项式。提出了一种计算模型参数最大似然估计的电磁算法。利用IWPS增长曲线模型及其特殊情况预测了2019冠状病毒病在南非的传播。结果表明,IWPS模型能较好地拟合新冠肺炎确诊病例的疾病增长,具有较好的长期预测价值。南非IWPS增长曲线模型预测,到2020年11月底,新确诊病例数量将减少。
{"title":"Sibling rivalry within inverse Weibull family to predict the COVID-19 spread in South Africa","authors":"Farzane Hashemi, A. Bekker, Kirsten Smith, M. Arashi","doi":"10.1080/23737484.2021.1979433","DOIUrl":"https://doi.org/10.1080/23737484.2021.1979433","url":null,"abstract":"ABSTRACT This article draws attention to a comparative study of different members within the inverse Weibull Power Series (IWPS) to analyze the COVID-19 data from South Africa for the period from 27 March to 23 August 2020. A new sibling of the IWPS is introduced, namely the inverse Weibull negative binomial. An EM algorithm is developed for computing the maximum likelihood estimates of the model parameters. The IWPS growth curve model and its special cases are used for prediction of the COVID-19 spread in South Africa. It is found that the IWPS model fits the disease growth of the COVID-19 confirmed cases well with worthy long-term predictions. The IWPS growth curve modeling of South African predicts that the number of confirmed new cases will decrease at the end of November 2020.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"33 1","pages":"119 - 132"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78063200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What is in the “I” of the beholder: modeling the processing of consonant addition in a child’s pronoun 观察者的“我”中有什么:对儿童代词中辅音加法的加工建模
Q4 Mathematics Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1995912
E. Babatsouli
Abstract Phonological processing in child developmental speech has been a major topic of research offering insights into intervention methods for child disordered speech. The present paper investigates the processing of consonant addition to the English personal pronoun I, a monosyllable comprising the diphthong //. This phonological phenomenon has not been studied in the literature in monolingual or bilingual speech. Here, a child’s speech is elicited longitudinally from age 2;9 to 3;9 and additions to I are examined in terms of the phonological processes of anticipation and perseveration. Results reveal (i) decreasing additions with age, (ii) larger processing distance in perseveration than in anticipation between triggering consonant and added I, (iii) addition dominance of the sonorants n, l and of the voiceless alveolar plosive t, matching their target frequencies in the child’s speech, (iv) no correlation between probability of consonant addition occurrence and syllabic processing distance, and (v) strong and statistically significant correlation between the mean and standard deviation of processing distance across the child’s ages, meaning that one or the other can be used in practice instead of both. These findings offer insights into speech error processing with applications to intervention techniques in children with speech difficulties.
儿童发展性言语的语音加工一直是一个重要的研究课题,为儿童言语障碍的干预方法提供了新的见解。本文研究了英语人称代词I(由双元音//组成的单音节)的辅音加成过程。这种语音现象在单语或双语语音中还没有被研究过。在这里,孩子的语言是从2岁、9岁到3岁纵向引出的,并且根据期待和坚持的语音过程来检查I的添加。结果表明:(1)随着年龄的增长,辅音的添加量减少;(2)触发辅音和添加的辅音i之间的持续处理距离比预期处理距离大;(3)辅音n、l和不发音的肺泡爆破音t的添加优势,与它们在儿童言语中的目标频率相匹配;(4)辅音添加发生的概率与音节处理距离没有相关性。(v)处理距离的平均值和标准差在不同年龄的儿童之间具有很强的统计学上显著的相关性,这意味着在实践中可以使用其中之一,而不是两者都使用。这些发现为言语障碍儿童的言语错误处理提供了应用干预技术的见解。
{"title":"What is in the “I” of the beholder: modeling the processing of consonant addition in a child’s pronoun","authors":"E. Babatsouli","doi":"10.1080/23737484.2021.1995912","DOIUrl":"https://doi.org/10.1080/23737484.2021.1995912","url":null,"abstract":"Abstract Phonological processing in child developmental speech has been a major topic of research offering insights into intervention methods for child disordered speech. The present paper investigates the processing of consonant addition to the English personal pronoun I, a monosyllable comprising the diphthong //. This phonological phenomenon has not been studied in the literature in monolingual or bilingual speech. Here, a child’s speech is elicited longitudinally from age 2;9 to 3;9 and additions to I are examined in terms of the phonological processes of anticipation and perseveration. Results reveal (i) decreasing additions with age, (ii) larger processing distance in perseveration than in anticipation between triggering consonant and added I, (iii) addition dominance of the sonorants n, l and of the voiceless alveolar plosive t, matching their target frequencies in the child’s speech, (iv) no correlation between probability of consonant addition occurrence and syllabic processing distance, and (v) strong and statistically significant correlation between the mean and standard deviation of processing distance across the child’s ages, meaning that one or the other can be used in practice instead of both. These findings offer insights into speech error processing with applications to intervention techniques in children with speech difficulties.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"117 1","pages":"670 - 694"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90714586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automobile insurance fraud detection 汽车保险欺诈检测
Q4 Mathematics Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1986169
M. Caruana, Liam Grech
Abstract The risk of incurring financial losses from fraudulent claims is an issue concerning all insurance companies. The detection of such claims is not an easy task. Moreover, a number of old-school methods have proven to be inefficient. Statistical techniques for predictive modelling have been applied to detect fraudulent claims. In this article, we compare two techniques: Artificial neural networks and the Naïve Bayes classifier. The theory underpinning both techniques is discussed and an application of these techniques to a dataset of labelled automobile insurance claims is then presented. Fraudulent claims only constitute a small percentage of the total number of claims. As a result, datasets tend to be unbalanced. This in turn causes a number of problems. To overcome such issues, techniques which deal with unbalanced datasets are also discussed. The suitability of Neural Networks and the Naïve Bayes classifier to the dataset is discussed and the results are compared and contrasted by using a number of performance measures including ROC curves, Accuracy, AUC, Precision, and Sensitivity. Both classification techniques gave comparable results with the Neural network giving slightly better results than the Naïve Bayes classifier on the training dataset. However, when applied to the test data, the Naïve Bayes classifier slightly outperformed the artificial neural network.
欺诈性理赔造成经济损失的风险是所有保险公司都关心的问题。检测此类索赔并非易事。此外,许多老派方法已被证明效率低下。预测建模的统计技术已被应用于检测欺诈性索赔。在本文中,我们比较了两种技术:人工神经网络和Naïve贝叶斯分类器。讨论了支持这两种技术的理论,并提出了将这些技术应用于标记汽车保险索赔数据集的方法。欺诈性索赔只占索赔总数的一小部分。因此,数据集往往是不平衡的。这反过来又导致了许多问题。为了克服这些问题,还讨论了处理不平衡数据集的技术。讨论了神经网络和Naïve贝叶斯分类器对数据集的适用性,并通过使用许多性能指标(包括ROC曲线,准确度,AUC,精度和灵敏度)对结果进行了比较和对比。两种分类技术都给出了可比较的结果,神经网络在训练数据集上的结果略好于Naïve贝叶斯分类器。然而,当应用于测试数据时,Naïve贝叶斯分类器的性能略优于人工神经网络。
{"title":"Automobile insurance fraud detection","authors":"M. Caruana, Liam Grech","doi":"10.1080/23737484.2021.1986169","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986169","url":null,"abstract":"Abstract The risk of incurring financial losses from fraudulent claims is an issue concerning all insurance companies. The detection of such claims is not an easy task. Moreover, a number of old-school methods have proven to be inefficient. Statistical techniques for predictive modelling have been applied to detect fraudulent claims. In this article, we compare two techniques: Artificial neural networks and the Naïve Bayes classifier. The theory underpinning both techniques is discussed and an application of these techniques to a dataset of labelled automobile insurance claims is then presented. Fraudulent claims only constitute a small percentage of the total number of claims. As a result, datasets tend to be unbalanced. This in turn causes a number of problems. To overcome such issues, techniques which deal with unbalanced datasets are also discussed. The suitability of Neural Networks and the Naïve Bayes classifier to the dataset is discussed and the results are compared and contrasted by using a number of performance measures including ROC curves, Accuracy, AUC, Precision, and Sensitivity. Both classification techniques gave comparable results with the Neural network giving slightly better results than the Naïve Bayes classifier on the training dataset. However, when applied to the test data, the Naïve Bayes classifier slightly outperformed the artificial neural network.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"40 1","pages":"520 - 535"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85082934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Communications in Statistics Case Studies Data Analysis and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1