Pub Date : 2021-12-28DOI: 10.1080/23737484.2021.2017810
Maryam Movahedifar, Hossein Hassani, M. Yarmohammadi, M. Kalantari, Rangan Gupta
Abstract Singular spectrum analysis (SSA) is a nonparametric method for separating time series data into a sum of small numbers of interpretable components (signal + noise). One of the steps of the SSA method, which is referenced to Embedding, is extremely sensitive to contamination of outliers which are often founded in time series analysis. To reduce the effect of outliers, SSA based on Singular Spectrum Decomposition (SSD) method is proposed. In this article, the ability of SSA based on SSD and basic SSA are compared in time series reconstruction in the presence of outliers. It is noteworthy that the matrix norm used in Basic SSA is the Frobenius norm or L 2-norm. There is a newer version of SSA that is based on L 1-norm and called L 1-SSA. It was confirmed that L 1-SSA is robust against outliers. In this regard, this research is also introduced a new version of SSD based on L 1-norm which is called L 1-SSD. A wide empirical study on both simulated and real data verifies the efficiency of basic SSA based on SSD and L 1-norm in reconstructing the time series where polluted by outliers.
{"title":"A robust approach for outlier imputation: Singular spectrum decomposition","authors":"Maryam Movahedifar, Hossein Hassani, M. Yarmohammadi, M. Kalantari, Rangan Gupta","doi":"10.1080/23737484.2021.2017810","DOIUrl":"https://doi.org/10.1080/23737484.2021.2017810","url":null,"abstract":"Abstract Singular spectrum analysis (SSA) is a nonparametric method for separating time series data into a sum of small numbers of interpretable components (signal + noise). One of the steps of the SSA method, which is referenced to Embedding, is extremely sensitive to contamination of outliers which are often founded in time series analysis. To reduce the effect of outliers, SSA based on Singular Spectrum Decomposition (SSD) method is proposed. In this article, the ability of SSA based on SSD and basic SSA are compared in time series reconstruction in the presence of outliers. It is noteworthy that the matrix norm used in Basic SSA is the Frobenius norm or L 2-norm. There is a newer version of SSA that is based on L 1-norm and called L 1-SSA. It was confirmed that L 1-SSA is robust against outliers. In this regard, this research is also introduced a new version of SSD based on L 1-norm which is called L 1-SSD. A wide empirical study on both simulated and real data verifies the efficiency of basic SSA based on SSD and L 1-norm in reconstructing the time series where polluted by outliers.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"34 1","pages":"234 - 250"},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88992819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-22DOI: 10.1080/23737484.2021.2017808
U. Nduka, I. Iwueze, C. Nwaigwe
Abstract The linear regression model is a popular tool used by almost all in different areas of research. The model relies mainly on the assumption of uncorrelated errors from a Gaussian distribution. However, many datasets in practice violate this basic assumption, making inference in such cases invalid. Therefore, the linear regression model with structured errors driven by heavy-tailed innovations are preferred in practice. Another issue that occur frequently with real-life data is missing values, due to some reasons such as system breakdown and labor unrest. Despite the challenge these two issues pose to practitioners, there is scarcity of literature where they have jointly been studied. Hence, this article considers these two issues jointly, for the first time, and develops an efficient parameter estimation procedure for Student’s-t autoregressive regression model for time series with missing values of the response variable. The procedure is based on a stochastic approximation expectation–maximization algorithm coupled with a Markov chain Monte Carlo technique. The procedure gives efficient closed-form expressions for the parameters of the model, which are very easy to compute. Simulations and real-life data analysis show that the method is efficient for use with incomplete time series data.
{"title":"Modeling serially correlated heavy-tailed data with some missing response values using stochastic EM algorithm","authors":"U. Nduka, I. Iwueze, C. Nwaigwe","doi":"10.1080/23737484.2021.2017808","DOIUrl":"https://doi.org/10.1080/23737484.2021.2017808","url":null,"abstract":"Abstract The linear regression model is a popular tool used by almost all in different areas of research. The model relies mainly on the assumption of uncorrelated errors from a Gaussian distribution. However, many datasets in practice violate this basic assumption, making inference in such cases invalid. Therefore, the linear regression model with structured errors driven by heavy-tailed innovations are preferred in practice. Another issue that occur frequently with real-life data is missing values, due to some reasons such as system breakdown and labor unrest. Despite the challenge these two issues pose to practitioners, there is scarcity of literature where they have jointly been studied. Hence, this article considers these two issues jointly, for the first time, and develops an efficient parameter estimation procedure for Student’s-t autoregressive regression model for time series with missing values of the response variable. The procedure is based on a stochastic approximation expectation–maximization algorithm coupled with a Markov chain Monte Carlo technique. The procedure gives efficient closed-form expressions for the parameters of the model, which are very easy to compute. Simulations and real-life data analysis show that the method is efficient for use with incomplete time series data.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"58 1","pages":"81 - 104"},"PeriodicalIF":0.0,"publicationDate":"2021-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76012816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-11-08DOI: 10.1080/23737484.2021.1986171
Yushan Cheng, Yongchang Hui, Shuangzhe Liu, Wing-Keung Wong
Abstract Literature has found that regression of independent (nearly) nonstationary time series could be spurious. We incorporate this idea to examine whether significant regression could be treated as insignificant in some situations. To do so, we conjecture that significant regression could appear significant in some cases but it could become insignificant in some other cases. To check whether our conjecture could hold, we set up a model in which both dependent and independent variables Yt and Xt are the sum of two variables, say and , in which and are independent and (nearly) nonstationary AR(1) time series such that and . Following this model-setup, we design some situations and the algorithm for our simulation to check whether our conjecture could hold. We find that on the one hand, our conjecture could hold that significant regression could appear significant in some cases when α 1 and α 2 are of different signs. On the other hand, our findings show that our conjecture does not hold and significant regression cannot be treated as insignificant when α 1 and α 2 are of the same signs. We note that as far as we know, our article is the first article to discover that significant regression can be treated as insignificant in some situations. Thus, the main contribution of our article is that our article is the first article to discover that significant regression can be treated as insignificant in some situations and remains significant in other situations. We believe that our discovery could be an anomaly in statistics. Our findings are useful for academics and practitioners in their data analysis in the way that if they find the regression is insignificant, they should investigate further whether their analysis falls into the problem studied in our article.
{"title":"Could significant regression be treated as insignificant: An anomaly in statistics?","authors":"Yushan Cheng, Yongchang Hui, Shuangzhe Liu, Wing-Keung Wong","doi":"10.1080/23737484.2021.1986171","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986171","url":null,"abstract":"Abstract Literature has found that regression of independent (nearly) nonstationary time series could be spurious. We incorporate this idea to examine whether significant regression could be treated as insignificant in some situations. To do so, we conjecture that significant regression could appear significant in some cases but it could become insignificant in some other cases. To check whether our conjecture could hold, we set up a model in which both dependent and independent variables Yt and Xt are the sum of two variables, say and , in which and are independent and (nearly) nonstationary AR(1) time series such that and . Following this model-setup, we design some situations and the algorithm for our simulation to check whether our conjecture could hold. We find that on the one hand, our conjecture could hold that significant regression could appear significant in some cases when α 1 and α 2 are of different signs. On the other hand, our findings show that our conjecture does not hold and significant regression cannot be treated as insignificant when α 1 and α 2 are of the same signs. We note that as far as we know, our article is the first article to discover that significant regression can be treated as insignificant in some situations. Thus, the main contribution of our article is that our article is the first article to discover that significant regression can be treated as insignificant in some situations and remains significant in other situations. We believe that our discovery could be an anomaly in statistics. Our findings are useful for academics and practitioners in their data analysis in the way that if they find the regression is insignificant, they should investigate further whether their analysis falls into the problem studied in our article.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"273 1","pages":"133 - 151"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74738609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-28DOI: 10.1080/23737484.2021.1991855
Bernard Colin, Simon Germain, C. Pomar
Abstract This article is concerned with the conception of an automatic numerical procedure which, integrated into automatic feeders, can identify changes in the feed intake patterns of individual pigs, thus allowing early detection of potential sanitary challenges within the herd. More precisely, the proposed numerical procedure analyzes every day, and for each pig within the herd, feed intake data collected during 5 consecutive days (memory lag) to predict the feeding patterns of the following day. Then, the procedure evaluates, for each animal, the difference between the predicted and the observed feeding patterns and automatically detects if this difference is greater than a given threshold. In this case, a signal is sent to a monitoring center and the animal can be placed under observation.
{"title":"Early detection of individual growing pigs’ sanitary challenges using functional data analysis of real-time feed intake patterns","authors":"Bernard Colin, Simon Germain, C. Pomar","doi":"10.1080/23737484.2021.1991855","DOIUrl":"https://doi.org/10.1080/23737484.2021.1991855","url":null,"abstract":"Abstract This article is concerned with the conception of an automatic numerical procedure which, integrated into automatic feeders, can identify changes in the feed intake patterns of individual pigs, thus allowing early detection of potential sanitary challenges within the herd. More precisely, the proposed numerical procedure analyzes every day, and for each pig within the herd, feed intake data collected during 5 consecutive days (memory lag) to predict the feeding patterns of the following day. Then, the procedure evaluates, for each animal, the difference between the predicted and the observed feeding patterns and automatically detects if this difference is greater than a given threshold. In this case, a signal is sent to a monitoring center and the animal can be placed under observation.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"148 1","pages":"177 - 198"},"PeriodicalIF":0.0,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74906957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-02DOI: 10.1080/23737484.2021.1991854
N. Watanabe
Abstract Fuzzy theories are not well accepted in the field of statistics. However, fuzzy theories are important from the statistical viewpoint. In this note we first discuss the statistical applications of the fuzzy theories briefly. The main is the fuzzy set theory and we do not refer the fuzzy measure theory. Second, we introduce some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity. Furthermore we define the fuzzy directional data as a special case of fuzzy data. The statistical analysis of fuzzy directional data is also discussed.
{"title":"Fuzzy theories and statistics—fuzzy data analysis","authors":"N. Watanabe","doi":"10.1080/23737484.2021.1991854","DOIUrl":"https://doi.org/10.1080/23737484.2021.1991854","url":null,"abstract":"Abstract Fuzzy theories are not well accepted in the field of statistics. However, fuzzy theories are important from the statistical viewpoint. In this note we first discuss the statistical applications of the fuzzy theories briefly. The main is the fuzzy set theory and we do not refer the fuzzy measure theory. Second, we introduce some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity. Furthermore we define the fuzzy directional data as a special case of fuzzy data. The statistical analysis of fuzzy directional data is also discussed.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"19 1","pages":"561 - 572"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88111455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-02DOI: 10.1080/23737484.2021.1986170
A. Kolnogorov, Denis Grunev
ABSTRACT We consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data if there are two processing methods available with different a priori unknown efficiencies. One has to determine the most effective method and provide its predominant application. In contrast to big data processing for which several approaches have been developed, including batch processing, the optimization of moderate data processing is currently not well understood. We consider minimax approach and search for minimax strategy and minimax risk as Bayesian ones corresponding to the worst-case prior distribution for which Bayesian risk attains its maximal value. Close to the worst-case prior distribution and corresponding Bayesian risk are obtained by numerical methods. Calculations show that determined strategy provides the value of maximal regret close to determined Bayesian risk and, hence, is approximately minimax one. Results can be applied to big data processing if the data arises by batches of moderate size with approximately uniform properties.
{"title":"Minimax strategies for Bernoulli two-armed bandit on a moderate control horizon","authors":"A. Kolnogorov, Denis Grunev","doi":"10.1080/23737484.2021.1986170","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986170","url":null,"abstract":"ABSTRACT We consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data if there are two processing methods available with different a priori unknown efficiencies. One has to determine the most effective method and provide its predominant application. In contrast to big data processing for which several approaches have been developed, including batch processing, the optimization of moderate data processing is currently not well understood. We consider minimax approach and search for minimax strategy and minimax risk as Bayesian ones corresponding to the worst-case prior distribution for which Bayesian risk attains its maximal value. Close to the worst-case prior distribution and corresponding Bayesian risk are obtained by numerical methods. Calculations show that determined strategy provides the value of maximal regret close to determined Bayesian risk and, hence, is approximately minimax one. Results can be applied to big data processing if the data arises by batches of moderate size with approximately uniform properties.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"24 1","pages":"536 - 544"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90082110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-02DOI: 10.1080/23737484.2021.2012013
C. Skiadas, Yiannis Dimotikalis, M. Caruana
This Special Issue on Statistical Methods and Data Analysis contains eleven invited articles presented at the 6th Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2020). The invited articles, theoretical, experimental and observational, present new results that have applications in real-life problems. An important objective was to select articles that present new methods for analyzing real-life data and lead to the advancement of the related fields. The following articles are included in this Special Issue: Mark A. Caruana and Liam Grech present their work on “Automobile Insurance Fraud Detection.” They explore the risk of incurring financial losses from fraudulent claims concerning insurance companies. Alexander Kolnogorov and Denis Grunev in their paper on “Minimax Strategies for Bernoulli Two-Armed Bandit on a Moderate Control Horizon” consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data when there are two processing methods available with different a priori unknown efficiencies. Panagiota Giannouli, Alex Karagrigoriou, Christos Kountzakis and Kimon Ntotsis in their paper “Multilevel Dimension Reduction for Credit Scoring Modelling and Prediction: Empirical Evidence for Greece” propose an innovative approach to flexible and accurate credit scoring modeling with the use of not only financial but also credit behavioral characteristics. Norio Watanabe is discussing “Fuzzy Theories and Statistics – Fuzzy Data Analysis –” and introduces some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity.
本期统计方法和数据分析特刊包含11篇在第六届随机建模技术和数据分析国际会议(SMTDA2020)上发表的特邀文章。应邀的文章,理论,实验和观察,提出了新的结果,在现实生活中的问题应用。一个重要的目标是选择文章,提出新的方法来分析现实生活中的数据,并导致相关领域的进步。以下文章包括在这个特刊中:马克A.卡鲁阿纳和利亚姆·格雷奇介绍了他们在“汽车保险欺诈检测”方面的工作。他们探讨了保险公司因欺诈性索赔而遭受经济损失的风险。Alexander Kolnogorov和Denis Grunev在他们的论文“中等控制水平上Bernoulli双臂强盗的极小极大策略”中考虑了中等控制水平上的Bernoulli双臂强盗问题,该问题适用于在存在两种具有不同先验未知效率的处理方法时处理适量数据的优化。Panagiota Giannouli, Alex Karagrigoriou, Christos Kountzakis和Kimon Ntotsis在他们的论文“信用评分建模和预测的多层次降维:希腊的经验证据”中提出了一种创新的方法,不仅使用金融特征,而且使用信用行为特征来灵活准确地建立信用评分模型。Norio Watanabe正在讨论“模糊理论和统计学——模糊数据分析”,并介绍了一些分析模糊数据的统计工具。模糊数据分析在与人类敏感性相关的领域具有重要意义。
{"title":"Special issue – Communications in statistics – Case studies and data analysis – 6th stochastic modeling techniques and data analysis international conference","authors":"C. Skiadas, Yiannis Dimotikalis, M. Caruana","doi":"10.1080/23737484.2021.2012013","DOIUrl":"https://doi.org/10.1080/23737484.2021.2012013","url":null,"abstract":"This Special Issue on Statistical Methods and Data Analysis contains eleven invited articles presented at the 6th Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2020). The invited articles, theoretical, experimental and observational, present new results that have applications in real-life problems. An important objective was to select articles that present new methods for analyzing real-life data and lead to the advancement of the related fields. The following articles are included in this Special Issue: Mark A. Caruana and Liam Grech present their work on “Automobile Insurance Fraud Detection.” They explore the risk of incurring financial losses from fraudulent claims concerning insurance companies. Alexander Kolnogorov and Denis Grunev in their paper on “Minimax Strategies for Bernoulli Two-Armed Bandit on a Moderate Control Horizon” consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data when there are two processing methods available with different a priori unknown efficiencies. Panagiota Giannouli, Alex Karagrigoriou, Christos Kountzakis and Kimon Ntotsis in their paper “Multilevel Dimension Reduction for Credit Scoring Modelling and Prediction: Empirical Evidence for Greece” propose an innovative approach to flexible and accurate credit scoring modeling with the use of not only financial but also credit behavioral characteristics. Norio Watanabe is discussing “Fuzzy Theories and Statistics – Fuzzy Data Analysis –” and introduces some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"47 1","pages":"517 - 519"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78411578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-02DOI: 10.1080/23737484.2021.1979433
Farzane Hashemi, A. Bekker, Kirsten Smith, M. Arashi
ABSTRACT This article draws attention to a comparative study of different members within the inverse Weibull Power Series (IWPS) to analyze the COVID-19 data from South Africa for the period from 27 March to 23 August 2020. A new sibling of the IWPS is introduced, namely the inverse Weibull negative binomial. An EM algorithm is developed for computing the maximum likelihood estimates of the model parameters. The IWPS growth curve model and its special cases are used for prediction of the COVID-19 spread in South Africa. It is found that the IWPS model fits the disease growth of the COVID-19 confirmed cases well with worthy long-term predictions. The IWPS growth curve modeling of South African predicts that the number of confirmed new cases will decrease at the end of November 2020.
{"title":"Sibling rivalry within inverse Weibull family to predict the COVID-19 spread in South Africa","authors":"Farzane Hashemi, A. Bekker, Kirsten Smith, M. Arashi","doi":"10.1080/23737484.2021.1979433","DOIUrl":"https://doi.org/10.1080/23737484.2021.1979433","url":null,"abstract":"ABSTRACT This article draws attention to a comparative study of different members within the inverse Weibull Power Series (IWPS) to analyze the COVID-19 data from South Africa for the period from 27 March to 23 August 2020. A new sibling of the IWPS is introduced, namely the inverse Weibull negative binomial. An EM algorithm is developed for computing the maximum likelihood estimates of the model parameters. The IWPS growth curve model and its special cases are used for prediction of the COVID-19 spread in South Africa. It is found that the IWPS model fits the disease growth of the COVID-19 confirmed cases well with worthy long-term predictions. The IWPS growth curve modeling of South African predicts that the number of confirmed new cases will decrease at the end of November 2020.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"33 1","pages":"119 - 132"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78063200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-02DOI: 10.1080/23737484.2021.1995912
E. Babatsouli
Abstract Phonological processing in child developmental speech has been a major topic of research offering insights into intervention methods for child disordered speech. The present paper investigates the processing of consonant addition to the English personal pronoun I, a monosyllable comprising the diphthong //. This phonological phenomenon has not been studied in the literature in monolingual or bilingual speech. Here, a child’s speech is elicited longitudinally from age 2;9 to 3;9 and additions to I are examined in terms of the phonological processes of anticipation and perseveration. Results reveal (i) decreasing additions with age, (ii) larger processing distance in perseveration than in anticipation between triggering consonant and added I, (iii) addition dominance of the sonorants n, l and of the voiceless alveolar plosive t, matching their target frequencies in the child’s speech, (iv) no correlation between probability of consonant addition occurrence and syllabic processing distance, and (v) strong and statistically significant correlation between the mean and standard deviation of processing distance across the child’s ages, meaning that one or the other can be used in practice instead of both. These findings offer insights into speech error processing with applications to intervention techniques in children with speech difficulties.
{"title":"What is in the “I” of the beholder: modeling the processing of consonant addition in a child’s pronoun","authors":"E. Babatsouli","doi":"10.1080/23737484.2021.1995912","DOIUrl":"https://doi.org/10.1080/23737484.2021.1995912","url":null,"abstract":"Abstract Phonological processing in child developmental speech has been a major topic of research offering insights into intervention methods for child disordered speech. The present paper investigates the processing of consonant addition to the English personal pronoun I, a monosyllable comprising the diphthong //. This phonological phenomenon has not been studied in the literature in monolingual or bilingual speech. Here, a child’s speech is elicited longitudinally from age 2;9 to 3;9 and additions to I are examined in terms of the phonological processes of anticipation and perseveration. Results reveal (i) decreasing additions with age, (ii) larger processing distance in perseveration than in anticipation between triggering consonant and added I, (iii) addition dominance of the sonorants n, l and of the voiceless alveolar plosive t, matching their target frequencies in the child’s speech, (iv) no correlation between probability of consonant addition occurrence and syllabic processing distance, and (v) strong and statistically significant correlation between the mean and standard deviation of processing distance across the child’s ages, meaning that one or the other can be used in practice instead of both. These findings offer insights into speech error processing with applications to intervention techniques in children with speech difficulties.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"117 1","pages":"670 - 694"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90714586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-02DOI: 10.1080/23737484.2021.1986169
M. Caruana, Liam Grech
Abstract The risk of incurring financial losses from fraudulent claims is an issue concerning all insurance companies. The detection of such claims is not an easy task. Moreover, a number of old-school methods have proven to be inefficient. Statistical techniques for predictive modelling have been applied to detect fraudulent claims. In this article, we compare two techniques: Artificial neural networks and the Naïve Bayes classifier. The theory underpinning both techniques is discussed and an application of these techniques to a dataset of labelled automobile insurance claims is then presented. Fraudulent claims only constitute a small percentage of the total number of claims. As a result, datasets tend to be unbalanced. This in turn causes a number of problems. To overcome such issues, techniques which deal with unbalanced datasets are also discussed. The suitability of Neural Networks and the Naïve Bayes classifier to the dataset is discussed and the results are compared and contrasted by using a number of performance measures including ROC curves, Accuracy, AUC, Precision, and Sensitivity. Both classification techniques gave comparable results with the Neural network giving slightly better results than the Naïve Bayes classifier on the training dataset. However, when applied to the test data, the Naïve Bayes classifier slightly outperformed the artificial neural network.
{"title":"Automobile insurance fraud detection","authors":"M. Caruana, Liam Grech","doi":"10.1080/23737484.2021.1986169","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986169","url":null,"abstract":"Abstract The risk of incurring financial losses from fraudulent claims is an issue concerning all insurance companies. The detection of such claims is not an easy task. Moreover, a number of old-school methods have proven to be inefficient. Statistical techniques for predictive modelling have been applied to detect fraudulent claims. In this article, we compare two techniques: Artificial neural networks and the Naïve Bayes classifier. The theory underpinning both techniques is discussed and an application of these techniques to a dataset of labelled automobile insurance claims is then presented. Fraudulent claims only constitute a small percentage of the total number of claims. As a result, datasets tend to be unbalanced. This in turn causes a number of problems. To overcome such issues, techniques which deal with unbalanced datasets are also discussed. The suitability of Neural Networks and the Naïve Bayes classifier to the dataset is discussed and the results are compared and contrasted by using a number of performance measures including ROC curves, Accuracy, AUC, Precision, and Sensitivity. Both classification techniques gave comparable results with the Neural network giving slightly better results than the Naïve Bayes classifier on the training dataset. However, when applied to the test data, the Naïve Bayes classifier slightly outperformed the artificial neural network.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"40 1","pages":"520 - 535"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85082934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}