O.-A. Ampomah, R. Minkah, G. Kallah-Dagadu, E. N. N. Nortey
Classification is one of the main areas of machine learning, where the target variable is usually categorical with at least two levels. This study focuses on deducing an optimal cut-off point for continuous outcomes (e.g., predicted probabilities) resulting from binary classifiers. To achieve this aim, the study modified univariate discriminant functions by incorporating the error cost of misclassification penalties involved. By doing so, we can systematically shift the cut-off point within its measurement range till the optimal point is obtained. Extensive simulation studies were conducted to investigate the performance of the proposed method in comparison with existing classification methods under the binary logistic and Bayesian quantile regression frameworks. The simulation results indicate that logistic regression models incorporating the proposed method outperform the existing ordinary logistic regression and Bayesian regression models. We illustrate the proposed method with a practical dataset from the finance industry that assesses default status in home equity.
{"title":"A Cost of Misclassification Adjustment Approach for Estimating Optimal Cut-Off Point for Classification","authors":"O.-A. Ampomah, R. Minkah, G. Kallah-Dagadu, E. N. N. Nortey","doi":"10.1155/2024/8082372","DOIUrl":"https://doi.org/10.1155/2024/8082372","url":null,"abstract":"Classification is one of the main areas of machine learning, where the target variable is usually categorical with at least two levels. This study focuses on deducing an optimal cut-off point for continuous outcomes (e.g., predicted probabilities) resulting from binary classifiers. To achieve this aim, the study modified univariate discriminant functions by incorporating the error cost of misclassification penalties involved. By doing so, we can systematically shift the cut-off point within its measurement range till the optimal point is obtained. Extensive simulation studies were conducted to investigate the performance of the proposed method in comparison with existing classification methods under the binary logistic and Bayesian quantile regression frameworks. The simulation results indicate that logistic regression models incorporating the proposed method outperform the existing ordinary logistic regression and Bayesian regression models. We illustrate the proposed method with a practical dataset from the finance industry that assesses default status in home equity.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140976709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The explosion of time series count data with diverse characteristics and features in recent years has led to a proliferation of new analysis models and methods. Significant efforts have been devoted to achieving flexibility capable of handling complex dependence structures, capturing multiple distributional characteristics simultaneously, and addressing nonstationary patterns such as trends, seasonality, or change points. However, it remains a challenge when considering them in the context of long-range dependence. The Lévy-based modeling framework offers a promising tool to meet the requirements of modern data analysis. It enables the modeling of both short-range and long-range serial correlation structures by selecting the kernel set accordingly and accommodates various marginal distributions within the class of infinitely divisible laws. We propose an extension of the basic stationary framework to capture additional marginal properties, such as heavy-tailedness, in both short-term and long-term dependencies, as well as overdispersion and zero inflation in simultaneous modeling. Statistical inference is based on composite pairwise likelihood. The model’s flexibility is illustrated through applications to rainfall data in Guinea from 2008 to 2023, and the number of NSF funding awarded to academic institutions. The proposed model demonstrates remarkable flexibility and versatility, capable of simultaneously capturing overdispersion, zero inflation, and heavy-tailedness in count time series data.
{"title":"Flexible Lévy-Based Models for Time Series of Count Data with Zero-Inflation, Overdispersion, and Heavy Tails","authors":"Confort Kollie, Philip Ngare, B. Malenje","doi":"10.1155/2023/1780404","DOIUrl":"https://doi.org/10.1155/2023/1780404","url":null,"abstract":"The explosion of time series count data with diverse characteristics and features in recent years has led to a proliferation of new analysis models and methods. Significant efforts have been devoted to achieving flexibility capable of handling complex dependence structures, capturing multiple distributional characteristics simultaneously, and addressing nonstationary patterns such as trends, seasonality, or change points. However, it remains a challenge when considering them in the context of long-range dependence. The Lévy-based modeling framework offers a promising tool to meet the requirements of modern data analysis. It enables the modeling of both short-range and long-range serial correlation structures by selecting the kernel set accordingly and accommodates various marginal distributions within the class of infinitely divisible laws. We propose an extension of the basic stationary framework to capture additional marginal properties, such as heavy-tailedness, in both short-term and long-term dependencies, as well as overdispersion and zero inflation in simultaneous modeling. Statistical inference is based on composite pairwise likelihood. The model’s flexibility is illustrated through applications to rainfall data in Guinea from 2008 to 2023, and the number of NSF funding awarded to academic institutions. The proposed model demonstrates remarkable flexibility and versatility, capable of simultaneously capturing overdispersion, zero inflation, and heavy-tailedness in count time series data.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139198810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, the exponentially generated system was used to modify a two-parameter Chen distribution to a four-parameter distribution with better performance. The property of complete probability distribution function was used to verify the completeness of the resulting distribution, which shows that the distribution is a proper probability distribution function. A simulation study involving varying sample sizes was used to ascertain the asymptotic property of the new distribution. Small and large sample sizes were considered which shows the closeness of the estimates to the true value as the sample size increases. Lifetime dataset were used for model comparison which shows the superiority of exponentially generated modify Chen distribution over some existing distributions. It is therefore recommended to use the four-parameter Chen distribution in place of the well-known two-parameter Chen distribution.
{"title":"Exponentially Generated Modified Chen Distribution with Applications to Lifetime Dataset","authors":"Awopeju Kabiru Abidemi, A. A. Abiodun","doi":"10.1155/2023/4458562","DOIUrl":"https://doi.org/10.1155/2023/4458562","url":null,"abstract":"In this paper, the exponentially generated system was used to modify a two-parameter Chen distribution to a four-parameter distribution with better performance. The property of complete probability distribution function was used to verify the completeness of the resulting distribution, which shows that the distribution is a proper probability distribution function. A simulation study involving varying sample sizes was used to ascertain the asymptotic property of the new distribution. Small and large sample sizes were considered which shows the closeness of the estimates to the true value as the sample size increases. Lifetime dataset were used for model comparison which shows the superiority of exponentially generated modify Chen distribution over some existing distributions. It is therefore recommended to use the four-parameter Chen distribution in place of the well-known two-parameter Chen distribution.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139251651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aim of this paper is to obtain a Bayesian estimator of stress-strength reliability based on generalized order statistics for Pareto distribution. The dependence of the Pareto distribution support on the parameter complicates the calculations. Hence, in literature, one of the parameters is assumed to be known. In this paper, for the first time, two parameters of Pareto distribution are considered unknown. In computing the Bayesian confidence interval for reliability based on generalized order statistics, the posterior distribution has a complex form that cannot be sampled by conventional methods. To solve this problem, we propose an acceptance-rejection algorithm to generate a sample of the posterior distribution. We also propose a particular case of this model and obtain the classical and Bayesian estimators for this particular case. In this case, to obtain the Bayesian estimator of stress-strength reliability, we propose a variable change method. Then, these confidence intervals are compared by simulation. Finally, a practical example of this study is provided.
{"title":"Bayesian Estimation of the Stress-Strength Reliability Based on Generalized Order Statistics for Pareto Distribution","authors":"Zahra Karimi Ezmareh, Gholamhossein Yari","doi":"10.1155/2023/8648261","DOIUrl":"https://doi.org/10.1155/2023/8648261","url":null,"abstract":"The aim of this paper is to obtain a Bayesian estimator of stress-strength reliability based on generalized order statistics for Pareto distribution. The dependence of the Pareto distribution support on the parameter complicates the calculations. Hence, in literature, one of the parameters is assumed to be known. In this paper, for the first time, two parameters of Pareto distribution are considered unknown. In computing the Bayesian confidence interval for reliability based on generalized order statistics, the posterior distribution has a complex form that cannot be sampled by conventional methods. To solve this problem, we propose an acceptance-rejection algorithm to generate a sample of the posterior distribution. We also propose a particular case of this model and obtain the classical and Bayesian estimators for this particular case. In this case, to obtain the Bayesian estimator of stress-strength reliability, we propose a variable change method. Then, these confidence intervals are compared by simulation. Finally, a practical example of this study is provided.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136281901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Atif, Muhammad Shafiq, Muhammad Farooq, Gohar Ayub, Friedrich Leisch, Muhammad Ilyas
This article comprehensively reviews the applications and algorithms used for monitoring the evolution of clustering solutions in data streams. The clustering technique is an unsupervised learning problem that involves the identification of natural subgroups in a large dataset. In contrast to supervised learning models, clustering is a data mining technique that retrieves the hidden pattern in the input dataset. The clustering solution reflects the mechanism that leads to a high level of similarity between the items. A few applications include pattern recognition, knowledge discovery, and market segmentation. However, many modern-day applications generate streaming or temporal datasets over time, where the pattern is not stationary and may change over time. In the context of this article, change detection is the process of identifying differences in the cluster solutions obtained from streaming datasets at consecutive time points. In this paper, we briefly review the models/algorithms introduced in the literature to monitor clusters’ evolution in data streams. Monitoring the changes in clustering solutions in streaming datasets plays a vital role in policy-making and future prediction. Of course, it has a wide range of applications that cannot be covered in a single study, but some of the most common are highlighted in this article.
{"title":"Monitoring Changes in Clustering Solutions: A Review of Models and Applications","authors":"Muhammad Atif, Muhammad Shafiq, Muhammad Farooq, Gohar Ayub, Friedrich Leisch, Muhammad Ilyas","doi":"10.1155/2023/7493623","DOIUrl":"https://doi.org/10.1155/2023/7493623","url":null,"abstract":"This article comprehensively reviews the applications and algorithms used for monitoring the evolution of clustering solutions in data streams. The clustering technique is an unsupervised learning problem that involves the identification of natural subgroups in a large dataset. In contrast to supervised learning models, clustering is a data mining technique that retrieves the hidden pattern in the input dataset. The clustering solution reflects the mechanism that leads to a high level of similarity between the items. A few applications include pattern recognition, knowledge discovery, and market segmentation. However, many modern-day applications generate streaming or temporal datasets over time, where the pattern is not stationary and may change over time. In the context of this article, change detection is the process of identifying differences in the cluster solutions obtained from streaming datasets at consecutive time points. In this paper, we briefly review the models/algorithms introduced in the literature to monitor clusters’ evolution in data streams. Monitoring the changes in clustering solutions in streaming datasets plays a vital role in policy-making and future prediction. Of course, it has a wide range of applications that cannot be covered in a single study, but some of the most common are highlighted in this article.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135818826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kathleen S. Kirch, Norou Diawara, Cynthia M. Jones
The ability of government agencies to assign accurate ages of fish is important to fisheries management. Accurate ageing allows for most reliable age-based models to be used to support sustainability and maximize economic benefit. Assigning age relies on validating putative annual marks by evaluating accretional material laid down in patterns in fish ear bones, typically by marginal increment analysis. These patterns often take the shape of a sawtooth wave with an abrupt drop in accretion yearly to form an annual band and are typically validated qualitatively. Researchers have shown key interest in modeling marginal increments to verify the marks do, in fact, occur yearly. However, it has been challenging in finding the best model to predict this sawtooth wave pattern. We propose three new applications of time series models to validate the existence of the yearly sawtooth wave patterned data: autoregressive integrated moving average (ARIMA), unobserved component, and copula. These methods are expected to enable the identification of yearly patterns in accretion. ARIMA and unobserved components account for the dependence of observations and error, while copula incorporates a variety of marginal distributions and dependence structures. The unobserved component model produced the best results (AIC: −123.7, MSE 0.00626), followed by the time series model (AIC: −117.292, MSE: 0.0081), and then the copula model (AIC: −96.62, Kendall’s tau: −0.5503). The unobserved component model performed best due to the completeness of the dataset. In conclusion, all three models are effective tools to validate yearly accretional patterns in fish ear bones despite their differences in constraints and assumptions.
{"title":"Fitting Time Series Models to Fisheries Data to Ascertain Age","authors":"Kathleen S. Kirch, Norou Diawara, Cynthia M. Jones","doi":"10.1155/2023/9991872","DOIUrl":"https://doi.org/10.1155/2023/9991872","url":null,"abstract":"The ability of government agencies to assign accurate ages of fish is important to fisheries management. Accurate ageing allows for most reliable age-based models to be used to support sustainability and maximize economic benefit. Assigning age relies on validating putative annual marks by evaluating accretional material laid down in patterns in fish ear bones, typically by marginal increment analysis. These patterns often take the shape of a sawtooth wave with an abrupt drop in accretion yearly to form an annual band and are typically validated qualitatively. Researchers have shown key interest in modeling marginal increments to verify the marks do, in fact, occur yearly. However, it has been challenging in finding the best model to predict this sawtooth wave pattern. We propose three new applications of time series models to validate the existence of the yearly sawtooth wave patterned data: autoregressive integrated moving average (ARIMA), unobserved component, and copula. These methods are expected to enable the identification of yearly patterns in accretion. ARIMA and unobserved components account for the dependence of observations and error, while copula incorporates a variety of marginal distributions and dependence structures. The unobserved component model produced the best results (AIC: −123.7, MSE 0.00626), followed by the time series model (AIC: −117.292, MSE: 0.0081), and then the copula model (AIC: −96.62, Kendall’s tau: −0.5503). The unobserved component model performed best due to the completeness of the dataset. In conclusion, all three models are effective tools to validate yearly accretional patterns in fish ear bones despite their differences in constraints and assumptions.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135252244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Determining the right number of clusters without any prior information about their numbers is a core problem in cluster analysis. In this paper, we propose a nonparametric clustering method based on different weighted spatial rank (WSR) functions. The main idea behind WSR is to define a dissimilarity measure locally based on a localized version of multivariate ranks. We consider a nonparametric Gaussian kernel weights function. We compare the performance of the method with other standard techniques and assess its misclassification rate. The method is completely data-driven, robust against distributional assumptions, and accurate for the purpose of intuitive visualization and can be used both to determine the number of clusters and assign each observation to its cluster.
{"title":"Clustering Analysis of Multivariate Data: A Weighted Spatial Ranks-Based Approach","authors":"Mohammed H. Baragilly, Hend Gabr, Brian H. Willis","doi":"10.1155/2023/8849404","DOIUrl":"https://doi.org/10.1155/2023/8849404","url":null,"abstract":"Determining the right number of clusters without any prior information about their numbers is a core problem in cluster analysis. In this paper, we propose a nonparametric clustering method based on different weighted spatial rank (WSR) functions. The main idea behind WSR is to define a dissimilarity measure locally based on a localized version of multivariate ranks. We consider a nonparametric Gaussian kernel weights function. We compare the performance of the method with other standard techniques and assess its misclassification rate. The method is completely data-driven, robust against distributional assumptions, and accurate for the purpose of intuitive visualization and can be used both to determine the number of clusters and assign each observation to its cluster.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136280409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the recent era, the introduction of a new family of distributions has gotten great attention due to the curbs of the classical univariate distributions. This study introduces a novel family of distributions called a new type 1 alpha power family of distributions. Based on the novel family, a special model called a new type 1 alpha power Weibull model is studied in depth. The new model has very interesting patterns and it is very flexible. Thus, it can model the real data with the failure rate patterns of increasing, decreasing, parabola-down, and bathtub. Its applicability is studied by applying it to the health sector data, and time-to-recovery of breast cancer patients, and its performance is compared to seven well-known models. Based on the model comparison, it is the best model to fit the health-related data with no exceptional features. Furthermore, the popular models for the data with exceptional features such as correlation, overdispersion, and zero-inflation in aggregate are explored with applications to epileptic seizer data. Sometimes, these features are beyond the probability distribution models. Hence, this study has implemented eight possible models separately to these data and they are compared based on the standard techniques. Accordingly, the zero-inflated Poisson-normal-gamma model which includes the random effects in the linear predictor to handle the three features simultaneously has shown its supremacy over the others and is the best model to fit the health-related data with these features.
{"title":"A New Type 1 Alpha Power Family of Distributions and Modeling Data with Correlation, Overdispersion, and Zero-Inflation in the Health Data Sets","authors":"Getachew Tekle, R. Roozegar, Zubair Ahmad","doi":"10.1155/2023/6611108","DOIUrl":"https://doi.org/10.1155/2023/6611108","url":null,"abstract":"In the recent era, the introduction of a new family of distributions has gotten great attention due to the curbs of the classical univariate distributions. This study introduces a novel family of distributions called a new type 1 alpha power family of distributions. Based on the novel family, a special model called a new type 1 alpha power Weibull model is studied in depth. The new model has very interesting patterns and it is very flexible. Thus, it can model the real data with the failure rate patterns of increasing, decreasing, parabola-down, and bathtub. Its applicability is studied by applying it to the health sector data, and time-to-recovery of breast cancer patients, and its performance is compared to seven well-known models. Based on the model comparison, it is the best model to fit the health-related data with no exceptional features. Furthermore, the popular models for the data with exceptional features such as correlation, overdispersion, and zero-inflation in aggregate are explored with applications to epileptic seizer data. Sometimes, these features are beyond the probability distribution models. Hence, this study has implemented eight possible models separately to these data and they are compared based on the standard techniques. Accordingly, the zero-inflated Poisson-normal-gamma model which includes the random effects in the linear predictor to handle the three features simultaneously has shown its supremacy over the others and is the best model to fit the health-related data with these features.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48658235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial data analysis provides valuable information to the government as well as companies. The rapid improvement of modern technology with a geographic information system (GIS) can lead to the collection and storage of more spatial data. We developed algorithms to choose optimal locations from those permanently in a space for an efficient spatial data analysis. Distances between neighboring permanent locations are not necessary to be equispaced distances. Robust and sequential methods were used to develop algorithms for design construction. The constructed designs are robust against misspecified regression responses and variance/covariance structures of responses. The proposed method can be extended for future works of image analysis which includes 3 dimensional image analysis.
{"title":"Applications of Robust Methods in Spatial Analysis","authors":"S. Selvaratnam","doi":"10.1155/2023/1328265","DOIUrl":"https://doi.org/10.1155/2023/1328265","url":null,"abstract":"Spatial data analysis provides valuable information to the government as well as companies. The rapid improvement of modern technology with a geographic information system (GIS) can lead to the collection and storage of more spatial data. We developed algorithms to choose optimal locations from those permanently in a space for an efficient spatial data analysis. Distances between neighboring permanent locations are not necessary to be equispaced distances. Robust and sequential methods were used to develop algorithms for design construction. The constructed designs are robust against misspecified regression responses and variance/covariance structures of responses. The proposed method can be extended for future works of image analysis which includes 3 dimensional image analysis.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46327721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Empirical evidence suggests that the traditional GARCH-type models are unable to accurately estimate the volatility of financial markets. To improve on the accuracy of the traditional GARCH-type models, a hybrid model (BSGARCH (1, 1)) that combines the flexibility of B-splines with the GARCH (1, 1) model has been proposed in the study. The lagged residuals from the GARCH (1, 1) model are fitted with a B-spline estimator and added to the results produced from the GARCH (1, 1) model. The proposed BSGARCH (1, 1) model was applied to simulated data and two real financial time series data (NASDAQ 100 and S&P 500). The outcome was then compared to the outcomes of the GARCH (1, 1), EGARCH (1, 1), GJR-GARCH (1, 1), and APARCH (1, 1) with different error distributions (ED) using the mean absolute percentage error (MAPE), the root mean square error (RMSE), Theil’s inequality coefficient (TIC) and QLIKE. It was concluded that the proposed BSGARCH (1, 1) model outperforms the traditional GARCH-type models that were considered in the study based on the performance metrics, and thus, it can be used for estimating volatility of stock markets.
{"title":"Hybrid Model for Stock Market Volatility","authors":"Kofi Agyarko, N. K. Frempong, E. N. Wiah","doi":"10.1155/2023/6124649","DOIUrl":"https://doi.org/10.1155/2023/6124649","url":null,"abstract":"Empirical evidence suggests that the traditional GARCH-type models are unable to accurately estimate the volatility of financial markets. To improve on the accuracy of the traditional GARCH-type models, a hybrid model (BSGARCH (1, 1)) that combines the flexibility of B-splines with the GARCH (1, 1) model has been proposed in the study. The lagged residuals from the GARCH (1, 1) model are fitted with a B-spline estimator and added to the results produced from the GARCH (1, 1) model. The proposed BSGARCH (1, 1) model was applied to simulated data and two real financial time series data (NASDAQ 100 and S&P 500). The outcome was then compared to the outcomes of the GARCH (1, 1), EGARCH (1, 1), GJR-GARCH (1, 1), and APARCH (1, 1) with different error distributions (ED) using the mean absolute percentage error (MAPE), the root mean square error (RMSE), Theil’s inequality coefficient (TIC) and QLIKE. It was concluded that the proposed BSGARCH (1, 1) model outperforms the traditional GARCH-type models that were considered in the study based on the performance metrics, and thus, it can be used for estimating volatility of stock markets.","PeriodicalId":44760,"journal":{"name":"Journal of Probability and Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47200080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}