Pub Date : 2021-12-28DOI: 10.14710/medstat.14.2.118-124
D. Rosadi, D. Arisanty, D. Agustina
Forest fire is one of important catastrophic events and have great impact on environment, infrastructure and human life. In this study, we discuss the method for prediction of the size of the forest fire using the hybrid approach between Fuzzy-C-Means clustering (FCM) and Neural Networks (NN) classification with backpropagation learning and extreme learning machine approach. For comparison purpose, we consider a similar hybrid approach, i.e., FCM with the classical Support Vector Machine (SVM) classification approach. In the empirical study, we apply the considered methods using several meteorological and Forest Weather Index (FWI) variables. We found that the best approach will be obtained using hybrid FCM-SVM for data training, where the best performance obtains for hybrid FCM-NN-backpropagation for data testing.
{"title":"PREDICTION OF FOREST FIRE USING NEURAL NETWORKS WITH BACKPROPAGATION LEARNING AND EXREME LEARNING MACHINE APPROACH USING METEOROLOGICAL AND WEATHER INDEX VARIABLES","authors":"D. Rosadi, D. Arisanty, D. Agustina","doi":"10.14710/medstat.14.2.118-124","DOIUrl":"https://doi.org/10.14710/medstat.14.2.118-124","url":null,"abstract":"Forest fire is one of important catastrophic events and have great impact on environment, infrastructure and human life. In this study, we discuss the method for prediction of the size of the forest fire using the hybrid approach between Fuzzy-C-Means clustering (FCM) and Neural Networks (NN) classification with backpropagation learning and extreme learning machine approach. For comparison purpose, we consider a similar hybrid approach, i.e., FCM with the classical Support Vector Machine (SVM) classification approach. In the empirical study, we apply the considered methods using several meteorological and Forest Weather Index (FWI) variables. We found that the best approach will be obtained using hybrid FCM-SVM for data training, where the best performance obtains for hybrid FCM-NN-backpropagation for data testing.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47208874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-28DOI: 10.14710/medstat.14.2.170-182
M. Miftahuddin, Retno Wahyuni Putri, I. Setiawan, R. S. Oktari
Variability of Sea Surface Temperature (SST) is one of the climatic features that influence global and regional climate dynamics. Missing data (gaps) in the SST dataset are worth investigating since they may statistically alter the value of the SST change. The partial least square-structural equation modeling (PLS-SEM) approach is used in this work to estimate the causality relationships between exogenous and endogenous latent variables. The findings of this study, which are significant indicators that have a loading factor value > 0.7 are as follows: i) sea surface temperature (oC) as a measure of the latent variable changes in SST, ii) wind speed (m/s) and relative humidity (%) as a measure of the latent variable of weather, and iii) air temperature (oC), long-wave solar radiation (w/m2) as a measure of climate latent variables. The size of the Rsquare value is influenced by the number of gaps. The results of the boostrapping show that the latent variables of weather and climate have a significant effect on changes in SST which are indicated by the value of tstatistics > ttabel. The structural model obtained Changes in SST (η) = -0.330 weather + 0.793 climate + ζ. The model shows that the weather has a negative coefficient, which means that the better the weather conditions, the lower the SST changes. Climate has a positive coefficient, which means that the better the climate, the SST changes will also increase. Rising sea surface temperatures caused by an increase in climate can lead to global warming, impacting El-Nino and La-Nina events.
{"title":"MODELING OF SEA SURFACE TEMPERATURE BASED ON PARTIAL LEAST SQUARE - STRUCTURAL EQUATION","authors":"M. Miftahuddin, Retno Wahyuni Putri, I. Setiawan, R. S. Oktari","doi":"10.14710/medstat.14.2.170-182","DOIUrl":"https://doi.org/10.14710/medstat.14.2.170-182","url":null,"abstract":"Variability of Sea Surface Temperature (SST) is one of the climatic features that influence global and regional climate dynamics. Missing data (gaps) in the SST dataset are worth investigating since they may statistically alter the value of the SST change. The partial least square-structural equation modeling (PLS-SEM) approach is used in this work to estimate the causality relationships between exogenous and endogenous latent variables. The findings of this study, which are significant indicators that have a loading factor value > 0.7 are as follows: i) sea surface temperature (oC) as a measure of the latent variable changes in SST, ii) wind speed (m/s) and relative humidity (%) as a measure of the latent variable of weather, and iii) air temperature (oC), long-wave solar radiation (w/m2) as a measure of climate latent variables. The size of the Rsquare value is influenced by the number of gaps. The results of the boostrapping show that the latent variables of weather and climate have a significant effect on changes in SST which are indicated by the value of tstatistics > ttabel. The structural model obtained Changes in SST (η) = -0.330 weather + 0.793 climate + ζ. The model shows that the weather has a negative coefficient, which means that the better the weather conditions, the lower the SST changes. Climate has a positive coefficient, which means that the better the climate, the SST changes will also increase. Rising sea surface temperatures caused by an increase in climate can lead to global warming, impacting El-Nino and La-Nina events.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44406867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-28DOI: 10.14710/medstat.14.2.108-117
Yundari Yundari, Shantika Martha
This research examines the semiparametric Generalized Space-Time Autoregressive (GSTAR) spacetime modeling and determines its spatial weight. In general, the spatial weights used are uniform, binary weights, and based on the distance, the result is a fixed weight. The GSTAR model is a stochastic model that takes into account its random variables. Thus, it is necessary to study the random spatial weights. This study introduced a new method to estimate the observed value of the GSTAR model semiparametric with a uniform kernel. The data involved the Gamma Ray (GR) log data on four coal drill holes. The semiparametric GSTAR modeling aimed to predict the amount of log GR in the unobserved soil layer based on the observation data information on the layer above it and its surrounding location. The results revealed that semiparametric GSTAR modeling could predict the presence of coal seams and their thickness of drill holes. The results also highlight the validity test on the out-sample data that the error in each borehole results in a small error. In addition, the error tends to approach the actual observed value at a depth of 1 meter down.
{"title":"THE APPLICATION OF THE SEMIPARAMETRIC GSTAR MODEL IN DETERMINING GAMMA-RAY LOG DATA ON SOIL LAYERS","authors":"Yundari Yundari, Shantika Martha","doi":"10.14710/medstat.14.2.108-117","DOIUrl":"https://doi.org/10.14710/medstat.14.2.108-117","url":null,"abstract":"This research examines the semiparametric Generalized Space-Time Autoregressive (GSTAR) spacetime modeling and determines its spatial weight. In general, the spatial weights used are uniform, binary weights, and based on the distance, the result is a fixed weight. The GSTAR model is a stochastic model that takes into account its random variables. Thus, it is necessary to study the random spatial weights. This study introduced a new method to estimate the observed value of the GSTAR model semiparametric with a uniform kernel. The data involved the Gamma Ray (GR) log data on four coal drill holes. The semiparametric GSTAR modeling aimed to predict the amount of log GR in the unobserved soil layer based on the observation data information on the layer above it and its surrounding location. The results revealed that semiparametric GSTAR modeling could predict the presence of coal seams and their thickness of drill holes. The results also highlight the validity test on the out-sample data that the error in each borehole results in a small error. In addition, the error tends to approach the actual observed value at a depth of 1 meter down.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46456315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-28DOI: 10.14710/medstat.14.2.206-215
T. W. Utami, Aisyah Lahdji
Coronavirus disease 2019 (COVID-19) is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) which was recently discovered. Coronavirus disease is now a pandemic that occurs in many countries in the world, one of which is Indonesia. One of the cities in Indonesia that has found many COVID cases is Semarang city, located in Central Java. Data on cases of COVID patients in Semarang City which are measured daily do not form a certain distribution pattern. We can build a model with a flexible statistical approach without any assumptions that must be used, namely the nonparametric regression. The nonparametric regression in this research using Local Polynomial Kernel approach. Determination of the polynomial order and optimal bandwidth in Local Polynomial Kernel Regression modeling use the GCV (Generalized Cross Validation) method. The data used this research are data on the number of COVID patients daily cases in Semarang, Indonesia. Based on the results of the application of the COVID patient daily cases in Semarang City, the optimal bandwidth value is 0.86 and the polynomial order is 4 with the minimum GCV is 3179.568 so that the model estimation results the MSE is 2922.22 and the determination coefficient is 97%. The estimation results show the highest number of Corona in the Semarang City at the beginning of July 2020. After the corona case increased in July, while the corona case in August decreased.
{"title":"MODELING OF LOCAL POLYNOMIAL KERNEL NONPARAMETRIC REGRESSION FOR COVID DAILY CASES IN SEMARANG CITY, INDONESIA","authors":"T. W. Utami, Aisyah Lahdji","doi":"10.14710/medstat.14.2.206-215","DOIUrl":"https://doi.org/10.14710/medstat.14.2.206-215","url":null,"abstract":"Coronavirus disease 2019 (COVID-19) is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) which was recently discovered. Coronavirus disease is now a pandemic that occurs in many countries in the world, one of which is Indonesia. One of the cities in Indonesia that has found many COVID cases is Semarang city, located in Central Java. Data on cases of COVID patients in Semarang City which are measured daily do not form a certain distribution pattern. We can build a model with a flexible statistical approach without any assumptions that must be used, namely the nonparametric regression. The nonparametric regression in this research using Local Polynomial Kernel approach. Determination of the polynomial order and optimal bandwidth in Local Polynomial Kernel Regression modeling use the GCV (Generalized Cross Validation) method. The data used this research are data on the number of COVID patients daily cases in Semarang, Indonesia. Based on the results of the application of the COVID patient daily cases in Semarang City, the optimal bandwidth value is 0.86 and the polynomial order is 4 with the minimum GCV is 3179.568 so that the model estimation results the MSE is 2922.22 and the determination coefficient is 97%. The estimation results show the highest number of Corona in the Semarang City at the beginning of July 2020. After the corona case increased in July, while the corona case in August decreased.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46879883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-12DOI: 10.14710/medstat.14.2.194-205
E. Sunandi, K. Notodiputro, B. Sartono
Poisson Log-Normal Model is one of the hierarchical mixed models that can be used for count data. Several estimation methods can be used to estimate the model parameters. The first objective of this study was to examine the performance of the parameter estimator and model built using the Hierarchical Bayes method via Markov Chain Monte Carlo (MCMC) with simulation. The second objective was applied the Poisson Log-Normal model to the West Java illiteracy Cases data which is sourced from the Susenas data on March 2019. In 2019, the incidence of illiteracy is a very rare occurrence in West Java Province. So that, it is suitable as an application case in this study. The simulation results showed that the Hierarchical Bayes parameter estimator through MCMC has the smallest Root Mean Squared Error of Prediction (RMSEP) value and the absolute bias is relatively mostly similar when compared to the Maximum Likelihood (ML) and Penalized Quasi-Likelihood (PQL) methods. Meanwhile, the empirical results showed that the fixed variable is the number of respondents who have a maximum education of elementary school have the greatest risk of illiteracy. Also, the diversity of census blocks significantly affects illiteracy cases in West Java 2019.
{"title":"A STUDY OF GENERALIZED LINEAR MIXED MODEL FOR COUNT DATA USING HIERARCHICAL BAYES METHOD","authors":"E. Sunandi, K. Notodiputro, B. Sartono","doi":"10.14710/medstat.14.2.194-205","DOIUrl":"https://doi.org/10.14710/medstat.14.2.194-205","url":null,"abstract":"Poisson Log-Normal Model is one of the hierarchical mixed models that can be used for count data. Several estimation methods can be used to estimate the model parameters. The first objective of this study was to examine the performance of the parameter estimator and model built using the Hierarchical Bayes method via Markov Chain Monte Carlo (MCMC) with simulation. The second objective was applied the Poisson Log-Normal model to the West Java illiteracy Cases data which is sourced from the Susenas data on March 2019. In 2019, the incidence of illiteracy is a very rare occurrence in West Java Province. So that, it is suitable as an application case in this study. The simulation results showed that the Hierarchical Bayes parameter estimator through MCMC has the smallest Root Mean Squared Error of Prediction (RMSEP) value and the absolute bias is relatively mostly similar when compared to the Maximum Likelihood (ML) and Penalized Quasi-Likelihood (PQL) methods. Meanwhile, the empirical results showed that the fixed variable is the number of respondents who have a maximum education of elementary school have the greatest risk of illiteracy. Also, the diversity of census blocks significantly affects illiteracy cases in West Java 2019.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44741861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-12DOI: 10.14710/medstat.14.2.137-145
Anisa Eka Haryati, Sugiyarto Surono
Clustering is a data analysis process which applied to classify the unlabeled data. Fuzzy clustering is a clustering method based on membership value which enclosing set of fuzzy as a measurement base for classification process. Fuzzy Subtractive Clustering (FSC) is included in one of fuzzy clustering method. This research applies Hamming distance and combined Minkowski Chebysev distance as a distance parameter in Fuzzy Subtractive Clustering. The objective of this research is to compare the output quality of the cluster from Fuzzy Subtractive Clustering by using Hamming distance and combine Minkowski Chebysev distance. The comparison of the two distances aims to see how well the clusters are produced from two different distances. The data used is data on hypertension. The variables used are age, gender, systolic pressure, diastolic pressure, and body weight. This research shows that the Partition Coefficient value resulted on Fuzzy Subtractive Clustering by applying combined Minkowski Chebysev distance is higher than the application of Hamming distance. Based on this, it can be concluded that in this study the quality of the cluster output using the combined Minkowski Chebysev distance is better.
{"title":"COMPARATIVE STUDY OF DISTANCE MEASURES ON FUZZY SUBTRACTIVE CLUSTERING","authors":"Anisa Eka Haryati, Sugiyarto Surono","doi":"10.14710/medstat.14.2.137-145","DOIUrl":"https://doi.org/10.14710/medstat.14.2.137-145","url":null,"abstract":"Clustering is a data analysis process which applied to classify the unlabeled data. Fuzzy clustering is a clustering method based on membership value which enclosing set of fuzzy as a measurement base for classification process. Fuzzy Subtractive Clustering (FSC) is included in one of fuzzy clustering method. This research applies Hamming distance and combined Minkowski Chebysev distance as a distance parameter in Fuzzy Subtractive Clustering. The objective of this research is to compare the output quality of the cluster from Fuzzy Subtractive Clustering by using Hamming distance and combine Minkowski Chebysev distance. The comparison of the two distances aims to see how well the clusters are produced from two different distances. The data used is data on hypertension. The variables used are age, gender, systolic pressure, diastolic pressure, and body weight. This research shows that the Partition Coefficient value resulted on Fuzzy Subtractive Clustering by applying combined Minkowski Chebysev distance is higher than the application of Hamming distance. Based on this, it can be concluded that in this study the quality of the cluster output using the combined Minkowski Chebysev distance is better.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42320589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-12DOI: 10.14710/medstat.14.2.183-193
A. Hoyyi, Abdurakhman Abdurakhman, D. Rosadi
The Option is widely applied in the financial sector. The Black-Scholes-Merton model is often used in calculating option prices on a stock price movement. The model uses geometric Brownian motion which assumes that the data is normally distributed. However, in reality, stock price movements can cause sharp spikes in data, resulting in nonnormal data distribution. So we need a stock price model that is not normally distributed. One of the fastest growing stock price models today is the process exponential model. The process has the ability to model data that has excess kurtosis and a longer tail (heavy tail) compared to the normal distribution. One of the members of the process is the Variance Gamma (VG) process. The VG process has three parameters which each of them, to control volatility, kurtosis and skewness. In this research, the secondary data samples of options and stocks of two companies were used, namely zoom video communications, Inc. (ZM) and Nokia Corporation (NOK). The price of call options is determined by using closed form equations and Monte Carlo simulation. The Simulation was carried out for various values until convergent result was obtained.
期权广泛应用于金融领域。Black-Scholes-Merton模型经常用于计算股票价格变动的期权价格。该模型采用几何布朗运动,假设数据是正态分布。然而,在现实中,股票价格的变动会导致数据的急剧飙升,从而导致数据的非正态分布。所以我们需要一个非正态分布的股价模型。目前增长最快的股票价格模型之一是过程指数模型。与正态分布相比,该过程能够对具有过量峰度和较长尾(重尾)的数据进行建模。该流程的成员之一是方差伽马(VG)流程。VG过程有三个参数,分别用于控制挥发性、峰度和偏度。本研究采用zoom video communications, Inc. (ZM)和Nokia Corporation (NOK)两家公司的期权和股票二级数据样本。采用封闭形式方程和蒙特卡罗模拟方法确定看涨期权的价格。对不同的数值进行了模拟,直到得到收敛的结果。
{"title":"VARIANCE GAMMA PROCESS WITH MONTE CARLO SIMULATION AND CLOSED FORM APPROACH FOR EUROPEAN CALL OPTION PRICE DETERMINATION","authors":"A. Hoyyi, Abdurakhman Abdurakhman, D. Rosadi","doi":"10.14710/medstat.14.2.183-193","DOIUrl":"https://doi.org/10.14710/medstat.14.2.183-193","url":null,"abstract":"The Option is widely applied in the financial sector. The Black-Scholes-Merton model is often used in calculating option prices on a stock price movement. The model uses geometric Brownian motion which assumes that the data is normally distributed. However, in reality, stock price movements can cause sharp spikes in data, resulting in nonnormal data distribution. So we need a stock price model that is not normally distributed. One of the fastest growing stock price models today is the process exponential model. The process has the ability to model data that has excess kurtosis and a longer tail (heavy tail) compared to the normal distribution. One of the members of the process is the Variance Gamma (VG) process. The VG process has three parameters which each of them, to control volatility, kurtosis and skewness. In this research, the secondary data samples of options and stocks of two companies were used, namely zoom video communications, Inc. (ZM) and Nokia Corporation (NOK). The price of call options is determined by using closed form equations and Monte Carlo simulation. The Simulation was carried out for various values until convergent result was obtained.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42169991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-30DOI: 10.14710/medstat.14.1.56-67
R. Zulkarnain, A. Djuraidah, I. Sumertajaya, Indahwati Indahwati
Stochastic frontier analysis (SFA) is the favorite method for measuring technical efficiency. SFA decomposes the error term into noise and inefficiency components. The noise component is generally assumed to have a normal distribution, while the inefficiency component is assumed to have half normal distribution. However, in the presence of outliers, the normality assumption of noise is not sufficient and can produce implausible technical efficiency scores. This paper aims to explore the use of Student’s t distribution for handling outliers in technical efficiency measurement. The model was applied in paddy rice production in East Java. Output variable was the quantity of production, while the input variables were land, seed, fertilizer, labor and capital. To link the output and inputs, Cobb-Douglas or Translog production functions was chosen using likelihood ratio test, where the parameters were estimated using maximum simulated likelihood. Furthermore, the technical efficiency scores were calculated using Jondrow method. The results showed that Student’s t distribution for noise can reduce the outliers in technical efficiency scores. Student’s t distribution revised the extremely high technical efficiency scores downward and the extremely low technical efficiency scores upward. The performance of model was improved after the outliers were handled, indicated by smaller AIC value.
{"title":"UTILIZATION OF STUDENT’S T DISTRIBUTION TO HANDLE OUTLIERS IN TECHNICAL EFFICIENCY MEASUREMENT","authors":"R. Zulkarnain, A. Djuraidah, I. Sumertajaya, Indahwati Indahwati","doi":"10.14710/medstat.14.1.56-67","DOIUrl":"https://doi.org/10.14710/medstat.14.1.56-67","url":null,"abstract":"Stochastic frontier analysis (SFA) is the favorite method for measuring technical efficiency. SFA decomposes the error term into noise and inefficiency components. The noise component is generally assumed to have a normal distribution, while the inefficiency component is assumed to have half normal distribution. However, in the presence of outliers, the normality assumption of noise is not sufficient and can produce implausible technical efficiency scores. This paper aims to explore the use of Student’s t distribution for handling outliers in technical efficiency measurement. The model was applied in paddy rice production in East Java. Output variable was the quantity of production, while the input variables were land, seed, fertilizer, labor and capital. To link the output and inputs, Cobb-Douglas or Translog production functions was chosen using likelihood ratio test, where the parameters were estimated using maximum simulated likelihood. Furthermore, the technical efficiency scores were calculated using Jondrow method. The results showed that Student’s t distribution for noise can reduce the outliers in technical efficiency scores. Student’s t distribution revised the extremely high technical efficiency scores downward and the extremely low technical efficiency scores upward. The performance of model was improved after the outliers were handled, indicated by smaller AIC value.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48369977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-25DOI: 10.14710/medstat.14.1.44-55
P. Kartikasari, H. Yasin, D. A. I. Maruddani
Currently the emergence of the novel coronavirus (Sars-Cov-2), which causes the COVID-19 pandemic and has become a serious health problem because of the high risk causes of death. Therefore, fast and appropriate action is needed to reduce the spread of the COVID-19 pandemic. One of the way is to build a prediction model so that it can be a reference in taking steps to overcome them. Because of the nature of transmission of this disease which is so fast and massive cause extreme data fluctuations and between objects whose observational distances are far enough correlated with each other (long memory). The result of this determination is the best ARFIMA model obtained to predict additional of recovering cases of COVID-19 is (1,0,489.0) with an SMAPE value of 12,44%, while the case of death is (1.0.429.0) with SMAPE value of 13,52%. This shows that the ARFIMA model can accommodate well the long memory effect, resulting in a small bias. Also in estimating model parameters, it is also simpler. For cases of recovery and death, the number is increasing even though the case of death is still very high compared to cases of recovery.
{"title":"AUTOREGRESSIVE FRACTIONAL INTEGRATED MOVING AVERAGE (ARFIMA) MODEL TO PREDICT COVID-19 PANDEMIC CASES IN INDONESIA","authors":"P. Kartikasari, H. Yasin, D. A. I. Maruddani","doi":"10.14710/medstat.14.1.44-55","DOIUrl":"https://doi.org/10.14710/medstat.14.1.44-55","url":null,"abstract":"Currently the emergence of the novel coronavirus (Sars-Cov-2), which causes the COVID-19 pandemic and has become a serious health problem because of the high risk causes of death. Therefore, fast and appropriate action is needed to reduce the spread of the COVID-19 pandemic. One of the way is to build a prediction model so that it can be a reference in taking steps to overcome them. Because of the nature of transmission of this disease which is so fast and massive cause extreme data fluctuations and between objects whose observational distances are far enough correlated with each other (long memory). The result of this determination is the best ARFIMA model obtained to predict additional of recovering cases of COVID-19 is (1,0,489.0) with an SMAPE value of 12,44%, while the case of death is (1.0.429.0) with SMAPE value of 13,52%. This shows that the ARFIMA model can accommodate well the long memory effect, resulting in a small bias. Also in estimating model parameters, it is also simpler. For cases of recovery and death, the number is increasing even though the case of death is still very high compared to cases of recovery.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45968491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-07DOI: 10.14710/medstat.14.2.146-157
M. Alawiyah, D. A. Kusuma, B. N. Ruchjana
Time series model that is commonly used is the Box-Jenkins based time series model. Time series data phenomena based on Box-Jenkins can be combined with spatial data, it is called the space time model One model based on Box-Jenkins model with heterogeneous location characteristics is the Generalized Space Time Autoregressive Integrated (GSTARI) model for a model that assumes data is not stationary or has a trend. This paper discusses the development of the GSTARI model with the assumption that the error variance is not constant which is applied to positive data confirmed by Covid-19 in West Java Province, especially in 4 regencies/cities that have cases in the high category from 6 March 2020 until 31 December 2020. Four regencies/cities are Depok City, Bekasi City, Bekasi Regency, and Karawang Regency. Parameter estimation method for the assumption of non-constant error variance can use Autoregressive Conditional Heteroscedasticity (ARCH) method. GSTARI-ARCH modeling procedure followed three Box-Jenkins stages, namely the identification process, parameter estimation and checking diagnostic. Application of the GSTARI-ARCH Model to Covid-19 positive confirmed data in 4 regencies/cities has a minimum value of RMSE in Bekasi City. The plot of forecast results for the four regencies/cities has a similar pattern to the actual data only applicable for a short time for 1-2 days.
{"title":"GSTARI-ARCH MODEL AND APPLICATION ON POSITIVE CONFIRMED DATA FOR COVID-19 IN WEST JAVA","authors":"M. Alawiyah, D. A. Kusuma, B. N. Ruchjana","doi":"10.14710/medstat.14.2.146-157","DOIUrl":"https://doi.org/10.14710/medstat.14.2.146-157","url":null,"abstract":"Time series model that is commonly used is the Box-Jenkins based time series model. Time series data phenomena based on Box-Jenkins can be combined with spatial data, it is called the space time model One model based on Box-Jenkins model with heterogeneous location characteristics is the Generalized Space Time Autoregressive Integrated (GSTARI) model for a model that assumes data is not stationary or has a trend. This paper discusses the development of the GSTARI model with the assumption that the error variance is not constant which is applied to positive data confirmed by Covid-19 in West Java Province, especially in 4 regencies/cities that have cases in the high category from 6 March 2020 until 31 December 2020. Four regencies/cities are Depok City, Bekasi City, Bekasi Regency, and Karawang Regency. Parameter estimation method for the assumption of non-constant error variance can use Autoregressive Conditional Heteroscedasticity (ARCH) method. GSTARI-ARCH modeling procedure followed three Box-Jenkins stages, namely the identification process, parameter estimation and checking diagnostic. Application of the GSTARI-ARCH Model to Covid-19 positive confirmed data in 4 regencies/cities has a minimum value of RMSE in Bekasi City. The plot of forecast results for the four regencies/cities has a similar pattern to the actual data only applicable for a short time for 1-2 days.","PeriodicalId":34146,"journal":{"name":"Media Statistika","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44970229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}