{"title":"Experience of distance education for project-based learning in data science","authors":"Kentaro Sakamaki, Masataka Taguri, Hiromu Nishiuchi, Yoshitomo Akimoto, Kazuyuki Koizumi","doi":"10.1007/s42081-022-00154-2","DOIUrl":"https://doi.org/10.1007/s42081-022-00154-2","url":null,"abstract":"","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"5 1","pages":"757 - 767"},"PeriodicalIF":1.3,"publicationDate":"2022-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43570645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-09DOI: 10.1007/s42081-022-00157-z
T. Stough, N. Cressie, E. Kang, A. Michalak, K. Sahr
{"title":"Correction to: Spatial analysis and visualization of global data on multi-resolution hexagonal grids","authors":"T. Stough, N. Cressie, E. Kang, A. Michalak, K. Sahr","doi":"10.1007/s42081-022-00157-z","DOIUrl":"https://doi.org/10.1007/s42081-022-00157-z","url":null,"abstract":"","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"5 1","pages":"271 - 272"},"PeriodicalIF":1.3,"publicationDate":"2022-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42326087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-28DOI: 10.1007/s42081-022-00152-4
Shu Yang, Jae Kwang Kim
{"title":"Correction to: Statistical data integration in survey sampling: a review","authors":"Shu Yang, Jae Kwang Kim","doi":"10.1007/s42081-022-00152-4","DOIUrl":"https://doi.org/10.1007/s42081-022-00152-4","url":null,"abstract":"","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"5 1","pages":"273 - 273"},"PeriodicalIF":1.3,"publicationDate":"2022-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49434293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-15DOI: 10.33369/jsds.v1i1.21010
Renny Alvionita, S. Nugroho, M. Chozin
Factorial experiment often involves large data sets and the use of generalized inverse for the data analysis. It becomes less manageable as the data increased. The objective of this study is to evaluate the accuracy of partitioned design matrix method for two factors multivariate design. The design matrix is partitioned into several sub-matrices based on their source of variation. The partitioned design matrix method in two factors multivariate is much simpler than usual sigma summation method in calculating the sum of product matrix and the degrees of freedom. This method could also be used in explaining the derivation of the statistics for testing the hypothesis of the equality of the means which corresponds to the source of variation.
{"title":"Partitioned Design Matrix Method for Two Factors Multivariate Design","authors":"Renny Alvionita, S. Nugroho, M. Chozin","doi":"10.33369/jsds.v1i1.21010","DOIUrl":"https://doi.org/10.33369/jsds.v1i1.21010","url":null,"abstract":"Factorial experiment often involves large data sets and the use of generalized inverse for the data analysis. It becomes less manageable as the data increased. The objective of this study is to evaluate the accuracy of partitioned design matrix method for two factors multivariate design. The design matrix is partitioned into several sub-matrices based on their source of variation. The partitioned design matrix method in two factors multivariate is much simpler than usual sigma summation method in calculating the sum of product matrix and the degrees of freedom. This method could also be used in explaining the derivation of the statistics for testing the hypothesis of the equality of the means which corresponds to the source of variation.","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"16 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78978403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-15DOI: 10.33369/jsds.v1i1.21009
Susi Wijuniamurti, S. Nugroho, R. Rachmawati
Clustering data through hierarchical approach could be performed by Agglomerative Nesting (AGNES) Method and Divisive Analysis (DIANA) Method. The objective of this research is to compare both the methods based on Euclid and Manhattan distance measurements. Of this research the clustering procedures of agglomerative method are conducted by exploring all techniques including single linkage, complete linkage, average linkage, and Ward. The data used are the National Socio-Economic Survey (SUSENAS) data which are selected specifically for the percentage of over 5 year old residents in each province, for both living in urban or rural, who access the internet in the last 3 months in 2017 but classified according purpose of accessing. By applying Mean Square Error (MSE) for 2 and 3 clusters, it can be concluded that the single linkage technique is the best performance of clustering procedure for both Euclidean and Manhattan distances.
{"title":"Agglomerative Nesting (AGNES) Method and Divisive Analysis (DIANA) Method For Hierarchical Clustering On Some Distance Measurement Concepts","authors":"Susi Wijuniamurti, S. Nugroho, R. Rachmawati","doi":"10.33369/jsds.v1i1.21009","DOIUrl":"https://doi.org/10.33369/jsds.v1i1.21009","url":null,"abstract":"Clustering data through hierarchical approach could be performed by Agglomerative Nesting (AGNES) Method and Divisive Analysis (DIANA) Method. The objective of this research is to compare both the methods based on Euclid and Manhattan distance measurements. Of this research the clustering procedures of agglomerative method are conducted by exploring all techniques including single linkage, complete linkage, average linkage, and Ward. The data used are the National Socio-Economic Survey (SUSENAS) data which are selected specifically for the percentage of over 5 year old residents in each province, for both living in urban or rural, who access the internet in the last 3 months in 2017 but classified according purpose of accessing. By applying Mean Square Error (MSE) for 2 and 3 clusters, it can be concluded that the single linkage technique is the best performance of clustering procedure for both Euclidean and Manhattan distances.","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"1 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73666910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-15DOI: 10.33369/jsds.v1i1.21007
Novi Putriasari, Sigit Nugroho, R. Rachmawati, Winalia Agwil, Y. O. Sitohang
Red chili occupies a strategic position in the Indonesian economic structure because its use applies to almost all Indonesian dishes. Therefore, controlling the price of red chili is anecessity to maintain national economic stability. The purpose of this research is to forecast a red chili weekly price using ARIMA and SSA based on the weekly data of chili prices from January 2016 - December 2019 sourced from Statistics Indonseia (BPS) Branch Office of Bengkulu Province. The data have been analyzed using software R. Based on MAPE, ARIMA (2,1,2) provides the best forecasting with value 0.49% while SSA 10.64%.
{"title":"Forecasting A Weekly Red Chilli Price in Bengkulu City Using Autoregressive Integrated Moving Average (ARIMA) and Singular Spectrum Analysis (SSA) Methods","authors":"Novi Putriasari, Sigit Nugroho, R. Rachmawati, Winalia Agwil, Y. O. Sitohang","doi":"10.33369/jsds.v1i1.21007","DOIUrl":"https://doi.org/10.33369/jsds.v1i1.21007","url":null,"abstract":"Red chili occupies a strategic position in the Indonesian economic structure because its use applies to almost all Indonesian dishes. Therefore, controlling the price of red chili is anecessity to maintain national economic stability. The purpose of this research is to forecast a red chili weekly price using ARIMA and SSA based on the weekly data of chili prices from January 2016 - December 2019 sourced from Statistics Indonseia (BPS) Branch Office of Bengkulu Province. The data have been analyzed using software R. Based on MAPE, ARIMA (2,1,2) provides the best forecasting with value 0.49% while SSA 10.64%.","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"60 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77882538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-15DOI: 10.33369/jsds.v1i1.21012
A. Gumilar, Sigit Nugroho, Buyung Keraman
In this research illustrates the simulation of quick count of sampling for the year 2014 Legislative Election in Bengkulu City, which has a data acquisition result for 589 TPS. The problem in this research is how to know the sample size and the right sampling method for Legislative Election in Bengkulu City on Year 2014. The purpose of this research is to know the sample size and the quick count calculation sampling method that can predict the actual vote result for Legislative Election. The method used in the calculation of fast calculation consists of three methods, simple random sampling, cluster random sampling and multistage random sampling. From the population data of 589 polling stations (TPS) into the population, the sample size was taken as much as 120 TPS or about 20% of the population, based on the results of calculations for sample sizes in a limited population. After the sample was selected, a sample simulation of 100 times for each method and simulation results was tested for compatibility with the chi-squared test. Based on the test results, it can be concluded that for sample size 120 TPS taken by simple random sampling method, cluster random sampling or multistage random sampling can predict the actual vote result in Legislative Election Year 2014 in Bengkulu with margin of error 5%. For efficiency consideration simple random sampling method can be selected.
{"title":"Simulation of Sample Determination Quick Count Legislative Elections In Bengkulu City","authors":"A. Gumilar, Sigit Nugroho, Buyung Keraman","doi":"10.33369/jsds.v1i1.21012","DOIUrl":"https://doi.org/10.33369/jsds.v1i1.21012","url":null,"abstract":"In this research illustrates the simulation of quick count of sampling for the year 2014 Legislative Election in Bengkulu City, which has a data acquisition result for 589 TPS. The problem in this research is how to know the sample size and the right sampling method for Legislative Election in Bengkulu City on Year 2014. The purpose of this research is to know the sample size and the quick count calculation sampling method that can predict the actual vote result for Legislative Election. The method used in the calculation of fast calculation consists of three methods, simple random sampling, cluster random sampling and multistage random sampling. From the population data of 589 polling stations (TPS) into the population, the sample size was taken as much as 120 TPS or about 20% of the population, based on the results of calculations for sample sizes in a limited population. After the sample was selected, a sample simulation of 100 times for each method and simulation results was tested for compatibility with the chi-squared test. Based on the test results, it can be concluded that for sample size 120 TPS taken by simple random sampling method, cluster random sampling or multistage random sampling can predict the actual vote result in Legislative Election Year 2014 in Bengkulu with margin of error 5%. For efficiency consideration simple random sampling method can be selected.","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"2 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82270186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-15DOI: 10.33369/jsds.v1i1.21011
Welly Fransiska, S. Nugroho, R. Rachmawati
Regression analysis is the study of the relationship between dependent variable and one or more independent variables. One of the important assumption that must be fulfilled to get the regression coefficient estimator Best Linear Unbiased Estimator (BLUE) is homoscedasticity. If the homoscedasticity assumption is violated then it is called heteroscedasticity. The consequences of heteroscedasticity are the estimator remain linear and unbiased, but it can cause estimator haven‘t a minimum variance so the estimator is no longer BLUE. The purpose of this study is to analyze and resolve the violation of heteroscedasticity assumption with Weighted Least Square(WLS) and Quantile Regression. Based on the results of the comparison between WLS and Quantile Regression obtained the most precise method used to overcome heteroscedasticity in this research is the WLS method because it produces that is greater (98%).
{"title":"A Comparison of Weighted Least Square and Quantile Regression for Solving Heteroscedasticity in Simple Linear Regression","authors":"Welly Fransiska, S. Nugroho, R. Rachmawati","doi":"10.33369/jsds.v1i1.21011","DOIUrl":"https://doi.org/10.33369/jsds.v1i1.21011","url":null,"abstract":"Regression analysis is the study of the relationship between dependent variable and one or more independent variables. One of the important assumption that must be fulfilled to get the regression coefficient estimator Best Linear Unbiased Estimator (BLUE) is homoscedasticity. If the homoscedasticity assumption is violated then it is called heteroscedasticity. The consequences of heteroscedasticity are the estimator remain linear and unbiased, but it can cause estimator haven‘t a minimum variance so the estimator is no longer BLUE. The purpose of this study is to analyze and resolve the violation of heteroscedasticity assumption with Weighted Least Square(WLS) and Quantile Regression. Based on the results of the comparison between WLS and Quantile Regression obtained the most precise method used to overcome heteroscedasticity in this research is the WLS method because it produces that is greater (98%).","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"26 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83523224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-03DOI: 10.1007/s42081-022-00148-0
N. Watanabe
{"title":"A k-means method for trends of time series","authors":"N. Watanabe","doi":"10.1007/s42081-022-00148-0","DOIUrl":"https://doi.org/10.1007/s42081-022-00148-0","url":null,"abstract":"","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"22 1","pages":"303 - 319"},"PeriodicalIF":1.3,"publicationDate":"2022-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"53297581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-16DOI: 10.1007/s42081-022-00146-2
Victor Mooto Nawa
{"title":"A weighted score confidence interval for a binomial proportion","authors":"Victor Mooto Nawa","doi":"10.1007/s42081-022-00146-2","DOIUrl":"https://doi.org/10.1007/s42081-022-00146-2","url":null,"abstract":"","PeriodicalId":29911,"journal":{"name":"Japanese Journal of Statistics and Data Science","volume":"5 1","pages":"133 - 147"},"PeriodicalIF":1.3,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47620088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}