Pub Date : 2023-04-03DOI: 10.1080/23737484.2023.2199952
Soumita Modak
ABSTRACT In this paper, a novel nonparametric norm-based clustering algorithm is proposed to classify real-valued continuous data sets given in arbitrary dimensional space. For data univariate, multivariate or high-dimensional, with the number of study variables close to or larger than the data size, our straightforward algorithm is implemented throughout under an univariate set-up, where we make use of the observation-wise (or pointwise) norms which quantify the distances of the observations from the origin zero or the null vector. The method begins with determination of the sample quantile using nonparamteric bootstrapping on the computed norms and always converges independently. By its design, the suggested algorithm is fast enough to detect the number of existing clusters itself and to form well-defined groups. Data study demonstrates its competitiveness in comparison to 2 popular clustering algorithms K-means and K-medoids.
{"title":"Pointwise norm-based clustering of data in arbitrary dimensional space","authors":"Soumita Modak","doi":"10.1080/23737484.2023.2199952","DOIUrl":"https://doi.org/10.1080/23737484.2023.2199952","url":null,"abstract":"ABSTRACT In this paper, a novel nonparametric norm-based clustering algorithm is proposed to classify real-valued continuous data sets given in arbitrary dimensional space. For data univariate, multivariate or high-dimensional, with the number of study variables close to or larger than the data size, our straightforward algorithm is implemented throughout under an univariate set-up, where we make use of the observation-wise (or pointwise) norms which quantify the distances of the observations from the origin zero or the null vector. The method begins with determination of the sample quantile using nonparamteric bootstrapping on the computed norms and always converges independently. By its design, the suggested algorithm is fast enough to detect the number of existing clusters itself and to form well-defined groups. Data study demonstrates its competitiveness in comparison to 2 popular clustering algorithms K-means and K-medoids.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"10 1","pages":"121 - 134"},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77030573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-03DOI: 10.1080/23737484.2023.2203840
Pedro Chaim, M. Laurini
Abstract In this paper we explore spatial patterns in voting behavior of the Brazilian electorate in the state legislative elections of 2014. With data aggregated at the municipality level, we employ a Beta regression model augmented with spatially correlated random effects to model the share of votes received by the three largest political parties: PMDB, PSDB, and PT. Results suggest PT is more preferred by the electorate in poorer and more densely populated areas, specially in states of the Northwest and South regions, while PSDB performs better in municipalities with relatively higher standards of living. Also, analysis of the spatial random effects indicates this component is especially important to account for the major stylized fact that is the simultaneous PSDB hegemony and relative lack of PMDB presence in the state of São Paulo.
{"title":"Spatial patterns in Brazilian state legislative elections","authors":"Pedro Chaim, M. Laurini","doi":"10.1080/23737484.2023.2203840","DOIUrl":"https://doi.org/10.1080/23737484.2023.2203840","url":null,"abstract":"Abstract In this paper we explore spatial patterns in voting behavior of the Brazilian electorate in the state legislative elections of 2014. With data aggregated at the municipality level, we employ a Beta regression model augmented with spatially correlated random effects to model the share of votes received by the three largest political parties: PMDB, PSDB, and PT. Results suggest PT is more preferred by the electorate in poorer and more densely populated areas, specially in states of the Northwest and South regions, while PSDB performs better in municipalities with relatively higher standards of living. Also, analysis of the spatial random effects indicates this component is especially important to account for the major stylized fact that is the simultaneous PSDB hegemony and relative lack of PMDB presence in the state of São Paulo.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"1 1","pages":"181 - 195"},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83016033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-03DOI: 10.1080/23737484.2023.2199957
Mantautas Rimkus, P. Kokoszka, K. Prabakar, Haonan Wang
Abstract We propose statistical fault detection methodology based on high-frequency data streams that are becoming available in modern power grids. Our approach can be treated as an online (sequential) change point monitoring methodology. However, due to the mostly unexplored and very nonstandard structure of high-frequency power grid streaming data, substantial new statistical development is required to make this methodology practically applicable. The paper includes development of scalar detectors based on multichannel data streams, determination of data-driven alarm thresholds and investigation of the performance and robustness of the new tools. Due to a reasonably large database of faults, we can calculate frequencies of false and correct fault signals, and recommend implementations that optimize these empirical success rates.
{"title":"Toward statistical real-time power fault detection","authors":"Mantautas Rimkus, P. Kokoszka, K. Prabakar, Haonan Wang","doi":"10.1080/23737484.2023.2199957","DOIUrl":"https://doi.org/10.1080/23737484.2023.2199957","url":null,"abstract":"Abstract We propose statistical fault detection methodology based on high-frequency data streams that are becoming available in modern power grids. Our approach can be treated as an online (sequential) change point monitoring methodology. However, due to the mostly unexplored and very nonstandard structure of high-frequency power grid streaming data, substantial new statistical development is required to make this methodology practically applicable. The paper includes development of scalar detectors based on multichannel data streams, determination of data-driven alarm thresholds and investigation of the performance and robustness of the new tools. Due to a reasonably large database of faults, we can calculate frequencies of false and correct fault signals, and recommend implementations that optimize these empirical success rates.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"18 1","pages":"196 - 217"},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75165080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-03DOI: 10.1080/23737484.2023.2207503
T. A. Diriba, L. K. Debusho
Abstract The spatial and spatio-temporal dependence modeling to extreme value distributions have been used to analyze the extremes of daily maximum rainfall data across selected weather stations in South Africa combining generalized Pareto distribution (GPD) with the flexible Bayesian Latent Gaussian Model (LGM). The paper demonstrated the spatio-temporal GPD model for applications in extreme rainfall data that capture systematic variation through spatial and spatio-temporal modeling framework, in which the temporal constitutes the week and month as random separately. The paper uses the Bayesian integrated Nested Laplace approximation (INLA) algorithm to estimate marginal posterior means of the parameters and hyper-parameters for Bayesian spatio-temporal models. The Bayesian inferences using INLA technique were applied to obtain prediction of the return levels at each station, which incorporate uncertainty due to model estimation, as well as the randomness that is inherent in the processes.
{"title":"Spatio-temporal dependence modelling of extreme rainfall in South Africa: A Bayesian integrated nested Laplace approximation technique","authors":"T. A. Diriba, L. K. Debusho","doi":"10.1080/23737484.2023.2207503","DOIUrl":"https://doi.org/10.1080/23737484.2023.2207503","url":null,"abstract":"Abstract The spatial and spatio-temporal dependence modeling to extreme value distributions have been used to analyze the extremes of daily maximum rainfall data across selected weather stations in South Africa combining generalized Pareto distribution (GPD) with the flexible Bayesian Latent Gaussian Model (LGM). The paper demonstrated the spatio-temporal GPD model for applications in extreme rainfall data that capture systematic variation through spatial and spatio-temporal modeling framework, in which the temporal constitutes the week and month as random separately. The paper uses the Bayesian integrated Nested Laplace approximation (INLA) algorithm to estimate marginal posterior means of the parameters and hyper-parameters for Bayesian spatio-temporal models. The Bayesian inferences using INLA technique were applied to obtain prediction of the return levels at each station, which incorporate uncertainty due to model estimation, as well as the randomness that is inherent in the processes.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"5 1","pages":"152 - 180"},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89805499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-03DOI: 10.1080/23737484.2023.2207499
Mahendra Saha, S. Dey
Abstract This article suggests a novel process capability index (PCI) termed as , which is based on an asymmetric loss function (linear exponential) for a normal process and offers a specific method of incorporating the loss in capability analysis. Next, we estimate the suggested PCI using the moment estimation approach when the process follows a normal distribution, and we compare the effectiveness of the investigated estimation methods in terms of their mean squared errors through simulation analysis. Additionally, the confidence intervals for the index are constructed using the generalized confidence interval (GCI) and parametric bootstrap confidence interval (BCI) approach. Using Monte Carlo simulation, the performance of the GCI and BCI is compared in terms of average width, associated coverage probabilities, and relative coverage. Finally, three real data sets from the electronic industries are re-analyzed to show the usefulness of the suggested index, MOM estimation, GCI and BCI.
{"title":"Uses of a new asymmetric loss-based process capability index in the electronic industries","authors":"Mahendra Saha, S. Dey","doi":"10.1080/23737484.2023.2207499","DOIUrl":"https://doi.org/10.1080/23737484.2023.2207499","url":null,"abstract":"Abstract This article suggests a novel process capability index (PCI) termed as , which is based on an asymmetric loss function (linear exponential) for a normal process and offers a specific method of incorporating the loss in capability analysis. Next, we estimate the suggested PCI using the moment estimation approach when the process follows a normal distribution, and we compare the effectiveness of the investigated estimation methods in terms of their mean squared errors through simulation analysis. Additionally, the confidence intervals for the index are constructed using the generalized confidence interval (GCI) and parametric bootstrap confidence interval (BCI) approach. Using Monte Carlo simulation, the performance of the GCI and BCI is compared in terms of average width, associated coverage probabilities, and relative coverage. Finally, three real data sets from the electronic industries are re-analyzed to show the usefulness of the suggested index, MOM estimation, GCI and BCI.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"6 1","pages":"135 - 151"},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90488465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-02DOI: 10.1080/23737484.2023.2164941
Hossein Zamani, Zohreh Pakdaman, Marzieh Shekari
Abstract The Poisson regression is a popular approach in modeling count data. However, in many situations often the variance of data is greater than the mean (over-dispersed data) and the generalized Poisson or mixed Poisson models such as the Poisson gamma (negative binomial), Poisson inverse Gaussian, Poisson lognormal, and Poisson Lindley have been proposed as the alternatives to the Poisson for describing over-dispersed count data. In some situations, the source of over-dispersion is the large percentage of zeros in the dataset. In the other words, the dataset involves an excessive number of zeros than are expected in the common discrete distributions which are known as the zero-inflated events. In order to analyze these data, zero-inflated models such as the zero-inflated Poisson, zero-inflated generalized Poisson, and zero-inflated negative binomial have been applied. This work proposes the functional form and the regression model of the zero-inflated Poisson quasi-Lindley (ZIPQL) and then, beside the alternative models, it was fitted and compared to US National Medical Expenditure Survey data.
{"title":"Zero-inflated poisson quasi-Lindley regression for modeling number of doctor visit data","authors":"Hossein Zamani, Zohreh Pakdaman, Marzieh Shekari","doi":"10.1080/23737484.2023.2164941","DOIUrl":"https://doi.org/10.1080/23737484.2023.2164941","url":null,"abstract":"Abstract The Poisson regression is a popular approach in modeling count data. However, in many situations often the variance of data is greater than the mean (over-dispersed data) and the generalized Poisson or mixed Poisson models such as the Poisson gamma (negative binomial), Poisson inverse Gaussian, Poisson lognormal, and Poisson Lindley have been proposed as the alternatives to the Poisson for describing over-dispersed count data. In some situations, the source of over-dispersion is the large percentage of zeros in the dataset. In the other words, the dataset involves an excessive number of zeros than are expected in the common discrete distributions which are known as the zero-inflated events. In order to analyze these data, zero-inflated models such as the zero-inflated Poisson, zero-inflated generalized Poisson, and zero-inflated negative binomial have been applied. This work proposes the functional form and the regression model of the zero-inflated Poisson quasi-Lindley (ZIPQL) and then, beside the alternative models, it was fitted and compared to US National Medical Expenditure Survey data.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"27 1","pages":"1 - 15"},"PeriodicalIF":0.0,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84907773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-02DOI: 10.1080/23737484.2023.2186283
Jörgen Blomvall, Pontus Söderbäck, M. Singull
Abstract This study proposes a method for estimating the interest spread over an OIS-implied spot rate used in market-consistent derivative pricing. Our method generalizes previous proposed ordinary least squares methods in the literature in two ways. First, it utilizes intraday data rather than data from a single time. Second, it is formulated as weighted least squares to counteract heteroscedasticity. Additionally, we present a general methodology to quantify the performance difference between methods when the true value is unknown. We find that our method outperforms previously proposed methods with statistical significance and that the primary improvement is the utilization of intraday data.
{"title":"Weighted least squares estimation of the risk-free rate from derivative prices","authors":"Jörgen Blomvall, Pontus Söderbäck, M. Singull","doi":"10.1080/23737484.2023.2186283","DOIUrl":"https://doi.org/10.1080/23737484.2023.2186283","url":null,"abstract":"Abstract This study proposes a method for estimating the interest spread over an OIS-implied spot rate used in market-consistent derivative pricing. Our method generalizes previous proposed ordinary least squares methods in the literature in two ways. First, it utilizes intraday data rather than data from a single time. Second, it is formulated as weighted least squares to counteract heteroscedasticity. Additionally, we present a general methodology to quantify the performance difference between methods when the true value is unknown. We find that our method outperforms previously proposed methods with statistical significance and that the primary improvement is the utilization of intraday data.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"79 1","pages":"72 - 105"},"PeriodicalIF":0.0,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76642575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-02DOI: 10.1080/23737484.2023.2175277
Daiki Maki
Abstract This study examines whether accounting for the nonlinearity of realized volatility leads to better forecast performance. We propose a new realized volatility forecasting model that considers nonlinearities without the assumption of a particular nonlinear model. The proposed model uses the Taylor series approximation method to account for nonlinearities. We applied it to the realized volatility of representative stock indices from the U.S., Japan, the U.K., and China and observed their in-sample nonlinearities. Additionally, we evaluate out-of-sample forecast performance. The empirical results show that realized volatility has nonlinearity, and the proposed models exhibit better forecast performance than standard models.
{"title":"Nonlinearity and forecast performance of realized volatility","authors":"Daiki Maki","doi":"10.1080/23737484.2023.2175277","DOIUrl":"https://doi.org/10.1080/23737484.2023.2175277","url":null,"abstract":"Abstract This study examines whether accounting for the nonlinearity of realized volatility leads to better forecast performance. We propose a new realized volatility forecasting model that considers nonlinearities without the assumption of a particular nonlinear model. The proposed model uses the Taylor series approximation method to account for nonlinearities. We applied it to the realized volatility of representative stock indices from the U.S., Japan, the U.K., and China and observed their in-sample nonlinearities. Additionally, we evaluate out-of-sample forecast performance. The empirical results show that realized volatility has nonlinearity, and the proposed models exhibit better forecast performance than standard models.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"252 1","pages":"51 - 71"},"PeriodicalIF":0.0,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75826594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-02DOI: 10.1080/23737484.2023.2194249
M. J. Hossain, P. Balagopal
Abstract Obesity-induced derangements in adipose tissue and other organs lead to the development of cardiovascular disease (CVD). The loss of CV-health in children is a continuum, and the manifestation of overt CVD takes several years. Therefore, robust biomarkers are crucial for its early prediction, prevention, and management. Biomarkers of CVD are highly mutually correlated, and typical regression approaches do not precisely appraise the obesity-induced summative alterations of these overlapping variables. This study examines if the confluence of biomarkers of CVD can distinguish adolescents with obesity from their normal-weight counterparts to illustrate obesity as a strong risk factor of CVD. The biomarkers were measured in a well-controlled study in 21 adolescents. Application of the Gaussian mixture model to these biomarkers identified two distinct groups that matched with the obesity status of participants, which was further confirmed using supervised learning methods. Classification of biomarkers from a simulation study of 1,000 data points, each comprising a vector of five biomarkers and the classification identifier, resulted in two groups that matched with the classification in the simulated dataset. The precise identification of obesity by the pattern of concurring CVD biomarkers in real and simulated datasets confirms obesity as a strong risk factor of CVD.
{"title":"A novel application of finite Gaussian mixture model (GMM) using real and simulated biomarkers of cardiovascular disease to distinguish adolescents with and without obesity","authors":"M. J. Hossain, P. Balagopal","doi":"10.1080/23737484.2023.2194249","DOIUrl":"https://doi.org/10.1080/23737484.2023.2194249","url":null,"abstract":"Abstract Obesity-induced derangements in adipose tissue and other organs lead to the development of cardiovascular disease (CVD). The loss of CV-health in children is a continuum, and the manifestation of overt CVD takes several years. Therefore, robust biomarkers are crucial for its early prediction, prevention, and management. Biomarkers of CVD are highly mutually correlated, and typical regression approaches do not precisely appraise the obesity-induced summative alterations of these overlapping variables. This study examines if the confluence of biomarkers of CVD can distinguish adolescents with obesity from their normal-weight counterparts to illustrate obesity as a strong risk factor of CVD. The biomarkers were measured in a well-controlled study in 21 adolescents. Application of the Gaussian mixture model to these biomarkers identified two distinct groups that matched with the obesity status of participants, which was further confirmed using supervised learning methods. Classification of biomarkers from a simulation study of 1,000 data points, each comprising a vector of five biomarkers and the classification identifier, resulted in two groups that matched with the classification in the simulated dataset. The precise identification of obesity by the pattern of concurring CVD biomarkers in real and simulated datasets confirms obesity as a strong risk factor of CVD.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"27 1","pages":"106 - 120"},"PeriodicalIF":0.0,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80987134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01Epub Date: 2023-05-19DOI: 10.1080/23737484.2023.2212262
Guangyi Gao, Jo A Wick, Alexandra R Brown, Richard J Barohn, Byron J Gajewski
The PAIN-CONTRoLS trial compared four medications in treating Cryptogenic sensory polyneuropathy. The primary outcome was a utility function that combined two outcomes, patients' pain score reduction and patients' quit rate. However, additional analysis of the individual outcomes could also be leveraged to inform selecting an optimal medication for future patients. We demonstrate how joint modeling of longitudinal and time-to-event data from PAIN-CONTRoLS can be used to predict the effects of medication in a patient-specific manner and helps to make patient-focused decisions. A joint model was used to evaluate the two outcomes while accounting for the association between the longitudinal process and the time-to-event processes. Results suggested no significant association between the patients' pain scores and time to the medication quit in the PAIN-CONTRoLS study, but the joint model still provided robust estimates and a better model fit. Using the model estimates, given patients' baseline characteristics, a drug profile on both the pain reduction and medication time could be obtained for each drug, providing information on how likely they would quit and how much pain reduction they should expect. Our analysis suggested that drugs viable for one patient may not be beneficial for others.
{"title":"Using a Bayesian model of the joint distribution of pain and time on medication to decide on pain medication for neuropathy.","authors":"Guangyi Gao, Jo A Wick, Alexandra R Brown, Richard J Barohn, Byron J Gajewski","doi":"10.1080/23737484.2023.2212262","DOIUrl":"10.1080/23737484.2023.2212262","url":null,"abstract":"<p><p>The PAIN-CONTRoLS trial compared four medications in treating Cryptogenic sensory polyneuropathy. The primary outcome was a utility function that combined two outcomes, patients' pain score reduction and patients' quit rate. However, additional analysis of the individual outcomes could also be leveraged to inform selecting an optimal medication for future patients. We demonstrate how joint modeling of longitudinal and time-to-event data from PAIN-CONTRoLS can be used to predict the effects of medication in a patient-specific manner and helps to make patient-focused decisions. A joint model was used to evaluate the two outcomes while accounting for the association between the longitudinal process and the time-to-event processes. Results suggested no significant association between the patients' pain scores and time to the medication quit in the PAIN-CONTRoLS study, but the joint model still provided robust estimates and a better model fit. Using the model estimates, given patients' baseline characteristics, a drug profile on both the pain reduction and medication time could be obtained for each drug, providing information on how likely they would quit and how much pain reduction they should expect. Our analysis suggested that drugs viable for one patient may not be beneficial for others.</p>","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"9 3","pages":"252-269"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10491414/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10241375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}