While COVID-19 has resulted in a significant increase in global mortality rates, the impact of the pandemic on mortality from other causes remains uncertain. To gain insight into the broader effects of COVID-19 on various causes of death, we analyze an Italian dataset that includes monthly mortality counts for different causes from January 2015 to December 2020. Our approach involves a generalized additive model enhanced with correlated random effects. The generalized additive model component effectively captures non-linear relationships between various covariates and mortality rates, while the random effects are multivariate time series observations recorded in various locations, and they embody information on the dependence structure present among geographical locations and different causes of mortality. Adopting a Bayesian framework, we impose suitable priors on the model parameters. For efficient posterior computation, we employ variational inference, specifically for fixed effect coefficients and random effects, Gaussian variational approximation is assumed, which streamlines the analysis process. The optimisation is performed using a coordinate ascent variational inference algorithm and several computational strategies are implemented along the way to address the issues arising from the high dimensional nature of the data, providing accelerated and stabilised parameter estimation and statistical inference.
{"title":"Bayesian Dynamic Generalized Additive Model for Mortality during COVID-19 Pandemic","authors":"Wei Zhang, Antonietta Mira, Ernst C. Wit","doi":"arxiv-2409.02378","DOIUrl":"https://doi.org/arxiv-2409.02378","url":null,"abstract":"While COVID-19 has resulted in a significant increase in global mortality\u0000rates, the impact of the pandemic on mortality from other causes remains\u0000uncertain. To gain insight into the broader effects of COVID-19 on various\u0000causes of death, we analyze an Italian dataset that includes monthly mortality\u0000counts for different causes from January 2015 to December 2020. Our approach\u0000involves a generalized additive model enhanced with correlated random effects.\u0000The generalized additive model component effectively captures non-linear\u0000relationships between various covariates and mortality rates, while the random\u0000effects are multivariate time series observations recorded in various\u0000locations, and they embody information on the dependence structure present\u0000among geographical locations and different causes of mortality. Adopting a\u0000Bayesian framework, we impose suitable priors on the model parameters. For\u0000efficient posterior computation, we employ variational inference, specifically\u0000for fixed effect coefficients and random effects, Gaussian variational\u0000approximation is assumed, which streamlines the analysis process. The\u0000optimisation is performed using a coordinate ascent variational inference\u0000algorithm and several computational strategies are implemented along the way to\u0000address the issues arising from the high dimensional nature of the data,\u0000providing accelerated and stabilised parameter estimation and statistical\u0000inference.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Flexible machine learning tools are being used increasingly to estimate heterogeneous treatment effects. This paper gives an accessible tutorial demonstrating the use of the causal forest algorithm, available in the R package grf. We start with a brief non-technical overview of treatment effect estimation methods with a focus on estimation in observational studies, although similar methods can be used in experimental studies. We then discuss the logic of estimating heterogeneous effects using the extension of the random forest algorithm implemented in grf. Finally, we illustrate causal forest by conducting a secondary analysis on the extent to which individual differences in resilience to high combat stress can be measured among US Army soldiers deploying to Afghanistan based on information about these soldiers available prior to deployment. Throughout we illustrate simple and interpretable exercises for both model selection and evaluation, including targeting operator characteristics curves, Qini curves, area-under-the-curve summaries, and best linear projections. A replication script with simulated data is available at github.com/grf-labs/grf/tree/master/experiments/ijmpr
{"title":"Estimating Treatment Effect Heterogeneity in Psychiatry: A Review and Tutorial with Causal Forests","authors":"Erik Sverdrup, Maria Petukhova, Stefan Wager","doi":"arxiv-2409.01578","DOIUrl":"https://doi.org/arxiv-2409.01578","url":null,"abstract":"Flexible machine learning tools are being used increasingly to estimate\u0000heterogeneous treatment effects. This paper gives an accessible tutorial\u0000demonstrating the use of the causal forest algorithm, available in the R\u0000package grf. We start with a brief non-technical overview of treatment effect\u0000estimation methods with a focus on estimation in observational studies,\u0000although similar methods can be used in experimental studies. We then discuss\u0000the logic of estimating heterogeneous effects using the extension of the random\u0000forest algorithm implemented in grf. Finally, we illustrate causal forest by\u0000conducting a secondary analysis on the extent to which individual differences\u0000in resilience to high combat stress can be measured among US Army soldiers\u0000deploying to Afghanistan based on information about these soldiers available\u0000prior to deployment. Throughout we illustrate simple and interpretable\u0000exercises for both model selection and evaluation, including targeting operator\u0000characteristics curves, Qini curves, area-under-the-curve summaries, and best\u0000linear projections. A replication script with simulated data is available at\u0000github.com/grf-labs/grf/tree/master/experiments/ijmpr","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher R. Wentland, Michael Weylandt, Laura P. Swiler, Thomas S. Ehrmann, Diana Bull
Attribution of climate impacts to a source forcing is critical to understanding, communicating, and addressing the effects of human influence on the climate. While standard attribution methods, such as optimal fingerprinting, have been successfully applied to long-term, widespread effects such as global surface temperature warming, they often struggle in low signal-to-noise regimes, typical of short-term climate forcings or climate variables which are loosely related to the forcing. Single-step approaches, which directly relate a source forcing and final impact, are unable to utilize additional climate information to improve attribution certainty. To address this shortcoming, this paper presents a novel multi-step attribution approach which is capable of analyzing multiple variables conditionally. A connected series of climate effects are treated as dependent, and relationships found in intermediary steps of a causal pathway are leveraged to better characterize the forcing impact. This enables attribution of the forcing level responsible for the observed impacts, while equivalent single-step approaches fail. Utilizing a scalar feature describing the forcing impact, simple forcing response models, and a conditional Bayesian formulation, this method can incorporate several causal pathways to identify the correct forcing magnitude. As an exemplar of a short-term, high-variance forcing, we demonstrate this method for the 1991 eruption of Mt. Pinatubo. Results indicate that including stratospheric and surface temperature and radiative flux measurements increases attribution certainty compared to analyses derived solely from temperature measurements. This framework has potential to improve climate attribution assessments for both geoengineering projects and long-term climate change, for which standard attribution methods may fail.
{"title":"Conditional multi-step attribution for climate forcings","authors":"Christopher R. Wentland, Michael Weylandt, Laura P. Swiler, Thomas S. Ehrmann, Diana Bull","doi":"arxiv-2409.01396","DOIUrl":"https://doi.org/arxiv-2409.01396","url":null,"abstract":"Attribution of climate impacts to a source forcing is critical to\u0000understanding, communicating, and addressing the effects of human influence on\u0000the climate. While standard attribution methods, such as optimal\u0000fingerprinting, have been successfully applied to long-term, widespread effects\u0000such as global surface temperature warming, they often struggle in low\u0000signal-to-noise regimes, typical of short-term climate forcings or climate\u0000variables which are loosely related to the forcing. Single-step approaches,\u0000which directly relate a source forcing and final impact, are unable to utilize\u0000additional climate information to improve attribution certainty. To address\u0000this shortcoming, this paper presents a novel multi-step attribution approach\u0000which is capable of analyzing multiple variables conditionally. A connected\u0000series of climate effects are treated as dependent, and relationships found in\u0000intermediary steps of a causal pathway are leveraged to better characterize the\u0000forcing impact. This enables attribution of the forcing level responsible for\u0000the observed impacts, while equivalent single-step approaches fail. Utilizing a\u0000scalar feature describing the forcing impact, simple forcing response models,\u0000and a conditional Bayesian formulation, this method can incorporate several\u0000causal pathways to identify the correct forcing magnitude. As an exemplar of a\u0000short-term, high-variance forcing, we demonstrate this method for the 1991\u0000eruption of Mt. Pinatubo. Results indicate that including stratospheric and\u0000surface temperature and radiative flux measurements increases attribution\u0000certainty compared to analyses derived solely from temperature measurements.\u0000This framework has potential to improve climate attribution assessments for\u0000both geoengineering projects and long-term climate change, for which standard\u0000attribution methods may fail.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"180 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We aim to explain whether a stress memory task has a significant impact on tonal coarticulation. We contribute a novel approach to analyse tonal coarticulation in phonetics, where several f0 contours are compared with respect to their vibrations at higher resolution, something that in statistical terms is called variation of the second order. We identify speech recording frequency curves as functional observations and harness inspiration from the mathematical fields of functional data analysis and optimal transport. By leveraging results from these two disciplines, we make one key observation:we identify the time and frequency covariance functions as crucial features for capturing the finer effects of tonal coarticulation. This observation leads us to propose a 2 steps approach where the mean functions are modelled via Generalized Additive Models, and the residuals of such models are investigated for any structure nested at covariance level. If such structure exist, we describe the variation manifested by the covariances through covariance principal component analysis. The 2-steps approach allows to uncover any variation not explained by generalized additive modelling, as well as fill a known shortcoming of these models into incorporating complex correlation structures in the data. The proposed method is illustrated on an articulatory dataset contrasting the pronunciation non-sensical bi-syllabic combinations in the presence of a short-memory challenge
{"title":"Tonal coarticulation revisited: functional covariance analysis to investigate the planning of co-articulated tones by Standard Chinese speakers","authors":"Valentina Masarotto, Yiya Chen","doi":"arxiv-2409.01194","DOIUrl":"https://doi.org/arxiv-2409.01194","url":null,"abstract":"We aim to explain whether a stress memory task has a significant impact on\u0000tonal coarticulation. We contribute a novel approach to analyse tonal\u0000coarticulation in phonetics, where several f0 contours are compared with\u0000respect to their vibrations at higher resolution, something that in statistical\u0000terms is called variation of the second order. We identify speech recording\u0000frequency curves as functional observations and harness inspiration from the\u0000mathematical fields of functional data analysis and optimal transport. By\u0000leveraging results from these two disciplines, we make one key observation:we\u0000identify the time and frequency covariance functions as crucial features for\u0000capturing the finer effects of tonal coarticulation. This observation leads us\u0000to propose a 2 steps approach where the mean functions are modelled via\u0000Generalized Additive Models, and the residuals of such models are investigated\u0000for any structure nested at covariance level. If such structure exist, we\u0000describe the variation manifested by the covariances through covariance\u0000principal component analysis. The 2-steps approach allows to uncover any\u0000variation not explained by generalized additive modelling, as well as fill a\u0000known shortcoming of these models into incorporating complex correlation\u0000structures in the data. The proposed method is illustrated on an articulatory\u0000dataset contrasting the pronunciation non-sensical bi-syllabic combinations in\u0000the presence of a short-memory challenge","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"84 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fingerprint examiners appear to be reluctant to adopt probabilistic reasoning, statistical models, and empirical validation. The rate of adoption of the likelihood-ratio framework by fingerprint practitioners appears to be near zero. A factor in the reluctance to adopt the likelihood-ratio framework may be a perception that it would require a radical change in practice. The present paper proposes a small step that would require minimal changes to current practice. It proposes and demonstrates a method to convert traditional fingerprint-examination outputs ("identification", "inconclusive", "exclusion") to well-calibrated Bayes factors. The method makes use of a beta-binomial model, and both uninformative and informative priors.
{"title":"A method to convert traditional fingerprint ACE / ACE-V outputs (\"identification\", \"inconclusive\", \"exclusion\") to Bayes factors","authors":"Geoffrey Stewart Morrison","doi":"arxiv-2409.00451","DOIUrl":"https://doi.org/arxiv-2409.00451","url":null,"abstract":"Fingerprint examiners appear to be reluctant to adopt probabilistic\u0000reasoning, statistical models, and empirical validation. The rate of adoption\u0000of the likelihood-ratio framework by fingerprint practitioners appears to be\u0000near zero. A factor in the reluctance to adopt the likelihood-ratio framework\u0000may be a perception that it would require a radical change in practice. The\u0000present paper proposes a small step that would require minimal changes to\u0000current practice. It proposes and demonstrates a method to convert traditional\u0000fingerprint-examination outputs (\"identification\", \"inconclusive\", \"exclusion\")\u0000to well-calibrated Bayes factors. The method makes use of a beta-binomial\u0000model, and both uninformative and informative priors.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"2023 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Localizing sources of troublesome oscillations, particularly forced oscillations (FOs), in power systems has received considerable attention over the last few years. This is driven in part by the massive deployment of phasor measurement units (PMUs) that capture these oscillations when they occur; and in part by the increasing incidents of FOs due to malfunctioning components, wind power fluctuations, and/or cyclic loads. Capitalizing on the frequency divider formula of [1], we develop methods to localize single and multiple oscillatory sources using bus frequency measurements. The method to localize a single oscillation source does not require knowledge of network parameters. However, the method for localizing FOs caused by multiple sources requires this knowledge. We explain the reasoning behind this knowledge difference as well as demonstrate the success of our methods for source localization in multiple test systems.
{"title":"Localizing Single and Multiple Oscillatory Sources: A Frequency Divider Approach","authors":"Rajasekhar Anguluri, Anamitra Pal","doi":"arxiv-2409.00566","DOIUrl":"https://doi.org/arxiv-2409.00566","url":null,"abstract":"Localizing sources of troublesome oscillations, particularly forced\u0000oscillations (FOs), in power systems has received considerable attention over\u0000the last few years. This is driven in part by the massive deployment of phasor\u0000measurement units (PMUs) that capture these oscillations when they occur; and\u0000in part by the increasing incidents of FOs due to malfunctioning components,\u0000wind power fluctuations, and/or cyclic loads. Capitalizing on the frequency\u0000divider formula of [1], we develop methods to localize single and multiple\u0000oscillatory sources using bus frequency measurements. The method to localize a\u0000single oscillation source does not require knowledge of network parameters.\u0000However, the method for localizing FOs caused by multiple sources requires this\u0000knowledge. We explain the reasoning behind this knowledge difference as well as\u0000demonstrate the success of our methods for source localization in multiple test\u0000systems.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tomasz Stanisz, Stanisław Drożdż, Jarosław Kwapień
As the recent studies indicate, the structure imposed onto written texts by the presence of punctuation develops patterns which reveal certain characteristics of universality. In particular, based on a large collection of classic literary works, it has been evidenced that the distances between consecutive punctuation marks, measured in terms of the number of words, obey the discrete Weibull distribution - a discrete variant of a distribution often used in survival analysis. The present work extends the analysis of punctuation usage patterns to more experimental pieces of world literature. It turns out that the compliance of the the distances between punctuation marks with the discrete Weibull distribution typically applies here as well. However, some of the works by James Joyce are distinct in this regard - in the sense that the tails of the relevant distributions are significantly thicker and, consequently, the corresponding hazard functions are decreasing functions not observed in typical literary texts in prose. "Finnegans Wake" - the same one to which science owes the word "quarks" for the most fundamental constituents of matter - is particularly striking in this context. At the same time, in all the studied texts, the sentence lengths - representing the distances between sentence-ending punctuation marks - reveal more freedom and are not constrained by the discrete Weibull distribution. This freedom in some cases translates into long-range nonlinear correlations, which manifest themselves in multifractality. Again, a text particularly spectacular in terms of multifractality is "Finnegans Wake".
{"title":"Statistics of punctuation in experimental literature -- the remarkable case of \"Finnegans Wake\" by James Joyce","authors":"Tomasz Stanisz, Stanisław Drożdż, Jarosław Kwapień","doi":"arxiv-2409.00483","DOIUrl":"https://doi.org/arxiv-2409.00483","url":null,"abstract":"As the recent studies indicate, the structure imposed onto written texts by\u0000the presence of punctuation develops patterns which reveal certain\u0000characteristics of universality. In particular, based on a large collection of\u0000classic literary works, it has been evidenced that the distances between\u0000consecutive punctuation marks, measured in terms of the number of words, obey\u0000the discrete Weibull distribution - a discrete variant of a distribution often\u0000used in survival analysis. The present work extends the analysis of punctuation\u0000usage patterns to more experimental pieces of world literature. It turns out\u0000that the compliance of the the distances between punctuation marks with the\u0000discrete Weibull distribution typically applies here as well. However, some of\u0000the works by James Joyce are distinct in this regard - in the sense that the\u0000tails of the relevant distributions are significantly thicker and,\u0000consequently, the corresponding hazard functions are decreasing functions not\u0000observed in typical literary texts in prose. \"Finnegans Wake\" - the same one to\u0000which science owes the word \"quarks\" for the most fundamental constituents of\u0000matter - is particularly striking in this context. At the same time, in all the\u0000studied texts, the sentence lengths - representing the distances between\u0000sentence-ending punctuation marks - reveal more freedom and are not constrained\u0000by the discrete Weibull distribution. This freedom in some cases translates\u0000into long-range nonlinear correlations, which manifest themselves in\u0000multifractality. Again, a text particularly spectacular in terms of\u0000multifractality is \"Finnegans Wake\".","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel García Rasines, Roi Naveiro, David Ríos Insua, Simón Rodríguez Santana
Pricing decisions stand out as one of the most critical tasks a company faces, particularly in today's digital economy. As with other business decision-making problems, pricing unfolds in a highly competitive and uncertain environment. Traditional analyses in this area have heavily relied on game theory and its variants. However, an important drawback of these approaches is their reliance on common knowledge assumptions, which are hardly tenable in competitive business domains. This paper introduces an innovative personalized pricing framework designed to assist decision-makers in undertaking pricing decisions amidst competition, considering both buyer's and competitors' preferences. Our approach (i) establishes a coherent framework for modeling competition mitigating common knowledge assumptions; (ii) proposes a principled method to forecast competitors' pricing and customers' purchasing decisions, acknowledging major business uncertainties; and, (iii) encourages structured thinking about the competitors' problems, thus enriching the solution process. To illustrate these properties, in addition to a general pricing template, we outline two specifications - one from the retail domain and a more intricate one from the pension fund domain.
{"title":"Personalized Pricing Decisions Through Adversarial Risk Analysis","authors":"Daniel García Rasines, Roi Naveiro, David Ríos Insua, Simón Rodríguez Santana","doi":"arxiv-2409.00444","DOIUrl":"https://doi.org/arxiv-2409.00444","url":null,"abstract":"Pricing decisions stand out as one of the most critical tasks a company\u0000faces, particularly in today's digital economy. As with other business\u0000decision-making problems, pricing unfolds in a highly competitive and uncertain\u0000environment. Traditional analyses in this area have heavily relied on game\u0000theory and its variants. However, an important drawback of these approaches is\u0000their reliance on common knowledge assumptions, which are hardly tenable in\u0000competitive business domains. This paper introduces an innovative personalized\u0000pricing framework designed to assist decision-makers in undertaking pricing\u0000decisions amidst competition, considering both buyer's and competitors'\u0000preferences. Our approach (i) establishes a coherent framework for modeling\u0000competition mitigating common knowledge assumptions; (ii) proposes a principled\u0000method to forecast competitors' pricing and customers' purchasing decisions,\u0000acknowledging major business uncertainties; and, (iii) encourages structured\u0000thinking about the competitors' problems, thus enriching the solution process.\u0000To illustrate these properties, in addition to a general pricing template, we\u0000outline two specifications - one from the retail domain and a more intricate\u0000one from the pension fund domain.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammadreza Ganji, Anas El Fathi Ph. D., Chiara Fabris Ph. D., Dayu Lv Ph. D., Boris Kovatchev Ph. D., Marc Breton Ph. D
Background and Objective: Diabetes presents a significant challenge to healthcare due to the negative impact of poor blood sugar control on health and associated complications. Computer simulation platforms, notably exemplified by the UVA/Padova Type 1 Diabetes simulator, has emerged as a promising tool for advancing diabetes treatments by simulating patient responses in a virtual environment. The UVA Virtual Lab (UVLab) is a new simulation platform to mimic the metabolic behavior of people with Type 2 diabetes (T2D) with a large population of 6062 virtual subjects. Methods: The work introduces the Distribution-Based Population Selection (DSPS) method, a systematic approach to identifying virtual subsets that mimic the clinical behavior observed in real trials. The method transforms the sub-population selection task into a Linear Programing problem, enabling the identification of the largest representative virtual cohort. This selection process centers on key clinical outcomes in diabetes research, such as HbA1c and Fasting plasma Glucose (FPG), ensuring that the statistical properties (moments) of the selected virtual sub-population closely resemble those observed in real-word clinical trial. Results: DSPS method was applied to the insulin degludec (IDeg) arm of a phase 3 clinical trial. This method was used to select a sub-population of virtual subjects that closely mirrored the clinical trial data across multiple key metrics, including glycemic efficacy, insulin dosages, and cumulative hypoglycemia events over a 26-week period. Conclusion: The DSPS algorithm is able to select virtual sub-population within UVLab to reproduce and predict the outcomes of a clinical trial. This statistical method can bridge the gap between large population simulation platforms and previously conducted clinical trials.
{"title":"Distribution-Based Sub-Population Selection (DSPS): A Method for in-Silico Reproduction of Clinical Trials Outcomes","authors":"Mohammadreza Ganji, Anas El Fathi Ph. D., Chiara Fabris Ph. D., Dayu Lv Ph. D., Boris Kovatchev Ph. D., Marc Breton Ph. D","doi":"arxiv-2409.00232","DOIUrl":"https://doi.org/arxiv-2409.00232","url":null,"abstract":"Background and Objective: Diabetes presents a significant challenge to\u0000healthcare due to the negative impact of poor blood sugar control on health and\u0000associated complications. Computer simulation platforms, notably exemplified by\u0000the UVA/Padova Type 1 Diabetes simulator, has emerged as a promising tool for\u0000advancing diabetes treatments by simulating patient responses in a virtual\u0000environment. The UVA Virtual Lab (UVLab) is a new simulation platform to mimic\u0000the metabolic behavior of people with Type 2 diabetes (T2D) with a large\u0000population of 6062 virtual subjects. Methods: The work introduces the\u0000Distribution-Based Population Selection (DSPS) method, a systematic approach to\u0000identifying virtual subsets that mimic the clinical behavior observed in real\u0000trials. The method transforms the sub-population selection task into a Linear\u0000Programing problem, enabling the identification of the largest representative\u0000virtual cohort. This selection process centers on key clinical outcomes in\u0000diabetes research, such as HbA1c and Fasting plasma Glucose (FPG), ensuring\u0000that the statistical properties (moments) of the selected virtual\u0000sub-population closely resemble those observed in real-word clinical trial.\u0000Results: DSPS method was applied to the insulin degludec (IDeg) arm of a phase\u00003 clinical trial. This method was used to select a sub-population of virtual\u0000subjects that closely mirrored the clinical trial data across multiple key\u0000metrics, including glycemic efficacy, insulin dosages, and cumulative\u0000hypoglycemia events over a 26-week period. Conclusion: The DSPS algorithm is\u0000able to select virtual sub-population within UVLab to reproduce and predict the\u0000outcomes of a clinical trial. This statistical method can bridge the gap\u0000between large population simulation platforms and previously conducted clinical\u0000trials.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"160 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The remarkable growth of digital assets, starting from the inception of Bitcoin in 2009 into a 1 trillion market in 2024, underscores the momentum behind disruptive technologies and the global appetite for digital assets. This paper develops a framework to enhance actuaries' understanding of the cyber risks associated with the developing digital asset ecosystem, as well as their measurement methods in the context of digital asset insurance. By integrating actuarial perspectives, we aim to enhance understanding and modeling of cyber risks at both the micro and systemic levels. The qualitative examination sheds light on blockchain technology and its associated risks, while our quantitative framework offers a rigorous approach to modeling cyber risks in digital asset insurance portfolios. This multifaceted approach serves three primary objectives: i) offer a clear and accessible education on the evolving digital asset ecosystem and the diverse spectrum of cyber risks it entails; ii) develop a scientifically rigorous framework for quantifying cyber risks in the digital asset ecosystem; iii) provide practical applications, including pricing strategies and tail risk management. Particularly, we develop frequency-severity models based on real loss data for pricing cyber risks in digit assets and utilize Monte Carlo simulation to estimate the tail risks, offering practical insights for risk management strategies. As digital assets continue to reshape finance, our work serves as a foundational step towards safeguarding the integrity and stability of this rapidly evolving landscape.
{"title":"A Framework for Digital Asset Risks with Insurance Applications","authors":"Zhengming Li, Jianxi Su, Maochao Xu, Jimmy Yuen","doi":"arxiv-2408.17227","DOIUrl":"https://doi.org/arxiv-2408.17227","url":null,"abstract":"The remarkable growth of digital assets, starting from the inception of\u0000Bitcoin in 2009 into a 1 trillion market in 2024, underscores the momentum\u0000behind disruptive technologies and the global appetite for digital assets. This\u0000paper develops a framework to enhance actuaries' understanding of the cyber\u0000risks associated with the developing digital asset ecosystem, as well as their\u0000measurement methods in the context of digital asset insurance. By integrating\u0000actuarial perspectives, we aim to enhance understanding and modeling of cyber\u0000risks at both the micro and systemic levels. The qualitative examination sheds\u0000light on blockchain technology and its associated risks, while our quantitative\u0000framework offers a rigorous approach to modeling cyber risks in digital asset\u0000insurance portfolios. This multifaceted approach serves three primary\u0000objectives: i) offer a clear and accessible education on the evolving digital\u0000asset ecosystem and the diverse spectrum of cyber risks it entails; ii) develop\u0000a scientifically rigorous framework for quantifying cyber risks in the digital\u0000asset ecosystem; iii) provide practical applications, including pricing\u0000strategies and tail risk management. Particularly, we develop\u0000frequency-severity models based on real loss data for pricing cyber risks in\u0000digit assets and utilize Monte Carlo simulation to estimate the tail risks,\u0000offering practical insights for risk management strategies. As digital assets\u0000continue to reshape finance, our work serves as a foundational step towards\u0000safeguarding the integrity and stability of this rapidly evolving landscape.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}