Pub Date : 2021-04-01Epub Date: 2020-04-18DOI: 10.1007/s12561-020-09278-z
Ning Hao, Yue Selena Niu, Feifei Xiao, Heping Zhang
In many applications such as copy number variant (CNV) detection, the goal is to identify short segments on which the observations have different means or medians from the background. Those segments are usually short and hidden in a long sequence, and hence are very challenging to find. We study a super scalable short segment (4S) detection algorithm in this paper. This nonparametric method clusters the locations where the observations exceed a threshold for segment detection. It is computationally efficient and does not rely on Gaussian noise assumption. Moreover, we develop a framework to assign significance levels for detected segments. We demonstrate the advantages of our proposed method by theoretical, simulation, and real data studies.
{"title":"A super scalable algorithm for short segment detection.","authors":"Ning Hao, Yue Selena Niu, Feifei Xiao, Heping Zhang","doi":"10.1007/s12561-020-09278-z","DOIUrl":"https://doi.org/10.1007/s12561-020-09278-z","url":null,"abstract":"<p><p>In many applications such as copy number variant (CNV) detection, the goal is to identify short segments on which the observations have different means or medians from the background. Those segments are usually short and hidden in a long sequence, and hence are very challenging to find. We study a super scalable short segment (4S) detection algorithm in this paper. This nonparametric method clusters the locations where the observations exceed a threshold for segment detection. It is computationally efficient and does not rely on Gaussian noise assumption. Moreover, we develop a framework to assign significance levels for detected segments. We demonstrate the advantages of our proposed method by theoretical, simulation, and real data studies.</p>","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-020-09278-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25504729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01Epub Date: 2020-08-17DOI: 10.1007/s12561-020-09291-2
Yimei Li, Liang Zhu, Lei Liu, Leslie L Robison
Both panel-count data and panel-binary data are common data types in recurrent event studies. Because of inconsistent questionnaires or missing data during the follow-ups, mixed data types need to be addressed frequently. A recently proposed semiparametric approach uses a proportional means model to facilitate regression analyses of mixed panel-count and panel-binary data. This method can use all available information regardless of the record type and provide unbiased estimates. However, the large number of nuisance parameters in the nonparametric baseline hazard function makes the estimating procedure very complicated and time-consuming. We approximated the baseline hazard function to simplify the estimating procedure. Simulation studies showed that our method performed similarly to that of the previous semiparametric likelihood-based method, but with much faster speed. Approximating the baseline hazard not only reduced the computational burden but also made it possible to implement the estimating procedure in a standard software, such as SAS.
{"title":"Regression analysis of mixed panel-count data with application to cancer studies.","authors":"Yimei Li, Liang Zhu, Lei Liu, Leslie L Robison","doi":"10.1007/s12561-020-09291-2","DOIUrl":"https://doi.org/10.1007/s12561-020-09291-2","url":null,"abstract":"<p><p>Both panel-count data and panel-binary data are common data types in recurrent event studies. Because of inconsistent questionnaires or missing data during the follow-ups, mixed data types need to be addressed frequently. A recently proposed semiparametric approach uses a proportional means model to facilitate regression analyses of mixed panel-count and panel-binary data. This method can use all available information regardless of the record type and provide unbiased estimates. However, the large number of nuisance parameters in the nonparametric baseline hazard function makes the estimating procedure very complicated and time-consuming. We approximated the baseline hazard function to simplify the estimating procedure. Simulation studies showed that our method performed similarly to that of the previous semiparametric likelihood-based method, but with much faster speed. Approximating the baseline hazard not only reduced the computational burden but also made it possible to implement the estimating procedure in a standard software, such as SAS.</p>","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-020-09291-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25500982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01Epub Date: 2020-04-28DOI: 10.1007/s12561-020-09280-5
Weiming Zhang, Debashis Ghosh
Mendelian Randomization (MR) represents a class of instrumental variable methods using genetic variants. It has become popular in epidemiological studies to account for the unmeasured confounders when estimating the effect of exposure on outcome. The success of Mendelian Randomization depends on three critical assumptions, which are difficult to verify. Therefore, sensitivity analysis methods are needed for evaluating results and making plausible conclusions. We propose a general and easy to apply approach to conduct sensitivity analysis for Mendelian Randomization studies. Bound et al. (1995) derived a formula for the asymptotic bias of the instrumental variable estimator. Based on their work, we derive a new sensitivity analysis formula. The parameters in the formula include sensitivity parameters such as the correlation between instruments and unmeasured confounder, the direct effect of instruments on outcome and the strength of instruments. In our simulation studies, we examined our approach in various scenarios using either individual SNPs or unweighted allele score as instruments. By using a previously published dataset from researchers involving a bone mineral density study, we demonstrate that our proposed method is a useful tool for MR studies, and that investigators can combine their domain knowledge with our method to obtain bias-corrected results and make informed conclusions on the scientific plausibility of their findings.
{"title":"A general approach to sensitivity analysis for Mendelian randomization.","authors":"Weiming Zhang, Debashis Ghosh","doi":"10.1007/s12561-020-09280-5","DOIUrl":"https://doi.org/10.1007/s12561-020-09280-5","url":null,"abstract":"<p><p>Mendelian Randomization (MR) represents a class of instrumental variable methods using genetic variants. It has become popular in epidemiological studies to account for the unmeasured confounders when estimating the effect of exposure on outcome. The success of Mendelian Randomization depends on three critical assumptions, which are difficult to verify. Therefore, sensitivity analysis methods are needed for evaluating results and making plausible conclusions. We propose a general and easy to apply approach to conduct sensitivity analysis for Mendelian Randomization studies. Bound et al. (1995) derived a formula for the asymptotic bias of the instrumental variable estimator. Based on their work, we derive a new sensitivity analysis formula. The parameters in the formula include sensitivity parameters such as the correlation between instruments and unmeasured confounder, the direct effect of instruments on outcome and the strength of instruments. In our simulation studies, we examined our approach in various scenarios using either individual SNPs or unweighted allele score as instruments. By using a previously published dataset from researchers involving a bone mineral density study, we demonstrate that our proposed method is a useful tool for MR studies, and that investigators can combine their domain knowledge with our method to obtain bias-corrected results and make informed conclusions on the scientific plausibility of their findings.</p>","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-020-09280-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25504730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-03-11DOI: 10.1007/s12561-021-09302-w
Yongli Han, Courtney Baker, E. Vogtmann, X. Hua, Jianxin Shi, Danping Liu
{"title":"Modeling Longitudinal Microbiome Compositional Data: A Two-Part Linear Mixed Model with Shared Random Effects","authors":"Yongli Han, Courtney Baker, E. Vogtmann, X. Hua, Jianxin Shi, Danping Liu","doi":"10.1007/s12561-021-09302-w","DOIUrl":"https://doi.org/10.1007/s12561-021-09302-w","url":null,"abstract":"","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-021-09302-w","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48308399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-03-10DOI: 10.1007/s12561-021-09307-5
Huilin Li, Hongzhe Li
{"title":"Introduction to Special Issue on Statistics in Microbiome and Metagenomics","authors":"Huilin Li, Hongzhe Li","doi":"10.1007/s12561-021-09307-5","DOIUrl":"https://doi.org/10.1007/s12561-021-09307-5","url":null,"abstract":"","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-021-09307-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48262560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-03-05DOI: 10.1007/s12561-021-09303-9
O. Egbon, Omodolapo Somo-Aina, E. Gayawan
{"title":"Spatial Weighted Analysis of Malnutrition Among Children in Nigeria: A Bayesian Approach","authors":"O. Egbon, Omodolapo Somo-Aina, E. Gayawan","doi":"10.1007/s12561-021-09303-9","DOIUrl":"https://doi.org/10.1007/s12561-021-09303-9","url":null,"abstract":"","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-021-09303-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"52603312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-18DOI: 10.1007/s12561-021-09301-x
Naitee Ting, Lihong Huang, Q. Deng, J. Cappelleri
{"title":"Average Response over Time as Estimand: An Alternative Implementation of the While on Treatment Strategy","authors":"Naitee Ting, Lihong Huang, Q. Deng, J. Cappelleri","doi":"10.1007/s12561-021-09301-x","DOIUrl":"https://doi.org/10.1007/s12561-021-09301-x","url":null,"abstract":"","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-021-09301-x","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"52603291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-03DOI: 10.1007/s12561-020-09299-8
David D. Hanagal
{"title":"Positive Stable Shared Frailty Models Based on Additive Hazards","authors":"David D. Hanagal","doi":"10.1007/s12561-020-09299-8","DOIUrl":"https://doi.org/10.1007/s12561-020-09299-8","url":null,"abstract":"","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-020-09299-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"52603272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01Epub Date: 2020-04-02DOI: 10.1007/s12561-020-09277-0
Yifan Zhu, Ying Qing Chen
Since December 2019, a disease caused by a novel strain of coronavirus (COVID-19) had infected many people and the cumulative confirmed cases have reached almost 180,000 as of 17, March 2020. The COVID-19 outbreak was believed to have emerged from a seafood market in Wuhan, a metropolis city of more than 11 million population in Hubei province, China. We introduced a statistical disease transmission model using case symptom onset data to estimate the transmissibility of the early-phase outbreak in China, and provided sensitivity analyses with various assumptions of disease natural history of the COVID-19. We fitted the transmission model to several publicly available sources of the outbreak data until 11, February 2020, and estimated lock down intervention efficacy of Wuhan city. The estimated was between 2.7 and 4.2 from plausible distribution assumptions of the incubation period and relative infectivity over the infectious period. 95% confidence interval of were also reported. Potential issues such as data quality concerns and comparison of different modelling approaches were discussed.
{"title":"On a Statistical Transmission Model in Analysis of the Early Phase of COVID-19 Outbreak.","authors":"Yifan Zhu, Ying Qing Chen","doi":"10.1007/s12561-020-09277-0","DOIUrl":"https://doi.org/10.1007/s12561-020-09277-0","url":null,"abstract":"<p><p>Since December 2019, a disease caused by a novel strain of coronavirus (COVID-19) had infected many people and the cumulative confirmed cases have reached almost 180,000 as of 17, March 2020. The COVID-19 outbreak was believed to have emerged from a seafood market in Wuhan, a metropolis city of more than 11 million population in Hubei province, China. We introduced a statistical disease transmission model using case symptom onset data to estimate the transmissibility of the early-phase outbreak in China, and provided sensitivity analyses with various assumptions of disease natural history of the COVID-19. We fitted the transmission model to several publicly available sources of the outbreak data until 11, February 2020, and estimated lock down intervention efficacy of Wuhan city. The estimated <math><msub><mi>R</mi> <mn>0</mn></msub> </math> was between 2.7 and 4.2 from plausible distribution assumptions of the incubation period and relative infectivity over the infectious period. 95% confidence interval of <math><msub><mi>R</mi> <mn>0</mn></msub> </math> were also reported. Potential issues such as data quality concerns and comparison of different modelling approaches were discussed.</p>","PeriodicalId":45094,"journal":{"name":"Statistics in Biosciences","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s12561-020-09277-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37836121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}