Statistical Applications in Genetics and Molecular Biology最新文献

英文中文

Spectral dynamic causal modelling of resting-state fMRI: an exploratory study relating effective brain connectivity in the default mode network to genetics. 静息状态fMRI的频谱动态因果建模:一项关于默认模式网络中有效大脑连接与遗传学的探索性研究。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2020-08-31 DOI: 10.1515/sagmb-2019-0058

Yunlong Nie, Eugene Opoku, Laila Yasmin, Yin Song, Jie Wang, Sidi Wu, Vanessa Scarapicchia, Jodie Gawryluk, Liangliang Wang, Jiguo Cao, Farouk S Nathoo

We conduct an imaging genetics study to explore how effective brain connectivity in the default mode network (DMN) may be related to genetics within the context of Alzheimer's disease and mild cognitive impairment. We develop an analysis of longitudinal resting-state functional magnetic resonance imaging (rs-fMRI) and genetic data obtained from a sample of 111 subjects with a total of 319 rs-fMRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. A Dynamic Causal Model (DCM) is fit to the rs-fMRI scans to estimate effective brain connectivity within the DMN and related to a set of single nucleotide polymorphisms (SNPs) contained in an empirical disease-constrained set which is obtained out-of-sample from 663 ADNI subjects having only genome-wide data. We relate longitudinal effective brain connectivity estimated using spectral DCM to SNPs using both linear mixed effect (LME) models as well as function-on-scalar regression (FSR). In both cases we implement a parametric bootstrap for testing SNP coefficients and make comparisons with p-values obtained from asymptotic null distributions. In both networks at an initial q-value threshold of 0.1 no effects are found. We report on exploratory patterns of associations with relatively high ranks that exhibit stability to the differing assumptions made by both FSR and LME.

我们进行了一项成像遗传学研究，以探索在阿尔茨海默病和轻度认知障碍的背景下，默认模式网络(DMN)中有效的大脑连接如何与遗传学相关。我们对纵向静息状态功能磁共振成像(rs-fMRI)和遗传数据进行了分析，这些数据来自111名受试者的样本，其中包括来自阿尔茨海默病神经成像倡议(ADNI)数据库的319次rs-fMRI扫描。动态因果模型(DCM)适合于rs-fMRI扫描，以估计DMN内有效的大脑连接，并与一组包含在经验疾病约束集中的单核苷酸多态性(snp)相关，该集来自663名ADNI受试者，仅具有全基因组数据。我们使用线性混合效应(LME)模型和标量函数回归(FSR)将使用频谱DCM估计的纵向有效脑连接与SNPs联系起来。在这两种情况下，我们实现了一个参数自举来测试SNP系数，并与从渐近零分布获得的p值进行比较。在初始q值阈值为0.1的两个网络中，没有发现任何影响。我们报告了相对较高等级关联的探索性模式，这些模式对FSR和LME所做的不同假设都表现出稳定性。

{"title":"Spectral dynamic causal modelling of resting-state fMRI: an exploratory study relating effective brain connectivity in the default mode network to genetics.","authors":"Yunlong Nie, Eugene Opoku, Laila Yasmin, Yin Song, Jie Wang, Sidi Wu, Vanessa Scarapicchia, Jodie Gawryluk, Liangliang Wang, Jiguo Cao, Farouk S Nathoo","doi":"10.1515/sagmb-2019-0058","DOIUrl":"https://doi.org/10.1515/sagmb-2019-0058","url":null,"abstract":"We conduct an imaging genetics study to explore how effective brain connectivity in the default mode network (DMN) may be related to genetics within the context of Alzheimer's disease and mild cognitive impairment. We develop an analysis of longitudinal resting-state functional magnetic resonance imaging (rs-fMRI) and genetic data obtained from a sample of 111 subjects with a total of 319 rs-fMRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. A Dynamic Causal Model (DCM) is fit to the rs-fMRI scans to estimate effective brain connectivity within the DMN and related to a set of single nucleotide polymorphisms (SNPs) contained in an empirical disease-constrained set which is obtained out-of-sample from 663 ADNI subjects having only genome-wide data. We relate longitudinal effective brain connectivity estimated using spectral DCM to SNPs using both linear mixed effect (LME) models as well as function-on-scalar regression (FSR). In both cases we implement a parametric bootstrap for testing SNP coefficients and make comparisons with p-values obtained from asymptotic null distributions. In both networks at an initial q-value threshold of 0.1 no effects are found. We report on exploratory patterns of associations with relatively high ranks that exhibit stability to the differing assumptions made by both FSR and LME.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"19 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2020-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2019-0058","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38327608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

EBADIMEX: an empirical Bayes approach to detect joint differential expression and methylation and to classify samples. EBADIMEX:一种经验贝叶斯方法，用于检测关节差异表达和甲基化并对样本进行分类。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2019-11-16 DOI: 10.1515/sagmb-2018-0050

Tobias Madsen,Michał Świtnicki,Malene Juul,Jakob Skou Pedersen

DNA methylation and gene expression are interdependent and both implicated in cancer development and progression, with many individual biomarkers discovered. A joint analysis of the two data types can potentially lead to biological insights that are not discoverable with separate analyses. To optimally leverage the joint data for identifying perturbed genes and classifying clinical cancer samples, it is important to accurately model the interactions between the two data types. Here, we present EBADIMEX for jointly identifying differential expression and methylation and classifying samples. The moderated t-test widely used with empirical Bayes priors in current differential expression methods is generalised to a multivariate setting by developing: (1) a moderated Welch t-test for equality of means with unequal variances; (2) a moderated F-test for equality of variances; and (3) a multivariate test for equality of means with equal variances. This leads to parametric models with prior distributions for the parameters, which allow fast evaluation and robust analysis of small data sets. EBADIMEX is demonstrated on simulated data as well as a large breast cancer (BRCA) cohort from TCGA. We show that the use of empirical Bayes priors and moderated tests works particularly well on small data sets.

DNA甲基化和基因表达是相互依赖的，都与癌症的发生和进展有关，发现了许多个体生物标志物。对这两种数据类型的联合分析可能会带来单独分析无法发现的生物学见解。为了最佳地利用联合数据来识别受干扰的基因和对临床癌症样本进行分类，准确地模拟两种数据类型之间的相互作用是很重要的。在这里，我们提出EBADIMEX联合识别差异表达和甲基化和分类样本。在当前的差分表达方法中，广泛使用经验贝叶斯先验的有调节t检验被推广到多元环境，通过开发:(1)具有不等方差的均数相等的有调节Welch t检验;(2)方差相等的有调节f检验;(3)方差相等的均值相等的多元检验。这导致参数具有先验分布的参数模型，允许快速评估和小数据集的鲁棒分析。EBADIMEX在模拟数据以及TCGA的大型乳腺癌(BRCA)队列中得到了验证。我们表明，使用经验贝叶斯先验和适度测试在小数据集上特别有效。

{"title":"EBADIMEX: an empirical Bayes approach to detect joint differential expression and methylation and to classify samples.","authors":"Tobias Madsen,Michał Świtnicki,Malene Juul,Jakob Skou Pedersen","doi":"10.1515/sagmb-2018-0050","DOIUrl":"https://doi.org/10.1515/sagmb-2018-0050","url":null,"abstract":"DNA methylation and gene expression are interdependent and both implicated in cancer development and progression, with many individual biomarkers discovered. A joint analysis of the two data types can potentially lead to biological insights that are not discoverable with separate analyses. To optimally leverage the joint data for identifying perturbed genes and classifying clinical cancer samples, it is important to accurately model the interactions between the two data types. Here, we present EBADIMEX for jointly identifying differential expression and methylation and classifying samples. The moderated t-test widely used with empirical Bayes priors in current differential expression methods is generalised to a multivariate setting by developing: (1) a moderated Welch t-test for equality of means with unequal variances; (2) a moderated F-test for equality of variances; and (3) a multivariate test for equality of means with equal variances. This leads to parametric models with prior distributions for the parameters, which allow fast evaluation and robust analysis of small data sets. EBADIMEX is demonstrated on simulated data as well as a large breast cancer (BRCA) cohort from TCGA. We show that the use of empirical Bayes priors and moderated tests works particularly well on small data sets.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"4 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2019-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138528254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Properties and Evaluation of the MOBIT - a novel Linkage-based Test Statistic and Quantification Method for Imprinting. MOBIT的性质与评价——一种新的基于链接的印迹检验统计与量化方法。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2019-07-10 DOI: 10.1515/sagmb-2018-0025

Markus Brugger, Michael Knapp, Konstantin Strauch

Genomic imprinting is a parent-of-origin effect apparent in an appreciable number of human diseases. We have proposed the new imprinting test statistic MOBIT, which is based on MOD score analysis. We were interested in the properties of the MOBIT concerning its distribution under three hypotheses: (1) H0,a: no linkage, no imprinting; (2) H0,b: linkage, no imprinting; (3) H1: linkage and imprinting. More specifically, we assessed the confounding between imprinting and sex-specific recombination frequencies, which presents a major difficulty in linkage-based testing for imprinting, and evaluated the power of the test. To this end, we have performed a linkage simulation study of affected sib-pairs and a three-generation pedigree with two trait models, many two- and multipoint marker scenarios, three genetic map ratios, two sample sizes, and five imprinting degrees. We also investigated the ability of the MOBIT to quantify the degree of imprinting and applied the MOBIT using a real data example on house dust mite allergy. We further proposed and evaluated two approaches to obtain empiric p values for the MOBIT. Our results showed that twopoint analyses assuming a sex-averaged marker map led to an inflated type I error due to confounding, especially for a larger marker-trait locus distance. When the correct sex-specific marker map was assumed, twopoint analyses have a reduced power to detect imprinting, compared to sex-averaged analyses with an appropriate correction for the inflation of the test statistic. However, confounding was not an issue in multipoint analysis unless the map ratio was extreme and marker spacing was sparse. With multipoint analysis, power as well as the ability to quantify the imprinting degree were almost equally high when a sex-averaged or the correct sex-specific map was used in the analysis. We recommend to obtain empiric p values for the MOBIT using genotype simulations based on the best-fitting nonimprinting model of the real dataset analysis. In addition, an implementation of a method based on the permutation of parental sexes is also available. In summary, we propose to perform multipoint analyses using densely spaced markers to efficiently discover new imprinted loci and to reliably quantify the degree of imprinting.

基因组印记是在相当数量的人类疾病中明显存在的亲本起源效应。我们提出了基于MOD分数分析的新的印迹测试统计量MOBIT。我们对MOBIT在三个假设下的分布特性感兴趣:(1)H0,a:无连锁，无印迹;(2) H0、b:联动，无压印;(3) H1:联动和印迹。更具体地说，我们评估了印迹和性别特异性重组频率之间的混淆，这是基于连锁的印迹测试的主要困难，并评估了测试的能力。为此，我们对受影响的兄弟姐妹和三代谱系进行了连锁模拟研究，包括两种性状模型、许多两点和多点标记情景、三种遗传图谱比例、两种样本量和五种印迹度。我们还研究了MOBIT量化印迹程度的能力，并以一个真实的室内尘螨过敏数据为例应用了MOBIT。我们进一步提出并评估了两种方法来获得MOBIT的经验p值。我们的结果表明，假设性别平均标记图谱的两点分析由于混淆导致I型误差膨胀，特别是对于较大的标记-性状位点距离。当假设正确的性别特异性标记图谱时，与对测试统计量膨胀进行适当校正的性别平均分析相比，两点分析检测印迹的能力降低。然而，在多点分析中，混淆不是一个问题，除非地图比是极端的，标记间距是稀疏的。在多点分析中，当在分析中使用性别平均或正确的性别特异性图谱时，权力和量化印迹程度的能力几乎同样高。我们建议使用基于真实数据集分析的最佳拟合非印迹模型的基因型模拟来获得MOBIT的经验p值。此外，还提供了一种基于亲本性别排列的方法的实现。总之，我们建议使用密集间隔的标记进行多点分析，以有效地发现新的印迹位点，并可靠地量化印迹程度。

{"title":"Properties and Evaluation of the MOBIT - a novel Linkage-based Test Statistic and Quantification Method for Imprinting.","authors":"Markus Brugger, Michael Knapp, Konstantin Strauch","doi":"10.1515/sagmb-2018-0025","DOIUrl":"https://doi.org/10.1515/sagmb-2018-0025","url":null,"abstract":"Genomic imprinting is a parent-of-origin effect apparent in an appreciable number of human diseases. We have proposed the new imprinting test statistic MOBIT, which is based on MOD score analysis. We were interested in the properties of the MOBIT concerning its distribution under three hypotheses: (1) H0,a: no linkage, no imprinting; (2) H0,b: linkage, no imprinting; (3) H1: linkage and imprinting. More specifically, we assessed the confounding between imprinting and sex-specific recombination frequencies, which presents a major difficulty in linkage-based testing for imprinting, and evaluated the power of the test. To this end, we have performed a linkage simulation study of affected sib-pairs and a three-generation pedigree with two trait models, many two- and multipoint marker scenarios, three genetic map ratios, two sample sizes, and five imprinting degrees. We also investigated the ability of the MOBIT to quantify the degree of imprinting and applied the MOBIT using a real data example on house dust mite allergy. We further proposed and evaluated two approaches to obtain empiric p values for the MOBIT. Our results showed that twopoint analyses assuming a sex-averaged marker map led to an inflated type I error due to confounding, especially for a larger marker-trait locus distance. When the correct sex-specific marker map was assumed, twopoint analyses have a reduced power to detect imprinting, compared to sex-averaged analyses with an appropriate correction for the inflation of the test statistic. However, confounding was not an issue in multipoint analysis unless the map ratio was extreme and marker spacing was sparse. With multipoint analysis, power as well as the ability to quantify the imprinting degree were almost equally high when a sex-averaged or the correct sex-specific map was used in the analysis. We recommend to obtain empiric p values for the MOBIT using genotype simulations based on the best-fitting nonimprinting model of the real dataset analysis. In addition, an implementation of a method based on the permutation of parental sexes is also available. In summary, we propose to perform multipoint analyses using densely spaced markers to efficiently discover new imprinted loci and to reliably quantify the degree of imprinting.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"18 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2018-0025","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38436842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Assessing genome-wide significance for the detection of differentially methylated regions. 评估检测差异甲基化区域的全基因组意义。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2018-09-19 DOI: 10.1515/sagmb-2017-0050

Christian M Page, Linda Vos, Trine B Rounge, Hanne F Harbo, Bettina K Andreassen

DNA methylation plays an important role in human health and disease, and methods for the identification of differently methylated regions are of increasing interest. There is currently a lack of statistical methods which properly address multiple testing, i.e. control genome-wide significance for differentially methylated regions. We introduce a scan statistic (DMRScan), which overcomes these limitations. We benchmark DMRScan against two well established methods (bumphunter, DMRcate), using a simulation study based on real methylation data. An implementation of DMRScan is available from Bioconductor. Our method has higher power than alternative methods across different simulation scenarios, particularly for small effect sizes. DMRScan exhibits greater flexibility in statistical modeling and can be used with more complex designs than current methods. DMRScan is the first dynamic approach which properly addresses the multiple-testing challenges for the identification of differently methylated regions. DMRScan outperformed alternative methods in terms of power, while keeping the false discovery rate controlled.

DNA甲基化在人类健康和疾病中起着重要作用，鉴定不同甲基化区域的方法越来越引起人们的兴趣。目前缺乏适当处理多重检测的统计方法，即控制差异甲基化区域的全基因组显著性。我们引入了扫描统计量(DMRScan)，它克服了这些限制。我们使用基于真实甲基化数据的模拟研究，将DMRScan与两种成熟的方法(bumphunter, DMRcate)进行基准测试。DMRScan的实现可以从Bioconductor获得。我们的方法在不同的模拟场景中比其他方法具有更高的功率，特别是对于小的效应大小。DMRScan在统计建模方面表现出更大的灵活性，可以使用比当前方法更复杂的设计。DMRScan是第一个动态方法，它正确地解决了识别不同甲基化区域的多重测试挑战。DMRScan在功率方面优于其他方法，同时保持了错误发现率的控制。

{"title":"Assessing genome-wide significance for the detection of differentially methylated regions.","authors":"Christian M Page, Linda Vos, Trine B Rounge, Hanne F Harbo, Bettina K Andreassen","doi":"10.1515/sagmb-2017-0050","DOIUrl":"https://doi.org/10.1515/sagmb-2017-0050","url":null,"abstract":"DNA methylation plays an important role in human health and disease, and methods for the identification of differently methylated regions are of increasing interest. There is currently a lack of statistical methods which properly address multiple testing, i.e. control genome-wide significance for differentially methylated regions. We introduce a scan statistic (DMRScan), which overcomes these limitations. We benchmark DMRScan against two well established methods (bumphunter, DMRcate), using a simulation study based on real methylation data. An implementation of DMRScan is available from Bioconductor. Our method has higher power than alternative methods across different simulation scenarios, particularly for small effect sizes. DMRScan exhibits greater flexibility in statistical modeling and can be used with more complex designs than current methods. DMRScan is the first dynamic approach which properly addresses the multiple-testing challenges for the identification of differently methylated regions. DMRScan outperformed alternative methods in terms of power, while keeping the false discovery rate controlled.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"17 5","pages":""},"PeriodicalIF":0.9,"publicationDate":"2018-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2017-0050","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36502575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A variable selection approach in the multivariate linear model: an application to LC-MS metabolomics data. 多元线性模型中的变量选择方法:在LC-MS代谢组学数据中的应用。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2018-09-08 DOI: 10.1515/sagmb-2017-0077

Marie Perrot-Dockès, Céline Lévy-Leduc, Julien Chiquet, Laure Sansonnet, Margaux Brégère, Marie-Pierre Étienne, Stéphane Robin, Grégory Genta-Jouve

Omic data are characterized by the presence of strong dependence structures that result either from data acquisition or from some underlying biological processes. Applying statistical procedures that do not adjust the variable selection step to the dependence pattern may result in a loss of power and the selection of spurious variables. The goal of this paper is to propose a variable selection procedure within the multivariate linear model framework that accounts for the dependence between the multiple responses. We shall focus on a specific type of dependence which consists in assuming that the responses of a given individual can be modelled as a time series. We propose a novel Lasso-based approach within the framework of the multivariate linear model taking into account the dependence structure by using different types of stationary processes covariance structures for the random error matrix. Our numerical experiments show that including the estimation of the covariance matrix of the random error matrix in the Lasso criterion dramatically improves the variable selection performance. Our approach is successfully applied to an untargeted LC-MS (Liquid Chromatography-Mass Spectrometry) data set made of African copals samples. Our methodology is implemented in the R package MultiVarSel which is available from the Comprehensive R Archive Network (CRAN).

组学数据的特点是存在很强的依赖结构，这些结构要么来自数据采集，要么来自一些潜在的生物过程。应用不将变量选择步骤调整为依赖模式的统计过程可能导致功率损失和选择虚假变量。本文的目标是在多元线性模型框架内提出一个变量选择程序，该程序考虑了多个响应之间的依赖性。我们将把重点放在一种特定类型的依赖上，这种依赖包括假设一个给定个体的反应可以建模为一个时间序列。我们在多元线性模型的框架内提出了一种新的基于lasso的方法，通过对随机误差矩阵使用不同类型的平稳过程协方差结构来考虑依赖结构。我们的数值实验表明，在Lasso准则中加入随机误差矩阵的协方差矩阵的估计可以显著提高变量选择的性能。我们的方法成功地应用于由非洲煤样品组成的非靶向LC-MS(液相色谱-质谱)数据集。我们的方法是在R软件包MultiVarSel中实现的，该软件包可从综合R档案网络(CRAN)获得。

{"title":"A variable selection approach in the multivariate linear model: an application to LC-MS metabolomics data.","authors":"Marie Perrot-Dockès, Céline Lévy-Leduc, Julien Chiquet, Laure Sansonnet, Margaux Brégère, Marie-Pierre Étienne, Stéphane Robin, Grégory Genta-Jouve","doi":"10.1515/sagmb-2017-0077","DOIUrl":"https://doi.org/10.1515/sagmb-2017-0077","url":null,"abstract":"Omic data are characterized by the presence of strong dependence structures that result either from data acquisition or from some underlying biological processes. Applying statistical procedures that do not adjust the variable selection step to the dependence pattern may result in a loss of power and the selection of spurious variables. The goal of this paper is to propose a variable selection procedure within the multivariate linear model framework that accounts for the dependence between the multiple responses. We shall focus on a specific type of dependence which consists in assuming that the responses of a given individual can be modelled as a time series. We propose a novel Lasso-based approach within the framework of the multivariate linear model taking into account the dependence structure by using different types of stationary processes covariance structures for the random error matrix. Our numerical experiments show that including the estimation of the covariance matrix of the random error matrix in the Lasso criterion dramatically improves the variable selection performance. Our approach is successfully applied to an untargeted LC-MS (Liquid Chromatography-Mass Spectrometry) data set made of African copals samples. Our methodology is implemented in the R package MultiVarSel which is available from the Comprehensive R Archive Network (CRAN).","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"17 5","pages":""},"PeriodicalIF":0.9,"publicationDate":"2018-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2017-0077","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36483644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Biology challenging statistics. 生物学挑战统计学。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2018-08-30 DOI: 10.1515/sagmb-2018-0048

Michael P H Stumpf

引用次数: 0

Editorial change at Statistical Applications in Genetics and Molecular Biology. 《遗传学和分子生物学中的统计应用》编辑变更。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2018-08-24 DOI: 10.1515/sagmb-2018-0046

Torsten Krüger

引用次数: 2

A test for detecting differential indirect trans effects between two groups of samples. 一种检测两组样本间差异间接反效应的检验。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2018-07-31 DOI: 10.1515/sagmb-2017-0058

Nimisha Chaturvedi, Renée X de Menezes, Jelle J Goeman, Wessel van Wieringen

Integrative analysis of copy number and gene expression data can help in understanding the cis and trans effect of copy number aberrations on transcription levels of genes involved in a pathway. To analyse how these copy number mediated gene-gene interactions differ between groups of samples we propose a new method, named dNET. Our method uses ridge regression to model the network topology involving one gene's expression level, its gene dosage and the expression levels of other genes in the network. The interaction parameters are estimated by fitting the model per gene for all samples together. However, instead of testing for differential network topology per gene, dNET tests for an overall difference in estimated parameters between two groups of samples and produces a single p-value. With the help of several simulation studies, we show that dNET can detect differential network nodes with high accuracy and low rate of false positives even in the presence of differential cis effects. We also apply dNET to publicly available TCGA cancer datasets and identify pathways where copy number mediated gene-gene interactions differ between samples with cancer stage lower than stage 3 and samples with cancer stage 3 or above.

拷贝数和基因表达数据的综合分析有助于理解拷贝数畸变对通路相关基因转录水平的顺式和反式影响。为了分析这些拷贝数介导的基因相互作用在不同样本组之间的差异，我们提出了一种名为dNET的新方法。我们的方法使用脊回归来建模网络拓扑，包括一个基因的表达水平，它的基因剂量和网络中其他基因的表达水平。通过对所有样本的每个基因模型进行拟合来估计相互作用参数。然而，dNET不是测试每个基因的差异网络拓扑结构，而是测试两组样本之间估计参数的总体差异，并产生单个p值。通过一些仿真研究，我们证明了即使在存在差分顺式效应的情况下，dNET也能以高精度和低误报率检测差分网络节点。我们还将dNET应用于公开可用的TCGA癌症数据集，并确定拷贝数介导的基因相互作用在癌症阶段低于3期和癌症阶段3或以上的样本之间的不同途径。

{"title":"A test for detecting differential indirect trans effects between two groups of samples.","authors":"Nimisha Chaturvedi, Renée X de Menezes, Jelle J Goeman, Wessel van Wieringen","doi":"10.1515/sagmb-2017-0058","DOIUrl":"https://doi.org/10.1515/sagmb-2017-0058","url":null,"abstract":"Integrative analysis of copy number and gene expression data can help in understanding the cis and trans effect of copy number aberrations on transcription levels of genes involved in a pathway. To analyse how these copy number mediated gene-gene interactions differ between groups of samples we propose a new method, named dNET. Our method uses ridge regression to model the network topology involving one gene's expression level, its gene dosage and the expression levels of other genes in the network. The interaction parameters are estimated by fitting the model per gene for all samples together. However, instead of testing for differential network topology per gene, dNET tests for an overall difference in estimated parameters between two groups of samples and produces a single p-value. With the help of several simulation studies, we show that dNET can detect differential network nodes with high accuracy and low rate of false positives even in the presence of differential cis effects. We also apply dNET to publicly available TCGA cancer datasets and identify pathways where copy number mediated gene-gene interactions differ between samples with cancer stage lower than stage 3 and samples with cancer stage 3 or above.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"17 5","pages":""},"PeriodicalIF":0.9,"publicationDate":"2018-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2017-0058","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36356623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the relation between the true and sample correlations under Bayesian modelling of gene expression datasets. 基因表达数据集贝叶斯建模下真相关性与样本相关性的关系。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2018-07-14 DOI: 10.1515/sagmb-2017-0068

Royi Jacobovic

Abstract The prediction of cancer prognosis and metastatic potential immediately after the initial diagnoses is a major challenge in current clinical research. The relevance of such a signature is clear, as it will free many patients from the agony and toxic side-effects associated with the adjuvant chemotherapy automatically and sometimes carelessly subscribed to them. Motivated by this issue, several previous works presented a Bayesian model which led to the following conclusion: thousands of samples are needed to generate a robust gene list for predicting outcome. This conclusion is based on existence of some statistical assumptions including asymptotic independence of sample correlations. The current work makes two main contributions: (1) It shows that while the assumptions of the Bayesian model discussed by previous papers seem to be non-restrictive, they are quite strong. To demonstrate this point, it is shown that some standard sparse and Gaussian models are not included in the set of models which are mathematically consistent with these assumptions. (2) It is shown that the empirical Bayes methodology which was applied in order to test the relevant assumptions does not detect severe violations and consequently an overestimation of the required sample size might be incurred. Finally, we suggest that under some regularity conditions it is possible that the current theoretical results can be used for development of a new method to test the asymptotic independence assumption.

{"title":"On the relation between the true and sample correlations under Bayesian modelling of gene expression datasets.","authors":"Royi Jacobovic","doi":"10.1515/sagmb-2017-0068","DOIUrl":"https://doi.org/10.1515/sagmb-2017-0068","url":null,"abstract":"Abstract The prediction of cancer prognosis and metastatic potential immediately after the initial diagnoses is a major challenge in current clinical research. The relevance of such a signature is clear, as it will free many patients from the agony and toxic side-effects associated with the adjuvant chemotherapy automatically and sometimes carelessly subscribed to them. Motivated by this issue, several previous works presented a Bayesian model which led to the following conclusion: thousands of samples are needed to generate a robust gene list for predicting outcome. This conclusion is based on existence of some statistical assumptions including asymptotic independence of sample correlations. The current work makes two main contributions: (1) It shows that while the assumptions of the Bayesian model discussed by previous papers seem to be non-restrictive, they are quite strong. To demonstrate this point, it is shown that some standard sparse and Gaussian models are not included in the set of models which are mathematically consistent with these assumptions. (2) It is shown that the empirical Bayes methodology which was applied in order to test the relevant assumptions does not detect severe violations and consequently an overestimation of the required sample size might be incurred. Finally, we suggest that under some regularity conditions it is possible that the current theoretical results can be used for development of a new method to test the asymptotic independence assumption.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"17 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2018-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2017-0068","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36213534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Empirical Bayesian approach to testing multiple hypotheses with separate priors for left and right alternatives. 经验贝叶斯方法测试多个假设与独立先验的左和右选择。

IF 0.9 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology

Pub Date : 2018-07-05 DOI: 10.1515/sagmb-2018-0002

Naveen K Bansal, Mehdi Maadooliat, Steven J Schrodi

Abstract We consider a multiple hypotheses problem with directional alternatives in a decision theoretic framework. We obtain an empirical Bayes rule subject to a constraint on mixed directional false discovery rate (mdFDR≤α) under the semiparametric setting where the distribution of the test statistic is parametric, but the prior distribution is nonparametric. We proposed separate priors for the left tail and right tail alternatives as it may be required for many applications. The proposed Bayes rule is compared through simulation against rules proposed by Benjamini and Yekutieli and Efron. We illustrate the proposed methodology for two sets of data from biological experiments: HIV-transfected cell-line mRNA expression data, and a quantitative trait genome-wide SNP data set. We have developed a user-friendly web-based shiny App for the proposed method which is available through URL https://npseb.shinyapps.io/npseb/. The HIV and SNP data can be directly accessed, and the results presented in this paper can be executed.

{"title":"Empirical Bayesian approach to testing multiple hypotheses with separate priors for left and right alternatives.","authors":"Naveen K Bansal, Mehdi Maadooliat, Steven J Schrodi","doi":"10.1515/sagmb-2018-0002","DOIUrl":"https://doi.org/10.1515/sagmb-2018-0002","url":null,"abstract":"Abstract We consider a multiple hypotheses problem with directional alternatives in a decision theoretic framework. We obtain an empirical Bayes rule subject to a constraint on mixed directional false discovery rate (mdFDR≤α) under the semiparametric setting where the distribution of the test statistic is parametric, but the prior distribution is nonparametric. We proposed separate priors for the left tail and right tail alternatives as it may be required for many applications. The proposed Bayes rule is compared through simulation against rules proposed by Benjamini and Yekutieli and Efron. We illustrate the proposed methodology for two sets of data from biological experiments: HIV-transfected cell-line mRNA expression data, and a quantitative trait genome-wide SNP data set. We have developed a user-friendly web-based shiny App for the proposed method which is available through URL https://npseb.shinyapps.io/npseb/. The HIV and SNP data can be directly accessed, and the results presented in this paper can be executed.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"17 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2018-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2018-0002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36286389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Statistical Applications in Genetics and Molecular Biology

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀