Baptiste Féraud, Réjane Rousseau, P. Tullio, M. Verleysen, B. Govaerts
{"title":"Independent Component Analysis and Statistical Modelling for theIdentification of Metabolomics Biomarkers in 1H-NMR Spectroscopy","authors":"Baptiste Féraud, Réjane Rousseau, P. Tullio, M. Verleysen, B. Govaerts","doi":"10.4172/2155-6180.1000367","DOIUrl":null,"url":null,"abstract":"In order to maintain life, living organism’s product and transform small molecules called metabolites. Metabolomics aims at studying the development of biological reactions resulting from a contact with a physio-pathological stimulus, through these metabolites. The 1H-NMR spectroscopy is widely used to graphically describe a metabolite composition via spectra. Biologists can then confirm or invalidate the development of a biological reaction if specific NMR spectral regions are altered from a given physiological situation to another. However, this pro-cess supposes a preliminary identification step which traditionally consists in the study of the two first components of a Principal Component Analysis (PCA). This paper presents a new methodology in two main steps providing knowledge on specific 1H-NMR spectral areas via the identification of biomarkers and via the visualization of the effects caused by some external changes. The first step implies Independent Component Analysis (ICA) in order to decompose the spectral data into statistically independent components or sources of information. The in-dependent (pure or composite) metabolites contained in bio fluids are discovered through the sources, and their quantities through mixing weights. Specific questions related to ICA like the choice of the number of components and their ordering are discussed. The second step consists in a statistical modelling of the ICA mixing weights and introduces statistical hypothesis tests on the parameters of the estimated models, with the objective of selecting sources which present biomarkers (or significantly fluctuating spectral regions). Statistical models are considered here for their adaptability to different possible kinds of data or contexts. A computation of contrasts which can lead to the visualization of changes on spectra caused by changes of the factor of interest is also proposed. This methodology is innovative because multi-factors studies (via the use of mixed models) and statistical confirmations of the factors effects are allowed together.","PeriodicalId":87294,"journal":{"name":"Journal of biometrics & biostatistics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4172/2155-6180.1000367","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of biometrics & biostatistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4172/2155-6180.1000367","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In order to maintain life, living organism’s product and transform small molecules called metabolites. Metabolomics aims at studying the development of biological reactions resulting from a contact with a physio-pathological stimulus, through these metabolites. The 1H-NMR spectroscopy is widely used to graphically describe a metabolite composition via spectra. Biologists can then confirm or invalidate the development of a biological reaction if specific NMR spectral regions are altered from a given physiological situation to another. However, this pro-cess supposes a preliminary identification step which traditionally consists in the study of the two first components of a Principal Component Analysis (PCA). This paper presents a new methodology in two main steps providing knowledge on specific 1H-NMR spectral areas via the identification of biomarkers and via the visualization of the effects caused by some external changes. The first step implies Independent Component Analysis (ICA) in order to decompose the spectral data into statistically independent components or sources of information. The in-dependent (pure or composite) metabolites contained in bio fluids are discovered through the sources, and their quantities through mixing weights. Specific questions related to ICA like the choice of the number of components and their ordering are discussed. The second step consists in a statistical modelling of the ICA mixing weights and introduces statistical hypothesis tests on the parameters of the estimated models, with the objective of selecting sources which present biomarkers (or significantly fluctuating spectral regions). Statistical models are considered here for their adaptability to different possible kinds of data or contexts. A computation of contrasts which can lead to the visualization of changes on spectra caused by changes of the factor of interest is also proposed. This methodology is innovative because multi-factors studies (via the use of mixed models) and statistical confirmations of the factors effects are allowed together.