Longitudinal functional data are increasingly common in the health domain. The motivated dataset for this paper comprises H-NMR spectra of kidney transplant patients [8]. Our aim is to cluster patients into different clinical outcome subgoups to reveal the success of the transplantation. The NMR spectra of each patient at each time point are functional data and the data are longitudinally collected at up to nine different time points. Existing methods are available for functional data collected at one time point, but not for longitudinal functional data collected at a grid of time points subject to missingness. We therefore first apply a method to extract the same number of functional feactures for each subject. Next we propose a novel nonparametric clustering method for mulitivariate functional data. We applied our proposed clustering method to the kidney transplant dataset both to a subset of the raw data with only two time points and the extacted functional features. It appeared that the proposed method achieves better clustering performance on the extracted functional features than on the subset of raw data. A data simulation study was performed to further evaluate the method. The design mimiced the kidney transplant dataset but with a larger sample size. Scenarios which have different levels of noise were considered. The simulation study showed the accuarcy of our proposed method.
{"title":"Nonparametric clustering for longitudinal functional data with the application to H-NMR spectra of kidney transplant patients. Longitudinal functional data clustering.","authors":"Minzhen Xie, Haiyan Liu, Jeanine Houwing-Duistermaat","doi":"10.19272/202111401003","DOIUrl":"https://doi.org/10.19272/202111401003","url":null,"abstract":"Longitudinal functional data are increasingly common in the health domain. The motivated dataset for this paper comprises H-NMR spectra of kidney transplant patients [8]. Our aim is to cluster patients into different clinical outcome subgoups to reveal the success of the transplantation. The NMR spectra of each patient at each time point are functional data and the data are longitudinally collected at up to nine different time points. Existing methods are available for functional data collected at one time point, but not for longitudinal functional data collected at a grid of time points subject to missingness. We therefore first apply a method to extract the same number of functional feactures for each subject. Next we propose a novel nonparametric clustering method for mulitivariate functional data. We applied our proposed clustering method to the kidney transplant dataset both to a subset of the raw data with only two time points and the extacted functional features. It appeared that the proposed method achieves better clustering performance on the extracted functional features than on the subset of raw data. A data simulation study was performed to further evaluate the method. The design mimiced the kidney transplant dataset but with a larger sample size. Scenarios which have different levels of noise were considered. The simulation study showed the accuarcy of our proposed method.","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49105459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Fuady, S. el Bouhaddani, H. Uh, Jeanine Houwing-Duistermaat
Multiple technologies which measure the same omics data set but are based on different aspects of the molecules exist. In practice, studies use different technologies and have therefore different biomarkers. An example is the glycan age index, which is constructed by three different ultra-performance liquid chromatography (UPLC) IgG glycans, and is a biomarker for biological age. A second technology is liquid chromatography- mass spectrometry (LCMS). To estimate the effect of a biomarker on an outcome variable, two issues need to be addressed. Firstly, a measurement error is needed to map one technology to the other one using a calibration study. Here, we consider two approaches, namely one based on the chemical properties of the two technologies and one based on the estimation of this relationship using O2PLS. Secondly, the use of an approximation of the biomarker in the main study needs to be taken into account by use of a regression calibration method. The performance of the two approaches is studied via simulations. The methods are used to estimate the relationship between glycan age and menopause. We have data from two cohorts, namely Korcula and Vis. In conclusion, (1) both measurement error models give similar results and suggest that there is an association between the glycan age index and the menopause status, (2) the chemical mapping approach outperforms O2PLS in the low measurement error variance, while on the larger measurement error variance, O2PLS works better, (3) statistical efficiency is lost due to increased noise level by adding irrelevant information.
{"title":"Estimation of the effect of surrogate multi-omic biomarkers.","authors":"A. Fuady, S. el Bouhaddani, H. Uh, Jeanine Houwing-Duistermaat","doi":"10.19272/202111402006","DOIUrl":"https://doi.org/10.19272/202111402006","url":null,"abstract":"Multiple technologies which measure the same omics data set but are based on different aspects of the molecules exist. In practice, studies use different technologies and have therefore different biomarkers. An example is the glycan age index, which is constructed by three different ultra-performance liquid chromatography (UPLC) IgG glycans, and is a biomarker for biological age. A second technology is liquid chromatography- mass spectrometry (LCMS). To estimate the effect of a biomarker on an outcome variable, two issues need to be addressed. Firstly, a measurement error is needed to map one technology to the other one using a calibration study. Here, we consider two approaches, namely one based on the chemical properties of the two technologies and one based on the estimation of this relationship using O2PLS. Secondly, the use of an approximation of the biomarker in the main study needs to be taken into account by use of a regression calibration method. The performance of the two approaches is studied via simulations. The methods are used to estimate the relationship between glycan age and menopause. We have data from two cohorts, namely Korcula and Vis. In conclusion, (1) both measurement error models give similar results and suggest that there is an association between the glycan age index and the menopause status, (2) the chemical mapping approach outperforms O2PLS in the low measurement error variance, while on the larger measurement error variance, O2PLS works better, (3) statistical efficiency is lost due to increased noise level by adding irrelevant information.","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42019619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sonia Dembowska, Alejandro F Frangi, Jeanine Houwing-Duistermaat, Haiyan Liu
The use of statistical methods to predict outcomes using high dimensional datasets in medicine is becoming increasingly popular for forecasting and monitoring patient health. Our work is motivated by a longitudinal dataset containing 1H NMR spectra of metabolites of 18 patients undergoing a kidney transplant alongside their graft outcomes that fall into one of three categories: acute rejection, delayed graft function and primary function. We proposed a functional partial least squares (FPLS) model that extends existing PLS methods for the analysis of longitudinally measured scalar omics datasets to the case of longitudinally measured functional datasets. We designed an iterative algorithm to link multiple time points, and then applied our proposed method to analyse the data from kidney transplant patients. Finally, we compared the AUC of our method to the AUC of the univariate methods which only use the information of one time-point information. It appeared that our method outperforms the existing methods. A simulation study was performed to mimic the kidney transplant dataset but with a larger sample size and different scenarios performed to evaluate the performance of the new method in larger datasets. We consider scenarios which vary in the difficulty to distinguish the two groups. It appeared that the three time-points model performs better than any of the individual models with average AUCs of 0.909 and 0.811 respectively.
{"title":"Multivariate functional partial least squares for classification using longitudinal data.","authors":"Sonia Dembowska, Alejandro F Frangi, Jeanine Houwing-Duistermaat, Haiyan Liu","doi":"10.19272/202111402007","DOIUrl":"https://doi.org/10.19272/202111402007","url":null,"abstract":"The use of statistical methods to predict outcomes using high dimensional datasets in medicine is becoming increasingly popular for forecasting and monitoring patient health. Our work is motivated by a longitudinal dataset containing 1H NMR spectra of metabolites of 18 patients undergoing a kidney transplant alongside their graft outcomes that fall into one of three categories: acute rejection, delayed graft function and primary function. We proposed a functional partial least squares (FPLS) model that extends existing PLS methods for the analysis of longitudinally measured scalar omics datasets to the case of longitudinally measured functional datasets. We designed an iterative algorithm to link multiple time points, and then applied our proposed method to analyse the data from kidney transplant patients. Finally, we compared the AUC of our method to the AUC of the univariate methods which only use the information of one time-point information. It appeared that our method outperforms the existing methods. A simulation study was performed to mimic the kidney transplant dataset but with a larger sample size and different scenarios performed to evaluate the performance of the new method in larger datasets. We consider scenarios which vary in the difficulty to distinguish the two groups. It appeared that the three time-points model performs better than any of the individual models with average AUCs of 0.909 and 0.811 respectively.","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45744157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Budimir, C. Sala, M. G. Bacalini, P. Garagnani, G. Castellani
DNA methylation studies usually focus on the groups of CpG sites. Neighbouring CpG sites are analyzed together due to their group behaviour. However, this approach ignores the possible interaction between more distant CpG sites. In this work, we investigate the complete methylation correlation structure of chromosome 21. Two data sets were used for the correlation analysis, smaller data set with methylation measurements from Down syndrome patients and their family members and larger data set with healthy subjects. This allowed us to examine the general properties of the methylation correlation structure as well as its modifications in presence of an extra copy of the chromosome. We observed that the CpG sites work in small highly correlated groups. While some groups coincided with CpG islands, other groups contained CpG sites scattered across the whole chromosome. Groups of highly correlated CpG sites remained preserved in the case of Down syndrome. Moreover, the methylome of a Down syndrome patient had newly formed correlations between CpG sites suggesting that the methylation correlation structure in Down syndrome is stronger than in case of an unaffected individual.
{"title":"DNA methylation correlation structure of chromosome 21 in Down syndrome.","authors":"I. Budimir, C. Sala, M. G. Bacalini, P. Garagnani, G. Castellani","doi":"10.19272/202111402008","DOIUrl":"https://doi.org/10.19272/202111402008","url":null,"abstract":"DNA methylation studies usually focus on the groups of CpG sites. Neighbouring CpG sites are analyzed together due to their group behaviour. However, this approach ignores the possible interaction between more distant CpG sites. In this work, we investigate the complete methylation correlation structure of chromosome 21. Two data sets were used for the correlation analysis, smaller data set with methylation measurements from Down syndrome patients and their family members and larger data set with healthy subjects. This allowed us to examine the general properties of the methylation correlation structure as well as its modifications in presence of an extra copy of the chromosome. We observed that the CpG sites work in small highly correlated groups. While some groups coincided with CpG islands, other groups contained CpG sites scattered across the whole chromosome. Groups of highly correlated CpG sites remained preserved in the case of Down syndrome. Moreover, the methylome of a Down syndrome patient had newly formed correlations between CpG sites suggesting that the methylation correlation structure in Down syndrome is stronger than in case of an unaffected individual.","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44461506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1007/978-981-15-1731-0_5
A. Kamimura, T. Ohira
{"title":"Potential Applications and Challenges","authors":"A. Kamimura, T. Ohira","doi":"10.1007/978-981-15-1731-0_5","DOIUrl":"https://doi.org/10.1007/978-981-15-1731-0_5","url":null,"abstract":"","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87052570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1007/978-981-15-1731-0_2
A. Kamimura, T. Ohira
{"title":"Chases and Escapes","authors":"A. Kamimura, T. Ohira","doi":"10.1007/978-981-15-1731-0_2","DOIUrl":"https://doi.org/10.1007/978-981-15-1731-0_2","url":null,"abstract":"","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78333928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Munawar, A. Akrem, A. Hussain, P. Spencer, C. Betzel
Snake venom is a myriad of biologically active proteins and peptides. Three finger toxins are highly conserved in their molecular structure, but interestingly possess diverse biological functions. During the course of evolution the introduction of subtle mutations in loop regions and slight variations in the three dimensional structure, has resulted in their functional versatility. Cytotoxin-1 (UniProt ID: P01467), isolated from Naja mossambica mossambica, showed the potential to inhibit chymotrypsin and the chymotryptic activity of the 20S proteasome. In the present work we describe a molecular model of cytotoxin-1 in complex with chymotrypsin, prepared by the online server ClusPro. Analysis of the molecular model shows that Cytotoxin-1 (P01467) binds to chymotrypsin through its loop I located near the N-terminus. The concave side of loop I of the toxin fits well in the substrate binding pocket of the protease. We propose Phe10 as the dedicated P1 site of the ligand. Being a potent inhibitor of the 20S proteasome, cytotoxin-1 (P01467) can serve as a potential antitumor agent. Already snake venom cytotoxins have been investigated for their ability as an anticancer agent. The molecular model of cytotoxin-1 in complex with chymotrypsin provides important information towards understanding the complex formation.
{"title":"MOLECULAR MODEL OF CYTOTOXIN-1 FROM NAJA MOSSAMBICA MOSSAMBICA VENOM IN COMPLEX WITH CHYMOTRYPSIN.","authors":"A. Munawar, A. Akrem, A. Hussain, P. Spencer, C. Betzel","doi":"10.1400/240197","DOIUrl":"https://doi.org/10.1400/240197","url":null,"abstract":"Snake venom is a myriad of biologically active proteins and peptides. Three finger toxins are highly conserved in their molecular structure, but interestingly possess diverse biological functions. During the course of evolution the introduction of subtle mutations in loop regions and slight variations in the three dimensional structure, has resulted in their functional versatility. Cytotoxin-1 (UniProt ID: P01467), isolated from Naja mossambica mossambica, showed the potential to inhibit chymotrypsin and the chymotryptic activity of the 20S proteasome. In the present work we describe a molecular model of cytotoxin-1 in complex with chymotrypsin, prepared by the online server ClusPro. Analysis of the molecular model shows that Cytotoxin-1 (P01467) binds to chymotrypsin through its loop I located near the N-terminus. The concave side of loop I of the toxin fits well in the substrate binding pocket of the protease. We propose Phe10 as the dedicated P1 site of the ligand. Being a potent inhibitor of the 20S proteasome, cytotoxin-1 (P01467) can serve as a potential antitumor agent. Already snake venom cytotoxins have been investigated for their ability as an anticancer agent. The molecular model of cytotoxin-1 in complex with chymotrypsin provides important information towards understanding the complex formation.","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66623680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DYNAMICAL SYSTEMS ON GRAPHS: FROM RANDOM WALKS TO TRANSPORTATION NETWORKS.","authors":"A. Bazzani","doi":"10.1400/240192","DOIUrl":"https://doi.org/10.1400/240192","url":null,"abstract":"","PeriodicalId":55980,"journal":{"name":"Theoretical Biology Forum","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66623620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}