{"title":"基于生物标志物的膀胱癌症生存模型的双阶段离散化方法","authors":"M. Nascimben, M. Venturin, L. Rimondini","doi":"10.2478/caim-2021-0003","DOIUrl":null,"url":null,"abstract":"Abstract Bioinformatic techniques targeting gene expression data require specific analysis pipelines with the aim of studying properties, adaptation, and disease outcomes in a sample population. Present investigation compared together results of four numerical experiments modeling survival rates from bladder cancer genetic profiles. Research showed that a sequence of two discretization phases produced remarkable results compared to a classic approach employing one discretization of gene expression data. Analysis involving two discretization phases consisted of a primary discretizer followed by refinement or pre-binning input values before the main discretization scheme. Among all tests, the best model encloses a sequence of data transformation to compensate skewness, data discretization phase with class-attribute interdependence maximization algorithm, and final classification by voting feature intervals, a classifier that also provides discrete interval optimization.","PeriodicalId":37903,"journal":{"name":"Communications in Applied and Industrial Mathematics","volume":"12 1","pages":"29 - 47"},"PeriodicalIF":0.3000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Double-stage discretization approaches for biomarker-based bladder cancer survival modeling\",\"authors\":\"M. Nascimben, M. Venturin, L. Rimondini\",\"doi\":\"10.2478/caim-2021-0003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Bioinformatic techniques targeting gene expression data require specific analysis pipelines with the aim of studying properties, adaptation, and disease outcomes in a sample population. Present investigation compared together results of four numerical experiments modeling survival rates from bladder cancer genetic profiles. Research showed that a sequence of two discretization phases produced remarkable results compared to a classic approach employing one discretization of gene expression data. Analysis involving two discretization phases consisted of a primary discretizer followed by refinement or pre-binning input values before the main discretization scheme. Among all tests, the best model encloses a sequence of data transformation to compensate skewness, data discretization phase with class-attribute interdependence maximization algorithm, and final classification by voting feature intervals, a classifier that also provides discrete interval optimization.\",\"PeriodicalId\":37903,\"journal\":{\"name\":\"Communications in Applied and Industrial Mathematics\",\"volume\":\"12 1\",\"pages\":\"29 - 47\"},\"PeriodicalIF\":0.3000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communications in Applied and Industrial Mathematics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/caim-2021-0003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications in Applied and Industrial Mathematics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/caim-2021-0003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICS","Score":null,"Total":0}
Double-stage discretization approaches for biomarker-based bladder cancer survival modeling
Abstract Bioinformatic techniques targeting gene expression data require specific analysis pipelines with the aim of studying properties, adaptation, and disease outcomes in a sample population. Present investigation compared together results of four numerical experiments modeling survival rates from bladder cancer genetic profiles. Research showed that a sequence of two discretization phases produced remarkable results compared to a classic approach employing one discretization of gene expression data. Analysis involving two discretization phases consisted of a primary discretizer followed by refinement or pre-binning input values before the main discretization scheme. Among all tests, the best model encloses a sequence of data transformation to compensate skewness, data discretization phase with class-attribute interdependence maximization algorithm, and final classification by voting feature intervals, a classifier that also provides discrete interval optimization.
期刊介绍:
Communications in Applied and Industrial Mathematics (CAIM) is one of the official journals of the Italian Society for Applied and Industrial Mathematics (SIMAI). Providing immediate open access to original, unpublished high quality contributions, CAIM is devoted to timely report on ongoing original research work, new interdisciplinary subjects, and new developments. The journal focuses on the applications of mathematics to the solution of problems in industry, technology, environment, cultural heritage, and natural sciences, with a special emphasis on new and interesting mathematical ideas relevant to these fields of application . Encouraging novel cross-disciplinary approaches to mathematical research, CAIM aims to provide an ideal platform for scientists who cooperate in different fields including pure and applied mathematics, computer science, engineering, physics, chemistry, biology, medicine and to link scientist with professionals active in industry, research centres, academia or in the public sector. Coverage includes research articles describing new analytical or numerical methods, descriptions of modelling approaches, simulations for more accurate predictions or experimental observations of complex phenomena, verification/validation of numerical and experimental methods; invited or submitted reviews and perspectives concerning mathematical techniques in relation to applications, and and fields in which new problems have arisen for which mathematical models and techniques are not yet available.