Markian Jaworsky, Xiaohui Tao, Lei Pan, Shiva Raj Pokhrel, Jianming Yong, Ji Zhang
{"title":"Interrelated feature selection from health surveys using domain knowledge graph.","authors":"Markian Jaworsky, Xiaohui Tao, Lei Pan, Shiva Raj Pokhrel, Jianming Yong, Ji Zhang","doi":"10.1007/s13755-023-00254-7","DOIUrl":null,"url":null,"abstract":"<p><p>Finding patterns among risk factors and chronic illness can suggest similar causes, provide guidance to improve healthy lifestyles, and give clues for possible treatments for outliers. Prior studies have typically isolated data challenges from single-disease datasets. However, the predictive power of multiple diseases is more helpful in establishing a healthy lifestyle than investigating one disease. Most studies typically focus on single-disease datasets; however, to ensure that health advice is generalized and contemporary, the features that predict the likelihood of many diseases can improve health advice effectiveness when considering the patient's point of view. We construct and present a novel knowledge-based qualitative method to remove redundant features from a dataset and redefine the outliers. The results of our trials upon five annual chronic disease health surveys demonstrate that our Knowledge Graph-based feature selection, when applied to many machine learning and deep learning multi-label classifiers, can improve classification performance. Our methodology is compatible with future directions, such as graph neural networks. It provides clinicians with an efficient process to select the most relevant health survey questions and responses regarding single or many human organ systems.</p>","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"11 1","pages":"54"},"PeriodicalIF":4.7000,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654272/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Information Science and Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13755-023-00254-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Finding patterns among risk factors and chronic illness can suggest similar causes, provide guidance to improve healthy lifestyles, and give clues for possible treatments for outliers. Prior studies have typically isolated data challenges from single-disease datasets. However, the predictive power of multiple diseases is more helpful in establishing a healthy lifestyle than investigating one disease. Most studies typically focus on single-disease datasets; however, to ensure that health advice is generalized and contemporary, the features that predict the likelihood of many diseases can improve health advice effectiveness when considering the patient's point of view. We construct and present a novel knowledge-based qualitative method to remove redundant features from a dataset and redefine the outliers. The results of our trials upon five annual chronic disease health surveys demonstrate that our Knowledge Graph-based feature selection, when applied to many machine learning and deep learning multi-label classifiers, can improve classification performance. Our methodology is compatible with future directions, such as graph neural networks. It provides clinicians with an efficient process to select the most relevant health survey questions and responses regarding single or many human organ systems.
期刊介绍:
Health Information Science and Systems is a multidisciplinary journal that integrates artificial intelligence/computer science/information technology with health science and services, embracing information science research coupled with topics related to the modeling, design, development, integration and management of health information systems, smart health, artificial intelligence in medicine, and computer aided diagnosis, medical expert systems. The scope includes: i.) smart health, artificial Intelligence in medicine, computer aided diagnosis, medical image processing, medical expert systems ii.) medical big data, medical/health/biomedicine information resources such as patient medical records, devices and equipments, software and tools to capture, store, retrieve, process, analyze, optimize the use of information in the health domain, iii.) data management, data mining, and knowledge discovery, all of which play a key role in decision making, management of public health, examination of standards, privacy and security issues, iv.) development of new architectures and applications for health information systems.