{"title":"Private pathological assessment via machine learning and homomorphic encryption","authors":"Ahmad Al Badawi, Mohd Faizal Bin Yusof","doi":"10.1186/s13040-024-00379-9","DOIUrl":null,"url":null,"abstract":"The objective of this research is to explore the applicability of machine learning and fully homomorphic encryption (FHE) in the private pathological assessment, with a focus on the inference phase of support vector machines (SVM) for the classification of confidential medical data. A framework is introduced that utilizes the Cheon-Kim-Kim-Song (CKKS) FHE scheme, facilitating the execution of SVM inference on encrypted datasets. This framework ensures the privacy of patient data and negates the necessity of decryption during the analytical process. Additionally, an efficient feature extraction technique is presented for the transformation of medical imagery into vectorial representations. The system’s evaluation across various datasets substantiates its practicality and efficacy. The proposed method delivers classification accuracy and performance on par with traditional, non-encrypted SVM inference, while upholding a 128-bit security level against established cryptographic attacks targeting the CKKS scheme. The secure inference process is executed within a temporal span of mere seconds. The findings of this study underscore the viability of FHE in enhancing the security and efficiency of bioinformatics analyses, potentially benefiting fields such as cardiology, oncology, and medical imagery. The implications of this research are significant for the future of privacy-preserving machine learning, promoting progress in diagnostic procedures, tailored medical treatments, and clinical investigations.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":null,"pages":null},"PeriodicalIF":4.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-024-00379-9","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The objective of this research is to explore the applicability of machine learning and fully homomorphic encryption (FHE) in the private pathological assessment, with a focus on the inference phase of support vector machines (SVM) for the classification of confidential medical data. A framework is introduced that utilizes the Cheon-Kim-Kim-Song (CKKS) FHE scheme, facilitating the execution of SVM inference on encrypted datasets. This framework ensures the privacy of patient data and negates the necessity of decryption during the analytical process. Additionally, an efficient feature extraction technique is presented for the transformation of medical imagery into vectorial representations. The system’s evaluation across various datasets substantiates its practicality and efficacy. The proposed method delivers classification accuracy and performance on par with traditional, non-encrypted SVM inference, while upholding a 128-bit security level against established cryptographic attacks targeting the CKKS scheme. The secure inference process is executed within a temporal span of mere seconds. The findings of this study underscore the viability of FHE in enhancing the security and efficiency of bioinformatics analyses, potentially benefiting fields such as cardiology, oncology, and medical imagery. The implications of this research are significant for the future of privacy-preserving machine learning, promoting progress in diagnostic procedures, tailored medical treatments, and clinical investigations.
期刊介绍:
BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data.
Topical areas include, but are not limited to:
-Development, evaluation, and application of novel data mining and machine learning algorithms.
-Adaptation, evaluation, and application of traditional data mining and machine learning algorithms.
-Open-source software for the application of data mining and machine learning algorithms.
-Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies.
-Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.