Machine Learning: Science and Technology最新文献_第4页

Semi-Supervised Segmentation of Abdominal Organs and Liver Tumor: Uncertainty Rectified Curriculum Labeling Meets X-Fuse 腹部器官和肝脏肿瘤的半监督分割：不确定性矫正课程标签与 X-Fuse 的结合

Machine Learning: Science and Technology

Pub Date : 2024-05-15 DOI: 10.1088/2632-2153/ad4c38

Pengju Lyu, Wenjian Liu, Tingyi Lin, Jie Zhang, Yao Liu, Cheng Wang, Jianjun Zhu

Precise liver tumors and associated organ segmentation hold immense value for surgical and radiological intervention, enabling anatomical localization for pre-operative planning and intra-operative guidance. Modern deep learning models for medical image segmentation have evolved from convolution neural networks to transformer architectures, significantly boosting global context understanding. However, accurate delineation especially of hepatic lesions remains an enduring challenge due to models’ predominant focus solely on spatial feature extraction failing to adequately characterize complex medical anatomies. Moreover, the relative paucity of expertly annotated medical imaging data restricts model exposure to diverse pathological presentations. In this paper, we present a three-phrased cascaded segmentation framework featuring an X-Fuse model that synergistically integrates spatial and frequency domain’s complementary information in dual encoders to enrich latent feature representation. To enhance model generalizability, building upon X Fuse topology and taking advantage of additional unlabeled pathological data, our proposed integration of curriculum pseudo-labeling with Jensen-Shannon variance-based uncertainty rectification promotes optimized pseudo-supervision in the context of semi-supervised learning. We further introduce a tumor-focus augmentation technique including training-free copy-paste and knowledge-based synthesis that show efficacy in simplicity, contributing to the substantial elevation of model adaptability on diverse lesional morphologies. Extensive experiments and modular evaluations on a holdout test set demonstrate that our methods significantly outperform existing state-of-the-art segmentation models in both supervised and semi-supervised settings, as measured by the Dice similarity coefficient, achieving superior delineation of bones (95.42%), liver (96.26%), and liver tumors (89.53%) with 16.41% increase comparing to V-Net on supervised-only and augmented-absent scenario. Our method marks a significant step toward the realization of more reliable and robust AI-assisted diagnostic tools for liver tumor intervention. We have made the codes publicly available.

精确的肝脏肿瘤和相关器官分割对外科手术和放射学干预具有巨大价值，可为术前规划和术中指导提供解剖定位。用于医学影像分割的现代深度学习模型已从卷积神经网络发展到变换器架构，极大地增强了对全局上下文的理解。然而，由于模型主要侧重于空间特征提取，无法充分表征复杂的医学解剖结构，因此准确划分病变（尤其是肝脏病变）仍是一项持久的挑战。此外，专家注释的医学影像数据相对较少，限制了模型对各种病理表现的接触。在本文中，我们提出了一个以 X-Fuse 模型为特色的三词组级联分割框架，该模型在双编码器中协同整合了空间域和频率域的互补信息，从而丰富了潜在特征表征。为了增强模型的通用性，我们在 X Fuse 拓扑的基础上，利用额外的未标记病理数据，提出了课程伪标记与基于 Jensen-Shannon 方差的不确定性矫正的整合方案，在半监督学习的背景下促进了伪监督的优化。我们进一步介绍了一种肿瘤焦点增强技术，包括免训练复制粘贴和基于知识的合成，这些技术显示了简便性的功效，有助于大幅提高模型对不同病变形态的适应性。在保留测试集上进行的广泛实验和模块化评估表明，我们的方法在监督和半监督设置中都明显优于现有的一流分割模型，以 Dice 相似性系数来衡量，我们的方法在骨骼（95.42%）、肝脏（96.26%）和肝脏肿瘤（89.53%）的划分上都更胜一筹，在仅监督和无增强的情况下，与 V-Net 相比提高了 16.41%。我们的方法标志着向实现更可靠、更稳健的肝脏肿瘤人工智能辅助诊断工具迈出了重要一步。我们已公开了相关代码。

{"title":"Semi-Supervised Segmentation of Abdominal Organs and Liver Tumor: Uncertainty Rectified Curriculum Labeling Meets X-Fuse","authors":"Pengju Lyu, Wenjian Liu, Tingyi Lin, Jie Zhang, Yao Liu, Cheng Wang, Jianjun Zhu","doi":"10.1088/2632-2153/ad4c38","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4c38","url":null,"abstract":"\u0000 Precise liver tumors and associated organ segmentation hold immense value for surgical and radiological intervention, enabling anatomical localization for pre-operative planning and intra-operative guidance. Modern deep learning models for medical image segmentation have evolved from convolution neural networks to transformer architectures, significantly boosting global context understanding. However, accurate delineation especially of hepatic lesions remains an enduring challenge due to models’ predominant focus solely on spatial feature extraction failing to adequately characterize complex medical anatomies. Moreover, the relative paucity of expertly annotated medical imaging data restricts model exposure to diverse pathological presentations. In this paper, we present a three-phrased cascaded segmentation framework featuring an X-Fuse model that synergistically integrates spatial and frequency domain’s complementary information in dual encoders to enrich latent feature representation. To enhance model generalizability, building upon X Fuse topology and taking advantage of additional unlabeled pathological data, our proposed integration of curriculum pseudo-labeling with Jensen-Shannon variance-based uncertainty rectification promotes optimized pseudo-supervision in the context of semi-supervised learning. We further introduce a tumor-focus augmentation technique including training-free copy-paste and knowledge-based synthesis that show efficacy in simplicity, contributing to the substantial elevation of model adaptability on diverse lesional morphologies. Extensive experiments and modular evaluations on a holdout test set demonstrate that our methods significantly outperform existing state-of-the-art segmentation models in both supervised and semi-supervised settings, as measured by the Dice similarity coefficient, achieving superior delineation of bones (95.42%), liver (96.26%), and liver tumors (89.53%) with 16.41% increase comparing to V-Net on supervised-only and augmented-absent scenario. Our method marks a significant step toward the realization of more reliable and robust AI-assisted diagnostic tools for liver tumor intervention. We have made the codes publicly available.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"57 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140975988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers 受物理学启发的用于探测旋转双黑洞合并的高阶波模信号的时空图人工智能组合

Machine Learning: Science and Technology

Pub Date : 2024-05-15 DOI: 10.1088/2632-2153/ad4c37

Minyang Tian, Eliu Huerta, Huihuo Zheng, Prayush Kumar

We present a new class of AI models for the detection of quasi-circular, spinning, non-precessing binary black hole mergers whose waveforms include the higher order gravitational wave modes $(ell, |m|)={(2, 2), (2, 1), (3, 3), (3, 2), (4, 4)}$, and mode mixing effects in the (ell = 3, |m| = 2) harmonics. These AI models combine hybrid dilated convolution neural networks to accurately model both short- and long-range temporal sequential information of gravitational waves; and graph neural networks to capture spatial correlations among gravitational wave observatories to consistently describe and identify the presence of a signal in a three detector network encompassing the Advanced LIGO and Virgo detectors. We first trained these spatiotemporal-graph AI models using synthetic noise, using 1.2 million modeled waveforms to densely sample this signal manifold, within 1.7 hours using 256 NVIDIA A100 GPUs in the Polaris supercomputer at the Argonne Leadership Computing Facility. This distributed training approach exhibited optimal classification performance, and strong scaling up to 512 NVIDIA A100 GPUs. With these AI ensembles we processed data from a three detector network, and found that an ensemble of 4 AI models achieves state-of-the-art performance for signal detection, and reports two misclassifications for every decade of searched data. We distributed AI inference over 128 GPUs in the Polaris supercomputer and 128 nodes in the Theta supercomputer, and completed the processing of a decade of gravitational wave data from a three detector network within 3.5 hours. Finally, we fine-tuned these AI ensembles to process the entire month of February 2020, which is part of the O3b LIGO/Virgo observation run, and found 6 gravitational waves, concurrently identified in Advanced LIGO and Advanced Virgo data, and zero false positives. This analysis was completed in one hour using one NVIDIA A100 GPU.

我们提出了一类新的人工智能模型，用于探测准圆的、旋转的、非前处理的双黑洞合并，其波形包括高阶引力波模式$(ell, |m|)={(2, 2), (2, 1), (3, 3), (3, 2), (4, 4)}$，以及(ell = 3, |m| = 2) 谐波中的模式混合效应。这些人工智能模型结合了混合扩张卷积神经网络，以准确模拟引力波的短程和长程时间序列信息；以及图神经网络，以捕捉引力波观测站之间的空间相关性，从而在包括高级 LIGO 和室女座探测器在内的三个探测器网络中一致地描述和识别信号的存在。我们首先使用合成噪声对这些时空图人工智能模型进行了训练，使用阿贡领导计算设施北极星超级计算机中的 256 个英伟达 A100 GPU，在 1.7 小时内使用 120 万个建模波形对信号流形进行了密集采样。这种分布式训练方法表现出最佳的分类性能，并可扩展到 512 个英伟达 A100 GPU。利用这些人工智能模型集，我们处理了来自三个探测器网络的数据，发现由 4 个人工智能模型组成的模型集在信号检测方面达到了最先进的性能，并且每搜索十年数据就会报告两次误分类。我们将人工智能推理分布在北极星超级计算机的 128 个 GPU 和 Theta 超级计算机的 128 个节点上，并在 3.5 小时内完成了对来自三个探测器网络的十年引力波数据的处理。最后，我们对这些人工智能集合进行了微调，以处理 2020 年 2 月（O3b LIGO/Virgo 观测运行的一部分）的整个月份，并在高级 LIGO 和高级 Virgo 数据中同时发现了 6 个引力波，误报率为零。该分析使用一个英伟达 A100 GPU 在一小时内完成。

{"title":"Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers","authors":"Minyang Tian, Eliu Huerta, Huihuo Zheng, Prayush Kumar","doi":"10.1088/2632-2153/ad4c37","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4c37","url":null,"abstract":"\u0000 We present a new class of AI models for the detection of quasi-circular, spinning, non-precessing binary black hole mergers whose waveforms include the higher order gravitational wave modes $(ell, |m|)={(2, 2), (2, 1), (3, 3), (3, 2), (4, 4)}$, and mode mixing effects in the (ell = 3, |m| = 2) harmonics. These AI models combine hybrid dilated convolution neural networks to accurately model both short- and long-range temporal sequential information of gravitational waves; and graph neural networks to capture spatial correlations among gravitational wave observatories to consistently describe and identify the presence of a signal in a three detector network encompassing the Advanced LIGO and Virgo detectors. We first trained these spatiotemporal-graph AI models using synthetic noise, using 1.2 million modeled waveforms to densely sample this signal manifold, within 1.7 hours using 256 NVIDIA A100 GPUs in the Polaris supercomputer at the Argonne Leadership Computing Facility. This distributed training approach exhibited optimal classification performance, and strong scaling up to 512 NVIDIA A100 GPUs. With these AI ensembles we processed data from a three detector network, and found that an ensemble of 4 AI models achieves state-of-the-art performance for signal detection, and reports two misclassifications for every decade of searched data. We distributed AI inference over 128 GPUs in the Polaris supercomputer and 128 nodes in the Theta supercomputer, and completed the processing of a decade of gravitational wave data from a three detector network within 3.5 hours. Finally, we fine-tuned these AI ensembles to process the entire month of February 2020, which is part of the O3b LIGO/Virgo observation run, and found 6 gravitational waves, concurrently identified in Advanced LIGO and Advanced Virgo data, and zero false positives. This analysis was completed in one hour using one NVIDIA A100 GPU.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"59 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140972241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Interpolation of Environmental Data Using Deep Learning and Model Inference 利用深度学习和模型推理对环境数据进行插值处理

Machine Learning: Science and Technology

Pub Date : 2024-05-14 DOI: 10.1088/2632-2153/ad4b94

C. Ibebuchi, Itohan-Osa Abu

The temporal resolution of environmental data sets plays a major role in the granularity of the information that can be derived from the data. In most cases, it is required that different data sets have a common temporal resolution to enable their consistent evaluations and applications in making informed decisions. This study leverages deep learning with long short-term memory (LSTM) neural networks and model inference to enhance the temporal resolution of climate datasets, specifically temperature, and precipitation, from daily to sub-daily scales. We trained our model to learn the relationship between daily and sub-daily data, subsequently applying this knowledge to increase the resolution of a separate dataset with a coarser (daily) temporal resolution. Our findings reveal a high degree of accuracy for temperature predictions, evidenced by a correlation of 0.99 and a mean absolute error of 0.21 °C, between the actual and predicted sub-daily values. In contrast, the approach was less effective for precipitation, achieving an explained variance of only 37%, compared to 98% for temperature. Further, besides the sub-daily interpolation of the climate data sets, we adapted our approach to increase the resolution of the Normalized difference vegetation index of Landsat (from 16-day to 5-day interval) using the LSTM model pre-trained from the Sentinel 2 Normalized difference vegetation index - that exists at a relatively higher temporal resolution. The explained variance between the predicted Landsat and Sentinel 1 data is 70% with a mean absolute error of 0.03. These results suggest that our method is particularly suitable for environmental datasets with less pronounced short-term variability, offering a promising tool for improving the resolution and utility of the data.

环境数据集的时间分辨率对于从数据中获取信息的粒度起着重要作用。在大多数情况下，需要不同的数据集具有共同的时间分辨率，以便在做出明智决策时对其进行一致的评估和应用。本研究利用深度学习的长短期记忆（LSTM）神经网络和模型推理来提高气候数据集的时间分辨率，特别是温度和降水量，从日尺度到亚日尺度。我们对模型进行了训练，以学习日数据和亚日数据之间的关系，随后将这一知识用于提高时间分辨率更粗的（日）单独数据集的分辨率。我们的研究结果表明，该模型对温度的预测具有很高的准确性，其实际值与亚日预测值之间的相关性为 0.99，平均绝对误差为 0.21 °C。相比之下，该方法对降水的效果较差，解释方差仅为 37%，而对气温的解释方差为 98%。此外，除了对气候数据集进行亚日插值外，我们还调整了方法，利用根据哨兵 2 号归一化差异植被指数预先训练的 LSTM 模型，提高了大地遥感卫星归一化差异植被指数的分辨率（从 16 天间隔提高到 5 天间隔）--后者的时间分辨率相对更高。预测的大地遥感卫星数据和哨兵 1 号数据之间的解释方差为 70%，平均绝对误差为 0.03。这些结果表明，我们的方法特别适用于短期变异性不明显的环境数据集，为提高数据的分辨率和实用性提供了一种有前途的工具。

{"title":"Interpolation of Environmental Data Using Deep Learning and Model Inference","authors":"C. Ibebuchi, Itohan-Osa Abu","doi":"10.1088/2632-2153/ad4b94","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4b94","url":null,"abstract":"\u0000 The temporal resolution of environmental data sets plays a major role in the granularity of the information that can be derived from the data. In most cases, it is required that different data sets have a common temporal resolution to enable their consistent evaluations and applications in making informed decisions. This study leverages deep learning with long short-term memory (LSTM) neural networks and model inference to enhance the temporal resolution of climate datasets, specifically temperature, and precipitation, from daily to sub-daily scales. We trained our model to learn the relationship between daily and sub-daily data, subsequently applying this knowledge to increase the resolution of a separate dataset with a coarser (daily) temporal resolution. Our findings reveal a high degree of accuracy for temperature predictions, evidenced by a correlation of 0.99 and a mean absolute error of 0.21 °C, between the actual and predicted sub-daily values. In contrast, the approach was less effective for precipitation, achieving an explained variance of only 37%, compared to 98% for temperature. Further, besides the sub-daily interpolation of the climate data sets, we adapted our approach to increase the resolution of the Normalized difference vegetation index of Landsat (from 16-day to 5-day interval) using the LSTM model pre-trained from the Sentinel 2 Normalized difference vegetation index - that exists at a relatively higher temporal resolution. The explained variance between the predicted Landsat and Sentinel 1 data is 70% with a mean absolute error of 0.03. These results suggest that our method is particularly suitable for environmental datasets with less pronounced short-term variability, offering a promising tool for improving the resolution and utility of the data.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"12 26","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140980742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards XAI agnostic explainability to assess differential diagnosis for Meningitis diseases 实现 XAI 不可知论的可解释性，以评估脑膜炎疾病的鉴别诊断

Machine Learning: Science and Technology

Pub Date : 2024-05-10 DOI: 10.1088/2632-2153/ad4a1f

Aya Messai, Ahlem Drif, A. Ouyahia, Meriem Guechi, Mounira Rais, Lars Kaderali, Hocine Cherifi

Meningitis, characterized by meninges and cerebrospinal fluid (CSF) inflammation, poses diagnostic challenges due to diverse clinical manifestations. This work introduces an explainable AI automatic medical decision methodology that determines critical features and their relevant values for the differential diagnosis of various meningitis cases. We proceed with knowledge acquisition to define the rules for this research. Currently, we have established the etiological diagnosis of Meningococcaemia, Meningococcal Meningitis, Tuberculous Meningitis, Aseptic Meningitis, Haemophilus influenzae Meningitis, and Pneumococcal Meningitis. The data preprocessing was conducted after collecting data from samples with meningitis diseases at Setif Hospital in Algeria. Tree-based ensemble methods were then applied to assess the model’s performance. Finally, we implement an XAI agnostic explainability approach based on the SHapley Additive exPlanations technique to attribute each feature’s contribution to the model’s output. Experiments were conducted on the collected dataset and the SINAN database, obtained from the Brazilian Government’s Health Information System on Notifiable Diseases, which comprises 6729 patients aged over 18 years. The Extreme Gradient Boosting model was chosen for its superior performance metrics (Accuracy: 0.90, AUROC: 0.94, and F1-score: 0.98). Setif’s hospital data revealed notable performance metrics (Accuracy: 0.7143, F1-Score: 0.7857). This study's findings showcase each feature's contribution to the model’s predictions and diagnosis. It also reveals critical biomarker ranges associated with distinct types of Meningitis. Significant diagnostic effect was found for Meningococcal Meningitis with elevated neutrophil levels (>40%) and balanced lymphocyte levels (40-60%). Tuberculous Meningitis demonstrated low neutrophil levels (<60%) and elevated lymphocyte levels (>60%). Haemophilus influenzae meningitis exhibited a predominance of neutrophils (>80%), while Aseptic meningitis showed lower neutrophil levels (<40%) and lymphocyte levels within the range of 50-60%. The majority of the AI automatic medical decision results are twinned with validation by our team of infectious disease experts, confirming the alignment of algorithmic diagnoses with clinical practices.

脑膜炎以脑膜和脑脊液（CSF）炎症为特征，临床表现多种多样，给诊断带来了挑战。本作品介绍了一种可解释的人工智能自动医疗决策方法，它能确定各种脑膜炎病例鉴别诊断的关键特征及其相关值。我们从知识获取入手，为这项研究确定规则。目前，我们已经确定了脑膜炎球菌血症、脑膜炎球菌性脑膜炎、结核性脑膜炎、无菌性脑膜炎、流感嗜血杆菌性脑膜炎和肺炎球菌性脑膜炎的病因诊断。数据预处理是在收集阿尔及利亚塞提夫医院脑膜炎疾病样本数据后进行的。然后采用基于树的集合方法来评估模型的性能。最后，我们采用了基于 SHapley Additive exPlanations 技术的 XAI 不可知论可解释性方法，以确定每个特征对模型输出的贡献。实验是在收集的数据集和 SINAN 数据库上进行的，SINAN 数据库来自巴西政府的应申报疾病健康信息系统，其中包括 6729 名 18 岁以上的患者。极端梯度提升模型因其卓越的性能指标（准确率：0.90；AUROC：0.94；F1-score：0.98）而被选中。Setif 的医院数据显示了显著的性能指标（准确率：0.7143，F1-分数：0.7857）。这项研究的结果展示了每个特征对模型预测和诊断的贡献。它还揭示了与不同类型脑膜炎相关的关键生物标志物范围。中性粒细胞水平升高（>40%）和淋巴细胞水平平衡（40-60%）对脑膜炎球菌性脑膜炎有显著的诊断效果。结核性脑膜炎的中性粒细胞水平较低（60%）。流感嗜血杆菌脑膜炎显示中性粒细胞占主导地位（>80%），而无菌性脑膜炎显示中性粒细胞水平较低（<40%），淋巴细胞水平在 50-60% 之间。大多数人工智能自动医疗决策结果都经过了我们的传染病专家团队的验证，确认了算法诊断与临床实践的一致性。

{"title":"Towards XAI agnostic explainability to assess differential diagnosis for Meningitis diseases","authors":"Aya Messai, Ahlem Drif, A. Ouyahia, Meriem Guechi, Mounira Rais, Lars Kaderali, Hocine Cherifi","doi":"10.1088/2632-2153/ad4a1f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4a1f","url":null,"abstract":"\u0000 Meningitis, characterized by meninges and cerebrospinal fluid (CSF) inflammation, poses diagnostic challenges due to diverse clinical manifestations. This work introduces an explainable AI automatic medical decision methodology that determines critical features and their relevant values for the differential diagnosis of various meningitis cases. We proceed with knowledge acquisition to define the rules for this research. Currently, we have established the etiological diagnosis of Meningococcaemia, Meningococcal Meningitis, Tuberculous Meningitis, Aseptic Meningitis, Haemophilus influenzae Meningitis, and Pneumococcal Meningitis. The data preprocessing was conducted after collecting data from samples with meningitis diseases at Setif Hospital in Algeria. Tree-based ensemble methods were then applied to assess the model’s performance. Finally, we implement an XAI agnostic explainability approach based on the SHapley Additive exPlanations technique to attribute each feature’s contribution to the model’s output. Experiments were conducted on the collected dataset and the SINAN database, obtained from the Brazilian Government’s Health Information System on Notifiable Diseases, which comprises 6729 patients aged over 18 years. The Extreme Gradient Boosting model was chosen for its superior performance metrics (Accuracy: 0.90, AUROC: 0.94, and F1-score: 0.98). Setif’s hospital data revealed notable performance metrics (Accuracy: 0.7143, F1-Score: 0.7857). This study's findings showcase each feature's contribution to the model’s predictions and diagnosis. It also reveals critical biomarker ranges associated with distinct types of Meningitis. Significant diagnostic effect was found for Meningococcal Meningitis with elevated neutrophil levels (>40%) and balanced lymphocyte levels (40-60%). Tuberculous Meningitis demonstrated low neutrophil levels (<60%) and elevated lymphocyte levels (>60%). Haemophilus influenzae meningitis exhibited a predominance of neutrophils (>80%), while Aseptic meningitis showed lower neutrophil levels (<40%) and lymphocyte levels within the range of 50-60%. The majority of the AI automatic medical decision results are twinned with validation by our team of infectious disease experts, confirming the alignment of algorithmic diagnoses with clinical practices.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 38","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140993021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Beyond Dynamics: Learning to Discover Conservation Principles 超越动力学：学习发现水土保持原理

Machine Learning: Science and Technology

Pub Date : 2024-05-10 DOI: 10.1088/2632-2153/ad4a20

Antonii Belyshev, Alexander Kovrigin, Andrey Ustyuzhanin

The discovery of conservation principles is crucial for understanding the fundamental behavior of both classical and quantum physical systems across numerous domains. This paper introduces an innovative method that merges representation learning and topological analysis to explore the topology of conservation law spaces. Notably, the robustness of our approach to noise makes it suitable for complex experimental setups and its aptitude extends to the analysis of quantum systems, as successfully demonstrated in our paper. We exemplify our method’s potential to unearth previously unknown conservation principles and endorse interdisciplinary research through a variety of physical simulations. In conclusion, this work emphasizes the significance of data-driven techniques in deepening our comprehension of the principles governing classical and quantum physical systems.

发现守恒原理对于理解众多领域中经典和量子物理系统的基本行为至关重要。本文介绍了一种融合表征学习和拓扑分析的创新方法，用于探索守恒定律空间的拓扑结构。值得注意的是，我们的方法对噪声的鲁棒性使其适用于复杂的实验设置，其适用范围也扩展到量子系统的分析，这在我们的论文中得到了成功的展示。我们举例说明了我们的方法具有挖掘以前未知守恒原理的潜力，并通过各种物理模拟支持跨学科研究。总之，这项工作强调了数据驱动技术在加深我们对经典和量子物理系统原理的理解方面的重要意义。

引用次数: 0

Exploiting Data Diversity in Multi-Domain Federated Learning 在多域联合学习中利用数据多样性

Machine Learning: Science and Technology

Pub Date : 2024-05-03 DOI: 10.1088/2632-2153/ad4768

Hussain Ahmad Madni, Rao Muhammad Umer, G. Foresti

Federated Learning (FL) is an evolving machine learning technique that allows collaborative model training without sharing the original data among participants. In real-world scenarios, data residing at multiple clients are often heterogeneous in terms of different resolutions, magnifications, scanners, or imaging protocols, and thus challenging for global FL model convergence in collaborative training. Most of the existing FL methods consider data heterogeneity within one domain by assuming same data variation in each client site. In this paper, we consider data heterogeneity in FL with different domains of heterogeneous data by raising the problems of domain-shift, class-imbalance, and missing data. We propose a method, MDFL (Multi-Domain Federated Learning) as a solution to heterogeneous training data from multiple domains by training robust Transformer model. We use two loss functions, one for correctly predicting class labels and other for encouraging similarity and dissimilarity over latent features, to optimize the global FL model. We perform various experiments using different convolution-based networks and non-convolutional Transformer architectures on multi-domain datasets. We evaluate the proposed approach on benchmark datasets and compare with the existing FL methods. Our results show the superiority of the proposed approach which performs better in term of robust FL global model than the exiting methods.

联合学习（FL）是一种不断发展的机器学习技术，它允许在参与者之间不共享原始数据的情况下进行协作模型训练。在现实世界的场景中，驻留在多个客户端的数据通常是异构的，如不同的分辨率、放大率、扫描仪或成像协议，因此在协作训练中全局 FL 模型收敛具有挑战性。现有的 FL 方法大多通过假设每个客户端站点的数据变化相同来考虑一个域内的数据异质性。在本文中，我们通过提出域偏移、类不平衡和数据缺失等问题，考虑了 FL 中不同域异构数据的数据异质性。我们提出了一种 MDFL（多域联合学习）方法，通过训练鲁棒 Transformer 模型来解决来自多个域的异构训练数据。我们使用两个损失函数（一个用于正确预测类标签，另一个用于鼓励潜在特征的相似性和不相似性）来优化全局 FL 模型。我们在多域数据集上使用不同的卷积网络和非卷积变换器架构进行了各种实验。我们在基准数据集上对所提出的方法进行了评估，并与现有的 FL 方法进行了比较。我们的结果表明了所提出方法的优越性，它在鲁棒 FL 全局模型方面的表现优于现有方法。

{"title":"Exploiting Data Diversity in Multi-Domain Federated Learning","authors":"Hussain Ahmad Madni, Rao Muhammad Umer, G. Foresti","doi":"10.1088/2632-2153/ad4768","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4768","url":null,"abstract":"\u0000 Federated Learning (FL) is an evolving machine learning technique that allows collaborative model training without sharing the original data among participants. In real-world scenarios, data residing at multiple clients are often heterogeneous in terms of different resolutions, magnifications, scanners, or imaging protocols, and thus challenging for global FL model convergence in collaborative training. Most of the existing FL methods consider data heterogeneity within one domain by assuming same data variation in each client site. In this paper, we consider data heterogeneity in FL with different domains of heterogeneous data by raising the problems of domain-shift, class-imbalance, and missing data. We propose a method, MDFL (Multi-Domain Federated Learning) as a solution to heterogeneous training data from multiple domains by training robust Transformer model. We use two loss functions, one for correctly predicting class labels and other for encouraging similarity and dissimilarity over latent features, to optimize the global FL model. We perform various experiments using different convolution-based networks and non-convolutional Transformer architectures on multi-domain datasets. We evaluate the proposed approach on benchmark datasets and compare with the existing FL methods. Our results show the superiority of the proposed approach which performs better in term of robust FL global model than the exiting methods.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"16 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141016885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of Machine Learning Prediction Reliability based on Sampling Distance Evaluation with Feature Decorrelation 基于采样距离评估与特征解相关性的机器学习预测可靠性分析

Machine Learning: Science and Technology

Pub Date : 2024-04-23 DOI: 10.1088/2632-2153/ad4231

evan askanazi, Ilya Grinberg

Despite successful use in a wide variety of disciplines for data analysis and prediction, machine learning (ML) methods suffer from a lack of understanding of the reliability of predictions due to the lack of transparency and black-box nature of ML models. In materials science and other fields, typical ML model results include a significant number of low-quality predictions. This problem is known to be particularly acute for target systems which differ significantly from the data used for ML model training. However, to date, a general method for uncertainty quantification (UQ) of ML predictions has not been available. Focusing on the intuitive and computationally efficient similarity-based UQ, we show that a simple metric based on Euclidean feature space distance and sampling density together with the decorrelation of the features using Gram-Schmidt orthogonalization allows effective separation of the accurately predicted data points from data points with poor prediction accuracy. To demonstrate the generality of the method, we apply it to support vector regression models for various small data sets for materials science and other fields. We also show that the proposed metric is a more effective UQ tool than the standard approach of using the average distance of k nearest neighbors (k=1-10) in features space for similarity evaluation. Our method is computationally simple, can be used with any ML learning method and enables analysis of the sources of the ML prediction errors. Therefore, it is suitable for use as a standard technique for the estimation of ML prediction reliability for small data sets and as a tool for data set design.

尽管机器学习（ML）方法成功地应用于各种学科的数据分析和预测，但由于 ML 模型缺乏透明度和黑箱性质，人们对预测的可靠性缺乏了解。在材料科学和其他领域，典型的 ML 模型结果包括大量低质量预测。众所周知，这一问题在目标系统中尤为突出，因为目标系统与用于 ML 模型训练的数据存在很大差异。然而，迄今为止，还没有一种对 ML 预测进行不确定性量化（UQ）的通用方法。我们重点研究了直观且计算效率高的基于相似性的不确定性量化方法，结果表明，基于欧氏特征空间距离和采样密度的简单度量，加上使用格拉姆-施密特正交化对特征进行去相关处理，可以有效地将预测准确的数据点与预测准确性较差的数据点区分开来。为了证明该方法的通用性，我们将其应用于材料科学和其他领域各种小型数据集的支持向量回归模型。我们还表明，与使用特征空间中 k 个近邻（k=1-10）的平均距离进行相似性评估的标准方法相比，所提出的度量方法是一种更有效的 UQ 工具。我们的方法计算简单，可与任何 ML 学习方法一起使用，并能分析 ML 预测误差的来源。因此，它适合用作估算小型数据集 ML 预测可靠性的标准技术，以及数据集设计的工具。

{"title":"Analysis of Machine Learning Prediction Reliability based on Sampling Distance Evaluation with Feature Decorrelation","authors":"evan askanazi, Ilya Grinberg","doi":"10.1088/2632-2153/ad4231","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4231","url":null,"abstract":"\u0000 Despite successful use in a wide variety of disciplines for data analysis and prediction, machine learning (ML) methods suffer from a lack of understanding of the reliability of predictions due to the lack of transparency and black-box nature of ML models. In materials science and other fields, typical ML model results include a significant number of low-quality predictions. This problem is known to be particularly acute for target systems which differ significantly from the data used for ML model training. However, to date, a general method for uncertainty quantification (UQ) of ML predictions has not been available. Focusing on the intuitive and computationally efficient similarity-based UQ, we show that a simple metric based on Euclidean feature space distance and sampling density together with the decorrelation of the features using Gram-Schmidt orthogonalization allows effective separation of the accurately predicted data points from data points with poor prediction accuracy. To demonstrate the generality of the method, we apply it to support vector regression models for various small data sets for materials science and other fields. We also show that the proposed metric is a more effective UQ tool than the standard approach of using the average distance of k nearest neighbors (k=1-10) in features space for similarity evaluation. Our method is computationally simple, can be used with any ML learning method and enables analysis of the sources of the ML prediction errors. Therefore, it is suitable for use as a standard technique for the estimation of ML prediction reliability for small data sets and as a tool for data set design.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"28 41","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140672092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine Learning for Efficient Grazing-Exit X-ray Absorption Near Edge Structure Spectroscopy Analysis : Bayesian Optimization Approach 机器学习用于高效掠出 X 射线吸收近边缘结构光谱分析：贝叶斯优化方法

Machine Learning: Science and Technology

Pub Date : 2024-04-23 DOI: 10.1088/2632-2153/ad4253

C. Cakir, Can Bogoclu, Franziska Emmerling, Christina Streli, A. Guilherme Buzanich, Martin Radtke

In materials science, traditional techniques for analysing layered structures are essential for obtaining information about local structure, electronic properties and chemical states. While valuable, these methods often require high vacuum environments and have limited depth profiling capabilities. The Grazing Exit X-ray Absorption Near-Edge Structure (GE-XANES) technique addresses these limitations by providing depth-resolved insight at ambient conditions, facilitating in situ material analysis without special sample preparation. However, GE-XANES is limited by long data acquisition times, which hinders its practicality for various applications. To overcome this, we have incorporated Bayesian Optimization (BO) into the GE-XANES data acquisition process. This innovative approach significantly reduces data acquisition time from 20 hours to 25 minutes. We have used standard GE-XANES experiment, which serve as reference, to validate the effectiveness and accuracy of the BO-informed experimental setup. Our results show that this optimized approach maintains data quality while significantly improving efficiency, making GE-XANES more accessible to a wider range of materials science applications.

在材料科学领域，分析层状结构的传统技术对于获取有关局部结构、电子特性和化学状态的信息至关重要。虽然这些方法很有价值，但通常需要高真空环境，而且深度剖析能力有限。冰晶出口 X 射线吸收近边缘结构（GE-XANES）技术解决了这些局限性，在环境条件下提供深度分辨洞察力，无需特殊样品制备即可进行原位材料分析。然而，GE-XANES 受限于较长的数据采集时间，妨碍了其在各种应用中的实用性。为了克服这一问题，我们在 GE-XANES 数据采集过程中加入了贝叶斯优化 (BO)。这一创新方法将数据采集时间从 20 小时大幅缩短至 25 分钟。我们使用标准 GE-XANES 实验作为参考，以验证贝叶斯优化实验设置的有效性和准确性。我们的研究结果表明，这种优化方法在保持数据质量的同时显著提高了效率，使 GE-XANES 更容易应用于更广泛的材料科学领域。

{"title":"Machine Learning for Efficient Grazing-Exit X-ray Absorption Near Edge Structure Spectroscopy Analysis : Bayesian Optimization Approach","authors":"C. Cakir, Can Bogoclu, Franziska Emmerling, Christina Streli, A. Guilherme Buzanich, Martin Radtke","doi":"10.1088/2632-2153/ad4253","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4253","url":null,"abstract":"\u0000 In materials science, traditional techniques for analysing layered structures are essential for obtaining information about local structure, electronic properties and chemical states. While valuable, these methods often require high vacuum environments and have limited depth profiling capabilities. The Grazing Exit X-ray Absorption Near-Edge Structure (GE-XANES) technique addresses these limitations by providing depth-resolved insight at ambient conditions, facilitating in situ material analysis without special sample preparation. However, GE-XANES is limited by long data acquisition times, which hinders its practicality for various applications. To overcome this, we have incorporated Bayesian Optimization (BO) into the GE-XANES data acquisition process. This innovative approach significantly reduces data acquisition time from 20 hours to 25 minutes. We have used standard GE-XANES experiment, which serve as reference, to validate the effectiveness and accuracy of the BO-informed experimental setup. Our results show that this optimized approach maintains data quality while significantly improving efficiency, making GE-XANES more accessible to a wider range of materials science applications.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"35 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140667613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quadratic hyper-surface kernel-free large margin distribution machine-based regression and its least-square form 基于机器的二次超曲面无核大余量分布回归及其最小平方形式

Machine Learning: Science and Technology

Pub Date : 2024-04-19 DOI: 10.1088/2632-2153/ad40fc

Hao He, Kuaini Wang, Yuzhu Jiang, Huimin Pei

ǫ-Support vector regression (ǫ-SVR) is a powerful machine learning approach that focuses on minimizing the margin, which represents the tolerance range between predicted and actual values. However, recent theoretical studies have highlighted that simply minimizing structural risk does not necessarily result in well margin distribution. Instead, it has been shown that the distribution of margins plays a more crucial role in achieving better generalization performance. Furthermore, the kernel-free technique oﬀers a signiﬁcant advantage as it eﬀectively reduces the overall running time and simpliﬁes the parameter selection process compared to the kernel trick. Based on existing kernel-free regression methods, we present two eﬃcient and robust approaches named quadratic hyper-surface kernel-free large margin distribution machine-based regression(QLDMR) and quadratic hyper-surface kernel-free least squares large margin distribution machine-based regression(QLSLDMR). The QLDMR optimizes the margin distribution by considering both ǫ-insensitive loss and quadratic loss function similar to the large-margin distribution machine-based regression (LDMR). QLSLDMR aims to reduce the cost of the computing process of QLDMR, which transforms inequality constraints into an equality constraint inspired by least squares support vector machines (LSSVR). Both models are combined the spirit of optimal margin distribution with kernel-free technique and after simpliﬁcation are convex so that they can be solved by some classical methods. Experimental results demonstrate the superiority of the optimal margin distribution combined with the kernel-free technique in robustness, generalization, and eﬃciency.

支持向量回归（ǫ-SVR）是一种功能强大的机器学习方法，其重点是最小化边际值，即预测值与实际值之间的容差范围。然而，最近的理论研究强调，仅仅最小化结构风险并不一定会带来良好的边际分布。相反，研究表明，边际分布在实现更好的泛化性能方面起着更为关键的作用。此外，与核技巧相比，无核技术具有显著的优势，它能有效缩短整体运行时间，简化参数选择过程。在现有无核回归方法的基础上，我们提出了两种高效、稳健的方法，即二次超曲面无核大余量分布机器回归（QLDMR）和二次超曲面无核最小二乘大余量分布机器回归（QLSLDMR）。QLDMR 通过考虑对ǫ不敏感的损失和二次损失函数来优化边际分布，与基于大边际分布的机器回归（LDMR）类似。QLSLDMR 的目的是降低 QLDMR 计算过程的成本，它受最小二乘支持向量机（LSSVR）的启发，将不等式约束转化为等式约束。这两个模型都结合了最优边际分布的精神和无核技术，简化后都是凸模型，因此可以用一些经典方法求解。实验结果表明，最优边际分布与无内核技术相结合，在鲁棒性、通用性和易用性方面都更胜一筹。

{"title":"Quadratic hyper-surface kernel-free large margin distribution machine-based regression and its least-square form","authors":"Hao He, Kuaini Wang, Yuzhu Jiang, Huimin Pei","doi":"10.1088/2632-2153/ad40fc","DOIUrl":"https://doi.org/10.1088/2632-2153/ad40fc","url":null,"abstract":"\u0000 ǫ-Support vector regression (ǫ-SVR) is a powerful machine learning approach that focuses on minimizing the margin, which represents the tolerance range between predicted and actual values. However, recent theoretical studies have highlighted that simply minimizing structural risk does not necessarily result in well margin distribution. Instead, it has been shown that the distribution of margins plays a more crucial role in achieving better generalization performance. Furthermore, the kernel-free technique oﬀers a signiﬁcant advantage as it eﬀectively reduces the overall running time and simpliﬁes the parameter selection process compared to the kernel trick. Based on existing kernel-free regression methods, we present two eﬃcient and robust approaches named quadratic hyper-surface kernel-free large margin distribution machine-based regression(QLDMR) and quadratic hyper-surface kernel-free least squares large margin distribution machine-based regression(QLSLDMR). The QLDMR optimizes the margin distribution by considering both ǫ-insensitive loss and quadratic loss function similar to the large-margin distribution machine-based regression (LDMR). QLSLDMR aims to reduce the cost of the computing process of QLDMR, which transforms inequality constraints into an equality constraint inspired by least squares support vector machines (LSSVR). Both models are combined the spirit of optimal margin distribution with kernel-free technique and after simpliﬁcation are convex so that they can be solved by some classical methods. Experimental results demonstrate the superiority of the optimal margin distribution combined with the kernel-free technique in robustness, generalization, and eﬃciency.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140685194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multimodal Protein Representation Learning and Target-aware Variational Auto-encoders for Protein-binding Ligand Generation 用于蛋白质结合配体生成的多模态蛋白质表征学习和目标感知变异自动编码器

Machine Learning: Science and Technology

Pub Date : 2024-04-15 DOI: 10.1088/2632-2153/ad3ee4

Nhat-Khang Ngô, T. Hy

Without knowledge of specific pockets, generating ligands based on the global structure of a protein target plays a crucial role in drug discovery as it helps reduce the search space for potential drug-like candidates in the pipeline. However, contemporary methods require optimizing tailored networks for each protein, which is arduous and costly. To address this issue, we introduce TargetVAE, a target-aware variational auto-encoder that generates ligands with desirable properties including high binding affinity and high synthesizability to arbitrary target proteins, guided by a multimodal deep neural network built based on geometric and sequence models, named Protein Multimodal Network (PMN), as the prior for the generative model. PMN unifies different representations of proteins (e.g., primary structure - sequence of amino acids, 3D tertiary structure, and residue-level graph) into a single representation. Our multimodal architecture learns from the entire protein structure and is able to capture their sequential, topological, and geometrical information by utilizing language modeling, graph neural networks, and geometric deep learning. We showcase the superiority of our approach by conducting extensive experiments and evaluations, including predicting protein-ligand binding affinity in the PBDBind v2020 dataset as well as the assessment of generative model quality, ligand generation for unseen targets, and docking score computation. Empirical results demonstrate the promising and competitive performance of our proposed approach. Our software package is publicly available at https://github.com/HySonLab/Ligand_Generation

在不了解特定口袋的情况下，根据蛋白质靶点的整体结构生成配体在药物发现中起着至关重要的作用，因为这有助于缩小潜在药物候选者的搜索空间。然而，现代方法需要为每种蛋白质优化定制网络，这既艰巨又昂贵。为了解决这个问题，我们引入了 TargetVAE，这是一种目标感知变异自动编码器，它能生成具有理想特性的配体，包括与任意目标蛋白质的高结合亲和力和高合成性，并以基于几何和序列模型构建的多模态深度神经网络（名为蛋白质多模态网络（PMN））作为生成模型的先验。PMN 将蛋白质的不同表征（如一级结构--氨基酸序列、三维三级结构和残基级图形）统一为单一表征。我们的多模态架构从整个蛋白质结构中学习，通过利用语言建模、图神经网络和几何深度学习，能够捕捉其序列、拓扑和几何信息。我们通过广泛的实验和评估展示了我们方法的优越性，包括在 PBDBind v2020 数据集中预测蛋白质与配体的结合亲和力，以及评估生成模型的质量、为未见靶标生成配体和计算对接得分。实证结果表明，我们提出的方法具有良好的前景和竞争力。我们的软件包可通过 https://github.com/HySonLab/Ligand_Generation 公开获取。

{"title":"Multimodal Protein Representation Learning and Target-aware Variational Auto-encoders for Protein-binding Ligand Generation","authors":"Nhat-Khang Ngô, T. Hy","doi":"10.1088/2632-2153/ad3ee4","DOIUrl":"https://doi.org/10.1088/2632-2153/ad3ee4","url":null,"abstract":"\u0000 Without knowledge of specific pockets, generating ligands based on the global structure of a protein target plays a crucial role in drug discovery as it helps reduce the search space for potential drug-like candidates in the pipeline. However, contemporary methods require optimizing tailored networks for each protein, which is arduous and costly. To address this issue, we introduce TargetVAE, a target-aware variational auto-encoder that generates ligands with desirable properties including high binding affinity and high synthesizability to arbitrary target proteins, guided by a multimodal deep neural network built based on geometric and sequence models, named Protein Multimodal Network (PMN), as the prior for the generative model. PMN unifies different representations of proteins (e.g., primary structure - sequence of amino acids, 3D tertiary structure, and residue-level graph) into a single representation. Our multimodal architecture learns from the entire protein structure and is able to capture their sequential, topological, and geometrical information by utilizing language modeling, graph neural networks, and geometric deep learning. We showcase the superiority of our approach by conducting extensive experiments and evaluations, including predicting protein-ligand binding affinity in the PBDBind v2020 dataset as well as the assessment of generative model quality, ligand generation for unseen targets, and docking score computation. Empirical results demonstrate the promising and competitive performance of our proposed approach. Our software package is publicly available at https://github.com/HySonLab/Ligand_Generation","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"43 36","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140701581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1