Pub Date : 2025-11-01Epub Date: 2025-10-01DOI: 10.1016/j.jbi.2025.104920
Şeyma Selcan Mağara, Noah Dietrich, Ali Burak Ünal, Mete Akgün
Objective:
Record linkage is essential for integrating data from multiple sources with diverse applications in real-world healthcare and research. Probabilistic Privacy-Preserving Record Linkage (PPRL) enables this integration occurs, while protecting sensitive information from unauthorized access, especially when datasets lack exact identifiers. As privacy regulations evolve and multi-institutional collaborations expand globally, there is a growing demand for methods that effectively balance security, accuracy, and efficiency. However, ensuring both privacy and scalability in large-scale record linkage remains a key challenge.
Method:
This paper presents a novel and efficient PPRL method based on a secure 3-party computation (MPC) framework. Our approach allows multiple parties to compute linkage results without exposing their private inputs and significantly improves the speed of linkage process compared to existing PPRL solutions.
Result:
Our method preserves the linkage quality of a state-of-the-art (SOTA) MPC-based PPRL method while achieving up to 14 times faster performance. For example, linking a record against a database of 10,000 records takes just 8.74 s in a realistic network with 700 Mbps bandwidth and 60 ms latency, compared to 92.32 s with the SOTA method. Even on a slower internet connection with 100 Mbps bandwidth and 60 ms latency, the linkage completes in 28 s, where as the SOTA method requires 287.96 s. These results demonstrate the significant scalability and efficiency improvements of our approach.
Conclusion:
Our novel PPRL method, based on secure 3-party computation, offers an efficient and scalable solution for large-scale record linkage while ensuring privacy protection. The approach demonstrates significant performance improvements, making it a promising tool for secure data integration in privacy-sensitive sectors.
{"title":"Accelerating probabilistic privacy-preserving medical record linkage: A three-party MPC approach","authors":"Şeyma Selcan Mağara, Noah Dietrich, Ali Burak Ünal, Mete Akgün","doi":"10.1016/j.jbi.2025.104920","DOIUrl":"10.1016/j.jbi.2025.104920","url":null,"abstract":"<div><h3>Objective:</h3><div>Record linkage is essential for integrating data from multiple sources with diverse applications in real-world healthcare and research. Probabilistic Privacy-Preserving Record Linkage (PPRL) enables this integration occurs, while protecting sensitive information from unauthorized access, especially when datasets lack exact identifiers. As privacy regulations evolve and multi-institutional collaborations expand globally, there is a growing demand for methods that effectively balance security, accuracy, and efficiency. However, ensuring both privacy and scalability in large-scale record linkage remains a key challenge.</div></div><div><h3>Method:</h3><div>This paper presents a novel and efficient PPRL method based on a secure 3-party computation (MPC) framework. Our approach allows multiple parties to compute linkage results without exposing their private inputs and significantly improves the speed of linkage process compared to existing PPRL solutions.</div></div><div><h3>Result:</h3><div>Our method preserves the linkage quality of a state-of-the-art (SOTA) MPC-based PPRL method while achieving up to 14 times faster performance. For example, linking a record against a database of 10,000 records takes just 8.74 s in a realistic network with 700 Mbps bandwidth and 60 ms latency, compared to 92.32 s with the SOTA method. Even on a slower internet connection with 100 Mbps bandwidth and 60 ms latency, the linkage completes in 28 s, where as the SOTA method requires 287.96 s. These results demonstrate the significant scalability and efficiency improvements of our approach.</div></div><div><h3>Conclusion:</h3><div>Our novel PPRL method, based on secure 3-party computation, offers an efficient and scalable solution for large-scale record linkage while ensuring privacy protection. The approach demonstrates significant performance improvements, making it a promising tool for secure data integration in privacy-sensitive sectors.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104920"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145223419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biobanks and biomolecular resources are increasingly central to data-driven biomedical research, encompassing not only metadata but also granular, sample-related data from diverse sources such as healthcare systems, national registries, and research outputs. However, the lack of a standardised, machine-readable format for representing such data limits interoperability, data reuse and integration into clinical and research environments. While MIABIS provides a conceptual model for biobank data, its abstract nature and reliance on heterogeneous implementations create barriers to practical, scalable adoption. This study presents a pragmatic, operational implementation of MIABIS focused on enabling real-world exchange and integration of sample-level data.
Methods
We systematically evaluated established data exchange standards, comparing HL7 FHIR and OMOP CDM with respect to their suitability for structuring sample-related data in a semantically robust and machine-readable form. Based on this analysis, we developed a FHIR-based representation of MIABIS that supports complex biobank structures and enables integration with federated data infrastructures. Supporting tools, including a Python library and an implementation guide, were created to ensure usability across diverse research and clinical contexts.
Results
We created nine interoperable FHIR profiles covering core MIABIS entities, ensuring consistency with FHIR standards. To support adoption, we developed an open-source Python library that abstracts FHIR interactions and provides schema validation for MIABIS-compliant data. The library was integrated into an ETL tool in operation at Czech Node of BBMRI-ERIC, European Biobanking and Biomolecular Resources Research Infrastructure, to demonstrate usability with real-world sample-related data. Separately, we validated the representation of MIABIS entities at the organisational level by converting the data structures of BBMRI-ERIC Directory into FHIR, demonstrating compatibility with federated data infrastructures.
Conclusion
This work delivers a machine-readable, interoperable implementation of MIABIS, enabling the exchange of both organisational and sample-level data across biobanks and health information systems. By integrating MIABIS with HL7 FHIR, we provide a host of reusable tools and mechanisms for further evolution of the data model. Combined, these benefits can help with the integration into clinical and research workflows, supporting data discoverability, reuse, and cross-institutional collaboration in biomedical research.
{"title":"Definitions to data flow: Operationalizing MIABIS in HL7 FHIR","authors":"Radovan Tomášik , Šimon Koňár , Niina Eklund , Cäcilia Engels , Zdenka Dudova , Radoslava Kacová , Roman Hrstka , Petr Holub","doi":"10.1016/j.jbi.2025.104919","DOIUrl":"10.1016/j.jbi.2025.104919","url":null,"abstract":"<div><h3>Objective</h3><div>Biobanks and biomolecular resources are increasingly central to data-driven biomedical research, encompassing not only metadata but also granular, sample-related data from diverse sources such as healthcare systems, national registries, and research outputs. However, the lack of a standardised, machine-readable format for representing such data limits interoperability, data reuse and integration into clinical and research environments. While MIABIS provides a conceptual model for biobank data, its abstract nature and reliance on heterogeneous implementations create barriers to practical, scalable adoption. This study presents a pragmatic, operational implementation of MIABIS focused on enabling real-world exchange and integration of sample-level data.</div></div><div><h3>Methods</h3><div>We systematically evaluated established data exchange standards, comparing HL7 FHIR and OMOP CDM with respect to their suitability for structuring sample-related data in a semantically robust and machine-readable form. Based on this analysis, we developed a FHIR-based representation of MIABIS that supports complex biobank structures and enables integration with federated data infrastructures. Supporting tools, including a Python library and an implementation guide, were created to ensure usability across diverse research and clinical contexts.</div></div><div><h3>Results</h3><div>We <em>created nine interoperable FHIR profiles</em> covering core MIABIS entities, ensuring consistency with FHIR standards. To support adoption, we <em>developed an open-source Python library</em> that abstracts FHIR interactions and provides schema validation for MIABIS-compliant data. The <em>library was integrated into an ETL tool</em> in operation at Czech Node of BBMRI-ERIC, European Biobanking and Biomolecular Resources Research Infrastructure, to demonstrate usability with real-world sample-related data. Separately, we validated the representation of MIABIS entities at the organisational level by converting the data structures of BBMRI-ERIC Directory into FHIR, demonstrating compatibility with federated data infrastructures.</div></div><div><h3>Conclusion</h3><div>This work delivers a machine-readable, interoperable implementation of MIABIS, enabling the exchange of both organisational and sample-level data across biobanks and health information systems. By integrating MIABIS with HL7 FHIR, we provide a host of reusable tools and mechanisms for further evolution of the data model. Combined, these benefits can help with the integration into clinical and research workflows, supporting data discoverability, reuse, and cross-institutional collaboration in biomedical research.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104919"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145191636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-04DOI: 10.1016/j.jbi.2025.104925
Luke Stevens , Nan Kennedy , Rob J. Taylor , Adam Lewis , Frank E. Harrell Jr , Matthew S. Shotwell , Emily S. Serdoz , Gordon R. Bernard , Wesley H. Self , Christopher J. Lindsell , Paul A. Harris , Jonathan D. Casey
Objective
Since 2012, the electronic data capture platform REDCap has included an embedded randomization module allowing a single randomization per study record with the ability to stratify by variables such as study site and participant sex at birth. In recent years, platform, adaptive, decentralized, and pragmatic trials have gained popularity. These trial designs often require approaches to randomization not supported by the original REDCap randomization module, including randomizing patients into multiple domains or at multiple points in time, changing allocation tables to add or drop study groups, or adaptively changing allocation ratios based on data from previously enrolled participants. Our team aimed to develop new randomization functions to address these issues.
Methods
A collaborative process facilitated by the NIH-funded Trial Innovation Network was initiated to modernize the randomization module in REDCap, incorporating feedback from clinical trialists, biostatisticians, technologists, and other experts.
Results
This effort led to the development of an advanced randomization module within the REDCap platform. In addition to supporting platform, adaptive, decentralized, and pragmatic trials, the new module introduces several new features, such as improved support for blinded randomization, additional randomization metadata capture (e.g., user identity and timestamp), additional tools allowing REDCap administrators to support investigators using the randomization module, and the ability for clinicians participating in pragmatic or decentralized trials to perform randomization through a survey without needing log-in access to the study database. As of June 19, 2025, multiple randomizations have been used in 211 projects from 55 institutions, randomizations with real-time trigger logic in 108 projects from 64 institutions, and blinded group allocation in 24 projects from 17 institutions.
Conclusion
The new randomization module aims to streamline the randomization process, improve trial efficiency, and ensure robust data integrity, thereby supporting the conduct of more sophisticated and adaptive clinical trials.
{"title":"A REDCap advanced randomization module to meet the needs of modern trials","authors":"Luke Stevens , Nan Kennedy , Rob J. Taylor , Adam Lewis , Frank E. Harrell Jr , Matthew S. Shotwell , Emily S. Serdoz , Gordon R. Bernard , Wesley H. Self , Christopher J. Lindsell , Paul A. Harris , Jonathan D. Casey","doi":"10.1016/j.jbi.2025.104925","DOIUrl":"10.1016/j.jbi.2025.104925","url":null,"abstract":"<div><h3>Objective</h3><div>Since 2012, the electronic data capture platform REDCap has included an embedded randomization module allowing a single randomization per study record with the ability to stratify by variables such as study site and participant sex at birth. In recent years, platform, adaptive, decentralized, and pragmatic trials have gained popularity. These trial designs often require approaches to randomization not supported by the original REDCap randomization module, including randomizing patients into multiple domains or at multiple points in time, changing allocation tables to add or drop study groups, or adaptively changing allocation ratios based on data from previously enrolled participants. Our team aimed to develop new randomization functions to address these issues.</div></div><div><h3>Methods</h3><div>A collaborative process facilitated by the NIH-funded Trial Innovation Network was initiated to modernize the randomization module in REDCap, incorporating feedback from clinical trialists, biostatisticians, technologists, and other experts.</div></div><div><h3>Results</h3><div>This effort led to the development of an advanced randomization module within the REDCap platform. In addition to supporting platform, adaptive, decentralized, and pragmatic trials, the new module introduces several new features, such as improved support for blinded randomization, additional randomization metadata capture (e.g., user identity and timestamp), additional tools allowing REDCap administrators to support investigators using the randomization module, and the ability for clinicians participating in pragmatic or decentralized trials to perform randomization through a survey without needing log-in access to the study database. As of June 19, 2025, multiple randomizations have been used in 211 projects from 55 institutions, randomizations with real-time trigger logic in 108 projects from 64 institutions, and blinded group allocation in 24 projects from 17 institutions.</div></div><div><h3>Conclusion</h3><div>The new randomization module aims to streamline the randomization process, improve trial efficiency, and ensure robust data integrity, thereby supporting the conduct of more sophisticated and adaptive clinical trials.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104925"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145238683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-23DOI: 10.1016/j.jbi.2025.104930
Jianfu Li , Yiming Li , Zenan Sun , Evan Yu , Ahmed M. Abdelhameed , Weiguo Cao , Haifang Li , Jianping He , Pengze Li , Jingna Feng , Yue Yu , Xinyue Hu , Manqi Li , Rakesh Kumar , Yifang Dang , Fang Li , Shahyar M Gharacholou , Cui Tao
Objective
Multimodal large language models (LLMs) offer new potential for enhancing cardiovascular decision support, particularly in interpreting echocardiographic data. This study systematically evaluates and benchmarks foundation models from diverse domains on echocardiogram-based tasks to assess their effectiveness, limitations and potential in clinical cardiovascular applications.
Methods
We curated three cardiovascular imaging datasets—EchoNet-Dynamic, TMED2, and an expert-annotated echocardiogram (TTE) dataset—to evaluate performance on four critical tasks: (1) cardiac function evaluation through ejection fraction (EF) prediction, (2) cardiac view classification, (3) aortic stenosis (AS) severity assessment, and (4) cardiovascular disease classification. We evaluated six multimodal LLMs: EchoClip (cardiovascular-specific), BiomedGPT and LLaVA-Med (medical-domain), and MiniCPM-V 2.6, LLaMA-3-Vision-Alpha, and Gemini-1.5 (general-domain). Models were assessed using zero-shot, few-shot, and fine-tuning strategies, where applicable. Performance was measured using mean absolute error (MAE) and root mean squared error (RMSE) for EF prediction, and accuracy, precision, recall, and F1 score for classification tasks.
Results
Domain-specific models such as EchoClip demonstrated the strongest zero-shot performance in EF prediction, achieving an MAE of 10.34. General-domain models showed limited effectiveness without adaptation, with MiniCPM-V 2.6 reporting an MAE of 251.92. Fine-tuning significantly improved outcomes; for example, MiniCPM-V 2.6′s MAE decreased to 31.93, and view classification accuracy increased from 20 % to 63.05 %. In classification tasks, EchoClip achieved F1 scores of 0.2716 for AS severity and 0.4919 for disease classification but exhibited limited performance in view classification (F1 = 0.1457). Few-shot learning yielded modest gains but was generally less effective than fine-tuning.
Conclusions
This evaluation and benchmarking study demonstrated the importance of domain-specific pretraining and model adaptation in cardiovascular decision support tasks. Cardiovascular-focused models and fine-tuned general-domain models achieved superior performance, especially for complex assessments such as EF estimation. These findings offer critical insights into the current capabilities and future directions for clinically meaningful AI integration in cardiovascular medicine.
{"title":"Exploring multimodal large language models on transthoracic Echocardiogram (TTE) tasks for cardiovascular decision support","authors":"Jianfu Li , Yiming Li , Zenan Sun , Evan Yu , Ahmed M. Abdelhameed , Weiguo Cao , Haifang Li , Jianping He , Pengze Li , Jingna Feng , Yue Yu , Xinyue Hu , Manqi Li , Rakesh Kumar , Yifang Dang , Fang Li , Shahyar M Gharacholou , Cui Tao","doi":"10.1016/j.jbi.2025.104930","DOIUrl":"10.1016/j.jbi.2025.104930","url":null,"abstract":"<div><h3>Objective</h3><div>Multimodal large language models (LLMs) offer new potential for enhancing cardiovascular decision support, particularly in interpreting echocardiographic data. This study systematically evaluates and benchmarks foundation models from diverse domains on echocardiogram-based tasks to assess their effectiveness, limitations and potential in clinical cardiovascular applications.</div></div><div><h3>Methods</h3><div>We curated three cardiovascular imaging datasets—EchoNet-Dynamic, TMED2, and an expert-annotated echocardiogram (TTE) dataset—to evaluate performance on four critical tasks: (1) cardiac function evaluation through ejection fraction (EF) prediction, (2) cardiac view classification, (3) aortic stenosis (AS) severity assessment, and (4) cardiovascular disease classification. We evaluated six multimodal LLMs: EchoClip (cardiovascular-specific), BiomedGPT and LLaVA-Med (medical-domain), and MiniCPM-V 2.6, LLaMA-3-Vision-Alpha, and Gemini-1.5 (general-domain). Models were assessed using zero-shot, few-shot, and fine-tuning strategies, where applicable. Performance was measured using mean absolute error (MAE) and root mean squared error (RMSE) for EF prediction, and accuracy, precision, recall, and F1 score for classification tasks.</div></div><div><h3>Results</h3><div>Domain-specific models such as EchoClip demonstrated the strongest zero-shot performance in EF prediction, achieving an MAE of 10.34. General-domain models showed limited effectiveness without adaptation, with MiniCPM-V 2.6 reporting an MAE of 251.92. Fine-tuning significantly improved outcomes; for example, MiniCPM-V 2.6′s MAE decreased to 31.93, and view classification accuracy increased from 20 % to 63.05 %. In classification tasks, EchoClip achieved F1 scores of 0.2716 for AS severity and 0.4919 for disease classification but exhibited limited performance in view classification (F1 = 0.1457). Few-shot learning yielded modest gains but was generally less effective than fine-tuning.</div></div><div><h3>Conclusions</h3><div>This evaluation and benchmarking study demonstrated the importance of domain-specific pretraining and model adaptation in cardiovascular decision support tasks. Cardiovascular-focused models and fine-tuned general-domain models achieved superior performance, especially for complex assessments such as EF estimation. These findings offer critical insights into the current capabilities and future directions for clinically meaningful AI integration in cardiovascular medicine.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104930"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145370273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-10DOI: 10.1016/j.jbi.2025.104926
Yanlei Kang , Haoyu Zhuang , Yunliang Jiang , Zhong Li
The prediction of drug–target interactions (DTIs) and binding affinities (DTAs) plays a pivotal role in drug discovery and design. However, most existing methods fail to fully exploit the rich multimodal information inherent in molecular structures. In this study, we propose a multimodal feature fusion model, MF-DTA. On the representational level, MF-DTA introduces the molecular fragment graph, generated via BRICS-based decomposition, as a novel modality. This representation enables a more intuitive capture of the structural characteristics and pharmacophore-related information of drug molecules. In terms of model architecture, a deformable convolutional layer is applied for the protein residue–residue contact map (hereafter referred to as contact map) to flexibly adjust the distribution of sampling points and enhance the representational capability. To effectively integrate the multimodal information from both drug and target branches, a mixture-of-experts (MoE)-based multihead attention mechanism is employed for local fusion, while a dual-decoder architecture facilitates cross-modal interaction between drug and target features. The final output yields a high-quality prediction of binding affinity. Cross-validation experiments conducted on several benchmark datasets demonstrate that MF-DTA consistently outperforms state-of-the-art methods. Specifically, it achieves CI improvements of 0.1%, 0.5%, and 0.3% over the best-performing baseline models in the Davis, KIBA and BindingDB datasets, respectively, and exceeds traditional models by 1% to 2% on average. The model also ranks among the best performers in terms of the MSE and Rm metrics. Model visualization further supports its interpretability, confirming that it successfully learns meaningful drug–target interaction patterns.To further assess the practical utility of the proposed model, we apply it to screen potential candidate compounds from a natural product library targeting tubulin. In summary, MF-DTA offers not only accurate and robust binding affinity prediction capabilities but also strong interpretability, making it a powerful and practical tool for drug design and target identification.
{"title":"MF-DTA: Predicting drug–target affinity with multi-modal feature fusion model","authors":"Yanlei Kang , Haoyu Zhuang , Yunliang Jiang , Zhong Li","doi":"10.1016/j.jbi.2025.104926","DOIUrl":"10.1016/j.jbi.2025.104926","url":null,"abstract":"<div><div>The prediction of drug–target interactions (DTIs) and binding affinities (DTAs) plays a pivotal role in drug discovery and design. However, most existing methods fail to fully exploit the rich multimodal information inherent in molecular structures. In this study, we propose a multimodal feature fusion model, MF-DTA. On the representational level, MF-DTA introduces the molecular fragment graph, generated via BRICS-based decomposition, as a novel modality. This representation enables a more intuitive capture of the structural characteristics and pharmacophore-related information of drug molecules. In terms of model architecture, a deformable convolutional layer is applied for the protein residue–residue contact map (hereafter referred to as contact map) to flexibly adjust the distribution of sampling points and enhance the representational capability. To effectively integrate the multimodal information from both drug and target branches, a mixture-of-experts (MoE)-based multihead attention mechanism is employed for local fusion, while a dual-decoder architecture facilitates cross-modal interaction between drug and target features. The final output yields a high-quality prediction of binding affinity. Cross-validation experiments conducted on several benchmark datasets demonstrate that MF-DTA consistently outperforms state-of-the-art methods. Specifically, it achieves CI improvements of 0.1%, 0.5%, and 0.3% over the best-performing baseline models in the Davis, KIBA and BindingDB datasets, respectively, and exceeds traditional models by 1% to 2% on average. The model also ranks among the best performers in terms of the MSE and R<sub>m</sub> <span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span> metrics. Model visualization further supports its interpretability, confirming that it successfully learns meaningful drug–target interaction patterns.To further assess the practical utility of the proposed model, we apply it to screen potential candidate compounds from a natural product library targeting tubulin. In summary, MF-DTA offers not only accurate and robust binding affinity prediction capabilities but also strong interpretability, making it a powerful and practical tool for drug design and target identification.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104926"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145280336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-13DOI: 10.1016/j.jbi.2025.104924
Xinyao Liu , Junchang Xin , Qi Shen , Zhihong Huang , Zhiqiong Wang
Objective:
Radiology report provides important references for physicians’ treatment decisions by including descriptions and diagnostic results of imaging. Automatic generation of radiology report reduces the workload of physicians and significantly improves work efficiency. However, the existing report generation methods use image-text conversion to generate reports directly from medical images, and fail to fully simulate the radiologist’s diagnostic process of “examine first, describe later”. Therefore, existing methods often can only generate general normal descriptions, and it is difficult to accurately describe the specific lesion features.
Methods:
To address this issue, we mimic the working mode of radiologists by first checking whether the patient suffers from a certain disease, and then using the learned medical knowledge to describe the images to form a report. We propose a soft label-guided transformer (SLGT) for radiology report generation. Firstly, the pseudo-labels of the samples are obtained, and the soft label-guided attention mechanism is utilized to highlight features related to the disease labels in the encoding stage. Secondly, text features from the decoding phase and image features are aligned, and the generated text features are used to guide the potential representations. Finally, a hybrid loss is designed that includes losses for text generation, disease classification, and visual-textual alignment. Optimization of SLGT using the hybrid loss allows the model to learn richer features that are more relevant to disease abnormalities, which improves the performance of the model.
Results:
The proposed SLGT is evaluated on the widely used IU X-ray, MIMIC-CXR, and COV-CTR datasets. The experiments show that the proposed model SLGT outperforms the previous state-of-the-art models on three datasets.
Conclusion:
This work improves the performance of automatically generating medical reports, making their application in computer-aided diagnosis feasible.
{"title":"Soft label-guided transformer for radiology report generation","authors":"Xinyao Liu , Junchang Xin , Qi Shen , Zhihong Huang , Zhiqiong Wang","doi":"10.1016/j.jbi.2025.104924","DOIUrl":"10.1016/j.jbi.2025.104924","url":null,"abstract":"<div><h3>Objective:</h3><div>Radiology report provides important references for physicians’ treatment decisions by including descriptions and diagnostic results of imaging. Automatic generation of radiology report reduces the workload of physicians and significantly improves work efficiency. However, the existing report generation methods use image-text conversion to generate reports directly from medical images, and fail to fully simulate the radiologist’s diagnostic process of “examine first, describe later”. Therefore, existing methods often can only generate general normal descriptions, and it is difficult to accurately describe the specific lesion features.</div></div><div><h3>Methods:</h3><div>To address this issue, we mimic the working mode of radiologists by first checking whether the patient suffers from a certain disease, and then using the learned medical knowledge to describe the images to form a report. We propose a soft label-guided transformer (SLGT) for radiology report generation. Firstly, the pseudo-labels of the samples are obtained, and the soft label-guided attention mechanism is utilized to highlight features related to the disease labels in the encoding stage. Secondly, text features from the decoding phase and image features are aligned, and the generated text features are used to guide the potential representations. Finally, a hybrid loss is designed that includes losses for text generation, disease classification, and visual-textual alignment. Optimization of SLGT using the hybrid loss allows the model to learn richer features that are more relevant to disease abnormalities, which improves the performance of the model.</div></div><div><h3>Results:</h3><div>The proposed SLGT is evaluated on the widely used IU X-ray, MIMIC-CXR, and COV-CTR datasets. The experiments show that the proposed model SLGT outperforms the previous state-of-the-art models on three datasets.</div></div><div><h3>Conclusion:</h3><div>This work improves the performance of automatically generating medical reports, making their application in computer-aided diagnosis feasible.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104924"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145300755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-22DOI: 10.1016/j.jbi.2025.104944
Weiru Fu , Hao Li , Ling Luo, Hongfei Lin
Objective:
Adverse Drug Event (ADE) extraction from social media is a critical yet challenging task due to the semantic similarity between adverse effects and therapeutic indications, as well as the prevalence of overlapping and discontinuous mentions often caused by comorbid conditions. This study aims to develop a robust model for accurate ADE extraction from noisy and irregular social media texts.
Methods:
We propose ADENER, a grid-tagging architecture that models ADE extraction as multi-label word-pair classification. ADENER incorporates two core encoding mechanisms: the convolutional capture layer fuses multi-dimensional textual features, captures long-range word-pair dependencies via dilated convolutions, and enhances interactions through semantic association matrices for social media text irregularities; the syntactic affine layer integrates path-level dependency information to enhance global logic understanding, enabling the model to distinguish between therapeutic symptom entities and ADE entities through syntactic cues. The decoding stage uses four-type relational labels to uniformly decode flat, overlapping, and discontinuous ADE mentions.
Results:
We evaluated ADENER on three widely used ADE extraction datasets: CADEC, CADECv2, SMM4H. The model achieved F1 scores of 74.64%, 77.97%, 61.73% on these datasets, respectively, outperforming all compared baseline models while maintaining competitive computational efficiency. The results demonstrate the effectiveness of our model in addressing the challenges posed by irregular and noisy social media data.
Conclusion:
ADENER offers a unified and effective solution for ADE extraction from social media, capable of handling flat, overlapping, and discontinuous entity mentions and correctly distinguishing ADE entities from therapeutic symptom entities. By incorporating convolutional capture layers for semantic word-pair interactions and syntactic affine layers for dependency-based logic understanding, our approach significantly improves extraction accuracy, providing a valuable tool for pharmacovigilance research and real-world drug safety monitoring.
{"title":"ADENER: A syntax-augmented grid-tagging model for Adverse Drug Event extraction in social media","authors":"Weiru Fu , Hao Li , Ling Luo, Hongfei Lin","doi":"10.1016/j.jbi.2025.104944","DOIUrl":"10.1016/j.jbi.2025.104944","url":null,"abstract":"<div><h3>Objective:</h3><div>Adverse Drug Event (ADE) extraction from social media is a critical yet challenging task due to the semantic similarity between adverse effects and therapeutic indications, as well as the prevalence of overlapping and discontinuous mentions often caused by comorbid conditions. This study aims to develop a robust model for accurate ADE extraction from noisy and irregular social media texts.</div></div><div><h3>Methods:</h3><div>We propose ADENER, a grid-tagging architecture that models ADE extraction as multi-label word-pair classification. ADENER incorporates two core encoding mechanisms: the convolutional capture layer fuses multi-dimensional textual features, captures long-range word-pair dependencies via dilated convolutions, and enhances interactions through semantic association matrices for social media text irregularities; the syntactic affine layer integrates path-level dependency information to enhance global logic understanding, enabling the model to distinguish between therapeutic symptom entities and ADE entities through syntactic cues. The decoding stage uses four-type relational labels to uniformly decode flat, overlapping, and discontinuous ADE mentions.</div></div><div><h3>Results:</h3><div>We evaluated ADENER on three widely used ADE extraction datasets: CADEC, CADECv2, SMM4H. The model achieved F1 scores of 74.64%, 77.97%, 61.73% on these datasets, respectively, outperforming all compared baseline models while maintaining competitive computational efficiency. The results demonstrate the effectiveness of our model in addressing the challenges posed by irregular and noisy social media data.</div></div><div><h3>Conclusion:</h3><div>ADENER offers a unified and effective solution for ADE extraction from social media, capable of handling flat, overlapping, and discontinuous entity mentions and correctly distinguishing ADE entities from therapeutic symptom entities. By incorporating convolutional capture layers for semantic word-pair interactions and syntactic affine layers for dependency-based logic understanding, our approach significantly improves extraction accuracy, providing a valuable tool for pharmacovigilance research and real-world drug safety monitoring.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104944"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145355020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-17DOI: 10.1016/j.jbi.2025.104929
Kassem Anis Bouali, Elena Šikudová
Objective:
Early diagnosis of Alzheimer’s disease depends on accessible cognitive assessments, such as the Rey-Osterrieth Complex Figure (ROCF) test. However, manual scoring of this test is labor-intensive and subjective, which introduces experimental biases. Additionally, deep learning models face challenges due to the limited availability of annotated clinical data, particularly for assessments like the ROCF test. This scarcity of data restricts model generalization and exacerbates domain shifts across different populations.
Methods:
We propose a novel framework comprising a data synthesis pipeline and ROCF-Net, a deep learning model specifically designed for ROCF scoring. The synthesis pipeline is lightweight and capable of generating realistic, diverse, and annotated ROCF drawings. ROCF-Net, on the other hand, is a cross-domain scoring model engineered to address domain discrepancies in stroke texture and line artifacts. It maintains high scoring accuracy through a novel line-specific attention mechanism tailored to the unique characteristics of ROCF drawings.
Results:
Unlike conventional synthetic medical imaging methods, our approach generates ROCF drawings that accurately reflect Alzheimer’s-specific abnormalities with minimal computational cost. Our scoring model achieves SOTA performance across differently sourced datasets, with a Mean Absolute Error (MAE) of 3.53 and a Pearson Correlation Coefficient (PCC) of 0.86. This demonstrates both high predictive accuracy and computational efficiency, outperforming existing ROCF scoring methods that rely on Convolutional Neural Networks (CNNs) while avoiding the overhead of parameter-heavy transformer models. We also show that training on our synthetic data generalizes as well as training on real clinical data, where the difference in performance was minimal (MAE differed by 1.43 and PCC by 0.07), indicating no statistically significant performance gap.
Conclusion:
Our work introduces four contributions: (1) a cost-effective pipeline for generating synthetic ROCF data, reducing dependency on clinical datasets; (2) a domain-agnostic model for automated ROCF scoring across diverse drawing styles; (3) a lightweight attention mechanism aligning model decisions with clinical scoring for transparency; and (4) a bias-aware framework using synthetic data to reduce demographic disparities, promoting fair cognitive assessment across populations.
{"title":"Synthetic-to-real attentive deep learning for Alzheimer’s assessment: A domain-agnostic framework for ROCF scoring","authors":"Kassem Anis Bouali, Elena Šikudová","doi":"10.1016/j.jbi.2025.104929","DOIUrl":"10.1016/j.jbi.2025.104929","url":null,"abstract":"<div><h3>Objective:</h3><div>Early diagnosis of Alzheimer’s disease depends on accessible cognitive assessments, such as the Rey-Osterrieth Complex Figure (ROCF) test. However, manual scoring of this test is labor-intensive and subjective, which introduces experimental biases. Additionally, deep learning models face challenges due to the limited availability of annotated clinical data, particularly for assessments like the ROCF test. This scarcity of data restricts model generalization and exacerbates domain shifts across different populations.</div></div><div><h3>Methods:</h3><div>We propose a novel framework comprising a data synthesis pipeline and ROCF-Net, a deep learning model specifically designed for ROCF scoring. The synthesis pipeline is lightweight and capable of generating realistic, diverse, and annotated ROCF drawings. ROCF-Net, on the other hand, is a cross-domain scoring model engineered to address domain discrepancies in stroke texture and line artifacts. It maintains high scoring accuracy through a novel line-specific attention mechanism tailored to the unique characteristics of ROCF drawings.</div></div><div><h3>Results:</h3><div>Unlike conventional synthetic medical imaging methods, our approach generates ROCF drawings that accurately reflect Alzheimer’s-specific abnormalities with minimal computational cost. Our scoring model achieves SOTA performance across differently sourced datasets, with a Mean Absolute Error (MAE) of 3.53 and a Pearson Correlation Coefficient (PCC) of 0.86. This demonstrates both high predictive accuracy and computational efficiency, outperforming existing ROCF scoring methods that rely on Convolutional Neural Networks (CNNs) while avoiding the overhead of parameter-heavy transformer models. We also show that training on our synthetic data generalizes as well as training on real clinical data, where the difference in performance was minimal (MAE differed by 1.43 and PCC by 0.07), indicating no statistically significant performance gap.</div></div><div><h3>Conclusion:</h3><div>Our work introduces four contributions: (1) a cost-effective pipeline for generating synthetic ROCF data, reducing dependency on clinical datasets; (2) a domain-agnostic model for automated ROCF scoring across diverse drawing styles; (3) a lightweight attention mechanism aligning model decisions with clinical scoring for transparency; and (4) a bias-aware framework using synthetic data to reduce demographic disparities, promoting fair cognitive assessment across populations.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104929"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145329252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-08DOI: 10.1016/j.jbi.2025.104923
Qiuyang Feng , Xiao Huang
Drug–drug interactions are a major concern in healthcare, as concurrent drug use can cause severe adverse effects. Existing machine learning methods often neglect data imbalance and DDI directionality, limiting clinical reliability. To overcome these issues, we employed GPT-4o Large Language Model to convert free-text DDI descriptions into structured triplets for directionality analysis and applied SMOTE to alleviate class imbalance. Using four key drug features (molecular fingerprints, enzymes, pathways, targets), our Deep Neural Networks (DNN) achieved 88.9% accuracy and showed an average AUPR gain of 0.68 for minority classes attributable to SMOTE. By applying attention-based feature importance analysis, we demonstrated that the most influential feature in the DNN model was supported by pharmacological evidence. These results demonstrate the effectiveness of our framework for accurate and robust DDI prediction. The source code and data are available at https://github.com/FrankFengF/Drug-drug-interaction-prediction-
{"title":"Multi-feature machine learning for enhanced drug–drug interaction prediction","authors":"Qiuyang Feng , Xiao Huang","doi":"10.1016/j.jbi.2025.104923","DOIUrl":"10.1016/j.jbi.2025.104923","url":null,"abstract":"<div><div>Drug–drug interactions are a major concern in healthcare, as concurrent drug use can cause severe adverse effects. Existing machine learning methods often neglect data imbalance and DDI directionality, limiting clinical reliability. To overcome these issues, we employed GPT-4o Large Language Model to convert free-text DDI descriptions into structured triplets for directionality analysis and applied SMOTE to alleviate class imbalance. Using four key drug features (molecular fingerprints, enzymes, pathways, targets), our Deep Neural Networks (DNN) achieved 88.9% accuracy and showed an average AUPR gain of 0.68 for minority classes attributable to SMOTE. By applying attention-based feature importance analysis, we demonstrated that the most influential feature in the DNN model was supported by pharmacological evidence. These results demonstrate the effectiveness of our framework for accurate and robust DDI prediction. The source code and data are available at <span><span>https://github.com/FrankFengF/Drug-drug-interaction-prediction-</span><svg><path></path></svg></span></div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104923"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145258203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-11DOI: 10.1016/j.jbi.2025.104931
Tien-Yu Chang , Qinglin Gou , Leyi Zhao , Tiancheng Zhou , Hongyu Chen , Dong Yang , Huiwen Ju , Kaleb E. Smith , Chengkun Sun , Jinqian Pan , Yu Huang , Xing He , Xuhong Zhang , Daguang Xu , Jie Xu , Jiang Bian , Aokun Chen
Objective
Lung cancer is the most prevalent cancer and the leading cause of cancer-related death in the United States. Lung cancer screening with low-dose computed tomography (LDCT) helps identify lung cancer at an early stage and thus improves overall survival. The growing adoption of LDCT screening has increased radiologists’ workload and demands specialized training to accurately interpret LDCT images and report findings. Advances in artificial intelligence (AI), including large language models (LLMs) and vision models, could help reduce this burden and improve accuracy.
Methods
We devised LUMEN (Lung cancer screening with Unified Multimodal Evaluation and Navigation), a multimodal AI framework that mimics the radiologist’s workflow by identifying nodules in LDCT images, generating their characteristics, and drafting corresponding radiology reports in accordance with reporting guidelines. LUMEN integrates computer vision, vision-language models (VLMs), and LLMs. To assess our system, we developed a benchmarking framework to evaluate the lung cancer screening reports generated based on the findings and management criteria outlined in the Lung Imaging Reporting and Data System (Lung-RADS). It extracts them from radiology reports and measures clinical accuracy—focusing on information that is clinically important for lung cancer screening—independently of report format.
Results
This complement exists LLM/VLM in semantic accuracy metrics and provides a more comprehensive view of system performance. Our lung cancer screening report generation system achieved unparalleled performance compared to contemporary VLM systems, including M3D CT2Report. Furthermore, compared to standard LLM metrics, the clinical metrics we designed for lung cancer screening more accurately reflect the clinical utility of the generated reports.
Conclusion
LUMEN demonstrates the feasibility of generating clinically accurate lung nodule reports from LDCT images through a nodule-centric VQA approach, highlighting the potential of integrating VLMs and LLMs to support radiologists in lung cancer screening workflows. Our findings also underscore the importance of applying clinically meaningful evaluation metrics in developing medical AI systems.
{"title":"From image to report: automating lung cancer screening interpretation and reporting with vision-language models","authors":"Tien-Yu Chang , Qinglin Gou , Leyi Zhao , Tiancheng Zhou , Hongyu Chen , Dong Yang , Huiwen Ju , Kaleb E. Smith , Chengkun Sun , Jinqian Pan , Yu Huang , Xing He , Xuhong Zhang , Daguang Xu , Jie Xu , Jiang Bian , Aokun Chen","doi":"10.1016/j.jbi.2025.104931","DOIUrl":"10.1016/j.jbi.2025.104931","url":null,"abstract":"<div><h3>Objective</h3><div>Lung cancer is the most prevalent cancer and the leading cause of cancer-related death in the United States. Lung cancer screening with low-dose computed tomography (LDCT) helps identify lung cancer at an early stage and thus improves overall survival. The growing adoption of LDCT screening has increased radiologists’ workload and demands specialized training to accurately interpret LDCT images and report findings. Advances in artificial intelligence (AI), including large language models (LLMs) and vision models, could help reduce this burden and improve accuracy.</div></div><div><h3>Methods</h3><div>We devised LUMEN (Lung cancer screening with Unified Multimodal Evaluation and Navigation), a multimodal AI framework that mimics the radiologist’s workflow by identifying nodules in LDCT images, generating their characteristics, and drafting corresponding radiology reports in accordance with reporting guidelines. LUMEN integrates computer vision, vision-language models (VLMs), and LLMs. To assess our system, we developed a benchmarking framework to evaluate the lung cancer screening reports generated based on the findings and management criteria outlined in the Lung Imaging Reporting and Data System (Lung-RADS). It extracts them from radiology reports and measures clinical accuracy—focusing on information that is clinically important for lung cancer screening—independently of report format.</div></div><div><h3>Results</h3><div>This complement exists LLM/VLM in semantic accuracy metrics and provides a more comprehensive view of system performance. Our lung cancer screening report generation system achieved unparalleled performance compared to contemporary VLM systems, including M3D CT2Report. Furthermore, compared to standard LLM metrics, the clinical metrics we designed for lung cancer screening more accurately reflect the clinical utility of the generated reports.</div></div><div><h3>Conclusion</h3><div>LUMEN demonstrates the feasibility of generating clinically accurate lung nodule reports from LDCT images through a nodule-centric VQA approach, highlighting the potential of integrating VLMs and LLMs to support radiologists in lung cancer screening workflows. Our findings also underscore the importance of applying clinically meaningful evaluation metrics in developing medical AI systems.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"171 ","pages":"Article 104931"},"PeriodicalIF":4.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145286192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}