Pub Date : 2025-02-25DOI: 10.7507/1001-5515.202402012
Bin Quan, Yajing Huang, Yanfang Li, Qinqun Chen, Honglai Zhang, Li Li, Guiqing Liu, Hang Wei
Cardiotocography (CTG) is a non-invasive and important tool for diagnosing fetal distress during pregnancy. To meet the needs of intelligent fetal heart monitoring based on deep learning, this paper proposes a TWD-MOAL deep active learning algorithm based on the three-way decision (TWD) theory and multi-objective optimization Active Learning (MOAL). During the training process of a convolutional neural network (CNN) classification model, the algorithm incorporates the TWD theory to select high-confidence samples as pseudo-labeled samples in a fine-grained batch processing mode, meanwhile low-confidence samples annotated by obstetrics experts were also considered. The TWD-MOAL algorithm proposed in this paper was validated on a dataset of 16 355 prenatal CTG records collected by our group. Experimental results showed that the algorithm proposed in this paper achieved an accuracy of 80.63% using only 40% of the labeled samples, and in terms of various indicators, it performed better than the existing active learning algorithms under other frameworks. The study has shown that the intelligent fetal heart monitoring model based on TWD-MOAL proposed in this paper is reasonable and feasible. The algorithm significantly reduces the time and cost of labeling by obstetric experts and effectively solves the problem of data imbalance in CTG signal data in clinic, which is of great significance for assisting obstetrician in interpretations CTG signals and realizing intelligence fetal monitoring.
{"title":"[Research on intelligent fetal heart monitoring model based on deep active learning].","authors":"Bin Quan, Yajing Huang, Yanfang Li, Qinqun Chen, Honglai Zhang, Li Li, Guiqing Liu, Hang Wei","doi":"10.7507/1001-5515.202402012","DOIUrl":"https://doi.org/10.7507/1001-5515.202402012","url":null,"abstract":"<p><p>Cardiotocography (CTG) is a non-invasive and important tool for diagnosing fetal distress during pregnancy. To meet the needs of intelligent fetal heart monitoring based on deep learning, this paper proposes a TWD-MOAL deep active learning algorithm based on the three-way decision (TWD) theory and multi-objective optimization Active Learning (MOAL). During the training process of a convolutional neural network (CNN) classification model, the algorithm incorporates the TWD theory to select high-confidence samples as pseudo-labeled samples in a fine-grained batch processing mode, meanwhile low-confidence samples annotated by obstetrics experts were also considered. The TWD-MOAL algorithm proposed in this paper was validated on a dataset of 16 355 prenatal CTG records collected by our group. Experimental results showed that the algorithm proposed in this paper achieved an accuracy of 80.63% using only 40% of the labeled samples, and in terms of various indicators, it performed better than the existing active learning algorithms under other frameworks. The study has shown that the intelligent fetal heart monitoring model based on TWD-MOAL proposed in this paper is reasonable and feasible. The algorithm significantly reduces the time and cost of labeling by obstetric experts and effectively solves the problem of data imbalance in CTG signal data in clinic, which is of great significance for assisting obstetrician in interpretations CTG signals and realizing intelligence fetal monitoring.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"57-64"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.7507/1001-5515.202403028
Yudong Cai, Xue Liu, Xiang Liao, Yi Zhou
The processing mechanism of the human brain for speech information is a significant source of inspiration for the study of speech enhancement technology. Attention and lateral inhibition are key mechanisms in auditory information processing that can selectively enhance specific information. Building on this, the study introduces a dual-branch U-Net that integrates lateral inhibition and feedback-driven attention mechanisms. Noisy speech signals input into the first branch of the U-Net led to the selective feedback of time-frequency units with high confidence. The generated activation layer gradients, in conjunction with the lateral inhibition mechanism, were utilized to calculate attention maps. These maps were then concatenated to the second branch of the U-Net, directing the network's focus and achieving selective enhancement of auditory speech signals. The evaluation of the speech enhancement effect was conducted by utilising five metrics, including perceptual evaluation of speech quality. This method was compared horizontally with five other methods: Wiener, SEGAN, PHASEN, Demucs and GRN. The experimental results demonstrated that the proposed method improved speech signal enhancement capabilities in various noise scenarios by 18% to 21% compared to the baseline network across multiple performance metrics. This improvement was particularly notable in low signal-to-noise ratio conditions, where the proposed method exhibited a significant performance advantage over other methods. The speech enhancement technique based on lateral inhibition and feedback-driven attention mechanisms holds significant potential in auditory speech enhancement, making it suitable for clinical practices related to artificial cochleae and hearing aids.
{"title":"[Neural network for auditory speech enhancement featuring feedback-driven attention and lateral inhibition].","authors":"Yudong Cai, Xue Liu, Xiang Liao, Yi Zhou","doi":"10.7507/1001-5515.202403028","DOIUrl":"https://doi.org/10.7507/1001-5515.202403028","url":null,"abstract":"<p><p>The processing mechanism of the human brain for speech information is a significant source of inspiration for the study of speech enhancement technology. Attention and lateral inhibition are key mechanisms in auditory information processing that can selectively enhance specific information. Building on this, the study introduces a dual-branch U-Net that integrates lateral inhibition and feedback-driven attention mechanisms. Noisy speech signals input into the first branch of the U-Net led to the selective feedback of time-frequency units with high confidence. The generated activation layer gradients, in conjunction with the lateral inhibition mechanism, were utilized to calculate attention maps. These maps were then concatenated to the second branch of the U-Net, directing the network's focus and achieving selective enhancement of auditory speech signals. The evaluation of the speech enhancement effect was conducted by utilising five metrics, including perceptual evaluation of speech quality. This method was compared horizontally with five other methods: Wiener, SEGAN, PHASEN, Demucs and GRN. The experimental results demonstrated that the proposed method improved speech signal enhancement capabilities in various noise scenarios by 18% to 21% compared to the baseline network across multiple performance metrics. This improvement was particularly notable in low signal-to-noise ratio conditions, where the proposed method exhibited a significant performance advantage over other methods. The speech enhancement technique based on lateral inhibition and feedback-driven attention mechanisms holds significant potential in auditory speech enhancement, making it suitable for clinical practices related to artificial cochleae and hearing aids.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"82-89"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.7507/1001-5515.202411025
Hong Liang, Jipeng Sun, Yong Fan, Desen Cao, Kunlun He, Zhengbo Zhang, Zhi Mao
The intensive care unit (ICU) is a highly equipment-intensive area with a wide variety of medical devices, and the accuracy and timeliness of medical equipment data collection are highly demanded. The integration of the Internet of Things (IoT) into ICU medical devices is of great significance for enhancing the quality of medical care and nursing, as well as for the advancement of digital and intelligent ICUs. This study focuses on the construction of the IOT for ICU medical devices and proposes innovative solutions, including the overall architecture design, devices connection, data collection, data standardization, platform construction and application implementation. The overall architecture was designed according to the perception layer, network layer, platform layer and application layer; three modes of device connection and data acquisition were proposed; data standardization based on Integrating the Healthcare Enterprise-Patient Care Device (IHE-PCD) was proposed. This study was practically verified in the Chinese People's Liberation Army General Hospital, a total of 122 devices in four ICU wards were connected to the IoT, storing 21.76 billion data items, with a data volume of 12.5 TB, which solved the problem of difficult systematic medical equipment data collection and data integration in ICUs. The remarkable results achieved proved the feasibility and reliability of this study. The research results of this paper provide a solution reference for the construction of hospital ICU IoT, offer more abundant data for medical big data analysis research, which can support the improvement of ICU medical services and promote the development of ICU to digitalization and intelligence.
{"title":"[Research and application implementation of the Internet of Things scheme for intensive care unit medical equipment].","authors":"Hong Liang, Jipeng Sun, Yong Fan, Desen Cao, Kunlun He, Zhengbo Zhang, Zhi Mao","doi":"10.7507/1001-5515.202411025","DOIUrl":"https://doi.org/10.7507/1001-5515.202411025","url":null,"abstract":"<p><p>The intensive care unit (ICU) is a highly equipment-intensive area with a wide variety of medical devices, and the accuracy and timeliness of medical equipment data collection are highly demanded. The integration of the Internet of Things (IoT) into ICU medical devices is of great significance for enhancing the quality of medical care and nursing, as well as for the advancement of digital and intelligent ICUs. This study focuses on the construction of the IOT for ICU medical devices and proposes innovative solutions, including the overall architecture design, devices connection, data collection, data standardization, platform construction and application implementation. The overall architecture was designed according to the perception layer, network layer, platform layer and application layer; three modes of device connection and data acquisition were proposed; data standardization based on Integrating the Healthcare Enterprise-Patient Care Device (IHE-PCD) was proposed. This study was practically verified in the Chinese People's Liberation Army General Hospital, a total of 122 devices in four ICU wards were connected to the IoT, storing 21.76 billion data items, with a data volume of 12.5 TB, which solved the problem of difficult systematic medical equipment data collection and data integration in ICUs. The remarkable results achieved proved the feasibility and reliability of this study. The research results of this paper provide a solution reference for the construction of hospital ICU IoT, offer more abundant data for medical big data analysis research, which can support the improvement of ICU medical services and promote the development of ICU to digitalization and intelligence.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"65-72"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Depression, a mental health disorder, has emerged as one of the significant challenges in the global public health domain. Investigating the pathogenesis of depression and accurately assessing the symptomatic changes are fundamental to formulating effective clinical diagnosis and treatment strategies. Utilizing non-invasive brain imaging technologies such as functional magnetic resonance imaging and scalp electroencephalography, existing studies have confirmed that the onset of depression is closely associated with abnormal neural activities and altered functional connectivity in multiple brain regions. Magnetoencephalography, unaffected by tissue conductivity and skull thickness, boasts high spatial resolution and signal-to-noise ratio, offering unique advantages and significant value in revealing the abnormal brain mechanisms and neural characteristics of depression. This review, starting from the rhythmic characteristics, nonlinear dynamic features, and connectivity characteristics of magnetoencephalography in depression patients, revisits the research progress on magnetoencephalography features related to depression, discusses current issues and future development trends, and provides insights for the study of pathophysiological mechanisms, as well as for clinical diagnosis and treatment of depression.
{"title":"[Research progress on the characteristics of magnetoencephalography signals in depression].","authors":"Zhiyuan Chen, Yongzhi Huang, Haiqing Yu, Chunyan Cao, Minpeng Xu, Dong Ming","doi":"10.7507/1001-5515.202406072","DOIUrl":"https://doi.org/10.7507/1001-5515.202406072","url":null,"abstract":"<p><p>Depression, a mental health disorder, has emerged as one of the significant challenges in the global public health domain. Investigating the pathogenesis of depression and accurately assessing the symptomatic changes are fundamental to formulating effective clinical diagnosis and treatment strategies. Utilizing non-invasive brain imaging technologies such as functional magnetic resonance imaging and scalp electroencephalography, existing studies have confirmed that the onset of depression is closely associated with abnormal neural activities and altered functional connectivity in multiple brain regions. Magnetoencephalography, unaffected by tissue conductivity and skull thickness, boasts high spatial resolution and signal-to-noise ratio, offering unique advantages and significant value in revealing the abnormal brain mechanisms and neural characteristics of depression. This review, starting from the rhythmic characteristics, nonlinear dynamic features, and connectivity characteristics of magnetoencephalography in depression patients, revisits the research progress on magnetoencephalography features related to depression, discusses current issues and future development trends, and provides insights for the study of pathophysiological mechanisms, as well as for clinical diagnosis and treatment of depression.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"189-196"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The gradient field, one of the core magnetic fields in magnetic resonance imaging (MRI) systems, is generated by gradient coils and plays a critical role in spatial encoding and the generation of echo signals. The uniformity or linearity of the gradient field directly impacts the quality and distortion level of MRI images. However, traditional point measurement methods lack accuracy in assessing the linearity of gradient fields, making it difficult to provide effective parameters for image distortion correction. This paper introduced a spherical measurement-based method that involved measuring the magnetic field distribution on a sphere, followed by detailed magnetic field calculations and linearity analysis. This study, applied to assess the nonlinearity of asymmetric head gradient coils, demonstrated more comprehensive and precise results compared to point measurement methods. This advancement not only strengthens the scientific basis for the design of gradient coils but also provides more reliable parameters and methods for the accurate correction of MRI image distortions.
{"title":"[Spherical measurement-based analysis of gradient nonlinearity in magnetic resonance imaging].","authors":"Xiaoli Yang, Zhaolian Wang, Qian Wang, Yiting Zhang, Zixuan Song, Yuchang Zhang, Yafei Qi, Xiaopeng Ma","doi":"10.7507/1001-5515.202401068","DOIUrl":"https://doi.org/10.7507/1001-5515.202401068","url":null,"abstract":"<p><p>The gradient field, one of the core magnetic fields in magnetic resonance imaging (MRI) systems, is generated by gradient coils and plays a critical role in spatial encoding and the generation of echo signals. The uniformity or linearity of the gradient field directly impacts the quality and distortion level of MRI images. However, traditional point measurement methods lack accuracy in assessing the linearity of gradient fields, making it difficult to provide effective parameters for image distortion correction. This paper introduced a spherical measurement-based method that involved measuring the magnetic field distribution on a sphere, followed by detailed magnetic field calculations and linearity analysis. This study, applied to assess the nonlinearity of asymmetric head gradient coils, demonstrated more comprehensive and precise results compared to point measurement methods. This advancement not only strengthens the scientific basis for the design of gradient coils but also provides more reliable parameters and methods for the accurate correction of MRI image distortions.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"174-180"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.7507/1001-5515.202411010
Hongxiang Gao, Zhipeng Cai, Jianqing Li, Chengyu Liu
Cardiovascular diseases and psychological disorders represent two major threats to human physical and mental health. Research on electrocardiogram (ECG) signals offers valuable opportunities to address these issues. However, existing methods are constrained by limitations in understanding ECG features and transferring knowledge across tasks. To address these challenges, this study developed a multi-resolution feature encoding network based on residual networks, which effectively extracted local morphological features and global rhythm features of ECG signals, thereby enhancing feature representation. Furthermore, a model compression-based continual learning method was proposed, enabling the structured transfer of knowledge from simpler tasks to more complex ones, resulting in improved performance in downstream tasks. The multi-resolution learning model demonstrated superior or comparable performance to state-of-the-art algorithms across five datasets, including tasks such as ECG QRS complex detection, arrhythmia classification, and emotion classification. The continual learning method achieved significant improvements over conventional training approaches in cross-domain, cross-task, and incremental data scenarios. These results highlight the potential of the proposed method for effective cross-task knowledge transfer in ECG analysis and offer a new perspective for multi-task learning using ECG signals.
{"title":"[The joint analysis of heart health and mental health based on continual learning].","authors":"Hongxiang Gao, Zhipeng Cai, Jianqing Li, Chengyu Liu","doi":"10.7507/1001-5515.202411010","DOIUrl":"https://doi.org/10.7507/1001-5515.202411010","url":null,"abstract":"<p><p>Cardiovascular diseases and psychological disorders represent two major threats to human physical and mental health. Research on electrocardiogram (ECG) signals offers valuable opportunities to address these issues. However, existing methods are constrained by limitations in understanding ECG features and transferring knowledge across tasks. To address these challenges, this study developed a multi-resolution feature encoding network based on residual networks, which effectively extracted local morphological features and global rhythm features of ECG signals, thereby enhancing feature representation. Furthermore, a model compression-based continual learning method was proposed, enabling the structured transfer of knowledge from simpler tasks to more complex ones, resulting in improved performance in downstream tasks. The multi-resolution learning model demonstrated superior or comparable performance to state-of-the-art algorithms across five datasets, including tasks such as ECG QRS complex detection, arrhythmia classification, and emotion classification. The continual learning method achieved significant improvements over conventional training approaches in cross-domain, cross-task, and incremental data scenarios. These results highlight the potential of the proposed method for effective cross-task knowledge transfer in ECG analysis and offer a new perspective for multi-task learning using ECG signals.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143503807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.7507/1001-5515.202406041
Ziqiong Wang, Dechun Zhao, Lu Qin, Yi Chen, Yuchen Shen
In audiovisual emotion recognition, representational learning is a research direction receiving considerable attention, and the key lies in constructing effective affective representations with both consistency and variability. However, there are still many challenges to accurately realize affective representations. For this reason, in this paper we proposed a cross-modal audiovisual recognition model based on a multi-head cross-attention mechanism. The model achieved fused feature and modality alignment through a multi-head cross-attention architecture, and adopted a segmented training strategy to cope with the modality missing problem. In addition, a unimodal auxiliary loss task was designed and shared parameters were used in order to preserve the independent information of each modality. Ultimately, the model achieved macro and micro F1 scores of 84.5% and 88.2%, respectively, on the crowdsourced annotated multimodal emotion dataset of actor performances (CREMA-D). The model in this paper can effectively capture intra- and inter-modal feature representations of audio and video modalities, and successfully solves the unity problem of the unimodal and multimodal emotion recognition frameworks, which provides a brand-new solution to the audiovisual emotion recognition.
{"title":"[Audiovisual emotion recognition based on a multi-head cross attention mechanism].","authors":"Ziqiong Wang, Dechun Zhao, Lu Qin, Yi Chen, Yuchen Shen","doi":"10.7507/1001-5515.202406041","DOIUrl":"https://doi.org/10.7507/1001-5515.202406041","url":null,"abstract":"<p><p>In audiovisual emotion recognition, representational learning is a research direction receiving considerable attention, and the key lies in constructing effective affective representations with both consistency and variability. However, there are still many challenges to accurately realize affective representations. For this reason, in this paper we proposed a cross-modal audiovisual recognition model based on a multi-head cross-attention mechanism. The model achieved fused feature and modality alignment through a multi-head cross-attention architecture, and adopted a segmented training strategy to cope with the modality missing problem. In addition, a unimodal auxiliary loss task was designed and shared parameters were used in order to preserve the independent information of each modality. Ultimately, the model achieved macro and micro F1 scores of 84.5% and 88.2%, respectively, on the crowdsourced annotated multimodal emotion dataset of actor performances (CREMA-D). The model in this paper can effectively capture intra- and inter-modal feature representations of audio and video modalities, and successfully solves the unity problem of the unimodal and multimodal emotion recognition frameworks, which provides a brand-new solution to the audiovisual emotion recognition.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"24-31"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.7507/1001-5515.202405035
An Zeng, Zhifu Shuai, Dan Pan, Jinzhi Lin
Alzheimer's disease (AD) classification models usually segment the entire brain image into voxel blocks and assign them labels consistent with the entire image, but not every voxel block is closely related to the disease. To this end, an AD auxiliary diagnosis framework based on weakly supervised multi-instance learning (MIL) and multi-scale feature fusion is proposed, and the framework is designed from three aspects: within the voxel block, between voxel blocks, and high-confidence voxel blocks. First, a three-dimensional convolutional neural network was used to extract deep features within the voxel block; then the spatial correlation information between voxel blocks was captured through position encoding and attention mechanism; finally, high-confidence voxel blocks were selected and combined with multi-scale information fusion strategy to integrate key features for classification decision. The performance of the model was evaluated on the Alzheimer's Disease Neuroimaging Initiative (ADNI) and Open Access Series of Imaging Studies (OASIS) datasets. Experimental results showed that the proposed framework improved ACC and AUC by 3% and 4% on average compared with other mainstream frameworks in the two tasks of AD classification and mild cognitive impairment conversion classification, and could find the key voxel blocks that trigger the disease, providing an effective basis for AD auxiliary diagnosis.
{"title":"[Classification of Alzheimer's disease based on multi-example learning and multi-scale feature fusion].","authors":"An Zeng, Zhifu Shuai, Dan Pan, Jinzhi Lin","doi":"10.7507/1001-5515.202405035","DOIUrl":"https://doi.org/10.7507/1001-5515.202405035","url":null,"abstract":"<p><p>Alzheimer's disease (AD) classification models usually segment the entire brain image into voxel blocks and assign them labels consistent with the entire image, but not every voxel block is closely related to the disease. To this end, an AD auxiliary diagnosis framework based on weakly supervised multi-instance learning (MIL) and multi-scale feature fusion is proposed, and the framework is designed from three aspects: within the voxel block, between voxel blocks, and high-confidence voxel blocks. First, a three-dimensional convolutional neural network was used to extract deep features within the voxel block; then the spatial correlation information between voxel blocks was captured through position encoding and attention mechanism; finally, high-confidence voxel blocks were selected and combined with multi-scale information fusion strategy to integrate key features for classification decision. The performance of the model was evaluated on the Alzheimer's Disease Neuroimaging Initiative (ADNI) and Open Access Series of Imaging Studies (OASIS) datasets. Experimental results showed that the proposed framework improved ACC and AUC by 3% and 4% on average compared with other mainstream frameworks in the two tasks of AD classification and mild cognitive impairment conversion classification, and could find the key voxel blocks that trigger the disease, providing an effective basis for AD auxiliary diagnosis.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"132-139"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical cross-modal retrieval aims to achieve semantic similarity search between different modalities of medical cases, such as quickly locating relevant ultrasound images through ultrasound reports, or using ultrasound images to retrieve matching reports. However, existing medical cross-modal hash retrieval methods face significant challenges, including semantic and visual differences between modalities and the scalability issues of hash algorithms in handling large-scale data. To address these challenges, this paper proposes a Medical image Semantic Alignment Cross-modal Hashing based on Transformer (MSACH). The algorithm employed a segmented training strategy, combining modality feature extraction and hash function learning, effectively extracting low-dimensional features containing important semantic information. A Transformer encoder was used for cross-modal semantic learning. By introducing manifold similarity constraints, balance constraints, and a linear classification network constraint, the algorithm enhanced the discriminability of the hash codes. Experimental results demonstrated that the MSACH algorithm improved the mean average precision (MAP) by 11.8% and 12.8% on two datasets compared to traditional methods. The algorithm exhibits outstanding performance in enhancing retrieval accuracy and handling large-scale medical data, showing promising potential for practical applications.
{"title":"[Cross-modal hash retrieval of medical images based on Transformer semantic alignment].","authors":"Qianlin Wu, Lun Tang, Qinghai Liu, Liming Xu, Qianbin Chen","doi":"10.7507/1001-5515.202407034","DOIUrl":"https://doi.org/10.7507/1001-5515.202407034","url":null,"abstract":"<p><p>Medical cross-modal retrieval aims to achieve semantic similarity search between different modalities of medical cases, such as quickly locating relevant ultrasound images through ultrasound reports, or using ultrasound images to retrieve matching reports. However, existing medical cross-modal hash retrieval methods face significant challenges, including semantic and visual differences between modalities and the scalability issues of hash algorithms in handling large-scale data. To address these challenges, this paper proposes a Medical image Semantic Alignment Cross-modal Hashing based on Transformer (MSACH). The algorithm employed a segmented training strategy, combining modality feature extraction and hash function learning, effectively extracting low-dimensional features containing important semantic information. A Transformer encoder was used for cross-modal semantic learning. By introducing manifold similarity constraints, balance constraints, and a linear classification network constraint, the algorithm enhanced the discriminability of the hash codes. Experimental results demonstrated that the MSACH algorithm improved the mean average precision (MAP) by 11.8% and 12.8% on two datasets compared to traditional methods. The algorithm exhibits outstanding performance in enhancing retrieval accuracy and handling large-scale medical data, showing promising potential for practical applications.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"156-163"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.7507/1001-5515.202304067
Yuxin Zhang, Chenrui Zhang, Shihao Sun, Guizhi Xu
This paper proposes a motor imagery recognition algorithm based on feature fusion and transfer adaptive boosting (TrAdaboost) to address the issue of low accuracy in motor imagery (MI) recognition across subjects, thereby increasing the reliability of MI-based brain-computer interfaces (BCI) for cross-individual use. Using the autoregressive model, power spectral density and discrete wavelet transform, time-frequency domain features of MI can be obtained, while the filter bank common spatial pattern is used to extract spatial domain features, and multi-scale dispersion entropy is employed to extract nonlinear features. The IV-2a dataset from the 4 th International BCI Competition was used for the binary classification task, with the pattern recognition model constructed by combining the improved TrAdaboost integrated learning algorithm with support vector machine (SVM), k nearest neighbor (KNN), and mind evolutionary algorithm-based back propagation (MEA-BP) neural network. The results show that the SVM-based TrAdaboost integrated learning algorithm has the best performance when 30% of the target domain instance data is migrated, with an average classification accuracy of 86.17%, a Kappa value of 0.723 3, and an AUC value of 0.849 8. These results suggest that the algorithm can be used to recognize MI signals across individuals, providing a new way to improve the generalization capability of BCI recognition models.
{"title":"[Research on motor imagery recognition based on feature fusion and transfer adaptive boosting].","authors":"Yuxin Zhang, Chenrui Zhang, Shihao Sun, Guizhi Xu","doi":"10.7507/1001-5515.202304067","DOIUrl":"https://doi.org/10.7507/1001-5515.202304067","url":null,"abstract":"<p><p>This paper proposes a motor imagery recognition algorithm based on feature fusion and transfer adaptive boosting (TrAdaboost) to address the issue of low accuracy in motor imagery (MI) recognition across subjects, thereby increasing the reliability of MI-based brain-computer interfaces (BCI) for cross-individual use. Using the autoregressive model, power spectral density and discrete wavelet transform, time-frequency domain features of MI can be obtained, while the filter bank common spatial pattern is used to extract spatial domain features, and multi-scale dispersion entropy is employed to extract nonlinear features. The IV-2a dataset from the 4 <sup>th</sup> International BCI Competition was used for the binary classification task, with the pattern recognition model constructed by combining the improved TrAdaboost integrated learning algorithm with support vector machine (SVM), <i>k</i> nearest neighbor (KNN), and mind evolutionary algorithm-based back propagation (MEA-BP) neural network. The results show that the SVM-based TrAdaboost integrated learning algorithm has the best performance when 30% of the target domain instance data is migrated, with an average classification accuracy of 86.17%, a Kappa value of 0.723 3, and an AUC value of 0.849 8. These results suggest that the algorithm can be used to recognize MI signals across individuals, providing a new way to improve the generalization capability of BCI recognition models.</p>","PeriodicalId":39324,"journal":{"name":"生物医学工程学杂志","volume":"42 1","pages":"9-16"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}