Pub Date : 2023-02-01Epub Date: 2022-12-10DOI: 10.1142/S0129065723500089
Sepehr Shirani, Antonio Valentin, Gonzalo Alarcon, Farhana Kazi, Saeid Sanei
To enable an accurate recognition of neuronal excitability in an epileptic brain for modeling or localization of epileptic zone, here the brain response to single-pulse electrical stimulation (SPES) has been decomposed into its constituent components using adaptive singular spectrum analysis (SSA). Given the response at neuronal level, these components are expected to be the inhibitory and excitatory components. The prime objective is to thoroughly investigate the nature of delayed responses (elicited between 100[Formula: see text]ms-1 s after SPES) for localization of the epileptic zone. SSA is a powerful subspace signal analysis method for separation of single channel signals into their constituent uncorrelated components. The consistency in the results for both early and delayed brain responses verifies the usability of the approach.
{"title":"Separating Inhibitory and Excitatory Responses of Epileptic Brain to Single-Pulse Electrical Stimulation.","authors":"Sepehr Shirani, Antonio Valentin, Gonzalo Alarcon, Farhana Kazi, Saeid Sanei","doi":"10.1142/S0129065723500089","DOIUrl":"10.1142/S0129065723500089","url":null,"abstract":"<p><p>To enable an accurate recognition of neuronal excitability in an epileptic brain for modeling or localization of epileptic zone, here the brain response to single-pulse electrical stimulation (SPES) has been decomposed into its constituent components using adaptive singular spectrum analysis (SSA). Given the response at neuronal level, these components are expected to be the inhibitory and excitatory components. The prime objective is to thoroughly investigate the nature of delayed responses (elicited between 100[Formula: see text]ms-1 s after SPES) for localization of the epileptic zone. SSA is a powerful subspace signal analysis method for separation of single channel signals into their constituent uncorrelated components. The consistency in the results for both early and delayed brain responses verifies the usability of the approach.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 2","pages":"2350008"},"PeriodicalIF":8.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9190493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Repetitive Transcranial Magnetic Stimulation (rTMS) is proposed as an effective treatment for major depressive disorder (MDD). However, because of the suboptimal treatment outcome of rTMS, the prediction of response to this technique is a crucial task. We developed a deep learning (DL) model to classify responders (R) and non-responders (NR). With this aim, we assessed the pre-treatment EEG signal of 34 MDD patients and extracted effective connectivity (EC) among all electrodes in four frequency bands of EEG signal. Two-dimensional EC maps are put together to create a rich connectivity image and a sequence of these images is fed to the DL model. Then, the DL framework was constructed based on transfer learning (TL) models which are pre-trained convolutional neural networks (CNN) named VGG16, Xception, and EfficientNetB0. Then, long short-term memory (LSTM) cells are equipped with an attention mechanism added on top of TL models to fully exploit the spatiotemporal information of EEG signal. Using leave-one subject out cross validation (LOSO CV), Xception-BLSTM-Attention acquired the highest performance with 98.86% of accuracy and 97.73% of specificity. Fusion of these models as an ensemble model based on optimized majority voting gained 99.32% accuracy and 98.34% of specificity. Therefore, the ensemble of TL-LSTM-Attention models can predict accurately the treatment outcome.
{"title":"Attention-Based Convolutional Recurrent Deep Neural Networks for the Prediction of Response to Repetitive Transcranial Magnetic Stimulation for Major Depressive Disorder.","authors":"Mohsen Sadat Shahabi, Ahmad Shalbaf, Behrooz Nobakhsh, Reza Rostami, Reza Kazemi","doi":"10.1142/S0129065723500077","DOIUrl":"https://doi.org/10.1142/S0129065723500077","url":null,"abstract":"<p><p>Repetitive Transcranial Magnetic Stimulation (rTMS) is proposed as an effective treatment for major depressive disorder (MDD). However, because of the suboptimal treatment outcome of rTMS, the prediction of response to this technique is a crucial task. We developed a deep learning (DL) model to classify responders (R) and non-responders (NR). With this aim, we assessed the pre-treatment EEG signal of 34 MDD patients and extracted effective connectivity (EC) among all electrodes in four frequency bands of EEG signal. Two-dimensional EC maps are put together to create a rich connectivity image and a sequence of these images is fed to the DL model. Then, the DL framework was constructed based on transfer learning (TL) models which are pre-trained convolutional neural networks (CNN) named VGG16, Xception, and EfficientNetB0. Then, long short-term memory (LSTM) cells are equipped with an attention mechanism added on top of TL models to fully exploit the spatiotemporal information of EEG signal. Using leave-one subject out cross validation (LOSO CV), Xception-BLSTM-Attention acquired the highest performance with 98.86% of accuracy and 97.73% of specificity. Fusion of these models as an ensemble model based on optimized majority voting gained 99.32% accuracy and 98.34% of specificity. Therefore, the ensemble of TL-LSTM-Attention models can predict accurately the treatment outcome.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 2","pages":"2350007"},"PeriodicalIF":8.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10638196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1142/S0129065723500053
Ningbo Fei, Rong Li, Hongyan Cui, Yong Hu
Somatosensory evoked potential (SEP) has been commonly used as intraoperative monitoring to detect the presence of neurological deficits during scoliosis surgery. However, SEP usually presents an enormous variation in response to patient-specific factors such as physiological parameters leading to the false warning. This study proposes a prediction model to quantify SEP amplitude variation due to noninjury-related physiological changes of the patient undergoing scoliosis surgery. Based on a hybrid network of attention-based long-short-term memory (LSTM) and convolutional neural networks (CNNs), we develop a deep learning-based framework for predicting the SEP value in response to variation of physiological variables. The training and selection of model parameters were based on a 5-fold cross-validation scheme using mean square error (MSE) as evaluation metrics. The proposed model obtained MSE of 0.027[Formula: see text][Formula: see text] on left cortical SEP, MSE of 0.024[Formula: see text][Formula: see text] on left subcortical SEP, MSE of 0.031[Formula: see text][Formula: see text] on right cortical SEP, and MSE of 0.025[Formula: see text][Formula: see text] on right subcortical SEP based on the test set. The proposed model could quantify the affection from physiological parameters to the SEP amplitude in response to normal variation of physiology during scoliosis surgery. The prediction of SEP amplitude provides a potential varying reference for intraoperative SEP monitoring.
{"title":"A Prediction Model for Normal Variation of Somatosensory Evoked Potential During Scoliosis Surgery.","authors":"Ningbo Fei, Rong Li, Hongyan Cui, Yong Hu","doi":"10.1142/S0129065723500053","DOIUrl":"https://doi.org/10.1142/S0129065723500053","url":null,"abstract":"<p><p>Somatosensory evoked potential (SEP) has been commonly used as intraoperative monitoring to detect the presence of neurological deficits during scoliosis surgery. However, SEP usually presents an enormous variation in response to patient-specific factors such as physiological parameters leading to the false warning. This study proposes a prediction model to quantify SEP amplitude variation due to noninjury-related physiological changes of the patient undergoing scoliosis surgery. Based on a hybrid network of attention-based long-short-term memory (LSTM) and convolutional neural networks (CNNs), we develop a deep learning-based framework for predicting the SEP value in response to variation of physiological variables. The training and selection of model parameters were based on a 5-fold cross-validation scheme using mean square error (MSE) as evaluation metrics. The proposed model obtained MSE of 0.027[Formula: see text][Formula: see text] on left cortical SEP, MSE of 0.024[Formula: see text][Formula: see text] on left subcortical SEP, MSE of 0.031[Formula: see text][Formula: see text] on right cortical SEP, and MSE of 0.025[Formula: see text][Formula: see text] on right subcortical SEP based on the test set. The proposed model could quantify the affection from physiological parameters to the SEP amplitude in response to normal variation of physiology during scoliosis surgery. The prediction of SEP amplitude provides a potential varying reference for intraoperative SEP monitoring.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 2","pages":"2350005"},"PeriodicalIF":8.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10629178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The problem of human activity recognition (HAR) has been increasingly attracting the efforts of the research community, having several applications. It consists of recognizing human motion and/or behavior within a given image or a video sequence, using as input raw sensor measurements. In this paper, a multimodal approach addressing the task of video-based HAR is proposed. It is based on 3D visual data that are collected using an RGB + depth camera, resulting to both raw video and 3D skeletal sequences. These data are transformed into six different 2D image representations; four of them are in the spectral domain, another is a pseudo-colored image. The aforementioned representations are based on skeletal data. The last representation is a "dynamic" image which is actually an artificially created image that summarizes RGB data of the whole video sequence, in a visually comprehensible way. In order to classify a given activity video, first, all the aforementioned 2D images are extracted and then six trained convolutional neural networks are used so as to extract visual features. The latter are fused so as to form a single feature vector and are fed into a support vector machine for classification into human activities. For evaluation purposes, a challenging motion activity recognition dataset is used, while single-view, cross-view and cross-subject experiments are performed. Moreover, the proposed approach is compared to three other state-of-the-art methods, demonstrating superior performance in most experiments.
{"title":"A Multimodal Fusion Approach for Human Activity Recognition.","authors":"Dimitrios Koutrintzes, Evaggelos Spyrou, Eirini Mathe, Phivos Mylonas","doi":"10.1142/S0129065723500028","DOIUrl":"https://doi.org/10.1142/S0129065723500028","url":null,"abstract":"<p><p>The problem of human activity recognition (HAR) has been increasingly attracting the efforts of the research community, having several applications. It consists of recognizing human motion and/or behavior within a given image or a video sequence, using as input raw sensor measurements. In this paper, a multimodal approach addressing the task of video-based HAR is proposed. It is based on 3D visual data that are collected using an RGB + depth camera, resulting to both raw video and 3D skeletal sequences. These data are transformed into six different 2D image representations; four of them are in the spectral domain, another is a pseudo-colored image. The aforementioned representations are based on skeletal data. The last representation is a \"dynamic\" image which is actually an artificially created image that summarizes RGB data of the whole video sequence, in a visually comprehensible way. In order to classify a given activity video, first, all the aforementioned 2D images are extracted and then six trained convolutional neural networks are used so as to extract visual features. The latter are fused so as to form a single feature vector and are fed into a support vector machine for classification into human activities. For evaluation purposes, a challenging motion activity recognition dataset is used, while single-view, cross-view and cross-subject experiments are performed. Moreover, the proposed approach is compared to three other state-of-the-art methods, demonstrating superior performance in most experiments.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 1","pages":"2350002"},"PeriodicalIF":8.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9083202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1142/S0129065722500575
Jennifer Sorinas, Juan C Fernandez Troyano, Jose Manuel Ferrández, Eduardo Fernandez
The large range of potential applications, not only for patients but also for healthy people, that could be achieved by affective brain-computer interface (aBCI) makes more latent the necessity of finding a commonly accepted protocol for real-time EEG-based emotion recognition. Based on wavelet package for spectral feature extraction, attending to the nature of the EEG signal, we have specified some of the main parameters needed for the implementation of robust positive and negative emotion classification. Twelve seconds has resulted as the most appropriate sliding window size; from that, a set of 20 target frequency-location variables have been proposed as the most relevant features that carry the emotional information. Lastly, QDA and KNN classifiers and population rating criterion for stimuli labeling have been suggested as the most suitable approaches for EEG-based emotion recognition. The proposed model reached a mean accuracy of 98% (s.d. 1.4) and 98.96% (s.d. 1.28) in a subject-dependent (SD) approach for QDA and KNN classifier, respectively. This new model represents a step forward towards real-time classification. Moreover, new insights regarding subject-independent (SI) approximation have been discussed, although the results were not conclusive.
{"title":"Unraveling the Development of an Algorithm for Recognizing Primary Emotions Through Electroencephalography.","authors":"Jennifer Sorinas, Juan C Fernandez Troyano, Jose Manuel Ferrández, Eduardo Fernandez","doi":"10.1142/S0129065722500575","DOIUrl":"https://doi.org/10.1142/S0129065722500575","url":null,"abstract":"<p><p>The large range of potential applications, not only for patients but also for healthy people, that could be achieved by affective brain-computer interface (aBCI) makes more latent the necessity of finding a commonly accepted protocol for real-time EEG-based emotion recognition. Based on wavelet package for spectral feature extraction, attending to the nature of the EEG signal, we have specified some of the main parameters needed for the implementation of robust positive and negative emotion classification. Twelve seconds has resulted as the most appropriate sliding window size; from that, a set of 20 target frequency-location variables have been proposed as the most relevant features that carry the emotional information. Lastly, QDA and KNN classifiers and population rating criterion for stimuli labeling have been suggested as the most suitable approaches for EEG-based emotion recognition. The proposed model reached a mean accuracy of 98% (s.d. 1.4) and 98.96% (s.d. 1.28) in a subject-dependent (SD) approach for QDA and KNN classifier, respectively. This new model represents a step forward towards real-time classification. Moreover, new insights regarding subject-independent (SI) approximation have been discussed, although the results were not conclusive.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 1","pages":"2250057"},"PeriodicalIF":8.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10587567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1142/S0129065723500016
D Nhu, M Janmohamed, L Shakhatreh, O Gonen, P Perucca, A Gilligan, P Kwan, T J O'Brien, C W Tan, L Kuhlmann
Deep learning for automated interictal epileptiform discharge (IED) detection has been topical with many published papers in recent years. All existing works viewed EEG signals as time-series and developed specific models for IED classification; however, general time-series classification (TSC) methods were not considered. Moreover, none of these methods were evaluated on any public datasets, making direct comparisons challenging. This paper explored two state-of-the-art convolutional-based TSC algorithms, InceptionTime and Minirocket, on IED detection. We fine-tuned and cross-evaluated them on a public (Temple University Events - TUEV) and two private datasets and provided ready metrics for benchmarking future work. We observed that the optimal parameters correlated with the clinical duration of an IED and achieved the best area under precision-recall curve (AUPRC) of 0.98 and F1 of 0.80 on the private datasets, respectively. The AUPRC and F1 on the TUEV dataset were 0.99 and 0.97, respectively. While algorithms trained on the private sets maintained their performance when tested on the TUEV data, those trained on TUEV could not generalize well to the private data. These results emerge from differences in the class distributions across datasets and indicate a need for public datasets with a better diversity of IED waveforms, background activities and artifacts to facilitate standardization and benchmarking of algorithms.
{"title":"Automated Interictal Epileptiform Discharge Detection from Scalp EEG Using Scalable Time-series Classification Approaches.","authors":"D Nhu, M Janmohamed, L Shakhatreh, O Gonen, P Perucca, A Gilligan, P Kwan, T J O'Brien, C W Tan, L Kuhlmann","doi":"10.1142/S0129065723500016","DOIUrl":"https://doi.org/10.1142/S0129065723500016","url":null,"abstract":"<p><p>Deep learning for automated interictal epileptiform discharge (IED) detection has been topical with many published papers in recent years. All existing works viewed EEG signals as time-series and developed specific models for IED classification; however, general time-series classification (TSC) methods were not considered. Moreover, none of these methods were evaluated on any public datasets, making direct comparisons challenging. This paper explored two state-of-the-art convolutional-based TSC algorithms, InceptionTime and Minirocket, on IED detection. We fine-tuned and cross-evaluated them on a public (Temple University Events - TUEV) and two private datasets and provided ready metrics for benchmarking future work. We observed that the optimal parameters correlated with the clinical duration of an IED and achieved the best area under precision-recall curve (AUPRC) of 0.98 and F1 of 0.80 on the private datasets, respectively. The AUPRC and F1 on the TUEV dataset were 0.99 and 0.97, respectively. While algorithms trained on the private sets maintained their performance when tested on the TUEV data, those trained on TUEV could not generalize well to the private data. These results emerge from differences in the class distributions across datasets and indicate a need for public datasets with a better diversity of IED waveforms, background activities and artifacts to facilitate standardization and benchmarking of algorithms.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 1","pages":"2350001"},"PeriodicalIF":8.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9098300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Raeisi, M. Khazaei, G. Tamburro, Pierpaolo Croce, S. Comani, F. Zappasodi
{"title":"Spatio-Temporal Graph Attention Network for Neonatal Seizure Detection","authors":"K. Raeisi, M. Khazaei, G. Tamburro, Pierpaolo Croce, S. Comani, F. Zappasodi","doi":"10.2139/ssrn.4327675","DOIUrl":"https://doi.org/10.2139/ssrn.4327675","url":null,"abstract":"","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"1 1","pages":""},"PeriodicalIF":8.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68774270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1142/S0129065722500605
Ronghao Xian, Rikong Lugu, Hong Peng, Qian Yang, Xiaohui Luo, Jun Wang
Nonlinear spiking neural P (NSNP) systems are a class of neural-like computational models inspired from the nonlinear mechanism of spiking neurons. NSNP systems have a distinguishing feature: nonlinear spiking mechanism. To handle edge detection of images, this paper proposes a variant, nonlinear spiking neural P (NSNP) systems with two outputs (TO), termed as NSNP-TO systems. Based on NSNP-TO system, an edge detection framework is developed, termed as ED-NSNP detector. The detection ability of ED-NSNP detector relies on two convolutional kernels. To obtain good detection performance, particle swarm optimization (PSO) is used to optimize the parameters of the two convolutional kernels. The proposed ED-NSNP detector is evaluated on several open benchmark images and compared with seven baseline edge detection methods. The comparison results indicate the availability and effectiveness of the proposed ED-NSNP detector.
非线性spike neural P (NSNP)系统是一类受spike神经元非线性机制启发的类神经计算模型。NSNP系统有一个显著的特点:非线性尖峰机制。为了处理图像的边缘检测,本文提出了一种具有两个输出(To)的非线性尖峰神经P (NSNP)系统,称为NSNP- To系统。基于NSNP-TO系统,开发了一种边缘检测框架,称为ED-NSNP检测器。ED-NSNP检测器的检测能力依赖于两个卷积核。为了获得较好的检测性能,采用粒子群算法对两个卷积核的参数进行优化。在若干开放的基准图像上对所提出的ED-NSNP检测器进行了评估,并与7种基线边缘检测方法进行了比较。比较结果表明了所提出的ED-NSNP检测器的可用性和有效性。
{"title":"Edge Detection Method Based on Nonlinear Spiking Neural Systems.","authors":"Ronghao Xian, Rikong Lugu, Hong Peng, Qian Yang, Xiaohui Luo, Jun Wang","doi":"10.1142/S0129065722500605","DOIUrl":"https://doi.org/10.1142/S0129065722500605","url":null,"abstract":"<p><p>Nonlinear spiking neural P (NSNP) systems are a class of neural-like computational models inspired from the nonlinear mechanism of spiking neurons. NSNP systems have a distinguishing feature: nonlinear spiking mechanism. To handle edge detection of images, this paper proposes a variant, nonlinear spiking neural P (NSNP) systems with two outputs (TO), termed as NSNP-TO systems. Based on NSNP-TO system, an edge detection framework is developed, termed as ED-NSNP detector. The detection ability of ED-NSNP detector relies on two convolutional kernels. To obtain good detection performance, particle swarm optimization (PSO) is used to optimize the parameters of the two convolutional kernels. The proposed ED-NSNP detector is evaluated on several open benchmark images and compared with seven baseline edge detection methods. The comparison results indicate the availability and effectiveness of the proposed ED-NSNP detector.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 1","pages":"2250060"},"PeriodicalIF":8.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10533406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1142/S0129065722500599
Jianyong Wang, Lei Zhang, Zhang Yi
Three-dimensional (3D) medical image segmentation plays a crucial role in medical care applications. Although various two-dimensional (2D) and 3D neural network models have been applied to 3D medical image segmentation and achieved impressive results, a trade-off remains between efficiency and accuracy. To address this issue, a novel mixture convolutional network (MixConvNet) is proposed, in which traditional 2D/3D convolutional blocks are replaced with novel MixConv blocks. In the MixConv block, 3D convolution is decomposed into a mixture of 2D convolutions from different views. Therefore, the MixConv block fully utilizes the advantages of 2D convolution and maintains the learning ability of 3D convolution. It acts as 3D convolutions and thus can process volumetric input directly and learn intra-slice features, which are absent in the traditional 2D convolutional block. By contrast, the proposed MixConv block only contains 2D convolutions; hence, it has significantly fewer trainable parameters and less computation budget than a block containing 3D convolutions. Furthermore, the proposed MixConvNet is pre-trained with small input patches and fine-tuned with large input patches to improve segmentation performance further. In experiments on the Decathlon Heart dataset and Sliver07 dataset, the proposed MixConvNet outperformed the state-of-the-art methods such as UNet3D, VNet, and nnUnet.
{"title":"Mixture 2D Convolutions for 3D Medical Image Segmentation.","authors":"Jianyong Wang, Lei Zhang, Zhang Yi","doi":"10.1142/S0129065722500599","DOIUrl":"https://doi.org/10.1142/S0129065722500599","url":null,"abstract":"<p><p>Three-dimensional (3D) medical image segmentation plays a crucial role in medical care applications. Although various two-dimensional (2D) and 3D neural network models have been applied to 3D medical image segmentation and achieved impressive results, a trade-off remains between efficiency and accuracy. To address this issue, a novel mixture convolutional network (MixConvNet) is proposed, in which traditional 2D/3D convolutional blocks are replaced with novel MixConv blocks. In the MixConv block, 3D convolution is decomposed into a mixture of 2D convolutions from different views. Therefore, the MixConv block fully utilizes the advantages of 2D convolution and maintains the learning ability of 3D convolution. It acts as 3D convolutions and thus can process volumetric input directly and learn intra-slice features, which are absent in the traditional 2D convolutional block. By contrast, the proposed MixConv block only contains 2D convolutions; hence, it has significantly fewer trainable parameters and less computation budget than a block containing 3D convolutions. Furthermore, the proposed MixConvNet is pre-trained with small input patches and fine-tuned with large input patches to improve segmentation performance further. In experiments on the Decathlon Heart dataset and Sliver07 dataset, the proposed MixConvNet outperformed the state-of-the-art methods such as UNet3D, VNet, and nnUnet.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 1","pages":"2250059"},"PeriodicalIF":8.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10533407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, deep learning has shown very competitive performance in seizure detection. However, most of the currently used methods either convert electroencephalogram (EEG) signals into spectral images and employ 2D-CNNs, or split the one-dimensional (1D) features of EEG signals into many segments and employ 1D-CNNs. Moreover, these investigations are further constrained by the absence of consideration for temporal links between time series segments or spectrogram images. Therefore, we propose a Dual-Modal Information Bottleneck (Dual-modal IB) network for EEG seizure detection. The network extracts EEG features from both time series and spectrogram dimensions, allowing information from different modalities to pass through the Dual-modal IB, requiring the model to gather and condense the most pertinent information in each modality and only share what is necessary. Specifically, we make full use of the information shared between the two modality representations to obtain key information for seizure detection and to remove irrelevant feature between the two modalities. In addition, to explore the intrinsic temporal dependencies, we further introduce a bidirectional long-short-term memory (BiLSTM) for Dual-modal IB model, which is used to model the temporal relationships between the information after each modality is extracted by convolutional neural network (CNN). For CHB-MIT dataset, the proposed framework can achieve an average segment-based sensitivity of 97.42%, specificity of 99.32%, accuracy of 98.29%, and an average event-based sensitivity of 96.02%, false detection rate (FDR) of 0.70/h. We release our code at https://github.com/LLLL1021/Dual-modal-IB.
{"title":"Dual-Modal Information Bottleneck Network for Seizure Detection.","authors":"Jiale Wang, Xinting Ge, Yunfeng Shi, Mengxue Sun, Qingtao Gong, Haipeng Wang, Wenhui Huang","doi":"10.1142/S0129065722500617","DOIUrl":"https://doi.org/10.1142/S0129065722500617","url":null,"abstract":"<p><p>In recent years, deep learning has shown very competitive performance in seizure detection. However, most of the currently used methods either convert electroencephalogram (EEG) signals into spectral images and employ 2D-CNNs, or split the one-dimensional (1D) features of EEG signals into many segments and employ 1D-CNNs. Moreover, these investigations are further constrained by the absence of consideration for temporal links between time series segments or spectrogram images. Therefore, we propose a Dual-Modal Information Bottleneck (Dual-modal IB) network for EEG seizure detection. The network extracts EEG features from both time series and spectrogram dimensions, allowing information from different modalities to pass through the Dual-modal IB, requiring the model to gather and condense the most pertinent information in each modality and only share what is necessary. Specifically, we make full use of the information shared between the two modality representations to obtain key information for seizure detection and to remove irrelevant feature between the two modalities. In addition, to explore the intrinsic temporal dependencies, we further introduce a bidirectional long-short-term memory (BiLSTM) for Dual-modal IB model, which is used to model the temporal relationships between the information after each modality is extracted by convolutional neural network (CNN). For CHB-MIT dataset, the proposed framework can achieve an average segment-based sensitivity of 97.42%, specificity of 99.32%, accuracy of 98.29%, and an average event-based sensitivity of 96.02%, false detection rate (FDR) of 0.70/h. We release our code at https://github.com/LLLL1021/Dual-modal-IB.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 1","pages":"2250061"},"PeriodicalIF":8.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9098298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}