Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909716
Shubhr Singh, Huy Phan, Emmanouil Benetos
Polyphonic sound event detection (SED) involves the pre-diction of sound events present in an audio recording along with their onset and offset times. Recently, Deep Neural Net-works, specifically convolutional recurrent neural networks (CRNN) have achieved impressive results for this task. The convolution part of the architecture is used to extract trans-lational invariant features from the input and the recurrent part learns the underlying temporal relationship between au-dio frames. Recent studies showed that the weight sharing paradigm of recurrent networks might be a hindering factor in certain kinds of time series data, specifically where there is a temporal conditional shift, i.e. the conditional distribution of a label changes across the temporal scale. This warrants a relevant question - is there a similar phenomenon in poly-phonic sound events due to dynamic polyphony level across the temporal axis? In this work, we explore this question and inquire if relaxed weight sharing improves performance of a CRNN for polyphonic SED. We propose to use hyper-networks to relax weight sharing in the recurrent part and show that the CRNN's performance is improved by ≈ 3% across two datasets, thus paving the way for further explo-ration of the existence of temporal conditional shift for poly-phonic SED.
{"title":"Hypernetworks for Sound event Detection: a Proof-of-Concept","authors":"Shubhr Singh, Huy Phan, Emmanouil Benetos","doi":"10.23919/eusipco55093.2022.9909716","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909716","url":null,"abstract":"Polyphonic sound event detection (SED) involves the pre-diction of sound events present in an audio recording along with their onset and offset times. Recently, Deep Neural Net-works, specifically convolutional recurrent neural networks (CRNN) have achieved impressive results for this task. The convolution part of the architecture is used to extract trans-lational invariant features from the input and the recurrent part learns the underlying temporal relationship between au-dio frames. Recent studies showed that the weight sharing paradigm of recurrent networks might be a hindering factor in certain kinds of time series data, specifically where there is a temporal conditional shift, i.e. the conditional distribution of a label changes across the temporal scale. This warrants a relevant question - is there a similar phenomenon in poly-phonic sound events due to dynamic polyphony level across the temporal axis? In this work, we explore this question and inquire if relaxed weight sharing improves performance of a CRNN for polyphonic SED. We propose to use hyper-networks to relax weight sharing in the recurrent part and show that the CRNN's performance is improved by ≈ 3% across two datasets, thus paving the way for further explo-ration of the existence of temporal conditional shift for poly-phonic SED.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127374303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909891
Mahmoud El-Hindi, Michael Muma, A. Zoubir
Speaker diarization systems process audio files by labelling speech segments according to speakers' identities. Many speaker diarization systems work offline and are not suited for online applications. We present a semi-supervised, online, low-complexity system. While, in general, speaker diarization operates in an unsupervised manner, the presented system relies on the enrollment of the participating speakers in the conversation. The diarization system has two main novel aspects. The first one is a proposed online learning strategy that evaluates processed segments according to their usefulness for learning a speaker, i.e. update a speaker model with it. The segment is evaluated using two metrics to determine whether to use the segment to update the system. The second novel aspect is a proposed vector quantization approach that models the score not only depending on the target speaker codebook but also takes an alternative codebook into account. We also present an approach to compute the alternative codebook. Simulation results show that the proposed system outperforms a comparable system without the proposed online learning strategy and shows benefits, especially for short training lengths.
{"title":"Semi-Supervised Online Speaker Diarization using Vector Quantization with Alternative Codebooks","authors":"Mahmoud El-Hindi, Michael Muma, A. Zoubir","doi":"10.23919/eusipco55093.2022.9909891","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909891","url":null,"abstract":"Speaker diarization systems process audio files by labelling speech segments according to speakers' identities. Many speaker diarization systems work offline and are not suited for online applications. We present a semi-supervised, online, low-complexity system. While, in general, speaker diarization operates in an unsupervised manner, the presented system relies on the enrollment of the participating speakers in the conversation. The diarization system has two main novel aspects. The first one is a proposed online learning strategy that evaluates processed segments according to their usefulness for learning a speaker, i.e. update a speaker model with it. The segment is evaluated using two metrics to determine whether to use the segment to update the system. The second novel aspect is a proposed vector quantization approach that models the score not only depending on the target speaker codebook but also takes an alternative codebook into account. We also present an approach to compute the alternative codebook. Simulation results show that the proposed system outperforms a comparable system without the proposed online learning strategy and shows benefits, especially for short training lengths.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127203912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909586
Zhengning Zhang, Lin Zhang, Yue Wang, P. Feng, Shaobo Liu, Jian Wang
Fine-grained ship detection in optical remote sensing images is a challenging problem due to its long-tailed distributed dataset, which is often coupled with the multi-scale of ship and complex environment. In this paper, a novel average instance area imbalance ratio (AIAIR) is firstly used for quantitatively evaluating long-tailed distribution and multi-scale coupled problem. Based on which, we propose the idea of feature space decoupling and augmentation guided by cross-Level semantic segmentation, where features on different classwise-balance level are scheduled. On this basis, a Siamese Semantic Segmentation Guided Ship Detection Network (SGSDet) is proposed to effectively facilitate fine-grained ship detection performance. Our proposed method can be easily plugged into existing object detection models. Numerical experiments show that the proposed method outperforms the baseline by 2.32% mAP on the ShipRSImageNet dataset without extra annotations.
{"title":"Cross-Level Semantic Segmentation Guided Feature Space Decoupling And Augmentation for Fine-Grained Ship Detection","authors":"Zhengning Zhang, Lin Zhang, Yue Wang, P. Feng, Shaobo Liu, Jian Wang","doi":"10.23919/eusipco55093.2022.9909586","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909586","url":null,"abstract":"Fine-grained ship detection in optical remote sensing images is a challenging problem due to its long-tailed distributed dataset, which is often coupled with the multi-scale of ship and complex environment. In this paper, a novel average instance area imbalance ratio (AIAIR) is firstly used for quantitatively evaluating long-tailed distribution and multi-scale coupled problem. Based on which, we propose the idea of feature space decoupling and augmentation guided by cross-Level semantic segmentation, where features on different classwise-balance level are scheduled. On this basis, a Siamese Semantic Segmentation Guided Ship Detection Network (SGSDet) is proposed to effectively facilitate fine-grained ship detection performance. Our proposed method can be easily plugged into existing object detection models. Numerical experiments show that the proposed method outperforms the baseline by 2.32% mAP on the ShipRSImageNet dataset without extra annotations.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128915271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909751
David Meltzer, D. Luengo
Recognition of individuals through different bio-metric traits is becoming increasingly important. Apart from traditional biomarkers (like fingerprints), many alternative traits have been proposed during the last two decades: ECG and EEG signals, iris or facial recognition, behavioral traits, etc. Several works have shown that ECG-based recognition is a feasible alternative for either stand-alone or multibiometric recognition systems. In this paper, we propose a novel, efficient and scalable clustering-based method for ECG biometric recognition. First of all, fixed length segments of the ECG are extracted without the need of computing any fiducial points. Unique traits for each subject are then obtained by computing the autocorrelation (AC) of each segment, followed by the discrete cosine transform (DCT) to compress the available information. Finally, hierarchical ag-glomerative clustering (HAC) is applied to generate the groups that will be used later on for classification. The combination of the DCT to reduce the feature dimensionality and the HAC to decrease the number of features required by the classifier results in an efficient method both from the memory storage and computational point of view. Furthermore, the proposed AC/DCT-HAC (ADH) approach is robust, since no fiducial points (which may be difficult to extract reliably) are required, scalable and attains a better performance than other approaches with higher storage/computational cost.
{"title":"An efficient clustering-based non-fiducial approach for ECG biometric recognition","authors":"David Meltzer, D. Luengo","doi":"10.23919/eusipco55093.2022.9909751","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909751","url":null,"abstract":"Recognition of individuals through different bio-metric traits is becoming increasingly important. Apart from traditional biomarkers (like fingerprints), many alternative traits have been proposed during the last two decades: ECG and EEG signals, iris or facial recognition, behavioral traits, etc. Several works have shown that ECG-based recognition is a feasible alternative for either stand-alone or multibiometric recognition systems. In this paper, we propose a novel, efficient and scalable clustering-based method for ECG biometric recognition. First of all, fixed length segments of the ECG are extracted without the need of computing any fiducial points. Unique traits for each subject are then obtained by computing the autocorrelation (AC) of each segment, followed by the discrete cosine transform (DCT) to compress the available information. Finally, hierarchical ag-glomerative clustering (HAC) is applied to generate the groups that will be used later on for classification. The combination of the DCT to reduce the feature dimensionality and the HAC to decrease the number of features required by the classifier results in an efficient method both from the memory storage and computational point of view. Furthermore, the proposed AC/DCT-HAC (ADH) approach is robust, since no fiducial points (which may be difficult to extract reliably) are required, scalable and attains a better performance than other approaches with higher storage/computational cost.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130553894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909601
Petteri Pulkkinen, V. Koivunen
This paper addresses the problems of co-design and cooperation among radar and communication systems operating in a shared spectrum scenario. Online learning facilitates using the spectrum flexibly while managing and mitigating rapidly time-frequency-space varying interference. We extend the previously proposed Model-Based Online Learning (MBOL) algorithm [1] to allocate frequency and power resources among co-designed and collaborating sensing and communication systems in dynamic interference scenarios. The proposed MBOL algorithm learns a predictive spectrum model using online convex optimization (OCO), assigns sub-bands between sensing and communications tasks, and optimizes their power for the tasks at hand. The performance of the proposed MBOL method is evaluated in simulations using the proposed constrained regret criterion and shown to improve the sensing and communications performance compared to the baseline method in terms of lower and sub-linear constrained regret.
{"title":"Model-Based Online Learning for Joint Radar-Communication Systems Operating in Dynamic Interference","authors":"Petteri Pulkkinen, V. Koivunen","doi":"10.23919/eusipco55093.2022.9909601","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909601","url":null,"abstract":"This paper addresses the problems of co-design and cooperation among radar and communication systems operating in a shared spectrum scenario. Online learning facilitates using the spectrum flexibly while managing and mitigating rapidly time-frequency-space varying interference. We extend the previously proposed Model-Based Online Learning (MBOL) algorithm [1] to allocate frequency and power resources among co-designed and collaborating sensing and communication systems in dynamic interference scenarios. The proposed MBOL algorithm learns a predictive spectrum model using online convex optimization (OCO), assigns sub-bands between sensing and communications tasks, and optimizes their power for the tasks at hand. The performance of the proposed MBOL method is evaluated in simulations using the proposed constrained regret criterion and shown to improve the sensing and communications performance compared to the baseline method in terms of lower and sub-linear constrained regret.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130840149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909782
Congji Yin, Wenjiang Feng, Junbing Li, Guojun Li
In this paper, a new double expectation propagation-based decision feedback equalizer (DEP-DFE) for server inter-symbol interference (ISI) channels employing turbo equalization is proposed. The EP algorithm is used at the equalizer output and the channel decoder output. The proposed DEP-DFE offers a new approach to alleviate error propagation. Additionally, its computational complexity is nearly half of the EP-based minimum mean square error (MMSE)-based linear equalizer (EP-MMSE-LE) proposed by Santos et al. The bit error ratio performance of the proposed equalizer is verified through simulation in the well-known severely frequency selective Proakis-C channel for different scenarios. Simulation results demonstrate that the proposed DEP-DFE can achieve similar or better performance than the EP-MMSE-LE. Moreover, it has significant improvement over the double expectation propagation-based MMSE-LE (DEP-MMSE-LE).
{"title":"A Low-Complexity Double EP-Based DFE for Turbo Equalization","authors":"Congji Yin, Wenjiang Feng, Junbing Li, Guojun Li","doi":"10.23919/eusipco55093.2022.9909782","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909782","url":null,"abstract":"In this paper, a new double expectation propagation-based decision feedback equalizer (DEP-DFE) for server inter-symbol interference (ISI) channels employing turbo equalization is proposed. The EP algorithm is used at the equalizer output and the channel decoder output. The proposed DEP-DFE offers a new approach to alleviate error propagation. Additionally, its computational complexity is nearly half of the EP-based minimum mean square error (MMSE)-based linear equalizer (EP-MMSE-LE) proposed by Santos et al. The bit error ratio performance of the proposed equalizer is verified through simulation in the well-known severely frequency selective Proakis-C channel for different scenarios. Simulation results demonstrate that the proposed DEP-DFE can achieve similar or better performance than the EP-MMSE-LE. Moreover, it has significant improvement over the double expectation propagation-based MMSE-LE (DEP-MMSE-LE).","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130840759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909553
Quentin Giboulot, Patrick Bas, R. Cogranne, Dirk Borghys
This paper studies the problem of Cover Source Mismatch (CSM) in steganalysis, i.e. the impact of a testing set which does not originate from the same source than the training set. In this study, the trained steganalyzer uses state of the art deep-learning architecture prone to better generalization than feature-based steganalysis. Different sources such as the sensor model, the ISO sensitivity, the processing pipeline and the content, are investigated. Our conclusions are that, on one hand, deep learning steganalysis is still very sensitive to the CSM, on the other hand, the holistic strategy leverages the good generalization properties of deep learning to reduce the CSM with a relatively small number of training samples.
{"title":"The Cover Source Mismatch Problem in Deep-Learning Steganalysis","authors":"Quentin Giboulot, Patrick Bas, R. Cogranne, Dirk Borghys","doi":"10.23919/eusipco55093.2022.9909553","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909553","url":null,"abstract":"This paper studies the problem of Cover Source Mismatch (CSM) in steganalysis, i.e. the impact of a testing set which does not originate from the same source than the training set. In this study, the trained steganalyzer uses state of the art deep-learning architecture prone to better generalization than feature-based steganalysis. Different sources such as the sensor model, the ISO sensitivity, the processing pipeline and the content, are investigated. Our conclusions are that, on one hand, deep learning steganalysis is still very sensitive to the CSM, on the other hand, the holistic strategy leverages the good generalization properties of deep learning to reduce the CSM with a relatively small number of training samples.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126686706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909877
Zhuoqian Jiang, J. Xin, Weiliang Zuo, Nanning Zheng, A. Sano
In this paper, we explore the problem of near-field source localization in an unknown spatially colored noise environment using an end-to-end neural network which is based on deep residual learning. Specifically, the proposed approach uses the multi-dimensional information of the array covariance as input, and finally directly outputs the location information of the near-field sources through the regression structure. The architecture of deep neural network is well designed taking into account the trade-off between the expression ability and compu-tational complexity. In addition, benefiting from the method of generating training data that combines the degree of separation to traverse the spatial location, the proposed approach has a robust performance for different location parameter separation. The simulation results demonstrate that the proposed approach outperforms the existing model-driven methods under various conditions, especially for the adverse scenes with low SNRs, small number of snapshots, or correlated sources.
{"title":"Deep Residual Learning Based Localization of Near-Field Sources in Unknown Spatially Colored Noise Fields","authors":"Zhuoqian Jiang, J. Xin, Weiliang Zuo, Nanning Zheng, A. Sano","doi":"10.23919/eusipco55093.2022.9909877","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909877","url":null,"abstract":"In this paper, we explore the problem of near-field source localization in an unknown spatially colored noise environment using an end-to-end neural network which is based on deep residual learning. Specifically, the proposed approach uses the multi-dimensional information of the array covariance as input, and finally directly outputs the location information of the near-field sources through the regression structure. The architecture of deep neural network is well designed taking into account the trade-off between the expression ability and compu-tational complexity. In addition, benefiting from the method of generating training data that combines the degree of separation to traverse the spatial location, the proposed approach has a robust performance for different location parameter separation. The simulation results demonstrate that the proposed approach outperforms the existing model-driven methods under various conditions, especially for the adverse scenes with low SNRs, small number of snapshots, or correlated sources.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"238 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123317499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909958
Talha Bozkus, U. Mitra
Wireless communication networks are well-modeled by Markov Decision Processes (MDPs), but induce a large state space which challenges policy optimization. Reinforcement learning such as Q-learning enables the solution of policy opti-mization problems in unknown environments. Herein a graph-learning algorithm is proposed to improve the accuracy and complexity performance of Q-learning algorithm for a multiple access communications problem. By exploiting the structural properties of the wireless network MDP, several structurally related Markov chains are created and these multiple chains are sampled to learn multiple policies which are fused. Furthermore, a state-action aggregation method is proposed to reduce the time and memory complexity of the algorithm. Numerical results show that the proposed algorithm achieves a reduction of 80% with respect to the policy error and a reduction of 70% for the runtime versus other state-of-the-art $Q$ learning algorithms.
{"title":"Ensemble Link Learning for Large State Space Multiple Access Communications","authors":"Talha Bozkus, U. Mitra","doi":"10.23919/eusipco55093.2022.9909958","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909958","url":null,"abstract":"Wireless communication networks are well-modeled by Markov Decision Processes (MDPs), but induce a large state space which challenges policy optimization. Reinforcement learning such as Q-learning enables the solution of policy opti-mization problems in unknown environments. Herein a graph-learning algorithm is proposed to improve the accuracy and complexity performance of Q-learning algorithm for a multiple access communications problem. By exploiting the structural properties of the wireless network MDP, several structurally related Markov chains are created and these multiple chains are sampled to learn multiple policies which are fused. Furthermore, a state-action aggregation method is proposed to reduce the time and memory complexity of the algorithm. Numerical results show that the proposed algorithm achieves a reduction of 80% with respect to the policy error and a reduction of 70% for the runtime versus other state-of-the-art $Q$ learning algorithms.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126209361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909948
J. Birdi, J. D’hooge, A. Bertrand
Quantitative ultrasound (QUS) imaging complements the standard B-mode images with a quantitative represen-tation of the target's acoustic properties. Attenuation coefficient is an important parameter characterizing these properties, with applications in medical diagnosis and tissue characterization. Traditional QUS methods use analytical models to estimate this coefficient from the acquired signal. Propagation effects, such as diffraction, which are difficult to model analytically are usually ignored, affecting their estimation accuracy. To tackle this issue, reference phantom measurements are commonly used. These are, however, time-consuming and may not always be feasible, limiting the existing approaches' practical applicability. To overcome these challenges, we leverage recent advances in the deep learning field and propose a neural network approach which takes the magnitude spectra of the backscattered ultrasound signal at different axial depths as the input and provides the target's attenuation coefficient as the output. For the presented proof-of-concept study, the network was trained on a simulated dataset, and learnt a proper model from the training data, thereby avoiding the need for an analytical model. The trained network was tested on both simulated and tissue-mimicking phantom datasets, demonstrating the capability of neural networks to provide accurate attenuation estimates from diffraction affected recordings without a reference phantom measurement.
{"title":"A Neural Network Approach for Ultrasound Attenuation Coefficient Estimation","authors":"J. Birdi, J. D’hooge, A. Bertrand","doi":"10.23919/eusipco55093.2022.9909948","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909948","url":null,"abstract":"Quantitative ultrasound (QUS) imaging complements the standard B-mode images with a quantitative represen-tation of the target's acoustic properties. Attenuation coefficient is an important parameter characterizing these properties, with applications in medical diagnosis and tissue characterization. Traditional QUS methods use analytical models to estimate this coefficient from the acquired signal. Propagation effects, such as diffraction, which are difficult to model analytically are usually ignored, affecting their estimation accuracy. To tackle this issue, reference phantom measurements are commonly used. These are, however, time-consuming and may not always be feasible, limiting the existing approaches' practical applicability. To overcome these challenges, we leverage recent advances in the deep learning field and propose a neural network approach which takes the magnitude spectra of the backscattered ultrasound signal at different axial depths as the input and provides the target's attenuation coefficient as the output. For the presented proof-of-concept study, the network was trained on a simulated dataset, and learnt a proper model from the training data, thereby avoiding the need for an analytical model. The trained network was tested on both simulated and tissue-mimicking phantom datasets, demonstrating the capability of neural networks to provide accurate attenuation estimates from diffraction affected recordings without a reference phantom measurement.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126498877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}