{"title":"Sound source localization and detection based on densely connected network and attention mechanism","authors":"Bomao Zhou, Jin Tang","doi":"10.1016/j.apacoust.2024.110338","DOIUrl":null,"url":null,"abstract":"<div><div>Sound Source Localization and Detection (SSLD) is a joint task of detecting sound event activity and sound source localization (SSL). This paper proposes an SSLD method based on dense connection and attention mechanism. We propose a densely connected block parallel to the gated linear unit (GLU) to enhance feature propagation and alleviate vanishing gradient problems. We introduce multi-headed self-attention (MHSA) to aggregate contextual information and model long-term dependencies. The output adopts activity-coupled Cartesian direction-of-arrival (ACCDOA) representation. We decomposed the loss function based on Euclidean distance and proposed a loss function based on vector length loss and vector angle loss. These two types of losses correspond to sound activity detection (SAD) and direction-of-arrival (DOA) estimation in the SSLD task, respectively. The experimental results indicate that the proposed model achieves state-of-the-art performance compared to the baseline models. The proposed loss function also achieved performance improvement compared to the general Euclidean distance loss. In addition, we propose a system for remote SSL. In a real-world environment, we used two Laser Doppler vibrometers (LDVs) to remotely capture sound signals from different types of target objects for remote SSL. The results of real-world experiments have validated the effectiveness of the proposed system and demonstrated its potential application value.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Acoustics","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003682X24004894","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Sound Source Localization and Detection (SSLD) is a joint task of detecting sound event activity and sound source localization (SSL). This paper proposes an SSLD method based on dense connection and attention mechanism. We propose a densely connected block parallel to the gated linear unit (GLU) to enhance feature propagation and alleviate vanishing gradient problems. We introduce multi-headed self-attention (MHSA) to aggregate contextual information and model long-term dependencies. The output adopts activity-coupled Cartesian direction-of-arrival (ACCDOA) representation. We decomposed the loss function based on Euclidean distance and proposed a loss function based on vector length loss and vector angle loss. These two types of losses correspond to sound activity detection (SAD) and direction-of-arrival (DOA) estimation in the SSLD task, respectively. The experimental results indicate that the proposed model achieves state-of-the-art performance compared to the baseline models. The proposed loss function also achieved performance improvement compared to the general Euclidean distance loss. In addition, we propose a system for remote SSL. In a real-world environment, we used two Laser Doppler vibrometers (LDVs) to remotely capture sound signals from different types of target objects for remote SSL. The results of real-world experiments have validated the effectiveness of the proposed system and demonstrated its potential application value.
期刊介绍:
Since its launch in 1968, Applied Acoustics has been publishing high quality research papers providing state-of-the-art coverage of research findings for engineers and scientists involved in applications of acoustics in the widest sense.
Applied Acoustics looks not only at recent developments in the understanding of acoustics but also at ways of exploiting that understanding. The Journal aims to encourage the exchange of practical experience through publication and in so doing creates a fund of technological information that can be used for solving related problems. The presentation of information in graphical or tabular form is especially encouraged. If a report of a mathematical development is a necessary part of a paper it is important to ensure that it is there only as an integral part of a practical solution to a problem and is supported by data. Applied Acoustics encourages the exchange of practical experience in the following ways: • Complete Papers • Short Technical Notes • Review Articles; and thereby provides a wealth of technological information that can be used to solve related problems.
Manuscripts that address all fields of applications of acoustics ranging from medicine and NDT to the environment and buildings are welcome.