Yuan Wu , Gaorong Zhao , Likairui Zhang , Xinrong Hu , Lei Ding
{"title":"HearDrinking: Drunkenness detection and BACs predictions based on acoustic signal","authors":"Yuan Wu , Gaorong Zhao , Likairui Zhang , Xinrong Hu , Lei Ding","doi":"10.1016/j.pmcj.2025.102020","DOIUrl":null,"url":null,"abstract":"<div><div>Alcohol poisoning is a severe health concern resulting from excessive drinking and can be life-threatening. By utilizing home monitoring, individuals can quickly determine their blood alcohol content, thus preventing it from reaching hazardous levels. However, most existing systems for drunkenness detection require extra hardware or much effort from the user, making these systems impractical for detecting drunkenness in real life. Motivated by this, we present a device-free, noise-resistant drunkenness detection system named HearDrinking based on smartphone, which utilizes microphone of smartphone to record human’s voice activity, then mine drunkenness related features to yield accurate drunkenness detection. However, using acoustic signal to detect drunkenness is non-trivial since voice activities are prone to be interfered by ambient noise, and extracting fine-grained representations related to drunkenness from voice activities remains unresolved. On one hand, HearDrinking employs a multi-modal fusion method to realize noise-resistant voice activity detection. On the other hand, HearDrinking initially calculates the log-Mel spectrograms from the speech signal. The log-Mel spectrograms contain temporal and spectral information absent in image data. Therefore, conventional convolutions designed for images often have limited effectiveness in extracting features from log-Mel spectrograms. To overcome this limitation, we integrate Omni-dimensional Dynamic Convolution (ODConv) with ShuffleNetV2, creating OD-ShuffleNetV2. ODConv replaces certain conventional convolutions in the ShuffleNetV2 network. Multiple convolution cores are fused based on the log-Mel spectrogram, taking into account multi-dimensional attention, thereby optimizing the network structure. Comprehensive experiments with 15 participants reveal drunkenness detection accuracy of 96.08% and Blood Alcohol Content (BAC) predictions with an average error of 5 mg/dl.</div></div>","PeriodicalId":49005,"journal":{"name":"Pervasive and Mobile Computing","volume":"108 ","pages":"Article 102020"},"PeriodicalIF":3.0000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pervasive and Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574119225000094","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Alcohol poisoning is a severe health concern resulting from excessive drinking and can be life-threatening. By utilizing home monitoring, individuals can quickly determine their blood alcohol content, thus preventing it from reaching hazardous levels. However, most existing systems for drunkenness detection require extra hardware or much effort from the user, making these systems impractical for detecting drunkenness in real life. Motivated by this, we present a device-free, noise-resistant drunkenness detection system named HearDrinking based on smartphone, which utilizes microphone of smartphone to record human’s voice activity, then mine drunkenness related features to yield accurate drunkenness detection. However, using acoustic signal to detect drunkenness is non-trivial since voice activities are prone to be interfered by ambient noise, and extracting fine-grained representations related to drunkenness from voice activities remains unresolved. On one hand, HearDrinking employs a multi-modal fusion method to realize noise-resistant voice activity detection. On the other hand, HearDrinking initially calculates the log-Mel spectrograms from the speech signal. The log-Mel spectrograms contain temporal and spectral information absent in image data. Therefore, conventional convolutions designed for images often have limited effectiveness in extracting features from log-Mel spectrograms. To overcome this limitation, we integrate Omni-dimensional Dynamic Convolution (ODConv) with ShuffleNetV2, creating OD-ShuffleNetV2. ODConv replaces certain conventional convolutions in the ShuffleNetV2 network. Multiple convolution cores are fused based on the log-Mel spectrogram, taking into account multi-dimensional attention, thereby optimizing the network structure. Comprehensive experiments with 15 participants reveal drunkenness detection accuracy of 96.08% and Blood Alcohol Content (BAC) predictions with an average error of 5 mg/dl.
期刊介绍:
As envisioned by Mark Weiser as early as 1991, pervasive computing systems and services have truly become integral parts of our daily lives. Tremendous developments in a multitude of technologies ranging from personalized and embedded smart devices (e.g., smartphones, sensors, wearables, IoTs, etc.) to ubiquitous connectivity, via a variety of wireless mobile communications and cognitive networking infrastructures, to advanced computing techniques (including edge, fog and cloud) and user-friendly middleware services and platforms have significantly contributed to the unprecedented advances in pervasive and mobile computing. Cutting-edge applications and paradigms have evolved, such as cyber-physical systems and smart environments (e.g., smart city, smart energy, smart transportation, smart healthcare, etc.) that also involve human in the loop through social interactions and participatory and/or mobile crowd sensing, for example. The goal of pervasive computing systems is to improve human experience and quality of life, without explicit awareness of the underlying communications and computing technologies.
The Pervasive and Mobile Computing Journal (PMC) is a high-impact, peer-reviewed technical journal that publishes high-quality scientific articles spanning theory and practice, and covering all aspects of pervasive and mobile computing and systems.