Low-light video enhancement (LLVE) has received little attention compared to low-light image enhancement (LLIE) mainly due to the lack of paired low-/normal-light video datasets. Consequently, a common approach to LLVE is to enhance each video frame individually using LLIE methods. However, this practice introduces temporal inconsistencies in the resulting video. In this work, we propose a recurrent neural network (RNN) that, given a low-light video and its per-frame enhanced version, produces a temporally consistent video preserving the underlying frame-based enhancement. We achieve this by training our network with a combination of a new forward-backward temporal consistency loss and a content-preserving loss. At inference time, we can use our trained network to correct videos processed by any LLIE method. Experimental results show that our method achieves the best trade-off between temporal consistency improvement and fidelity with the per-frame enhanced video, exhibiting a lower memory complexity and comparable time complexity with respect to other state-of-the-art methods for temporal consistency.
{"title":"A RNN for Temporal Consistency in Low-Light Videos Enhanced by Single-Frame Methods","authors":"Claudio Rota;Marco Buzzelli;Simone Bianco;Raimondo Schettini","doi":"10.1109/LSP.2024.3475969","DOIUrl":"https://doi.org/10.1109/LSP.2024.3475969","url":null,"abstract":"Low-light video enhancement (LLVE) has received little attention compared to low-light image enhancement (LLIE) mainly due to the lack of paired low-/normal-light video datasets. Consequently, a common approach to LLVE is to enhance each video frame individually using LLIE methods. However, this practice introduces temporal inconsistencies in the resulting video. In this work, we propose a recurrent neural network (RNN) that, given a low-light video and its per-frame enhanced version, produces a temporally consistent video preserving the underlying frame-based enhancement. We achieve this by training our network with a combination of a new forward-backward temporal consistency loss and a content-preserving loss. At inference time, we can use our trained network to correct videos processed by any LLIE method. Experimental results show that our method achieves the best trade-off between temporal consistency improvement and fidelity with the per-frame enhanced video, exhibiting a lower memory complexity and comparable time complexity with respect to other state-of-the-art methods for temporal consistency.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2795-2799"},"PeriodicalIF":3.2,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142438500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-08DOI: 10.1109/LSP.2024.3475358
Arka Roy;Udit Satija
Respiratory disorders have become the third largest cause of death worldwide, which can be assessed by one of the two key diagnostic modalities: breathing patterns (BPs) or the airflow signals, and respiratory sounds (RSs). In recent years, few studies have been conducted on finding correlation between these two modalities which indicate the structural flaws of lungs under disease condition. In this letter, we propose ‘RS-2-BP’: a unified deep learning framework for deriving the electrical impedance tomography-based airflow signals from respiratory sounds using a hybrid neural network architecture, namely ReSTL, that comprises cascaded standard and residual shrinkage convolution blocks, followed by feature refined transformer encoders and long-short term memory (LSTM) units. The proposed framework is extensively evaluated using the publicly available BRACETS dataset. Experimental results suggest that our ReSTL can accurately derive the BPs from RSs with an average mean absolute error of $0.024pm 0.011, ,0.436pm 0.120, ,0.020pm 0.011,,0.134pm 0.068$