{"title":"Multi-scale feature flow alignment fusion with Transformer for the microscopic images segmentation of activated sludge","authors":"Lijie Zhao, Yingying Zhang, Guogang Wang, Mingzhong Huang, Qichun Zhang, Hamid Reza Karimi","doi":"10.1007/s11760-023-02836-0","DOIUrl":null,"url":null,"abstract":"Abstract Accurate microscopic images segmentation of activated sludge is essential for monitoring wastewater treatment processes. However, it is a challenging task due to poor contrast, artifacts, morphological similarities, and distribution imbalance. A novel image segmentation model (FafFormer) was developed in the work based on Transformer that incorporated pyramid pooling and flow alignment fusion. Pyramid Pooling Module was used to extract multi-scale features of flocs and filamentous bacteria with different morphology in the encoder. Multi-scale features were fused by flow alignment fusion module in the decoder. The module used generated semantic flow as auxiliary information to restore boundary details and facilitate fine-grained upsampling. The Focal–Lovász Loss was designed to handle class imbalance for filamentous bacteria and flocs. Image-segmentation experiments were conducted on an activated sludge dataset from a municipal wastewater treatment plant. FafFormer showed relative superiority in accuracy and reliability, especially for filamentous bacteria compared to existing models.","PeriodicalId":54393,"journal":{"name":"Signal Image and Video Processing","volume":"107 2","pages":"0"},"PeriodicalIF":2.0000,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11760-023-02836-0","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Accurate microscopic images segmentation of activated sludge is essential for monitoring wastewater treatment processes. However, it is a challenging task due to poor contrast, artifacts, morphological similarities, and distribution imbalance. A novel image segmentation model (FafFormer) was developed in the work based on Transformer that incorporated pyramid pooling and flow alignment fusion. Pyramid Pooling Module was used to extract multi-scale features of flocs and filamentous bacteria with different morphology in the encoder. Multi-scale features were fused by flow alignment fusion module in the decoder. The module used generated semantic flow as auxiliary information to restore boundary details and facilitate fine-grained upsampling. The Focal–Lovász Loss was designed to handle class imbalance for filamentous bacteria and flocs. Image-segmentation experiments were conducted on an activated sludge dataset from a municipal wastewater treatment plant. FafFormer showed relative superiority in accuracy and reliability, especially for filamentous bacteria compared to existing models.
期刊介绍:
The journal is an interdisciplinary journal presenting the theory and practice of signal, image and video processing. It aims at:
- Disseminating high level research results and engineering developments to all signal, image or video processing researchers and research groups.
- Presenting practical solutions for the current signal, image and video processing problems in Engineering and Science.
Subject areas covered by the journal include but are not limited to:
Adaptive processing – biomedical signal processing – multimedia signal processing – communication signal processing – non-linear signal processing – array processing – statistics and statistical signal processing – modeling – filtering – data science – graph signal processing – multi-resolution signal analysis and wavelets – segmentation – coding – restoration – enhancement – storage and retrieval – colour and multi-spectral processing – scanning – displaying – printing – interpolation – image processing - video processing-motion detection and estimation – stereoscopic processing – image and video coding.