Tianchen Ji , Huaiying Fang , Rencheng Zhang , Jianhong Yang , Zhifeng Wang , Xin Wang
{"title":"基于多模态特征选择和跨模态 Swin 变换器的塑料垃圾识别技术","authors":"Tianchen Ji , Huaiying Fang , Rencheng Zhang , Jianhong Yang , Zhifeng Wang , Xin Wang","doi":"10.1016/j.wasman.2024.11.027","DOIUrl":null,"url":null,"abstract":"<div><div>The classification and recycling of municipal solid waste (MSW) are strategies for resource conservation and pollution prevention, with plastic waste identification being an essential component of waste sorting. Multimodal detection of solid waste has increasingly replaced single-modal methods constrained by limited informational capacity. However, existing hyperspectral feature selection algorithms and multimodal identification methods have yet to leverage cross-modal information exhaustively. Therefore, two RGB-hyperspectral image (RGB-HSI) multimodal instance segmentation datasets were constructed to support research in plastic waste sorting. A feature band selection algorithm based on the Activation Weight function was proposed to automatically select influential hyperspectral bands from multimodal data, thereby reducing the burden of data acquisition, transmission, and inference. Furthermore, the multimodal Selective Feature Network (SFNet) was introduced to balance information across various modalities and stages. Moreover, the Correlation Swin Transformer Block was proposed, specifically crafted to fuse cross-modal mutual information, which can be synergistically employed with SFNet to enhance multimodal recognition capabilities further. Experimental results show that the Activation Weight band selection function can select the most effective feature bands. At the same time, the Correlation SF-Swin Transformer achieved the highest F1-scores of 97.85% and 97.37% in the two plastic waste object detection experiments, respectively. The source code and final models are available at <span><span>https://github.com/Bazenr/Correlation-SFSwin</span><svg><path></path></svg></span>, and the dataset can be accessed at <span><span>https://www.kaggle.com/datasets/bazenr/rgb-hsi-rgb-nir-municipal-solid-waste</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":23969,"journal":{"name":"Waste management","volume":"192 ","pages":"Pages 58-68"},"PeriodicalIF":7.1000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Plastic waste identification based on multimodal feature selection and cross-modal Swin Transformer\",\"authors\":\"Tianchen Ji , Huaiying Fang , Rencheng Zhang , Jianhong Yang , Zhifeng Wang , Xin Wang\",\"doi\":\"10.1016/j.wasman.2024.11.027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The classification and recycling of municipal solid waste (MSW) are strategies for resource conservation and pollution prevention, with plastic waste identification being an essential component of waste sorting. Multimodal detection of solid waste has increasingly replaced single-modal methods constrained by limited informational capacity. However, existing hyperspectral feature selection algorithms and multimodal identification methods have yet to leverage cross-modal information exhaustively. Therefore, two RGB-hyperspectral image (RGB-HSI) multimodal instance segmentation datasets were constructed to support research in plastic waste sorting. A feature band selection algorithm based on the Activation Weight function was proposed to automatically select influential hyperspectral bands from multimodal data, thereby reducing the burden of data acquisition, transmission, and inference. Furthermore, the multimodal Selective Feature Network (SFNet) was introduced to balance information across various modalities and stages. Moreover, the Correlation Swin Transformer Block was proposed, specifically crafted to fuse cross-modal mutual information, which can be synergistically employed with SFNet to enhance multimodal recognition capabilities further. Experimental results show that the Activation Weight band selection function can select the most effective feature bands. At the same time, the Correlation SF-Swin Transformer achieved the highest F1-scores of 97.85% and 97.37% in the two plastic waste object detection experiments, respectively. The source code and final models are available at <span><span>https://github.com/Bazenr/Correlation-SFSwin</span><svg><path></path></svg></span>, and the dataset can be accessed at <span><span>https://www.kaggle.com/datasets/bazenr/rgb-hsi-rgb-nir-municipal-solid-waste</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":23969,\"journal\":{\"name\":\"Waste management\",\"volume\":\"192 \",\"pages\":\"Pages 58-68\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Waste management\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0956053X24005841\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ENVIRONMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Waste management","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0956053X24005841","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
Plastic waste identification based on multimodal feature selection and cross-modal Swin Transformer
The classification and recycling of municipal solid waste (MSW) are strategies for resource conservation and pollution prevention, with plastic waste identification being an essential component of waste sorting. Multimodal detection of solid waste has increasingly replaced single-modal methods constrained by limited informational capacity. However, existing hyperspectral feature selection algorithms and multimodal identification methods have yet to leverage cross-modal information exhaustively. Therefore, two RGB-hyperspectral image (RGB-HSI) multimodal instance segmentation datasets were constructed to support research in plastic waste sorting. A feature band selection algorithm based on the Activation Weight function was proposed to automatically select influential hyperspectral bands from multimodal data, thereby reducing the burden of data acquisition, transmission, and inference. Furthermore, the multimodal Selective Feature Network (SFNet) was introduced to balance information across various modalities and stages. Moreover, the Correlation Swin Transformer Block was proposed, specifically crafted to fuse cross-modal mutual information, which can be synergistically employed with SFNet to enhance multimodal recognition capabilities further. Experimental results show that the Activation Weight band selection function can select the most effective feature bands. At the same time, the Correlation SF-Swin Transformer achieved the highest F1-scores of 97.85% and 97.37% in the two plastic waste object detection experiments, respectively. The source code and final models are available at https://github.com/Bazenr/Correlation-SFSwin, and the dataset can be accessed at https://www.kaggle.com/datasets/bazenr/rgb-hsi-rgb-nir-municipal-solid-waste.
期刊介绍:
Waste Management is devoted to the presentation and discussion of information on solid wastes,it covers the entire lifecycle of solid. wastes.
Scope:
Addresses solid wastes in both industrialized and economically developing countries
Covers various types of solid wastes, including:
Municipal (e.g., residential, institutional, commercial, light industrial)
Agricultural
Special (e.g., C and D, healthcare, household hazardous wastes, sewage sludge)