{"title":"MCIR-YOLO:利用多波段红外图像进行白色药丸分类","authors":"Mohan Wang;Yang Jiang;Baohui Xu;Mengqiang Huang;Xu Xue;Xu Wu;Wenjian Kuang;Xiang Liu;Harm Tolner","doi":"10.1109/JPHOT.2024.3426929","DOIUrl":null,"url":null,"abstract":"The identification and categorization of pills constitute critical tasks within a contemporary hospital, particularly for avoiding medication errors. Conventional approaches to visual recognition and classification predominantly rely on visible light imagery, proving inadequate for discerning white pills with similar visual characteristics. However, white pills exhibit distinctive infrared properties across various spectral bands. Building upon these observations, this paper introduces the MCIR-YOLO algorithm, a multi-band infrared image object detection system, which enhances the YOLOv5s model through multimodal fusion techniques. This study presents a novel dataset comprising IR images of white round pills captured across six channels, with peak wavelengths ranging from approximately 1400 nm to 1650 nm. Furthermore, a multimodal fusion strategy is proposed, facilitating multi-level feature integration across the six IR channels. This fusion technique exploits the scale features inherent to each IR modality, thereby enabling comprehensive information fusion across multiple modalities. Additionally, the model incorporates an auxiliary detection branch, independent of the backbone, which utilizes fused feature information to calculate a distinct loss, effectively mitigating overall loss. Attention mechanism modules are integrated after two distinct fusion points to enhance feature precision. Leveraging mean and scaling of IR features, these attention mechanisms significantly boost detection accuracy. Experimental results demonstrate that the improved model outperforms the baseline YOLOv5s model, particularly evident in a self-constructed dataset of white round pill IR images, where mAP0.5 increased by 5.47% and 7.96% for single-channel (peak at 1650 nm) and six-channel configurations, respectively. Notably, the utilization of the MCIR-YOLO model for six-channel recognition yields a substantial advantage of 12.05% over the best-performing single-channel IR image recognition.","PeriodicalId":13204,"journal":{"name":"IEEE Photonics Journal","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10595466","citationCount":"0","resultStr":"{\"title\":\"MCIR-YOLO: White Medication Pill Classification Using Multi-Band Infrared Images\",\"authors\":\"Mohan Wang;Yang Jiang;Baohui Xu;Mengqiang Huang;Xu Xue;Xu Wu;Wenjian Kuang;Xiang Liu;Harm Tolner\",\"doi\":\"10.1109/JPHOT.2024.3426929\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The identification and categorization of pills constitute critical tasks within a contemporary hospital, particularly for avoiding medication errors. Conventional approaches to visual recognition and classification predominantly rely on visible light imagery, proving inadequate for discerning white pills with similar visual characteristics. However, white pills exhibit distinctive infrared properties across various spectral bands. Building upon these observations, this paper introduces the MCIR-YOLO algorithm, a multi-band infrared image object detection system, which enhances the YOLOv5s model through multimodal fusion techniques. This study presents a novel dataset comprising IR images of white round pills captured across six channels, with peak wavelengths ranging from approximately 1400 nm to 1650 nm. Furthermore, a multimodal fusion strategy is proposed, facilitating multi-level feature integration across the six IR channels. This fusion technique exploits the scale features inherent to each IR modality, thereby enabling comprehensive information fusion across multiple modalities. Additionally, the model incorporates an auxiliary detection branch, independent of the backbone, which utilizes fused feature information to calculate a distinct loss, effectively mitigating overall loss. Attention mechanism modules are integrated after two distinct fusion points to enhance feature precision. Leveraging mean and scaling of IR features, these attention mechanisms significantly boost detection accuracy. Experimental results demonstrate that the improved model outperforms the baseline YOLOv5s model, particularly evident in a self-constructed dataset of white round pill IR images, where mAP0.5 increased by 5.47% and 7.96% for single-channel (peak at 1650 nm) and six-channel configurations, respectively. Notably, the utilization of the MCIR-YOLO model for six-channel recognition yields a substantial advantage of 12.05% over the best-performing single-channel IR image recognition.\",\"PeriodicalId\":13204,\"journal\":{\"name\":\"IEEE Photonics Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10595466\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Photonics Journal\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10595466/\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Photonics Journal","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10595466/","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
MCIR-YOLO: White Medication Pill Classification Using Multi-Band Infrared Images
The identification and categorization of pills constitute critical tasks within a contemporary hospital, particularly for avoiding medication errors. Conventional approaches to visual recognition and classification predominantly rely on visible light imagery, proving inadequate for discerning white pills with similar visual characteristics. However, white pills exhibit distinctive infrared properties across various spectral bands. Building upon these observations, this paper introduces the MCIR-YOLO algorithm, a multi-band infrared image object detection system, which enhances the YOLOv5s model through multimodal fusion techniques. This study presents a novel dataset comprising IR images of white round pills captured across six channels, with peak wavelengths ranging from approximately 1400 nm to 1650 nm. Furthermore, a multimodal fusion strategy is proposed, facilitating multi-level feature integration across the six IR channels. This fusion technique exploits the scale features inherent to each IR modality, thereby enabling comprehensive information fusion across multiple modalities. Additionally, the model incorporates an auxiliary detection branch, independent of the backbone, which utilizes fused feature information to calculate a distinct loss, effectively mitigating overall loss. Attention mechanism modules are integrated after two distinct fusion points to enhance feature precision. Leveraging mean and scaling of IR features, these attention mechanisms significantly boost detection accuracy. Experimental results demonstrate that the improved model outperforms the baseline YOLOv5s model, particularly evident in a self-constructed dataset of white round pill IR images, where mAP0.5 increased by 5.47% and 7.96% for single-channel (peak at 1650 nm) and six-channel configurations, respectively. Notably, the utilization of the MCIR-YOLO model for six-channel recognition yields a substantial advantage of 12.05% over the best-performing single-channel IR image recognition.
期刊介绍:
Breakthroughs in the generation of light and in its control and utilization have given rise to the field of Photonics, a rapidly expanding area of science and technology with major technological and economic impact. Photonics integrates quantum electronics and optics to accelerate progress in the generation of novel photon sources and in their utilization in emerging applications at the micro and nano scales spanning from the far-infrared/THz to the x-ray region of the electromagnetic spectrum. IEEE Photonics Journal is an online-only journal dedicated to the rapid disclosure of top-quality peer-reviewed research at the forefront of all areas of photonics. Contributions addressing issues ranging from fundamental understanding to emerging technologies and applications are within the scope of the Journal. The Journal includes topics in: Photon sources from far infrared to X-rays, Photonics materials and engineered photonic structures, Integrated optics and optoelectronic, Ultrafast, attosecond, high field and short wavelength photonics, Biophotonics, including DNA photonics, Nanophotonics, Magnetophotonics, Fundamentals of light propagation and interaction; nonlinear effects, Optical data storage, Fiber optics and optical communications devices, systems, and technologies, Micro Opto Electro Mechanical Systems (MOEMS), Microwave photonics, Optical Sensors.