{"title":"利用 YOLOv4 算法的基于软注意力的 LSTM 模型为医学图像自动添加字幕","authors":"Paspula Ravinder, Saravanan Srinivasan","doi":"10.3844/jcssp.2024.52.68","DOIUrl":null,"url":null,"abstract":": The medical image captioning field is one of the prominent fields nowadays. The interpretation and captioning of medical images can be a time-consuming and costly process, often requiring expert support. The growing volume of medical images makes it challenging for radiologists to handle their workload alone. However, addressing the issues of high cost and time can be achieved by automating the process of medical image captioning while assisting radiologists in improving the reliability and accuracy of the generated captions. It also provides an opportunity for new radiologists with less experience to benefit from automated support. Despite previous efforts in automating medical image captioning, there are still some unresolved issues, including generating overly detailed captions, difficulty in identifying abnormal regions in complex images, and low accuracy and reliability of some generated captions. To tackle these challenges, we suggest the new deep learning model specifically tailored for captioning medical images. Our model aims to extract features from images and generate meaningful sentences related to the identified defects with high accuracy. The approach we present utilizes a multi-model neural network that closely mimics the human visual system and automatically learns to describe the content of images. Our proposed method consists of two stages. In the first stage, known as the information extraction phase, we employ the YOLOv4","PeriodicalId":40005,"journal":{"name":"Journal of Computer Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Medical Image Captioning with Soft Attention-Based LSTM Model Utilizing YOLOv4 Algorithm\",\"authors\":\"Paspula Ravinder, Saravanan Srinivasan\",\"doi\":\"10.3844/jcssp.2024.52.68\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": The medical image captioning field is one of the prominent fields nowadays. The interpretation and captioning of medical images can be a time-consuming and costly process, often requiring expert support. The growing volume of medical images makes it challenging for radiologists to handle their workload alone. However, addressing the issues of high cost and time can be achieved by automating the process of medical image captioning while assisting radiologists in improving the reliability and accuracy of the generated captions. It also provides an opportunity for new radiologists with less experience to benefit from automated support. Despite previous efforts in automating medical image captioning, there are still some unresolved issues, including generating overly detailed captions, difficulty in identifying abnormal regions in complex images, and low accuracy and reliability of some generated captions. To tackle these challenges, we suggest the new deep learning model specifically tailored for captioning medical images. Our model aims to extract features from images and generate meaningful sentences related to the identified defects with high accuracy. The approach we present utilizes a multi-model neural network that closely mimics the human visual system and automatically learns to describe the content of images. Our proposed method consists of two stages. In the first stage, known as the information extraction phase, we employ the YOLOv4\",\"PeriodicalId\":40005,\"journal\":{\"name\":\"Journal of Computer Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3844/jcssp.2024.52.68\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3844/jcssp.2024.52.68","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automated Medical Image Captioning with Soft Attention-Based LSTM Model Utilizing YOLOv4 Algorithm
: The medical image captioning field is one of the prominent fields nowadays. The interpretation and captioning of medical images can be a time-consuming and costly process, often requiring expert support. The growing volume of medical images makes it challenging for radiologists to handle their workload alone. However, addressing the issues of high cost and time can be achieved by automating the process of medical image captioning while assisting radiologists in improving the reliability and accuracy of the generated captions. It also provides an opportunity for new radiologists with less experience to benefit from automated support. Despite previous efforts in automating medical image captioning, there are still some unresolved issues, including generating overly detailed captions, difficulty in identifying abnormal regions in complex images, and low accuracy and reliability of some generated captions. To tackle these challenges, we suggest the new deep learning model specifically tailored for captioning medical images. Our model aims to extract features from images and generate meaningful sentences related to the identified defects with high accuracy. The approach we present utilizes a multi-model neural network that closely mimics the human visual system and automatically learns to describe the content of images. Our proposed method consists of two stages. In the first stage, known as the information extraction phase, we employ the YOLOv4
期刊介绍:
Journal of Computer Science is aimed to publish research articles on theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. JCS updated twelve times a year and is a peer reviewed journal covers the latest and most compelling research of the time.