Assistive Image Annotation Systems with Deep Learning and Natural Language Capabilities: A Review

Moseli Mots'oehli
{"title":"Assistive Image Annotation Systems with Deep Learning and Natural Language Capabilities: A Review","authors":"Moseli Mots'oehli","doi":"arxiv-2407.00252","DOIUrl":null,"url":null,"abstract":"While supervised learning has achieved significant success in computer vision\ntasks, acquiring high-quality annotated data remains a bottleneck. This paper\nexplores both scholarly and non-scholarly works in AI-assistive deep learning\nimage annotation systems that provide textual suggestions, captions, or\ndescriptions of the input image to the annotator. This potentially results in\nhigher annotation efficiency and quality. Our exploration covers annotation for\na range of computer vision tasks including image classification, object\ndetection, regression, instance, semantic segmentation, and pose estimation. We\nreview various datasets and how they contribute to the training and evaluation\nof AI-assistive annotation systems. We also examine methods leveraging\nneuro-symbolic learning, deep active learning, and self-supervised learning\nalgorithms that enable semantic image understanding and generate free-text\noutput. These include image captioning, visual question answering, and\nmulti-modal reasoning. Despite the promising potential, there is limited\npublicly available work on AI-assistive image annotation with textual output\ncapabilities. We conclude by suggesting future research directions to advance\nthis field, emphasizing the need for more publicly accessible datasets and\ncollaborative efforts between academia and industry.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.00252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

While supervised learning has achieved significant success in computer vision tasks, acquiring high-quality annotated data remains a bottleneck. This paper explores both scholarly and non-scholarly works in AI-assistive deep learning image annotation systems that provide textual suggestions, captions, or descriptions of the input image to the annotator. This potentially results in higher annotation efficiency and quality. Our exploration covers annotation for a range of computer vision tasks including image classification, object detection, regression, instance, semantic segmentation, and pose estimation. We review various datasets and how they contribute to the training and evaluation of AI-assistive annotation systems. We also examine methods leveraging neuro-symbolic learning, deep active learning, and self-supervised learning algorithms that enable semantic image understanding and generate free-text output. These include image captioning, visual question answering, and multi-modal reasoning. Despite the promising potential, there is limited publicly available work on AI-assistive image annotation with textual output capabilities. We conclude by suggesting future research directions to advance this field, emphasizing the need for more publicly accessible datasets and collaborative efforts between academia and industry.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有深度学习和自然语言功能的辅助图像注释系统:综述
虽然监督学习在计算机视觉任务中取得了巨大成功,但获取高质量的注释数据仍是一个瓶颈。本论文探讨了人工智能辅助深度学习图像标注系统的学术和非学术成果,该系统可为标注者提供输入图像的文本建议、标题和说明。这有可能提高注释效率和质量。我们的探索涵盖了一系列计算机视觉任务的注释,包括图像分类、对象检测、回归、实例、语义分割和姿态估计。我们查看了各种数据集,以及它们如何为人工智能辅助标注系统的训练和评估做出贡献。我们还研究了利用神经符号学习、深度主动学习和自我监督学习算法实现语义图像理解并生成自由文本输出的方法。这些方法包括图像标题、视觉问题解答和多模态推理。尽管潜力巨大,但目前公开发表的具有文本输出能力的人工智能辅助图像注释工作还很有限。最后,我们提出了推进这一领域的未来研究方向,强调需要更多可公开访问的数据集以及学术界和产业界之间的合作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Pennsieve - A Collaborative Platform for Translational Neuroscience and Beyond Analysing Attacks on Blockchain Systems in a Layer-based Approach Exploring Utility in a Real-World Warehouse Optimization Problem: Formulation Based on Quantun Annealers and Preliminary Results High Definition Map Mapping and Update: A General Overview and Future Directions Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1