TsDa-ASAM: Balancing efficiency and accuracy in coke image particle size segmentation via two-stage distillation-aware adaptive segment anything model

IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Applied Intelligence Pub Date : 2025-03-13 DOI:10.1007/s10489-025-06427-z
Yalin Wang, Yubin Peng, Xujie Tan, Yuqing Pan, Chenliang Liu
{"title":"TsDa-ASAM: Balancing efficiency and accuracy in coke image particle size segmentation via two-stage distillation-aware adaptive segment anything model","authors":"Yalin Wang,&nbsp;Yubin Peng,&nbsp;Xujie Tan,&nbsp;Yuqing Pan,&nbsp;Chenliang Liu","doi":"10.1007/s10489-025-06427-z","DOIUrl":null,"url":null,"abstract":"<div><p>Coke image segmentation is a crucial step in coke particle size control of the sintering process. However, due to the complexity of model architecture and the dense distribution of coke particles in the images, existing segmentation methods fail to satisfy the efficiency and accuracy requirements for coke image segmentation in industrial scenarios. To address these challenges, this paper proposes a two-stage distillation-aware adaptive segment anything model to balance efficiency and accuracy in coke image particle size segmentation, referred to as TsDa-ASAM. In the first stage, knowledge distillation methods are employed to distill the Segment Anything Model (SAM) into a lightweight model, explicitly focusing on enhancing segmentation efficiency. In the second stage, a domain knowledge injection strategy is formulated, which incorporates domain knowledge into the distillation model to effectively enhance the accuracy. Moreover, an adaptive prompt point selection algorithm is introduced to address the redundancy issue of prompt points in SAM, improving the efficiency of TsDa-ASAM. The effectiveness of TsDa-ASAM is validated through extensive experiments on the publicly available dataset SA-1B and the coke image dataset from industrial sites. After distillation and fine-tuning, the segmentation accuracy of the proposed model improved by 10%, and the segmentation efficiency of TsDa-ASAM was enhanced by 2 to 3 times with the integration of the adaptive prompt point selection algorithm. The experimental results have effectively demonstrated the potential of the proposed model in balancing accuracy and efficiency.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06427-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Coke image segmentation is a crucial step in coke particle size control of the sintering process. However, due to the complexity of model architecture and the dense distribution of coke particles in the images, existing segmentation methods fail to satisfy the efficiency and accuracy requirements for coke image segmentation in industrial scenarios. To address these challenges, this paper proposes a two-stage distillation-aware adaptive segment anything model to balance efficiency and accuracy in coke image particle size segmentation, referred to as TsDa-ASAM. In the first stage, knowledge distillation methods are employed to distill the Segment Anything Model (SAM) into a lightweight model, explicitly focusing on enhancing segmentation efficiency. In the second stage, a domain knowledge injection strategy is formulated, which incorporates domain knowledge into the distillation model to effectively enhance the accuracy. Moreover, an adaptive prompt point selection algorithm is introduced to address the redundancy issue of prompt points in SAM, improving the efficiency of TsDa-ASAM. The effectiveness of TsDa-ASAM is validated through extensive experiments on the publicly available dataset SA-1B and the coke image dataset from industrial sites. After distillation and fine-tuning, the segmentation accuracy of the proposed model improved by 10%, and the segmentation efficiency of TsDa-ASAM was enhanced by 2 to 3 times with the integration of the adaptive prompt point selection algorithm. The experimental results have effectively demonstrated the potential of the proposed model in balancing accuracy and efficiency.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
TsDa-ASAM:利用两级蒸馏感知自适应分割模型平衡焦炭图像粒度分割的效率和准确性
焦炭图像分割是烧结过程中焦炭粒度控制的关键步骤。然而,由于模型架构的复杂性和焦炭颗粒在图像中的密集分布,现有的分割方法无法满足工业场景下焦炭图像分割的效率和精度要求。为了解决这些挑战,本文提出了一种两阶段蒸馏感知自适应分割模型,以平衡焦炭图像粒度分割的效率和准确性,称为TsDa-ASAM。在第一阶段,采用知识蒸馏的方法将分割任意模型(SAM)提炼成一个轻量级模型,明确地关注于提高分割效率。第二阶段,提出领域知识注入策略,将领域知识注入到精馏模型中,有效提高精馏模型的精度。此外,引入自适应提示点选择算法,解决了提示点冗余问题,提高了TsDa-ASAM的效率。TsDa-ASAM的有效性通过在公开可用的数据集SA-1B和工业现场焦炭图像数据集上的大量实验得到验证。经过精加工和微调,该模型的分割精度提高了10%,结合自适应提示点选择算法,TsDa-ASAM的分割效率提高了2 ~ 3倍。实验结果有效地证明了该模型在平衡精度和效率方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Intelligence
Applied Intelligence 工程技术-计算机:人工智能
CiteScore
6.60
自引率
20.80%
发文量
1361
审稿时长
5.9 months
期刊介绍: With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.
期刊最新文献
A hybrid decision tree-Markowitz framework for intelligent trading portfolio optimization Unsupervised multimodal graph-based model for geo-social analysis Anchor-based tensor clustering of multi-view data through high order relational fusion Deep Rayleigh quotient iteration for solving eigenvalue problems of linear differential operators Capability of large language models in assisting GPs with diagnoses
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1