Multiscale geometric window transformer for orthodontic teeth point cloud registration

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Multimedia Systems Pub Date : 2024-05-31 DOI:10.1007/s00530-024-01369-x
Hao Wang, Yan Tian, Yongchuan Xu, Jiahui Xu, Tao Yang, Yan Lu, Hong Chen
{"title":"Multiscale geometric window transformer for orthodontic teeth point cloud registration","authors":"Hao Wang, Yan Tian, Yongchuan Xu, Jiahui Xu, Tao Yang, Yan Lu, Hong Chen","doi":"10.1007/s00530-024-01369-x","DOIUrl":null,"url":null,"abstract":"<p>Digital orthodontic treatment monitoring has been gaining increasing attention in the past decade. However, current methods based on deep learning still face difficult challenges. Transformer, due to its excellent ability to model long-term dependencies, can be applied to the task of tooth point cloud registration. Nonetheless, most transformer-based point cloud registration networks suffer from two problems. First, they lack the embedding of credible geometric information, resulting in learned features that are not geometrically discriminative and blur the boundary between inliers and outliers. Second, the attention mechanism lacks continuous downsampling during geometric transformation invariant feature extraction at the superpixel level, thereby limiting the field of view and potentially limiting the model’s perception of local and global information. In this paper, we propose GeoSwin, which uses a novel geometric window transformer to achieve accurate registration of tooth point clouds in different stages of orthodontic treatment. This method uses the point distance, normal vector angle, and bidirectional spatial angular distances as the input geometric embedding of transformer, and then uses a proposed variable multiscale attention mechanism to achieve geometric information perception from local to global perspectives. Experiments on the Shing3D Dental Dataset demonstrate the effectiveness of our approach and that it outperforms other state-of-the-art approaches across multiple metrics. Our code and models are available at GeoSwin.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"495 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01369-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Digital orthodontic treatment monitoring has been gaining increasing attention in the past decade. However, current methods based on deep learning still face difficult challenges. Transformer, due to its excellent ability to model long-term dependencies, can be applied to the task of tooth point cloud registration. Nonetheless, most transformer-based point cloud registration networks suffer from two problems. First, they lack the embedding of credible geometric information, resulting in learned features that are not geometrically discriminative and blur the boundary between inliers and outliers. Second, the attention mechanism lacks continuous downsampling during geometric transformation invariant feature extraction at the superpixel level, thereby limiting the field of view and potentially limiting the model’s perception of local and global information. In this paper, we propose GeoSwin, which uses a novel geometric window transformer to achieve accurate registration of tooth point clouds in different stages of orthodontic treatment. This method uses the point distance, normal vector angle, and bidirectional spatial angular distances as the input geometric embedding of transformer, and then uses a proposed variable multiscale attention mechanism to achieve geometric information perception from local to global perspectives. Experiments on the Shing3D Dental Dataset demonstrate the effectiveness of our approach and that it outperforms other state-of-the-art approaches across multiple metrics. Our code and models are available at GeoSwin.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于正畸牙齿点云注册的多尺度几何窗口变换器
过去十年间,数字化正畸治疗监测越来越受到关注。然而,目前基于深度学习的方法仍然面临着困难的挑战。变压器具有出色的长期依赖性建模能力,可以应用于牙齿点云配准任务。然而,大多数基于变换器的点云配准网络都存在两个问题。首先,它们缺乏可信的几何信息嵌入,导致学习到的特征不具有几何鉴别力,并且模糊了异常值和离群值之间的界限。其次,在超像素级的几何变换不变特征提取过程中,注意力机制缺乏连续的下采样,从而限制了视场,并可能限制模型对局部和全局信息的感知。在本文中,我们提出了 GeoSwin,它使用一种新颖的几何窗口变换器来实现正畸治疗不同阶段牙齿点云的精确配准。该方法使用点距、法向量角和双向空间角距作为变换器的输入几何嵌入,然后使用提出的可变多尺度注意机制实现从局部到全局的几何信息感知。在 Shing3D Dental Dataset 上的实验证明了我们的方法的有效性,并且在多个指标上都优于其他最先进的方法。我们的代码和模型可在 GeoSwin 网站上查阅。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Multimedia Systems
Multimedia Systems 工程技术-计算机:理论方法
CiteScore
5.40
自引率
7.70%
发文量
148
审稿时长
4.5 months
期刊介绍: This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.
期刊最新文献
Adaptafood: an intelligent system to adapt recipes to specialised diets and healthy lifestyles. Generating generalized zero-shot learning based on dual-path feature enhancement Triple fusion and feature pyramid decoder for RGB-D semantic segmentation Automatic lymph node segmentation using deep parallel squeeze & excitation and attention Unet CAFIN: cross-attention based face image repair network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1