利用 Rein 微调视觉基础模型进行跨器官和跨扫描仪腺癌分类

arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI:arxiv-2409.11752

Pengzhou Cai, Xueyuan Zhang, Ze Zhao

{"title":"利用 Rein 微调视觉基础模型进行跨器官和跨扫描仪腺癌分类","authors":"Pengzhou Cai, Xueyuan Zhang, Ze Zhao","doi":"arxiv-2409.11752","DOIUrl":null,"url":null,"abstract":"In recent years, significant progress has been made in tumor segmentation\nwithin the field of digital pathology. However, variations in organs, tissue\npreparation methods, and image acquisition processes can lead to domain\ndiscrepancies among digital pathology images. To address this problem, in this\npaper, we use Rein, a fine-tuning method, to parametrically and efficiently\nfine-tune various vision foundation models (VFMs) for MICCAI 2024 Cross-Organ\nand Cross-Scanner Adenocarcinoma Segmentation (COSAS2024). The core of Rein\nconsists of a set of learnable tokens, which are directly linked to instances,\nimproving functionality at the instance level in each layer. In the data\nenvironment of the COSAS2024 Challenge, extensive experiments demonstrate that\nRein fine-tuned the VFMs to achieve satisfactory results. Specifically, we used\nRein to fine-tune ConvNeXt and DINOv2. Our team used the former to achieve\nscores of 0.7719 and 0.7557 on the preliminary test phase and final test phase\nin task1, respectively, while the latter achieved scores of 0.8848 and 0.8192\non the preliminary test phase and final test phase in task2. Code is available\nat GitHub.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"38 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation using Rein to Fine-tune Vision Foundation Models\",\"authors\":\"Pengzhou Cai, Xueyuan Zhang, Ze Zhao\",\"doi\":\"arxiv-2409.11752\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, significant progress has been made in tumor segmentation\\nwithin the field of digital pathology. However, variations in organs, tissue\\npreparation methods, and image acquisition processes can lead to domain\\ndiscrepancies among digital pathology images. To address this problem, in this\\npaper, we use Rein, a fine-tuning method, to parametrically and efficiently\\nfine-tune various vision foundation models (VFMs) for MICCAI 2024 Cross-Organ\\nand Cross-Scanner Adenocarcinoma Segmentation (COSAS2024). The core of Rein\\nconsists of a set of learnable tokens, which are directly linked to instances,\\nimproving functionality at the instance level in each layer. In the data\\nenvironment of the COSAS2024 Challenge, extensive experiments demonstrate that\\nRein fine-tuned the VFMs to achieve satisfactory results. Specifically, we used\\nRein to fine-tune ConvNeXt and DINOv2. Our team used the former to achieve\\nscores of 0.7719 and 0.7557 on the preliminary test phase and final test phase\\nin task1, respectively, while the latter achieved scores of 0.8848 and 0.8192\\non the preliminary test phase and final test phase in task2. Code is available\\nat GitHub.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11752\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11752","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，数字病理学领域在肿瘤分割方面取得了重大进展。然而，器官、组织制备方法和图像采集过程的不同会导致数字病理图像之间的差异。为了解决这个问题，我们在本文中使用一种微调方法 Rein，对 MICCAI 2024 跨器官和跨扫描仪腺癌分割（COSAS2024）的各种视觉基础模型（VFM）进行参数化和高效的微调。Reincons 的核心由一组可学习标记组成，这些标记与实例直接相关，从而提高了各层实例级的功能。在 COSAS2024 挑战赛的数据环境中，大量实验证明，Rein 对 VFM 进行了微调，取得了令人满意的结果。具体来说，我们使用Rein对ConvNeXt和DINOv2进行了微调。我们团队使用前者在任务 1 的初步测试阶段和最终测试阶段分别取得了 0.7719 和 0.7557 的分数，而后者在任务 2 的初步测试阶段和最终测试阶段分别取得了 0.8848 和 0.8192 的分数。代码可在 GitHub 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation using Rein to Fine-tune Vision Foundation Models

In recent years, significant progress has been made in tumor segmentation within the field of digital pathology. However, variations in organs, tissue preparation methods, and image acquisition processes can lead to domain discrepancies among digital pathology images. To address this problem, in this paper, we use Rein, a fine-tuning method, to parametrically and efficiently fine-tune various vision foundation models (VFMs) for MICCAI 2024 Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation (COSAS2024). The core of Rein consists of a set of learnable tokens, which are directly linked to instances, improving functionality at the instance level in each layer. In the data environment of the COSAS2024 Challenge, extensive experiments demonstrate that Rein fine-tuned the VFMs to achieve satisfactory results. Specifically, we used Rein to fine-tune ConvNeXt and DINOv2. Our team used the former to achieve scores of 0.7719 and 0.7557 on the preliminary test phase and final test phase in task1, respectively, while the latter achieved scores of 0.8848 and 0.8192 on the preliminary test phase and final test phase in task2. Code is available at GitHub.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Image and Video Processing

自引率

0.00%

发文量