ConVision Benchmark: A Contemporary Framework to Benchmark CNN and ViT Models

AI Pub Date : 2024-07-11 DOI:10.3390/ai5030056
Shreyas Bangalore Vijayakumar, Krishna Teja Chitty-Venkata, Kanishk Arya, Arun Somani
{"title":"ConVision Benchmark: A Contemporary Framework to Benchmark CNN and ViT Models","authors":"Shreyas Bangalore Vijayakumar, Krishna Teja Chitty-Venkata, Kanishk Arya, Arun Somani","doi":"10.3390/ai5030056","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have shown remarkable performance in computer vision tasks, including object detection and image recognition. These models have evolved significantly in architecture, efficiency, and versatility. Concurrently, deep-learning frameworks have diversified, with versions that often complicate reproducibility and unified benchmarking. We propose ConVision Benchmark, a comprehensive framework in PyTorch, to standardize the implementation and evaluation of state-of-the-art CNN and ViT models. This framework addresses common challenges such as version mismatches and inconsistent validation metrics. As a proof of concept, we performed an extensive benchmark analysis on a COVID-19 dataset, encompassing nearly 200 CNN and ViT models in which DenseNet-161 and MaxViT-Tiny achieved exceptional accuracy with a peak performance of around 95%. Although we primarily used the COVID-19 dataset for image classification, the framework is adaptable to a variety of datasets, enhancing its applicability across different domains. Our methodology includes rigorous performance evaluations, highlighting metrics such as accuracy, precision, recall, F1 score, and computational efficiency (FLOPs, MACs, CPU, and GPU latency). The ConVision Benchmark facilitates a comprehensive understanding of model efficacy, aiding researchers in deploying high-performance models for diverse applications.","PeriodicalId":503525,"journal":{"name":"AI","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/ai5030056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have shown remarkable performance in computer vision tasks, including object detection and image recognition. These models have evolved significantly in architecture, efficiency, and versatility. Concurrently, deep-learning frameworks have diversified, with versions that often complicate reproducibility and unified benchmarking. We propose ConVision Benchmark, a comprehensive framework in PyTorch, to standardize the implementation and evaluation of state-of-the-art CNN and ViT models. This framework addresses common challenges such as version mismatches and inconsistent validation metrics. As a proof of concept, we performed an extensive benchmark analysis on a COVID-19 dataset, encompassing nearly 200 CNN and ViT models in which DenseNet-161 and MaxViT-Tiny achieved exceptional accuracy with a peak performance of around 95%. Although we primarily used the COVID-19 dataset for image classification, the framework is adaptable to a variety of datasets, enhancing its applicability across different domains. Our methodology includes rigorous performance evaluations, highlighting metrics such as accuracy, precision, recall, F1 score, and computational efficiency (FLOPs, MACs, CPU, and GPU latency). The ConVision Benchmark facilitates a comprehensive understanding of model efficacy, aiding researchers in deploying high-performance models for diverse applications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ConVision 基准:对 CNN 和 ViT 模型进行基准测试的当代框架
卷积神经网络(CNN)和视觉变换器(ViT)在计算机视觉任务(包括物体检测和图像识别)中表现出色。这些模型在架构、效率和多功能性方面都有了长足的发展。与此同时,深度学习框架也变得多样化,其版本往往使可重复性和统一基准变得复杂。我们提出的 ConVision Benchmark 是 PyTorch 中的一个综合框架,旨在对最先进的 CNN 和 ViT 模型的实现和评估进行标准化。该框架解决了版本不匹配和验证指标不一致等常见难题。作为概念验证,我们在 COVID-19 数据集上进行了广泛的基准分析,该数据集包含近 200 个 CNN 和 ViT 模型,其中 DenseNet-161 和 MaxViT-Tiny 实现了约 95% 的峰值准确率。虽然我们主要将 COVID-19 数据集用于图像分类,但该框架可适用于各种数据集,从而增强了其在不同领域的适用性。我们的方法包括严格的性能评估,重点关注准确率、精确度、召回率、F1 分数和计算效率(FLOPs、MACs、CPU 和 GPU 延迟)等指标。ConVision 基准有助于全面了解模型的功效,帮助研究人员为各种应用部署高性能模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
AI
AI
自引率
0.00%
发文量
0
期刊最新文献
Recent Advances in 3D Object Detection for Self-Driving Vehicles: A Survey A Model for Feature Selection with Binary Particle Swarm Optimisation and Synthetic Features Dynamic Programming-Based White Box Adversarial Attack for Deep Neural Networks Computer Vision for Safety Management in the Steel Industry Optimization Strategies for Atari Game Environments: Integrating Snake Optimization Algorithm and Energy Valley Optimization in Reinforcement Learning Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1