{"title":"MetaCNN:一种基于混合深度学习图像的基于变压器框架的车辆分类方法","authors":"Juntian Chen, Ruikang Luo","doi":"10.1145/3569966.3570099","DOIUrl":null,"url":null,"abstract":"Abstract—With the development of vehicles and traffic system in the early 21st century, the need for a monitored traffic system and vehicle classification is enlarging. Together with the development of deep learning, computer vision realm has emerged versatile models that is able to fulfill the need of classification. Those popular models include CNN, Vision Trans- former, Metaformer and so on. However, these models handle the problem based on different data processing techniques, they either lacks efficiency or effectiveness. In particular, CNN is shortcoming in global data while ViT is lack of extraction of local information. Therefore, based on this research gap, we proposed a model called MetaCNN, which combines CNN and Poolformer – a specific metaformer structure, which takes the strength of the two models and compensate for both models’ deficiencies. Finally, in order to verify the feasibility of our model, we tested our model on a real-world remote sensing datasets of vehicle images in six different regions with different weather conditions. Our model MetaCNN has demonstrated better recognition performance compared to other baseline models. The results further prove that our model MetaCNN is adept at vehicle classification of remote sensing images though under complex scenarios","PeriodicalId":145580,"journal":{"name":"Proceedings of the 5th International Conference on Computer Science and Software Engineering","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MetaCNN: A New Hybrid Deep Learning Image-based Approach for Vehicle Classification Using Transformer-like Framework\",\"authors\":\"Juntian Chen, Ruikang Luo\",\"doi\":\"10.1145/3569966.3570099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract—With the development of vehicles and traffic system in the early 21st century, the need for a monitored traffic system and vehicle classification is enlarging. Together with the development of deep learning, computer vision realm has emerged versatile models that is able to fulfill the need of classification. Those popular models include CNN, Vision Trans- former, Metaformer and so on. However, these models handle the problem based on different data processing techniques, they either lacks efficiency or effectiveness. In particular, CNN is shortcoming in global data while ViT is lack of extraction of local information. Therefore, based on this research gap, we proposed a model called MetaCNN, which combines CNN and Poolformer – a specific metaformer structure, which takes the strength of the two models and compensate for both models’ deficiencies. Finally, in order to verify the feasibility of our model, we tested our model on a real-world remote sensing datasets of vehicle images in six different regions with different weather conditions. Our model MetaCNN has demonstrated better recognition performance compared to other baseline models. The results further prove that our model MetaCNN is adept at vehicle classification of remote sensing images though under complex scenarios\",\"PeriodicalId\":145580,\"journal\":{\"name\":\"Proceedings of the 5th International Conference on Computer Science and Software Engineering\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th International Conference on Computer Science and Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3569966.3570099\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Computer Science and Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3569966.3570099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MetaCNN: A New Hybrid Deep Learning Image-based Approach for Vehicle Classification Using Transformer-like Framework
Abstract—With the development of vehicles and traffic system in the early 21st century, the need for a monitored traffic system and vehicle classification is enlarging. Together with the development of deep learning, computer vision realm has emerged versatile models that is able to fulfill the need of classification. Those popular models include CNN, Vision Trans- former, Metaformer and so on. However, these models handle the problem based on different data processing techniques, they either lacks efficiency or effectiveness. In particular, CNN is shortcoming in global data while ViT is lack of extraction of local information. Therefore, based on this research gap, we proposed a model called MetaCNN, which combines CNN and Poolformer – a specific metaformer structure, which takes the strength of the two models and compensate for both models’ deficiencies. Finally, in order to verify the feasibility of our model, we tested our model on a real-world remote sensing datasets of vehicle images in six different regions with different weather conditions. Our model MetaCNN has demonstrated better recognition performance compared to other baseline models. The results further prove that our model MetaCNN is adept at vehicle classification of remote sensing images though under complex scenarios