Yuan Cheng, Rui Lin, Peining Zhen, Tianshu Hou, C. Ng, Hai-Bao Chen, Hao Yu, Ngai Wong
{"title":"FASSST:基于快速注意力的单阶段实时实例分割网络","authors":"Yuan Cheng, Rui Lin, Peining Zhen, Tianshu Hou, C. Ng, Hai-Bao Chen, Hao Yu, Ngai Wong","doi":"10.1109/WACV51458.2022.00277","DOIUrl":null,"url":null,"abstract":"Real-time instance segmentation is crucial in various AI applications. This work designs a network named Fast Attention based Single-Stage Segmentation NeT (FASSST) that performs instance segmentation with video-grade speed. Using an instance attention module (IAM), FASSST quickly locates target instances and segments with region of interest (ROI) feature fusion (RFF) aggregating ROI features from pyramid mask layers. The module employs an efficient single-stage feature regression, straight from features to instance coordinates and class probabilities. Experiments on COCO and CityScapes datasets show that FASSST achieves state-of-the-art performance under competitive accuracy: real-time inference of 47.5FPS on a GTX1080Ti GPU and 5.3FPS on a Jetson Xavier NX board with only 71.6 GFLOPs.","PeriodicalId":297092,"journal":{"name":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"FASSST: Fast Attention Based Single-Stage Segmentation Net for Real-Time Instance Segmentation\",\"authors\":\"Yuan Cheng, Rui Lin, Peining Zhen, Tianshu Hou, C. Ng, Hai-Bao Chen, Hao Yu, Ngai Wong\",\"doi\":\"10.1109/WACV51458.2022.00277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Real-time instance segmentation is crucial in various AI applications. This work designs a network named Fast Attention based Single-Stage Segmentation NeT (FASSST) that performs instance segmentation with video-grade speed. Using an instance attention module (IAM), FASSST quickly locates target instances and segments with region of interest (ROI) feature fusion (RFF) aggregating ROI features from pyramid mask layers. The module employs an efficient single-stage feature regression, straight from features to instance coordinates and class probabilities. Experiments on COCO and CityScapes datasets show that FASSST achieves state-of-the-art performance under competitive accuracy: real-time inference of 47.5FPS on a GTX1080Ti GPU and 5.3FPS on a Jetson Xavier NX board with only 71.6 GFLOPs.\",\"PeriodicalId\":297092,\"journal\":{\"name\":\"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACV51458.2022.00277\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV51458.2022.00277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
FASSST: Fast Attention Based Single-Stage Segmentation Net for Real-Time Instance Segmentation
Real-time instance segmentation is crucial in various AI applications. This work designs a network named Fast Attention based Single-Stage Segmentation NeT (FASSST) that performs instance segmentation with video-grade speed. Using an instance attention module (IAM), FASSST quickly locates target instances and segments with region of interest (ROI) feature fusion (RFF) aggregating ROI features from pyramid mask layers. The module employs an efficient single-stage feature regression, straight from features to instance coordinates and class probabilities. Experiments on COCO and CityScapes datasets show that FASSST achieves state-of-the-art performance under competitive accuracy: real-time inference of 47.5FPS on a GTX1080Ti GPU and 5.3FPS on a Jetson Xavier NX board with only 71.6 GFLOPs.