Pub Date : 2019-04-08DOI: 10.1109/DICTA47822.2019.8945926
J. Komori, K. Hotta
Semantic segmentation of 3D point clouds is difficult task due to its unordered representation. PointNet is a pioneering work which used 3D point clouds directly to predict 3D point semantic labels. However, it has a problem that it predicts labels without using local structure in metric space. Recent researches tackled this problem and achieved better performance. In addition to the problem, we considered that treating all channels with the same weight is obstacle to improve the accuracy. Therefore, we propose AB-PointNet which has been modified to predict 3D point semantic labels by considering the importance of channels. To emphasize the important channels, we used attention module which emphasizes channels that are useful for prediction and suppresses unimportant channels. This makes it possible to learn more effective features. In experiments, we evaluate our method on the large-scale indoor spaces 3D point cloud dataset with 13 semantic labels. Our proposed AB-PointNet has advanced performance of 3.2% in mean IoU in comparison with the conventional PointNet.
{"title":"AB-PointNet for 3D Point Cloud Recognition","authors":"J. Komori, K. Hotta","doi":"10.1109/DICTA47822.2019.8945926","DOIUrl":"https://doi.org/10.1109/DICTA47822.2019.8945926","url":null,"abstract":"Semantic segmentation of 3D point clouds is difficult task due to its unordered representation. PointNet is a pioneering work which used 3D point clouds directly to predict 3D point semantic labels. However, it has a problem that it predicts labels without using local structure in metric space. Recent researches tackled this problem and achieved better performance. In addition to the problem, we considered that treating all channels with the same weight is obstacle to improve the accuracy. Therefore, we propose AB-PointNet which has been modified to predict 3D point semantic labels by considering the importance of channels. To emphasize the important channels, we used attention module which emphasizes channels that are useful for prediction and suppresses unimportant channels. This makes it possible to learn more effective features. In experiments, we evaluate our method on the large-scale indoor spaces 3D point cloud dataset with 13 semantic labels. Our proposed AB-PointNet has advanced performance of 3.2% in mean IoU in comparison with the conventional PointNet.","PeriodicalId":6696,"journal":{"name":"2019 Digital Image Computing: Techniques and Applications (DICTA)","volume":"57 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90608547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/E-SCIENCE.2005.25
{"title":"Conference Chairs and Committees","authors":"","doi":"10.1109/E-SCIENCE.2005.25","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2005.25","url":null,"abstract":"","PeriodicalId":6696,"journal":{"name":"2019 Digital Image Computing: Techniques and Applications (DICTA)","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82438042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-20DOI: 10.1109/DICTA47822.2019.8945810
Anh-Dzung Doan, Y. Latif, Tat-Jun Chin, Yu Liu, Shin-Fang Ch'ng, Thanh-Toan Do, I. Reid
A major focus of current research on place recognition is visual localization for autonomous driving. In this scenario, as cameras will be operating continuously, it is realistic to expect videos as an input to visual localization algorithms, as opposed to the single-image querying approach used in other place recognition works. In this paper, we show that exploiting temporal continuity in the testing sequence significantly improves visual localization — qualitatively and quantitatively. Although intuitive, this idea has not been fully explored in recent works. Our main contribution is a novel Monte Carlo-based visual localization technique that can efficiently reason over the image sequence. Also, we propose an image retrieval pipeline which relies on local features and an encoding technique to represent an image as a single vector. The experimental results show that our proposed method achieves better results than state-of-the-art approaches for the task on visual localization under significant appearance change. Our synthetic dataset is made available at: http://tiny.cc/jd73bz
{"title":"Visual Localization under Appearance Change: A Filtering Approach","authors":"Anh-Dzung Doan, Y. Latif, Tat-Jun Chin, Yu Liu, Shin-Fang Ch'ng, Thanh-Toan Do, I. Reid","doi":"10.1109/DICTA47822.2019.8945810","DOIUrl":"https://doi.org/10.1109/DICTA47822.2019.8945810","url":null,"abstract":"A major focus of current research on place recognition is visual localization for autonomous driving. In this scenario, as cameras will be operating continuously, it is realistic to expect videos as an input to visual localization algorithms, as opposed to the single-image querying approach used in other place recognition works. In this paper, we show that exploiting temporal continuity in the testing sequence significantly improves visual localization — qualitatively and quantitatively. Although intuitive, this idea has not been fully explored in recent works. Our main contribution is a novel Monte Carlo-based visual localization technique that can efficiently reason over the image sequence. Also, we propose an image retrieval pipeline which relies on local features and an encoding technique to represent an image as a single vector. The experimental results show that our proposed method achieves better results than state-of-the-art approaches for the task on visual localization under significant appearance change. Our synthetic dataset is made available at: http://tiny.cc/jd73bz","PeriodicalId":6696,"journal":{"name":"2019 Digital Image Computing: Techniques and Applications (DICTA)","volume":"91 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2018-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79524081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}