{"title":"Open-Ended Fine-Grained 3D Object Categorization by Combining Shape and Texture Features in Multiple Colorspaces","authors":"Nils Keunecke, S. Kasaei","doi":"10.1109/HUMANOIDS47582.2021.9555670","DOIUrl":null,"url":null,"abstract":"As a consequence of an ever-increasing number of service robots, there is a growing demand for highly accurate real-time 3D object recognition. Considering the expansion of robot applications in more complex and dynamic environments, it is evident that it is not possible to pre-program all object categories and anticipate all exceptions in advance. Therefore, robots should have the functionality to learn about new object categories in an open-ended fashion while working in the environment. Towards this goal, we propose a deep transfer learning approach to generate a scale- and pose-invariant object representation by considering shape and texture information in multiple color spaces. The obtained global object representation is then fed to an instance-based object category learning and recognition, where a non-expert human user exists in the learning loop and can interactively guide the process of experience acquisition by teaching new object categories, or by correcting insufficient or erroneous categories. In this work, shape information encodes the common patterns of all categories, while texture information is used to describes the appearance of each instance in detail. Multiple color space combinations and network architectures are evaluated to find the most descriptive system. Experimental results showed that the proposed network architecture outperformed the selected state-of-the-art in terms of object classification accuracy and scalability. Furthermore, we performed a real robot experiment in the context of serve_a_beer scenario to show the real-time performance of the proposed approach.","PeriodicalId":320510,"journal":{"name":"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)","volume":"53 17","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HUMANOIDS47582.2021.9555670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As a consequence of an ever-increasing number of service robots, there is a growing demand for highly accurate real-time 3D object recognition. Considering the expansion of robot applications in more complex and dynamic environments, it is evident that it is not possible to pre-program all object categories and anticipate all exceptions in advance. Therefore, robots should have the functionality to learn about new object categories in an open-ended fashion while working in the environment. Towards this goal, we propose a deep transfer learning approach to generate a scale- and pose-invariant object representation by considering shape and texture information in multiple color spaces. The obtained global object representation is then fed to an instance-based object category learning and recognition, where a non-expert human user exists in the learning loop and can interactively guide the process of experience acquisition by teaching new object categories, or by correcting insufficient or erroneous categories. In this work, shape information encodes the common patterns of all categories, while texture information is used to describes the appearance of each instance in detail. Multiple color space combinations and network architectures are evaluated to find the most descriptive system. Experimental results showed that the proposed network architecture outperformed the selected state-of-the-art in terms of object classification accuracy and scalability. Furthermore, we performed a real robot experiment in the context of serve_a_beer scenario to show the real-time performance of the proposed approach.