Pub Date : 2012-09-01DOI: 10.1109/TAMD.2012.2200250
Fabian Chersi
Humans are very efficient in learning new skills through imitation and social interaction with other individuals. Recent experimental findings on the functioning of the mirror neuron system in humans and animals and on the coding of intentions, have led to the development of more realistic and powerful models of action understanding and imitation. This paper describes the implementation on a humanoid robot of a spiking neuron model of the mirror system. The proposed architecture is validated in an imitation task where the robot has to observe and understand manipulative action sequences executed by a human demonstrator and reproduce them on demand utilizing its own motor repertoire. To instruct the robot what to observe and to learn, and when to imitate, the demonstrator utilizes a simple form of sign language. Two basic principles underlie the functioning of the system: 1) imitation is primarily directed toward reproducing the goals of observed actions rather than the exact hand trajectories; and 2) the capacity to understand the motor intentions of another individual is based on the resonance of the same neural populations that are active during action execution. Experimental findings show that the use of even a very simple form of gesture-based communication allows to develop robotic architectures that are efficient, simple and user friendly.
{"title":"Learning Through Imitation: a Biological Approach to Robotics","authors":"Fabian Chersi","doi":"10.1109/TAMD.2012.2200250","DOIUrl":"https://doi.org/10.1109/TAMD.2012.2200250","url":null,"abstract":"Humans are very efficient in learning new skills through imitation and social interaction with other individuals. Recent experimental findings on the functioning of the mirror neuron system in humans and animals and on the coding of intentions, have led to the development of more realistic and powerful models of action understanding and imitation. This paper describes the implementation on a humanoid robot of a spiking neuron model of the mirror system. The proposed architecture is validated in an imitation task where the robot has to observe and understand manipulative action sequences executed by a human demonstrator and reproduce them on demand utilizing its own motor repertoire. To instruct the robot what to observe and to learn, and when to imitate, the demonstrator utilizes a simple form of sign language. Two basic principles underlie the functioning of the system: 1) imitation is primarily directed toward reproducing the goals of observed actions rather than the exact hand trajectories; and 2) the capacity to understand the motor intentions of another individual is based on the resonance of the same neural populations that are active during action execution. Experimental findings show that the use of even a very simple form of gesture-based communication allows to develop robotic architectures that are efficient, simple and user friendly.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"4 1","pages":"204-214"},"PeriodicalIF":0.0,"publicationDate":"2012-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2012.2200250","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-09-01DOI: 10.1109/TAMD.2012.2207455
Scott Heath, R. Schulz, David Ball, Janet Wiles
Time and space are fundamental to human language and embodied cognition. In our early work we investigated how Lingodroids, robots with the ability to build their own maps, could evolve their own geopersonal spatial language. In subsequent studies we extended the framework developed for learning spatial concepts and words to learning temporal intervals. This paper considers a new aspect of time, the naming of concepts like morning, afternoon, dawn, and dusk, which are events that are part of day-night cycles, but are not defined by specific time points on a clock. Grounding of such terms refers to events and features of the diurnal cycle, such as light levels. We studied event-based time in which robots experienced day-night cycles that varied with the seasons throughout a year. Then we used meet-at tasks to demonstrate that the words learned were grounded, where the times to meet were morning and afternoon, rather than specific clock times. The studies show how words and concepts for a novel aspect of cyclic time can be grounded through experience with events rather than by times as measured by clocks or calendars.
{"title":"Long Summer Days: Grounded Learning of Words for the Uneven Cycles of Real World Events","authors":"Scott Heath, R. Schulz, David Ball, Janet Wiles","doi":"10.1109/TAMD.2012.2207455","DOIUrl":"https://doi.org/10.1109/TAMD.2012.2207455","url":null,"abstract":"Time and space are fundamental to human language and embodied cognition. In our early work we investigated how Lingodroids, robots with the ability to build their own maps, could evolve their own geopersonal spatial language. In subsequent studies we extended the framework developed for learning spatial concepts and words to learning temporal intervals. This paper considers a new aspect of time, the naming of concepts like morning, afternoon, dawn, and dusk, which are events that are part of day-night cycles, but are not defined by specific time points on a clock. Grounding of such terms refers to events and features of the diurnal cycle, such as light levels. We studied event-based time in which robots experienced day-night cycles that varied with the seasons throughout a year. Then we used meet-at tasks to demonstrate that the words learned were grounded, where the times to meet were morning and afternoon, rather than specific clock times. The studies show how words and concepts for a novel aspect of cyclic time can be grounded through experience with events rather than by times as measured by clocks or calendars.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"10 1","pages":"192-203"},"PeriodicalIF":0.0,"publicationDate":"2012-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2012.2207455","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-09-01DOI: 10.1109/TAMD.2012.2199754
S. Lallée, U. Pattacini, Séverin Lemaignan, A. Lenz, C. Melhuish, L. Natale, Sergey Skachek, Katharina Hamann, Jasmin Steinwender, E. A. Sisbot, G. Metta, J. Guitton, R. Alami, Matthieu Warnier, A. Pipe, Felix Warneken, Peter Ford Dominey
Robots should be capable of interacting in a cooperative and adaptive manner with their human counterparts in open-ended tasks that can change in real-time. An important aspect of the robot behavior will be the ability to acquire new knowledge of the cooperative tasks by observing and interacting with humans. The current research addresses this challenge. We present results from a cooperative human-robot interaction system that has been specifically developed for portability between different humanoid platforms, by abstraction layers at the perceptual and motor interfaces. In the perceptual domain, the resulting system is demonstrated to learn to recognize objects and to recognize actions as sequences of perceptual primitives, and to transfer this learning, and recognition, between different robotic platforms. For execution, composite actions and plans are shown to be learnt on one robot and executed successfully on a different one. Most importantly, the system provides the ability to link actions into shared plans, that form the basis of human-robot cooperation, applying principles from human cognitive development to the domain of robot cognitive systems.
{"title":"Towards a Platform-Independent Cooperative Human Robot Interaction System: III An Architecture for Learning and Executing Actions and Shared Plans","authors":"S. Lallée, U. Pattacini, Séverin Lemaignan, A. Lenz, C. Melhuish, L. Natale, Sergey Skachek, Katharina Hamann, Jasmin Steinwender, E. A. Sisbot, G. Metta, J. Guitton, R. Alami, Matthieu Warnier, A. Pipe, Felix Warneken, Peter Ford Dominey","doi":"10.1109/TAMD.2012.2199754","DOIUrl":"https://doi.org/10.1109/TAMD.2012.2199754","url":null,"abstract":"Robots should be capable of interacting in a cooperative and adaptive manner with their human counterparts in open-ended tasks that can change in real-time. An important aspect of the robot behavior will be the ability to acquire new knowledge of the cooperative tasks by observing and interacting with humans. The current research addresses this challenge. We present results from a cooperative human-robot interaction system that has been specifically developed for portability between different humanoid platforms, by abstraction layers at the perceptual and motor interfaces. In the perceptual domain, the resulting system is demonstrated to learn to recognize objects and to recognize actions as sequences of perceptual primitives, and to transfer this learning, and recognition, between different robotic platforms. For execution, composite actions and plans are shown to be learnt on one robot and executed successfully on a different one. Most importantly, the system provides the ability to link actions into shared plans, that form the basis of human-robot cooperation, applying principles from human cognitive development to the domain of robot cognitive systems.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"4 1","pages":"239-253"},"PeriodicalIF":0.0,"publicationDate":"2012-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2012.2199754","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-09-01DOI: 10.1109/TAMD.2012.2216703
F. Harris, J. Krichmar, H. Siegelmann, H. Wagatsuma
The five articles in this special issue focus on human robot interactions. The papers bring together fields of study, such as cognitive architectures, computational neuroscience, developmental psychology, machine psychology, and sociall affective robots.
{"title":"Guest Editorial: Biologically Inspired Human-Robot Interactions - Developing More Natural Ways to Communicate with our Machines","authors":"F. Harris, J. Krichmar, H. Siegelmann, H. Wagatsuma","doi":"10.1109/TAMD.2012.2216703","DOIUrl":"https://doi.org/10.1109/TAMD.2012.2216703","url":null,"abstract":"The five articles in this special issue focus on human robot interactions. The papers bring together fields of study, such as cognitive architectures, computational neuroscience, developmental psychology, machine psychology, and sociall affective robots.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"18 1","pages":"190-191"},"PeriodicalIF":0.0,"publicationDate":"2012-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74644225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-01DOI: 10.1109/TAMD.2011.2166261
G. Pezzulo
Recent research in cognitive psychology, neuro- science, and robotics has widely explored the tight relations between language and action systems in primates. However, the link between the pragmatics of linguistic and nonlinguistic inter- actions has received less attention up to now. In this paper, we argue that cognitive agents exploit the same cognitive processes and neural substrate-a general pragmatic competence-across linguistic and nonlinguistic interactive contexts. Elaborating on Levinson's idea of an “interaction engine” that permits to convey and recognize communicative intentions in both linguistic and nonlinguistic interactions, we offer a computationally guided analysis of pragmatic competence, suggesting that the core abilities required for successful linguistic interactions could derive from more primitive architectures for action control, nonlinguistic interactions, and joint actions. Furthermore, we make the case for a novel, embodied approach to human-robot interaction and communication, in which the ability to carry on face-to-face communication develops in coordination with the pragmatic competence required for joint action.
{"title":"The “Interaction Engine”: A Common Pragmatic Competence Across Linguistic and Nonlinguistic Interactions","authors":"G. Pezzulo","doi":"10.1109/TAMD.2011.2166261","DOIUrl":"https://doi.org/10.1109/TAMD.2011.2166261","url":null,"abstract":"Recent research in cognitive psychology, neuro- science, and robotics has widely explored the tight relations between language and action systems in primates. However, the link between the pragmatics of linguistic and nonlinguistic inter- actions has received less attention up to now. In this paper, we argue that cognitive agents exploit the same cognitive processes and neural substrate-a general pragmatic competence-across linguistic and nonlinguistic interactive contexts. Elaborating on Levinson's idea of an “interaction engine” that permits to convey and recognize communicative intentions in both linguistic and nonlinguistic interactions, we offer a computationally guided analysis of pragmatic competence, suggesting that the core abilities required for successful linguistic interactions could derive from more primitive architectures for action control, nonlinguistic interactions, and joint actions. Furthermore, we make the case for a novel, embodied approach to human-robot interaction and communication, in which the ability to carry on face-to-face communication develops in coordination with the pragmatic competence required for joint action.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"4 1","pages":"105-123"},"PeriodicalIF":0.0,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2011.2166261","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-01DOI: 10.1109/TAMD.2011.2174636
J. Weng, M. Luciw
This is a theoretical, modeling, and algorithmic paper about the spatial aspect of brain-like information processing, modeled by the developmental network (DN) model. The new brain architecture allows the external environment (including teachers) to interact with the sensory ends and the motor ends of the skull-closed brain through development. It does not allow the human programmer to hand-pick extra-body concepts or to handcraft the concept boundaries inside the brain . Mathematically, the brain spatial processing performs real-time mapping from to , through network updates, where the contents of all emerge from experience. Using its limited resource, the brain does increasingly better through experience. A new principle is that the effector ends serve as hubs for concept learning and abstraction. The effector ends serve also as input and the sensory ends serve also as output. As DN embodiments, the Where-What Networks (WWNs) present three major function novelties-new concept abstraction, concept as emergent goals, and goal-directed perception. The WWN series appears to be the first general purpose emergent systems for detecting and recognizing multiple objects in complex backgrounds. Among others, the most significant new mechanism is general-purpose top-down attention.
{"title":"Brain-Like Emergent Spatial Processing","authors":"J. Weng, M. Luciw","doi":"10.1109/TAMD.2011.2174636","DOIUrl":"https://doi.org/10.1109/TAMD.2011.2174636","url":null,"abstract":"This is a theoretical, modeling, and algorithmic paper about the spatial aspect of brain-like information processing, modeled by the developmental network (DN) model. The new brain architecture allows the external environment (including teachers) to interact with the sensory ends and the motor ends of the skull-closed brain through development. It does not allow the human programmer to hand-pick extra-body concepts or to handcraft the concept boundaries inside the brain . Mathematically, the brain spatial processing performs real-time mapping from to , through network updates, where the contents of all emerge from experience. Using its limited resource, the brain does increasingly better through experience. A new principle is that the effector ends serve as hubs for concept learning and abstraction. The effector ends serve also as input and the sensory ends serve also as output. As DN embodiments, the Where-What Networks (WWNs) present three major function novelties-new concept abstraction, concept as emergent goals, and goal-directed perception. The WWN series appears to be the first general purpose emergent systems for detecting and recognizing multiple objects in complex backgrounds. Among others, the most significant new mechanism is general-purpose top-down attention.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"4 1","pages":"161-185"},"PeriodicalIF":0.0,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2011.2174636","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-01DOI: 10.1109/TAMD.2011.2178846
Kotaro Hayashi, M. Shiomi, T. Kanda, N. Hagita
We studied people's acceptance of robots that per- form tasks in a city. Three different beings (a human, a human wearing a mascot costume, and a robot) performed tasks in three different scenarios: endless guidance, responding to irrational complaints, and removing an accidentally discarded key from the trash. All of these tasks involved beings interacting with visitors in troublesome situations: dull, stressful, and dirty. For this paper, 30 participants watched nine videos (three tasks performed by three beings) and evaluated each being's appropriateness for the task and its human-likeness. The results indicate that people prefer that a robot rather than a human perform these troublesome tasks, even though they require much interaction with people. In addition, comparisons with the costumed-human suggest that people's beliefs that a being deserves human rights rather than having a human-like appearance and behavior or cognitive capability is one explanation for their judgments about appropriateness.
{"title":"Are Robots Appropriate for Troublesome and Communicative Tasks in a City Environment?","authors":"Kotaro Hayashi, M. Shiomi, T. Kanda, N. Hagita","doi":"10.1109/TAMD.2011.2178846","DOIUrl":"https://doi.org/10.1109/TAMD.2011.2178846","url":null,"abstract":"We studied people's acceptance of robots that per- form tasks in a city. Three different beings (a human, a human wearing a mascot costume, and a robot) performed tasks in three different scenarios: endless guidance, responding to irrational complaints, and removing an accidentally discarded key from the trash. All of these tasks involved beings interacting with visitors in troublesome situations: dull, stressful, and dirty. For this paper, 30 participants watched nine videos (three tasks performed by three beings) and evaluated each being's appropriateness for the task and its human-likeness. The results indicate that people prefer that a robot rather than a human perform these troublesome tasks, even though they require much interaction with people. In addition, comparisons with the costumed-human suggest that people's beliefs that a being deserves human rights rather than having a human-like appearance and behavior or cognitive capability is one explanation for their judgments about appropriateness.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"4 1","pages":"150-160"},"PeriodicalIF":0.0,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2011.2178846","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-01DOI: 10.1109/TAMD.2011.2177660
S. Nishide, J. Tani, Toru Takahashi, HIroshi G. Okuno, T. Ogata
Researches in the brain science field have uncovered the human capability to use tools as if they are part of the human bodies (known as tool-body assimilation) through trial and experience. This paper presents a method to apply a robot's active sensing experience to create the tool-body assimilation model. The model is composed of a feature extraction module, dynamics learning module, and a tool-body assimilation module. Self-organizing map (SOM) is used for the feature extraction module to extract object features from raw images. Multiple time-scales recurrent neural network (MTRNN) is used as the dynamics learning module. Parametric bias (PB) nodes are attached to the weights of MTRNN as second-order network to modulate the behavior of MTRNN based on the properties of the tool. The generalization capability of neural networks provide the model the ability to deal with unknown tools. Experiments were conducted with the humanoid robot HRP-2 using no tool, I-shaped, T-shaped, and L-shaped tools. The distribution of PB values have shown that the model has learned that the robot's dynamic properties change when holding a tool. Motion generation experiments show that the tool-body assimilation model is capable of applying to unknown tools to generate goal-oriented motions.
{"title":"Tool–Body Assimilation of Humanoid Robot Using a Neurodynamical System","authors":"S. Nishide, J. Tani, Toru Takahashi, HIroshi G. Okuno, T. Ogata","doi":"10.1109/TAMD.2011.2177660","DOIUrl":"https://doi.org/10.1109/TAMD.2011.2177660","url":null,"abstract":"Researches in the brain science field have uncovered the human capability to use tools as if they are part of the human bodies (known as tool-body assimilation) through trial and experience. This paper presents a method to apply a robot's active sensing experience to create the tool-body assimilation model. The model is composed of a feature extraction module, dynamics learning module, and a tool-body assimilation module. Self-organizing map (SOM) is used for the feature extraction module to extract object features from raw images. Multiple time-scales recurrent neural network (MTRNN) is used as the dynamics learning module. Parametric bias (PB) nodes are attached to the weights of MTRNN as second-order network to modulate the behavior of MTRNN based on the properties of the tool. The generalization capability of neural networks provide the model the ability to deal with unknown tools. Experiments were conducted with the humanoid robot HRP-2 using no tool, I-shaped, T-shaped, and L-shaped tools. The distribution of PB values have shown that the model has learned that the robot's dynamic properties change when holding a tool. Motion generation experiments show that the tool-body assimilation model is capable of applying to unknown tools to generate goal-oriented motions.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"31 1","pages":"139-149"},"PeriodicalIF":0.0,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2011.2177660","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-01DOI: 10.1109/TAMD.2011.2170213
H. Firouzi, M. N. Ahmadabadi, Babak Nadjar Araabi, S. Amizadeh, M. Mirian, R. Siegwart
A probabilistic framework for interactive learning in continuous and multimodal perceptual spaces is proposed. In this framework, the agent learns the task along with adaptive partitioning of its multimodal perceptual space. The learning process is formulated in a Bayesian reinforcement learning setting to facilitate the adaptive partitioning. The partitioning is gradually and softly done using Gaussian distributions. The parameters of distributions are adapted based on the agent's estimate of its actions' expected values. The probabilistic nature of the method results in experience generalization in addition to robustness against uncertainty and noise. To benefit from experience generalization diversity in different perceptual subspaces, the learning is performed in multiple perceptual subspaces-including the original space-in parallel. In every learning step, the policies learned in the subspaces are fused to select the final action. This concurrent learning in multiple spaces and the decision fusion result in faster learning, possibility of adding and/or removing sensors-i.e., gradual expansion or contraction of the perceptual space-, and appropriate robustness against probable failure of or ambiguity in the data of sensors. Results of two sets of simulations in addition to some experiments are reported to demonstrate the key properties of the framework.
{"title":"Interactive Learning in Continuous Multimodal Space: A Bayesian Approach to Action-Based Soft Partitioning and Learning","authors":"H. Firouzi, M. N. Ahmadabadi, Babak Nadjar Araabi, S. Amizadeh, M. Mirian, R. Siegwart","doi":"10.1109/TAMD.2011.2170213","DOIUrl":"https://doi.org/10.1109/TAMD.2011.2170213","url":null,"abstract":"A probabilistic framework for interactive learning in continuous and multimodal perceptual spaces is proposed. In this framework, the agent learns the task along with adaptive partitioning of its multimodal perceptual space. The learning process is formulated in a Bayesian reinforcement learning setting to facilitate the adaptive partitioning. The partitioning is gradually and softly done using Gaussian distributions. The parameters of distributions are adapted based on the agent's estimate of its actions' expected values. The probabilistic nature of the method results in experience generalization in addition to robustness against uncertainty and noise. To benefit from experience generalization diversity in different perceptual subspaces, the learning is performed in multiple perceptual subspaces-including the original space-in parallel. In every learning step, the policies learned in the subspaces are fused to select the final action. This concurrent learning in multiple spaces and the decision fusion result in faster learning, possibility of adding and/or removing sensors-i.e., gradual expansion or contraction of the perceptual space-, and appropriate robustness against probable failure of or ambiguity in the data of sensors. Results of two sets of simulations in addition to some experiments are reported to demonstrate the key properties of the framework.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"4 1","pages":"124-138"},"PeriodicalIF":0.0,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2011.2170213","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-03-01DOI: 10.1109/TAMD.2011.2163513
Yuanlong Yu, G. Mann, R. Gosine
The selective attention mechanism is employed by humans and primates to realize a truly intelligent perception system, which has the cognitive capability of learning and thinking about how to perceive the environment autonomously. The attention mechanism involves the top-down and bottom-up ways that correspond to the goal-directed and automatic perceptual behaviors, respectively. Rather than considering the automatic perception, this paper presents an artificial system of the goal-directed visual perception by using the object-based top-down visual attention mechanism. This cognitive system can guide the perception to an object of interest according to the current task, context and learned knowledge. It consists of three successive stages: preattentive processing, top-down attentional selection and post-attentive perception. The preattentive processing stage divides the input scene into homogeneous proto-objects, one of which is then selected by the top-down attention and finally sent to the post-attentive perception stage for high-level analysis. Experimental results of target detection in the cluttered environments are shown to validate this system.
{"title":"A Goal-Directed Visual Perception System Using Object-Based Top–Down Attention","authors":"Yuanlong Yu, G. Mann, R. Gosine","doi":"10.1109/TAMD.2011.2163513","DOIUrl":"https://doi.org/10.1109/TAMD.2011.2163513","url":null,"abstract":"The selective attention mechanism is employed by humans and primates to realize a truly intelligent perception system, which has the cognitive capability of learning and thinking about how to perceive the environment autonomously. The attention mechanism involves the top-down and bottom-up ways that correspond to the goal-directed and automatic perceptual behaviors, respectively. Rather than considering the automatic perception, this paper presents an artificial system of the goal-directed visual perception by using the object-based top-down visual attention mechanism. This cognitive system can guide the perception to an object of interest according to the current task, context and learned knowledge. It consists of three successive stages: preattentive processing, top-down attentional selection and post-attentive perception. The preattentive processing stage divides the input scene into homogeneous proto-objects, one of which is then selected by the top-down attention and finally sent to the post-attentive perception stage for high-level analysis. Experimental results of target detection in the cluttered environments are shown to validate this system.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"21 1","pages":"87-103"},"PeriodicalIF":0.0,"publicationDate":"2012-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2011.2163513","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62760273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}