Zhuang Yang , Yu Du , Dong Liu , Kesong Zhao , Ming Cong
{"title":"A human-robot interaction system for automated chemical experiments based on vision and natural language processing semantics","authors":"Zhuang Yang , Yu Du , Dong Liu , Kesong Zhao , Ming Cong","doi":"10.1016/j.engappai.2025.110226","DOIUrl":null,"url":null,"abstract":"<div><div>Using collaborative robots to replace researchers in performing repetitive and hazardous chemical experiments can effectively enhance experimental efficiency. However, this technology still faces several challenges, including understanding researchers' natural language instructions, autonomously generating action sequences, and more. Therefore, we developed a general control framework for robots in automated chemical experiments based on visual and natural language semantic information. Firstly, starting with the recognition of keywords within Chinese language instructions, we established a domain dictionary for chemical experiment operations and proposed an instruction understanding model based on the bidirectional long-short-term memory and conditional random field(BiLSTM-CRF), enhancing the robot's cognitive ability towards user instructions. Then, a rule matching method for chemical experimental information and a multimodal information feature matching mechanism were established for command content verification and the automatic generation of multiple types of structured language. At the same time, a robot feedback mechanism was added, enabling human-computer interaction and establishing closed-loop control of the system. Finally, propose a robot action sequence generation mechanism based on hierarchical finite state machines(HFSM), transforming structured language into operational strategies for chemical experiments required by the robot. Experimental results show that on the instruction task comprehension dataset created in this paper, the proposed method improves the F1 score by up to 4.44% in the instruction keyword extraction task compared to other models. In addition, compared to traditional manual teaching control, this method significantly reduces time costs. This verifies that the method effectively enhances the robot's ability to comprehend Chinese instructions and generates reliable executable action sequences.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"146 ","pages":"Article 110226"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095219762500226X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Using collaborative robots to replace researchers in performing repetitive and hazardous chemical experiments can effectively enhance experimental efficiency. However, this technology still faces several challenges, including understanding researchers' natural language instructions, autonomously generating action sequences, and more. Therefore, we developed a general control framework for robots in automated chemical experiments based on visual and natural language semantic information. Firstly, starting with the recognition of keywords within Chinese language instructions, we established a domain dictionary for chemical experiment operations and proposed an instruction understanding model based on the bidirectional long-short-term memory and conditional random field(BiLSTM-CRF), enhancing the robot's cognitive ability towards user instructions. Then, a rule matching method for chemical experimental information and a multimodal information feature matching mechanism were established for command content verification and the automatic generation of multiple types of structured language. At the same time, a robot feedback mechanism was added, enabling human-computer interaction and establishing closed-loop control of the system. Finally, propose a robot action sequence generation mechanism based on hierarchical finite state machines(HFSM), transforming structured language into operational strategies for chemical experiments required by the robot. Experimental results show that on the instruction task comprehension dataset created in this paper, the proposed method improves the F1 score by up to 4.44% in the instruction keyword extraction task compared to other models. In addition, compared to traditional manual teaching control, this method significantly reduces time costs. This verifies that the method effectively enhances the robot's ability to comprehend Chinese instructions and generates reliable executable action sequences.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.