{"title":"Semantic-compensation-based recovery in multi-agent systems","authors":"A. Unruh, H. Harjadi, J. Bailey, K. Ramamohanarao","doi":"10.1109/MASSUR.2005.1507051","DOIUrl":null,"url":null,"abstract":"In agent systems, an agent's recovery from, execution problems is often complicated by constraints that are not present in a more traditional distributed, database systems environment. An analysis of agent-related crash recovery issues is presented, and requirements for achieving 'acceptable' agent crash recovery are discussed. Motivated by this analysis, a novel approach to managing agent recovery is presented. It utilises an event-and task-driven model for employing semantic compensation; task retries, and checkpointing. The compensation/retry model requires a situated model of action and failure, and provides the agent with an emergent unified, treatment of both crash recovery and run-time failure-handling. This approach helps the agent to recover acceptably from crashes and execution problems; improve system predictability; manage inter-task dependencies; and address the way in which exogenous events or crashes can trigger the need for a re-decomposition of a task. Agent architecture is then presented, which uses pair processing to leverage these recovery techniques and increase the agent's availability on crash restart.","PeriodicalId":391808,"journal":{"name":"IEEE 2nd Symposium on Multi-Agent Security and Survivability, 2005.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE 2nd Symposium on Multi-Agent Security and Survivability, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASSUR.2005.1507051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
In agent systems, an agent's recovery from, execution problems is often complicated by constraints that are not present in a more traditional distributed, database systems environment. An analysis of agent-related crash recovery issues is presented, and requirements for achieving 'acceptable' agent crash recovery are discussed. Motivated by this analysis, a novel approach to managing agent recovery is presented. It utilises an event-and task-driven model for employing semantic compensation; task retries, and checkpointing. The compensation/retry model requires a situated model of action and failure, and provides the agent with an emergent unified, treatment of both crash recovery and run-time failure-handling. This approach helps the agent to recover acceptably from crashes and execution problems; improve system predictability; manage inter-task dependencies; and address the way in which exogenous events or crashes can trigger the need for a re-decomposition of a task. Agent architecture is then presented, which uses pair processing to leverage these recovery techniques and increase the agent's availability on crash restart.