{"title":"Iterative Graph Alignment","authors":"Fangyuan Yu, Hardeep Singh Arora, Matt Johnson","doi":"arxiv-2408.16667","DOIUrl":null,"url":null,"abstract":"By compressing diverse narratives, LLMs go beyond memorization, achieving\nintelligence by capturing generalizable causal relationships. However, they\nsuffer from local 'representation gaps' due to insufficient training data\ndiversity, limiting their real-world utility, especially in tasks requiring\nstrict alignment to rules. Traditional alignment methods relying on heavy human\nannotations are inefficient and unscalable. Recent self-alignment techniques\nalso fall short, as they often depend on self-selection based prompting and\nmemorization-based learning. To address these issues, we introduce Iterative\nGraph Alignment (IGA), an annotation-free rule-based alignment algorithm. A\nteacher model (VLM) employs Iterative Graph Prompting (IGP) to create logical\ngraphs and reference answers. The student model (LLM) identifies local\nknowledge gaps by attempting to align its responses with these references,\ncollaborating with helper models to generate diverse answers. These aligned\nresponses are then used for iterative supervised fine-tuning (SFT). Our\nevaluations across five rule-based scenarios demonstrate IGP's effectiveness,\nwith a 73.12\\% alignment improvement in Claude Sonnet 3.5, and\nLlama3-8B-Instruct achieving an 86.20\\% improvement, outperforming Claude\nSonnet 3.5 in rule-based alignment.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.16667","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
By compressing diverse narratives, LLMs go beyond memorization, achieving
intelligence by capturing generalizable causal relationships. However, they
suffer from local 'representation gaps' due to insufficient training data
diversity, limiting their real-world utility, especially in tasks requiring
strict alignment to rules. Traditional alignment methods relying on heavy human
annotations are inefficient and unscalable. Recent self-alignment techniques
also fall short, as they often depend on self-selection based prompting and
memorization-based learning. To address these issues, we introduce Iterative
Graph Alignment (IGA), an annotation-free rule-based alignment algorithm. A
teacher model (VLM) employs Iterative Graph Prompting (IGP) to create logical
graphs and reference answers. The student model (LLM) identifies local
knowledge gaps by attempting to align its responses with these references,
collaborating with helper models to generate diverse answers. These aligned
responses are then used for iterative supervised fine-tuning (SFT). Our
evaluations across five rule-based scenarios demonstrate IGP's effectiveness,
with a 73.12\% alignment improvement in Claude Sonnet 3.5, and
Llama3-8B-Instruct achieving an 86.20\% improvement, outperforming Claude
Sonnet 3.5 in rule-based alignment.