LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education

Unggi Lee , Minji Jeon , Yunseo Lee , Gyuri Byun , Yoorim Son , Jaeyoon Shin , Hongkyu Ko , Hyeoncheol Kim
{"title":"LLaVA-docent: Instruction tuning with multimodal large language model to support art appreciation education","authors":"Unggi Lee ,&nbsp;Minji Jeon ,&nbsp;Yunseo Lee ,&nbsp;Gyuri Byun ,&nbsp;Yoorim Son ,&nbsp;Jaeyoon Shin ,&nbsp;Hongkyu Ko ,&nbsp;Hyeoncheol Kim","doi":"10.1016/j.caeai.2024.100297","DOIUrl":null,"url":null,"abstract":"<div><p>Despite the development of various <span>AI</span> systems to support learning in various domains, <span>AI</span> assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.</p></div>","PeriodicalId":34469,"journal":{"name":"Computers and Education Artificial Intelligence","volume":"7 ","pages":"Article 100297"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666920X24001000/pdfft?md5=48322b1027ada7b47fe2466e14bfef09&pid=1-s2.0-S2666920X24001000-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Education Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666920X24001000","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
LLaVA-docent:利用多模态大语言模型调整教学,支持艺术欣赏教育
尽管开发了各种人工智能系统来支持各个领域的学习,但人工智能对艺术欣赏教育的帮助尚未得到广泛探索。对于大多数学生来说,艺术欣赏通常被认为是一项陌生且具有挑战性的工作,而人工智能生成的对话伙伴可以提供有针对性的问题,并鼓励观众深入欣赏艺术作品,从而使艺术欣赏变得更容易理解。本研究探讨了多模态大语言模型(MLLMs)在艺术鉴赏教育中的应用,重点是开发 LLaVA-Docent 模型,该模型旨在作为艺术鉴赏的个人导师。我们的方法包括设计和开发研究,重点是迭代增强设计和开发应用程序,以产生一个支持 MLLM 的功能性聊天机器人,并为艺术欣赏教育提供一个数据设计框架。为此,我们建立了一个由 GPT-4 生成的虚拟对话数据集,该数据集有助于训练我们的 MLLM LLaVA-Docent。通过对 LLaVA-Docent 的性能进行评估,将其与其他设置进行比较,发现了其明显的优缺点。我们的研究结果凸显了基于 MMLM 的个性化艺术欣赏聊天机器人的功效,并证明了它在艺术欣赏教学和体验的新方法中的适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
16.80
自引率
0.00%
发文量
66
审稿时长
50 days
期刊最新文献
Enhancing data analysis and programming skills through structured prompt training: The impact of generative AI in engineering education Understanding the practices, perceptions, and (dis)trust of generative AI among instructors: A mixed-methods study in the U.S. higher education Technological self-efficacy and sense of coherence: Key drivers in teachers' AI acceptance and adoption The influence of AI literacy on complex problem-solving skills through systematic thinking skills and intuition thinking skills: An empirical study in Thai gen Z accounting students Psychometrics of an Elo-based large-scale online learning system
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1