Machine Learning Offers Opportunities to Advance Library Services

IF 0.4 Q4 INFORMATION SCIENCE & LIBRARY SCIENCE Evidence Based Library and Information Practice Pub Date : 2024-06-14 DOI:10.18438/eblip30527
Samantha Kaplan
{"title":"Machine Learning Offers Opportunities to Advance Library Services","authors":"Samantha Kaplan","doi":"10.18438/eblip30527","DOIUrl":null,"url":null,"abstract":"A Review of:\nWang, Y. (2022). Using machine learning and natural language processing to analyze library chat reference transcripts. Information Technology and Libraries, 41(3). https://doi.org/10.6017/ital.v41i3.14967\nObjective – The study sought to develop a model to predict if library chat questions are reference or non-reference.\nDesign – Supervised machine learning and natural language processing.\nSetting – College of New Jersey academic library.\nSubjects – 8,000 Springshare LibChat transactions collected from 2014 to 2021.\nMethods – The chat logs were downloaded into Excel, cleaned, and individual questions were labelled reference or non-reference by hand. Labelled data were preprocessed to remove nonmeaningful and stop words, and reformatted to lowercase. Data were then stemmed to group words with similar meaning. The feature of question length was then added and data were transformed from text to numeric for text vectorization. Data were then divided into training and testing sets. The Python packages Natural Language Toolkit (NLTK) and scikit-learn were used for analysis, building random forest and gradient boosting models which were evaluated via confusion matrix.\nMain Results – Both models performed very well in precision, recall and accuracy, with the random forest model having better overall results than the gradient boosting model, as well as a more efficient fit time, though slightly longer prediction time.\nConclusion – High volume library chat services could benefit from utilizing machine learning to develop models that inform plugins or chat enhancements to filter chat queries quickly.","PeriodicalId":45227,"journal":{"name":"Evidence Based Library and Information Practice","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Evidence Based Library and Information Practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18438/eblip30527","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

A Review of: Wang, Y. (2022). Using machine learning and natural language processing to analyze library chat reference transcripts. Information Technology and Libraries, 41(3). https://doi.org/10.6017/ital.v41i3.14967 Objective – The study sought to develop a model to predict if library chat questions are reference or non-reference. Design – Supervised machine learning and natural language processing. Setting – College of New Jersey academic library. Subjects – 8,000 Springshare LibChat transactions collected from 2014 to 2021. Methods – The chat logs were downloaded into Excel, cleaned, and individual questions were labelled reference or non-reference by hand. Labelled data were preprocessed to remove nonmeaningful and stop words, and reformatted to lowercase. Data were then stemmed to group words with similar meaning. The feature of question length was then added and data were transformed from text to numeric for text vectorization. Data were then divided into training and testing sets. The Python packages Natural Language Toolkit (NLTK) and scikit-learn were used for analysis, building random forest and gradient boosting models which were evaluated via confusion matrix. Main Results – Both models performed very well in precision, recall and accuracy, with the random forest model having better overall results than the gradient boosting model, as well as a more efficient fit time, though slightly longer prediction time. Conclusion – High volume library chat services could benefit from utilizing machine learning to develop models that inform plugins or chat enhancements to filter chat queries quickly.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习为推进图书馆服务提供了机遇
回顾:Wang, Y. (2022).使用机器学习和自然语言处理分析图书馆聊天参考记录。信息技术与图书馆》,41(3)。https://doi.org/10.6017/ital.v41i3.14967Objective - 该研究试图开发一个模型来预测图书馆聊天问题是参考问题还是非参考问题。设计 - 监督机器学习和自然语言处理。环境 - 新泽西学院学术图书馆。研究对象 - 从 2014 年到 2021 年收集的 8000 条 Springshare LibChat 交易。方法 - 将聊天记录下载到 Excel 中,进行清理,并手工将单个问题标记为参考问题还是非参考问题。标注的数据经过预处理,以去除非意义词和停顿词,并重新格式化为小写。然后对数据进行词干处理,将意义相近的词分组。然后添加问题长度特征,并将数据从文本转换为数字,以便进行文本矢量化。然后将数据分为训练集和测试集。主要结果 - 两种模型在精确度、召回率和准确度方面都有很好的表现,随机森林模型的总体结果比梯度提升模型更好,拟合时间也更有效,但预测时间稍长。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Evidence Based Library and Information Practice
Evidence Based Library and Information Practice INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
0.80
自引率
12.50%
发文量
44
审稿时长
12 weeks
期刊最新文献
Students’ Perspective of the Advantages and Disadvantages of ChatGPT Compared to Reference Librarians Academic Libraries Can Develop AI Chatbots for Virtual Reference Services with Minimal Technical Knowledge and Limited Resources A Study on the Knowledge and Perception of Artificial Intelligence Increasing Student Engagement in a Re-opened Regional Campus Library: Results from a Student Focus Group Gauging Academic Unit Perceptions of Library Services During a Transition in University Budget Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1