Text document clustering using self organizing map: Theses and dissertations of universitas Indonesia

Yantine Arsita Br. Panjaitan, I. Surjandari, Asma Rosyidah
{"title":"Text document clustering using self organizing map: Theses and dissertations of universitas Indonesia","authors":"Yantine Arsita Br. Panjaitan, I. Surjandari, Asma Rosyidah","doi":"10.1109/ICSITECH.2017.8257096","DOIUrl":null,"url":null,"abstract":"Accessibility is a critical aspect to be considered by college library in order to facilitate users in searching library collections. The Library of Universitas Indonesia, as one of Asia's largest library with more than 1,500,000 book collections, should also concern about accessibility to balance its numerous collections. UI-ana collections or works produced by and associated with Universitas Indonesia; in particular theses (undergraduate and graduate theses) and dissertations are one of the largest numbers of collections in Universitas Indonesia's Library. However, the current collection's management system was still based on the submission of the collection in Universitas Indonesia's Library. Since these collections are arranged with no exact criterion, it is harder for users to find theses and dissertations with the same topic. Therefore, management of these collections based on certain criterion is extremely needed to facilitate users in searching these collections. This research aims to determine the categories that can represent theses and dissertations through abstract text mining of each collection in 2005–2015 with a clustering algorithm, namely Self-organizing Map. This study found 139 categories which will be used to classify theses and dissertations of Universitas Indonesia.","PeriodicalId":165045,"journal":{"name":"2017 3rd International Conference on Science in Information Technology (ICSITech)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 3rd International Conference on Science in Information Technology (ICSITech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSITECH.2017.8257096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Accessibility is a critical aspect to be considered by college library in order to facilitate users in searching library collections. The Library of Universitas Indonesia, as one of Asia's largest library with more than 1,500,000 book collections, should also concern about accessibility to balance its numerous collections. UI-ana collections or works produced by and associated with Universitas Indonesia; in particular theses (undergraduate and graduate theses) and dissertations are one of the largest numbers of collections in Universitas Indonesia's Library. However, the current collection's management system was still based on the submission of the collection in Universitas Indonesia's Library. Since these collections are arranged with no exact criterion, it is harder for users to find theses and dissertations with the same topic. Therefore, management of these collections based on certain criterion is extremely needed to facilitate users in searching these collections. This research aims to determine the categories that can represent theses and dissertations through abstract text mining of each collection in 2005–2015 with a clustering algorithm, namely Self-organizing Map. This study found 139 categories which will be used to classify theses and dissertations of Universitas Indonesia.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用自组织地图的文本文档聚类:印尼大学的论文
可访问性是高校图书馆为方便用户检索馆藏而必须考虑的一个重要方面。印度尼西亚大学图书馆作为亚洲最大的图书馆之一,拥有超过150万册藏书,也应该关注可访问性,以平衡其众多的藏书。由印度尼西亚大学制作或与之有关的UI-ana收藏品或作品;特别是论文(本科生和研究生论文)和学位论文是印度尼西亚大学图书馆最大的馆藏之一。然而,目前的馆藏管理系统仍然是基于向印尼大学图书馆提交的馆藏。由于这些集合的排列没有精确的标准,用户很难找到相同主题的论文和学位论文。因此,迫切需要根据一定的标准对这些馆藏进行管理,以方便用户查找这些馆藏。本研究旨在利用聚类算法Self-organizing Map对2005-2015年的每个文集进行抽象文本挖掘,确定可以代表论文和学位论文的类别。本研究发现了139个类别,这些类别将用于对印度尼西亚大学的论文进行分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Blended learning in postgraduate program Predicting degree-completion time with data mining Real-time location recommendation system for field data collection Segmentation of retinal blood vessels using Gabor wavelet and morphological reconstruction The development and usability testing of game-based learning as a medium to introduce zoology to young learners
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1