{"title":"Text Mining for Internship Titles Clustering Using Shared Nearest Neighbor","authors":"L. Zahrotun","doi":"10.18495/comengapp.v6i3.214","DOIUrl":null,"url":null,"abstract":"An Internship course becomes one of many compulsory subjects in Under graduate Program of Informatics Engineering in Ahmad Dahlan University, Yogyakarta.In the last few semesters, we found that some students were failed in taking this subject. After being identified, they were facing some obstacles such as determining the main theme for their job description. During this study, we proposed an application to classify the internship titles by using a technique in text mining called Shared Nearest-Neighbor and Cosine Similarity. From the result, we got values from the parameter K is 7, the epsilon value is 0.5, and the value of Mint t is 0.3 with 22 clusters and 0 outlier. These values presented that all data titles of internship activitiesareclassified into each cluster. 7 topics whichtook by majority of students are:1) Information Systems (7 titles);2) Instructional Media (5 titles);3)Archiving Applications (4 titles);4) Web Profile Implementation (3 titles); 5)Instructional Media for University Courses (3 titles); Multimedia (3 titles) and 6)Workshop & Training (3 titles).","PeriodicalId":120500,"journal":{"name":"Computer Engineering and Applications","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Engineering and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18495/comengapp.v6i3.214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
An Internship course becomes one of many compulsory subjects in Under graduate Program of Informatics Engineering in Ahmad Dahlan University, Yogyakarta.In the last few semesters, we found that some students were failed in taking this subject. After being identified, they were facing some obstacles such as determining the main theme for their job description. During this study, we proposed an application to classify the internship titles by using a technique in text mining called Shared Nearest-Neighbor and Cosine Similarity. From the result, we got values from the parameter K is 7, the epsilon value is 0.5, and the value of Mint t is 0.3 with 22 clusters and 0 outlier. These values presented that all data titles of internship activitiesareclassified into each cluster. 7 topics whichtook by majority of students are:1) Information Systems (7 titles);2) Instructional Media (5 titles);3)Archiving Applications (4 titles);4) Web Profile Implementation (3 titles); 5)Instructional Media for University Courses (3 titles); Multimedia (3 titles) and 6)Workshop & Training (3 titles).