{"title":"K-Gram As A Determinant Of Plagiarism Level in Rabin-Karp Algorithm","authors":"A. Siahaan, Mesran, R. Rahim, Dodi Siregar","doi":"10.17605/OSF.IO/J9KVR","DOIUrl":null,"url":null,"abstract":"Rabin-Karp is one of the algorithms used to detect the similarity levels of two strings. In this case, the string can be either a short sentence or a document containing complex words. In this algorithm, the plagiarism level determination is based on the same hash value on both documents examined. Each word will form K-Gram of a certain length. The K-Gram will then be converted into a hash value. Each hash value in the source document will be compared to the hash value in the target document. The same number of hashes is the level of plagiarism created. The length of K- Gram is the determinant of the plagiarism level. By determining the proper length of K-Gram, it produces the accurate result. The results will vary for each K-Gram value.","PeriodicalId":14347,"journal":{"name":"International Journal of Scientific & Technology Research","volume":"46 1","pages":"350-353"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Scientific & Technology Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17605/OSF.IO/J9KVR","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
Rabin-Karp is one of the algorithms used to detect the similarity levels of two strings. In this case, the string can be either a short sentence or a document containing complex words. In this algorithm, the plagiarism level determination is based on the same hash value on both documents examined. Each word will form K-Gram of a certain length. The K-Gram will then be converted into a hash value. Each hash value in the source document will be compared to the hash value in the target document. The same number of hashes is the level of plagiarism created. The length of K- Gram is the determinant of the plagiarism level. By determining the proper length of K-Gram, it produces the accurate result. The results will vary for each K-Gram value.