Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019786
M. T. Banday, Munis Khan
Registers within a processor, cache within, on, or outside the processor, and virtual memory on the disk drive builds memory hierarchy in modern computer systems. The principle of locality of reference makes this memory hierarchy work efficiently. In recent years, cache organizations and designs have witnessed several advances that have not only improved their performance such as hit rates, speed, latency, energy consumption, etc. but various new designs and organizations for chip multi-processors such as multilevel caches, Non-Uniform Cache Access (NUCA), hybrid caches, etc. have also emerged. This paper presents a study of current competing processors in terms of various factors determining performance and throughput of cache organization and design. To evaluate their performance and viability, it reviews recent cache trends that include hybrid cache memory, non-uniform cache architecture, energy efficient replacement algorithms, cache memory programming, software defined caches and emerging techniques for making cache reliable against soft errors. It discusses the pros and cons of emerging cache architectures and designs.
{"title":"A study of recent advances in cache memories","authors":"M. T. Banday, Munis Khan","doi":"10.1109/IC3I.2014.7019786","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019786","url":null,"abstract":"Registers within a processor, cache within, on, or outside the processor, and virtual memory on the disk drive builds memory hierarchy in modern computer systems. The principle of locality of reference makes this memory hierarchy work efficiently. In recent years, cache organizations and designs have witnessed several advances that have not only improved their performance such as hit rates, speed, latency, energy consumption, etc. but various new designs and organizations for chip multi-processors such as multilevel caches, Non-Uniform Cache Access (NUCA), hybrid caches, etc. have also emerged. This paper presents a study of current competing processors in terms of various factors determining performance and throughput of cache organization and design. To evaluate their performance and viability, it reviews recent cache trends that include hybrid cache memory, non-uniform cache architecture, energy efficient replacement algorithms, cache memory programming, software defined caches and emerging techniques for making cache reliable against soft errors. It discusses the pros and cons of emerging cache architectures and designs.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124890046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019584
Jyothi Thomas, G. Kulanthaivel
The role of the cervix in the pathogenesis of premature delivery is controversial. In a prospective, multicenter study of pregnant women, we used vaginal ultrasonography to measure the length of the cervix; we also documented the incidence of spontaneous delivery before 35 weeks' gestation. We performed vaginal ultrasonography at approximately 24 and 28 weeks of gestation in women with singleton pregnancies. We then assessed the relation between the length of the cervix and the risk of spontaneous preterm delivery. We examined 2915 women at approximately 24 weeks of gestation and 2531 of these women again at approximately 28 weeks. Spontaneous preterm delivery (at less than 35 weeks) occurred in 126 of the women (4.3 percent) examined at 24 weeks. The length of the cervix was normally distributed at 24 and 28 weeks (mean [SD], 35.28.3 mm and 33.78.5 mm, respectively). The relative risk of preterm delivery increased as the length of the cervix decreased. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks. The daily registration has N cases that each of the well-known stimulus-answer couples represents. The objective of this work is to develop a function that allows finding the vector of entrance variables t to the vector of exit variables P. F is any function, in this case the electric power consumption. Their modeling with Artificial Neural Network (ANN) is Multi a Perceptron Layer (PMC). Another form of modeling it is using Interpolation Algorithms (AI). For the lengths measured at 28 weeks, the corresponding relative risks were 2.80, 3.52, 5.39, 9.57, 13.88, and 24.94 (P0.001 for values at or below the 50th percentile; P0.003 for values at the 75th percentile). The risk of spontaneous preterm delivery is increased in women who are found to have a short cervix by vaginal ultrasonography during pregnancy.
{"title":"Predicting the risk of newborns based on fuzzy clustering method with prediction risk assessment","authors":"Jyothi Thomas, G. Kulanthaivel","doi":"10.1109/IC3I.2014.7019584","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019584","url":null,"abstract":"The role of the cervix in the pathogenesis of premature delivery is controversial. In a prospective, multicenter study of pregnant women, we used vaginal ultrasonography to measure the length of the cervix; we also documented the incidence of spontaneous delivery before 35 weeks' gestation. We performed vaginal ultrasonography at approximately 24 and 28 weeks of gestation in women with singleton pregnancies. We then assessed the relation between the length of the cervix and the risk of spontaneous preterm delivery. We examined 2915 women at approximately 24 weeks of gestation and 2531 of these women again at approximately 28 weeks. Spontaneous preterm delivery (at less than 35 weeks) occurred in 126 of the women (4.3 percent) examined at 24 weeks. The length of the cervix was normally distributed at 24 and 28 weeks (mean [SD], 35.28.3 mm and 33.78.5 mm, respectively). The relative risk of preterm delivery increased as the length of the cervix decreased. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks. The daily registration has N cases that each of the well-known stimulus-answer couples represents. The objective of this work is to develop a function that allows finding the vector of entrance variables t to the vector of exit variables P. F is any function, in this case the electric power consumption. Their modeling with Artificial Neural Network (ANN) is Multi a Perceptron Layer (PMC). Another form of modeling it is using Interpolation Algorithms (AI). For the lengths measured at 28 weeks, the corresponding relative risks were 2.80, 3.52, 5.39, 9.57, 13.88, and 24.94 (P0.001 for values at or below the 50th percentile; P0.003 for values at the 75th percentile). The risk of spontaneous preterm delivery is increased in women who are found to have a short cervix by vaginal ultrasonography during pregnancy.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"18 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124986849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019622
K. Jeevan, S. Krishnakumar
Hexagonal geometry has some advantageous like higher sampling efficiency, consistent connectivity and higher angular resolution. In addition to these advantages, the layout of photo-receptors in the human retina is more closely resembles to the hexagonal structure. It is due to these reasons many researchers have studied the possibility of using a hexagonal structure to represent digital images. Wavelet also have its own advantage and combining wavelet and processing of images in Hexagonal grid, that also will give better performance, because hexagonal wavelet includes the advantages of the hexagonal grid along with the wavelets. In this wok, the wavelet based image compression is performed on both square as well as hexagonal sampled images and the performance is compared using Mean Square Error (MSE) and Peak Signal to Noise Ratio (PSNR). Gabor filter is used for the interpolation of hexagonally sampled images. Compression on hexagonal domain gives better results compared to compression on rectangular domain.
{"title":"Compression of images represented in hexagonal lattice using wavelet and gabor filter","authors":"K. Jeevan, S. Krishnakumar","doi":"10.1109/IC3I.2014.7019622","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019622","url":null,"abstract":"Hexagonal geometry has some advantageous like higher sampling efficiency, consistent connectivity and higher angular resolution. In addition to these advantages, the layout of photo-receptors in the human retina is more closely resembles to the hexagonal structure. It is due to these reasons many researchers have studied the possibility of using a hexagonal structure to represent digital images. Wavelet also have its own advantage and combining wavelet and processing of images in Hexagonal grid, that also will give better performance, because hexagonal wavelet includes the advantages of the hexagonal grid along with the wavelets. In this wok, the wavelet based image compression is performed on both square as well as hexagonal sampled images and the performance is compared using Mean Square Error (MSE) and Peak Signal to Noise Ratio (PSNR). Gabor filter is used for the interpolation of hexagonally sampled images. Compression on hexagonal domain gives better results compared to compression on rectangular domain.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125431691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019814
Suman V. Patgar, K. Rani, T. Vasudev
Photocopy documents are very common in our normal life. People are permitted to carry and produce photocopied documents frequently, to avoid damages or losing the original documents. But this provision is misused for temporary benefits by fabricating fake photocopied documents. When a photocopied document is produced, it may be required to check for its authenticity. An attempt is made in this direction to detect such fabricated photocopied documents. This paper proposes a system to detect fabricated photocopied document using Bounding box. The work in this paper mainly focuses on detecting fabrication of photocopied document in which some contents are manipulated by smearing whitener over the original content and writing new contents above it and manipulate the content through cut and paste technique. A detailed experimental study has been performed using a collected sample set of considerable size and a decision model is developed for classification. Testing is performed on set of collected testing samples resulted in an average detection rate close to 86%.
{"title":"An unsupervised intelligent system to detect fabrication in photocopy document using variations in Bounding Box features","authors":"Suman V. Patgar, K. Rani, T. Vasudev","doi":"10.1109/IC3I.2014.7019814","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019814","url":null,"abstract":"Photocopy documents are very common in our normal life. People are permitted to carry and produce photocopied documents frequently, to avoid damages or losing the original documents. But this provision is misused for temporary benefits by fabricating fake photocopied documents. When a photocopied document is produced, it may be required to check for its authenticity. An attempt is made in this direction to detect such fabricated photocopied documents. This paper proposes a system to detect fabricated photocopied document using Bounding box. The work in this paper mainly focuses on detecting fabrication of photocopied document in which some contents are manipulated by smearing whitener over the original content and writing new contents above it and manipulate the content through cut and paste technique. A detailed experimental study has been performed using a collected sample set of considerable size and a decision model is developed for classification. Testing is performed on set of collected testing samples resulted in an average detection rate close to 86%.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131515706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019746
J. Rajeshwari, K. Karibasappa, M. T. Gopalkrishna
Face detection is used to locate and as well as identify the human faces in the image or video in different illumination, pose, orientation, in-plane rotation, position. Since face is the centre of attraction in video and images, it is used in several applications such as security, retrieval, video compression, recognition technology. Face detection is having a high degree of variance in its appearance which makes difficult problem in computer vision. Most of the images or videos are corrupted, hence a complete study of face detection based on skin is required. This paper presents a comparison on different methods used to detect the face for different illumination, pose and occlusion on skin based. By reviewing existing algorithms better algorithms can be developed for computer vision problems.
{"title":"Survey on skin based face detection on different illumination, poses and occlusion","authors":"J. Rajeshwari, K. Karibasappa, M. T. Gopalkrishna","doi":"10.1109/IC3I.2014.7019746","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019746","url":null,"abstract":"Face detection is used to locate and as well as identify the human faces in the image or video in different illumination, pose, orientation, in-plane rotation, position. Since face is the centre of attraction in video and images, it is used in several applications such as security, retrieval, video compression, recognition technology. Face detection is having a high degree of variance in its appearance which makes difficult problem in computer vision. Most of the images or videos are corrupted, hence a complete study of face detection based on skin is required. This paper presents a comparison on different methods used to detect the face for different illumination, pose and occlusion on skin based. By reviewing existing algorithms better algorithms can be developed for computer vision problems.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132628350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019672
Anand Shanker Tewari, T. S. Ansari, A. Barman
In the rapidly increasing field of E-commerce, buyer is surrounded by many product information. However, search engines like Google, Baidu, can't satisfy the demands of buyer because the information about the product that the users want can't be obtain quickly, easily and correctly. So buyer has to spend lots of time in removing the unnecessary information. Many e-commerce website often request buyers to review products that they have already purchased. As the popularity of e-commerce is increasing day by day, the reviews from customers about the product receives also increasing heavily. As a result of this it is difficult for a buyer to read all the reviews to make a decision about the product purchase. In this paper, we extracted, summarize and categorize all the customer reviews of a book. This paper proposes a book recommendation technique based on opinion mining and Naïve Bayes classifier to recommend top ranking books to buyers. This paper also considered the important factor like price of the book while recommendation and presented a novel tabular efficient method for recommending books to the buyer, especially when the buyer is coming first time to the website.
{"title":"Opinion based book recommendation using Naive Bayes classifier","authors":"Anand Shanker Tewari, T. S. Ansari, A. Barman","doi":"10.1109/IC3I.2014.7019672","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019672","url":null,"abstract":"In the rapidly increasing field of E-commerce, buyer is surrounded by many product information. However, search engines like Google, Baidu, can't satisfy the demands of buyer because the information about the product that the users want can't be obtain quickly, easily and correctly. So buyer has to spend lots of time in removing the unnecessary information. Many e-commerce website often request buyers to review products that they have already purchased. As the popularity of e-commerce is increasing day by day, the reviews from customers about the product receives also increasing heavily. As a result of this it is difficult for a buyer to read all the reviews to make a decision about the product purchase. In this paper, we extracted, summarize and categorize all the customer reviews of a book. This paper proposes a book recommendation technique based on opinion mining and Naïve Bayes classifier to recommend top ranking books to buyers. This paper also considered the important factor like price of the book while recommendation and presented a novel tabular efficient method for recommending books to the buyer, especially when the buyer is coming first time to the website.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133621662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019689
Arijit Ghosh, Satyaki Sen, C. Dey
Linguistic modelling of complex and nonlinear system constitutes to be the heart of many control and decision-making process. In this area, fuzzy logic is one of the most effective tools to build such linguistic models. Here, initially a fuzzy PI controller is designed with expert defined 49 rules to achieve desirable performance for a speed control system. Thereafter, a neuro-fuzzy controller is developed through back propagation training based on the input-output data set obtained from the previously designed fuzzy controller. Performance of the proposed neuro-fuzzy PI controller is tested through simulation study as well as real time experimentation on a DC servo speed control system. Both the simulation and experimental results substantiate the suitability of the designed neuro-fuzzy controller for closely approximating the behaviour of nonlinear fuzzy controller.
{"title":"Neuro-fuzzy design of a fuzzy PI controller with real-time implementation on a speed control system","authors":"Arijit Ghosh, Satyaki Sen, C. Dey","doi":"10.1109/IC3I.2014.7019689","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019689","url":null,"abstract":"Linguistic modelling of complex and nonlinear system constitutes to be the heart of many control and decision-making process. In this area, fuzzy logic is one of the most effective tools to build such linguistic models. Here, initially a fuzzy PI controller is designed with expert defined 49 rules to achieve desirable performance for a speed control system. Thereafter, a neuro-fuzzy controller is developed through back propagation training based on the input-output data set obtained from the previously designed fuzzy controller. Performance of the proposed neuro-fuzzy PI controller is tested through simulation study as well as real time experimentation on a DC servo speed control system. Both the simulation and experimental results substantiate the suitability of the designed neuro-fuzzy controller for closely approximating the behaviour of nonlinear fuzzy controller.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117242027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019702
V. Ranga, M. Dave, A. Verma
In this research paper we study the problem of optimal actor nodes selection based on mutual exclusion in the context of wireless sensor and actor network (WSAN) and propose a novel distributed approach to solve it. The major requirements for any proposed approach in such scenario are: (1) the proposed approach should select the minimum number of actor nodes to act on the given incident event region, (2) the overlaps between acting ranges should be minimum to avoid redundant action by actor nodes i.e. wastage of resources should be minimal, and finally, whole event region must be covered by one or more than one actors as per their availability in the network i.e. full coverage should be achieved. We have proposed one novel approach called Distributed Optimal Actor nodes Selection based on Mutual Exclusion approach (DOASME) in this research paper. The simulation results show the performance in terms of size of actor cover set, overlapped region, non-overlapped region and maximum actor coverage degree. We have also compared our simulated results with previously proposed benchmark algorithms.
{"title":"A distributed approach for selection of optimal actor nodes in wireless sensor and actor networks","authors":"V. Ranga, M. Dave, A. Verma","doi":"10.1109/IC3I.2014.7019702","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019702","url":null,"abstract":"In this research paper we study the problem of optimal actor nodes selection based on mutual exclusion in the context of wireless sensor and actor network (WSAN) and propose a novel distributed approach to solve it. The major requirements for any proposed approach in such scenario are: (1) the proposed approach should select the minimum number of actor nodes to act on the given incident event region, (2) the overlaps between acting ranges should be minimum to avoid redundant action by actor nodes i.e. wastage of resources should be minimal, and finally, whole event region must be covered by one or more than one actors as per their availability in the network i.e. full coverage should be achieved. We have proposed one novel approach called Distributed Optimal Actor nodes Selection based on Mutual Exclusion approach (DOASME) in this research paper. The simulation results show the performance in terms of size of actor cover set, overlapped region, non-overlapped region and maximum actor coverage degree. We have also compared our simulated results with previously proposed benchmark algorithms.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114926948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019737
A. Ranjan, Vijay Kumar, M. Hussain
TLS is the cryptographic protocol used in the internet. It consists of set of protocols which are used for negotiation of cryptographic parameters, encryption-decryption and reporting errors during the process. Security Analysis of any cryptographic protocol is very much needed to discover vulnerability and to evaluate its security properties. First we theoretically analysed the protocol using automated tool scyther and draw important conclusion. After that we have performed one real time experiment to identify the loopholes with TLS authentication. We gathered the data and prepared the record of it then we have analysed the reasons behind it and suggested some generic countermeasures to handle them. In this paper we intend to find out the loopholes of TLS and found that certificate forging could be considered as a loophole of TLS security mechanism and discovered its cause and proposed the countermeasures.
{"title":"Security analysis of TLS authentication","authors":"A. Ranjan, Vijay Kumar, M. Hussain","doi":"10.1109/IC3I.2014.7019737","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019737","url":null,"abstract":"TLS is the cryptographic protocol used in the internet. It consists of set of protocols which are used for negotiation of cryptographic parameters, encryption-decryption and reporting errors during the process. Security Analysis of any cryptographic protocol is very much needed to discover vulnerability and to evaluate its security properties. First we theoretically analysed the protocol using automated tool scyther and draw important conclusion. After that we have performed one real time experiment to identify the loopholes with TLS authentication. We gathered the data and prepared the record of it then we have analysed the reasons behind it and suggested some generic countermeasures to handle them. In this paper we intend to find out the loopholes of TLS and found that certificate forging could be considered as a loophole of TLS security mechanism and discovered its cause and proposed the countermeasures.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115134136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/IC3I.2014.7019659
Rajiv Yerra, Yiu-Kai Ng
Text document categorization is one of the rapidly emerging research fields, where documents are identified, differentiated and classified manually or algorithmically. The paper focuses on application of automatic text document categorization in plagiarism detection domain. In today's world plagiarism has become a prime concern, especially in research and educational fields. This paper aims on the study and comparison of different methods of document categorization in external plagiarism detection. Here the primary focus is to explore the unsupervised document categorization/ clustering methods using different variations of K-means algorithm and compare it with the general N-gram based method and Vector Space Model based method. Finally the analysis and evaluation is done using data set from PAN-20131 and performance is compared based on precision, recall and efficiency in terms of time taken for algorithm execution.
{"title":"Using K-means cluster based techniques in external plagiarism detection","authors":"Rajiv Yerra, Yiu-Kai Ng","doi":"10.1109/IC3I.2014.7019659","DOIUrl":"https://doi.org/10.1109/IC3I.2014.7019659","url":null,"abstract":"Text document categorization is one of the rapidly emerging research fields, where documents are identified, differentiated and classified manually or algorithmically. The paper focuses on application of automatic text document categorization in plagiarism detection domain. In today's world plagiarism has become a prime concern, especially in research and educational fields. This paper aims on the study and comparison of different methods of document categorization in external plagiarism detection. Here the primary focus is to explore the unsupervised document categorization/ clustering methods using different variations of K-means algorithm and compare it with the general N-gram based method and Vector Space Model based method. Finally the analysis and evaluation is done using data set from PAN-20131 and performance is compared based on precision, recall and efficiency in terms of time taken for algorithm execution.","PeriodicalId":430848,"journal":{"name":"2014 International Conference on Contemporary Computing and Informatics (IC3I)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116983927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}