Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346664
Divyesh Patel, T. Srivastava
The field of Discrete Tomography (DT) deals with the reconstruction of 2D discrete images from a few number of their projections. The ideal problem of DT is to reconstruct a binary image from its horizontal and vertical projections. It turns out that this problem of DT is highly underdetermined and therefore it is inevitable to impose additional constraints to this problem. This paper uses the convexity property of binary images and the problem of reconstruction of h-convex binary images from its horizontal and vertical projections is considered here. This problem is transformed into two different optimization problems by defining two appropriate objective functions. Then two simulated annealing (SA) algorithms to solve the two optimization problems are developed. The SA algorithms are tested on various randomly generated test images. The algorithms are also tested on noisy images. Finally numerical results have been reported showing good reconstruction fidelity.
{"title":"Reconstructing h-convex binary images from its horizontal and vertical projections by simulated annealing","authors":"Divyesh Patel, T. Srivastava","doi":"10.1109/IC3.2015.7346664","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346664","url":null,"abstract":"The field of Discrete Tomography (DT) deals with the reconstruction of 2D discrete images from a few number of their projections. The ideal problem of DT is to reconstruct a binary image from its horizontal and vertical projections. It turns out that this problem of DT is highly underdetermined and therefore it is inevitable to impose additional constraints to this problem. This paper uses the convexity property of binary images and the problem of reconstruction of h-convex binary images from its horizontal and vertical projections is considered here. This problem is transformed into two different optimization problems by defining two appropriate objective functions. Then two simulated annealing (SA) algorithms to solve the two optimization problems are developed. The SA algorithms are tested on various randomly generated test images. The algorithms are also tested on noisy images. Finally numerical results have been reported showing good reconstruction fidelity.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124597731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346676
Prabhat Dansena, K. P. Kumar, R. Pal
Automatic extraction of important regions from a cheque image helps in automatic analysis of the cheque. It can be used for automated clearing of cheques, detection of frauds in the cheques, and so on. A novel approach of extracting important regions from a cheque image is proposed, in this paper, based on identification of lines. Experimental results demonstrate the success of the proposed approach.
{"title":"Line based extraction of important regions from a cheque image","authors":"Prabhat Dansena, K. P. Kumar, R. Pal","doi":"10.1109/IC3.2015.7346676","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346676","url":null,"abstract":"Automatic extraction of important regions from a cheque image helps in automatic analysis of the cheque. It can be used for automated clearing of cheques, detection of frauds in the cheques, and so on. A novel approach of extracting important regions from a cheque image is proposed, in this paper, based on identification of lines. Experimental results demonstrate the success of the proposed approach.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122641255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346743
Bhuvan Mehan, Sanjay Batish, R. Bhatia, A. Dhiman
Border Node preferred Social Ranking based Routing Protocol (BNSR) is present in this paper which is the expansion of the BMFR routing protocol. Routing strategy of BNSR follows the position based routing by using any location services such as GPS system and forwarding strategy follows the prominence of border node based forwarding to shrivel the delay and optimize the path length. BNSR considers the concept of social ranking which is a parameter of CODO (continuous opinion dynamic optimization) technique on which basis the next hop border node is selected. The protocol is simulated with NS2 simulator and results shows the algorithm works well and produces better packet delivery ratio (PDR) and minimum end-to-end delay. When compared with BMFR protocol the consequence of purposed protocol is much better and much efficient in VANETs. We are the first to acquaint the concept of social ranking in selecting the next hop border nodes in the best of our knowledge.
{"title":"BNSR: Border Node preferred Social Ranking based Routing Protocol for VANETs","authors":"Bhuvan Mehan, Sanjay Batish, R. Bhatia, A. Dhiman","doi":"10.1109/IC3.2015.7346743","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346743","url":null,"abstract":"Border Node preferred Social Ranking based Routing Protocol (BNSR) is present in this paper which is the expansion of the BMFR routing protocol. Routing strategy of BNSR follows the position based routing by using any location services such as GPS system and forwarding strategy follows the prominence of border node based forwarding to shrivel the delay and optimize the path length. BNSR considers the concept of social ranking which is a parameter of CODO (continuous opinion dynamic optimization) technique on which basis the next hop border node is selected. The protocol is simulated with NS2 simulator and results shows the algorithm works well and produces better packet delivery ratio (PDR) and minimum end-to-end delay. When compared with BMFR protocol the consequence of purposed protocol is much better and much efficient in VANETs. We are the first to acquaint the concept of social ranking in selecting the next hop border nodes in the best of our knowledge.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122413735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346706
E. Katta, Anuja Arora
Cross Language Information Retrieval (CLIR) is a sub domain of Information Retrieval. It deals with retrieval of information in a specified language that is different from the language of user's query. In this paper, an improved English-Hindi based CLIR is proposed. There are various un-noticed domains in this broad research area that are required to be worked upon in order to improve the performance of an English-Hindi based CLIR. Not much research effort has been put up to improve the searching and ranking aspects of CLIR systems, especially in case of English-Hindi based CLIR. This paper focuses on applying algorithms like Naïve Bayes and particle swarm optimization in order to improve ranking and searching aspects of a CLIR system. We matched terms contained in documents to the query terms in same sequence as present in the search query to make our system more efficient. Along with this our approach also makes use of bilingual English-Hindi translator for query conversion in Hindi language. Further, we use Hindi query extension and synonym generation which helps in retrieving more relevant results in an English-Hindi based CLIR as compared to existing one. Both of these techniques applied to this improved approach gives user a change to choose more appropriate Hindi query than just by using the single translated query and hence improving overall performance.
{"title":"An improved approach to English-Hindi based Cross Language Information Retrieval system","authors":"E. Katta, Anuja Arora","doi":"10.1109/IC3.2015.7346706","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346706","url":null,"abstract":"Cross Language Information Retrieval (CLIR) is a sub domain of Information Retrieval. It deals with retrieval of information in a specified language that is different from the language of user's query. In this paper, an improved English-Hindi based CLIR is proposed. There are various un-noticed domains in this broad research area that are required to be worked upon in order to improve the performance of an English-Hindi based CLIR. Not much research effort has been put up to improve the searching and ranking aspects of CLIR systems, especially in case of English-Hindi based CLIR. This paper focuses on applying algorithms like Naïve Bayes and particle swarm optimization in order to improve ranking and searching aspects of a CLIR system. We matched terms contained in documents to the query terms in same sequence as present in the search query to make our system more efficient. Along with this our approach also makes use of bilingual English-Hindi translator for query conversion in Hindi language. Further, we use Hindi query extension and synonym generation which helps in retrieving more relevant results in an English-Hindi based CLIR as compared to existing one. Both of these techniques applied to this improved approach gives user a change to choose more appropriate Hindi query than just by using the single translated query and hence improving overall performance.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"os-12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127760755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346643
Shailendra Tiwari, R. Srivastava
Bayesian statistical algorithm plays a significant role in the quality of the images produced by Emission Tomography like PET/SPECT, since they can provide an accurate system model. The major drawbacks associated with this algorithm include the problem of slow convergence, choice of optimum initial point and ill-posedness. To address these issues, in this paper a hybrid-cascaded framework for Median Root Prior (MRP) based reconstruction algorithm is proposed. This framework consists of breaking the reconstruction process into two parts viz. primary and secondary. During primary part, simultaneous algebraic reconstruction technique (SART) is applied to overcome the problems of slow convergence and initialization. It provides fast convergence and produce good reconstruction results with lesser number of iterations than other iterative methods. The task of primary part is to provide an enhanced image to secondary part to be used as an initial estimate for reconstruction process. The secondary part is a hybrid combination of two parts namely the reconstruction part and the prior part. The reconstruction is done using Median Root Prior (MRP) while Anisotropic Diffusion (AD) is used as prior to deal with ill-posedness. A comparative analysis of the proposed model with some other standard methods in literature is presented both qualitatively and quantitatively for a simulated phantom and a standard medical image test data. Using cascaded primary and secondary reconstruction steps, yields significant improvements in reconstructed image quality. It also accelerates the convergence and provides enhanced results using the projection data. The obtained result justifies the applicability of the proposed method.
{"title":"An efficient and modified median root prior based framework for PET/SPECT reconstruction algorithm","authors":"Shailendra Tiwari, R. Srivastava","doi":"10.1109/IC3.2015.7346643","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346643","url":null,"abstract":"Bayesian statistical algorithm plays a significant role in the quality of the images produced by Emission Tomography like PET/SPECT, since they can provide an accurate system model. The major drawbacks associated with this algorithm include the problem of slow convergence, choice of optimum initial point and ill-posedness. To address these issues, in this paper a hybrid-cascaded framework for Median Root Prior (MRP) based reconstruction algorithm is proposed. This framework consists of breaking the reconstruction process into two parts viz. primary and secondary. During primary part, simultaneous algebraic reconstruction technique (SART) is applied to overcome the problems of slow convergence and initialization. It provides fast convergence and produce good reconstruction results with lesser number of iterations than other iterative methods. The task of primary part is to provide an enhanced image to secondary part to be used as an initial estimate for reconstruction process. The secondary part is a hybrid combination of two parts namely the reconstruction part and the prior part. The reconstruction is done using Median Root Prior (MRP) while Anisotropic Diffusion (AD) is used as prior to deal with ill-posedness. A comparative analysis of the proposed model with some other standard methods in literature is presented both qualitatively and quantitatively for a simulated phantom and a standard medical image test data. Using cascaded primary and secondary reconstruction steps, yields significant improvements in reconstructed image quality. It also accelerates the convergence and provides enhanced results using the projection data. The obtained result justifies the applicability of the proposed method.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133725086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346648
Chandresh Kumar Maurya, Durga Toshniwal, G. V. Venkoparao
Anomaly detection is an important task in many real world applications such as fraud detection, suspicious activity detection, health care monitoring etc. In this paper, we tackle this problem from supervised learning perspective in online learning setting. We maximize well known Gmean metric for class-imbalance learning in online learning framework. Specifically, we show that maximizing Gmean is equivalent to minimizing a convex surrogate loss function and based on that we propose novel online learning algorithm for anomaly detection. We then show, by extensive experiments, that the performance of the proposed algorithm with respect to sum metric is as good as a recently proposed Cost-Sensitive Online Classification(CSOC) algorithm for class-imbalance learning over various benchmarked data sets while keeping running time close to the perception algorithm. Our another conclusion is that other competitive online algorithms do not perform consistently over data sets of varying size. This shows the potential applicability of our proposed approach.
{"title":"Online anomaly detection via class-imbalance learning","authors":"Chandresh Kumar Maurya, Durga Toshniwal, G. V. Venkoparao","doi":"10.1109/IC3.2015.7346648","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346648","url":null,"abstract":"Anomaly detection is an important task in many real world applications such as fraud detection, suspicious activity detection, health care monitoring etc. In this paper, we tackle this problem from supervised learning perspective in online learning setting. We maximize well known Gmean metric for class-imbalance learning in online learning framework. Specifically, we show that maximizing Gmean is equivalent to minimizing a convex surrogate loss function and based on that we propose novel online learning algorithm for anomaly detection. We then show, by extensive experiments, that the performance of the proposed algorithm with respect to sum metric is as good as a recently proposed Cost-Sensitive Online Classification(CSOC) algorithm for class-imbalance learning over various benchmarked data sets while keeping running time close to the perception algorithm. Our another conclusion is that other competitive online algorithms do not perform consistently over data sets of varying size. This shows the potential applicability of our proposed approach.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132915173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346736
Pratik Ranjan, H. Om
The signature schemes are used to verify the authenticity of a signature and the corresponding documents. The undeniable signature schemes are challenge and response based interactive schemes, where the active participation of signer is compulsory. These schemes are used in private communication where the confidential deals and agreements take place as a legitimate signer cannot deny his signature. In this paper, we analyze the Thomas and Lal's braid group based zero-knowledge undeniable signature scheme and show that it is insecure against the man-in-the-middle and impersonation attacks. In addition, we propose an efficient undeniable signature scheme using the braid groups that provides secrecy and authenticity of a legitimate signer. Furthermore, we show that our scheme is secure against the above mentioned attacks.
{"title":"An efficient undeniable signature scheme using braid groups","authors":"Pratik Ranjan, H. Om","doi":"10.1109/IC3.2015.7346736","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346736","url":null,"abstract":"The signature schemes are used to verify the authenticity of a signature and the corresponding documents. The undeniable signature schemes are challenge and response based interactive schemes, where the active participation of signer is compulsory. These schemes are used in private communication where the confidential deals and agreements take place as a legitimate signer cannot deny his signature. In this paper, we analyze the Thomas and Lal's braid group based zero-knowledge undeniable signature scheme and show that it is insecure against the man-in-the-middle and impersonation attacks. In addition, we propose an efficient undeniable signature scheme using the braid groups that provides secrecy and authenticity of a legitimate signer. Furthermore, we show that our scheme is secure against the above mentioned attacks.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132956689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346690
Arijul Haque, K. S. Rao
This work explores the spectral energies of neutral, sad and angry speech, and analyzes the potential of spectral energy modification to convert neutral speech to sad/angry speech. A method of modifying the spectral energy of neutral speech signals based on a filter bank implementation is proposed for the purpose of converting a given neutral speech to a target emotional speech. Since pitch plays a vital role in emotion expression, we modify the pitch contour first by using the method of Gaussian normalization. This is followed by modification of spectral energy using a method proposed in this paper. The expressiveness of the resultant speech is compared with speech obtained by modifying only the pitch contour, and we have observed improvements in expressiveness due to incorporation of proposed spectral energy modification. The method is found to be quite good for neutral to sad conversion. However, the quality of conversion to anger is not good, and the reasons behind this are analyzed.
{"title":"Analysis and modification of spectral energy for neutral to sad emotion conversion","authors":"Arijul Haque, K. S. Rao","doi":"10.1109/IC3.2015.7346690","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346690","url":null,"abstract":"This work explores the spectral energies of neutral, sad and angry speech, and analyzes the potential of spectral energy modification to convert neutral speech to sad/angry speech. A method of modifying the spectral energy of neutral speech signals based on a filter bank implementation is proposed for the purpose of converting a given neutral speech to a target emotional speech. Since pitch plays a vital role in emotion expression, we modify the pitch contour first by using the method of Gaussian normalization. This is followed by modification of spectral energy using a method proposed in this paper. The expressiveness of the resultant speech is compared with speech obtained by modifying only the pitch contour, and we have observed improvements in expressiveness due to incorporation of proposed spectral energy modification. The method is found to be quite good for neutral to sad conversion. However, the quality of conversion to anger is not good, and the reasons behind this are analyzed.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122919079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346705
P. Suja, P. KalyanKumarV., Shikha Tripathi
Emotions are characterized as responses to internal and external events of a person. Emotion recognition through facial expressions from videos plays a vital role in human computer interaction where the dynamic changes in face movements needs to be realized quickly. In this work, we propose a simple method, using the geometrical based approach for the recognition of six basic emotions in video sequences of BU-4DFE database. We have chosen optimum feature points out of the 83 feature points provided in the BU-4DFE database. A video expressing emotion will have frames containing neutral, onset, apex and offset of that emotion. We have dynamically identified the frame that is most expressive for an emotion (apex). The Euclidean distance between the feature points in apex and neutral frame is determined and their difference in corresponding neutral and the apex frame is calculated to form the feature vector. The feature vectors thus formed for all the emotions and subjects are given to Neural Networks (NN) and Support Vector Machine (SVM) with different kernels for classification. We have compared the accuracy obtained by NN & SVM. Our proposed method is simple, uses only two frames and yields good accuracy for BU-4DFE database. Very complex algorithms exist in literature using BU-4DFE database and our proposed simple method gives comparable results. It can be applied for real time implementation and kinesics in future.
{"title":"Dynamic facial emotion recognition from 4D video sequences","authors":"P. Suja, P. KalyanKumarV., Shikha Tripathi","doi":"10.1109/IC3.2015.7346705","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346705","url":null,"abstract":"Emotions are characterized as responses to internal and external events of a person. Emotion recognition through facial expressions from videos plays a vital role in human computer interaction where the dynamic changes in face movements needs to be realized quickly. In this work, we propose a simple method, using the geometrical based approach for the recognition of six basic emotions in video sequences of BU-4DFE database. We have chosen optimum feature points out of the 83 feature points provided in the BU-4DFE database. A video expressing emotion will have frames containing neutral, onset, apex and offset of that emotion. We have dynamically identified the frame that is most expressive for an emotion (apex). The Euclidean distance between the feature points in apex and neutral frame is determined and their difference in corresponding neutral and the apex frame is calculated to form the feature vector. The feature vectors thus formed for all the emotions and subjects are given to Neural Networks (NN) and Support Vector Machine (SVM) with different kernels for classification. We have compared the accuracy obtained by NN & SVM. Our proposed method is simple, uses only two frames and yields good accuracy for BU-4DFE database. Very complex algorithms exist in literature using BU-4DFE database and our proposed simple method gives comparable results. It can be applied for real time implementation and kinesics in future.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115173525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-20DOI: 10.1109/IC3.2015.7346657
Arko Banerjee
In this paper a novel approach to document clustering has been introduced by defining a representative-based document similarity model that performs probabilistic segmentation of documents into chunks. The frequently occuring chunks that are considered as representatives of the document set, may represent phrases or stem of true words. The representative based document similarity model, containing a term-document matrix with respect to the representatives, is a compact representation of the vector space model that improves quality of document clustering over traditional methods.
{"title":"Leveraging probabilistic segmentation to document clustering","authors":"Arko Banerjee","doi":"10.1109/IC3.2015.7346657","DOIUrl":"https://doi.org/10.1109/IC3.2015.7346657","url":null,"abstract":"In this paper a novel approach to document clustering has been introduced by defining a representative-based document similarity model that performs probabilistic segmentation of documents into chunks. The frequently occuring chunks that are considered as representatives of the document set, may represent phrases or stem of true words. The representative based document similarity model, containing a term-document matrix with respect to the representatives, is a compact representation of the vector space model that improves quality of document clustering over traditional methods.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134552491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}