Jin Zeng, Jidong Ge, Yemao Zhou, Yi Feng, Chuanyi Li, Zhongjin Li, B. Luo
The traditional approach to measure text similarity is based on the TF-IDF algorithm to get the document vector, and then use the cosine similarity algorithm to calculate the text similarity. However, this method of statistical way ignores the potential semantics of the articles or words. By some means, this method only aims at the word itself. But with the Latent Semantic Analysis, the semantic space is added on the basis of calculate TF-IDF. Each word and document can have a position in semantic space by Singular Value Decomposition. That allows the semantic analysis, document clustering, and the relationship between semantic class and document class can be finished at the same time. Here, we summarize the text similarity measures, and gradually extend to the Latent Semantic Analysis. The experiment shows that the statutes predicted by LSA are more accurate than that only by TF-IDF.
{"title":"Statutes Recommendation Based on Text Similarity","authors":"Jin Zeng, Jidong Ge, Yemao Zhou, Yi Feng, Chuanyi Li, Zhongjin Li, B. Luo","doi":"10.1109/WISA.2017.52","DOIUrl":"https://doi.org/10.1109/WISA.2017.52","url":null,"abstract":"The traditional approach to measure text similarity is based on the TF-IDF algorithm to get the document vector, and then use the cosine similarity algorithm to calculate the text similarity. However, this method of statistical way ignores the potential semantics of the articles or words. By some means, this method only aims at the word itself. But with the Latent Semantic Analysis, the semantic space is added on the basis of calculate TF-IDF. Each word and document can have a position in semantic space by Singular Value Decomposition. That allows the semantic analysis, document clustering, and the relationship between semantic class and document class can be finished at the same time. Here, we summarize the text similarity measures, and gradually extend to the Latent Semantic Analysis. The experiment shows that the statutes predicted by LSA are more accurate than that only by TF-IDF.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133750222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Authentication of the objects of interest plays a vital role and applicability in security sensitive environments. With Pattern recognition to classify patterns based on prior knowledge or on statistical information extracted from the patterns provides various solutions for recognizing and authenticating the identity of objects or persons. Identifying faces/objects of interest requires taking samples for training the classifier and classifying the input probe images with better recognition rate depending on the classification features. Facial recognition accuracy decreases when illumination of image is changed and with Single Sample per Person, where only one training sample is available does not give best matching results. In this paper, we present a model which works by taking different sample images and extracting Local Binary patterns, constructing the normalized histograms for training the SVM classifier and then classifying input probe images using Binary and Multiclass Support Vector Machines.
{"title":"Face Recognition by SVM Using Local Binary Patterns","authors":"Ejaz Ul Haq, Xu Huarong, M. I. Khattak","doi":"10.1109/WISA.2017.68","DOIUrl":"https://doi.org/10.1109/WISA.2017.68","url":null,"abstract":"Authentication of the objects of interest plays a vital role and applicability in security sensitive environments. With Pattern recognition to classify patterns based on prior knowledge or on statistical information extracted from the patterns provides various solutions for recognizing and authenticating the identity of objects or persons. Identifying faces/objects of interest requires taking samples for training the classifier and classifying the input probe images with better recognition rate depending on the classification features. Facial recognition accuracy decreases when illumination of image is changed and with Single Sample per Person, where only one training sample is available does not give best matching results. In this paper, we present a model which works by taking different sample images and extracting Local Binary patterns, constructing the normalized histograms for training the SVM classifier and then classifying input probe images using Binary and Multiclass Support Vector Machines.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127873218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Time series classification presents a specific machine learning challenge due to the ordering of variables. Recent studies show that the simple nearest neighbor classifier with elastic distance measures is hard to beat and many researchers focus on alternative distance measures. Unlike nearest neighbor classifier try to find a training sample which has the minimum distance with test instance, we utilize a reconstruction strategy to determine the label of new time series in this paper. Concretely, for each test time series, we reconstruct it by using as few training samples as possible and then calculate the residuals between the test time series and the selected training samples of each class. The test time series is classified to the class with minimum residual. To get the required time series from the training set, we employ sparse restriction technique to discover the optimal combination of different training samples while fitting test time series. Meanwhile, to solve the scenarios where the time series dataset is linearly inseparable, we extend our method by the kernel trick. Extensive experimental results show that the proposed method can gain the significant improvement on commonly used time series datasets.
{"title":"Efficient Time Series Classification via Sparse Linear Combination","authors":"Zhenguo Zhang, Peng Nie, Yanlong Wen","doi":"10.1109/WISA.2017.37","DOIUrl":"https://doi.org/10.1109/WISA.2017.37","url":null,"abstract":"Time series classification presents a specific machine learning challenge due to the ordering of variables. Recent studies show that the simple nearest neighbor classifier with elastic distance measures is hard to beat and many researchers focus on alternative distance measures. Unlike nearest neighbor classifier try to find a training sample which has the minimum distance with test instance, we utilize a reconstruction strategy to determine the label of new time series in this paper. Concretely, for each test time series, we reconstruct it by using as few training samples as possible and then calculate the residuals between the test time series and the selected training samples of each class. The test time series is classified to the class with minimum residual. To get the required time series from the training set, we employ sparse restriction technique to discover the optimal combination of different training samples while fitting test time series. Meanwhile, to solve the scenarios where the time series dataset is linearly inseparable, we extend our method by the kernel trick. Extensive experimental results show that the proposed method can gain the significant improvement on commonly used time series datasets.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114372920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiuli Wang, Zhuoming Xu, Xiutao Xia, Chengwang Mao
This paper addresses the sparsity problem in collaborative filtering (CF) by developing an aggregated useruser similarity measure suitable for the user-based CF model. The aggregated similarity measure is a weighted aggregation of the SimRank++ similarity on the user-item bipartite graph and the cosine similarity of the Linked Open Data (LOD)-based user profiles derived from both the rating data and the items' descriptive attributes found from LOD resources. To validate the effectiveness of the aggregated similarity and evaluate the accuracy of rating predictions with the user-based CF method, comparative experiments between four similarity measures, the Pearson correlation coefficient, the SimRank++ similarity, the cosine similarity and the aggregated similarity, were conducted on the MovieLens 100k dataset and DBpedia. The experimental results indicate that the proposed aggregated similarity measure overall outperforms the other three similarity measures in terms of both Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), especially in the cases of 30-100 nearest neighbors.
{"title":"Computing User Similarity by Combining SimRank++ and Cosine Similarities to Improve Collaborative Filtering","authors":"Xiuli Wang, Zhuoming Xu, Xiutao Xia, Chengwang Mao","doi":"10.1109/WISA.2017.22","DOIUrl":"https://doi.org/10.1109/WISA.2017.22","url":null,"abstract":"This paper addresses the sparsity problem in collaborative filtering (CF) by developing an aggregated useruser similarity measure suitable for the user-based CF model. The aggregated similarity measure is a weighted aggregation of the SimRank++ similarity on the user-item bipartite graph and the cosine similarity of the Linked Open Data (LOD)-based user profiles derived from both the rating data and the items' descriptive attributes found from LOD resources. To validate the effectiveness of the aggregated similarity and evaluate the accuracy of rating predictions with the user-based CF method, comparative experiments between four similarity measures, the Pearson correlation coefficient, the SimRank++ similarity, the cosine similarity and the aggregated similarity, were conducted on the MovieLens 100k dataset and DBpedia. The experimental results indicate that the proposed aggregated similarity measure overall outperforms the other three similarity measures in terms of both Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), especially in the cases of 30-100 nearest neighbors.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"426 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116230102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yutian Chen, Wenyan Gan, Lei Zhang, Chong Liu, Xianlei Wang
Visual place recognition is an active research field in the robotic navigation and localization, which means the ability to recognize a known place in the environment using vision as the main sensor modality. Despite significant progress in computer vision and machine learning techniques, challenges remain especially in dynamic environments such as illumination change, viewpoint change and so on. In this paper, a survey and comparative study on existing approaches of visual place recognition is presented, including place feature extraction methods, image similarity metrics and searching algorithms, as well as some benchmark datasets and evaluation metrics. Experimental results show that the methods combining feature extraction using convolutional neural networks and sequential image searching achieve higher precision in large scale dynamic environment.
{"title":"A Survey on Visual Place Recognition for Mobile Robots Localization","authors":"Yutian Chen, Wenyan Gan, Lei Zhang, Chong Liu, Xianlei Wang","doi":"10.1109/WISA.2017.7","DOIUrl":"https://doi.org/10.1109/WISA.2017.7","url":null,"abstract":"Visual place recognition is an active research field in the robotic navigation and localization, which means the ability to recognize a known place in the environment using vision as the main sensor modality. Despite significant progress in computer vision and machine learning techniques, challenges remain especially in dynamic environments such as illumination change, viewpoint change and so on. In this paper, a survey and comparative study on existing approaches of visual place recognition is presented, including place feature extraction methods, image similarity metrics and searching algorithms, as well as some benchmark datasets and evaluation metrics. Experimental results show that the methods combining feature extraction using convolutional neural networks and sequential image searching achieve higher precision in large scale dynamic environment.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130631469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It is one of the popular and effective way to build a unified authentication center to implement single sign-on among many applications in the enterprise. How to deal with the high concurrent and high flow of user requests to ensure the stability and efficiency of the authentication service is most important when integrating multiple systems. Aiming at the problem of authentication center, such as overloaded, single point of failure, slow response time, etc. we put forward a distributed architecture with cache to enable the unified authentication. The authentication tickets can be shared among multiple nodes by cache. The hot and important data can be prefetched to cache to improve the response time. A multi-factor cache replacement algorithm based on Hybird is also proposed which combining complex and diverse user behavior to improve the effectiveness of data replacement. The experimental results show that the optimized distributed authentication architecture can guarantee the stability of the system, and the cache mechanism can improve the response time, and a multi factor cache replacement algorithm based on Hybird can improve the cache hit ratio.
{"title":"The Optimization Mechanism Research of Distributed Unified Authentication Based on Cache","authors":"Dongju Yang, Kai Feng","doi":"10.1109/WISA.2017.4","DOIUrl":"https://doi.org/10.1109/WISA.2017.4","url":null,"abstract":"It is one of the popular and effective way to build a unified authentication center to implement single sign-on among many applications in the enterprise. How to deal with the high concurrent and high flow of user requests to ensure the stability and efficiency of the authentication service is most important when integrating multiple systems. Aiming at the problem of authentication center, such as overloaded, single point of failure, slow response time, etc. we put forward a distributed architecture with cache to enable the unified authentication. The authentication tickets can be shared among multiple nodes by cache. The hot and important data can be prefetched to cache to improve the response time. A multi-factor cache replacement algorithm based on Hybird is also proposed which combining complex and diverse user behavior to improve the effectiveness of data replacement. The experimental results show that the optimized distributed authentication architecture can guarantee the stability of the system, and the cache mechanism can improve the response time, and a multi factor cache replacement algorithm based on Hybird can improve the cache hit ratio.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126642518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}