Pub Date : 2020-01-01DOI: 10.4018/ijwsr.2020010103
A. Sathish, S. Ravimaran, S. J. N. Kumar
With the rapid developments occurring in cloud computing and services, there has been a growing trend of using the cloud for large-scale data storage. This has led to a major security dispute on data handling. Thus, the process can be overcome by utilizing an efficient shielded access on a key propagation (ESAKP) technique along with an adaptive optimization algorithm for password generation and performing double permutation. The password generation is done by adaptive ant lion optimization (AALO) which tackles the problem of ineffiency. This build has stronger security which needs an efficient selection property by eliminating the worst fit in each iteration. The optimized password is utilized by an adaptive vignere cipher for efficient key generation in which adaptiveness is employed to prevent the dilemma of choosing the first letter of alphabet which in turn reduces the computation time and improves the security. Additionally, there is a need to encrypte the symmetric key asymmetrically with a Elliptic Curve-Diffie Hellman algorithm (EC-DH) with a double stage permutation which produces a scrambling form of data adding security to the data.
{"title":"A Well-Organized Safeguarded Access on Key Propagation by Malleable Optimization in Blend With Double Permutation","authors":"A. Sathish, S. Ravimaran, S. J. N. Kumar","doi":"10.4018/ijwsr.2020010103","DOIUrl":"https://doi.org/10.4018/ijwsr.2020010103","url":null,"abstract":"With the rapid developments occurring in cloud computing and services, there has been a growing trend of using the cloud for large-scale data storage. This has led to a major security dispute on data handling. Thus, the process can be overcome by utilizing an efficient shielded access on a key propagation (ESAKP) technique along with an adaptive optimization algorithm for password generation and performing double permutation. The password generation is done by adaptive ant lion optimization (AALO) which tackles the problem of ineffiency. This build has stronger security which needs an efficient selection property by eliminating the worst fit in each iteration. The optimized password is utilized by an adaptive vignere cipher for efficient key generation in which adaptiveness is employed to prevent the dilemma of choosing the first letter of alphabet which in turn reduces the computation time and improves the security. Additionally, there is a need to encrypte the symmetric key asymmetrically with a Elliptic Curve-Diffie Hellman algorithm (EC-DH) with a double stage permutation which produces a scrambling form of data adding security to the data.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"26 1","pages":"43-63"},"PeriodicalIF":1.1,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73222580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-01DOI: 10.4018/ijwsr.2020010101
Xin Zhang, Jiali You, Hanxing Xue, Jinlin Wang
In the era of Internet of Things, cloud services are difficult to meet the real-time transmission requirements of users for the data generated in the edge of network especially for the Internet video services. Utilizing the devices at the edge of network, such as an intelligent router, to achieve nearby content services for users can effectively reduce backbone traffic and enhance service performance. This article proposes a decentralized PageRank-based content dissemination model at the edge of network, in which a suitable node selection algorithm is designed to distribute the content evenly in the network. Each node can quickly obtain data from neighbor nodes, thereby reducing the cloud load as well as the network bandwidth and improving the service response performance. The simulation shows that, compared with the other two dissemination algorithms, the content is distributed more even, which means every node has more opportunity to obtain the data from neighbors; and the service rejection rate can be decreased by an average of 5.2% in the case of high concurrent requests.
{"title":"A Decentralized PageRank Based Content Dissemination Model at the Edge of Network","authors":"Xin Zhang, Jiali You, Hanxing Xue, Jinlin Wang","doi":"10.4018/ijwsr.2020010101","DOIUrl":"https://doi.org/10.4018/ijwsr.2020010101","url":null,"abstract":"In the era of Internet of Things, cloud services are difficult to meet the real-time transmission requirements of users for the data generated in the edge of network especially for the Internet video services. Utilizing the devices at the edge of network, such as an intelligent router, to achieve nearby content services for users can effectively reduce backbone traffic and enhance service performance. This article proposes a decentralized PageRank-based content dissemination model at the edge of network, in which a suitable node selection algorithm is designed to distribute the content evenly in the network. Each node can quickly obtain data from neighbor nodes, thereby reducing the cloud load as well as the network bandwidth and improving the service response performance. The simulation shows that, compared with the other two dissemination algorithms, the content is distributed more even, which means every node has more opportunity to obtain the data from neighbors; and the service rejection rate can be decreased by an average of 5.2% in the case of high concurrent requests.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"54 39 1","pages":"1-16"},"PeriodicalIF":1.1,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80584887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-01DOI: 10.4018/ijwsr.2020010104
Santhoshkumar Srinivasan, Yuhong Yan, Yong-bo Liang, Abhijeet Roy, B. Kumara, Incheon Paik, Wuhui Chen, Frederic Montagut, R. Molva, S. Golega, Shuai Zhao, Bo Cheng, Le Yu, Shou-lu Hou, Yang Zhang
Along with true information, rumors spread in online social networks (OSN) on an unprecedented scale. In recent days, rumor identification gains more interest among the researchers. Finding rumors also poses other critical challenges like noisy and imprecise input data, data sparsity, and unclear interpretations of the output. To address these issues, we propose a neuro-fuzzy classification approach called the neuro-fuzzy rumor detector (NFRD) to automatically identify the rumors in OSNs. NFRD quickly transforms the input to fuzzy rules which classify the rumor. Neural networks handle larger input data. Fuzzy systems are better in handling uncertainty and imprecision in input data by producing fuzzy rules that effectively eliminate the unclear inputs. NFRD also considers the semantic aspects of information to ensure better classification. The neuro-fuzzy approach addresses the most common problems such as uncertainty elimination, noise reduction, and quicker generalization. Experimental results show the proposed approach performs well against state-of-the-art rumor detecting techniques.
{"title":"A Neuro-Fuzzy Approach to Detect Rumors in Online Social Networks","authors":"Santhoshkumar Srinivasan, Yuhong Yan, Yong-bo Liang, Abhijeet Roy, B. Kumara, Incheon Paik, Wuhui Chen, Frederic Montagut, R. Molva, S. Golega, Shuai Zhao, Bo Cheng, Le Yu, Shou-lu Hou, Yang Zhang","doi":"10.4018/ijwsr.2020010104","DOIUrl":"https://doi.org/10.4018/ijwsr.2020010104","url":null,"abstract":"Along with true information, rumors spread in online social networks (OSN) on an unprecedented scale. In recent days, rumor identification gains more interest among the researchers. Finding rumors also poses other critical challenges like noisy and imprecise input data, data sparsity, and unclear interpretations of the output. To address these issues, we propose a neuro-fuzzy classification approach called the neuro-fuzzy rumor detector (NFRD) to automatically identify the rumors in OSNs. NFRD quickly transforms the input to fuzzy rules which classify the rumor. Neural networks handle larger input data. Fuzzy systems are better in handling uncertainty and imprecision in input data by producing fuzzy rules that effectively eliminate the unclear inputs. NFRD also considers the semantic aspects of information to ensure better classification. The neuro-fuzzy approach addresses the most common problems such as uncertainty elimination, noise reduction, and quicker generalization. Experimental results show the proposed approach performs well against state-of-the-art rumor detecting techniques.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"54 1","pages":"64-82"},"PeriodicalIF":1.1,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86733679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-04DOI: 10.4018/978-1-5225-7501-6.ch003
Prerna Mahajan, Geetika Gaba, N. Chauhan
The value of Big Data is now being recognized by many industries and governments. The efficient mining of Big Data enables to improve the competitive advantage of companies and to add value for many social and economic sectors. In fact, important projects with huge investments were launched by several governments to extract the maximum benefit from Big Data. The private sector has also deployed important efforts to maximize profits and optimize resources. However, Big Data sharing brings new information security and privacy issues. Traditional technologies and methods are no longer appropriate and lack of performance when applied in Big Data context. This chapter presents Big Data security challenges and a state of the art in methods, mechanisms and solutions used to protect data-intensive information systems.
{"title":"Big Data Security","authors":"Prerna Mahajan, Geetika Gaba, N. Chauhan","doi":"10.4018/978-1-5225-7501-6.ch003","DOIUrl":"https://doi.org/10.4018/978-1-5225-7501-6.ch003","url":null,"abstract":"The value of Big Data is now being recognized by many industries and governments. The efficient mining of Big Data enables to improve the competitive advantage of companies and to add value for many social and economic sectors. In fact, important projects with huge investments were launched by several governments to extract the maximum benefit from Big Data. The private sector has also deployed important efforts to maximize profits and optimize resources. However, Big Data sharing brings new information security and privacy issues. Traditional technologies and methods are no longer appropriate and lack of performance when applied in Big Data context. This chapter presents Big Data security challenges and a state of the art in methods, mechanisms and solutions used to protect data-intensive information systems.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"86 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2019-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83262701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.4018/ijwsr.2019100102
Lin Xiao, Chuanmin Mi, Yetian Chen, Lihua Huang
This study aims to understand the determinants of consumer satisfaction with bed-and-breakfast establishments (B&Bs) and build a hierarchical structure of these determinants. Content analysis was conducted based on the consumer online review data. Ten determinants of customer satisfaction were identified. The interpretive structural modeling (ISM) technique was then used to develop a five-level hierarchical structural model based on these determinants. Finally, the cross-impact matrix multiplication applied to the classification (MICMAC) technique was used to analyze the driver and dependence power for each determinant. This study has the potential to make significant contributions from both the theoretical and practical perspectives.
{"title":"Understanding the Determinants of Consumer Satisfaction With B&B Hotels: An Interpretive Structural Modeling Approach","authors":"Lin Xiao, Chuanmin Mi, Yetian Chen, Lihua Huang","doi":"10.4018/ijwsr.2019100102","DOIUrl":"https://doi.org/10.4018/ijwsr.2019100102","url":null,"abstract":"This study aims to understand the determinants of consumer satisfaction with bed-and-breakfast establishments (B&Bs) and build a hierarchical structure of these determinants. Content analysis was conducted based on the consumer online review data. Ten determinants of customer satisfaction were identified. The interpretive structural modeling (ISM) technique was then used to develop a five-level hierarchical structural model based on these determinants. Finally, the cross-impact matrix multiplication applied to the classification (MICMAC) technique was used to analyze the driver and dependence power for each determinant. This study has the potential to make significant contributions from both the theoretical and practical perspectives.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"28 1","pages":"21-39"},"PeriodicalIF":1.1,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76135966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.4018/ijwsr.2019100103
Jun Zeng, Feng Li, Xin He, Junhao Wen
Point of interest (POI) recommendation is a significant task in location-based social networks (LBSNs), e.g., Foursquare, Brightkite. It helps users explore the surroundings and help POI owners increase income. While several researches have been proposed for the recommendation services, it lacks integrated analysis on POI recommendation. In this article, the authors propose a unified recommendation framework, which fuses personalized user preference, geographical influence, and social reputation. The TF-IDF method is adopted to measure the interest level and contribution of locations when calculating the similarity between users. Geographical influence includes geographical distance and location popularity. The authors find friends in Brightkite share low common visited POIs. It means friends' interests may vary greatly. Instead of directly getting recommendations from so-called friends in LBSN, the users attain recommendation from others according to their reputation. Finally, experimental results on real-world dataset demonstrate that the proposed method performs much better than other recommendation methods.
{"title":"Fused Collaborative Filtering With User Preference, Geographical and Social Influence for Point of Interest Recommendation","authors":"Jun Zeng, Feng Li, Xin He, Junhao Wen","doi":"10.4018/ijwsr.2019100103","DOIUrl":"https://doi.org/10.4018/ijwsr.2019100103","url":null,"abstract":"Point of interest (POI) recommendation is a significant task in location-based social networks (LBSNs), e.g., Foursquare, Brightkite. It helps users explore the surroundings and help POI owners increase income. While several researches have been proposed for the recommendation services, it lacks integrated analysis on POI recommendation. In this article, the authors propose a unified recommendation framework, which fuses personalized user preference, geographical influence, and social reputation. The TF-IDF method is adopted to measure the interest level and contribution of locations when calculating the similarity between users. Geographical influence includes geographical distance and location popularity. The authors find friends in Brightkite share low common visited POIs. It means friends' interests may vary greatly. Instead of directly getting recommendations from so-called friends in LBSN, the users attain recommendation from others according to their reputation. Finally, experimental results on real-world dataset demonstrate that the proposed method performs much better than other recommendation methods.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"98 1","pages":"40-52"},"PeriodicalIF":1.1,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81004431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.4018/ijwsr.2019100101
S. Bukhari, Yunni Xia
The cloud computing paradigm provides an ideal platform for supporting large-scale scientific-workflow-based applications over the internet. However, the scheduling and execution of scientific workflows still face various challenges such as cost and response time management, which aim at handling acquisition delays of physical servers and minimizing the overall completion time of workflows. A careful investigation into existing methods shows that most existing approaches consider static performance of physical machines (PMs) and ignore the impact of resource acquisition delays in their scheduling models. In this article, the authors present a meta-heuristic-based method to scheduling scientific workflows aiming at reducing workflow completion time through appropriately managing acquisition and transmission delays required for inter-PM communications. The authors carry out extensive case studies as well based on real-world commercial cloud sand multiple workflow templates. Experimental results clearly show that the proposed method outperforms the state-of-art ones such as ICPCP, CEGA, and JIT-C in terms of workflow completion time.
{"title":"A Novel Completion-Time-Minimization Scheduling Approach of Scientific Workflows Over Heterogeneous Cloud Computing Systems","authors":"S. Bukhari, Yunni Xia","doi":"10.4018/ijwsr.2019100101","DOIUrl":"https://doi.org/10.4018/ijwsr.2019100101","url":null,"abstract":"The cloud computing paradigm provides an ideal platform for supporting large-scale scientific-workflow-based applications over the internet. However, the scheduling and execution of scientific workflows still face various challenges such as cost and response time management, which aim at handling acquisition delays of physical servers and minimizing the overall completion time of workflows. A careful investigation into existing methods shows that most existing approaches consider static performance of physical machines (PMs) and ignore the impact of resource acquisition delays in their scheduling models. In this article, the authors present a meta-heuristic-based method to scheduling scientific workflows aiming at reducing workflow completion time through appropriately managing acquisition and transmission delays required for inter-PM communications. The authors carry out extensive case studies as well based on real-world commercial cloud sand multiple workflow templates. Experimental results clearly show that the proposed method outperforms the state-of-art ones such as ICPCP, CEGA, and JIT-C in terms of workflow completion time.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"1 1","pages":"1-20"},"PeriodicalIF":1.1,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83002508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.4018/ijwsr.2019100104
Abdulrahman Elhosuieny, Mofreh Salem, Amr Thabet, Abdelhameed Ibrahim
Nowadays, mobile computation applications attract major interest of researchers. Limited processing power and short battery lifetime is an obstacle in executing computationally-intensive applications. This article presents a mobile computation automatic decision-making offloading framework. The proposed framework consists of two phases: adaptive learning, and modeling and runtime computation offloading. In the adaptive phase, curve-fitting (CF) technique based on non-linear polynomial regression (NPR) methodology is used to build an approximate time-predicting model that can estimate the execution time for spending the processing of the detected-intensive applications. The runtime computation phase uses the time predicting model for computing the predicted execution time to decide whether to run the application remotely and perform the offloading process or to run the application locally. Eventually, the RESTful web service is applied to carry out the offloading task in the case of a positive offloading decision. The proposed framework experimentally outperforms a competitive state-of-the-art technique by 73% concerning the time factor. The proposed time-predicting model records minimal deviation of the originally obtained values as it is applied 0.4997, 8.9636, 0.0020, and 0.6797 on the mean squared error metric for matrix-determinant, image-sharpening, matrix-multiplication, and n-queens problems, respectively.
{"title":"ADOMC-NPR Automatic Decision-Making Offloading Framework for Mobile Computation Using Nonlinear Polynomial Regression Model","authors":"Abdulrahman Elhosuieny, Mofreh Salem, Amr Thabet, Abdelhameed Ibrahim","doi":"10.4018/ijwsr.2019100104","DOIUrl":"https://doi.org/10.4018/ijwsr.2019100104","url":null,"abstract":"Nowadays, mobile computation applications attract major interest of researchers. Limited processing power and short battery lifetime is an obstacle in executing computationally-intensive applications. This article presents a mobile computation automatic decision-making offloading framework. The proposed framework consists of two phases: adaptive learning, and modeling and runtime computation offloading. In the adaptive phase, curve-fitting (CF) technique based on non-linear polynomial regression (NPR) methodology is used to build an approximate time-predicting model that can estimate the execution time for spending the processing of the detected-intensive applications. The runtime computation phase uses the time predicting model for computing the predicted execution time to decide whether to run the application remotely and perform the offloading process or to run the application locally. Eventually, the RESTful web service is applied to carry out the offloading task in the case of a positive offloading decision. The proposed framework experimentally outperforms a competitive state-of-the-art technique by 73% concerning the time factor. The proposed time-predicting model records minimal deviation of the originally obtained values as it is applied 0.4997, 8.9636, 0.0020, and 0.6797 on the mean squared error metric for matrix-determinant, image-sharpening, matrix-multiplication, and n-queens problems, respectively.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"6 1","pages":"53-73"},"PeriodicalIF":1.1,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75404944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.4018/IJWSR.2019070102
Hajar Elmaghraoui, Laila Benhlima, D. Chiadmi
In this article, the authors propose a dynamic web service composition approach based on representing the semantic relationship between web services using a weighted directed AND/OR graph. The nodes in this graph represent available services while the arcs represent the semantic input/output dependencies among them. The novelty of this work consists of constructing the graph and computing offline the shortest paths between each pair of its nodes to disconnect this tedious task from the composition query process. A set of dynamic optimization techniques has been included to reduce the size of the graph and thus improve the scalability and performance of this approach. In addition to the sequence and fork relations between services, this solution also supports the parallel relation. Furthermore, a recovery mechanism is integrated to ensure the continuity of the execution of the composition.
{"title":"Automatic Dynamic Web Service Composition Using AND/OR Directed Graphs","authors":"Hajar Elmaghraoui, Laila Benhlima, D. Chiadmi","doi":"10.4018/IJWSR.2019070102","DOIUrl":"https://doi.org/10.4018/IJWSR.2019070102","url":null,"abstract":"In this article, the authors propose a dynamic web service composition approach based on representing the semantic relationship between web services using a weighted directed AND/OR graph. The nodes in this graph represent available services while the arcs represent the semantic input/output dependencies among them. The novelty of this work consists of constructing the graph and computing offline the shortest paths between each pair of its nodes to disconnect this tedious task from the composition query process. A set of dynamic optimization techniques has been included to reduce the size of the graph and thus improve the scalability and performance of this approach. In addition to the sequence and fork relations between services, this solution also supports the parallel relation. Furthermore, a recovery mechanism is integrated to ensure the continuity of the execution of the composition.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"25 1","pages":"29-43"},"PeriodicalIF":1.1,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75533906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.4018/IJWSR.2019070104
Yi Zhao, Yu Qiao, K. He
Clustering has become an increasingly important task in the analysis of large documents. Clustering aims to organize these documents, and facilitate better search and knowledge extraction. Most existing clustering methods that use user-generated tags only consider their positive influence for improving automatic clustering performance. The authors argue that not all user-generated tags can provide useful information for clustering. In this article, the authors propose a new solution for clustering, named HRT-LDA (High Representation Tags Latent Dirichlet Allocation), which considers the effects of different tags on clustering performance. For this, the authors perform a tag filtering strategy and a tag appending strategy based on transfer learning, Word2vec, TF-IDF and semantic computing. Extensive experiments on real-world datasets demonstrate that HRT-LDA outperforms the state-of-the-art tagging augmented LDA methods for clustering.
{"title":"A Novel Tagging Augmented LDA Model for Clustering","authors":"Yi Zhao, Yu Qiao, K. He","doi":"10.4018/IJWSR.2019070104","DOIUrl":"https://doi.org/10.4018/IJWSR.2019070104","url":null,"abstract":"Clustering has become an increasingly important task in the analysis of large documents. Clustering aims to organize these documents, and facilitate better search and knowledge extraction. Most existing clustering methods that use user-generated tags only consider their positive influence for improving automatic clustering performance. The authors argue that not all user-generated tags can provide useful information for clustering. In this article, the authors propose a new solution for clustering, named HRT-LDA (High Representation Tags Latent Dirichlet Allocation), which considers the effects of different tags on clustering performance. For this, the authors perform a tag filtering strategy and a tag appending strategy based on transfer learning, Word2vec, TF-IDF and semantic computing. Extensive experiments on real-world datasets demonstrate that HRT-LDA outperforms the state-of-the-art tagging augmented LDA methods for clustering.","PeriodicalId":54936,"journal":{"name":"International Journal of Web Services Research","volume":"35 1","pages":"59-77"},"PeriodicalIF":1.1,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90449632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}