Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603255
Yitong Lu, Mingxin Liang, Chao Gao, Yuxin Liu, Xianghua Li
The community structure as a vital property for complex networks contributes a lot for understanding and detecting inherent functions of real networks. However, existing algorithms which are ranging from the optimization-based to model-based strategies still need to be strengthened further in terms of their robustness and accuracy. In this paper, a kind of multi-headed slime molds, Physarum, is used for optimizing genetic algorithm (GA), due to its intelligence of generating foraging networks based on bioresearches. Thus, a Physarum-based Network Model (PNM) is proposed based on the Physarum-based Model, which shows an ability of recognizing inter-community edges. Combining PNM with a genetic algorithm, a novel genetic algorithm, called PNGACD, is putting forward to enhance the GA's efficiency, in which a priori edge recognition of PNM is integrated into the phase of initialization. Moreover, experiments in six real-world networks are used to evaluate the efficiency of the proposed method. Results show that there is a remarkable improvement in term of the robustness and accuracy, which demonstrates that PNGACD has a better performance, compared with the existing algorithms.
{"title":"A bio-inspired genetic algorithm for community mining","authors":"Yitong Lu, Mingxin Liang, Chao Gao, Yuxin Liu, Xianghua Li","doi":"10.1109/FSKD.2016.7603255","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603255","url":null,"abstract":"The community structure as a vital property for complex networks contributes a lot for understanding and detecting inherent functions of real networks. However, existing algorithms which are ranging from the optimization-based to model-based strategies still need to be strengthened further in terms of their robustness and accuracy. In this paper, a kind of multi-headed slime molds, Physarum, is used for optimizing genetic algorithm (GA), due to its intelligence of generating foraging networks based on bioresearches. Thus, a Physarum-based Network Model (PNM) is proposed based on the Physarum-based Model, which shows an ability of recognizing inter-community edges. Combining PNM with a genetic algorithm, a novel genetic algorithm, called PNGACD, is putting forward to enhance the GA's efficiency, in which a priori edge recognition of PNM is integrated into the phase of initialization. Moreover, experiments in six real-world networks are used to evaluate the efficiency of the proposed method. Results show that there is a remarkable improvement in term of the robustness and accuracy, which demonstrates that PNGACD has a better performance, compared with the existing algorithms.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127934176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603460
Huayong Liu, Tao Li
An adaptive threshold shot detection algorithm based on improved block color features is proposed in this paper. This paper adopts an improved block color feature extraction method based on equal area of rectangular ring. Sub-block accumulative color histogram is extracted as color features and different weight for different rectangle rings is set in order to highlight the central part of frame. Then, adaptive threshold of detecting abrupt shot and gradual shot is calculated, and different detection modules is used according to the distance of the characteristics between frames. In the abrupt shots detection, several frames' frame difference and the edge shape features between adjacent frames are calculated to detect the flash. In the gradual shots detection, the discontinuous frame differences between the current frame and back frames are used to detect the boundary of the gradual shot. The experimental results show that this method has better effect to different types of video.
{"title":"An adaptive threshold shot detection algorithm based on improved block color features","authors":"Huayong Liu, Tao Li","doi":"10.1109/FSKD.2016.7603460","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603460","url":null,"abstract":"An adaptive threshold shot detection algorithm based on improved block color features is proposed in this paper. This paper adopts an improved block color feature extraction method based on equal area of rectangular ring. Sub-block accumulative color histogram is extracted as color features and different weight for different rectangle rings is set in order to highlight the central part of frame. Then, adaptive threshold of detecting abrupt shot and gradual shot is calculated, and different detection modules is used according to the distance of the characteristics between frames. In the abrupt shots detection, several frames' frame difference and the edge shape features between adjacent frames are calculated to detect the flash. In the gradual shots detection, the discontinuous frame differences between the current frame and back frames are used to detect the boundary of the gradual shot. The experimental results show that this method has better effect to different types of video.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134137784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603302
Wei Mei
An axiomatic definition for possibility measure without “maxitivity” operator is proposed in this work. It has a psychology foundation of human cognition of possibility, is consistent with the conditional probability interpretation of possibility. Possibility measure defined this way obeys to a disjunctive operator of arithmetic mean and a conjunctive operator of product, which offers a different perspective on the understanding of possibility and should promote the cross prosperity of probability and possibility.
{"title":"The concept of possibility based on human cognition","authors":"Wei Mei","doi":"10.1109/FSKD.2016.7603302","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603302","url":null,"abstract":"An axiomatic definition for possibility measure without “maxitivity” operator is proposed in this work. It has a psychology foundation of human cognition of possibility, is consistent with the conditional probability interpretation of possibility. Possibility measure defined this way obeys to a disjunctive operator of arithmetic mean and a conjunctive operator of product, which offers a different perspective on the understanding of possibility and should promote the cross prosperity of probability and possibility.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"209 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133057271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Treebank is one of important resources in the natural language processing. Compared with the rich and mature Chinese corpus, Vietnamese Syntactic Analysis is much more difficult. This paper presents a new approach which uses Chinese-Vietnamese bilingual word alignment corpus to build Vietnamese Dependency Treebank. Firstly, the aligned word processing was made by Chinese-Vietnamese sentence alignment; Secondly, the dependency parsing was done with Chinese sentences. Finally, Vietnamese Dependency Parsing Treebank was generated by Chinese-Vietnamese Languages align relationship and Chinese Dependency Tree, At the same time, The Vietnamese phrase tree converted into dependency Treebank can significantly improve the accuracy of dependency analysis. Experimental results show that this approach can simplify the process of manual collection and annotation of Vietnamese Treebank, and it can save manpower and time to build the Vietnamese Treebank. Experimental results show that the accuracy of this method compared to machine learning methods has improved significantly.
{"title":"Building vietnamese dependency treebank based on Chinese-Vietnamese bilingual word alignment","authors":"Ying Li, Jianyi Guo, Zhengtao Yu, Hongbin Wang, Yonghua Wen","doi":"10.1109/FSKD.2016.7603371","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603371","url":null,"abstract":"Treebank is one of important resources in the natural language processing. Compared with the rich and mature Chinese corpus, Vietnamese Syntactic Analysis is much more difficult. This paper presents a new approach which uses Chinese-Vietnamese bilingual word alignment corpus to build Vietnamese Dependency Treebank. Firstly, the aligned word processing was made by Chinese-Vietnamese sentence alignment; Secondly, the dependency parsing was done with Chinese sentences. Finally, Vietnamese Dependency Parsing Treebank was generated by Chinese-Vietnamese Languages align relationship and Chinese Dependency Tree, At the same time, The Vietnamese phrase tree converted into dependency Treebank can significantly improve the accuracy of dependency analysis. Experimental results show that this approach can simplify the process of manual collection and annotation of Vietnamese Treebank, and it can save manpower and time to build the Vietnamese Treebank. Experimental results show that the accuracy of this method compared to machine learning methods has improved significantly.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114157600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603331
Dan Li, W. Yao
In order to ensure the normal operation of water supply projects, especially long-distance water transmission pipeline (LDWTP), it is necessary to recognize and avoid potential risk events during its operation process. As the vagueness and correlation of risk indices usually have a significant impact on the risk assessment result, those indices with high similarity are found out and eliminated based on the fuzzy similarity theory and triangular fuzzy number. Then the quantitative risk assessment using the fuzzy comprehensive evaluation method by combining analytic hierarchy process (AHP) approach is conducted based on the newly established risk index system. The case study shows that there are five groups of indices with high similarity and the risk index system can be screened from 27 indices to 22 indices. The risk assessment result illustrates that risk grades of the four categories from high to low are technology risk, management risk, third-party risk and nature risk. The comprehensive risk of LDWTP is high, which can help the decision makers to identify the most contributing risk events and allow them to take necessary measures for the risk reduction and risk control.
{"title":"Risk assessment of long-distance water transmission pipeline based on fuzzy similarity evaluation approach","authors":"Dan Li, W. Yao","doi":"10.1109/FSKD.2016.7603331","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603331","url":null,"abstract":"In order to ensure the normal operation of water supply projects, especially long-distance water transmission pipeline (LDWTP), it is necessary to recognize and avoid potential risk events during its operation process. As the vagueness and correlation of risk indices usually have a significant impact on the risk assessment result, those indices with high similarity are found out and eliminated based on the fuzzy similarity theory and triangular fuzzy number. Then the quantitative risk assessment using the fuzzy comprehensive evaluation method by combining analytic hierarchy process (AHP) approach is conducted based on the newly established risk index system. The case study shows that there are five groups of indices with high similarity and the risk index system can be screened from 27 indices to 22 indices. The risk assessment result illustrates that risk grades of the four categories from high to low are technology risk, management risk, third-party risk and nature risk. The comprehensive risk of LDWTP is high, which can help the decision makers to identify the most contributing risk events and allow them to take necessary measures for the risk reduction and risk control.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114413916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603246
Ying Lin, Suoping Li, Haiyan Chen, Duo Peng
Cooperative networking is used widely because it has advantages in terms of network capacity and transmission reliability upgrade. This paper presents an analytical study on channel capacity of amplify-and-forward (AF) relay system with multi-relays in multiple input multiple output (MIMO) system. The problem of resource allocation is studied in this paper and an algorithm of amplification matrix for each antenna of multi-relays is presented. Then we solved above problem using a global optimization method. Results of simulation show that proposed algorithm could greatly improve the channel capacity at little cost.
{"title":"Capacity maximization on multi-relay MIMO cooperative system","authors":"Ying Lin, Suoping Li, Haiyan Chen, Duo Peng","doi":"10.1109/FSKD.2016.7603246","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603246","url":null,"abstract":"Cooperative networking is used widely because it has advantages in terms of network capacity and transmission reliability upgrade. This paper presents an analytical study on channel capacity of amplify-and-forward (AF) relay system with multi-relays in multiple input multiple output (MIMO) system. The problem of resource allocation is studied in this paper and an algorithm of amplification matrix for each antenna of multi-relays is presented. Then we solved above problem using a global optimization method. Results of simulation show that proposed algorithm could greatly improve the channel capacity at little cost.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123062866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603459
Q. Zheng, C. Zheng
Photo-realistic image synthesis using full lens model provides us with realistic lens effects, but it suffers from low rendering efficiency when many light paths are obstructed by lens stops and lens barrel. This paper proposes a novel method to generate light paths, along which rays can propagate through the lens system. As a first step, a light passage function is defined as the objective function for sampling light paths. The sampling is implemented in a hypercube space, by means of both adaptive Markov chain sampling and interacting Markov chain Monte Carlo. Then light paths are constructed based on these samples. This approach can be easily incorporated in existing rendering methods to trace rays through a full lens model. Experimental results show that this approach can effectively increase the number of valid rays which can go through the lens system, therefore improving the rendering efficiency.
{"title":"Adaptive light paths generation through full lens model","authors":"Q. Zheng, C. Zheng","doi":"10.1109/FSKD.2016.7603459","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603459","url":null,"abstract":"Photo-realistic image synthesis using full lens model provides us with realistic lens effects, but it suffers from low rendering efficiency when many light paths are obstructed by lens stops and lens barrel. This paper proposes a novel method to generate light paths, along which rays can propagate through the lens system. As a first step, a light passage function is defined as the objective function for sampling light paths. The sampling is implemented in a hypercube space, by means of both adaptive Markov chain sampling and interacting Markov chain Monte Carlo. Then light paths are constructed based on these samples. This approach can be easily incorporated in existing rendering methods to trace rays through a full lens model. Experimental results show that this approach can effectively increase the number of valid rays which can go through the lens system, therefore improving the rendering efficiency.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123212721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603425
Xiaoxue Wang, Conghui Zhu, Sheng Li, T. Zhao, Dequan Zheng
Statistical machine translation (SMT) plays more and more important role now. The performance of the SMT is largely dependent on the size and quality of training data. But the demands for translation is rich, how to make the best of limited in-domain data to satisfy the needs of translation coming from different domains is one of the hot focus in current SMT. Domain adaption aims to obviously improve the specific-domain performance by bringing much out-of-domain parallel corpus at the absence of in-domain parallel corpus. Domain adaption is one of the keys to get the SMT into practical application. This paper introduces mainstream methods of domain adaption for SMT, compares advantages and disadvantages of representative methods based on the result of the same data and shows personal views about the possible future direction of domain adaption for SMT.
{"title":"Domain adaptation for statistical machine translation","authors":"Xiaoxue Wang, Conghui Zhu, Sheng Li, T. Zhao, Dequan Zheng","doi":"10.1109/FSKD.2016.7603425","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603425","url":null,"abstract":"Statistical machine translation (SMT) plays more and more important role now. The performance of the SMT is largely dependent on the size and quality of training data. But the demands for translation is rich, how to make the best of limited in-domain data to satisfy the needs of translation coming from different domains is one of the hot focus in current SMT. Domain adaption aims to obviously improve the specific-domain performance by bringing much out-of-domain parallel corpus at the absence of in-domain parallel corpus. Domain adaption is one of the keys to get the SMT into practical application. This paper introduces mainstream methods of domain adaption for SMT, compares advantages and disadvantages of representative methods based on the result of the same data and shows personal views about the possible future direction of domain adaption for SMT.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124800369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603360
R. Katarzyniak, Wojciech A. Lorkiewicz, Dominik P. Wiecek
An original model of linguistic summaries extracted from episodic data is briefly presented. In particular, a class of linguistic summaries expressed as modal equivalences is considered. The model is tailored to the concept of autonomous agent systems, and is supported by several detailed, non-technical, natural language processing and knowledge representation theories. Complementary to the well known classic interpretation of linguistic summaries, based on the fuzzy sets theory, the proposed model deals with a different class of vague cognitive concepts. The class consists of epistemic modalities, in particular the concepts of knowledge, belief and possibility. Each sub-class of linguistic summaries is processed as understood in the context of natural systems and supported by related cognitive semantics. Remarks on relevant implementation technologies are given, and an illustrative computational example is presented.
{"title":"Modal linguistic summaries based on natural language equivalence with cognitive semantics","authors":"R. Katarzyniak, Wojciech A. Lorkiewicz, Dominik P. Wiecek","doi":"10.1109/FSKD.2016.7603360","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603360","url":null,"abstract":"An original model of linguistic summaries extracted from episodic data is briefly presented. In particular, a class of linguistic summaries expressed as modal equivalences is considered. The model is tailored to the concept of autonomous agent systems, and is supported by several detailed, non-technical, natural language processing and knowledge representation theories. Complementary to the well known classic interpretation of linguistic summaries, based on the fuzzy sets theory, the proposed model deals with a different class of vague cognitive concepts. The class consists of epistemic modalities, in particular the concepts of knowledge, belief and possibility. Each sub-class of linguistic summaries is processed as understood in the context of natural systems and supported by related cognitive semantics. Remarks on relevant implementation technologies are given, and an illustrative computational example is presented.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122951962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-08-01DOI: 10.1109/FSKD.2016.7603392
Wei Ai, Dapu Li
With the emergence of the big data age, how to get valuable hot topic from the vast amount of digitized textual materials quickly and accurately has attracted more and more attention. This paper proposes a parallel Two-phase Mic-mac Hot Topic Detection(TMHTD) method specially design for microblogging in “Big Data” environment, which is implemented based on Apache Spark cloud computing environment. TMHTD is a distributed clustering framework for documents sets with two phases, including micro-clustering and macro-clustering. In the first phase, TMHTD partitions original data sets into a group of smaller data sets, and these data subsets are clustered into many small topics, producing intermediate results. In the second phase, the intermediate results are integrated into one, further clustered, and achieve the final hot topic sets. To improve the accuracy of the hot topic detection, an optimization of TMHTD is proposed. To handle large databases, we deliberately design a group of MapReduce jobs to concretely accomplish the hot topic detection in a highly scalable way. Extensive experimental results indicate that the accuracy and performance of TMHTD algorithm can be improved significantly over existing approaches.
{"title":"Parallelizing hot topic detection of microblog on spark","authors":"Wei Ai, Dapu Li","doi":"10.1109/FSKD.2016.7603392","DOIUrl":"https://doi.org/10.1109/FSKD.2016.7603392","url":null,"abstract":"With the emergence of the big data age, how to get valuable hot topic from the vast amount of digitized textual materials quickly and accurately has attracted more and more attention. This paper proposes a parallel Two-phase Mic-mac Hot Topic Detection(TMHTD) method specially design for microblogging in “Big Data” environment, which is implemented based on Apache Spark cloud computing environment. TMHTD is a distributed clustering framework for documents sets with two phases, including micro-clustering and macro-clustering. In the first phase, TMHTD partitions original data sets into a group of smaller data sets, and these data subsets are clustered into many small topics, producing intermediate results. In the second phase, the intermediate results are integrated into one, further clustered, and achieve the final hot topic sets. To improve the accuracy of the hot topic detection, an optimization of TMHTD is proposed. To handle large databases, we deliberately design a group of MapReduce jobs to concretely accomplish the hot topic detection in a highly scalable way. Extensive experimental results indicate that the accuracy and performance of TMHTD algorithm can be improved significantly over existing approaches.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123143062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}