Portfolio optimization refers to the reasonable allocation of assets to achieve the investment objectives. At present, the investment environment depression was due to sluggish economic conditions. For investors, they expect to find a balance between return and risk in a complex investment environment. In order to solve this problem, this paper proposes a portfolio optimization algorithm named Portfolio Optimization based on Affinity propagation and Genetic algorithm, as also called POGA, which based on affinity propagation and genetic algorithm. Firstly, the affinity propagation algorithm is used to construct a candidate set of portfolio based on the correlation analysis of the stock time series. Secondly, using the Sharpe-ratio as the Optimization objective function, the genetic algorithm is used to solve an optimal portfolio strategy with higher-return and lower-risk. Finally, the experimental result of real-world stock data show that a portfolio with higher return and lower risk will be selected.
{"title":"Research on Portfolio Optimization Based on Affinity Propagation and Genetic Algorithm","authors":"Chong Liu, Wenyan Gan, Yutian Chen","doi":"10.1109/WISA.2017.9","DOIUrl":"https://doi.org/10.1109/WISA.2017.9","url":null,"abstract":"Portfolio optimization refers to the reasonable allocation of assets to achieve the investment objectives. At present, the investment environment depression was due to sluggish economic conditions. For investors, they expect to find a balance between return and risk in a complex investment environment. In order to solve this problem, this paper proposes a portfolio optimization algorithm named Portfolio Optimization based on Affinity propagation and Genetic algorithm, as also called POGA, which based on affinity propagation and genetic algorithm. Firstly, the affinity propagation algorithm is used to construct a candidate set of portfolio based on the correlation analysis of the stock time series. Secondly, using the Sharpe-ratio as the Optimization objective function, the genetic algorithm is used to solve an optimal portfolio strategy with higher-return and lower-risk. Finally, the experimental result of real-world stock data show that a portfolio with higher return and lower risk will be selected.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116004379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Document clustering (or text clustering) is the application of cluster analysis to textual documents. It has applications in automatic document organization, topic extraction and fast information retrieval or filtering. At the same time, there are still many challenges, for example the accuracy of clustering needs to be improved. In this regard, the process of cluster correction becomes the object of analysis. In this paper, we focus on the polysemy and synonymy issue in clustering process. Polysemy represents the ambiguity of an individual word or phrase that can be used (in different contexts) to express two or more different meanings. However, synonymy is the semantic relation that holds between two or more words that can (in a given context) express the same meaning. These two conditions will affect our results of clustering. In order that, we use bag of words model to distinguish contexts of the same words and word2vec to re-cluster word with the similar meaning. Cosine similarity is also use to measure of similarity between two nonzero vectors in these two model.
{"title":"Cluster Correction on Polysemy and Synonymy","authors":"Zemin Qin, Hao Lian, Tieke He, B. Luo","doi":"10.1109/WISA.2017.45","DOIUrl":"https://doi.org/10.1109/WISA.2017.45","url":null,"abstract":"Document clustering (or text clustering) is the application of cluster analysis to textual documents. It has applications in automatic document organization, topic extraction and fast information retrieval or filtering. At the same time, there are still many challenges, for example the accuracy of clustering needs to be improved. In this regard, the process of cluster correction becomes the object of analysis. In this paper, we focus on the polysemy and synonymy issue in clustering process. Polysemy represents the ambiguity of an individual word or phrase that can be used (in different contexts) to express two or more different meanings. However, synonymy is the semantic relation that holds between two or more words that can (in a given context) express the same meaning. These two conditions will affect our results of clustering. In order that, we use bag of words model to distinguish contexts of the same words and word2vec to re-cluster word with the similar meaning. Cosine similarity is also use to measure of similarity between two nonzero vectors in these two model.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116669890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the development of scientific research, scientific publications are valuable resources for new-comers in the research field. But massive scientific publications make it a challenge for researchers diving into a new research field. As a good practice to this problem, topics are put forward to organize publications. In this paper, we propose two modified LDA topic models as solutions to topic analysis and influential paper discovery on scientific publications, cc-LDA and cp-LDA. Compared to state-of-the-art researches on LDA, we incorporate citation information including its occurrence times and occurrence position into our models. Model cc-LDA integrates paper content and citation occurrence into LDA model, while cp-LDA considers both occurrence and position of citations. Both models can not only find topics in the form of citation distribution, but also help discover influential papers under certain topics. Furthermore, both models can extract more representative vectors for papers, which achieve good performance in subsequent clustering.
{"title":"Topic Analysis and Influential Paper Discovery on Scientific Publications","authors":"Ye Li, Jun He, Hongyan Liu","doi":"10.1109/WISA.2017.69","DOIUrl":"https://doi.org/10.1109/WISA.2017.69","url":null,"abstract":"With the development of scientific research, scientific publications are valuable resources for new-comers in the research field. But massive scientific publications make it a challenge for researchers diving into a new research field. As a good practice to this problem, topics are put forward to organize publications. In this paper, we propose two modified LDA topic models as solutions to topic analysis and influential paper discovery on scientific publications, cc-LDA and cp-LDA. Compared to state-of-the-art researches on LDA, we incorporate citation information including its occurrence times and occurrence position into our models. Model cc-LDA integrates paper content and citation occurrence into LDA model, while cp-LDA considers both occurrence and position of citations. Both models can not only find topics in the form of citation distribution, but also help discover influential papers under certain topics. Furthermore, both models can extract more representative vectors for papers, which achieve good performance in subsequent clustering.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124792298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shouli Zhang, Chen Liu, Shen Su, Yanbo Han, Dandan Feng
Abnormality Detection in power plant is a typical IoT application which aims to identify anomalies in these routinely collected monitoring sensor data; intend to help detect possible faults in the equipment. However, on the development of abnormality detection, we find that there are three challenges. The first one is the lack of cooperation between sensors. It means that the physical sensors cannot share and interact with each other. Secondly, the rapid increase in volume of sensor data and dynamic situation of production result in challenges to predefine all possible associations between sensors. Thirdly, it is difficult to build IoT application for developers who have little or no professional knowledge about production process. In this paper, we proposed a proactive data service model to encapsulate stream sensor data into services. We spread events among the proactive data services. By analysis of event correlations, we have realized service hyperlinks which help to offer the proactive real-time interaction with services. Real application and experiments verified that our proactive data service based method is more effective compare with traditional rule-based methods to detect abnormalities in power plant.
{"title":"A Proactive Data Service Model to Encapsulating Stream Sensor Data into Service","authors":"Shouli Zhang, Chen Liu, Shen Su, Yanbo Han, Dandan Feng","doi":"10.1109/WISA.2017.5","DOIUrl":"https://doi.org/10.1109/WISA.2017.5","url":null,"abstract":"Abnormality Detection in power plant is a typical IoT application which aims to identify anomalies in these routinely collected monitoring sensor data; intend to help detect possible faults in the equipment. However, on the development of abnormality detection, we find that there are three challenges. The first one is the lack of cooperation between sensors. It means that the physical sensors cannot share and interact with each other. Secondly, the rapid increase in volume of sensor data and dynamic situation of production result in challenges to predefine all possible associations between sensors. Thirdly, it is difficult to build IoT application for developers who have little or no professional knowledge about production process. In this paper, we proposed a proactive data service model to encapsulate stream sensor data into services. We spread events among the proactive data services. By analysis of event correlations, we have realized service hyperlinks which help to offer the proactive real-time interaction with services. Real application and experiments verified that our proactive data service based method is more effective compare with traditional rule-based methods to detect abnormalities in power plant.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128978694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruolan Li, Huan Liao, Huili Su, Yukun Li, Yongxuan Lai
As people acquire much more personal information as a result of personal and work activities, the management of these information becomes a serious problem and an important research issue. Modeling personal desktop activities and identifying them are two basic problems for supporting activity-based operations. To the best of our knowledge there is no literature on formalizing and identifying desktop activity from personal information management perspective. There are a number of challenges to this work, including the fact that people exhibit personalized behaviors, have individual interests, needs and resources, no available experimental data set, etc. In this paper, we perform a user experiment to learn about user desktop activities in a personal information management context. We collected information access activities in a naturalistic setting and propose a conceptual activity model by analyzing features of user behaviors at their desktop computers. We present an effective and efficient method of automatically identifying desktop activities. To evaluate performance of our method, we develop a prototype system to collect real users activities, and evaluate our methods for identifying activities. The results verify the effectiveness and efficiency of our methods.
{"title":"A Method to Identify Personal Desktop Activities","authors":"Ruolan Li, Huan Liao, Huili Su, Yukun Li, Yongxuan Lai","doi":"10.1109/WISA.2017.39","DOIUrl":"https://doi.org/10.1109/WISA.2017.39","url":null,"abstract":"As people acquire much more personal information as a result of personal and work activities, the management of these information becomes a serious problem and an important research issue. Modeling personal desktop activities and identifying them are two basic problems for supporting activity-based operations. To the best of our knowledge there is no literature on formalizing and identifying desktop activity from personal information management perspective. There are a number of challenges to this work, including the fact that people exhibit personalized behaviors, have individual interests, needs and resources, no available experimental data set, etc. In this paper, we perform a user experiment to learn about user desktop activities in a personal information management context. We collected information access activities in a naturalistic setting and propose a conceptual activity model by analyzing features of user behaviors at their desktop computers. We present an effective and efficient method of automatically identifying desktop activities. To evaluate performance of our method, we develop a prototype system to collect real users activities, and evaluate our methods for identifying activities. The results verify the effectiveness and efficiency of our methods.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134390971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
with the continuous construction and development of university library, how to find interesting books from the massive books is becoming a concerned problem. In this paper, we develop a personalized book recommender system based on Chinese Library Classification Method named CLCM. CLCM uses Upper and Lower Level Relations Model (ULLRM) to describe the characteristic words and fuses the Dominant and Recessive Feedback Model (DRFM) to update the users' preferences. And visualization of book inquiry improves the efficiency of inquiring. The experimental results show that CLCM performs much better than the state-of-the art approaches in the university library.
{"title":"Personalized Book Recommender System Based on Chinese Library Classification","authors":"H. Zhang, Yingyuan Xiao, Zhongjing Bu","doi":"10.1109/WISA.2017.42","DOIUrl":"https://doi.org/10.1109/WISA.2017.42","url":null,"abstract":"with the continuous construction and development of university library, how to find interesting books from the massive books is becoming a concerned problem. In this paper, we develop a personalized book recommender system based on Chinese Library Classification Method named CLCM. CLCM uses Upper and Lower Level Relations Model (ULLRM) to describe the characteristic words and fuses the Dominant and Recessive Feedback Model (DRFM) to update the users' preferences. And visualization of book inquiry improves the efficiency of inquiring. The experimental results show that CLCM performs much better than the state-of-the art approaches in the university library.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132137906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays, the collaborative filtering techniques have demonstrated an excellent performance in the top-N recommendation. However conventional methods in similarity measurement are insufficient when the condition of data sparsity and cold start occur, which leads to a poor accuracy in prediction. In order to concur the limitation, a collaborative filtering algorithm of calculating similarity based on item rating and attributes is proposed. Firstly, we calculate the similarity of item attributes, then calculate the similarity of the project according to the user rating of the project. Meanwhile, a weighted control coefficient is proposed to combine the similarity between item attributes and rating of items, which contribute to obtain nearest neighbors. Experiments have shown that our algorithm has major potential in solving the problem of cold start, therefore improving the precision of the recommendation system.
{"title":"A Collaborative Filtering Algorithm of Calculating Similarity Based on Item Rating and Attributes","authors":"Zelong Li, Mengxing Huang, Yu Zhang","doi":"10.1109/WISA.2017.35","DOIUrl":"https://doi.org/10.1109/WISA.2017.35","url":null,"abstract":"Nowadays, the collaborative filtering techniques have demonstrated an excellent performance in the top-N recommendation. However conventional methods in similarity measurement are insufficient when the condition of data sparsity and cold start occur, which leads to a poor accuracy in prediction. In order to concur the limitation, a collaborative filtering algorithm of calculating similarity based on item rating and attributes is proposed. Firstly, we calculate the similarity of item attributes, then calculate the similarity of the project according to the user rating of the project. Meanwhile, a weighted control coefficient is proposed to combine the similarity between item attributes and rating of items, which contribute to obtain nearest neighbors. Experiments have shown that our algorithm has major potential in solving the problem of cold start, therefore improving the precision of the recommendation system.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132628653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Massive Electronic Medical Records (EMRs) contain a lot of knowledge and Named Entity Recognition (NER) in Chinese EMR is a very important task. However, due to the lack of Chinese medical dictionary, there are few studies on NER in Chinese EMR. In this paper, we first build a medical dictionary. We then investigated the effects of different types of features in Chinese clinical NER tasks based on Condition Random Fields (CRF) algorithm, the most popular algorithm for NER, including bag-of-characters, part of speech, dictionary feature, and word clustering features. In the experimental section, we randomly selected 220 clinical texts from Peking Anzhen Hospital. The experimental results showed that these features were beneficial in varying degrees to Chinese named entity recognition. Finally, after analyzing the experimental results, we get some rules of thumb.
{"title":"Named Entity Recognition in Chinese Electronic Medical Records Based on CRF","authors":"Kaixin Liu, Qingcheng Hu, Jianwei Liu, Chunxiao Xing","doi":"10.1109/WISA.2017.8","DOIUrl":"https://doi.org/10.1109/WISA.2017.8","url":null,"abstract":"Massive Electronic Medical Records (EMRs) contain a lot of knowledge and Named Entity Recognition (NER) in Chinese EMR is a very important task. However, due to the lack of Chinese medical dictionary, there are few studies on NER in Chinese EMR. In this paper, we first build a medical dictionary. We then investigated the effects of different types of features in Chinese clinical NER tasks based on Condition Random Fields (CRF) algorithm, the most popular algorithm for NER, including bag-of-characters, part of speech, dictionary feature, and word clustering features. In the experimental section, we randomly selected 220 clinical texts from Peking Anzhen Hospital. The experimental results showed that these features were beneficial in varying degrees to Chinese named entity recognition. Finally, after analyzing the experimental results, we get some rules of thumb.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131121891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Some work in information spreading show that the attention plays an important role in explaining human behaviors and the attention decay exists in the process of information spreading. However, few researchers take the attention decay into consideration when studying the spreading dynamics of information. In this paper, we propose a susceptible-received-accepted-immune (SRAI) information spreading model to explore the attention decay's effect on the spread dynamics of information, integrating the memory, the social reinforcement and the attention decay. We simulate the model in different complex networks and verify the impacts of the attention decay on the information spreading process. Particularly, simulation results show that in some situations, the effect of the attention decay will decrease with the increasement of the network's randomness. Our work can provide insights to the understanding of the role of the attention decay in information spreading.
{"title":"The Influence of the Attention Decay in an Information Spreading Model","authors":"Zili Xiong, Zaobin Gan, Haifeng Xiang, Hongwei Lu","doi":"10.1109/WISA.2017.13","DOIUrl":"https://doi.org/10.1109/WISA.2017.13","url":null,"abstract":"Some work in information spreading show that the attention plays an important role in explaining human behaviors and the attention decay exists in the process of information spreading. However, few researchers take the attention decay into consideration when studying the spreading dynamics of information. In this paper, we propose a susceptible-received-accepted-immune (SRAI) information spreading model to explore the attention decay's effect on the spread dynamics of information, integrating the memory, the social reinforcement and the attention decay. We simulate the model in different complex networks and verify the impacts of the attention decay on the information spreading process. Particularly, simulation results show that in some situations, the effect of the attention decay will decrease with the increasement of the network's randomness. Our work can provide insights to the understanding of the role of the attention decay in information spreading.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116192250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Time series prediction problems can play an important role in many areas, and multi-step ahead time series forecast, like river flow forecast, stock price forecast, could help people to make right decisions. Many predictive models do not work very well in multi-step ahead predictions. LSTM (Long Short-Term Memory) is an iterative structure in the hidden layer of the recurrent neural network which could capture the long-term dependency in time series. In this paper, we try to model different types of data patterns, use LSTM RNN for multi-step ahead prediction, and compare the prediction result with other traditional models.
{"title":"Multi-step Ahead Time Series Forecasting for Different Data Patterns Based on LSTM Recurrent Neural Network","authors":"L. Yunpeng, Hou Di, Bao Junpeng, Qi Yong","doi":"10.1109/WISA.2017.25","DOIUrl":"https://doi.org/10.1109/WISA.2017.25","url":null,"abstract":"Time series prediction problems can play an important role in many areas, and multi-step ahead time series forecast, like river flow forecast, stock price forecast, could help people to make right decisions. Many predictive models do not work very well in multi-step ahead predictions. LSTM (Long Short-Term Memory) is an iterative structure in the hidden layer of the recurrent neural network which could capture the long-term dependency in time series. In this paper, we try to model different types of data patterns, use LSTM RNN for multi-step ahead prediction, and compare the prediction result with other traditional models.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122370533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}