Pub Date : 2017-07-01DOI: 10.1109/ISI.2017.8004881
C. Cai, Linjing Li, D. Zeng
Public sentiment permeated through social media is usually regarded as an important measure for public opinion monitoring, policy making, and so forth. However, the deluge of user-generated content in web, especially in social platform, causes great challenge to public sentiment analysis tasks. Therefore, Web-derived Emotional Word Detection (WEWD) is proposed as a fundamental tool aims to alleviate this problem. Most previous works on WEWD focus on rules, syntax, and sentence structures, a few utilize semantic information which has the potential to further increase the accuracy and efficiency of WEWD. In this paper, we propose a Global-Local Latent Semantic (GLLS) framework for WEWD to make a full use of latent semantic information with the help of multiple sense word embedding technology. We devise two computational WEWD models, called Ensemble GLLS (EGLLS) and Deep GLLS (DGLLS). EGLLS exploits an ensemble learning way to fuse the global and local latent semantics while DGLLS takes advantage of deep neural network. We also design an old-new corpus enrich technique to help increase the effectiveness of the overall training and detecting process. To the best of our knowledge, this is the first work which applies multiple sense word embedding and deep neural network in WEWD related tasks. Experiments on real datasets demonstrate the effectiveness of the proposed idea and methods.
{"title":"Web-derived Emotional Word Detection in social media using Latent Semantic information","authors":"C. Cai, Linjing Li, D. Zeng","doi":"10.1109/ISI.2017.8004881","DOIUrl":"https://doi.org/10.1109/ISI.2017.8004881","url":null,"abstract":"Public sentiment permeated through social media is usually regarded as an important measure for public opinion monitoring, policy making, and so forth. However, the deluge of user-generated content in web, especially in social platform, causes great challenge to public sentiment analysis tasks. Therefore, Web-derived Emotional Word Detection (WEWD) is proposed as a fundamental tool aims to alleviate this problem. Most previous works on WEWD focus on rules, syntax, and sentence structures, a few utilize semantic information which has the potential to further increase the accuracy and efficiency of WEWD. In this paper, we propose a Global-Local Latent Semantic (GLLS) framework for WEWD to make a full use of latent semantic information with the help of multiple sense word embedding technology. We devise two computational WEWD models, called Ensemble GLLS (EGLLS) and Deep GLLS (DGLLS). EGLLS exploits an ensemble learning way to fuse the global and local latent semantics while DGLLS takes advantage of deep neural network. We also design an old-new corpus enrich technique to help increase the effectiveness of the overall training and detecting process. To the best of our knowledge, this is the first work which applies multiple sense word embedding and deep neural network in WEWD related tasks. Experiments on real datasets demonstrate the effectiveness of the proposed idea and methods.","PeriodicalId":423696,"journal":{"name":"2017 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130489248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-01DOI: 10.1109/ISI.2017.8004887
C. Cai, Linjing Li, D. Zeng
Social bots are regarded as the most common kind of malwares in social platform. They can produce fake messages, spread rumours, and even manipulate public opinions. Recently, massive social bots are created and widely spread in social platform, they bring negative effects to public and netizen security. Bot detection aims to distinguish bots from human and it catches more and more attentions in recent years. In this paper, we propose a behavior enhanced deep model (BeDM) for bot detection. The proposed model regards user content as temporal text data instead of plain text to extract latent temporal patterns. Moreover, BeDM fuses content information and behavior information using deep learning method. To the best of our knowledge, this is the first trial that applies deep neural network in bot detection. Experiments on real world dataset collected from Twitter also demonstrate the effectiveness of our proposed model.
{"title":"Behavior enhanced deep bot detection in social media","authors":"C. Cai, Linjing Li, D. Zeng","doi":"10.1109/ISI.2017.8004887","DOIUrl":"https://doi.org/10.1109/ISI.2017.8004887","url":null,"abstract":"Social bots are regarded as the most common kind of malwares in social platform. They can produce fake messages, spread rumours, and even manipulate public opinions. Recently, massive social bots are created and widely spread in social platform, they bring negative effects to public and netizen security. Bot detection aims to distinguish bots from human and it catches more and more attentions in recent years. In this paper, we propose a behavior enhanced deep model (BeDM) for bot detection. The proposed model regards user content as temporal text data instead of plain text to extract latent temporal patterns. Moreover, BeDM fuses content information and behavior information using deep learning method. To the best of our knowledge, this is the first trial that applies deep neural network in bot detection. Experiments on real world dataset collected from Twitter also demonstrate the effectiveness of our proposed model.","PeriodicalId":423696,"journal":{"name":"2017 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128767203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-01DOI: 10.1109/ISI.2017.8004890
Hongyuan Ma, Ou Tao, Chunlu Zhao, Pengxiao Li, Lihong Wang
Caching query results is an efficient technique for Web search engines. A state-of-the-art approach named Static-Dynamic Cache (SDC) is widely used in practice. Replacement policy is the key factor on the performance of cache system, and has been widely studied such as LIRS, ARC, CLOCK, SKLRU and RANDOM in different research areas. In this paper, we discussed replacement policies for static-dynamic cache and conducted the experiments on real large scale query logs from two famous commercial Web search engine companies. The experimental results show that ARC replacement policy could work well with static-dynamic cache, especially for large scale query results cache.
{"title":"Impact of replacement policies on static-dynamic query results cache in web search engines","authors":"Hongyuan Ma, Ou Tao, Chunlu Zhao, Pengxiao Li, Lihong Wang","doi":"10.1109/ISI.2017.8004890","DOIUrl":"https://doi.org/10.1109/ISI.2017.8004890","url":null,"abstract":"Caching query results is an efficient technique for Web search engines. A state-of-the-art approach named Static-Dynamic Cache (SDC) is widely used in practice. Replacement policy is the key factor on the performance of cache system, and has been widely studied such as LIRS, ARC, CLOCK, SKLRU and RANDOM in different research areas. In this paper, we discussed replacement policies for static-dynamic cache and conducted the experiments on real large scale query logs from two famous commercial Web search engine companies. The experimental results show that ARC replacement policy could work well with static-dynamic cache, especially for large scale query results cache.","PeriodicalId":423696,"journal":{"name":"2017 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125774228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper focuses on one type of Covert Storage Channel (CSC) that uses the 6-bit TCP flag header in TCP/IP network packets to transmit secret messages between accomplices. We use relative entropy to characterize the irregularity of network flows in comparison to normal traffic. A normal profile is created by the frequency distribution of TCP flags in regular traffic packets. In detection, the TCP flag frequency distribution of network traffic is computed for each unique IP pair. In order to evaluate the accuracy and efficiency of the proposed method, this study uses real regular traffic data sets as well as CSC messages using coding schemes under assumptions of both clear text, composed by a list of keywords common in Unix systems, and encrypted text. Moreover, smart accomplices may use only those TCP flags that are ever appearing in normal traffic. Then, in detection, the relative entropy can reveal the dissimilarity of a different frequency distribution from this normal profile. We have also used different data processing methods in detection: one method summarizes all the packets for a pair of IP addresses into one flow and the other uses a sliding moving window over such a flow to generate multiple frames of packets. The experimentation results, displayed by Receiver Operating Characteristic (ROC) curves, have shown that the method is promising to differentiate normal and CSC traffic packet streams. Furthermore the delay of raising an alert is analyzed for CSC messages to show its efficiency.
{"title":"Raising flags: Detecting covert storage channels using relative entropy","authors":"Josephine K. Chow, Xiangyang Li, X. Mountrouidou","doi":"10.1145/3017680.3022454","DOIUrl":"https://doi.org/10.1145/3017680.3022454","url":null,"abstract":"This paper focuses on one type of Covert Storage Channel (CSC) that uses the 6-bit TCP flag header in TCP/IP network packets to transmit secret messages between accomplices. We use relative entropy to characterize the irregularity of network flows in comparison to normal traffic. A normal profile is created by the frequency distribution of TCP flags in regular traffic packets. In detection, the TCP flag frequency distribution of network traffic is computed for each unique IP pair. In order to evaluate the accuracy and efficiency of the proposed method, this study uses real regular traffic data sets as well as CSC messages using coding schemes under assumptions of both clear text, composed by a list of keywords common in Unix systems, and encrypted text. Moreover, smart accomplices may use only those TCP flags that are ever appearing in normal traffic. Then, in detection, the relative entropy can reveal the dissimilarity of a different frequency distribution from this normal profile. We have also used different data processing methods in detection: one method summarizes all the packets for a pair of IP addresses into one flow and the other uses a sliding moving window over such a flow to generate multiple frames of packets. The experimentation results, displayed by Receiver Operating Characteristic (ROC) curves, have shown that the method is promising to differentiate normal and CSC traffic packet streams. Furthermore the delay of raising an alert is analyzed for CSC messages to show its efficiency.","PeriodicalId":423696,"journal":{"name":"2017 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128391827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}