Pub Date : 2012-06-11DOI: 10.1109/ISI.2012.6284099
Ming Yang, Hsinchun Chen
Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.
{"title":"Partially supervised learning for radical opinion identification in hate group web forums","authors":"Ming Yang, Hsinchun Chen","doi":"10.1109/ISI.2012.6284099","DOIUrl":"https://doi.org/10.1109/ISI.2012.6284099","url":null,"abstract":"Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125434496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-11DOI: 10.1109/ISI.2012.6284291
James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham
A detective distributes information on a current case to his law enforcement peers. He quickly receives a computer generated response with leads identified within hundreds of thousands of previously distributed free text documents from thousands of other detectives. The challenges lie in the nature of free text - unstructured formats, confusing word usage, cut-andpaste additions, abbreviations, inserted html/xml tags, multimedia content, and domain-specific terminology. This research proposes a new data structure, the semantic information structure, which encapsulates the extracted content information on classes of information such as people, vehicles, events, organizations, objects, and locations as well as the contextual information about the connections and measures to enable prioritization of files containing related pieces of content. The structure is organized to be a result of automated natural language processing methods that extract entities, expanded entity phrases and their links which are driven by ontologies, DLSafe rules, abductive hypotheses and semantic composition. Importance and significance measures aid in prioritization.
{"title":"Extracting semantic information structures from free text law enforcement data","authors":"James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham","doi":"10.1109/ISI.2012.6284291","DOIUrl":"https://doi.org/10.1109/ISI.2012.6284291","url":null,"abstract":"A detective distributes information on a current case to his law enforcement peers. He quickly receives a computer generated response with leads identified within hundreds of thousands of previously distributed free text documents from thousands of other detectives. The challenges lie in the nature of free text - unstructured formats, confusing word usage, cut-andpaste additions, abbreviations, inserted html/xml tags, multimedia content, and domain-specific terminology. This research proposes a new data structure, the semantic information structure, which encapsulates the extracted content information on classes of information such as people, vehicles, events, organizations, objects, and locations as well as the contextual information about the connections and measures to enable prioritization of files containing related pieces of content. The structure is organized to be a result of automated natural language processing methods that extract entities, expanded entity phrases and their links which are driven by ontologies, DLSafe rules, abductive hypotheses and semantic composition. Importance and significance measures aid in prioritization.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126368722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-11DOI: 10.1109/ISI.2012.6284145
A. Badia
One of the main challenges in intelligence work is to assess the trustworthiness of data sources. In an adversarial setting, in which the subjects under study actively try to disturb the data gathering process, trustworthiness is one of the most important properties of a source. The recent increase in usage of open source data has exacerbated the problem, due to the proliferation of sources. In this paper we propose computerized methods to help analysts evaluate the truthfulness of data sources (open or not). We apply methods developed in database and Semantic Web research to determine data quality (which includes truthfulness but also other related aspects like accuracy, timeliness, etc.). Research on data quality has made frequent use of provenance metadata. This is metadata related to the origin of the data: where it comes from, how and when it was obtained, and any relevant conditions that might help determine how it came to be in its current form. We study the application of similar methods to the particular situation of the Intelligence analyst, focusing on trust. This paper describes ongoing research; what is explained here is a first attempt at tackling this complex but very important problem. Due to lack of space, relevant work in the research literature is not discussed, and several technical considerations are omitted; finally, further research directions are only sketched.
{"title":"Evaluating source trustability with data provenance: A research note","authors":"A. Badia","doi":"10.1109/ISI.2012.6284145","DOIUrl":"https://doi.org/10.1109/ISI.2012.6284145","url":null,"abstract":"One of the main challenges in intelligence work is to assess the trustworthiness of data sources. In an adversarial setting, in which the subjects under study actively try to disturb the data gathering process, trustworthiness is one of the most important properties of a source. The recent increase in usage of open source data has exacerbated the problem, due to the proliferation of sources. In this paper we propose computerized methods to help analysts evaluate the truthfulness of data sources (open or not). We apply methods developed in database and Semantic Web research to determine data quality (which includes truthfulness but also other related aspects like accuracy, timeliness, etc.). Research on data quality has made frequent use of provenance metadata. This is metadata related to the origin of the data: where it comes from, how and when it was obtained, and any relevant conditions that might help determine how it came to be in its current form. We study the application of similar methods to the particular situation of the Intelligence analyst, focusing on trust. This paper describes ongoing research; what is explained here is a first attempt at tackling this complex but very important problem. Due to lack of space, relevant work in the research literature is not discussed, and several technical considerations are omitted; finally, further research directions are only sketched.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128696535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-01DOI: 10.1109/ISI.2012.6284301
Ben-xian Li, Meng-jun Li, Duoyong Sun, Jiang Li, Wenju Li
It is important for us to estimate centrality degree of terrorist networks. In this paper, a novel method was presented by us to evaluate terrorism network centrality nodes. The algorithm is based on network cohesion degree measures used in social network analysis. The advantage of this method is to be considered both degree and position of a node. The experimental results show that this method of efficiency has some advantage over betweenness centrality method.
{"title":"Evaluation method for node importance based on node condensation in terrorism networks","authors":"Ben-xian Li, Meng-jun Li, Duoyong Sun, Jiang Li, Wenju Li","doi":"10.1109/ISI.2012.6284301","DOIUrl":"https://doi.org/10.1109/ISI.2012.6284301","url":null,"abstract":"It is important for us to estimate centrality degree of terrorist networks. In this paper, a novel method was presented by us to evaluate terrorism network centrality nodes. The algorithm is based on network cohesion degree measures used in social network analysis. The advantage of this method is to be considered both degree and position of a node. The experimental results show that this method of efficiency has some advantage over betweenness centrality method.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131092094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-01DOI: 10.1109/ISI.2012.6284312
Satyen Abrol, L. Khan, V. Khadilkar, B. Thuraisingham, Tyrone Cadenhead
This paper describes a framework, SNODSOC (Stream based novel class detection for social network analysis), that detects evolving patterns and trends in social microblogs. SNODSOC extends our powerful data mining system, SNOD (Stream-based Novel Class Detection) for now detecting novel patterns and trends within microblogs.
{"title":"Design and implementation of SNODSOC: Novel class detection for social network analysis","authors":"Satyen Abrol, L. Khan, V. Khadilkar, B. Thuraisingham, Tyrone Cadenhead","doi":"10.1109/ISI.2012.6284312","DOIUrl":"https://doi.org/10.1109/ISI.2012.6284312","url":null,"abstract":"This paper describes a framework, SNODSOC (Stream based novel class detection for social network analysis), that detects evolving patterns and trends in social microblogs. SNODSOC extends our powerful data mining system, SNOD (Stream-based Novel Class Detection) for now detecting novel patterns and trends within microblogs.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129556652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-01DOI: 10.1109/ISI.2012.6284097
S. Pratt, P. Giabbanelli, Piper J. Jackson, Vijay Mago
Attempts to model insurgency have suffered from several obstacles. Qualitative research may be vague and conflicting, while quantitative research is limited due to the difficulties of collecting sufficient data in war and inferring complex relationships. We propose an innovative combination of Fuzzy Cognitive Maps and Cellular Automata to capture this complexity. Our approach is computational, thus it can be used to develop a simulation platform in which military and political analysts can test scenarios. We take a step-by-step approach to illustrate the potential of our approach in a population-centric war, similar to the on-going campaign in Afghanistan. While the project still requires validation and improvement of the knowledge base by domain experts as well as construction of accurate simulation scenarios, this example fully specifies the general problem definition and the technical structure of the model.
{"title":"Rebel with many causes: A computational model of insurgency","authors":"S. Pratt, P. Giabbanelli, Piper J. Jackson, Vijay Mago","doi":"10.1109/ISI.2012.6284097","DOIUrl":"https://doi.org/10.1109/ISI.2012.6284097","url":null,"abstract":"Attempts to model insurgency have suffered from several obstacles. Qualitative research may be vague and conflicting, while quantitative research is limited due to the difficulties of collecting sufficient data in war and inferring complex relationships. We propose an innovative combination of Fuzzy Cognitive Maps and Cellular Automata to capture this complexity. Our approach is computational, thus it can be used to develop a simulation platform in which military and political analysts can test scenarios. We take a step-by-step approach to illustrate the potential of our approach in a population-centric war, similar to the on-going campaign in Afghanistan. While the project still requires validation and improvement of the knowledge base by domain experts as well as construction of accurate simulation scenarios, this example fully specifies the general problem definition and the technical structure of the model.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"19 9-10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114048463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the public policy community, a need exists for a software tool that comprehensively applies leading concepts of counterterrorism (CT) analysis and planning to systematically display, track, operationalize and update on an analyst's computer all the decisions and processes involved in countering an on-going terrorist insurgency facing a targeted adversary. Initially, such a prototype would be employed for analytical and training purposes, with a more advanced version used in a government's actual CT campaign. No such comprehensive software tool exists at present (based on the author's knowledge) to address any of these requirements. This research note, which is of a preliminary nature, is primarily descriptive and does not include a discussion of how such a tool would be operationalized into a software program, since that would involve releasing proprietary intellectual property that is intended for commercial purposes. Hopefully, this preliminary research note will generate interest in a collaborative effort with other researchers and software developers to create and operationalize such a software tool kit.
{"title":"Research note: Concept to develop a software-based counter-terrorism campaign decision support tool","authors":"Joshua Sinai","doi":"10.1145/1938606.1938612","DOIUrl":"https://doi.org/10.1145/1938606.1938612","url":null,"abstract":"In the public policy community, a need exists for a software tool that comprehensively applies leading concepts of counterterrorism (CT) analysis and planning to systematically display, track, operationalize and update on an analyst's computer all the decisions and processes involved in countering an on-going terrorist insurgency facing a targeted adversary. Initially, such a prototype would be employed for analytical and training purposes, with a more advanced version used in a government's actual CT campaign. No such comprehensive software tool exists at present (based on the author's knowledge) to address any of these requirements. This research note, which is of a preliminary nature, is primarily descriptive and does not include a discussion of how such a tool would be operationalized into a software program, since that would involve releasing proprietary intellectual property that is intended for commercial purposes. Hopefully, this preliminary research note will generate interest in a collaborative effort with other researchers and software developers to create and operationalize such a software tool kit.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124943948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}