从自由文本执法数据中提取语义信息结构

2012 IEEE International Conference on Intelligence and Security Informatics Pub Date : 2012-06-11 DOI:10.1109/ISI.2012.6284291

James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham

{"title":"从自由文本执法数据中提取语义信息结构","authors":"James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham","doi":"10.1109/ISI.2012.6284291","DOIUrl":null,"url":null,"abstract":"A detective distributes information on a current case to his law enforcement peers. He quickly receives a computer generated response with leads identified within hundreds of thousands of previously distributed free text documents from thousands of other detectives. The challenges lie in the nature of free text - unstructured formats, confusing word usage, cut-andpaste additions, abbreviations, inserted html/xml tags, multimedia content, and domain-specific terminology. This research proposes a new data structure, the semantic information structure, which encapsulates the extracted content information on classes of information such as people, vehicles, events, organizations, objects, and locations as well as the contextual information about the connections and measures to enable prioritization of files containing related pieces of content. The structure is organized to be a result of automated natural language processing methods that extract entities, expanded entity phrases and their links which are driven by ontologies, DLSafe rules, abductive hypotheses and semantic composition. Importance and significance measures aid in prioritization.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Extracting semantic information structures from free text law enforcement data\",\"authors\":\"James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham\",\"doi\":\"10.1109/ISI.2012.6284291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A detective distributes information on a current case to his law enforcement peers. He quickly receives a computer generated response with leads identified within hundreds of thousands of previously distributed free text documents from thousands of other detectives. The challenges lie in the nature of free text - unstructured formats, confusing word usage, cut-andpaste additions, abbreviations, inserted html/xml tags, multimedia content, and domain-specific terminology. This research proposes a new data structure, the semantic information structure, which encapsulates the extracted content information on classes of information such as people, vehicles, events, organizations, objects, and locations as well as the contextual information about the connections and measures to enable prioritization of files containing related pieces of content. The structure is organized to be a result of automated natural language processing methods that extract entities, expanded entity phrases and their links which are driven by ontologies, DLSafe rules, abductive hypotheses and semantic composition. Importance and significance measures aid in prioritization.\",\"PeriodicalId\":199734,\"journal\":{\"name\":\"2012 IEEE International Conference on Intelligence and Security Informatics\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Intelligence and Security Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISI.2012.6284291\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Intelligence and Security Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2012.6284291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

一名侦探将当前案件的信息分发给他的执法同僚。他很快就收到了计算机生成的回复，其中包含从数千名其他侦探先前分发的数十万份免费文本文件中识别出的线索。挑战在于自由文本的本质——非结构化格式、令人困惑的单词用法、剪切和粘贴添加、缩写、插入的html/xml标记、多媒体内容和特定于领域的术语。本研究提出了一种新的数据结构，即语义信息结构，它将提取的内容信息封装在诸如人、车辆、事件、组织、对象和位置等信息类别上，以及关于连接和度量的上下文信息，从而能够对包含相关内容的文件进行优先级排序。该结构被组织为自动自然语言处理方法的结果，这些方法提取实体、扩展实体短语及其链接，这些链接由本体、DLSafe规则、溯因假设和语义组合驱动。重要性和意义度量有助于确定优先级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Extracting semantic information structures from free text law enforcement data

A detective distributes information on a current case to his law enforcement peers. He quickly receives a computer generated response with leads identified within hundreds of thousands of previously distributed free text documents from thousands of other detectives. The challenges lie in the nature of free text - unstructured formats, confusing word usage, cut-andpaste additions, abbreviations, inserted html/xml tags, multimedia content, and domain-specific terminology. This research proposes a new data structure, the semantic information structure, which encapsulates the extracted content information on classes of information such as people, vehicles, events, organizations, objects, and locations as well as the contextual information about the connections and measures to enable prioritization of files containing related pieces of content. The structure is organized to be a result of automated natural language processing methods that extract entities, expanded entity phrases and their links which are driven by ontologies, DLSafe rules, abductive hypotheses and semantic composition. Importance and significance measures aid in prioritization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 IEEE International Conference on Intelligence and Security Informatics

自引率

0.00%

发文量

期刊最新文献

Detecting criminal networks: SNA models are compared to proprietary models Securing cyberspace: Identifying key actors in hacker communities Emergency decision support using an agent-based modeling approach Payment card fraud: Challenges and solutions Extracting action knowledge in security informatics