{"title":"使用基于web的语义相似度度量来识别犯罪语料库中命名实体之间的关系","authors":"Priyanka Das, A. Das","doi":"10.1109/ICRCICN.2017.8234525","DOIUrl":null,"url":null,"abstract":"The present work proposes an unsupervised approach for recognising relations between named entities from a large corpora based on crime in Indian states and union territories. Initially, named entities have been identified from the extracted crime corpus and certain pair of entities have been chosen that facilitates the crime analysis. Then the entity pairs with their intermediate context words have been represented as a shallow parse tree for relation instance. From the parse trees, only the head words (in each entity pair) reflecting the main meaning of the phrases has been considered for measuring a semantic similarity using a web search engine that retrieves the page count of those particular words and their conjunctives. The derived page count is used for measuring the Simpson Coefficient between the pairs and based on this similarity score, an agglomerative hierarchical clustering technique has been applied that makes several clusters of entity pairs of same relationship. The resultant clusters also have been characterised with the most frequent head word present in the group. This proposed method shows a simple similarity measure technique for relation extraction from crime data providing better accuracy than other existing methods.","PeriodicalId":166298,"journal":{"name":"2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Relation recognition among named entities from a crime corpus using a web-based semantic similarity measurement\",\"authors\":\"Priyanka Das, A. Das\",\"doi\":\"10.1109/ICRCICN.2017.8234525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The present work proposes an unsupervised approach for recognising relations between named entities from a large corpora based on crime in Indian states and union territories. Initially, named entities have been identified from the extracted crime corpus and certain pair of entities have been chosen that facilitates the crime analysis. Then the entity pairs with their intermediate context words have been represented as a shallow parse tree for relation instance. From the parse trees, only the head words (in each entity pair) reflecting the main meaning of the phrases has been considered for measuring a semantic similarity using a web search engine that retrieves the page count of those particular words and their conjunctives. The derived page count is used for measuring the Simpson Coefficient between the pairs and based on this similarity score, an agglomerative hierarchical clustering technique has been applied that makes several clusters of entity pairs of same relationship. The resultant clusters also have been characterised with the most frequent head word present in the group. This proposed method shows a simple similarity measure technique for relation extraction from crime data providing better accuracy than other existing methods.\",\"PeriodicalId\":166298,\"journal\":{\"name\":\"2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRCICN.2017.8234525\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRCICN.2017.8234525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Relation recognition among named entities from a crime corpus using a web-based semantic similarity measurement
The present work proposes an unsupervised approach for recognising relations between named entities from a large corpora based on crime in Indian states and union territories. Initially, named entities have been identified from the extracted crime corpus and certain pair of entities have been chosen that facilitates the crime analysis. Then the entity pairs with their intermediate context words have been represented as a shallow parse tree for relation instance. From the parse trees, only the head words (in each entity pair) reflecting the main meaning of the phrases has been considered for measuring a semantic similarity using a web search engine that retrieves the page count of those particular words and their conjunctives. The derived page count is used for measuring the Simpson Coefficient between the pairs and based on this similarity score, an agglomerative hierarchical clustering technique has been applied that makes several clusters of entity pairs of same relationship. The resultant clusters also have been characterised with the most frequent head word present in the group. This proposed method shows a simple similarity measure technique for relation extraction from crime data providing better accuracy than other existing methods.