{"title":"稀疏数据集中多关系链接检测","authors":"Dong Nie, M. Roantree","doi":"10.5220/0007696901490157","DOIUrl":null,"url":null,"abstract":"Application areas such as healthcare and insurance see many patients or clients with their lifetime record spread across the databases of different providers. Record linkage is the task where algorithms are used to identify the same individual contained in different datasets. In cases where unique identifiers are found, linking those records is a trivial task. However, there are very high numbers of individuals who cannot be matched as common identifiers do not exist across datasets and their identifying information is not exact or often, quite different (e.g. a change of address). In this research, we provide a new approach to record linkage which also includes the ability to detect relationships between customers (e.g. family). A validation is presented which highlights the best parameter and configuration settings for the types of relationship links that are required.","PeriodicalId":271024,"journal":{"name":"International Conference on Enterprise Information Systems","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Detecting Multi-Relationship Links in Sparse Datasets\",\"authors\":\"Dong Nie, M. Roantree\",\"doi\":\"10.5220/0007696901490157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Application areas such as healthcare and insurance see many patients or clients with their lifetime record spread across the databases of different providers. Record linkage is the task where algorithms are used to identify the same individual contained in different datasets. In cases where unique identifiers are found, linking those records is a trivial task. However, there are very high numbers of individuals who cannot be matched as common identifiers do not exist across datasets and their identifying information is not exact or often, quite different (e.g. a change of address). In this research, we provide a new approach to record linkage which also includes the ability to detect relationships between customers (e.g. family). A validation is presented which highlights the best parameter and configuration settings for the types of relationship links that are required.\",\"PeriodicalId\":271024,\"journal\":{\"name\":\"International Conference on Enterprise Information Systems\",\"volume\":\"100 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Enterprise Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0007696901490157\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Enterprise Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007696901490157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detecting Multi-Relationship Links in Sparse Datasets
Application areas such as healthcare and insurance see many patients or clients with their lifetime record spread across the databases of different providers. Record linkage is the task where algorithms are used to identify the same individual contained in different datasets. In cases where unique identifiers are found, linking those records is a trivial task. However, there are very high numbers of individuals who cannot be matched as common identifiers do not exist across datasets and their identifying information is not exact or often, quite different (e.g. a change of address). In this research, we provide a new approach to record linkage which also includes the ability to detect relationships between customers (e.g. family). A validation is presented which highlights the best parameter and configuration settings for the types of relationship links that are required.