A. Vakilian, Yodsawalai Chodpathumwan, Arash Termehchy, A. Nayyeri
{"title":"成本效益优于分类法的概念设计","authors":"A. Vakilian, Yodsawalai Chodpathumwan, Arash Termehchy, A. Nayyeri","doi":"10.1145/3068839.3068841","DOIUrl":null,"url":null,"abstract":"It is known that annotating entities in unstructured and semistructured datasets by their concepts improves the effectiveness of answering queries over these datasets. Ideally, one would like to annotate entities of all relevant concepts in a dataset. However, it takes substantial time and computational resources to annotate concepts in large datasets and an organization may have sufficient resources to annotate only a subset of relevant concepts. Clearly, it would like to annotate a subset of concepts that provides the most effective answers to queries over the dataset. We propose a formal framework that quantifies the amount by which annotating entities of concepts from a taxonomy in a dataset improves the effectiveness of answering queries over the dataset. Because the problem is NP-hard, we propose an efficient approximation for the problem. Our extensive empirical studies validate our framework and show the accuracy and efficiency of our algorithm.","PeriodicalId":211805,"journal":{"name":"Proceedings of the 20th International Workshop on the Web and Databases","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Cost-Effective Conceptual Design Over Taxonomies\",\"authors\":\"A. Vakilian, Yodsawalai Chodpathumwan, Arash Termehchy, A. Nayyeri\",\"doi\":\"10.1145/3068839.3068841\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is known that annotating entities in unstructured and semistructured datasets by their concepts improves the effectiveness of answering queries over these datasets. Ideally, one would like to annotate entities of all relevant concepts in a dataset. However, it takes substantial time and computational resources to annotate concepts in large datasets and an organization may have sufficient resources to annotate only a subset of relevant concepts. Clearly, it would like to annotate a subset of concepts that provides the most effective answers to queries over the dataset. We propose a formal framework that quantifies the amount by which annotating entities of concepts from a taxonomy in a dataset improves the effectiveness of answering queries over the dataset. Because the problem is NP-hard, we propose an efficient approximation for the problem. Our extensive empirical studies validate our framework and show the accuracy and efficiency of our algorithm.\",\"PeriodicalId\":211805,\"journal\":{\"name\":\"Proceedings of the 20th International Workshop on the Web and Databases\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 20th International Workshop on the Web and Databases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3068839.3068841\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th International Workshop on the Web and Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3068839.3068841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
It is known that annotating entities in unstructured and semistructured datasets by their concepts improves the effectiveness of answering queries over these datasets. Ideally, one would like to annotate entities of all relevant concepts in a dataset. However, it takes substantial time and computational resources to annotate concepts in large datasets and an organization may have sufficient resources to annotate only a subset of relevant concepts. Clearly, it would like to annotate a subset of concepts that provides the most effective answers to queries over the dataset. We propose a formal framework that quantifies the amount by which annotating entities of concepts from a taxonomy in a dataset improves the effectiveness of answering queries over the dataset. Because the problem is NP-hard, we propose an efficient approximation for the problem. Our extensive empirical studies validate our framework and show the accuracy and efficiency of our algorithm.