{"title":"从重叠变量集上的未知干预数据集中发现因果关系","authors":"Fuyuan Cao;Yunxia Wang;Kui Yu;Jiye Liang","doi":"10.1109/TKDE.2024.3443997","DOIUrl":null,"url":null,"abstract":"Inferring causal structures from experimentation is a challenging task in many fields. Most causal structure learning algorithms with unknown interventions are proposed to discover causal relationships over an identical variable set. However, often due to privacy, ethical, financial, and practical concerns, the variable sets observed by multiple sources or domains are not entirely identical. While a few algorithms are proposed to handle the partially overlapping variable sets, they focus on the case of known intervention targets. Therefore, to be close to the real-world environment, we consider discovering causal relationships over overlapping variable sets under the unknown intervention setting and exploring a scenario where a problem is studied across multiple domains. Here, we propose an algorithm for discovering the causal relationships over the integrated set of variables from unknown interventions, mainly handling the entangled inconsistencies caused by the incomplete observation of variables and unknown intervention targets. Specifically, we first distinguish two types of inconsistencies and then deal with respectively them by presenting some lemmas. Finally, we construct a fusion rule to combine learned structures of multiple domains, obtaining the final structures over the integrated set of variables. Theoretical analysis and experimental results on synthetic, benchmark, and real-world datasets have verified the effectiveness of the proposed algorithm.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"7725-7742"},"PeriodicalIF":8.9000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Causal Discovery From Unknown Interventional Datasets Over Overlapping Variable Sets\",\"authors\":\"Fuyuan Cao;Yunxia Wang;Kui Yu;Jiye Liang\",\"doi\":\"10.1109/TKDE.2024.3443997\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Inferring causal structures from experimentation is a challenging task in many fields. Most causal structure learning algorithms with unknown interventions are proposed to discover causal relationships over an identical variable set. However, often due to privacy, ethical, financial, and practical concerns, the variable sets observed by multiple sources or domains are not entirely identical. While a few algorithms are proposed to handle the partially overlapping variable sets, they focus on the case of known intervention targets. Therefore, to be close to the real-world environment, we consider discovering causal relationships over overlapping variable sets under the unknown intervention setting and exploring a scenario where a problem is studied across multiple domains. Here, we propose an algorithm for discovering the causal relationships over the integrated set of variables from unknown interventions, mainly handling the entangled inconsistencies caused by the incomplete observation of variables and unknown intervention targets. Specifically, we first distinguish two types of inconsistencies and then deal with respectively them by presenting some lemmas. Finally, we construct a fusion rule to combine learned structures of multiple domains, obtaining the final structures over the integrated set of variables. Theoretical analysis and experimental results on synthetic, benchmark, and real-world datasets have verified the effectiveness of the proposed algorithm.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"36 12\",\"pages\":\"7725-7742\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2024-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10637995/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10637995/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Causal Discovery From Unknown Interventional Datasets Over Overlapping Variable Sets
Inferring causal structures from experimentation is a challenging task in many fields. Most causal structure learning algorithms with unknown interventions are proposed to discover causal relationships over an identical variable set. However, often due to privacy, ethical, financial, and practical concerns, the variable sets observed by multiple sources or domains are not entirely identical. While a few algorithms are proposed to handle the partially overlapping variable sets, they focus on the case of known intervention targets. Therefore, to be close to the real-world environment, we consider discovering causal relationships over overlapping variable sets under the unknown intervention setting and exploring a scenario where a problem is studied across multiple domains. Here, we propose an algorithm for discovering the causal relationships over the integrated set of variables from unknown interventions, mainly handling the entangled inconsistencies caused by the incomplete observation of variables and unknown intervention targets. Specifically, we first distinguish two types of inconsistencies and then deal with respectively them by presenting some lemmas. Finally, we construct a fusion rule to combine learned structures of multiple domains, obtaining the final structures over the integrated set of variables. Theoretical analysis and experimental results on synthetic, benchmark, and real-world datasets have verified the effectiveness of the proposed algorithm.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.