{"title":"用R包识别反事实查询","authors":"Santtu Tikka","doi":"10.32614/rj-2023-053","DOIUrl":null,"url":null,"abstract":"In the framework of structural causal models, counterfactual queries describe events that concern multiple alternative states of the system under study. Counterfactual queries often take the form of \"what if\" type questions such as \"would an applicant have been hired if they had over 10 years of experience, when in reality they only had 5 years of experience?\" Such questions and counterfactual inference in general are crucial, for example when addressing the problem of fairness in decision-making. Because counterfactual events contain contradictory states of the world, it is impossible to conduct a randomized experiment to address them without making several restrictive assumptions. However, it is sometimes possible to identify such queries from observational and experimental data by representing the system under study as a causal model, and the available data as symbolic probability distributions. @shpitser2007 constructed two algorithms, called ID\\* and IDC\\*, for identifying counterfactual queries and conditional counterfactual queries, respectively. These two algorithms are analogous to the ID and IDC algorithms by @shpitser2006id [@shpitser2006idc] for identification of interventional distributions, which were implemented in R by @tikka2017 in the causaleffect package. We present the R package [cfid](https://CRAN.R-project.org/package=cfid) that implements the ID\\* and IDC\\* algorithms. Identification of counterfactual queries and the features of cfid are demonstrated via examples.","PeriodicalId":51285,"journal":{"name":"R Journal","volume":"104 1-2","pages":"0"},"PeriodicalIF":2.3000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Identifying Counterfactual Queries with the R Package cfid\",\"authors\":\"Santtu Tikka\",\"doi\":\"10.32614/rj-2023-053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the framework of structural causal models, counterfactual queries describe events that concern multiple alternative states of the system under study. Counterfactual queries often take the form of \\\"what if\\\" type questions such as \\\"would an applicant have been hired if they had over 10 years of experience, when in reality they only had 5 years of experience?\\\" Such questions and counterfactual inference in general are crucial, for example when addressing the problem of fairness in decision-making. Because counterfactual events contain contradictory states of the world, it is impossible to conduct a randomized experiment to address them without making several restrictive assumptions. However, it is sometimes possible to identify such queries from observational and experimental data by representing the system under study as a causal model, and the available data as symbolic probability distributions. @shpitser2007 constructed two algorithms, called ID\\\\* and IDC\\\\*, for identifying counterfactual queries and conditional counterfactual queries, respectively. These two algorithms are analogous to the ID and IDC algorithms by @shpitser2006id [@shpitser2006idc] for identification of interventional distributions, which were implemented in R by @tikka2017 in the causaleffect package. We present the R package [cfid](https://CRAN.R-project.org/package=cfid) that implements the ID\\\\* and IDC\\\\* algorithms. Identification of counterfactual queries and the features of cfid are demonstrated via examples.\",\"PeriodicalId\":51285,\"journal\":{\"name\":\"R Journal\",\"volume\":\"104 1-2\",\"pages\":\"0\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"R Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32614/rj-2023-053\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"R Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32614/rj-2023-053","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Identifying Counterfactual Queries with the R Package cfid
In the framework of structural causal models, counterfactual queries describe events that concern multiple alternative states of the system under study. Counterfactual queries often take the form of "what if" type questions such as "would an applicant have been hired if they had over 10 years of experience, when in reality they only had 5 years of experience?" Such questions and counterfactual inference in general are crucial, for example when addressing the problem of fairness in decision-making. Because counterfactual events contain contradictory states of the world, it is impossible to conduct a randomized experiment to address them without making several restrictive assumptions. However, it is sometimes possible to identify such queries from observational and experimental data by representing the system under study as a causal model, and the available data as symbolic probability distributions. @shpitser2007 constructed two algorithms, called ID\* and IDC\*, for identifying counterfactual queries and conditional counterfactual queries, respectively. These two algorithms are analogous to the ID and IDC algorithms by @shpitser2006id [@shpitser2006idc] for identification of interventional distributions, which were implemented in R by @tikka2017 in the causaleffect package. We present the R package [cfid](https://CRAN.R-project.org/package=cfid) that implements the ID\* and IDC\* algorithms. Identification of counterfactual queries and the features of cfid are demonstrated via examples.
R JournalCOMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-STATISTICS & PROBABILITY
CiteScore
2.70
自引率
0.00%
发文量
40
审稿时长
>12 weeks
期刊介绍:
The R Journal is the open access, refereed journal of the R project for statistical computing. It features short to medium length articles covering topics that should be of interest to users or developers of R.
The R Journal intends to reach a wide audience and have a thorough review process. Papers are expected to be reasonably short, clearly written, not too technical, and of course focused on R. Authors of refereed articles should take care to:
- put their contribution in context, in particular discuss related R functions or packages;
- explain the motivation for their contribution;
- provide code examples that are reproducible.