{"title":"学习贝叶斯网络:混合类型数据的 Copula 方法","authors":"Federico Castelletti","doi":"10.1007/s11336-024-09969-2","DOIUrl":null,"url":null,"abstract":"<p>Estimating dependence relationships between variables is a crucial issue in many applied domains and in particular psychology. When several variables are entertained, these can be organized into a network which encodes their set of conditional dependence relations. Typically however, the underlying network structure is completely unknown or can be partially drawn only; accordingly it should be learned from the available data, a process known as <i>structure learning</i>. In addition, data arising from social and psychological studies are often of different types, as they can include categorical, discrete and continuous measurements. In this paper, we develop a novel Bayesian methodology for structure learning of directed networks which applies to mixed data, i.e., possibly containing continuous, discrete, ordinal and binary variables simultaneously. Whenever available, our method can easily incorporate known dependence structures among variables represented by paths or edge directions that can be postulated in advance based on the specific problem under consideration. We evaluate the proposed method through extensive simulation studies, with appreciable performances in comparison with current state-of-the-art alternative methods. Finally, we apply our methodology to well-being data from a social survey promoted by the United Nations, and mental health data collected from a cohort of medical students. R code implementing the proposed methodology is available at https://github.com/FedeCastelletti/bayes_networks_mixed_data.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":"2014 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Bayesian Networks: A Copula Approach for Mixed-Type Data\",\"authors\":\"Federico Castelletti\",\"doi\":\"10.1007/s11336-024-09969-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Estimating dependence relationships between variables is a crucial issue in many applied domains and in particular psychology. When several variables are entertained, these can be organized into a network which encodes their set of conditional dependence relations. Typically however, the underlying network structure is completely unknown or can be partially drawn only; accordingly it should be learned from the available data, a process known as <i>structure learning</i>. In addition, data arising from social and psychological studies are often of different types, as they can include categorical, discrete and continuous measurements. In this paper, we develop a novel Bayesian methodology for structure learning of directed networks which applies to mixed data, i.e., possibly containing continuous, discrete, ordinal and binary variables simultaneously. Whenever available, our method can easily incorporate known dependence structures among variables represented by paths or edge directions that can be postulated in advance based on the specific problem under consideration. We evaluate the proposed method through extensive simulation studies, with appreciable performances in comparison with current state-of-the-art alternative methods. Finally, we apply our methodology to well-being data from a social survey promoted by the United Nations, and mental health data collected from a cohort of medical students. R code implementing the proposed methodology is available at https://github.com/FedeCastelletti/bayes_networks_mixed_data.</p>\",\"PeriodicalId\":54534,\"journal\":{\"name\":\"Psychometrika\",\"volume\":\"2014 1\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychometrika\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1007/s11336-024-09969-2\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychometrika","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1007/s11336-024-09969-2","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
摘要
估计变量之间的依赖关系是许多应用领域,尤其是心理学领域的一个关键问题。当多个变量同时存在时,可以将这些变量组织成一个网络,其中编码了它们之间的一系列条件依赖关系。然而,通常情况下,底层网络结构是完全未知的,或者只能部分得出;因此,应从现有数据中学习网络结构,这一过程被称为结构学习。此外,社会和心理研究中产生的数据通常有不同类型,因为它们可能包括分类、离散和连续测量。在本文中,我们为有向网络的结构学习开发了一种新颖的贝叶斯方法,该方法适用于混合数据,即可能同时包含连续、离散、顺序和二进制变量的数据。只要有可用的数据,我们的方法就能轻松纳入已知的变量间依赖结构,这些结构由路径或边缘方向表示,可以根据所考虑的具体问题事先假设。我们通过大量的模拟研究对所提出的方法进行了评估,与目前最先进的替代方法相比,我们的方法具有显著的性能。最后,我们将我们的方法应用于联合国推广的一项社会调查中的幸福感数据,以及从一批医学生中收集的心理健康数据。实现该方法的 R 代码可在 https://github.com/FedeCastelletti/bayes_networks_mixed_data 上获取。
Learning Bayesian Networks: A Copula Approach for Mixed-Type Data
Estimating dependence relationships between variables is a crucial issue in many applied domains and in particular psychology. When several variables are entertained, these can be organized into a network which encodes their set of conditional dependence relations. Typically however, the underlying network structure is completely unknown or can be partially drawn only; accordingly it should be learned from the available data, a process known as structure learning. In addition, data arising from social and psychological studies are often of different types, as they can include categorical, discrete and continuous measurements. In this paper, we develop a novel Bayesian methodology for structure learning of directed networks which applies to mixed data, i.e., possibly containing continuous, discrete, ordinal and binary variables simultaneously. Whenever available, our method can easily incorporate known dependence structures among variables represented by paths or edge directions that can be postulated in advance based on the specific problem under consideration. We evaluate the proposed method through extensive simulation studies, with appreciable performances in comparison with current state-of-the-art alternative methods. Finally, we apply our methodology to well-being data from a social survey promoted by the United Nations, and mental health data collected from a cohort of medical students. R code implementing the proposed methodology is available at https://github.com/FedeCastelletti/bayes_networks_mixed_data.
期刊介绍:
The journal Psychometrika is devoted to the advancement of theory and methodology for behavioral data in psychology, education and the social and behavioral sciences generally. Its coverage is offered in two sections: Theory and Methods (T& M), and Application Reviews and Case Studies (ARCS). T&M articles present original research and reviews on the development of quantitative models, statistical methods, and mathematical techniques for evaluating data from psychology, the social and behavioral sciences and related fields. Application Reviews can be integrative, drawing together disparate methodologies for applications, or comparative and evaluative, discussing advantages and disadvantages of one or more methodologies in applications. Case Studies highlight methodology that deepens understanding of substantive phenomena through more informative data analysis, or more elegant data description.