首页 > 最新文献

Proceedings of machine learning research最新文献

英文 中文
Fairness-Aware Class Imbalanced Learning on Multiple Subgroups. 多个子群上的公平感知类不平衡学习
Davoud Ataee Tarzanagh, Bojian Hou, Boning Tong, Qi Long, Li Shen

We present a novel Bayesian-based optimization framework that addresses the challenge of generalization in overparameterized models when dealing with imbalanced subgroups and limited samples per subgroup. Our proposed tri-level optimization framework utilizes local predictors, which are trained on a small amount of data, as well as a fair and class-balanced predictor at the middle and lower levels. To effectively overcome saddle points for minority classes, our lower-level formulation incorporates sharpness-aware minimization. Meanwhile, at the upper level, the framework dynamically adjusts the loss function based on validation loss, ensuring a close alignment between the global predictor and local predictors. Theoretical analysis demonstrates the framework's ability to enhance classification and fairness generalization, potentially resulting in improvements in the generalization bound. Empirical results validate the superior performance of our tri-level framework compared to existing state-of-the-art approaches. The source code can be found at https://github.com/PennShenLab/FACIMS.

我们提出了一种新颖的基于贝叶斯的优化框架,以解决在处理不平衡子群和每个子群样本有限的情况下,过参数化模型的泛化难题。我们提出的三层优化框架利用了在少量数据基础上训练的局部预测器,以及中层和低层的公平和类平衡预测器。为了有效克服少数群体的鞍点问题,我们的低层次方案采用了锐度感知最小化。同时,在上层,该框架根据验证损失动态调整损失函数,确保全局预测器和局部预测器之间的紧密配合。理论分析表明,该框架能够增强分类和公平泛化能力,从而有可能改善泛化边界。实证结果验证了与现有的最先进方法相比,我们的三层框架具有更优越的性能。源代码见 https://github.com/PennShenLab/FACIMS。
{"title":"Fairness-Aware Class Imbalanced Learning on Multiple Subgroups.","authors":"Davoud Ataee Tarzanagh, Bojian Hou, Boning Tong, Qi Long, Li Shen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We present a novel Bayesian-based optimization framework that addresses the challenge of generalization in overparameterized models when dealing with imbalanced subgroups and limited samples per subgroup. Our proposed tri-level optimization framework utilizes <i>local</i> predictors, which are trained on a small amount of data, as well as a fair and class-balanced predictor at the middle and lower levels. To effectively overcome saddle points for minority classes, our lower-level formulation incorporates sharpness-aware minimization. Meanwhile, at the upper level, the framework dynamically adjusts the loss function based on validation loss, ensuring a close alignment between the <i>global</i> predictor and local predictors. Theoretical analysis demonstrates the framework's ability to enhance classification and fairness generalization, potentially resulting in improvements in the generalization bound. Empirical results validate the superior performance of our tri-level framework compared to existing state-of-the-art approaches. The source code can be found at https://github.com/PennShenLab/FACIMS.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"216 ","pages":"2123-2133"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11003754/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140857599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum Likelihood Estimation of Flexible Survival Densities with Importance Sampling. 利用重要性采样对灵活生存密度进行最大似然估计
Mert Ketenci, Shreyas Bhave, Noémie Elhadad, Adler Perotte

Survival analysis is a widely-used technique for analyzing time-to-event data in the presence of censoring. In recent years, numerous survival analysis methods have emerged which scale to large datasets and relax traditional assumptions such as proportional hazards. These models, while being performant, are very sensitive to model hyperparameters including: (1) number of bins and bin size for discrete models and (2) number of cluster assignments for mixture-based models. Each of these choices requires extensive tuning by practitioners to achieve optimal performance. In addition, we demonstrate in empirical studies that: (1) optimal bin size may drastically differ based on the metric of interest (e.g., concordance vs brier score), and (2) mixture models may suffer from mode collapse and numerical instability. We propose a survival analysis approach which eliminates the need to tune hyperparameters such as mixture assignments and bin sizes, reducing the burden on practitioners. We show that the proposed approach matches or outperforms baselines on several real-world datasets.

生存分析是一种广泛使用的技术,用于分析存在剔除的时间到事件数据。近年来,出现了许多生存分析方法,这些方法可扩展到大型数据集,并放宽了比例危险等传统假设。这些模型虽然性能优越,但对模型超参数非常敏感,包括:(1) 离散模型的箱数和箱大小;(2) 基于混合模型的聚类分配数。这些选择中的每一个都需要实践者进行大量的调整才能达到最佳性能。此外,我们还通过实证研究证明了以下几点:(1) 最佳分仓大小可能会因相关指标(如一致性与布赖尔得分)的不同而大相径庭,(2) 混合物模型可能会出现模式崩溃和数值不稳定性。我们提出的生存分析方法无需调整混合分配和分仓大小等超参数,从而减轻了从业人员的负担。我们的研究表明,在几个真实世界数据集上,我们提出的方法与基线相匹配,甚至优于基线。
{"title":"Maximum Likelihood Estimation of Flexible Survival Densities with Importance Sampling.","authors":"Mert Ketenci, Shreyas Bhave, Noémie Elhadad, Adler Perotte","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Survival analysis is a widely-used technique for analyzing time-to-event data in the presence of censoring. In recent years, numerous survival analysis methods have emerged which scale to large datasets and relax traditional assumptions such as proportional hazards. These models, while being performant, are very sensitive to model hyperparameters including: (1) number of bins and bin size for discrete models and (2) number of cluster assignments for mixture-based models. Each of these choices requires extensive tuning by practitioners to achieve optimal performance. In addition, we demonstrate in empirical studies that: (1) optimal bin size may drastically differ based on the metric of interest (e.g., concordance vs brier score), and (2) mixture models may suffer from mode collapse and numerical instability. We propose a survival analysis approach which eliminates the need to tune hyperparameters such as mixture assignments and bin sizes, reducing the burden on practitioners. We show that the proposed approach matches or outperforms baselines on several real-world datasets.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"219 ","pages":"360-380"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11441640/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142334003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bringing At-home Pediatric Sleep Apnea Testing Closer to Reality: A Multi-modal Transformer Approach. 让居家小儿睡眠呼吸暂停检测更接近现实:多模式变压器方法。
Hamed Fayyaz, Abigail Strang, Rahmatollah Beheshti

Sleep apnea in children is a major health problem affecting one to five percent of children (in the US). If not treated in a timely manner, it can also lead to other physical and mental health issues. Pediatric sleep apnea has different clinical causes and characteristics than adults. Despite a large group of studies dedicated to studying adult apnea, pediatric sleep apnea has been studied in a much less limited fashion. Relatedly, at-home sleep apnea testing tools and algorithmic methods for automatic detection of sleep apnea are widely present for adults, but not children. In this study, we target this gap by presenting a machine learning-based model for detecting apnea events from commonly collected sleep signals. We show that our method outperforms state-of-the-art methods across two public datasets, as determined by the F1-score and AUROC measures. Additionally, we show that using two of the signals that are easier to collect at home (ECG and SpO2) can also achieve very competitive results, potentially addressing the concerns about collecting various sleep signals from children outside the clinic. Therefore, our study can greatly inform ongoing progress toward increasing the accessibility of pediatric sleep apnea testing and improving the timeliness of the treatment interventions.

儿童睡眠呼吸暂停是一个重大的健康问题,影响着 1% 到 5% 的儿童(在美国)。如果不及时治疗,还可能导致其他身心健康问题。小儿睡眠呼吸暂停的临床原因和特点与成人不同。尽管有大量研究致力于研究成人呼吸暂停,但对小儿睡眠呼吸暂停的研究却少得多。与此相关的是,用于自动检测睡眠呼吸暂停的家用睡眠呼吸暂停测试工具和算法方法广泛应用于成人,但儿童却没有。在本研究中,我们针对这一空白,提出了一种基于机器学习的模型,用于从通常收集的睡眠信号中检测呼吸暂停事件。根据 F1 分数和 AUROC 指标,我们的方法在两个公共数据集上的表现优于最先进的方法。此外,我们还表明,使用两种在家中更容易收集的信号(ECG 和 SpO2)也能获得非常有竞争力的结果,从而有可能解决在诊所外收集儿童各种睡眠信号的问题。因此,我们的研究可以为不断提高小儿睡眠呼吸暂停检测的可及性和改善治疗干预的及时性提供重要信息。
{"title":"Bringing At-home Pediatric Sleep Apnea Testing Closer to Reality: A Multi-modal Transformer Approach.","authors":"Hamed Fayyaz, Abigail Strang, Rahmatollah Beheshti","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Sleep apnea in children is a major health problem affecting one to five percent of children (in the US). If not treated in a timely manner, it can also lead to other physical and mental health issues. Pediatric sleep apnea has different clinical causes and characteristics than adults. Despite a large group of studies dedicated to studying adult apnea, pediatric sleep apnea has been studied in a much less limited fashion. Relatedly, at-home sleep apnea testing tools and algorithmic methods for automatic detection of sleep apnea are widely present for adults, but not children. In this study, we target this gap by presenting a machine learning-based model for detecting apnea events from commonly collected sleep signals. We show that our method outperforms state-of-the-art methods across two public datasets, as determined by the F1-score and AUROC measures. Additionally, we show that using two of the signals that are easier to collect at home (ECG and SpO<sub>2</sub>) can also achieve very competitive results, potentially addressing the concerns about collecting various sleep signals from children outside the clinic. Therefore, our study can greatly inform ongoing progress toward increasing the accessibility of pediatric sleep apnea testing and improving the timeliness of the treatment interventions.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"219 ","pages":"167-185"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10854997/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139725300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Computational Framework for EEG Causal Oscillatory Connectivity. 脑电图因果振荡连接的计算框架
Eric Rawls, Casey Gilmore, Erich Kummerfeld, Kelvin Lim, Tasha Nienow

Here we advance a new approach for measuring EEG causal oscillatory connectivity, capitalizing on recent advances in causal discovery analysis for skewed time series data and in spectral parameterization of time-frequency (TF) data. We first parameterize EEG TF data into separate oscillatory and aperiodic components. We then measure causal interactions between separated oscillatory data with the recently proposed causal connectivity method Greedy Adjacencies and Non-Gaussian Orientations (GANGO). We apply GANGO to contemporaneous time series, then we extend the GANGO method to lagged data that control for temporal autocorrelation. We apply this approach to EEG data acquired in the context of a clinical trial investigating noninvasive transcranial direct current stimulation to treat executive dysfunction following mild Traumatic Brain Injury (mTBI). First, we analyze whole-scalp oscillatory connectivity patterns using community detection. Then we demonstrate that tDCS increases the effect size of causal theta-band oscillatory connections between prefrontal sensors and the rest of the scalp, while simultaneously decreasing causal alpha-band oscillatory connections between prefrontal sensors and the rest of the scalp. Improved executive functioning following tDCS could result from increased prefrontal causal theta oscillatory influence, and decreased prefrontal alpha-band causal oscillatory influence.

在这里,我们利用倾斜时间序列数据因果发现分析和时间频率(TF)数据频谱参数化的最新进展,提出了一种测量脑电图因果振荡连通性的新方法。我们首先将脑电图 TF 数据参数化为独立的振荡成分和非周期性成分。然后,我们使用最近提出的因果连接方法 "贪婪邻接和非高斯方向(GANGO)"来测量分离的振荡数据之间的因果互动。我们将 GANGO 应用于同期时间序列,然后将 GANGO 方法扩展到控制时间自相关性的滞后数据。我们将这一方法应用于一项临床试验中获取的脑电图数据,该临床试验调查了用无创经颅直流电刺激治疗轻度脑外伤(mTBI)后的执行功能障碍。首先,我们利用群落检测分析了全尺度振荡连接模式。然后我们证明,tDCS 增加了前额叶传感器与头皮其他部分之间因果θ波段振荡连接的效应大小,同时减少了前额叶传感器与头皮其他部分之间因果α波段振荡连接。前额叶因果θ振荡影响的增加和前额叶α波段因果振荡影响的减少可能会导致 tDCS 治疗后执行功能的改善。
{"title":"A Computational Framework for EEG Causal Oscillatory Connectivity.","authors":"Eric Rawls, Casey Gilmore, Erich Kummerfeld, Kelvin Lim, Tasha Nienow","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Here we advance a new approach for measuring EEG causal oscillatory connectivity, capitalizing on recent advances in causal discovery analysis for skewed time series data and in spectral parameterization of time-frequency (TF) data. We first parameterize EEG TF data into separate oscillatory and aperiodic components. We then measure causal interactions between separated oscillatory data with the recently proposed causal connectivity method Greedy Adjacencies and Non-Gaussian Orientations (GANGO). We apply GANGO to contemporaneous time series, then we extend the GANGO method to lagged data that control for temporal autocorrelation. We apply this approach to EEG data acquired in the context of a clinical trial investigating noninvasive transcranial direct current stimulation to treat executive dysfunction following mild Traumatic Brain Injury (mTBI). First, we analyze whole-scalp oscillatory connectivity patterns using community detection. Then we demonstrate that tDCS increases the effect size of causal theta-band oscillatory connections between prefrontal sensors and the rest of the scalp, while simultaneously decreasing causal alpha-band oscillatory connections between prefrontal sensors and the rest of the scalp. Improved executive functioning following tDCS could result from increased prefrontal causal theta oscillatory influence, and decreased prefrontal alpha-band causal oscillatory influence.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"223 ","pages":"40-51"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11545965/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142633724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Dynamic Antibiotic Treatment Strategies against Invasive Methicillin-Resistant Staphylococcus Aureus Infections using Causal Survival Forests and G-Formula on Statewide Electronic Health Record Data. 利用全州电子健康记录数据的因果生存森林和G公式优化抗侵袭性耐甲氧西林金黄色葡萄球菌感染的动态抗生素治疗策略。
Inyoung Jun, Scott A Cohen, Sarah E Ser, Simone Marini, Robert J Lucero, Jiang Bian, Mattia Prosperi

Developing models for individualized, time-varying treatment optimization from observational data with large variable spaces, e.g., electronic health records (EHR), is problematic because of inherent, complex bias that can change over time. Traditional methods such as the g-formula are robust, but must identify critical subsets of variables due to combinatorial issues. Machine learning approaches such as causal survival forests have fewer constraints and can provide fine-tuned, individualized counterfactual predictions. In this study, we aimed to optimize time-varying antibiotic treatment -identifying treatment heterogeneity and conditional treatment effects- against invasive methicillin-resistant Staphylococcus Aureus (MRSA) infections, using statewide EHR data collected in Florida, USA. While many previous studies focused on measuring the effects of the first empiric treatment (i.e., usually vancomycin), our study focuses on dynamic sequential treatment changes, comparing possible vancomycin switches with other antibiotics at clinically relevant time points, e.g., after obtaining a bacterial culture and susceptibility testing. Our study population included adult individuals admitted to the hospital with invasive MRSA. We collected demographic, clinical, medication, and laboratory information from the EHR for these patients. Then, we followed three sequential antibiotic choices (i.e., their empiric treatment, subsequent directed treatment, and final sustaining treatment), evaluating 30-day mortality as the outcome. We applied both causal survival forests and g-formula using different clinical intervention policies. We found that switching from vancomycin to another antibiotic improved survival probability, yet there was a benefit from initiating vancomycin compared to not using it at any time point. These findings show consistency with the empiric choice of vancomycin before confirmation of MRSA and shed light on how to manage switches on course. In conclusion, this application of causal machine learning on EHR demonstrates utility in modeling dynamic, heterogeneous treatment effects that cannot be evaluated precisely using randomized clinical trials.

根据具有大可变空间的观察数据,如电子健康记录(EHR),开发个性化、时变治疗优化模型是有问题的,因为固有的、复杂的偏差可能会随着时间的推移而变化。g公式等传统方法是稳健的,但由于组合问题,必须识别变量的关键子集。因果生存森林等机器学习方法具有较少的约束,可以提供微调的、个性化的反事实预测。在这项研究中,我们旨在利用在美国佛罗里达州收集的全州EHR数据,优化针对侵袭性耐甲氧西林金黄色葡萄球菌(MRSA)感染的时变抗生素治疗——确定治疗异质性和条件治疗效果,我们的研究重点是动态的序贯治疗变化,比较在临床相关时间点,例如在获得细菌培养和易感性测试后,可能的万古霉素转换与其他抗生素。我们的研究人群包括因侵袭性MRSA入院的成年个体。我们从EHR中收集了这些患者的人口统计学、临床、药物和实验室信息。然后,我们遵循三种连续的抗生素选择(即经验性治疗、随后的定向治疗和最终的持续治疗),评估30天的死亡率作为结果。我们使用不同的临床干预政策应用了因果生存森林和g公式。我们发现,从万古霉素改用另一种抗生素提高了生存概率,但与在任何时间点不使用万古霉素相比,使用万古霉素都有好处。这些发现表明,在确认MRSA之前,万古霉素的经验性选择是一致的,并为如何管理疗程切换提供了线索。总之,因果机器学习在EHR中的应用证明了其在建模动态、异质性治疗效果方面的实用性,这些效果无法使用随机临床试验进行精确评估。
{"title":"Optimizing Dynamic Antibiotic Treatment Strategies against Invasive Methicillin-Resistant <i>Staphylococcus Aureus</i> Infections using Causal Survival Forests and G-Formula on Statewide Electronic Health Record Data.","authors":"Inyoung Jun, Scott A Cohen, Sarah E Ser, Simone Marini, Robert J Lucero, Jiang Bian, Mattia Prosperi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Developing models for individualized, time-varying treatment optimization from observational data with large variable spaces, e.g., electronic health records (EHR), is problematic because of inherent, complex bias that can change over time. Traditional methods such as the g-formula are robust, but must identify critical subsets of variables due to combinatorial issues. Machine learning approaches such as causal survival forests have fewer constraints and can provide fine-tuned, individualized counterfactual predictions. In this study, we aimed to optimize time-varying antibiotic treatment -identifying treatment heterogeneity and conditional treatment effects- against invasive methicillin-resistant <i>Staphylococcus Aureus</i> (MRSA) infections, using statewide EHR data collected in Florida, USA. While many previous studies focused on measuring the effects of the first empiric treatment (i.e., usually vancomycin), our study focuses on dynamic sequential treatment changes, comparing possible vancomycin switches with other antibiotics at clinically relevant time points, e.g., after obtaining a bacterial culture and susceptibility testing. Our study population included adult individuals admitted to the hospital with invasive MRSA. We collected demographic, clinical, medication, and laboratory information from the EHR for these patients. Then, we followed three sequential antibiotic choices (i.e., their empiric treatment, subsequent directed treatment, and final sustaining treatment), evaluating 30-day mortality as the outcome. We applied both causal survival forests and g-formula using different clinical intervention policies. We found that switching from vancomycin to another antibiotic improved survival probability, yet there was a benefit from initiating vancomycin compared to not using it at any time point. These findings show consistency with the empiric choice of vancomycin before confirmation of MRSA and shed light on how to manage switches on course. In conclusion, this application of causal machine learning on EHR demonstrates utility in modeling dynamic, heterogeneous treatment effects that cannot be evaluated precisely using randomized clinical trials.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"218 ","pages":"98-115"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584043/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49686010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Py-Tetrad and RPy-Tetrad: A New Python Interface with R Support for Tetrad Causal Search. Py-Tetrad 和 RPy-Tetrad:为 Tetrad 因果搜索提供 R 支持的新 Python 接口。
Joseph D Ramsey, Bryan Andrews

We give novel Python and R interfaces for the (Java) Tetrad project for causal modeling, search, and estimation. The Tetrad project is a mainstay in the literature, having been under consistent development for over 30 years. Some of its algorithms are now classics, like PC and FCI; others are recent developments. It is increasingly the case, however, that researchers need to access the underlying Java code from Python or R. Existing methods for doing this are inadequate. We provide new, up-to-date methods using the JPype Python-Java interface and the Reticulate Python-R interface, directly solving these issues. With the addition of some simple tools and the provision of working examples for both Python and R, using JPype and Reticulate to interface Python and R with Tetrad is straightforward and intuitive.

我们为用于因果建模、搜索和估算的(Java)Tetrad 项目提供了新颖的 Python 和 R 接口。Tetrad 项目是文献中的中流砥柱,已经持续发展了 30 多年。它的一些算法现已成为经典,如 PC 和 FCI;另一些则是最近才开发的。然而,越来越多的研究人员需要从 Python 或 R 语言访问底层 Java 代码。我们使用 JPype Python-Java 接口和 Reticulate Python-R 接口提供了最新的新方法,直接解决了这些问题。通过添加一些简单的工具和提供 Python 和 R 的工作示例,使用 JPype 和 Reticulate 将 Python 和 R 与 Tetrad 连接起来就变得简单直观了。
{"title":"Py-Tetrad and RPy-Tetrad: A New Python Interface with R Support for Tetrad Causal Search.","authors":"Joseph D Ramsey, Bryan Andrews","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We give novel Python and R interfaces for the (Java) Tetrad project for causal modeling, search, and estimation. The Tetrad project is a mainstay in the literature, having been under consistent development for over 30 years. Some of its algorithms are now classics, like PC and FCI; others are recent developments. It is increasingly the case, however, that researchers need to access the underlying Java code from Python or R. Existing methods for doing this are inadequate. We provide new, up-to-date methods using the JPype Python-Java interface and the Reticulate Python-R interface, directly solving these issues. With the addition of some simple tools and the provision of working examples for both Python and R, using JPype and Reticulate to interface Python and R with Tetrad is straightforward and intuitive.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"223 ","pages":"40-51"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11316512/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141918282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fair Survival Time Prediction via Mutual Information Minimization. 当多则少时:加入额外的数据集可能会引入虚假相关性,从而影响性能。
Hyungrok Do, Yuxin Chang, Yoon Sang Cho, Padhraic Smyth, Judy Zhong

Survival analysis is a general framework for predicting the time until a specific event occurs, often in the presence of censoring. Although this framework is widely used in practice, few studies to date have considered fairness for time-to-event outcomes, despite recent significant advances in the algorithmic fairness literature more broadly. In this paper, we propose a framework to achieve demographic parity in survival analysis models by minimizing the mutual information between predicted time-to-event and sensitive attributes. We show that our approach effectively minimizes mutual information to encourage statistical independence of time-to-event predictions and sensitive attributes. Furthermore, we propose four types of disparity assessment metrics based on common survival analysis metrics. Through experiments on multiple benchmark datasets, we demonstrate that by minimizing the dependence between the prediction and the sensitive attributes, our method can systematically improve the fairness of survival predictions and is robust to censoring.

生存分析是一种预测特定事件发生前时间的通用框架,通常在存在删减的情况下使用。尽管这一框架在实践中得到了广泛应用,但迄今为止很少有研究考虑到时间到事件结果的公平性,尽管算法公平性文献最近取得了更广泛的重大进展。在本文中,我们提出了一个框架,通过最小化预测的事件发生时间与敏感属性之间的互信息,在生存分析模型中实现人口统计均等。我们的研究表明,我们的方法能有效地最小化互信息,从而鼓励时间到事件预测和敏感属性的统计独立性。此外,我们还基于常见的生存分析指标提出了四种差异评估指标。通过在多个基准数据集上的实验,我们证明了通过最小化预测与敏感属性之间的依赖性,我们的方法可以系统地提高生存预测的公平性,并且对普查具有鲁棒性。
{"title":"Fair Survival Time Prediction via Mutual Information Minimization.","authors":"Hyungrok Do, Yuxin Chang, Yoon Sang Cho, Padhraic Smyth, Judy Zhong","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Survival analysis is a general framework for predicting the time until a specific event occurs, often in the presence of censoring. Although this framework is widely used in practice, few studies to date have considered fairness for time-to-event outcomes, despite recent significant advances in the algorithmic fairness literature more broadly. In this paper, we propose a framework to achieve demographic parity in survival analysis models by minimizing the mutual information between predicted time-to-event and sensitive attributes. We show that our approach effectively minimizes mutual information to encourage statistical independence of time-to-event predictions and sensitive attributes. Furthermore, we propose four types of disparity assessment metrics based on common survival analysis metrics. Through experiments on multiple benchmark datasets, we demonstrate that by minimizing the dependence between the prediction and the sensitive attributes, our method can systematically improve the fairness of survival predictions and is robust to censoring.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"219 ","pages":"128-149"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11067550/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140861818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Typed Markers and Context for Clinical Temporal Relation Extraction. 用于临床时空关系提取的类型标记和上下文。
Cheng Cheng, Jeremy C Weiss

Reliable extraction of temporal relations from clinical notes is a growing need in many clinical research domains. Our work introduces typed markers to the task of clinical temporal relation extraction. We demonstrate that the addition of medical entity information to clinical text as tags with context sentences then input to a transformer-based architecture can outperform more complex systems requiring feature engineering and temporal reasoning. We propose several strategies of typed marker creation that incorporate entity type information at different granularities, with extensive experiments to test their effectiveness. Our system establishes the best result on I2B2, a clinical benchmark dataset for temporal relation extraction, with a F1 at 83.5% that provides a substantial 3.3% improvement over the previous best system.

从临床笔记中可靠地提取时间关系是许多临床研究领域日益增长的需求。我们的工作将类型化标记引入到临床时间关系提取任务中。我们证明,将医学实体信息作为带有上下文句子的标记添加到临床文本中,然后输入到基于转换器的架构中,其效果优于需要特征工程和时间推理的更复杂系统。我们提出了几种结合不同粒度实体类型信息的类型化标记创建策略,并通过大量实验来测试其有效性。我们的系统在时间关系提取的临床基准数据集 I2B2 上取得了最佳结果,F1 为 83.5%,比之前的最佳系统大幅提高了 3.3%。
{"title":"Typed Markers and Context for Clinical Temporal Relation Extraction.","authors":"Cheng Cheng, Jeremy C Weiss","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Reliable extraction of temporal relations from clinical notes is a growing need in many clinical research domains. Our work introduces typed markers to the task of clinical temporal relation extraction. We demonstrate that the addition of medical entity information to clinical text as tags with context sentences then input to a transformer-based architecture can outperform more complex systems requiring feature engineering and temporal reasoning. We propose several strategies of typed marker creation that incorporate entity type information at different granularities, with extensive experiments to test their effectiveness. Our system establishes the best result on I2B2, a clinical benchmark dataset for temporal relation extraction, with a F1 at 83.5% that provides a substantial 3.3% improvement over the previous best system.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"219 ","pages":"94-109"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10929572/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140112398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EASL: A Framework for Designing, Implementing, and Evaluating ML Solutions in Clinical Healthcare Settings. EASL:在临床医疗环境中设计、实施和评估 ML 解决方案的框架。
Eric Prince, Todd C Hankinson, Carsten Görg

We introduce the Explainable Analytical Systems Lab (EASL) framework, an end-to-end solution designed to facilitate the development, implementation, and evaluation of clinical machine learning (ML) tools. EASL is highly versatile and applicable to a variety of contexts and includes resources for data management, ML model development, visualization and user interface development, service hosting, and usage analytics. To demonstrate its practical applications, we present the EASL framework in the context of a case study: designing and evaluating a deep learning classifier to predict diagnoses from medical imaging. The framework is composed of three modules, each with their own set of resources. The Workbench module stores data and develops initial ML models, the Canvas module contains a medical imaging viewer and web development framework, and the Studio module hosts the ML model and provides web analytics and support for conducting user studies. EASL encourages model developers to take a holistic view by integrating the model development, implementation, and evaluation into one framework, and thus ensures that models are both effective and reliable when used in a clinical setting. EASL contributes to our understanding of machine learning applied to healthcare by providing a comprehensive framework that makes it easier to develop and evaluate ML tools within a clinical setting.

我们介绍了可解释分析系统实验室(EASL)框架,这是一个端到端的解决方案,旨在促进临床机器学习(ML)工具的开发、实施和评估。EASL 具有很强的通用性,适用于各种环境,包括用于数据管理、ML 模型开发、可视化和用户界面开发、服务托管和使用分析的资源。为了展示其实际应用,我们在一个案例研究中介绍了 EASL 框架:设计和评估用于预测医学影像诊断的深度学习分类器。该框架由三个模块组成,每个模块都有自己的资源集。Workbench 模块存储数据并开发初始 ML 模型,Canvas 模块包含医学影像浏览器和网络开发框架,Studio 模块托管 ML 模型并提供网络分析和开展用户研究的支持。EASL 鼓励模型开发人员从全局出发,将模型开发、实施和评估整合到一个框架中,从而确保模型在临床环境中使用时既有效又可靠。EASL 提供了一个全面的框架,使在临床环境中开发和评估 ML 工具变得更加容易,从而加深了我们对将机器学习应用于医疗保健的理解。
{"title":"EASL: A Framework for Designing, Implementing, and Evaluating ML Solutions in Clinical Healthcare Settings.","authors":"Eric Prince, Todd C Hankinson, Carsten Görg","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We introduce the Explainable Analytical Systems Lab (EASL) framework, an end-to-end solution designed to facilitate the development, implementation, and evaluation of clinical machine learning (ML) tools. EASL is highly versatile and applicable to a variety of contexts and includes resources for data management, ML model development, visualization and user interface development, service hosting, and usage analytics. To demonstrate its practical applications, we present the EASL framework in the context of a case study: designing and evaluating a deep learning classifier to predict diagnoses from medical imaging. The framework is composed of three modules, each with their own set of resources. The Workbench module stores data and develops initial ML models, the Canvas module contains a medical imaging viewer and web development framework, and the Studio module hosts the ML model and provides web analytics and support for conducting user studies. EASL encourages model developers to take a holistic view by integrating the model development, implementation, and evaluation into one framework, and thus ensures that models are both effective and reliable when used in a clinical setting. EASL contributes to our understanding of machine learning applied to healthcare by providing a comprehensive framework that makes it easier to develop and evaluate ML tools within a clinical setting.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"219 ","pages":"612-630"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11235083/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141581781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the Impact of Context Inference Error and Partial Observability on RL Methods for Just-In-Time Adaptive Interventions. 评估上下文推理误差和部分可观察性对用于及时适应性干预的 RL 方法的影响。
Karine Karine, Predrag Klasnja, Susan A Murphy, Benjamin M Marlin

Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community. JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components in response to each individual's time varying state. In this work, we explore the application of reinforcement learning methods to the problem of learning intervention option selection policies. We study the effect of context inference error and partial observability on the ability to learn effective policies. Our results show that the propagation of uncertainty from context inferences is critical to improving intervention efficacy as context uncertainty increases, while policy gradient algorithms can provide remarkable robustness to partially observed behavioral state information.

适时自适应干预(JITAIs)是行为科学界开发的一类个性化健康干预措施。JITAIs 旨在根据每个人随时间变化的状态,从一组预定义的组件中反复选择一系列干预选项,从而提供适当类型和数量的支持。在这项工作中,我们探索了强化学习方法在学习干预选项选择策略问题上的应用。我们研究了上下文推理误差和部分可观察性对学习有效政策能力的影响。我们的研究结果表明,随着情境不确定性的增加,情境推断中不确定性的传播对于提高干预效果至关重要,而政策梯度算法则能为部分观察到的行为状态信息提供显著的鲁棒性。
{"title":"Assessing the Impact of Context Inference Error and Partial Observability on RL Methods for Just-In-Time Adaptive Interventions.","authors":"Karine Karine, Predrag Klasnja, Susan A Murphy, Benjamin M Marlin","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community. JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components in response to each individual's time varying state. In this work, we explore the application of reinforcement learning methods to the problem of learning intervention option selection policies. We study the effect of context inference error and partial observability on the ability to learn effective policies. Our results show that the propagation of uncertainty from context inferences is critical to improving intervention efficacy as context uncertainty increases, while policy gradient algorithms can provide remarkable robustness to partially observed behavioral state information.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"216 ","pages":"1047-1057"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10506656/pdf/nihms-1926373.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10309493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of machine learning research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1