首页 > 最新文献

Proceedings of machine learning research最新文献

英文 中文
Network conditioning for synergistic learning on partial annotations. 部分注释协同学习的网络调节。
Benjamin Billot, Neel Dey, Esra Abaci Turk, P Ellen Grant, Polina Golland

The robustness and accuracy of multi-organ segmentation networks is limited by the scarcity of labels. A common strategy to alleviate the annotation burden is to use partially labelled datasets, where each image can be annotated for a subset of all organs of interest. Unfortunately, this approach causes inconsistencies in the background class since it can now include target organs. Moreover, we consider the even more relaxed setting of region-based segmentation, where voxels can be labelled for super-regions, thus causing further inconsistencies across annotations. Here we propose CoNeMOS (Conditional Network for Multi-Organ Segmentation), a framework that leverages a label-conditioned network for synergistic learning on partially labelled region-based segmentations. Conditioning is achieved by combining convolutions with expressive Feature-wise Linear Modulation (FiLM) layers, whose parameters are controlled by an auxiliary network. In contrast to other conditioning methods, FiLM layers are stable to train and add negligible computation overhead, which enables us to condition the entire network. As a result, the network can learn where it needs to extract shared or label-specific features, instead of imposing it with the architecture (e.g., with different segmentation heads). By encouraging flexible synergies across labels, our method obtains state-of-the-art results for the segmentation of challenging low-resolution fetal MRI data. Our code is available at https://github.com/BBillot/CoNeMOS.

多器官分割网络的鲁棒性和准确性受到标签稀缺性的限制。减轻注释负担的一种常用策略是使用部分标记的数据集,其中每个图像可以为所有感兴趣的器官的子集进行注释。不幸的是,这种方法导致背景类不一致,因为它现在可以包括目标器官。此外,我们考虑了更宽松的基于区域的分割设置,其中体素可以标记为超区域,从而导致注释之间的进一步不一致。在这里,我们提出了CoNeMOS(条件网络多器官分割),这是一个利用标签条件网络在部分标记的基于区域的分割上进行协同学习的框架。通过将卷积与表达特征的线性调制(FiLM)层相结合来实现条件调节,其参数由辅助网络控制。与其他调节方法相比,FiLM层训练稳定,并且增加的计算开销可以忽略不计,这使我们能够调节整个网络。因此,网络可以学习在哪里需要提取共享或标签特定的特征,而不是用架构强加给它(例如,使用不同的分段头)。通过鼓励跨标签灵活的协同作用,我们的方法获得了具有挑战性的低分辨率胎儿MRI数据分割的最先进的结果。我们的代码可在https://github.com/BBillot/CoNeMOS上获得。
{"title":"Network conditioning for synergistic learning on partial annotations.","authors":"Benjamin Billot, Neel Dey, Esra Abaci Turk, P Ellen Grant, Polina Golland","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The robustness and accuracy of multi-organ segmentation networks is limited by the scarcity of labels. A common strategy to alleviate the annotation burden is to use partially labelled datasets, where each image can be annotated for a subset of all organs of interest. Unfortunately, this approach causes inconsistencies in the background class since it can now include target organs. Moreover, we consider the even more relaxed setting of region-based segmentation, where voxels can be labelled for super-regions, thus causing further inconsistencies across annotations. Here we propose CoNeMOS (Conditional Network for Multi-Organ Segmentation), a framework that leverages a label-conditioned network for synergistic learning on partially labelled region-based segmentations. Conditioning is achieved by combining convolutions with expressive Feature-wise Linear Modulation (FiLM) layers, whose parameters are controlled by an auxiliary network. In contrast to other conditioning methods, FiLM layers are stable to train and add negligible computation overhead, which enables us to condition the entire network. As a result, the network can <i>learn</i> where it needs to extract shared or label-specific features, instead of imposing it with the architecture (e.g., with different segmentation heads). By encouraging flexible synergies across labels, our method obtains state-of-the-art results for the segmentation of challenging low-resolution fetal MRI data. Our code is available at https://github.com/BBillot/CoNeMOS.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"250 ","pages":"119-130"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12393823/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144981797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-supervised pretraining in the wild imparts image acquisition robustness to medical image transformers: an application to lung cancer segmentation. 野外自监督预训练赋予医学图像转换器图像获取鲁棒性:肺癌分割的应用。
Jue Jiang, Harini Veeraraghavan

Self-supervised learning (SSL) is an approach to pretrain models with unlabeled datasets and extract useful feature representations such that these models can be easily fine-tuned for various downstream tasks. Self-pretraining applies SSL on curated task-specific datasets without using task-specific labels. Increasing availability of public data repositories has now made it possible to utilize diverse and large, task unrelated datasets to pretrain models in the "wild" using SSL. However, the benefit of such wild-pretraining over self-pretraining has not been studied in the context of medical image analysis. Hence, we analyzed transformers (Swin and ViT) and a convolutional neural network created using wild- and self-pretraining trained to segment lung tumors from 3D-computed tomography (CT) scans in terms of: (a) accuracy, (b) fine-tuning epoch efficiency, and (c) robustness to image acquisition differences (contrast versus non-contrast, slice thickness, and image reconstruction kernels). We also studied feature reuse using centered kernel alignment (CKA) with the Swin networks. Our analysis with two independent testing (public N = 139; internal N = 196) datasets showed that wild-pretrained Swin models significantly outperformed self-pretrained Swin for the various imaging acquisitions. Fine-tuning epoch efficiency was higher for both wild-pretrained Swin and ViT models compared to their self-pretrained counterparts. Feature reuse close to the final encoder layers was lower than in the early layers for wild-pretrained models irrespective of the pretext tasks used in SSL. Models and code will be made available through GitHub upon manuscript acceptance.

自监督学习(Self-supervised learning, SSL)是一种使用未标记数据集预训练模型并提取有用特征表示的方法,这样这些模型就可以很容易地针对各种下游任务进行微调。自我预训练在特定任务的数据集上应用SSL,而不使用特定任务的标签。公共数据存储库的可用性越来越高,现在可以利用各种各样的、大型的、任务无关的数据集来使用SSL在“野外”预训练模型。然而,在医学图像分析的背景下,这种野生预训练相对于自我预训练的好处尚未得到研究。因此,我们分析了变压器(Swin和ViT)和卷积神经网络,该网络使用野生和自我预训练创建,用于从3d计算机断层扫描(CT)扫描中分割肺肿瘤,并从以下方面进行了分析:(a)准确性,(b)微调epoch效率,以及(c)对图像采集差异(对比度与非对比度,切片厚度和图像重建核)的鲁棒性。我们还研究了在Swin网络中使用中心核对齐(CKA)的特征重用。我们的分析采用两个独立检验(公共N = 139;内部N = 196)数据集表明,野生预训练的Swin模型在各种成像获取方面明显优于自预训练的Swin。与自我预训练的模型相比,野生预训练的Swin和ViT模型的微调历元效率更高。无论SSL中使用的借口任务如何,接近最终编码器层的特征重用低于原始预训练模型的早期层。稿件接受后,模型和代码将通过GitHub提供。
{"title":"Self-supervised pretraining in the wild imparts image acquisition robustness to medical image transformers: an application to lung cancer segmentation.","authors":"Jue Jiang, Harini Veeraraghavan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Self-supervised learning (SSL) is an approach to pretrain models with unlabeled datasets and extract useful feature representations such that these models can be easily fine-tuned for various downstream tasks. Self-pretraining applies SSL on curated task-specific datasets without using task-specific labels. Increasing availability of public data repositories has now made it possible to utilize diverse and large, task unrelated datasets to pretrain models in the \"wild\" using SSL. However, the benefit of such wild-pretraining over self-pretraining has not been studied in the context of medical image analysis. Hence, we analyzed transformers (Swin and ViT) and a convolutional neural network created using wild- and self-pretraining trained to segment lung tumors from 3D-computed tomography (CT) scans in terms of: (a) accuracy, (b) fine-tuning epoch efficiency, and (c) robustness to image acquisition differences (contrast versus non-contrast, slice thickness, and image reconstruction kernels). We also studied feature reuse using centered kernel alignment (CKA) with the Swin networks. Our analysis with two independent testing (public N = 139; internal N = 196) datasets showed that wild-pretrained Swin models significantly outperformed self-pretrained Swin for the various imaging acquisitions. Fine-tuning epoch efficiency was higher for both wild-pretrained Swin and ViT models compared to their self-pretrained counterparts. Feature reuse close to the final encoder layers was lower than in the early layers for wild-pretrained models irrespective of the pretext tasks used in SSL. Models and code will be made available through GitHub upon manuscript acceptance.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"250 ","pages":"708-721"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11741178/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143017993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Theoretical Analysis of Learned Database Operations under Distribution Shift through Distribution Learnability. 通过分布可学习性对分布转移下的可学习数据库操作进行理论分析》(Theoretical Analysis of Learned Database Operations under Distribution Shift through Distribution Learnability)。
Sepanta Zeighami, Cyrus Shahabi

Use of machine learning to perform database operations, such as indexing, cardinality estimation, and sorting, is shown to provide substantial performance benefits. However, when datasets change and data distribution shifts, empirical results also show performance degradation for learned models, possibly to worse than non-learned alternatives. This, together with a lack of theoretical understanding of learned methods undermines their practical applicability, since there are no guarantees on how well the models will perform after deployment. In this paper, we present the first known theoretical characterization of the performance of learned models in dynamic datasets, for the aforementioned operations. Our results show novel theoretical characteristics achievable by learned models and provide bounds on the performance of the models that characterize their advantages over non-learned methods, showing why and when learned models can outperform the alternatives. Our analysis develops the distribution learnability framework and novel theoretical tools which build the foundation for the analysis of learned database operations in the future.

使用机器学习执行数据库操作(如索引、卡片性估计和排序)可带来巨大的性能优势。然而,当数据集发生变化和数据分布发生变化时,经验结果也显示学习模型的性能下降,可能比非学习模型更差。由于无法保证模型部署后的性能,再加上缺乏对学习方法的理论理解,这就削弱了其实际应用性。在本文中,我们针对上述操作,首次从理论上描述了动态数据集中学习模型的性能。我们的结果表明了学习模型可实现的新理论特性,并提供了模型性能的界限,这些界限描述了模型相对于非学习方法的优势,说明了为什么以及什么时候学习模型可以优于其他方法。我们的分析建立了分布式可学习性框架和新颖的理论工具,为今后分析学习型数据库操作奠定了基础。
{"title":"Theoretical Analysis of Learned Database Operations under Distribution Shift through Distribution Learnability.","authors":"Sepanta Zeighami, Cyrus Shahabi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Use of machine learning to perform database operations, such as indexing, cardinality estimation, and sorting, is shown to provide substantial performance benefits. However, when datasets change and data distribution shifts, empirical results also show performance degradation for learned models, possibly to worse than non-learned alternatives. This, together with a lack of theoretical understanding of learned methods undermines their practical applicability, since there are no guarantees on how well the models will perform after deployment. In this paper, we present the first known theoretical characterization of the performance of learned models in dynamic datasets, for the aforementioned operations. Our results show novel theoretical characteristics achievable by learned models and provide bounds on the performance of the models that characterize their advantages over non-learned methods, showing why and when learned models can outperform the alternatives. Our analysis develops the <i>distribution learnability</i> framework and novel theoretical tools which build the foundation for the analysis of learned database operations in the future.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"235 ","pages":"58283-58305"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534081/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation. DISCRET:综合治疗效果估算的忠实解释。
Yinjun Wu, Mayank Keoliya, Kan Chen, Neelay Velingker, Ziyang Li, Emily J Getzen, Qi Long, Mayur Naik, Ravi B Parikh, Eric Wong

Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers for black-box models lack faithfulness guarantees, and self-interpretable models greatly compromise accuracy. To address these issues, we propose DISCRET, a self-interpretable ITE framework that synthesizes faithful, rule-based explanations for each sample. A key insight behind DISCRET is that explanations can serve dually as database queries to identify similar subgroups of samples. We provide a novel RL algorithm to efficiently synthesize these explanations from a large search space. We evaluate DISCRET on diverse tasks involving tabular, image, and text data. DISCRET outperforms the best self-interpretable models and has accuracy comparable to the best black-box models while providing faithful explanations. DISCRET is available at https://github.com/wuyinjun-1993/DISCRET-ICML2024.

设计忠实而准确的人工智能模型极具挑战性,尤其是在个体治疗效果估算(ITE)领域。部署在医疗保健等关键环境中的 ITE 预测模型在理想情况下应:(i) 准确;(ii) 提供忠实的解释。然而,目前的解决方案并不充分:最先进的黑箱模型无法提供解释,黑箱模型的事后解释器缺乏忠实性保证,而可自我解释的模型则大大降低了准确性。为了解决这些问题,我们提出了 DISCRET,这是一种可自我解释的 ITE 框架,它能为每个样本合成基于规则的忠实解释。DISCRET 背后的一个关键见解是,解释可以作为数据库查询来识别相似的样本子群。我们提供了一种新颖的 RL 算法,可以从大型搜索空间中高效地合成这些解释。我们在涉及表格、图像和文本数据的各种任务中对 DISCRET 进行了评估。DISCRET 的表现优于最好的自解释模型,其准确性可与最好的黑盒模型相媲美,同时还能提供忠实的解释。DISCRET可在https://github.com/wuyinjun-1993/DISCRET-ICML2024。
{"title":"DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation.","authors":"Yinjun Wu, Mayank Keoliya, Kan Chen, Neelay Velingker, Ziyang Li, Emily J Getzen, Qi Long, Mayur Naik, Ravi B Parikh, Eric Wong","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers for black-box models lack faithfulness guarantees, and self-interpretable models greatly compromise accuracy. To address these issues, we propose DISCRET, a self-interpretable ITE framework that synthesizes faithful, rule-based explanations for each sample. A key insight behind DISCRET is that explanations can serve dually as <i>database queries</i> to identify similar subgroups of samples. We provide a novel RL algorithm to efficiently synthesize these explanations from a large search space. We evaluate DISCRET on diverse tasks involving tabular, image, and text data. DISCRET outperforms the best self-interpretable models and has accuracy comparable to the best black-box models while providing faithful explanations. DISCRET is available at https://github.com/wuyinjun-1993/DISCRET-ICML2024.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"235 ","pages":"53597-53618"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11350397/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142115743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kernel Debiased Plug-in Estimation: Simultaneous, Automated Debiasing without Influence Functions for Many Target Parameters. 核去偏插件估算:针对众多目标参数的无影响函数同步自动去差分。
Brian Cho, Yaroslav Mukhin, Kyra Gan, Ivana Malenica

When estimating target parameters in nonparametric models with nuisance parameters, substituting the unknown nuisances with nonparametric estimators can introduce "plug-in bias." Traditional methods addressing this suboptimal bias-variance trade-off rely on the influence function (IF) of the target parameter. When estimating multiple target parameters, these methods require debiasing the nuisance parameter multiple times using the corresponding IFs, which poses analytical and computational challenges. In this work, we leverage the targeted maximum likelihood estimation (TMLE) framework to propose a novel method named kernel debiased plug-in estimation (KDPE). KDPE refines an initial estimate through regularized likelihood maximization steps, employing a nonparametric model based on reproducing kernel Hilbert spaces. We show that KDPE: (i) simultaneously debiases all pathwise differentiable target parameters that satisfy our regularity conditions, (ii) does not require the IF for implementation, and (iii) remains computationally tractable. We numerically illustrate the use of KDPE and validate our theoretical results.

在带有干扰参数的非参数模型中估计目标参数时,用非参数估计器替代未知干扰参数可能会引入 "插入偏差"。解决这种偏差-方差权衡次优问题的传统方法依赖于目标参数的影响函数(IF)。在估计多个目标参数时,这些方法需要使用相应的影响函数多次去扰动参数,这给分析和计算带来了挑战。在这项工作中,我们利用目标最大似然估计(TMLE)框架,提出了一种名为核去势插件估计(KDPE)的新方法。KDPE 采用基于再现核希尔伯特空间的非参数模型,通过正则化似然最大化步骤完善初始估计值。我们证明了 KDPE:(i) 能同时去除满足正则条件的所有路径可变目标参数,(ii) 不需要 IF 来实现,(iii) 计算上仍然可行。我们用数字说明了 KDPE 的使用,并验证了我们的理论结果。
{"title":"Kernel Debiased Plug-in Estimation: Simultaneous, Automated Debiasing without Influence Functions for Many Target Parameters.","authors":"Brian Cho, Yaroslav Mukhin, Kyra Gan, Ivana Malenica","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>When estimating target parameters in nonparametric models with nuisance parameters, substituting the unknown nuisances with nonparametric estimators can introduce \"plug-in bias.\" Traditional methods addressing this suboptimal bias-variance trade-off rely on the <i>influence function</i> (IF) of the target parameter. When estimating multiple target parameters, these methods require debiasing the nuisance parameter multiple times using the corresponding IFs, which poses analytical and computational challenges. In this work, we leverage the <i>targeted maximum likelihood estimation</i> (TMLE) framework to propose a novel method named <i>kernel debiased plug-in estimation</i> (KDPE). KDPE refines an initial estimate through regularized likelihood maximization steps, employing a nonparametric model based on <i>reproducing kernel Hilbert spaces</i>. We show that KDPE: (i) simultaneously debiases <i>all</i> pathwise differentiable target parameters that satisfy our regularity conditions, (ii) does not require the IF for implementation, and (iii) remains computationally tractable. We numerically illustrate the use of KDPE and validate our theoretical results.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"235 ","pages":"8534-8555"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11359899/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142115744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Position: Topological Deep Learning is the New Frontier for Relational Learning. 观点:拓扑深度学习是关系学习的新前沿。
Theodore Papamarkou, Tolga Birdal, Michael Bronstein, Gunnar Carlsson, Justin Curry, Yue Gao, Mustafa Hajij, Roland Kwitt, Pietro Liò, Paolo Di Lorenzo, Vasileios Maroulas, Nina Miolane, Farzana Nasrin, Karthikeyan Natesan Ramamurthy, Bastian Rieck, Simone Scardapane, Michael T Schaub, Petar Veličković, Bei Wang, Yusu Wang, Guo-Wei Wei, Ghada Zamzmi

Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning settings. To this end, this paper discusses open problems in TDL, ranging from practical benefits to theoretical foundations. For each problem, it outlines potential solutions and future research opportunities. At the same time, this paper serves as an invitation to the scientific community to actively participate in TDL research to unlock the potential of this emerging field.

拓扑深度学习(TDL)是一个快速发展的领域,利用拓扑特征来理解和设计深度学习模型。本文认为TDL是关系学习的新前沿。TDL可以通过结合拓扑概念来补充图表示学习和几何深度学习,因此可以为各种机器学习设置提供自然选择。为此,本文从实践利益到理论基础等方面探讨了TDL的开放性问题。对于每个问题,它都概述了潜在的解决方案和未来的研究机会。同时,本文也作为对科学界积极参与TDL研究的邀请,以释放这一新兴领域的潜力。
{"title":"Position: Topological Deep Learning is the New Frontier for Relational Learning.","authors":"Theodore Papamarkou, Tolga Birdal, Michael Bronstein, Gunnar Carlsson, Justin Curry, Yue Gao, Mustafa Hajij, Roland Kwitt, Pietro Liò, Paolo Di Lorenzo, Vasileios Maroulas, Nina Miolane, Farzana Nasrin, Karthikeyan Natesan Ramamurthy, Bastian Rieck, Simone Scardapane, Michael T Schaub, Petar Veličković, Bei Wang, Yusu Wang, Guo-Wei Wei, Ghada Zamzmi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning settings. To this end, this paper discusses open problems in TDL, ranging from practical benefits to theoretical foundations. For each problem, it outlines potential solutions and future research opportunities. At the same time, this paper serves as an invitation to the scientific community to actively participate in TDL research to unlock the potential of this emerging field.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"235 ","pages":"39529-39555"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11973457/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143804858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hidden Population Estimation with Indirect Inference and Auxiliary Information. 基于间接推断和辅助信息的隐藏总体估计。
Justin Weltz, Eric Laber, Alexander Volfovsky

Many populations defined by illegal or stigmatized behavior are difficult to sample using conventional survey methodology. Respondent Driven Sampling (RDS) is a participant referral process frequently employed in this context to collect information. This sampling methodology can be modeled as a stochastic process that explores the graph of a social network, generating a partially observed subgraph between study participants. The methods currently used to impute the missing edges in this subgraph exhibit biased downstream estimation. We leverage auxiliary participant information and concepts from indirect inference to ameliorate these issues and improve estimation of the hidden population size. These advances result in smaller bias and higher precision in the estimation of the study participant arrival rate, the sample subgraph, and the population size. Lastly, we use our method to estimate the number of People Who Inject Drugs (PWID) in the Kohtla-Jarve region of Estonia.

许多被非法或污名化行为界定的人群很难用传统的调查方法抽样。被调查者驱动抽样(RDS)是在这种情况下经常用于收集信息的参与者推荐过程。这种抽样方法可以建模为探索社会网络图的随机过程,在研究参与者之间生成部分观察到的子图。目前用于估算该子图中缺失边的方法显示出有偏的下游估计。我们利用间接推断的辅助参与者信息和概念来改善这些问题,并改进对隐藏人口规模的估计。这些进步使研究参与者到达率、样本子图和总体规模的估计偏差更小,精度更高。最后,我们用我们的方法估计了爱沙尼亚Kohtla-Jarve地区注射吸毒者(PWID)的人数。
{"title":"Hidden Population Estimation with Indirect Inference and Auxiliary Information.","authors":"Justin Weltz, Eric Laber, Alexander Volfovsky","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Many populations defined by illegal or stigmatized behavior are difficult to sample using conventional survey methodology. Respondent Driven Sampling (RDS) is a participant referral process frequently employed in this context to collect information. This sampling methodology can be modeled as a stochastic process that explores the graph of a social network, generating a partially observed subgraph between study participants. The methods currently used to impute the missing edges in this subgraph exhibit biased downstream estimation. We leverage auxiliary participant information and concepts from indirect inference to ameliorate these issues and improve estimation of the hidden population size. These advances result in smaller bias and higher precision in the estimation of the study participant arrival rate, the sample subgraph, and the population size. Lastly, we use our method to estimate the number of People Who Inject Drugs (PWID) in the Kohtla-Jarve region of Estonia.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"244 ","pages":"3730-3746"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448677/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145115161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Basic to Extra Features: Hypergraph Transformer Pretrain-then-Finetuning for Balanced Clinical Predictions on EHR. 从基本到额外的功能:超图变压器预训练-然后微调平衡临床预测电子病历。
Ran Xu, Yiwen Lu, Chang Liu, Yong Chen, Yan Sun, Xiao Hu, Joyce C Ho, Carl Yang

Electronic Health Records (EHRs) contain rich patient information and are crucial for clinical research and practice. In recent years, deep learning models have been applied to EHRs, but they often rely on massive features, which may not be readily available for all patients. We propose HTP-Star, which leverages hypergraph structures with a pretrain-then-finetune framework for modeling EHR data, enabling seamless integration of additional features. Additionally, we design two techniques, namely (1) Smoothness-inducing Regularization and (2) Group-balanced Reweighting, to enhance the model's robustness during finetuning. Through experiments conducted on two real EHR datasets, we demonstrate that HTP-Star consistently outperforms various baselines while striking a balance between patients with basic and extra features.

电子健康记录(EHRs)包含丰富的患者信息,对临床研究和实践至关重要。近年来,深度学习模型已被应用于电子病历,但它们往往依赖于大量特征,而这些特征可能并不适用于所有患者。我们提出了http - star,它利用超图结构和预训练-然后微调框架来建模EHR数据,从而实现其他功能的无缝集成。此外,我们设计了两种技术,即(1)平滑诱导正则化和(2)组平衡重加权,以增强模型在微调过程中的鲁棒性。通过在两个真实的EHR数据集上进行的实验,我们证明了HTP-Star始终优于各种基线,同时在具有基本特征和额外特征的患者之间取得平衡。
{"title":"From Basic to Extra Features: Hypergraph Transformer Pretrain-then-Finetuning for Balanced Clinical Predictions on EHR.","authors":"Ran Xu, Yiwen Lu, Chang Liu, Yong Chen, Yan Sun, Xiao Hu, Joyce C Ho, Carl Yang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Electronic Health Records (EHRs) contain rich patient information and are crucial for clinical research and practice. In recent years, deep learning models have been applied to EHRs, but they often rely on massive features, which may not be readily available for all patients. We propose HTP-Star, which leverages hypergraph structures with a pretrain-then-finetune framework for modeling EHR data, enabling seamless integration of additional features. Additionally, we design two techniques, namely (1) <i>Smoothness-inducing Regularization</i> and (2) <i>Group-balanced Reweighting</i>, to enhance the model's robustness during finetuning. Through experiments conducted on two real EHR datasets, we demonstrate that HTP-Star consistently outperforms various baselines while striking a balance between patients with basic and extra features.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"248 ","pages":"182-197"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11876795/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges. 利用 LLM 从电子病历中检索证据:可能性与挑战。
Hiba Ahsan, Denis Jered McInerney, Jisoo Kim, Christopher Potter, Geoffrey Young, Silvio Amir, Byron C Wallace

Unstructured data in Electronic Health Records (EHRs) often contains critical information-complementary to imaging-that could inform radiologists' diagnoses. But the large volume of notes often associated with patients together with time constraints renders manually identifying relevant evidence practically infeasible. In this work we propose and evaluate a zero-shot strategy for using LLMs as a mechanism to efficiently retrieve and summarize unstructured evidence in patient EHR relevant to a given query. Our method entails tasking an LLM to infer whether a patient has, or is at risk of, a particular condition on the basis of associated notes; if so, we ask the model to summarize the supporting evidence. Under expert evaluation, we find that this LLM-based approach provides outputs consistently preferred to a pre-LLM information retrieval baseline. Manual evaluation is expensive, so we also propose and validate a method using an LLM to evaluate (other) LLM outputs for this task, allowing us to scale up evaluation. Our findings indicate the promise of LLMs as interfaces to EHR, but also highlight the outstanding challenge posed by "hallucinations". In this setting, however, we show that model confidence in outputs strongly correlates with faithful summaries, offering a practical means to limit confabulations.

电子健康记录(EHR)中的非结构化数据通常包含重要的信息--与影像资料互为补充--可为放射科医生的诊断提供依据。但是,由于患者的笔记数量庞大,再加上时间限制,人工识别相关证据实际上是不可行的。在这项工作中,我们提出并评估了一种零点策略,利用 LLM 作为一种机制,有效检索和总结病人电子病历中与给定查询相关的非结构化证据。我们的方法要求 LLM 根据相关笔记推断病人是否患有某种疾病或是否有患病风险;如果是,我们要求模型总结支持性证据。通过专家评估,我们发现这种基于 LLM 的方法所提供的输出结果始终优于 LLM 前的信息检索基线。人工评估的成本很高,因此我们还提出并验证了一种使用 LLM 评估(其他)LLM 输出的方法,使我们能够扩大评估范围。我们的研究结果表明了 LLM 作为电子病历接口的前景,但也强调了 "幻觉 "带来的巨大挑战。不过,在这种情况下,我们发现模型对输出结果的信心与忠实摘要密切相关,这为限制幻觉提供了一种切实可行的方法。
{"title":"Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges.","authors":"Hiba Ahsan, Denis Jered McInerney, Jisoo Kim, Christopher Potter, Geoffrey Young, Silvio Amir, Byron C Wallace","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Unstructured data in Electronic Health Records (EHRs) often contains critical information-complementary to imaging-that could inform radiologists' diagnoses. But the large volume of notes often associated with patients together with time constraints renders manually identifying relevant evidence practically infeasible. In this work we propose and evaluate a zero-shot strategy for using LLMs as a mechanism to efficiently retrieve and summarize unstructured evidence in patient EHR relevant to a given query. Our method entails tasking an LLM to infer whether a patient has, or is at risk of, a particular condition on the basis of associated notes; if so, we ask the model to summarize the supporting evidence. Under expert evaluation, we find that this LLM-based approach provides outputs consistently preferred to a pre-LLM information retrieval baseline. Manual evaluation is expensive, so we also propose and validate a method using an LLM to evaluate (other) LLM outputs for this task, allowing us to scale up evaluation. Our findings indicate the promise of LLMs as interfaces to EHR, but also highlight the outstanding challenge posed by \"hallucinations\". In this setting, however, we show that model confidence in outputs strongly correlates with faithful summaries, offering a practical means to limit confabulations.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"248 ","pages":"489-505"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368037/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporally Multi-Scale Sparse Self-Attention for Physical Activity Data Imputation. 用于体育锻炼数据估算的时空多尺度稀疏自我关注。
Hui Wei, Maxwell A Xu, Colin Samplawski, James M Rehg, Santosh Kumar, Benjamin M Marlin

Wearable sensors enable health researchers to continuously collect data pertaining to the physiological state of individuals in real-world settings. However, such data can be subject to extensive missingness due to a complex combination of factors. In this work, we study the problem of imputation of missing step count data, one of the most ubiquitous forms of wearable sensor data. We construct a novel and large scale data set consisting of a training set with over 3 million hourly step count observations and a test set with over 2.5 million hourly step count observations. We propose a domain knowledge-informed sparse self-attention model for this task that captures the temporal multi-scale nature of step-count data. We assess the performance of the model relative to baselines and conduct ablation studies to verify our specific model designs.

可穿戴传感器使健康研究人员能够在真实世界环境中持续收集与个人生理状态有关的数据。然而,由于各种因素的复杂组合,这些数据可能会出现大量缺失。在这项工作中,我们研究了缺失步数数据的估算问题,这是最普遍的可穿戴传感器数据形式之一。我们构建了一个新颖的大规模数据集,包括一个包含 300 多万个每小时步数观测值的训练集和一个包含 250 多万个每小时步数观测值的测试集。我们为这项任务提出了一个以领域知识为基础的稀疏自我注意力模型,该模型能捕捉到步数数据的时间多尺度特性。我们评估了该模型相对于基线的性能,并进行了消融研究,以验证我们的特定模型设计。
{"title":"Temporally Multi-Scale Sparse Self-Attention for Physical Activity Data Imputation.","authors":"Hui Wei, Maxwell A Xu, Colin Samplawski, James M Rehg, Santosh Kumar, Benjamin M Marlin","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Wearable sensors enable health researchers to continuously collect data pertaining to the physiological state of individuals in real-world settings. However, such data can be subject to extensive missingness due to a complex combination of factors. In this work, we study the problem of imputation of missing step count data, one of the most ubiquitous forms of wearable sensor data. We construct a novel and large scale data set consisting of a training set with over 3 million hourly step count observations and a test set with over 2.5 million hourly step count observations. We propose a domain knowledge-informed sparse self-attention model for this task that captures the temporal multi-scale nature of step-count data. We assess the performance of the model relative to baselines and conduct ablation studies to verify our specific model designs.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"248 ","pages":"137-154"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11421853/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142334005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of machine learning research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1