首页 > 最新文献

Proceedings of machine learning research最新文献

英文 中文
Half-Hop: A graph upsampling approach for slowing down message passing 半跳:一种降低消息传递速度的图形上采样方法
Pub Date : 2023-07-01 DOI: 10.48550/arXiv.2308.09198
Mehdi Azabou, Venkataraman Ganesh, Shantanu Thakoor, Chi-Heng Lin, Lakshmi Sathidevi, Ran Liu, M. Vaĺko, Petar Velickovic, Eva L. Dyer
Message passing neural networks have shown a lot of success on graph-structured data. However, there are many instances where message passing can lead to over-smoothing or fail when neighboring nodes belong to different classes. In this work, we introduce a simple yet general framework for improving learning in message passing neural networks. Our approach essentially upsamples edges in the original graph by adding "slow nodes" at each edge that can mediate communication between a source and a target node. Our method only modifies the input graph, making it plug-and-play and easy to use with existing models. To understand the benefits of slowing down message passing, we provide theoretical and empirical analyses. We report results on several supervised and self-supervised benchmarks, and show improvements across the board, notably in heterophilic conditions where adjacent nodes are more likely to have different labels. Finally, we show how our approach can be used to generate augmentations for self-supervised learning, where slow nodes are randomly introduced into different edges in the graph to generate multi-scale views with variable path lengths.
消息传递神经网络在图结构数据方面取得了很大的成功。然而,在许多情况下,当相邻节点属于不同的类时,消息传递可能会导致过度平滑或失败。在这项工作中,我们介绍了一个简单而通用的框架,用于改进消息传递神经网络的学习。我们的方法通过在每条边上添加“慢节点”来对原始图中的边进行上采样,这些节点可以调解源节点和目标节点之间的通信。我们的方法只修改输入图,使其即插即用,并且易于与现有模型一起使用。为了理解减缓信息传递的好处,我们提供了理论和实证分析。我们报告了几个监督和自监督基准的结果,并显示了全面的改进,特别是在相邻节点更有可能具有不同标签的异亲条件下。最后,我们展示了如何使用我们的方法来生成自监督学习的增强,其中将慢节点随机引入图中的不同边,以生成具有可变路径长度的多尺度视图。
{"title":"Half-Hop: A graph upsampling approach for slowing down message passing","authors":"Mehdi Azabou, Venkataraman Ganesh, Shantanu Thakoor, Chi-Heng Lin, Lakshmi Sathidevi, Ran Liu, M. Vaĺko, Petar Velickovic, Eva L. Dyer","doi":"10.48550/arXiv.2308.09198","DOIUrl":"https://doi.org/10.48550/arXiv.2308.09198","url":null,"abstract":"Message passing neural networks have shown a lot of success on graph-structured data. However, there are many instances where message passing can lead to over-smoothing or fail when neighboring nodes belong to different classes. In this work, we introduce a simple yet general framework for improving learning in message passing neural networks. Our approach essentially upsamples edges in the original graph by adding \"slow nodes\" at each edge that can mediate communication between a source and a target node. Our method only modifies the input graph, making it plug-and-play and easy to use with existing models. To understand the benefits of slowing down message passing, we provide theoretical and empirical analyses. We report results on several supervised and self-supervised benchmarks, and show improvements across the board, notably in heterophilic conditions where adjacent nodes are more likely to have different labels. Finally, we show how our approach can be used to generate augmentations for self-supervised learning, where slow nodes are randomly introduced into different edges in the graph to generate multi-scale views with variable path lengths.","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"202 1","pages":"1341-1360"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45894268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Conditional Normalizing Flow for Accelerated Multi-Coil MR Imaging. 用于加速多线圈磁共振成像的条件归一化流程
Jeffrey Wen, Rizwan Ahmad, Philip Schniter

Accelerated magnetic resonance (MR) imaging attempts to reduce acquisition time by collecting data below the Nyquist rate. As an ill-posed inverse problem, many plausible solutions exist, yet the majority of deep learning approaches generate only a single solution. We instead focus on sampling from the posterior distribution, which provides more comprehensive information for downstream inference tasks. To do this, we design a novel conditional normalizing flow (CNF) that infers the signal component in the measurement operator's nullspace, which is later combined with measured data to form complete images. Using fastMRI brain and knee data, we demonstrate fast inference and accuracy that surpasses recent posterior sampling techniques for MRI. Code is available at https://github.com/jwen307/mri_cnf.

加速磁共振(MR)成像试图通过收集低于奈奎斯特速率的数据来缩短采集时间。作为一个难解的逆问题,存在许多似是而非的解决方案,但大多数深度学习方法只能生成一个单一的解决方案。相反,我们专注于从后验分布中采样,为下游推理任务提供更全面的信息。为此,我们设计了一种新颖的条件归一化流(CNF),可以推断出测量算子空域中的信号分量,然后将其与测量数据相结合,形成完整的图像。通过使用 fastMRI 脑部和膝部数据,我们展示了快速推断和超越最新 MRI 后采样技术的准确性。代码见 https://github.com/jwen307/mri_cnf。
{"title":"A Conditional Normalizing Flow for Accelerated Multi-Coil MR Imaging.","authors":"Jeffrey Wen, Rizwan Ahmad, Philip Schniter","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Accelerated magnetic resonance (MR) imaging attempts to reduce acquisition time by collecting data below the Nyquist rate. As an ill-posed inverse problem, many plausible solutions exist, yet the majority of deep learning approaches generate only a single solution. We instead focus on sampling from the posterior distribution, which provides more comprehensive information for downstream inference tasks. To do this, we design a novel conditional normalizing flow (CNF) that infers the signal component in the measurement operator's nullspace, which is later combined with measured data to form complete images. Using fastMRI brain and knee data, we demonstrate fast inference and accuracy that surpasses recent posterior sampling techniques for MRI. Code is available at https://github.com/jwen307/mri_cnf.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"202 ","pages":"36926-36939"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10712023/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138814682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radiology Reports Improve Visual Representations Learned from Radiographs. 放射学报告改进了从射线照片中学到的可视化表达。
Haoxu Huang, Samyak Rawlekar, Sumit Chopra, Cem M Deniz

Although human's ability to visually understand the structure of the World plays a crucial role in perceiving the World and making appropriate decisions, human perception does not solely rely on vision but amalgamates the information from acoustic, verbal, and visual stimuli. An active area of research has been revolving around designing an efficient framework that adapts to multiple modalities and ideally improves the performance of existing tasks. While numerous frameworks have proved effective on natural datasets like ImageNet, a limited number of studies have been carried out in the biomedical domain. In this work, we extend the available frameworks for natural data to biomedical data by leveraging the abundant, unstructured multi-modal data available as radiology images and reports. We attempt to answer the question, "For multi-modal learning, self-supervised learning and joint learning using both learning strategies, which one improves the visual representation for downstream chest radiographs classification tasks the most?". Our experiments indicated that in limited labeled data settings with 1% and 10% labeled data, the joint learning with multi-modal and self-supervised models outperforms self-supervised learning and is at par with multi-modal learning. Additionally, we found that multi-modal learning is generally more robust on out-of-distribution datasets. The code is publicly available online.

虽然人类通过视觉理解世界结构的能力在感知世界和做出适当决策方面起着至关重要的作用,但人类的感知并不完全依赖视觉,而是综合了来自声音、语言和视觉刺激的信息。一个活跃的研究领域一直围绕着设计一个能适应多种模式并能理想地提高现有任务性能的高效框架展开。虽然许多框架已在 ImageNet 等自然数据集上证明有效,但在生物医学领域开展的研究数量有限。在这项工作中,我们利用放射学图像和报告等丰富的非结构化多模态数据,将现有的自然数据框架扩展到生物医学数据。我们试图回答这样一个问题:"对于多模态学习、自我监督学习和同时使用两种学习策略的联合学习,哪种学习策略能最大程度地改善下游胸片分类任务的可视化表示?我们的实验表明,在 1%和 10%的有限标注数据设置中,多模态模型和自我监督模型的联合学习优于自我监督学习,与多模态学习相当。此外,我们还发现,多模态学习在非分布数据集上通常更稳健。代码可在线公开获取。
{"title":"Radiology Reports Improve Visual Representations Learned from Radiographs.","authors":"Haoxu Huang, Samyak Rawlekar, Sumit Chopra, Cem M Deniz","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Although human's ability to visually understand the structure of the World plays a crucial role in perceiving the World and making appropriate decisions, human perception does not solely rely on vision but amalgamates the information from acoustic, verbal, and visual stimuli. An active area of research has been revolving around designing an efficient framework that adapts to multiple modalities and ideally improves the performance of existing tasks. While numerous frameworks have proved effective on natural datasets like ImageNet, a limited number of studies have been carried out in the biomedical domain. In this work, we extend the available frameworks for natural data to biomedical data by leveraging the abundant, unstructured multi-modal data available as radiology images and reports. We attempt to answer the question, \"For multi-modal learning, self-supervised learning and joint learning using both learning strategies, which one improves the visual representation for downstream chest radiographs classification tasks the most?\". Our experiments indicated that in limited labeled data settings with 1% and 10% labeled data, the joint learning with multi-modal and self-supervised models outperforms self-supervised learning and is at par with multi-modal learning. Additionally, we found that multi-modal learning is generally more robust on out-of-distribution datasets. The code is publicly available online.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"227 ","pages":"1385-1405"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11234265/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141581782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes. 具有潜在稀疏高斯过程的完全贝叶斯自动编码器。
Ba-Hien Tran, Babak Shahbaba, Stephan Mandt, Maurizio Filippone

We present a fully Bayesian autoencoder model that treats both local latent variables and global decoder parameters in a Bayesian fashion. This approach allows for flexible priors and posterior approximations while keeping the inference costs low. To achieve this, we introduce an amortized MCMC approach by utilizing an implicit stochastic network to learn sampling from the posterior over local latent variables. Furthermore, we extend the model by incorporating a Sparse Gaussian Process prior over the latent space, allowing for a fully Bayesian treatment of inducing points and kernel hyperparameters and leading to improved scalability. Additionally, we enable Deep Gaussian Process priors on the latent space and the handling of missing data. We evaluate our model on a range of experiments focusing on dynamic representation learning and generative modeling, demonstrating the strong performance of our approach in comparison to existing methods that combine Gaussian Processes and autoencoders.

我们提出了一种完全贝叶斯自动编码器模型,它以贝叶斯方式处理局部潜变量和全局解码器参数。这种方法允许灵活的先验和后验近似,同时保持较低的推理成本。为此,我们引入了一种摊销 MCMC 方法,利用隐式随机网络从局部潜变量的后验中学习采样。此外,我们还对模型进行了扩展,在潜变量空间中加入了稀疏高斯过程先验,允许对诱导点和内核超参数进行全贝叶斯处理,从而提高了可扩展性。此外,我们还启用了潜空间的深度高斯过程先验,并处理了缺失数据。我们在一系列侧重于动态表示学习和生成建模的实验中对我们的模型进行了评估,结果表明,与结合高斯过程和自动编码器的现有方法相比,我们的方法具有很强的性能。
{"title":"Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes.","authors":"Ba-Hien Tran, Babak Shahbaba, Stephan Mandt, Maurizio Filippone","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We present a fully Bayesian autoencoder model that treats both local latent variables and global decoder parameters in a Bayesian fashion. This approach allows for flexible priors and posterior approximations while keeping the inference costs low. To achieve this, we introduce an amortized MCMC approach by utilizing an implicit stochastic network to learn sampling from the posterior over local latent variables. Furthermore, we extend the model by incorporating a Sparse Gaussian Process prior over the latent space, allowing for a fully Bayesian treatment of inducing points and kernel hyperparameters and leading to improved scalability. Additionally, we enable Deep Gaussian Process priors on the latent space and the handling of missing data. We evaluate our model on a range of experiments focusing on dynamic representation learning and generative modeling, demonstrating the strong performance of our approach in comparison to existing methods that combine Gaussian Processes and autoencoders.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"202 ","pages":"34409-34430"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11031196/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140856806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Half-Hop: A graph upsampling approach for slowing down message passing. 半跳:一种用于减慢消息传递速度的图形上采样方法。
Mehdi Azabou, Venkataramana Ganesh, Shantanu Thakoor, Chi-Heng Lin, Lakshmi Sathidevi, Ran Liu, Michal Valko, Petar Veličković, Eva L Dyer

Message passing neural networks have shown a lot of success on graph-structured data. However, there are many instances where message passing can lead to over-smoothing or fail when neighboring nodes belong to different classes. In this work, we introduce a simple yet general framework for improving learning in message passing neural networks. Our approach essentially upsamples edges in the original graph by adding "slow nodes" at each edge that can mediate communication between a source and a target node. Our method only modifies the input graph, making it plug-and-play and easy to use with existing models. To understand the benefits of slowing down message passing, we provide theoretical and empirical analyses. We report results on several supervised and self-supervised benchmarks, and show improvements across the board, notably in heterophilic conditions where adjacent nodes are more likely to have different labels. Finally, we show how our approach can be used to generate augmentations for self-supervised learning, where slow nodes are randomly introduced into different edges in the graph to generate multi-scale views with variable path lengths.

消息传递神经网络在图结构数据方面取得了很大的成功。然而,在许多情况下,当相邻节点属于不同的类时,消息传递可能会导致过度平滑或失败。在这项工作中,我们介绍了一个简单而通用的框架,用于改进消息传递神经网络的学习。我们的方法本质上是通过在每条边上添加“慢节点”来对原始图中的边进行上采样,这些节点可以调解源节点和目标节点之间的通信。我们的方法只修改输入图,使其即插即用,并且易于与现有模型一起使用。为了理解减缓信息传递的好处,我们提供了理论和实证分析。我们报告了几个监督和自监督基准的结果,并显示了全面的改进,特别是在相邻节点更有可能具有不同标签的异亲条件下。最后,我们展示了如何使用我们的方法来生成自监督学习的增强,其中将慢节点随机引入图中的不同边,以生成具有可变路径长度的多尺度视图。
{"title":"Half-Hop: A graph upsampling approach for slowing down message passing.","authors":"Mehdi Azabou,&nbsp;Venkataramana Ganesh,&nbsp;Shantanu Thakoor,&nbsp;Chi-Heng Lin,&nbsp;Lakshmi Sathidevi,&nbsp;Ran Liu,&nbsp;Michal Valko,&nbsp;Petar Veličković,&nbsp;Eva L Dyer","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Message passing neural networks have shown a lot of success on graph-structured data. However, there are many instances where message passing can lead to over-smoothing or fail when neighboring nodes belong to different classes. In this work, we introduce a simple yet general framework for improving learning in message passing neural networks. Our approach essentially upsamples edges in the original graph by adding \"slow nodes\" at each edge that can mediate communication between a source and a target node. Our method only modifies the input graph, making it plug-and-play and easy to use with existing models. To understand the benefits of slowing down message passing, we provide theoretical and empirical analyses. We report results on several supervised and self-supervised benchmarks, and show improvements across the board, notably in heterophilic conditions where adjacent nodes are more likely to have different labels. Finally, we show how our approach can be used to generate augmentations for self-supervised learning, where slow nodes are randomly introduced into different edges in the graph to generate multi-scale views with variable path lengths.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"202 ","pages":"1341-1360"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10559225/pdf/nihms-1931959.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Controlled Differential Equations on Long Sequences via Non-standard Wavelets. 通过非标准小波控制长序列上的微分方程
Sourav Pal, Zhanpeng Zeng, Sathya N Ravi, Vikas Singh

Neural Controlled Differential equations (NCDE) are a powerful mechanism to model the dynamics in temporal sequences, e.g., applications involving physiological measures, where apart from the initial condition, the dynamics also depend on subsequent measures or even a different "control" sequence. But NCDEs do not scale well to longer sequences. Existing strategies adapt rough path theory, and instead model the dynamics over summaries known as log signatures. While rigorous and elegant, invertibility of these summaries is difficult, and limits the scope of problems where these ideas can offer strong benefits (reconstruction, generative modeling). For tasks where it is sensible to assume that the (long) sequences in the training data are a fixed length of temporal measurements - this assumption holds in most experiments tackled in the literature - we describe an efficient simplification. First, we recast the regression/classification task as an integral transform. We then show how restricting the class of operators (permissible in the integral transform), allows the use of a known algorithm that leverages non-standard Wavelets to decompose the operator. Thereby, our task (learning the operator) radically simplifies. A neural variant of this idea yields consistent improvements across a wide gamut of use cases tackled in existing works. We also describe a novel application on modeling tasks involving coupled differential equations.

神经控制微分方程(NCDE)是一种强大的机制,可用于建立时间序列的动态模型,例如,在涉及生理测量的应用中,除了初始条件外,动态还取决于后续测量甚至不同的 "控制 "序列。但是,NCDE 不能很好地扩展到更长的序列。现有的策略采用了粗糙路径理论,并在称为对数特征的摘要上建立动态模型。虽然这种方法既严谨又优雅,但这些摘要的可逆性却很难实现,这就限制了这些想法能带来巨大优势的问题(重建、生成模型)的范围。对于假设训练数据中的(长)序列是固定长度的时间测量(这一假设在大多数文献中的实验中都成立)的任务,我们描述了一种有效的简化方法。首先,我们将回归/分类任务重塑为积分变换。然后,我们展示了如何限制算子类别(积分变换中允许的算子类别),从而利用非标准小波分解算子的已知算法。这样,我们的任务(学习算子)就从根本上简化了。这一想法的神经变体在现有工作中处理的各种用例中都取得了一致的改进。我们还介绍了在涉及耦合微分方程的建模任务中的新应用。
{"title":"Controlled Differential Equations on Long Sequences via Non-standard Wavelets.","authors":"Sourav Pal, Zhanpeng Zeng, Sathya N Ravi, Vikas Singh","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Neural Controlled Differential equations (NCDE) are a powerful mechanism to model the dynamics in temporal sequences, e.g., applications involving physiological measures, where apart from the initial condition, the dynamics also depend on subsequent measures or even a different \"control\" sequence. But NCDEs do not scale well to longer sequences. Existing strategies adapt rough path theory, and instead model the dynamics over summaries known as <i>log signatures</i>. While rigorous and elegant, invertibility of these summaries is difficult, and limits the scope of problems where these ideas can offer strong benefits (reconstruction, generative modeling). For tasks where it is sensible to assume that the (long) sequences in the training data are a <i>fixed</i> length of temporal measurements - this assumption holds in most experiments tackled in the literature - we describe an efficient simplification. First, we recast the regression/classification task as an integral transform. We then show how restricting the class of operators (permissible in the integral transform), allows the use of a known algorithm that leverages non-standard Wavelets to decompose the operator. Thereby, our task (learning the operator) radically simplifies. A neural variant of this idea yields consistent improvements across a wide gamut of use cases tackled in existing works. We also describe a novel application on modeling tasks involving coupled differential equations.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"202 ","pages":"26820-26836"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11178150/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141332696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning. 折扣正则化的意外后果:改进确定性等价强化学习中的正则化。
Sarah Rathnam, Sonali Parbhoo, Weiwei Pan, Susan A Murphy, Finale Doshi-Velez

Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to restrict planning to a less complex set of policies when estimating an MDP from sparse or noisy data (Jiang et al., 2015). It is commonly understood that discount regularization functions by de-emphasizing or ignoring delayed effects. In this paper, we reveal an alternate view of discount regularization that exposes unintended consequences. We demonstrate that planning under a lower discount factor produces an identical optimal policy to planning using any prior on the transition matrix that has the same distribution for all states and actions. In fact, it functions like a prior with stronger regularization on state-action pairs with more transition data. This leads to poor performance when the transition matrix is estimated from data sets with uneven amounts of data across state-action pairs. Our equivalence theorem leads to an explicit formula to set regularization parameters locally for individual state-action pairs rather than globally. We demonstrate the failures of discount regularization and how we remedy them using our state-action-specific method across simple empirical examples as well as a medical cancer simulator.

贴现正则化是指在计算最优策略时使用较短的规划期限,它是一种常用的选择,可以在根据稀疏或噪声数据估计 MDP 时,将规划限制在不太复杂的策略集上(Jiang 等人,2015 年)。一般认为,折扣正则化功能是通过去强调或忽略延迟效应来实现的。在本文中,我们揭示了折扣正则化的另一种观点,它暴露了意想不到的后果。我们证明,在较低的贴现因子下进行规划,与在过渡矩阵上使用任何对所有状态和行动具有相同分布的先验进行规划,都能产生相同的最优策略。事实上,它的功能类似于对具有更多过渡数据的状态-行动对进行更强正则化的先验。当过渡矩阵是通过状态-行动对数据量不均的数据集估算出来时,这就会导致性能不佳。我们的等价定理提供了一个明确的公式,可以为单个状态-行动对局部而不是全局设置正则化参数。我们通过简单的经验示例和医疗癌症模拟器,展示了折扣正则化的失败,以及我们如何使用针对特定状态行动的方法来弥补这些失败。
{"title":"The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning.","authors":"Sarah Rathnam, Sonali Parbhoo, Weiwei Pan, Susan A Murphy, Finale Doshi-Velez","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to restrict planning to a less complex set of policies when estimating an MDP from sparse or noisy data (Jiang et al., 2015). It is commonly understood that discount regularization functions by de-emphasizing or ignoring delayed effects. In this paper, we reveal an alternate view of discount regularization that exposes unintended consequences. We demonstrate that planning under a lower discount factor produces an identical optimal policy to planning using any prior on the transition matrix that has the same distribution for all states and actions. In fact, it functions like a prior with stronger regularization on state-action pairs with more transition data. This leads to poor performance when the transition matrix is estimated from data sets with uneven amounts of data across state-action pairs. Our equivalence theorem leads to an explicit formula to set regularization parameters locally for individual state-action pairs rather than globally. We demonstrate the failures of discount regularization and how we remedy them using our state-action-specific method across simple empirical examples as well as a medical cancer simulator.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"202 ","pages":"28746-28767"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10472113/pdf/nihms-1926341.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10151971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Algorithms for White-Box Adversarial Streams. 白箱对抗流的改进算法。
Ying Feng, David P Woodruff

We study streaming algorithms in the white-box adversarial stream model, where the internal state of the streaming algorithm is revealed to an adversary who adaptively generates the stream updates, but the algorithm obtains fresh randomness unknown to the adversary at each time step. We incorporate cryptographic assumptions to construct robust algorithms against such adversaries. We propose efficient algorithms for sparse recovery of vectors, low rank recovery of matrices and tensors, as well as low rank plus sparse recovery of matrices, i.e., robust PCA. Unlike deterministic algorithms, our algorithms can report when the input is not sparse or low rank even in the presence of such an adversary. We use these recovery algorithms to improve upon and solve new problems in numerical linear algebra and combinatorial optimization on white-box adversarial streams. For example, we give the first efficient algorithm for outputting a matching in a graph with insertions and deletions to its edges provided the matching size is small, and otherwise we declare the matching size is large. We also improve the approximation versus memory tradeoff of previous work for estimating the number of non-zero elements in a vector and computing the matrix rank.

我们研究的是白箱对抗流模型中的流算法,在这种模型中,流算法的内部状态会透露给一个自适应生成流更新的对手,但算法会在每个时间步获得对手未知的新随机性。我们结合密码学假设,构建了针对此类对手的鲁棒算法。我们提出了矢量稀疏恢复、矩阵和张量低秩恢复以及矩阵低秩加稀疏恢复(即鲁棒性 PCA)的高效算法。与确定性算法不同的是,我们的算法可以在输入不稀疏或低秩时报告,即使存在这样的对手。我们利用这些恢复算法来改进和解决白箱对抗流上的数值线性代数和组合优化中的新问题。例如,我们给出了第一种高效算法,用于在匹配大小较小的情况下,在边有插入和删除的图中输出匹配,否则我们宣布匹配大小较大。我们还改进了以前工作中在估计向量中的非零元素数量和计算矩阵秩时的近似值与内存之间的权衡。
{"title":"Improved Algorithms for White-Box Adversarial Streams.","authors":"Ying Feng, David P Woodruff","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We study streaming algorithms in the white-box adversarial stream model, where the internal state of the streaming algorithm is revealed to an adversary who adaptively generates the stream updates, but the algorithm obtains fresh randomness unknown to the adversary at each time step. We incorporate cryptographic assumptions to construct robust algorithms against such adversaries. We propose efficient algorithms for sparse recovery of vectors, low rank recovery of matrices and tensors, as well as low rank plus sparse recovery of matrices, i.e., robust PCA. Unlike deterministic algorithms, our algorithms can report when the input is not sparse or low rank even in the presence of such an adversary. We use these recovery algorithms to improve upon and solve new problems in numerical linear algebra and combinatorial optimization on white-box adversarial streams. For example, we give the first efficient algorithm for outputting a matching in a graph with insertions and deletions to its edges provided the matching size is small, and otherwise we declare the matching size is large. We also improve the approximation versus memory tradeoff of previous work for estimating the number of non-zero elements in a vector and computing the matrix rank.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"202 ","pages":"9962-9975"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11576266/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal isotonic calibration for heterogeneous treatment effects. 非均匀处理效果的因果等压校准。
Lars van der Laan, Ernesto Ulloa-Pérez, Marco Carone, Alex Luedtke

We propose causal isotonic calibration, a novel nonparametric method for calibrating predictors of heterogeneous treatment effects. In addition, we introduce a novel data-efficient variant of calibration that avoids the need for hold-out calibration sets, which we refer to as cross-calibration. Causal isotonic cross-calibration takes cross-fitted predictors and outputs a single calibrated predictor obtained using all available data. We establish under weak conditions that causal isotonic calibration and cross-calibration both achieve fast doubly-robust calibration rates so long as either the propensity score or outcome regression is estimated well in an appropriate sense. The proposed causal isotonic calibrator can be wrapped around any black-box learning algorithm to provide strong distribution-free calibration guarantees while preserving predictive performance.

我们提出因果等压校准,这是一种新的非参数方法,用于校准异质性治疗效果的预测因子。此外,我们引入了一种新的数据高效的校准变体,避免了对保留校准集的需要,我们将其称为交叉校准。因果等压交叉校准采用交叉拟合的预测因子,并输出使用所有可用数据获得的单个校准预测因子。我们建立了在弱条件下,只要倾向得分或结果回归在适当的意义上估计得很好,因果等压校准和交叉校准都可以实现快速的双稳健校准率。所提出的因果等压校准器可以包裹在任何黑箱学习算法中,以提供强大的无分布校准保证,同时保持预测性能。
{"title":"Causal isotonic calibration for heterogeneous treatment effects.","authors":"Lars van der Laan,&nbsp;Ernesto Ulloa-Pérez,&nbsp;Marco Carone,&nbsp;Alex Luedtke","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We propose causal isotonic calibration, a novel nonparametric method for calibrating predictors of heterogeneous treatment effects. In addition, we introduce a novel data-efficient variant of calibration that avoids the need for hold-out calibration sets, which we refer to as cross-calibration. Causal isotonic cross-calibration takes cross-fitted predictors and outputs a single calibrated predictor obtained using all available data. We establish under weak conditions that causal isotonic calibration and cross-calibration both achieve fast doubly-robust calibration rates so long as either the propensity score or outcome regression is estimated well in an appropriate sense. The proposed causal isotonic calibrator can be wrapped around any black-box learning algorithm to provide strong distribution-free calibration guarantees while preserving predictive performance.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"202 ","pages":"34831-34854"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10416780/pdf/nihms-1900331.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9996727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Stain Decomposition via Inversion Regulation for Multiplex Immunohistochemistry Images. 通过反转调节对多重免疫组化图像进行无监督污点分解
Shahira Abousamra, Danielle Fassler, Jiachen Yao, Rajarsi Gupta, Tahsin Kurc, Luisa Escobar-Hoyos, Dimitris Samaras, Kenneth Shroyer, Joel Saltz, Chao Chen

Multiplex Immunohistochemistry (mIHC) is a cost-effective and accessible method for in situ labeling of multiple protein biomarkers in a tissue sample. By assigning a different stain to each biomarker, it allows the visualization of different types of cells within the tumor vicinity for downstream analysis. However, to detect different types of stains in a given mIHC image is a challenging problem, especially when the number of stains is high. Previous deep-learning-based methods mostly assume full supervision; yet the annotation can be costly. In this paper, we propose a novel unsupervised stain decomposition method to detect different stains simultaneously. Our method does not require any supervision, except for color samples of different stains. A main technical challenge is that the problem is underdetermined and can have multiple solutions. To conquer this issue, we propose a novel inversion regulation technique, which eliminates most undesirable solutions. On a 7-plexed IHC images dataset, the proposed method achieves high quality stain decomposition results without human annotation.

多重免疫组化(mIHC)是一种经济有效且易于使用的方法,可对组织样本中的多种蛋白质生物标记物进行原位标记。通过为每种生物标记物分配不同的染色剂,可以观察到肿瘤附近不同类型的细胞,以便进行下游分析。然而,在给定的 mIHC 图像中检测不同类型的染色剂是一个具有挑战性的问题,尤其是当染色剂数量较多时。以往基于深度学习的方法大多假定了完全的监督;但注释的成本可能很高。在本文中,我们提出了一种新颖的无监督污点分解方法来同时检测不同的污点。除了不同污渍的颜色样本,我们的方法不需要任何监督。一个主要的技术挑战是,该问题是一个未确定的问题,可能有多个解决方案。为了解决这个问题,我们提出了一种新颖的反转调节技术,它可以消除大多数不理想的解决方案。在 7 种复合物的 IHC 图像数据集上,所提出的方法无需人工标注即可获得高质量的染色分解结果。
{"title":"Unsupervised Stain Decomposition via Inversion Regulation for Multiplex Immunohistochemistry Images.","authors":"Shahira Abousamra, Danielle Fassler, Jiachen Yao, Rajarsi Gupta, Tahsin Kurc, Luisa Escobar-Hoyos, Dimitris Samaras, Kenneth Shroyer, Joel Saltz, Chao Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Multiplex Immunohistochemistry (mIHC) is a cost-effective and accessible method for in situ labeling of multiple protein biomarkers in a tissue sample. By assigning a different stain to each biomarker, it allows the visualization of different types of cells within the tumor vicinity for downstream analysis. However, to detect different types of stains in a given mIHC image is a challenging problem, especially when the number of stains is high. Previous deep-learning-based methods mostly assume full supervision; yet the annotation can be costly. In this paper, we propose a novel unsupervised stain decomposition method to detect different stains simultaneously. Our method does not require any supervision, except for color samples of different stains. A main technical challenge is that the problem is underdetermined and can have multiple solutions. To conquer this issue, we propose a novel inversion regulation technique, which eliminates most undesirable solutions. On a 7-plexed IHC images dataset, the proposed method achieves high quality stain decomposition results without human annotation.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"227 ","pages":"74-94"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11138139/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141181231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of machine learning research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1