The Dyson equalizer: adaptive noise stabilization for low-rank signal detection and recovery.

IF 1.4 4区 数学 Q2 MATHEMATICS, APPLIED Information and Inference-A Journal of the Ima Pub Date : 2025-01-16 eCollection Date: 2025-03-01 DOI:10.1093/imaiai/iaae036
Boris Landa, Yuval Kluger
{"title":"The Dyson equalizer: adaptive noise stabilization for low-rank signal detection and recovery.","authors":"Boris Landa, Yuval Kluger","doi":"10.1093/imaiai/iaae036","DOIUrl":null,"url":null,"abstract":"<p><p>Detecting and recovering a low-rank signal in a noisy data matrix is a fundamental task in data analysis. Typically, this task is addressed by inspecting and manipulating the spectrum of the observed data, e.g. thresholding the singular values of the data matrix at a certain critical level. This approach is well established in the case of homoskedastic noise, where the noise variance is identical across the entries. However, in numerous applications, the noise can be heteroskedastic, where the noise characteristics may vary considerably across the rows and columns of the data. In this scenario, the spectral behaviour of the noise can differ significantly from the homoskedastic case, posing various challenges for signal detection and recovery. To address these challenges, we develop an adaptive normalization procedure that equalizes the average noise variance across the rows and columns of a given data matrix. Our proposed procedure is data-driven and fully automatic, supporting a broad range of noise distributions, variance patterns and signal structures. Our approach relies on random matrix theory results that describe the resolvent of the noise via the so-called Dyson equation. By leveraging this relation, we can accurately infer the noise level in each row and each column directly from the resolvent of the data. We establish that in many cases, our normalization enforces the standard spectral behaviour of homoskedastic noise-the Marchenko-Pastur (MP) law, allowing for simple and reliable detection of signal components. Furthermore, we demonstrate that our approach can substantially improve signal recovery in heteroskedastic settings by manipulating the spectrum after normalization. Lastly, we apply our method to single-cell RNA sequencing and spatial transcriptomics data, showcasing accurate fits to the MP law after normalization.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"14 1","pages":"iaae036"},"PeriodicalIF":1.4000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735832/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Inference-A Journal of the Ima","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/imaiai/iaae036","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

Detecting and recovering a low-rank signal in a noisy data matrix is a fundamental task in data analysis. Typically, this task is addressed by inspecting and manipulating the spectrum of the observed data, e.g. thresholding the singular values of the data matrix at a certain critical level. This approach is well established in the case of homoskedastic noise, where the noise variance is identical across the entries. However, in numerous applications, the noise can be heteroskedastic, where the noise characteristics may vary considerably across the rows and columns of the data. In this scenario, the spectral behaviour of the noise can differ significantly from the homoskedastic case, posing various challenges for signal detection and recovery. To address these challenges, we develop an adaptive normalization procedure that equalizes the average noise variance across the rows and columns of a given data matrix. Our proposed procedure is data-driven and fully automatic, supporting a broad range of noise distributions, variance patterns and signal structures. Our approach relies on random matrix theory results that describe the resolvent of the noise via the so-called Dyson equation. By leveraging this relation, we can accurately infer the noise level in each row and each column directly from the resolvent of the data. We establish that in many cases, our normalization enforces the standard spectral behaviour of homoskedastic noise-the Marchenko-Pastur (MP) law, allowing for simple and reliable detection of signal components. Furthermore, we demonstrate that our approach can substantially improve signal recovery in heteroskedastic settings by manipulating the spectrum after normalization. Lastly, we apply our method to single-cell RNA sequencing and spatial transcriptomics data, showcasing accurate fits to the MP law after normalization.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
戴森均衡器:用于低秩信号检测和恢复的自适应噪声稳定。
在噪声数据矩阵中检测和恢复低秩信号是数据分析中的一项基本任务。通常,这项任务是通过检查和操纵观测数据的频谱来解决的,例如,在某个临界水平上对数据矩阵的奇异值进行阈值设定。这种方法在均匀噪声的情况下很好地建立起来,其中噪声方差在各个条目之间是相同的。然而,在许多应用中,噪声可能是异方差的,其中噪声特性可能在数据的行和列之间变化很大。在这种情况下,噪声的频谱行为可能与均方差情况有很大不同,这给信号检测和恢复带来了各种挑战。为了解决这些挑战,我们开发了一种自适应归一化过程,该过程可以均衡给定数据矩阵的行和列之间的平均噪声方差。我们提出的程序是数据驱动和全自动的,支持广泛的噪声分布,方差模式和信号结构。我们的方法依赖于随机矩阵理论的结果,该结果通过所谓的戴森方程描述了噪声的解决方案。通过利用这种关系,我们可以直接从数据的解析中准确地推断出每行和每列中的噪声水平。我们确定,在许多情况下,我们的归一化强制均方差噪声的标准频谱行为-马尔琴科-巴斯德(MP)定律,允许简单可靠地检测信号成分。此外,我们证明了我们的方法可以通过操作归一化后的频谱大大提高异方差设置中的信号恢复。最后,我们将我们的方法应用于单细胞RNA测序和空间转录组学数据,显示了归一化后MP定律的准确拟合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.90
自引率
0.00%
发文量
28
期刊最新文献
The Dyson equalizer: adaptive noise stabilization for low-rank signal detection and recovery. Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise. Phase transition and higher order analysis of Lq regularization under dependence. On statistical inference with high-dimensional sparse CCA. Black-box tests for algorithmic stability.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1