{"title":"噪声矩阵补全线性形式的多重检验","authors":"Wanteng Ma, Lilun Du, Dong Xia, Ming Yuan","doi":"arxiv-2312.00305","DOIUrl":null,"url":null,"abstract":"Many important tasks of large-scale recommender systems can be naturally cast\nas testing multiple linear forms for noisy matrix completion. These problems,\nhowever, present unique challenges because of the subtle bias-and-variance\ntradeoff of and an intricate dependence among the estimated entries induced by\nthe low-rank structure. In this paper, we develop a general approach to\novercome these difficulties by introducing new statistics for individual tests\nwith sharp asymptotics both marginally and jointly, and utilizing them to\ncontrol the false discovery rate (FDR) via a data splitting and symmetric\naggregation scheme. We show that valid FDR control can be achieved with\nguaranteed power under nearly optimal sample size requirements using the\nproposed methodology. Extensive numerical simulations and real data examples\nare also presented to further illustrate its practical merits.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"88 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiple Testing of Linear Forms for Noisy Matrix Completion\",\"authors\":\"Wanteng Ma, Lilun Du, Dong Xia, Ming Yuan\",\"doi\":\"arxiv-2312.00305\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many important tasks of large-scale recommender systems can be naturally cast\\nas testing multiple linear forms for noisy matrix completion. These problems,\\nhowever, present unique challenges because of the subtle bias-and-variance\\ntradeoff of and an intricate dependence among the estimated entries induced by\\nthe low-rank structure. In this paper, we develop a general approach to\\novercome these difficulties by introducing new statistics for individual tests\\nwith sharp asymptotics both marginally and jointly, and utilizing them to\\ncontrol the false discovery rate (FDR) via a data splitting and symmetric\\naggregation scheme. We show that valid FDR control can be achieved with\\nguaranteed power under nearly optimal sample size requirements using the\\nproposed methodology. Extensive numerical simulations and real data examples\\nare also presented to further illustrate its practical merits.\",\"PeriodicalId\":501330,\"journal\":{\"name\":\"arXiv - MATH - Statistics Theory\",\"volume\":\"88 4\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2312.00305\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2312.00305","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multiple Testing of Linear Forms for Noisy Matrix Completion
Many important tasks of large-scale recommender systems can be naturally cast
as testing multiple linear forms for noisy matrix completion. These problems,
however, present unique challenges because of the subtle bias-and-variance
tradeoff of and an intricate dependence among the estimated entries induced by
the low-rank structure. In this paper, we develop a general approach to
overcome these difficulties by introducing new statistics for individual tests
with sharp asymptotics both marginally and jointly, and utilizing them to
control the false discovery rate (FDR) via a data splitting and symmetric
aggregation scheme. We show that valid FDR control can be achieved with
guaranteed power under nearly optimal sample size requirements using the
proposed methodology. Extensive numerical simulations and real data examples
are also presented to further illustrate its practical merits.