Constraining Galaxy-Halo connection using machine learning

IF 1.9 4区 物理与天体物理 Q2 ASTRONOMY & ASTROPHYSICS Astronomy and Computing Pub Date : 2024-10-01 DOI:10.1016/j.ascom.2024.100883
A. Jana , L. Samushia
{"title":"Constraining Galaxy-Halo connection using machine learning","authors":"A. Jana ,&nbsp;L. Samushia","doi":"10.1016/j.ascom.2024.100883","DOIUrl":null,"url":null,"abstract":"<div><div>We investigate the potential of machine learning (ML) methods to model small-scale galaxy clustering for constraining Halo Occupation Distribution (HOD) parameters. Our analysis reveals that while many ML algorithms report good statistical fits, they often yield likelihood contours that are significantly biased in both mean values and variances relative to the true model parameters. This highlights the importance of careful data processing and algorithm selection in ML applications for galaxy clustering, as even seemingly robust methods can lead to biased results if not applied correctly. ML tools offer a promising approach to exploring the HOD parameter space with significantly reduced computational costs compared to traditional brute-force methods if their robustness is established. Using our ANN-based pipeline, we successfully recreate some standard results from recent literature. Properly restricting the HOD parameter space, transforming the training data, and carefully selecting ML algorithms are essential for achieving unbiased and robust predictions. Among the methods tested, artificial neural networks (ANNs) outperform random forests (RF) and ridge regression in predicting clustering statistics, when the HOD prior space is appropriately restricted. We demonstrate these findings using the projected two-point correlation function (<span><math><mrow><msub><mrow><mi>w</mi></mrow><mrow><mi>p</mi></mrow></msub><mrow><mo>(</mo><msub><mrow><mi>r</mi></mrow><mrow><mi>p</mi></mrow></msub><mo>)</mo></mrow></mrow></math></span>), angular multipoles of the correlation function (<span><math><mrow><msub><mrow><mi>ξ</mi></mrow><mrow><mi>ℓ</mi></mrow></msub><mrow><mo>(</mo><mi>r</mi><mo>)</mo></mrow></mrow></math></span>), and the void probability function (VPF) of Luminous Red Galaxies from Dark Energy Spectroscopic Instrument mocks. Our results show that while combining <span><math><mrow><msub><mrow><mi>w</mi></mrow><mrow><mi>p</mi></mrow></msub><mrow><mo>(</mo><msub><mrow><mi>r</mi></mrow><mrow><mi>p</mi></mrow></msub><mo>)</mo></mrow></mrow></math></span> and VPF improves parameter constraints, adding the multipoles <span><math><msub><mrow><mi>ξ</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span>, <span><math><msub><mrow><mi>ξ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>, and <span><math><msub><mrow><mi>ξ</mi></mrow><mrow><mn>4</mn></mrow></msub></math></span> to <span><math><mrow><msub><mrow><mi>w</mi></mrow><mrow><mi>p</mi></mrow></msub><mrow><mo>(</mo><msub><mrow><mi>r</mi></mrow><mrow><mi>p</mi></mrow></msub><mo>)</mo></mrow></mrow></math></span> does not significantly improve the constraints.</div></div>","PeriodicalId":48757,"journal":{"name":"Astronomy and Computing","volume":"49 ","pages":"Article 100883"},"PeriodicalIF":1.9000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Astronomy and Computing","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213133724000982","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

We investigate the potential of machine learning (ML) methods to model small-scale galaxy clustering for constraining Halo Occupation Distribution (HOD) parameters. Our analysis reveals that while many ML algorithms report good statistical fits, they often yield likelihood contours that are significantly biased in both mean values and variances relative to the true model parameters. This highlights the importance of careful data processing and algorithm selection in ML applications for galaxy clustering, as even seemingly robust methods can lead to biased results if not applied correctly. ML tools offer a promising approach to exploring the HOD parameter space with significantly reduced computational costs compared to traditional brute-force methods if their robustness is established. Using our ANN-based pipeline, we successfully recreate some standard results from recent literature. Properly restricting the HOD parameter space, transforming the training data, and carefully selecting ML algorithms are essential for achieving unbiased and robust predictions. Among the methods tested, artificial neural networks (ANNs) outperform random forests (RF) and ridge regression in predicting clustering statistics, when the HOD prior space is appropriately restricted. We demonstrate these findings using the projected two-point correlation function (wp(rp)), angular multipoles of the correlation function (ξ(r)), and the void probability function (VPF) of Luminous Red Galaxies from Dark Energy Spectroscopic Instrument mocks. Our results show that while combining wp(rp) and VPF improves parameter constraints, adding the multipoles ξ0, ξ2, and ξ4 to wp(rp) does not significantly improve the constraints.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用机器学习限制银河与光环的联系
我们研究了机器学习(ML)方法对小尺度星系聚类建模的潜力,以约束星系晕占分布(HOD)参数。我们的分析表明,虽然许多 ML 算法报告了良好的统计拟合,但它们产生的似然等值线的均值和方差与真实的模型参数相比都有很大偏差。这凸显了在星系聚类的 ML 应用中仔细处理数据和选择算法的重要性,因为如果应用不当,即使是看似稳健的方法也会导致有偏差的结果。与传统的粗暴方法相比,如果ML工具的鲁棒性得到确立,那么它就能提供一种探索HOD参数空间的有前途的方法,而且能大大降低计算成本。利用我们基于 ANN 的管道,我们成功地重现了近期文献中的一些标准结果。适当限制 HOD 参数空间、转换训练数据以及谨慎选择 ML 算法对于实现无偏且稳健的预测至关重要。在所测试的方法中,如果适当限制 HOD 先验空间,人工神经网络(ANN)在预测聚类统计数据方面的表现优于随机森林(RF)和脊回归。我们使用投影两点相关函数(wp(rp))、相关函数的角倍率(ξℓ(r))以及暗能量光谱仪器模拟的红色发光星系的空隙概率函数(VPF)证明了这些发现。我们的研究结果表明,将 wp(rp) 和 VPF 结合起来可以改善参数约束,但将乘数ξ0、ξ2 和ξ4 加入 wp(rp) 并不能显著改善约束。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Astronomy and Computing
Astronomy and Computing ASTRONOMY & ASTROPHYSICSCOMPUTER SCIENCE,-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
4.10
自引率
8.00%
发文量
67
期刊介绍: Astronomy and Computing is a peer-reviewed journal that focuses on the broad area between astronomy, computer science and information technology. The journal aims to publish the work of scientists and (software) engineers in all aspects of astronomical computing, including the collection, analysis, reduction, visualisation, preservation and dissemination of data, and the development of astronomical software and simulations. The journal covers applications for academic computer science techniques to astronomy, as well as novel applications of information technologies within astronomy.
期刊最新文献
AstroMLab 1: Who wins astronomy jeopardy!? Extended black hole solutions in Rastall theory of gravity Classification of galaxies from image features using best parameter selection by horse herd optimization algorithm (HOA) Accelerating radio astronomy imaging with RICK A numerical solution of Schrödinger equation for the dynamics of early universe
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1