Comparing sampling techniques to chart parameter space of 21 cm global signal with Artificial Neural Networks

IF 5.3 2区 物理与天体物理 Q1 ASTRONOMY & ASTROPHYSICS Journal of Cosmology and Astroparticle Physics Pub Date : 2024-10-14 DOI:10.1088/1475-7516/2024/10/041
Anshuman Tripathi, Gursharanjit Kaur, Abhirup Datta and Suman Majumdar
{"title":"Comparing sampling techniques to chart parameter space of 21 cm global signal with Artificial Neural Networks","authors":"Anshuman Tripathi, Gursharanjit Kaur, Abhirup Datta and Suman Majumdar","doi":"10.1088/1475-7516/2024/10/041","DOIUrl":null,"url":null,"abstract":"Understanding the first billion years of the universe requires studying two critical epochs: the Epoch of Reionization (EoR) and Cosmic Dawn (CD). However, due to limited data, the properties of the Intergalactic Medium (IGM) during these periods remain poorly understood, leading to a vast parameter space for the global 21cm signal. Training an Artificial Neural Network (ANN) with a narrowly defined parameter space can result in biased inferences. To mitigate this, the training dataset must be uniformly drawn from the entire parameter space to cover all possible signal realizations. However, drawing all possible realizations is computationally challenging, necessitating the sampling of a representative subset of this space. This study aims to identify optimal sampling techniques for the extensive dimensionality and volume of the 21cm signal parameter space. The optimally sampled training set will be used to train the ANN to infer from the global signal experiment. We investigate three sampling techniques: random, Latin hypercube (stratified), and Hammersley sequence (quasi-Monte Carlo) sampling, and compare their outcomes. Our findings reveal that sufficient samples must be drawn for robust and accurate ANN model training, regardless of the sampling technique employed. The required sample size depends primarily on two factors: the complexity of the data and the number of free parameters. More free parameters necessitate drawing more realizations. Among the sampling techniques utilized, we find that ANN models trained with Hammersley Sequence sampling demonstrate greater robustness compared to those trained with Latin hypercube and Random sampling.","PeriodicalId":15445,"journal":{"name":"Journal of Cosmology and Astroparticle Physics","volume":"31 1","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cosmology and Astroparticle Physics","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/1475-7516/2024/10/041","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

Understanding the first billion years of the universe requires studying two critical epochs: the Epoch of Reionization (EoR) and Cosmic Dawn (CD). However, due to limited data, the properties of the Intergalactic Medium (IGM) during these periods remain poorly understood, leading to a vast parameter space for the global 21cm signal. Training an Artificial Neural Network (ANN) with a narrowly defined parameter space can result in biased inferences. To mitigate this, the training dataset must be uniformly drawn from the entire parameter space to cover all possible signal realizations. However, drawing all possible realizations is computationally challenging, necessitating the sampling of a representative subset of this space. This study aims to identify optimal sampling techniques for the extensive dimensionality and volume of the 21cm signal parameter space. The optimally sampled training set will be used to train the ANN to infer from the global signal experiment. We investigate three sampling techniques: random, Latin hypercube (stratified), and Hammersley sequence (quasi-Monte Carlo) sampling, and compare their outcomes. Our findings reveal that sufficient samples must be drawn for robust and accurate ANN model training, regardless of the sampling technique employed. The required sample size depends primarily on two factors: the complexity of the data and the number of free parameters. More free parameters necessitate drawing more realizations. Among the sampling techniques utilized, we find that ANN models trained with Hammersley Sequence sampling demonstrate greater robustness compared to those trained with Latin hypercube and Random sampling.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
比较采样技术,用人工神经网络绘制 21 厘米全球信号的参数空间图
要了解宇宙最初的十亿年,需要研究两个关键的纪元:再电离纪元(EoR)和宇宙黎明纪元(CD)。然而,由于数据有限,人们对这两个时期的星系际介质(IGM)特性仍然知之甚少,导致全球 21cm 信号的参数空间十分巨大。用定义狭窄的参数空间来训练人工神经网络(ANN)可能会导致推论出现偏差。为了减轻这种情况,训练数据集必须从整个参数空间中统一抽取,以涵盖所有可能的信号变现。然而,绘制所有可能的实现情况在计算上具有挑战性,因此必须对该空间的代表性子集进行采样。本研究旨在针对 21 厘米信号参数空间的广泛维度和容量确定最佳采样技术。优化采样后的训练集将用于训练 ANN,以便从全局信号实验中进行推断。我们研究了三种抽样技术:随机抽样、拉丁超立方(分层)抽样和哈默斯利序列(准蒙特卡洛)抽样,并比较了它们的结果。我们的研究结果表明,无论采用哪种抽样技术,都必须抽取足够的样本才能进行稳健、准确的 ANN 模型训练。所需样本量主要取决于两个因素:数据的复杂性和自由参数的数量。自由参数越多,就需要抽取更多的真实值。我们发现,在所使用的抽样技术中,与使用拉丁超立方和随机抽样技术训练的模型相比,使用哈默斯利序列抽样技术训练的 ANN 模型具有更强的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Cosmology and Astroparticle Physics
Journal of Cosmology and Astroparticle Physics 地学天文-天文与天体物理
CiteScore
10.20
自引率
23.40%
发文量
632
审稿时长
1 months
期刊介绍: Journal of Cosmology and Astroparticle Physics (JCAP) encompasses theoretical, observational and experimental areas as well as computation and simulation. The journal covers the latest developments in the theory of all fundamental interactions and their cosmological implications (e.g. M-theory and cosmology, brane cosmology). JCAP''s coverage also includes topics such as formation, dynamics and clustering of galaxies, pre-galactic star formation, x-ray astronomy, radio astronomy, gravitational lensing, active galactic nuclei, intergalactic and interstellar matter.
期刊最新文献
Teleparallel geometry with spherical symmetry: the diagonal and proper frames On marginals and profiled posteriors for cosmological parameter estimation Efficient hybrid technique for generating sub-grid haloes in reionization simulations Constructing viable interacting dark matter and dark energy models: a dynamical systems approach A speed limit on tachyon fields from cosmological and fine-structure data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1