首页 > 最新文献

arXiv - PHYS - Data Analysis, Statistics and Probability最新文献

英文 中文
Similarity-Based Analysis of Atmospheric Organic Compounds for Machine Learning Applications 基于相似性的大气有机化合物分析促进机器学习应用
Pub Date : 2024-06-26 DOI: arxiv-2406.18171
Hilda Sandström, Patrick Rinke
The formation of aerosol particles in the atmosphere impacts air quality andclimate change, but many of the organic molecules involved remain unknown.Machine learning could aid in identifying these compounds through acceleratedanalysis of molecular properties and detection characteristics. However, suchprogress is hindered by the current lack of curated datasets for atmosphericmolecules and their associated properties. To tackle this challenge, we proposea similarity analysis that connects atmospheric compounds to existing largemolecular datasets used for machine learning development. We find a smalloverlap between atmospheric and non-atmospheric molecules using standardmolecular representations in machine learning applications. The identifiedout-of-domain character of atmospheric compounds is related to their distinctfunctional groups and atomic composition. Our investigation underscores theneed for collaborative efforts to gather and share more molecular-levelatmospheric chemistry data. The presented similarity based analysis can be usedfor future dataset curation for machine learning development in the atmosphericsciences.
大气中气溶胶粒子的形成会影响空气质量和气候变化,但其中涉及的许多有机分子仍不为人知。机器学习可以通过加速分析分子特性和检测特征来帮助识别这些化合物。然而,由于目前缺乏大气分子及其相关特性的数据集,这一进展受到了阻碍。为了应对这一挑战,我们提出了一种相似性分析方法,将大气化合物与用于机器学习开发的现有大分子数据集联系起来。我们发现,在机器学习应用中使用标准分子表征的大气分子和非大气分子之间存在微小的重叠。所发现的大气化合物的域外特征与其独特的功能基团和原子组成有关。我们的研究突出表明,需要共同努力收集和共享更多分子水平的大气化学数据。所介绍的基于相似性的分析可用于未来大气科学领域机器学习开发的数据集整理。
{"title":"Similarity-Based Analysis of Atmospheric Organic Compounds for Machine Learning Applications","authors":"Hilda Sandström, Patrick Rinke","doi":"arxiv-2406.18171","DOIUrl":"https://doi.org/arxiv-2406.18171","url":null,"abstract":"The formation of aerosol particles in the atmosphere impacts air quality and\u0000climate change, but many of the organic molecules involved remain unknown.\u0000Machine learning could aid in identifying these compounds through accelerated\u0000analysis of molecular properties and detection characteristics. However, such\u0000progress is hindered by the current lack of curated datasets for atmospheric\u0000molecules and their associated properties. To tackle this challenge, we propose\u0000a similarity analysis that connects atmospheric compounds to existing large\u0000molecular datasets used for machine learning development. We find a small\u0000overlap between atmospheric and non-atmospheric molecules using standard\u0000molecular representations in machine learning applications. The identified\u0000out-of-domain character of atmospheric compounds is related to their distinct\u0000functional groups and atomic composition. Our investigation underscores the\u0000need for collaborative efforts to gather and share more molecular-level\u0000atmospheric chemistry data. The presented similarity based analysis can be used\u0000for future dataset curation for machine learning development in the atmospheric\u0000sciences.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brownian friction dynamics: fluctuations in sliding distance 布朗摩擦动力学:滑动距离的波动
Pub Date : 2024-06-23 DOI: arxiv-2406.16139
Ruibin Xu, Feng Zhou, B. N. J. Persson
We have studied the fluctuation (noise) in the position of sliding blocksunder constant driving forces on different substrate surfaces. The experimentaldata are complemented by simulations using a simple spring-block model wherethe asperity contact regions are modeled by miniblocks connected to the bigblock by viscoelastic springs. The miniblocks experience forces that fluctuaterandomly with the lateral position, simulating the interaction betweenasperities on the block and the substrate. The theoretical model providesdisplacement power spectra that agree well with the experimental results.
我们研究了滑动块在不同基体表面恒定驱动力下的位置波动(噪声)。实验数据通过使用简单的弹簧滑块模型进行模拟得到补充,该模型通过粘弹性弹簧将小滑块与大滑块连接起来,从而模拟出非晶体接触区域。小块受到的力随横向位置随机波动,模拟了块上的表面与基体之间的相互作用。理论模型提供的位移功率谱与实验结果非常吻合。
{"title":"Brownian friction dynamics: fluctuations in sliding distance","authors":"Ruibin Xu, Feng Zhou, B. N. J. Persson","doi":"arxiv-2406.16139","DOIUrl":"https://doi.org/arxiv-2406.16139","url":null,"abstract":"We have studied the fluctuation (noise) in the position of sliding blocks\u0000under constant driving forces on different substrate surfaces. The experimental\u0000data are complemented by simulations using a simple spring-block model where\u0000the asperity contact regions are modeled by miniblocks connected to the big\u0000block by viscoelastic springs. The miniblocks experience forces that fluctuate\u0000randomly with the lateral position, simulating the interaction between\u0000asperities on the block and the substrate. The theoretical model provides\u0000displacement power spectra that agree well with the experimental results.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"76 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gaussian approximation of dynamic cavity equations for linearly-coupled stochastic dynamics 线性耦合随机动力学的动态空腔方程的高斯近似值
Pub Date : 2024-06-20 DOI: arxiv-2406.14200
Mattia Tarabolo, Luca Dall'Asta
Stochastic dynamics on sparse graphs and disordered systems often lead tocomplex behaviors characterized by heterogeneity in time and spatial scales,slow relaxation, localization, and aging phenomena. The mathematical tools andapproximation techniques required to analyze these complex systems are stillunder development, posing significant technical challenges and resulting in areliance on numerical simulations. We introduce a novel computational frameworkfor investigating the dynamics of sparse disordered systems with continuousdegrees of freedom. Starting with a graphical model representation of thedynamic partition function for a system of linearly-coupled stochasticdifferential equations, we use dynamic cavity equations on locally tree-likefactor graphs to approximate the stochastic measure. Here, cavity marginals areidentified with local functionals of single-site trajectories. Our primaryapproximation involves a second-order truncation of a small-coupling expansion,leading to a Gaussian form for the cavity marginals. For linear dynamics withadditive noise, this method yields a closed set of causal integro-differentialequations for cavity versions of one-time and two-time averages. Theseequations provide an exact dynamical description within the local tree-likeapproximation, retrieving classical results for the spectral density of sparserandom matrices. Global constraints, non-linear forces, and state-dependentnoise terms can be addressed using a self-consistent perturbative closuretechnique. The resulting equations resemble those of dynamical mean-fieldtheory in the mode-coupling approximation used for fully-connected models.However, due to their cavity formulation, the present method can also beapplied to ensembles of sparse random graphs and employed as a message-passingalgorithm on specific graph instances.
稀疏图和无序系统上的随机动力学常常导致复杂的行为,其特点是时间和空间尺度的异质性、缓慢弛豫、局部化和老化现象。分析这些复杂系统所需的数学工具和近似技术仍处于发展阶段,这带来了巨大的技术挑战,并导致对数值模拟的依赖。我们介绍了一种新的计算框架,用于研究具有连续自由度的稀疏无序系统的动力学。从线性耦合随机微分方程系统的动态分配函数的图形模型表示开始,我们使用局部树状因子图上的动态空穴方程来近似随机度量。在这里,空洞边际与单点轨迹的局部函数相一致。我们的主要近似方法是对小耦合扩展进行二阶截断,从而得到高斯形式的空洞边际。对于具有附加噪声的线性动力学,这种方法可以得到一组封闭的因果积分微分方程,用于空腔版本的一次平均和两次平均。这些方程在局部树状近似中提供了精确的动力学描述,检索了稀疏随机矩阵谱密度的经典结果。全局约束、非线性力和与状态相关的噪声项可以通过自洽的扰动闭合技术来解决。然而,由于其空腔形式,本方法也可应用于稀疏随机图集合,并在特定图实例上用作消息传递算法。
{"title":"Gaussian approximation of dynamic cavity equations for linearly-coupled stochastic dynamics","authors":"Mattia Tarabolo, Luca Dall'Asta","doi":"arxiv-2406.14200","DOIUrl":"https://doi.org/arxiv-2406.14200","url":null,"abstract":"Stochastic dynamics on sparse graphs and disordered systems often lead to\u0000complex behaviors characterized by heterogeneity in time and spatial scales,\u0000slow relaxation, localization, and aging phenomena. The mathematical tools and\u0000approximation techniques required to analyze these complex systems are still\u0000under development, posing significant technical challenges and resulting in a\u0000reliance on numerical simulations. We introduce a novel computational framework\u0000for investigating the dynamics of sparse disordered systems with continuous\u0000degrees of freedom. Starting with a graphical model representation of the\u0000dynamic partition function for a system of linearly-coupled stochastic\u0000differential equations, we use dynamic cavity equations on locally tree-like\u0000factor graphs to approximate the stochastic measure. Here, cavity marginals are\u0000identified with local functionals of single-site trajectories. Our primary\u0000approximation involves a second-order truncation of a small-coupling expansion,\u0000leading to a Gaussian form for the cavity marginals. For linear dynamics with\u0000additive noise, this method yields a closed set of causal integro-differential\u0000equations for cavity versions of one-time and two-time averages. These\u0000equations provide an exact dynamical description within the local tree-like\u0000approximation, retrieving classical results for the spectral density of sparse\u0000random matrices. Global constraints, non-linear forces, and state-dependent\u0000noise terms can be addressed using a self-consistent perturbative closure\u0000technique. The resulting equations resemble those of dynamical mean-field\u0000theory in the mode-coupling approximation used for fully-connected models.\u0000However, due to their cavity formulation, the present method can also be\u0000applied to ensembles of sparse random graphs and employed as a message-passing\u0000algorithm on specific graph instances.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Models for Accurately Predicting Properties of CsPbCl3 Perovskite Quantum Dots 用于准确预测 CsPbCl3 Perovskite 量子点特性的机器学习模型
Pub Date : 2024-06-20 DOI: arxiv-2406.15515
Mehmet Sıddık Çadırcı, Musa Çadırcı
Perovskite Quantum Dots (PQDs) have a promising future for severalapplications due to their unique properties. This study investigates theeffectiveness of Machine Learning (ML) in predicting the size, absorbance (1Sabs) and photoluminescence (PL) properties of $mathrm{CsPbCl}_3$ PQDs usingsynthesizing features as the input dataset. the study employed ML models ofSupport Vector Regression (SVR), Nearest Neighbour Distance (NND), RandomForest (RF), Gradient Boosting Machine (GBM), Decision Tree (DT) and DeepLearning (DL). Although all models performed highly accurate results, SVR andNND demonstrated the best accurate property prediction by achieving excellentperformance on the test and training datasets, with high $mathrm{R}^2$ and lowRoot Mean Squared Error (RMSE) and low Mean Absolute Error (MAE) metric values.Given that ML is becoming more superior, its ability to understand the QDsfield could prove invaluable to shape the future of nanomaterials designing.
包光体量子点(PQDs)因其独特的性质,在多种应用中具有广阔的前景。本研究以合成特征作为输入数据集,探讨了机器学习(ML)在预测$mathrm{CsPbCl}_3$ PQDs的尺寸、吸光度(1Sabs)和光致发光(PL)特性方面的有效性。该研究采用了支持向量回归(SVR)、近邻距离(NND)、随机森林(RF)、梯度提升机(GBM)、决策树(DT)和深度学习(DL)等 ML 模型。尽管所有模型都取得了非常准确的结果,但SVR和NND在测试和训练数据集上取得了优异的性能,具有较高的$mathrm{R}^2$、较低的根均方误差(RMSE)和较低的平均绝对误差(MAE)指标值,从而展示了最准确的性能预测。
{"title":"Machine Learning Models for Accurately Predicting Properties of CsPbCl3 Perovskite Quantum Dots","authors":"Mehmet Sıddık Çadırcı, Musa Çadırcı","doi":"arxiv-2406.15515","DOIUrl":"https://doi.org/arxiv-2406.15515","url":null,"abstract":"Perovskite Quantum Dots (PQDs) have a promising future for several\u0000applications due to their unique properties. This study investigates the\u0000effectiveness of Machine Learning (ML) in predicting the size, absorbance (1S\u0000abs) and photoluminescence (PL) properties of $mathrm{CsPbCl}_3$ PQDs using\u0000synthesizing features as the input dataset. the study employed ML models of\u0000Support Vector Regression (SVR), Nearest Neighbour Distance (NND), Random\u0000Forest (RF), Gradient Boosting Machine (GBM), Decision Tree (DT) and Deep\u0000Learning (DL). Although all models performed highly accurate results, SVR and\u0000NND demonstrated the best accurate property prediction by achieving excellent\u0000performance on the test and training datasets, with high $mathrm{R}^2$ and low\u0000Root Mean Squared Error (RMSE) and low Mean Absolute Error (MAE) metric values.\u0000Given that ML is becoming more superior, its ability to understand the QDs\u0000field could prove invaluable to shape the future of nanomaterials designing.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Encoder-Decoder Neural Networks in Interpretation of X-ray Spectra 解读 X 射线光谱的编码器-解码器神经网络
Pub Date : 2024-06-20 DOI: arxiv-2406.14044
Jalmari Passilahti, Anton Vladyka, Johannes Niskanen
Encoder-decoder neural networks (EDNN) condense information most relevant tothe output of the feedforward network to activation values at a bottlenecklayer. We study the use of this architecture in emulation and interpretation ofsimulated X-ray spectroscopic data with the aim to identify key structuralcharacteristics for the spectra, previously studied using emulator-basedcomponent analysis (ECA). We find an EDNN to outperform ECA in covered targetvariable variance, but also discover complications in interpreting the latentvariables in physical terms. As a compromise of the benefits of these twoapproaches, we develop a network where the linear projection of ECA is used,thus maintaining the beneficial characteristics of vector expansion from thelatent variables for their interpretation. These results underline thenecessity of information recovery after its condensation and identification ofdecisive structural degrees for the output spectra for a justifiedinterpretation.
编码器-解码器神经网络(EDNN)将与前馈网络输出最相关的信息浓缩为瓶颈层的激活值。我们研究了这种结构在模拟和解释模拟 X 射线光谱数据中的应用,目的是识别光谱的关键结构特征。我们发现 EDNN 在覆盖目标变量方差方面优于 ECA,但也发现了用物理术语解释潜变量的复杂性。为了折中这两种方法的优点,我们开发了一种使用 ECA 线性投影的网络,从而保持了从潜在变量向量扩展来解释潜在变量的有利特性。这些结果凸显了在信息浓缩后进行信息恢复的必要性,以及为输出光谱确定决定性结构度以进行合理解释的必要性。
{"title":"Encoder-Decoder Neural Networks in Interpretation of X-ray Spectra","authors":"Jalmari Passilahti, Anton Vladyka, Johannes Niskanen","doi":"arxiv-2406.14044","DOIUrl":"https://doi.org/arxiv-2406.14044","url":null,"abstract":"Encoder-decoder neural networks (EDNN) condense information most relevant to\u0000the output of the feedforward network to activation values at a bottleneck\u0000layer. We study the use of this architecture in emulation and interpretation of\u0000simulated X-ray spectroscopic data with the aim to identify key structural\u0000characteristics for the spectra, previously studied using emulator-based\u0000component analysis (ECA). We find an EDNN to outperform ECA in covered target\u0000variable variance, but also discover complications in interpreting the latent\u0000variables in physical terms. As a compromise of the benefits of these two\u0000approaches, we develop a network where the linear projection of ECA is used,\u0000thus maintaining the beneficial characteristics of vector expansion from the\u0000latent variables for their interpretation. These results underline the\u0000necessity of information recovery after its condensation and identification of\u0000decisive structural degrees for the output spectra for a justified\u0000interpretation.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On countering adversarial perturbations in graphs using error correcting codes 利用纠错码对抗图中的对抗性扰动
Pub Date : 2024-06-20 DOI: arxiv-2406.14245
Saif Eddin Jabari
We consider the problem of a graph subjected to adversarial perturbations,such as those arising from cyber-attacks, where edges are covertly added orremoved. The adversarial perturbations occur during the transmission of thegraph between a sender and a receiver. To counteract potential perturbations,we explore a repetition coding scheme with sender-assigned binary noise andmajority voting on the receiver's end to rectify the graph's structure. Ourapproach operates without prior knowledge of the attack's characteristics. Weprovide an analytical derivation of a bound on the number of repetitions neededto satisfy probabilistic constraints on the quality of the reconstructed graph.We show that the method can accurately decode graphs that were subjected tonon-random edge removal, namely, those connected to vertices with the highesteigenvector centrality, in addition to random addition and removal of edges bythe attacker.
我们考虑的问题是图形受到对抗性扰动(如网络攻击引起的扰动)的影响,在这种情况下,边会被暗中添加或删除。对抗性扰动发生在图在发送方和接收方之间传输的过程中。为了抵消潜在的扰动,我们探索了一种重复编码方案,该方案采用发送方分配的二进制噪声和接收方的多数投票来纠正图的结构。我们的方法无需事先了解攻击的特征即可运行。我们对满足重建图质量概率约束所需的重复次数进行了分析推导。我们证明,除了攻击者随机添加和移除边之外,该方法还能准确解码被随机移除边的图,即那些与具有最高特征向量中心性的顶点相连的图。
{"title":"On countering adversarial perturbations in graphs using error correcting codes","authors":"Saif Eddin Jabari","doi":"arxiv-2406.14245","DOIUrl":"https://doi.org/arxiv-2406.14245","url":null,"abstract":"We consider the problem of a graph subjected to adversarial perturbations,\u0000such as those arising from cyber-attacks, where edges are covertly added or\u0000removed. The adversarial perturbations occur during the transmission of the\u0000graph between a sender and a receiver. To counteract potential perturbations,\u0000we explore a repetition coding scheme with sender-assigned binary noise and\u0000majority voting on the receiver's end to rectify the graph's structure. Our\u0000approach operates without prior knowledge of the attack's characteristics. We\u0000provide an analytical derivation of a bound on the number of repetitions needed\u0000to satisfy probabilistic constraints on the quality of the reconstructed graph.\u0000We show that the method can accurately decode graphs that were subjected to\u0000non-random edge removal, namely, those connected to vertices with the highest\u0000eigenvector centrality, in addition to random addition and removal of edges by\u0000the attacker.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the analysis of two-time correlation functions: equilibrium vs non-equilibrium systems 双时相关函数分析:平衡与非平衡系统
Pub Date : 2024-06-18 DOI: arxiv-2406.12520
Anastasia Ragulskaya, Vladimir Starostin, Fajun Zhang, Christian Gutt, Frank Schreiber
X-ray photon correlation spectroscopy (XPCS) is a powerful tool for theinvestigation of dynamics covering a broad range of time and length scales. Thetwo-time correlation function (TTC) is commonly used to track non-equilibriumdynamical evolution in XPCS measurements, followed by the extraction ofone-time correlations. While the theoretical foundation for the quantitativeanalysis of TTCs is primarily established for equilibrium systems, where keyparameters such as diffusion remain constant, non-equilibrium systems pose aunique challenge. In such systems, different projections ("cuts") of the TTCmay lead to divergent results if the underlying fundamental parametersthemselves are subject to temporal variations. This article explores widelyused approaches for TTC calculations and common methods for extracting relevantinformation from correlation functions on case studies, particularly in thelight of comparing dynamics in equilibrium and non-equilibrium systems.
X 射线光子相关光谱学(XPCS)是研究涵盖广泛时间和长度尺度的动力学的有力工具。在 XPCS 测量中,双时间相关函数(TTC)通常用于跟踪非平衡动力学演化,然后提取单时间相关性。虽然定量分析 TTC 的理论基础主要是针对平衡系统建立的,在平衡系统中,扩散等关键参数保持不变,但非平衡系统却带来了独特的挑战。在这类系统中,如果基本参数本身会发生时间变化,那么对 TTC 的不同预测("切割")可能会导致不同的结果。本文探讨了广泛使用的 TTC 计算方法,以及从案例研究的相关函数中提取相关信息的常用方法,特别是在比较平衡和非平衡系统的动力学方面。
{"title":"On the analysis of two-time correlation functions: equilibrium vs non-equilibrium systems","authors":"Anastasia Ragulskaya, Vladimir Starostin, Fajun Zhang, Christian Gutt, Frank Schreiber","doi":"arxiv-2406.12520","DOIUrl":"https://doi.org/arxiv-2406.12520","url":null,"abstract":"X-ray photon correlation spectroscopy (XPCS) is a powerful tool for the\u0000investigation of dynamics covering a broad range of time and length scales. The\u0000two-time correlation function (TTC) is commonly used to track non-equilibrium\u0000dynamical evolution in XPCS measurements, followed by the extraction of\u0000one-time correlations. While the theoretical foundation for the quantitative\u0000analysis of TTCs is primarily established for equilibrium systems, where key\u0000parameters such as diffusion remain constant, non-equilibrium systems pose a\u0000unique challenge. In such systems, different projections (\"cuts\") of the TTC\u0000may lead to divergent results if the underlying fundamental parameters\u0000themselves are subject to temporal variations. This article explores widely\u0000used approaches for TTC calculations and common methods for extracting relevant\u0000information from correlation functions on case studies, particularly in the\u0000light of comparing dynamics in equilibrium and non-equilibrium systems.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainties in ROC (Receiver Operating Characteristic) Curves Derived from Counting Data 从计数数据得出的 ROC(接收者工作特征)曲线的不确定性
Pub Date : 2024-06-17 DOI: arxiv-2406.11396
M. P. Fewell
The ROC (receiver operating characteristic) curve is a widely used device forassessing decision-making systems. It seems surprising, in view of its historydating back to World War Two, that the assignment of uncertainties to a ROCcurve is apparently not settled. This note returns to the question, focusing onthe application of ROC curves to the analysis of data from counting experimentsand taking a practical operational approach to the concept of uncertainty.
ROC(接收器工作特性)曲线是一种广泛使用的评估决策系统的工具。鉴于其历史可追溯到第二次世界大战,ROC 曲线的不确定性分配问题显然尚未解决,这似乎令人惊讶。本说明将回到这个问题,重点讨论 ROC 曲线在计数实验数据分析中的应用,并对不确定性的概念采取实际操作的方法。
{"title":"Uncertainties in ROC (Receiver Operating Characteristic) Curves Derived from Counting Data","authors":"M. P. Fewell","doi":"arxiv-2406.11396","DOIUrl":"https://doi.org/arxiv-2406.11396","url":null,"abstract":"The ROC (receiver operating characteristic) curve is a widely used device for\u0000assessing decision-making systems. It seems surprising, in view of its history\u0000dating back to World War Two, that the assignment of uncertainties to a ROC\u0000curve is apparently not settled. This note returns to the question, focusing on\u0000the application of ROC curves to the analysis of data from counting experiments\u0000and taking a practical operational approach to the concept of uncertainty.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Exoplanetary Features with a Residual Model for Uniform and Gaussian Distributions 用均匀分布和高斯分布的残差模型预测系外行星特征
Pub Date : 2024-06-16 DOI: arxiv-2406.10771
Andrew Sweet
The advancement of technology has led to rampant growth in data collectionacross almost every field, including astrophysics, with researchers turning tomachine learning to process and analyze this data. One prominent example ofthis data in astrophysics is the atmospheric retrievals of exoplanets. In orderto help bridge the gap between machine learning and astrophysics domainexperts, the 2023 Ariel Data Challenge was hosted to predict posteriordistributions of 7 exoplanetary features. The procedure outlined in this paperleveraged a combination of two deep learning models to address this challenge:a Multivariate Gaussian model that generates the mean and covariance matrix ofa multivariate Gaussian distribution, and a Uniform Quantile model thatpredicts quantiles for use as the upper and lower bounds of a uniformdistribution. Training of the Multivariate Gaussian model was found to beunstable, while training of the Uniform Quantile model was stable. An ensembleof uniform distributions was found to have competitive results during testing(posterior score of 696.43), and when combined with a multivariate Gaussiandistribution achieved a final rank of third in the 2023 Ariel Data Challenge(final score of 681.57).
随着技术的进步,几乎所有领域(包括天体物理学)的数据收集量都在急剧增长,研究人员转而利用机器学习来处理和分析这些数据。天体物理学中的一个突出例子就是系外行星的大气检索数据。为了帮助缩小机器学习与天体物理学领域专家之间的差距,2023 年阿里尔数据挑战赛(Ariel Data Challenge)旨在预测 7 个系外行星特征的后验分布。本文概述的程序利用了两个深度学习模型的组合来应对这一挑战:一个是多变量高斯模型,用于生成多变量高斯分布的均值和协方差矩阵;另一个是均匀量值模型,用于预测作为均匀分布上下限的量值。多变量高斯模型的训练并不稳定,而均匀量值模型的训练则很稳定。在测试过程中,发现均匀分布集合具有竞争力的结果(后验得分为 696.43),当与多元高斯分布结合时,在 2023 年阿里尔数据挑战赛中取得了第三名的最终排名(最终得分为 681.57)。
{"title":"Predicting Exoplanetary Features with a Residual Model for Uniform and Gaussian Distributions","authors":"Andrew Sweet","doi":"arxiv-2406.10771","DOIUrl":"https://doi.org/arxiv-2406.10771","url":null,"abstract":"The advancement of technology has led to rampant growth in data collection\u0000across almost every field, including astrophysics, with researchers turning to\u0000machine learning to process and analyze this data. One prominent example of\u0000this data in astrophysics is the atmospheric retrievals of exoplanets. In order\u0000to help bridge the gap between machine learning and astrophysics domain\u0000experts, the 2023 Ariel Data Challenge was hosted to predict posterior\u0000distributions of 7 exoplanetary features. The procedure outlined in this paper\u0000leveraged a combination of two deep learning models to address this challenge:\u0000a Multivariate Gaussian model that generates the mean and covariance matrix of\u0000a multivariate Gaussian distribution, and a Uniform Quantile model that\u0000predicts quantiles for use as the upper and lower bounds of a uniform\u0000distribution. Training of the Multivariate Gaussian model was found to be\u0000unstable, while training of the Uniform Quantile model was stable. An ensemble\u0000of uniform distributions was found to have competitive results during testing\u0000(posterior score of 696.43), and when combined with a multivariate Gaussian\u0000distribution achieved a final rank of third in the 2023 Ariel Data Challenge\u0000(final score of 681.57).","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Insights into Dark Matter Direct Detection Experiments: Decision Trees versus Deep Learning 暗物质直接探测实验的启示:决策树与深度学习
Pub Date : 2024-06-14 DOI: arxiv-2406.10372
Daniel E. Lopez-Fogliani, Andres D. Perez, Roberto Ruiz de Austri
The detection of Dark Matter (DM) remains a significant challenge in particlephysics. This study exploits advanced machine learning models to improvedetection capabilities of liquid xenon time projection chamber experiments,utilizing state-of-the-art transformers alongside traditional methods likeMultilayer Perceptrons and Convolutional Neural Networks. We evaluate variousdata representations and find that simplified feature representations,particularly corrected S1 and S2 signals, retain critical information forclassification. Our results show that while transformers offer promisingperformance, simpler models like XGBoost can achieve comparable results withoptimal data representations. We also derive exclusion limits in thecross-section versus DM mass parameter space, showing minimal differencesbetween XGBoost and the best performing deep learning models. The comparativeanalysis of different machine learning approaches provides a valuable referencefor future experiments by guiding the choice of models and data representationsto maximize detection capabilities.
暗物质(DM)的探测仍然是粒子物理学中的一项重大挑战。本研究利用先进的机器学习模型来提高液态氙时间投影室实验的探测能力,在使用多层感知器和卷积神经网络等传统方法的同时,还使用了最先进的变压器。我们对各种数据表示进行了评估,发现简化的特征表示,尤其是校正后的 S1 和 S2 信号,保留了分类的关键信息。我们的结果表明,虽然变换器的性能很有前途,但 XGBoost 等更简单的模型也能通过最佳数据表示获得与之相当的结果。我们还得出了横截面与 DM 质量参数空间的排除限制,显示 XGBoost 与表现最佳的深度学习模型之间的差异极小。不同机器学习方法的比较分析为未来的实验提供了有价值的参考,指导了模型和数据表示的选择,从而最大限度地提高了探测能力。
{"title":"Insights into Dark Matter Direct Detection Experiments: Decision Trees versus Deep Learning","authors":"Daniel E. Lopez-Fogliani, Andres D. Perez, Roberto Ruiz de Austri","doi":"arxiv-2406.10372","DOIUrl":"https://doi.org/arxiv-2406.10372","url":null,"abstract":"The detection of Dark Matter (DM) remains a significant challenge in particle\u0000physics. This study exploits advanced machine learning models to improve\u0000detection capabilities of liquid xenon time projection chamber experiments,\u0000utilizing state-of-the-art transformers alongside traditional methods like\u0000Multilayer Perceptrons and Convolutional Neural Networks. We evaluate various\u0000data representations and find that simplified feature representations,\u0000particularly corrected S1 and S2 signals, retain critical information for\u0000classification. Our results show that while transformers offer promising\u0000performance, simpler models like XGBoost can achieve comparable results with\u0000optimal data representations. We also derive exclusion limits in the\u0000cross-section versus DM mass parameter space, showing minimal differences\u0000between XGBoost and the best performing deep learning models. The comparative\u0000analysis of different machine learning approaches provides a valuable reference\u0000for future experiments by guiding the choice of models and data representations\u0000to maximize detection capabilities.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - PHYS - Data Analysis, Statistics and Probability
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1