Guojun Tang, Jason E. Black, Tyler S. Williamson, Steve H. Drew
Integrating Electronic Health Records (EHR) and the application of machine learning present opportunities for enhancing the accuracy and accessibility of data-driven diabetes prediction. In particular, developing data-driven machine learning models can provide early identification of patients with high risk for diabetes, potentially leading to more effective therapeutic strategies and reduced healthcare costs. However, regulation restrictions create barriers to developing centralized predictive models. This paper addresses the challenges by introducing a federated learning approach, which amalgamates predictive models without centralized data storage and processing, thus avoiding privacy issues. This marks the first application of federated learning to predict diabetes using real clinical datasets in Canada extracted from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) without crossprovince patient data sharing. We address class-imbalance issues through downsampling techniques and compare federated learning performance against province-based and centralized models. Experimental results show that the federated MLP model presents a similar or higher performance compared to the model trained with the centralized approach. However, the federated logistic regression model showed inferior performance compared to its centralized peer.
{"title":"Federated Diabetes Prediction in Canadian Adults Using Real-world Cross-Province Primary Care Data","authors":"Guojun Tang, Jason E. Black, Tyler S. Williamson, Steve H. Drew","doi":"arxiv-2408.12029","DOIUrl":"https://doi.org/arxiv-2408.12029","url":null,"abstract":"Integrating Electronic Health Records (EHR) and the application of machine\u0000learning present opportunities for enhancing the accuracy and accessibility of\u0000data-driven diabetes prediction. In particular, developing data-driven machine\u0000learning models can provide early identification of patients with high risk for\u0000diabetes, potentially leading to more effective therapeutic strategies and\u0000reduced healthcare costs. However, regulation restrictions create barriers to\u0000developing centralized predictive models. This paper addresses the challenges\u0000by introducing a federated learning approach, which amalgamates predictive\u0000models without centralized data storage and processing, thus avoiding privacy\u0000issues. This marks the first application of federated learning to predict\u0000diabetes using real clinical datasets in Canada extracted from the Canadian\u0000Primary Care Sentinel Surveillance Network (CPCSSN) without crossprovince\u0000patient data sharing. We address class-imbalance issues through downsampling\u0000techniques and compare federated learning performance against province-based\u0000and centralized models. Experimental results show that the federated MLP model\u0000presents a similar or higher performance compared to the model trained with the\u0000centralized approach. However, the federated logistic regression model showed\u0000inferior performance compared to its centralized peer.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saakaar Bhatnagar, Andrew Comerford, Zelu Xu, Davide Berti Polato, Araz Banaeizadeh, Alessandro Ferraris
As the demand for lithium-ion batteries rapidly increases there is a need to design these cells in a safe manner to mitigate thermal runaway. Thermal runaway in batteries leads to an uncontrollable temperature rise and potentially fires, which is a major safety concern. Typically, when modelling the chemical kinetics of thermal runaway calorimetry data ( e.g. Accelerated Rate Calorimetry (ARC)) is needed to determine the temperature-driven decomposition kinetics. Conventional methods of fitting Arrhenius Ordinary Differential Equation (ODE) thermal runaway models to Accelerated Rate Calorimetry (ARC) data make several assumptions that reduce the fidelity and generalizability of the obtained model. In this paper, Chemical Reaction Neural Networks (CRNNs) are trained to fit the kinetic parameters of N-equation Arrhenius ODEs to ARC data obtained from a Molicel 21700 P45B. The models are found to be better approximations of the experimental data. The flexibility of the method is demonstrated by experimenting with two-equation and four-equation models. Thermal runaway simulations are conducted in 3D using the obtained kinetic parameters, showing the applicability of the obtained thermal runaway models to large-scale simulations.
{"title":"Chemical Reaction Neural Networks for Fitting Accelerated Rate Calorimetry Data","authors":"Saakaar Bhatnagar, Andrew Comerford, Zelu Xu, Davide Berti Polato, Araz Banaeizadeh, Alessandro Ferraris","doi":"arxiv-2408.11984","DOIUrl":"https://doi.org/arxiv-2408.11984","url":null,"abstract":"As the demand for lithium-ion batteries rapidly increases there is a need to\u0000design these cells in a safe manner to mitigate thermal runaway. Thermal\u0000runaway in batteries leads to an uncontrollable temperature rise and\u0000potentially fires, which is a major safety concern. Typically, when modelling\u0000the chemical kinetics of thermal runaway calorimetry data ( e.g. Accelerated\u0000Rate Calorimetry (ARC)) is needed to determine the temperature-driven\u0000decomposition kinetics. Conventional methods of fitting Arrhenius Ordinary\u0000Differential Equation (ODE) thermal runaway models to Accelerated Rate\u0000Calorimetry (ARC) data make several assumptions that reduce the fidelity and\u0000generalizability of the obtained model. In this paper, Chemical Reaction Neural\u0000Networks (CRNNs) are trained to fit the kinetic parameters of N-equation\u0000Arrhenius ODEs to ARC data obtained from a Molicel 21700 P45B. The models are\u0000found to be better approximations of the experimental data. The flexibility of\u0000the method is demonstrated by experimenting with two-equation and four-equation\u0000models. Thermal runaway simulations are conducted in 3D using the obtained\u0000kinetic parameters, showing the applicability of the obtained thermal runaway\u0000models to large-scale simulations.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We assess the skin thermal injury risk in the situation where a test subject is exposed to an electromagnetic beam until the occurrence of flight action. The physical process is modeled as follows. The absorbed electromagnetic power increases the skin temperature. Wherever it is above a temperature threshold, thermal nociceptors are activated and transduce an electrical signal. When the activated skin volume reaches a threshold, the flight signal is initiated. After the delay of human reaction time, the flight action is materialized when the subject moves away or the beam power is turned off. The injury risk is quantified by the thermal damage parameter calculated in the Arrhenius equation. It depends on the beam power density absorbed into the skin, which is not measurable. In addition, the volume threshold for flight initiation is unknown. To circumference these difficulties, we normalize the formulation and write the thermal damage parameter in terms of the occurrence time of flight action, which is reliably observed in exposure tests. This thermal injury formulation provides a viable framework for investigating the effects of model parameters.
{"title":"Assessing skin thermal injury risk in exposure tests of heating until flight","authors":"Hongyun Wang, Shannon E. Foley, Hong Zhou","doi":"arxiv-2408.11947","DOIUrl":"https://doi.org/arxiv-2408.11947","url":null,"abstract":"We assess the skin thermal injury risk in the situation where a test subject\u0000is exposed to an electromagnetic beam until the occurrence of flight action.\u0000The physical process is modeled as follows. The absorbed electromagnetic power\u0000increases the skin temperature. Wherever it is above a temperature threshold,\u0000thermal nociceptors are activated and transduce an electrical signal. When the\u0000activated skin volume reaches a threshold, the flight signal is initiated.\u0000After the delay of human reaction time, the flight action is materialized when\u0000the subject moves away or the beam power is turned off. The injury risk is\u0000quantified by the thermal damage parameter calculated in the Arrhenius\u0000equation. It depends on the beam power density absorbed into the skin, which is\u0000not measurable. In addition, the volume threshold for flight initiation is\u0000unknown. To circumference these difficulties, we normalize the formulation and\u0000write the thermal damage parameter in terms of the occurrence time of flight\u0000action, which is reliably observed in exposure tests. This thermal injury\u0000formulation provides a viable framework for investigating the effects of model\u0000parameters.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reinforcement Learning (RL) has experienced significant advancement over the past decade, prompting a growing interest in applications within finance. This survey critically evaluates 167 publications, exploring diverse RL applications and frameworks in finance. Financial markets, marked by their complexity, multi-agent nature, information asymmetry, and inherent randomness, serve as an intriguing test-bed for RL. Traditional finance offers certain solutions, and RL advances these with a more dynamic approach, incorporating machine learning methods, including transfer learning, meta-learning, and multi-agent solutions. This survey dissects key RL components through the lens of Quantitative Finance. We uncover emerging themes, propose areas for future research, and critique the strengths and weaknesses of existing methods.
{"title":"The Evolution of Reinforcement Learning in Quantitative Finance","authors":"Nikolaos Pippas, Cagatay Turkay, Elliot A. Ludvig","doi":"arxiv-2408.10932","DOIUrl":"https://doi.org/arxiv-2408.10932","url":null,"abstract":"Reinforcement Learning (RL) has experienced significant advancement over the\u0000past decade, prompting a growing interest in applications within finance. This\u0000survey critically evaluates 167 publications, exploring diverse RL applications\u0000and frameworks in finance. Financial markets, marked by their complexity,\u0000multi-agent nature, information asymmetry, and inherent randomness, serve as an\u0000intriguing test-bed for RL. Traditional finance offers certain solutions, and\u0000RL advances these with a more dynamic approach, incorporating machine learning\u0000methods, including transfer learning, meta-learning, and multi-agent solutions.\u0000This survey dissects key RL components through the lens of Quantitative\u0000Finance. We uncover emerging themes, propose areas for future research, and\u0000critique the strengths and weaknesses of existing methods.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, Paris, France, transformed its transportation infrastructure, marked by a notable reallocation of space away from cars to active modes of transportation. Key initiatives driving this transformation included Plan V'elo I and II, during which the city created over 1,000 kilometres of new bike paths to encourage cycling. For this, substantial road capacity has been removed from the system. This transformation provides a unique opportunity to investigate the impact of the large-scale network re-configuration on the network-wide traffic flow. Using the Network Fundamental Diagram (NFD) and a re-sampling methodology for its estimation, we investigate with empirical loop detector data from 2010 and 2023 the impact on the network's capacity, critical density, and free-flow speed resulting from these policy interventions. We find that in the urban core with the most policy interventions, per lane capacity decreased by over 50%, accompanied by a 60% drop in free-flow speed. Similarly, in the zone with fewer interventions, capacity declined by 34%, with a 40% reduction in free-flow speed. While these changes seem substantial, the NFDs show that overall congestion did not increase, indicating a modal shift to other modes of transport and hence presumably more sustainable urban mobility.
{"title":"Effects of the Plan Vélo I and II on vehicular flow in Paris -- An Empirical Analysis","authors":"Elena Natterer, Allister Loder, Klaus Bogenberger","doi":"arxiv-2408.09836","DOIUrl":"https://doi.org/arxiv-2408.09836","url":null,"abstract":"In recent years, Paris, France, transformed its transportation\u0000infrastructure, marked by a notable reallocation of space away from cars to\u0000active modes of transportation. Key initiatives driving this transformation\u0000included Plan V'elo I and II, during which the city created over 1,000\u0000kilometres of new bike paths to encourage cycling. For this, substantial road\u0000capacity has been removed from the system. This transformation provides a\u0000unique opportunity to investigate the impact of the large-scale network\u0000re-configuration on the network-wide traffic flow. Using the Network\u0000Fundamental Diagram (NFD) and a re-sampling methodology for its estimation, we\u0000investigate with empirical loop detector data from 2010 and 2023 the impact on\u0000the network's capacity, critical density, and free-flow speed resulting from\u0000these policy interventions. We find that in the urban core with the most policy\u0000interventions, per lane capacity decreased by over 50%, accompanied by a 60%\u0000drop in free-flow speed. Similarly, in the zone with fewer interventions,\u0000capacity declined by 34%, with a 40% reduction in free-flow speed. While these\u0000changes seem substantial, the NFDs show that overall congestion did not\u0000increase, indicating a modal shift to other modes of transport and hence\u0000presumably more sustainable urban mobility.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yifei Yang, Runhan Shi, Zuchao Li, Shu Jiang, Bao-Liang Lu, Yang Yang, Hai Zhao
Retrosynthesis analysis is pivotal yet challenging in drug discovery and organic chemistry. Despite the proliferation of computational tools over the past decade, AI-based systems often fall short in generalizing across diverse reaction types and exploring alternative synthetic pathways. This paper presents BatGPT-Chem, a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction. Integrating chemical tasks via a unified framework of natural language and SMILES notation, this approach synthesizes extensive instructional data from an expansive chemical database. Employing both autoregressive and bidirectional training techniques across over one hundred million instances, BatGPT-Chem captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions and exhibiting strong zero-shot capabilities. Superior to existing AI methods, our model demonstrates significant advancements in generating effective strategies for complex molecules, as validated by stringent benchmark tests. BatGPT-Chem not only boosts the efficiency and creativity of retrosynthetic analysis but also establishes a new standard for computational tools in synthetic design. This development empowers chemists to adeptly address the synthesis of novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science. We release our trial platform at url{https://www.batgpt.net/dapp/chem}.
{"title":"BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction","authors":"Yifei Yang, Runhan Shi, Zuchao Li, Shu Jiang, Bao-Liang Lu, Yang Yang, Hai Zhao","doi":"arxiv-2408.10285","DOIUrl":"https://doi.org/arxiv-2408.10285","url":null,"abstract":"Retrosynthesis analysis is pivotal yet challenging in drug discovery and\u0000organic chemistry. Despite the proliferation of computational tools over the\u0000past decade, AI-based systems often fall short in generalizing across diverse\u0000reaction types and exploring alternative synthetic pathways. This paper\u0000presents BatGPT-Chem, a large language model with 15 billion parameters,\u0000tailored for enhanced retrosynthesis prediction. Integrating chemical tasks via\u0000a unified framework of natural language and SMILES notation, this approach\u0000synthesizes extensive instructional data from an expansive chemical database.\u0000Employing both autoregressive and bidirectional training techniques across over\u0000one hundred million instances, BatGPT-Chem captures a broad spectrum of\u0000chemical knowledge, enabling precise prediction of reaction conditions and\u0000exhibiting strong zero-shot capabilities. Superior to existing AI methods, our\u0000model demonstrates significant advancements in generating effective strategies\u0000for complex molecules, as validated by stringent benchmark tests. BatGPT-Chem\u0000not only boosts the efficiency and creativity of retrosynthetic analysis but\u0000also establishes a new standard for computational tools in synthetic design.\u0000This development empowers chemists to adeptly address the synthesis of novel\u0000compounds, potentially expediting the innovation cycle in drug manufacturing\u0000and materials science. We release our trial platform at\u0000url{https://www.batgpt.net/dapp/chem}.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays, harmful effects of climate change are becoming increasingly apparent. A vital issue that must be addressed is the generation of energy from non-renewable and often polluted sources. For this reason, the development of renewable energy sources is of great importance. Unfortunately, too rapid spread of renewables can disrupt stability of the power system and lead to energy blackouts. One should not simply support it, without ensuring sustainability and understanding of the diffusion process. In this research, we propose a new agent-based model of diffusion of photovoltaic panels. It is an extension of the $q$-voter model that utilizes multi-layer network structure. The model is analyzed by Monte Carlo simulations and mean-field approximation. The impact of parameters and specifications on the basic properties of the model is discussed.
{"title":"Multi-layer diffusion model of photovoltaic installations","authors":"Tomasz Weron, Janusz Szwabinski","doi":"arxiv-2408.09904","DOIUrl":"https://doi.org/arxiv-2408.09904","url":null,"abstract":"Nowadays, harmful effects of climate change are becoming increasingly\u0000apparent. A vital issue that must be addressed is the generation of energy from\u0000non-renewable and often polluted sources. For this reason, the development of\u0000renewable energy sources is of great importance. Unfortunately, too rapid\u0000spread of renewables can disrupt stability of the power system and lead to\u0000energy blackouts. One should not simply support it, without ensuring\u0000sustainability and understanding of the diffusion process. In this research, we\u0000propose a new agent-based model of diffusion of photovoltaic panels. It is an\u0000extension of the $q$-voter model that utilizes multi-layer network structure.\u0000The model is analyzed by Monte Carlo simulations and mean-field approximation.\u0000The impact of parameters and specifications on the basic properties of the\u0000model is discussed.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Meijer, Nicole Howard, Jessica Liang, Autumn Kelsey, Sathya Subramanian, Ed Johnson, Paul Mariz, James Harvey, Madeline Ambrose, Vitalii Tereshchenko, Aldan Beaubien, Neelima Inala, Yousef Aggoune, Stark Pister, Anne Vetto, Melissa Kinsey, Tom Bumol, Ananda Goldrath, Xiaojun Li, Troy Torgerson, Peter Skene, Lauren Okada, Christian La France, Zach Thomson, Lucas Graybuck
The high incidence of irreproducible research has led to urgent appeals for transparency and equitable practices in open science. For the scientific disciplines that rely on computationally intensive analyses of large data sets, a granular understanding of the analysis methodology is an essential component of reproducibility. This paper discusses the guiding principles of a computational reproducibility framework that enables a scientist to proactively generate a complete reproducible trace as analysis unfolds, and share data, methods and executable tools as part of a scientific publication, allowing other researchers to verify results and easily re-execute the steps of the scientific investigation.
{"title":"Provide Proactive Reproducible Analysis Transparency with Every Publication","authors":"Paul Meijer, Nicole Howard, Jessica Liang, Autumn Kelsey, Sathya Subramanian, Ed Johnson, Paul Mariz, James Harvey, Madeline Ambrose, Vitalii Tereshchenko, Aldan Beaubien, Neelima Inala, Yousef Aggoune, Stark Pister, Anne Vetto, Melissa Kinsey, Tom Bumol, Ananda Goldrath, Xiaojun Li, Troy Torgerson, Peter Skene, Lauren Okada, Christian La France, Zach Thomson, Lucas Graybuck","doi":"arxiv-2408.09103","DOIUrl":"https://doi.org/arxiv-2408.09103","url":null,"abstract":"The high incidence of irreproducible research has led to urgent appeals for\u0000transparency and equitable practices in open science. For the scientific\u0000disciplines that rely on computationally intensive analyses of large data sets,\u0000a granular understanding of the analysis methodology is an essential component\u0000of reproducibility. This paper discusses the guiding principles of a\u0000computational reproducibility framework that enables a scientist to proactively\u0000generate a complete reproducible trace as analysis unfolds, and share data,\u0000methods and executable tools as part of a scientific publication, allowing\u0000other researchers to verify results and easily re-execute the steps of the\u0000scientific investigation.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yihe Wang, Nadia Mammone, Darina Petrovsky, Alexandros T. Tzallas, Francesco C. Morabito, Xiang Zhang
Electroencephalogram (EEG) has emerged as a cost-effective and efficient method for supporting neurologists in assessing Alzheimer's disease (AD). Existing approaches predominantly utilize handcrafted features or Convolutional Neural Network (CNN)-based methods. However, the potential of the transformer architecture, which has shown promising results in various time series analysis tasks, remains underexplored in interpreting EEG for AD assessment. Furthermore, most studies are evaluated on the subject-dependent setup but often overlook the significance of the subject-independent setup. To address these gaps, we present ADformer, a novel multi-granularity transformer designed to capture temporal and spatial features to learn effective EEG representations. We employ multi-granularity data embedding across both dimensions and utilize self-attention to learn local features within each granularity and global features among different granularities. We conduct experiments across 5 datasets with a total of 525 subjects in setups including subject-dependent, subject-independent, and leave-subjects-out. Our results show that ADformer outperforms existing methods in most evaluations, achieving F1 scores of 75.19% and 93.58% on two large datasets with 65 subjects and 126 subjects, respectively, in distinguishing AD and healthy control (HC) subjects under the challenging subject-independent setup.
脑电图(EEG)已成为支持神经学家评估阿尔茨海默病(AD)的一种经济高效的方法。然而,在各种时间序列分析任务中显示出良好效果的变压器架构,在解读脑电图以评估注意力缺失症方面的潜力仍未得到充分发掘。此外,大多数研究都是在与受试者相关的设置上进行评估,但往往忽略了与受试者无关的设置的重要性。为了弥补这些不足,我们提出了 ADformer,这是一种新型多粒度变换器,旨在捕捉时间和空间特征以学习有效的脑电图描述。我们采用跨两个维度的多粒度数据嵌入,并利用自我关注来学习每个粒度内的局部特征和不同粒度间的全局特征。我们在 5 个数据集上进行了实验,共有 525 名受试者参加,实验设置包括依赖受试者、不依赖受试者和排除受试者。我们的结果表明,ADformer 在大多数评估中的表现都优于现有方法,在两个分别有 65 名受试者和 126 名受试者的大型数据集上,ADformer 的 F1 分数分别达到了 75.19% 和 93.58%,在具有挑战性的受试者独立设置下,ADformer 可以区分 AD 受试者和健康对照组(HC)受试者。
{"title":"ADformer: A Multi-Granularity Transformer for EEG-Based Alzheimer's Disease Assessment","authors":"Yihe Wang, Nadia Mammone, Darina Petrovsky, Alexandros T. Tzallas, Francesco C. Morabito, Xiang Zhang","doi":"arxiv-2409.00032","DOIUrl":"https://doi.org/arxiv-2409.00032","url":null,"abstract":"Electroencephalogram (EEG) has emerged as a cost-effective and efficient\u0000method for supporting neurologists in assessing Alzheimer's disease (AD).\u0000Existing approaches predominantly utilize handcrafted features or Convolutional\u0000Neural Network (CNN)-based methods. However, the potential of the transformer\u0000architecture, which has shown promising results in various time series analysis\u0000tasks, remains underexplored in interpreting EEG for AD assessment.\u0000Furthermore, most studies are evaluated on the subject-dependent setup but\u0000often overlook the significance of the subject-independent setup. To address\u0000these gaps, we present ADformer, a novel multi-granularity transformer designed\u0000to capture temporal and spatial features to learn effective EEG\u0000representations. We employ multi-granularity data embedding across both\u0000dimensions and utilize self-attention to learn local features within each\u0000granularity and global features among different granularities. We conduct\u0000experiments across 5 datasets with a total of 525 subjects in setups including\u0000subject-dependent, subject-independent, and leave-subjects-out. Our results\u0000show that ADformer outperforms existing methods in most evaluations, achieving\u0000F1 scores of 75.19% and 93.58% on two large datasets with 65 subjects and 126\u0000subjects, respectively, in distinguishing AD and healthy control (HC) subjects\u0000under the challenging subject-independent setup.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Susanna Baars, Jigar Parekh, Ihar Antonau, Philipp Bekemeyer, Ulrich Römer
The long runtime associated with simulating multidisciplinary systems challenges the use of Bayesian optimization for multidisciplinary design optimization (MDO). This is particularly the case if the coupled system is modeled in a partitioned manner and feedback loops, known as strong coupling, are present. This work introduces a method for Bayesian optimization in MDO called "Multidisciplinary Design Optimization using Thompson Sampling", abbreviated as MDO-TS. Instead of replacing the whole system with a surrogate, we substitute each discipline with such a Gaussian process. Since an entire multidisciplinary analysis is no longer required for enrichment, evaluations can potentially be saved. However, the objective and associated uncertainty are no longer analytically estimated. Since most adaptive sampling strategies assume the availability of these estimates, they cannot be applied without modification. Thompson sampling does not require this explicit availability. Instead, Thompson sampling balances exploration and exploitation by selecting actions based on optimizing random samples from the objective. We combine Thompson sampling with an approximate sampling strategy that uses random Fourier features. This approach produces continuous functions that can be evaluated iteratively. We study the application of this infill criterion to both an analytical problem and the shape optimization of a simple fluid-structure interaction example.
{"title":"Partitioned Surrogates and Thompson Sampling for Multidisciplinary Bayesian Optimization","authors":"Susanna Baars, Jigar Parekh, Ihar Antonau, Philipp Bekemeyer, Ulrich Römer","doi":"arxiv-2408.08691","DOIUrl":"https://doi.org/arxiv-2408.08691","url":null,"abstract":"The long runtime associated with simulating multidisciplinary systems\u0000challenges the use of Bayesian optimization for multidisciplinary design\u0000optimization (MDO). This is particularly the case if the coupled system is\u0000modeled in a partitioned manner and feedback loops, known as strong coupling,\u0000are present. This work introduces a method for Bayesian optimization in MDO\u0000called \"Multidisciplinary Design Optimization using Thompson Sampling\",\u0000abbreviated as MDO-TS. Instead of replacing the whole system with a surrogate,\u0000we substitute each discipline with such a Gaussian process. Since an entire\u0000multidisciplinary analysis is no longer required for enrichment, evaluations\u0000can potentially be saved. However, the objective and associated uncertainty are\u0000no longer analytically estimated. Since most adaptive sampling strategies\u0000assume the availability of these estimates, they cannot be applied without\u0000modification. Thompson sampling does not require this explicit availability.\u0000Instead, Thompson sampling balances exploration and exploitation by selecting\u0000actions based on optimizing random samples from the objective. We combine\u0000Thompson sampling with an approximate sampling strategy that uses random\u0000Fourier features. This approach produces continuous functions that can be\u0000evaluated iteratively. We study the application of this infill criterion to\u0000both an analytical problem and the shape optimization of a simple\u0000fluid-structure interaction example.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}