首页 > 最新文献

Journal of Computational Science最新文献

英文 中文
Real time patient scheduling orchestration for improving key performance indicators in a hospital emergency department 改善医院急诊科关键绩效指标的实时患者调度协调系统
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-13 DOI: 10.1016/j.jocs.2024.102422

Healthcare systems worldwide are increasingly subject to in-depth analysis. Problems in healthcare systems are of concern to the general public. For example, overcrowding in emergency departments creates several issues including longer waiting times, more frequent medical errors, a longer length of stay and worsened performance indicators. Overcrowding situations reduce the availability of staff and material resources, and therefore deteriorate the quality of care. The main cause of the overcrowding in emergency departments is the permanent interferences between the scheduled patients, unscheduled patients and urgent and unscheduled patients arriving at the emergency department. The objective of the present study is to develop an innovative decision support system that minimizes these interferences, while taking into account the perturbations that can occur throughout the day. The research’s ultimate goal is to improve the performance indicators via two processes: the first is a memetic algorithm based on a four dimensional hypercube genetic algorithm and local search techniques, and the second is based on a multi-agent system which dynamically orchestrates the patient pathway (given by the scheduling algorithm). In order to test and validate our approach, experiments are designed with real data from the adult emergency department at Lille University Medical Center. Simulations showed that with our approach we were able to reduce the waiting time of patients by 28.12%.

对全球医疗保健系统的深入分析日益增多。医疗系统中存在的问题引起了公众的关注。例如,急诊室人满为患造成了一些问题,包括候诊时间延长、医疗事故频发、住院时间延长以及绩效指标恶化。过度拥挤的情况会减少可用的人力和物力,从而降低医疗质量。急诊科人满为患的主要原因是急诊科的预约病人、非预约病人、急诊病人和非预约病人之间长期相互干扰。本研究的目的是开发一种创新的决策支持系统,以尽量减少这些干扰,同时考虑到全天可能发生的干扰。研究的最终目标是通过两个过程来改善性能指标:第一个过程是基于四维超立方遗传算法和局部搜索技术的记忆算法,第二个过程是基于多代理系统的动态协调病人路径(由调度算法给出)。为了测试和验证我们的方法,我们利用里尔大学医疗中心成人急诊科的真实数据进行了实验。模拟结果表明,采用我们的方法,病人的等待时间缩短了 28.12%。
{"title":"Real time patient scheduling orchestration for improving key performance indicators in a hospital emergency department","authors":"","doi":"10.1016/j.jocs.2024.102422","DOIUrl":"10.1016/j.jocs.2024.102422","url":null,"abstract":"<div><p>Healthcare systems worldwide are increasingly subject to in-depth analysis. Problems in healthcare systems are of concern to the general public. For example, overcrowding in emergency departments creates several issues including longer waiting times, more frequent medical errors, a longer length of stay and worsened performance indicators. Overcrowding situations reduce the availability of staff and material resources, and therefore deteriorate the quality of care. The main cause of the overcrowding in emergency departments is the permanent interferences between the scheduled patients, unscheduled patients and urgent and unscheduled patients arriving at the emergency department. The objective of the present study is to develop an innovative decision support system that minimizes these interferences, while taking into account the perturbations that can occur throughout the day. The research’s ultimate goal is to improve the performance indicators via two processes: the first is a memetic algorithm based on a four dimensional hypercube genetic algorithm and local search techniques, and the second is based on a multi-agent system which dynamically orchestrates the patient pathway (given by the scheduling algorithm). In order to test and validate our approach, experiments are designed with real data from the adult emergency department at Lille University Medical Center. Simulations showed that with our approach we were able to reduce the waiting time of patients by 28.12%.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142058323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the interaction between EEG and fNIRS: A multimodal network analysis of brain connectivity 研究脑电图与 fNIRS 之间的相互作用:大脑连接的多模态网络分析
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-12 DOI: 10.1016/j.jocs.2024.102416

The brain is a complex system with functional and structural networks. Different neuroimaging methods have been developed to explore these networks, but each method has its own unique strengths and limitations, depending on the signals they measure. Combining techniques like electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) has gained interest, but understanding how the information derived from these modalities is related to each other remains an exciting open question. The multilayer network model has emerged as a promising approach to integrate different sources data. In this study, we investigated the hemodynamic and electrophysiological data captured by fNIRS and EEG to compare brain network topologies derived from each modality, examining how these topologies vary between resting state (RS) and task-related conditions. Additionally, we adopted the multilayer network model to integrate EEG and fNIRS data and evaluate the benefits of combining multiple modalities compared to using a single modality in capturing characteristic brain functioning.

A small-world network structure was observed in the rest, right motor imagery, and left motor imagery tasks in both modalities. We found that EEG captures faster changes in neural activity, thus providing a more precise estimation of the timing of information transfer between brain regions in RS. fNIRS provides insights into the slower hemodynamic responses associated with longer-lasting and sustained neural processes in cognitive tasks. The multilayer approach outperformed unimodal analyses, offering a richer understanding of brain function. Complementarity between EEG and fNIRS was observed, particularly during tasks, as well as a certain level of redundancy and complementarity between the multimodal and the unimodal approach, which depends on the modality and the specific brain state. Overall, the results highlight differences in how EEG and fNIRS capture brain network topology in RS and tasks and emphasize the value of integrating multiple modalities for a comprehensive view of brain connectivity and function.

大脑是一个具有功能和结构网络的复杂系统。目前已开发出不同的神经成像方法来探索这些网络,但每种方法都有其独特的优势和局限性,这取决于它们所测量的信号。将脑电图(EEG)和功能性近红外光谱(fNIRS)等技术结合起来已引起了人们的兴趣,但了解从这些模式中获得的信息如何相互关联仍是一个令人兴奋的开放性问题。多层网络模型已成为整合不同来源数据的一种有前途的方法。在本研究中,我们调查了 fNIRS 和 EEG 所捕获的血液动力学和电生理学数据,比较了从每种模式中获得的大脑网络拓扑结构,研究了这些拓扑结构在静息状态(RS)和任务相关条件下的差异。此外,我们还采用了多层网络模型来整合脑电图和 fNIRS 数据,并评估了在捕捉大脑功能特征方面,与使用单一模式相比,结合多种模式的益处。在两种模式的静息、右运动想象和左运动想象任务中,我们都观察到了小世界网络结构。我们发现脑电图捕捉到的神经活动变化更快,因此能更精确地估计 RS 中大脑区域之间的信息传递时间。多层方法优于单模态分析,可提供对大脑功能更丰富的理解。观察到脑电图和 fNIRS 之间的互补性,特别是在任务期间,以及多模态和单模态方法之间一定程度的冗余和互补性,这取决于模态和特定的大脑状态。总之,研究结果凸显了脑电图和 fNIRS 在捕捉 RS 和任务中大脑网络拓扑结构方面的差异,并强调了整合多种模式以全面了解大脑连接和功能的价值。
{"title":"Investigating the interaction between EEG and fNIRS: A multimodal network analysis of brain connectivity","authors":"","doi":"10.1016/j.jocs.2024.102416","DOIUrl":"10.1016/j.jocs.2024.102416","url":null,"abstract":"<div><p>The brain is a complex system with functional and structural networks. Different neuroimaging methods have been developed to explore these networks, but each method has its own unique strengths and limitations, depending on the signals they measure. Combining techniques like electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) has gained interest, but understanding how the information derived from these modalities is related to each other remains an exciting open question. The multilayer network model has emerged as a promising approach to integrate different sources data. In this study, we investigated the hemodynamic and electrophysiological data captured by fNIRS and EEG to compare brain network topologies derived from each modality, examining how these topologies vary between resting state (RS) and task-related conditions. Additionally, we adopted the multilayer network model to integrate EEG and fNIRS data and evaluate the benefits of combining multiple modalities compared to using a single modality in capturing characteristic brain functioning.</p><p>A small-world network structure was observed in the rest, right motor imagery, and left motor imagery tasks in both modalities. We found that EEG captures faster changes in neural activity, thus providing a more precise estimation of the timing of information transfer between brain regions in RS. fNIRS provides insights into the slower hemodynamic responses associated with longer-lasting and sustained neural processes in cognitive tasks. The multilayer approach outperformed unimodal analyses, offering a richer understanding of brain function. Complementarity between EEG and fNIRS was observed, particularly during tasks, as well as a certain level of redundancy and complementarity between the multimodal and the unimodal approach, which depends on the modality and the specific brain state. Overall, the results highlight differences in how EEG and fNIRS capture brain network topology in RS and tasks and emphasize the value of integrating multiple modalities for a comprehensive view of brain connectivity and function.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1877750324002096/pdfft?md5=71eed64dc88649cf2f98e5b6d2bb1fed&pid=1-s2.0-S1877750324002096-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142049119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Some notes on the basic concepts of support vector machines 关于支持向量机基本概念的一些说明
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-08 DOI: 10.1016/j.jocs.2024.102390

Support vector machines (SVMs) are classic binary classification algorithms and have been shown to be a robust and well-behaved technique for classification in many real-world problems. However, there are ambiguities in the basic concepts of SVMs although these ambiguities do not affect the effectiveness of SVMs. Corinna Cortes and Vladimir Vapnik, who presented SVMs in 1995, pointed out that an SVM predicts through a hyperplane with a maximal margin. However existing literatures have two different definitions of the margin. On the other hand, Corinna Cortes and Vladimir Vapnik converted an SVM into an optimization problem that is much easier to solve. Nevertheless, existing papers do not explain how the optimization problem derives from an SVM well. These ambiguities may cause certain troubles in understanding the basic concepts of SVMs. For this purpose, this paper defines a separating hyperplane of a training data set and, hence, an optimal separating hyperplane of the set. The two definitions are reasonable since this paper proves that w0Tx+b0=0 is an optimal separating hyperplane of a training data set when w0 and b0 constitute a solution to the above optimization problem. Some notes on the above margin and optimization problem are given based on the two definitions. These notes should be meaningful for clarifying the basic concepts of SVMs.

支持向量机(SVM)是一种经典的二元分类算法,在许多实际问题的分类中都被证明是一种稳健而良好的技术。然而,尽管 SVM 的基本概念存在模糊之处,但这些模糊之处并不影响 SVM 的有效性。1995 年提出 SVM 的 Corinna Cortes 和 Vladimir Vapnik 指出,SVM 通过具有最大边际的超平面进行预测。然而,现有文献对边际有两种不同的定义。另一方面,Corinna Cortes 和 Vladimir Vapnik 将 SVM 转化为优化问题,这更容易解决。然而,现有论文并没有很好地解释优化问题是如何从 SVM 派生的。这些模糊之处可能会给理解 SVM 的基本概念带来一定的麻烦。为此,本文定义了训练数据集的分离超平面,进而定义了训练数据集的最优分离超平面。这两个定义是合理的,因为本文证明了当 w0 和 b0 构成上述优化问题的解时,w0Tx+b0=0 是训练数据集的最优分离超平面。基于这两个定义,本文对上述边际和优化问题做了一些说明。这些说明对于澄清 SVM 的基本概念应该是有意义的。
{"title":"Some notes on the basic concepts of support vector machines","authors":"","doi":"10.1016/j.jocs.2024.102390","DOIUrl":"10.1016/j.jocs.2024.102390","url":null,"abstract":"<div><p>Support vector machines (SVMs) are classic binary classification algorithms and have been shown to be a robust and well-behaved technique for classification in many real-world problems. However, there are ambiguities in the basic concepts of SVMs although these ambiguities do not affect the effectiveness of SVMs. Corinna Cortes and Vladimir Vapnik, who presented SVMs in 1995, pointed out that an SVM predicts through a hyperplane with a maximal margin. However existing literatures have two different definitions of the margin. On the other hand, Corinna Cortes and Vladimir Vapnik converted an SVM into an optimization problem that is much easier to solve. Nevertheless, existing papers do not explain how the optimization problem derives from an SVM well. These ambiguities may cause certain troubles in understanding the basic concepts of SVMs. For this purpose, this paper defines a separating hyperplane of a training data set and, hence, an optimal separating hyperplane of the set. The two definitions are reasonable since this paper proves that <span><math><mrow><msubsup><mrow><mtext>w</mtext></mrow><mrow><mn>0</mn></mrow><mrow><mtext>T</mtext></mrow></msubsup><mtext>x</mtext><mo>+</mo><msub><mrow><mi>b</mi></mrow><mrow><mn>0</mn></mrow></msub><mo>=</mo><mn>0</mn></mrow></math></span> is an optimal separating hyperplane of a training data set when <span><math><msub><mrow><mtext>w</mtext></mrow><mrow><mn>0</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>b</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span> constitute a solution to the above optimization problem. Some notes on the above margin and optimization problem are given based on the two definitions. These notes should be meaningful for clarifying the basic concepts of SVMs.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1877750324001832/pdfft?md5=e5c1cc2cfe92cdf160c7da2829fc6cb5&pid=1-s2.0-S1877750324001832-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modelling the queues of connected and autonomous vehicles at signal-free intersections considering the correlated vehicle arrivals 考虑车辆到达的相关性,在无信号灯交叉路口为联网车辆和自动驾驶车辆排队建模
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-08 DOI: 10.1016/j.jocs.2024.102420

Advances in connected and autonomous vehicle (CAV) technologies have made signal-free intersections a viable option for enhancing traffic performance. In the absence of traffic signal control, sequencing control strategies become crucial to ensuring the safety and efficiency of conflicting traffic flows at these intersections. The First-Come-First-Serve (FCFS) and Longest-Queue-First (LQF) strategies have received significant attention as fundamental approaches to managing connected and automated vehicles at signal-free intersections, serving as baselines for evaluating innovative strategies. However, the impact of varying traffic demand in conflicting directions on the volatility of CAV queues at signal-free intersections remains unclear, and there is a lack of analytical quantitative estimates on how these two fundamental sequencing strategies affect fairness within CAV queues. Furthermore, in urban road networks, CAVs entering a downstream intersection typically originate from an upstream intersection, and thus CAVs typically move in bunching and correlation. However, this phenomenon has received little attention in the modelling of CAV queues. To this end, in this paper, by virtue of the salient advantage of the Markovian Arrival Process (MAP) in describing the bunching and correlated arrival properties, an MAP-based double-input queueing model and its computational framework are developed to estimate the queueing process of CAVs at signal-free intersections. Some basic statistical metrics, such as queue length, delay, conditional queue length, and queue length variance, are derived. Additionally, numerical experiments are conducted to examine the queueing performance of FCFS and LQF strategies under different traffic conditions. The results suggest that the effectiveness of FCFS and LQF strategies varies depending on the level of traffic demand in the conflicting directions.

互联和自动驾驶汽车(CAV)技术的进步使无信号灯交叉路口成为提高交通性能的可行选择。在没有交通信号控制的情况下,排序控制策略对于确保这些交叉口冲突交通流的安全和效率至关重要。先到先服务(FCFS)和最长队列优先(LQF)策略作为在无信号灯交叉路口管理联网和自动驾驶车辆的基本方法受到了广泛关注,并成为评估创新策略的基准。然而,冲突方向的不同交通需求对无信号交叉口 CAV 队列波动性的影响仍不清楚,也缺乏对这两种基本排序策略如何影响 CAV 队列公平性的定量分析估计。此外,在城市道路网络中,进入下游交叉路口的 CAV 通常来自上游交叉路口,因此 CAV 通常以串联和相关的方式移动。然而,这一现象在 CAV 队列建模中很少受到关注。为此,本文利用马尔可夫到达过程(MAP)在描述扎堆和相关到达特性方面的突出优势,建立了基于 MAP 的双输入排队模型及其计算框架,以估计无信号交叉口的 CAV 排队过程。得出了一些基本的统计指标,如队列长度、延迟、条件队列长度和队列长度方差。此外,还进行了数值实验,以检验 FCFS 和 LQF 策略在不同交通条件下的排队性能。结果表明,FCFS 和 LQF 策略的有效性因冲突方向的交通需求水平而异。
{"title":"Modelling the queues of connected and autonomous vehicles at signal-free intersections considering the correlated vehicle arrivals","authors":"","doi":"10.1016/j.jocs.2024.102420","DOIUrl":"10.1016/j.jocs.2024.102420","url":null,"abstract":"<div><p>Advances in connected and autonomous vehicle (CAV) technologies have made signal-free intersections a viable option for enhancing traffic performance. In the absence of traffic signal control, sequencing control strategies become crucial to ensuring the safety and efficiency of conflicting traffic flows at these intersections. The First-Come-First-Serve (FCFS) and Longest-Queue-First (LQF) strategies have received significant attention as fundamental approaches to managing connected and automated vehicles at signal-free intersections, serving as baselines for evaluating innovative strategies. However, the impact of varying traffic demand in conflicting directions on the volatility of CAV queues at signal-free intersections remains unclear, and there is a lack of analytical quantitative estimates on how these two fundamental sequencing strategies affect fairness within CAV queues. Furthermore, in urban road networks, CAVs entering a downstream intersection typically originate from an upstream intersection, and thus CAVs typically move in bunching and correlation. However, this phenomenon has received little attention in the modelling of CAV queues. To this end, in this paper, by virtue of the salient advantage of the Markovian Arrival Process (MAP) in describing the bunching and correlated arrival properties, an MAP-based double-input queueing model and its computational framework are developed to estimate the queueing process of CAVs at signal-free intersections. Some basic statistical metrics, such as queue length, delay, conditional queue length, and queue length variance, are derived. Additionally, numerical experiments are conducted to examine the queueing performance of FCFS and LQF strategies under different traffic conditions. The results suggest that the effectiveness of FCFS and LQF strategies varies depending on the level of traffic demand in the conflicting directions.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Material hardness descriptor derived by symbolic regression 通过符号回归得出的材料硬度描述符
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-06 DOI: 10.1016/j.jocs.2024.102402

Hardness is a materials’ property with implications in several industrial fields, including oil and gas, manufacturing, and others. However, the relationship between this macroscale property and atomic (i.e., microscale) properties is unknown and in the last decade several models have unsuccessfully tried to correlate them in a wide range of chemical space. The understanding of such relationship is of fundamental importance for discovery of harder materials with specific characteristics to be employed in a wide range of fields. In this work, we have found a physical descriptor for Vickers hardness using a symbolic-regression artificial-intelligence approach based on compressed sensing. SISSO (Sure Independence Screening plus Sparsifying Operator) is an artificial-intelligence algorithm used for discovering simple and interpretable predictive models. It performs feature selection from up to billions of candidates obtained from several primary features by applying a set of mathematical operators. The resulting sparse SISSO model accurately describes the target property (i.e., Vickers hardness) with minimal complexity. We have considered the experimental values of hardness for binary, ternary, and quaternary transition-metal borides, carbides, nitrides, carbonitrides, carboborides, and boronitrides of 61 materials, on which the fitting was performed.. The found descriptor is a non-linear function of the microscopic properties, with the most significant contribution being from a combination of Voigt-averaged bulk modulus, Poisson’s ratio, and Reuss-averaged shear modulus. Results of high-throughput screening of 635 candidate materials using the found descriptor suggest the enhancement of material’s hardness through mixing with harder yet metastable structures (e.g., metastable VN, TaN, ReN2, Cr3N4, and ZrB6 all exhibit high hardness).

硬度是一种材料特性,对石油和天然气、制造业等多个工业领域都有影响。然而,这一宏观属性与原子(即微观)属性之间的关系尚不清楚,过去十年中,有几个模型试图在广泛的化学空间中将它们联系起来,但都没有成功。了解这种关系对于发现具有特定特性的更坚硬材料并将其应用于广泛领域具有根本性的重要意义。在这项工作中,我们利用基于压缩传感的符号回归人工智能方法,找到了维氏硬度的物理描述符。SISSO(Sure Independence Screening plus Sparsifying Operator)是一种人工智能算法,用于发现简单且可解释的预测模型。它通过应用一组数学运算符,从从多个主要特征中获得的多达数十亿个候选特征中进行特征选择。由此产生的稀疏 SISSO 模型能以最小的复杂度准确描述目标特性(即维氏硬度)。我们考虑了二元、三元和四元过渡金属硼化物、碳化物、氮化物、碳氮化物、碳硼化物和硼氮化物等 61 种材料的硬度实验值,并对其进行了拟合。所发现的描述符是微观特性的非线性函数,其中最重要的贡献来自沃伊特均值体积模量、泊松比和鲁斯均值剪切模量的组合。使用所发现的描述符对 635 种候选材料进行高通量筛选的结果表明,通过与较硬但可蜕变的结构混合,材料的硬度得到了提高(例如,可蜕变的 VN、TaN、ReN2、Cr3N4 和 ZrB6 都表现出很高的硬度)。
{"title":"Material hardness descriptor derived by symbolic regression","authors":"","doi":"10.1016/j.jocs.2024.102402","DOIUrl":"10.1016/j.jocs.2024.102402","url":null,"abstract":"<div><p>Hardness is a materials’ property with implications in several industrial fields, including oil and gas, manufacturing, and others. However, the relationship between this macroscale property and atomic (i.e., microscale) properties is unknown and in the last decade several models have unsuccessfully tried to correlate them in a wide range of chemical space. The understanding of such relationship is of fundamental importance for discovery of harder materials with specific characteristics to be employed in a wide range of fields. In this work, we have found a physical descriptor for Vickers hardness using a symbolic-regression artificial-intelligence approach based on compressed sensing. SISSO (Sure Independence Screening plus Sparsifying Operator) is an artificial-intelligence algorithm used for discovering simple and interpretable predictive models. It performs feature selection from up to billions of candidates obtained from several primary features by applying a set of mathematical operators. The resulting sparse SISSO model accurately describes the target property (i.e., Vickers hardness) with minimal complexity. We have considered the experimental values of hardness for binary, ternary, and quaternary transition-metal borides, carbides, nitrides, carbonitrides, carboborides, and boronitrides of 61 materials, on which the fitting was performed.. The found descriptor is a non-linear function of the microscopic properties, with the most significant contribution being from a combination of Voigt-averaged bulk modulus, Poisson’s ratio, and Reuss-averaged shear modulus. Results of high-throughput screening of 635 candidate materials using the found descriptor suggest the enhancement of material’s hardness through mixing with harder yet metastable structures (e.g., metastable VN, TaN, ReN<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>, Cr<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>N<span><math><msub><mrow></mrow><mrow><mn>4</mn></mrow></msub></math></span>, and ZrB<span><math><msub><mrow></mrow><mrow><mn>6</mn></mrow></msub></math></span> all exhibit high hardness).</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1877750324001959/pdfft?md5=01248d5bc6185e230ddb5469bb838119&pid=1-s2.0-S1877750324001959-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Step-based checkpointing with high-level algorithmic differentiation 基于步骤的检查点与高级算法区分
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-06 DOI: 10.1016/j.jocs.2024.102405

Automated code generation allows for a separation between the development of a model, expressed via a domain specific language, and lower level implementation details. Algorithmic differentiation can be applied symbolically at the level of the domain specific language, and the code generator reused to implement code required for an adjoint calculation. However the adjoint calculations are complicated by the well-known problem of storing or recomputing the forward data required by the adjoint, and different checkpointing strategies have been developed to tackle this problem. This article considers the combination of high-level algorithmic differentiation with step-based checkpointing schedules, with the primary application being for solvers of time-dependent partial differential equations. The focus is on algorithmic differentiation using a dynamically constructed record of forward operations, where the precise structure of the original forward calculation is unknown ahead-of-time. In addition, high-level approaches provide a simplified view of the model itself. This allows data required to restart and advance the forward, and data required to advance the adjoint, to be identified. The difference between the two types of data is here leveraged to implement checkpointing strategies with improved performance.

通过自动代码生成,可以将通过特定领域语言表达的模型开发与较低级别的实施细节分离开来。算法微分可以在特定领域语言的层次上以符号方式应用,代码生成器可以重复使用,以实现邻接计算所需的代码。然而,众所周知的问题是,要存储或重新计算旁证计算所需的前向数据,这使得旁证计算变得复杂,为了解决这个问题,人们开发了不同的检查点策略。本文考虑将高级算法微分与基于步长的检查点计划相结合,主要应用于时变偏微分方程的求解器。重点在于使用动态构建的前向运算记录进行算法微分,在这种情况下,原始前向计算的精确结构是提前未知的。此外,高层方法提供了模型本身的简化视图。这样就可以确定重启和推进正演所需的数据,以及推进副运算所需的数据。利用这两类数据之间的差异,可以实施性能更高的检查点策略。
{"title":"Step-based checkpointing with high-level algorithmic differentiation","authors":"","doi":"10.1016/j.jocs.2024.102405","DOIUrl":"10.1016/j.jocs.2024.102405","url":null,"abstract":"<div><p>Automated code generation allows for a separation between the development of a model, expressed via a domain specific language, and lower level implementation details. Algorithmic differentiation can be applied symbolically at the level of the domain specific language, and the code generator reused to implement code required for an adjoint calculation. However the adjoint calculations are complicated by the well-known problem of storing or recomputing the forward data required by the adjoint, and different checkpointing strategies have been developed to tackle this problem. This article considers the combination of high-level algorithmic differentiation with step-based checkpointing schedules, with the primary application being for solvers of time-dependent partial differential equations. The focus is on algorithmic differentiation using a dynamically constructed record of forward operations, where the precise structure of the original forward calculation is unknown ahead-of-time. In addition, high-level approaches provide a simplified view of the model itself. This allows data required to restart and advance the forward, and data required to advance the adjoint, to be identified. The difference between the two types of data is here leveraged to implement checkpointing strategies with improved performance.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1877750324001984/pdfft?md5=6f935bc44600d9170907d962ee7163e7&pid=1-s2.0-S1877750324001984-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DARSI: A deep auto-regressive time series inference architecture for forecasting of aerodynamic parameters DARSI:用于预测空气动力参数的深度自动回归时间序列推理架构
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-06 DOI: 10.1016/j.jocs.2024.102401

In the realm of fluid mechanics, where computationally-intensive simulations demand significant time investments, especially in predicting aerodynamic coefficients, the conventional use of time series forecasting techniques becomes paramount. Existing methods prove effective with periodic time series, yet the challenge escalates when faced with aperiodic or chaotic system responses. To address this challenge, we introduce DARSI (Deep Auto-Regressive Time Series Inference), an advanced architecture and an efficient hybrid of Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) components. Evaluated against established architectures (CNN, DLinear, LSTM, LSTNet, and PatchTST) for forecasting Coefficient of Lift (CL) values corresponding to Angles of Attack (AoAs) across periodic, aperiodic, and chaotic regimes, DARSI demonstrates remarkable performance, showing an average increase of 79.95% in CORR, 76.57% reduction in MAPE, 94.70% reduction in MSE, 76.18% reduction in QL, and 75.21% reduction in RRSE. Particularly adept at predicting chaotic aerodynamic coefficients, DARSI emerges as the best in static scenarios, surpassing DLinear and providing heightened reliability. In dynamic scenarios, DLinear takes the lead, with DARSI securing the second position alongside PatchTST. Furthermore, static AoAs at 24.7 are identified as the most chaotic, surpassing those at 24.9 and the study reveals a potential inflection point at AoA 24.7 in static scenarios for both DLinear and DARSI, warranting further confirmation. This research positions DARSI as an adept alternative to simulations, offering computational efficiency with significant implications for diverse time series forecasting applications across industries, particularly in advancing aerodynamic predictions in chaotic scenarios.

在流体力学领域,计算密集型模拟需要投入大量时间,尤其是在预测空气动力系数时,传统的时间序列预测技术变得至关重要。现有方法证明对周期性时间序列有效,但在面对非周期性或混沌系统响应时,挑战就升级了。为了应对这一挑战,我们引入了 DARSI(深度自回归时间序列推理),这是一种先进的架构,也是卷积神经网络(CNN)和长短期记忆(LSTM)组件的高效混合体。在预测周期性、非周期性和混沌状态下与攻击角(AoAs)相对应的升力系数(CL)值时,DARSI 与现有架构(CNN、DLinear、LSTM、LSTNet 和 PatchTST)进行了对比评估,显示出卓越的性能,CORR 平均提高了 79.95%,MAPE 平均降低了 76.57%,MSE 平均降低了 94.70%,QL 平均降低了 76.18%,RRSE 平均降低了 75.21%。DARSI 尤其擅长预测混乱的空气动力系数,在静态场景中表现最佳,超过了 DLinear,并提供了更高的可靠性。在动态场景中,DLinear 遥遥领先,DARSI 与 PatchTST 并列第二。此外,24.7 波段的静态视距被认为是最混乱的,超过了 24.9 波段的视距,研究还揭示了 DLinear 和 DARSI 在静态视距 24.7 波段的潜在拐点,值得进一步确认。这项研究将 DARSI 定义为模拟的一种有效替代方法,它具有计算效率高的特点,对各行各业的各种时间序列预测应用具有重要意义,特别是在推进混沌场景下的空气动力学预测方面。
{"title":"DARSI: A deep auto-regressive time series inference architecture for forecasting of aerodynamic parameters","authors":"","doi":"10.1016/j.jocs.2024.102401","DOIUrl":"10.1016/j.jocs.2024.102401","url":null,"abstract":"<div><p>In the realm of fluid mechanics, where computationally-intensive simulations demand significant time investments, especially in predicting aerodynamic coefficients, the conventional use of time series forecasting techniques becomes paramount. Existing methods prove effective with periodic time series, yet the challenge escalates when faced with aperiodic or chaotic system responses. To address this challenge, we introduce DARSI (Deep Auto-Regressive Time Series Inference), an advanced architecture and an efficient hybrid of Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) components. Evaluated against established architectures (CNN, DLinear, LSTM, LSTNet, and PatchTST) for forecasting Coefficient of Lift (<span><math><msub><mrow><mi>C</mi></mrow><mrow><mi>L</mi></mrow></msub></math></span>) values corresponding to Angles of Attack (AoAs) across periodic, aperiodic, and chaotic regimes, DARSI demonstrates remarkable performance, showing an average increase of 79.95% in CORR, 76.57% reduction in MAPE, 94.70% reduction in MSE, 76.18% reduction in QL, and 75.21% reduction in RRSE. Particularly adept at predicting chaotic aerodynamic coefficients, DARSI emerges as the best in static scenarios, surpassing DLinear and providing heightened reliability. In dynamic scenarios, DLinear takes the lead, with DARSI securing the second position alongside PatchTST. Furthermore, static AoAs at 24.7 are identified as the most chaotic, surpassing those at 24.9 and the study reveals a potential inflection point at AoA 24.7 in static scenarios for both DLinear and DARSI, warranting further confirmation. This research positions DARSI as an adept alternative to simulations, offering computational efficiency with significant implications for diverse time series forecasting applications across industries, particularly in advancing aerodynamic predictions in chaotic scenarios.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142020720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resonance modeling of the tsunami caused by the Aegean Sea Earthquake (Mw7.0) of October 30, 2020 2020 年 10 月 30 日爱琴海地震(Mw7.0)引发海啸的共振建模
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-05 DOI: 10.1016/j.jocs.2024.102398

The resonance of tsunami waves in semi-enclosed bays is paramount in understanding and mitigating the impact of seismic events on coastal communities. Semi-enclosed bays, characterized by their partial enclosure, can amplify the effects of incoming tsunami waves due to resonance behavior, where the natural frequencies of the bay correspond to those of the incoming waves. This resonance phenomenon can significantly increase wave height and inundation levels, posing an increased risk to nearby settlements and infrastructure. Understanding the resonance patterns in these bays is crucial for accurate hazard assessment, early warning systems, and effective disaster preparedness and response strategies. On October 30, 2020, an earthquake occurred between the Turkish Bay of Seferihisar Bay and the Greek island of Samos in the Aegean Sea. Long waves generated by the normal-faulting earthquake caused notable damage to settlements within Seferihisar Bay and the north coast of Samos Island. According to the measurements of the Syros mareograph stations, the wave heights were between 2 and 20 cm and wave periods between 9 and 20 seconds. Based on on-site survey reports conducted after the earthquake, inundation was reported in six settlements within Seferihisar Bay. However, inundation was notably higher in Sığacık and Akarca, reaching 2–3 times higher than in other locations, and the water level reached 2 m high. Given that the variance in inundation levels is attributed to resonance phenomena in Sığacık and Akarca rather than the propagation of tsunami waves, this study focused on conducting wave resonance modeling in Seferihisar Bay. The resonance modeling was performed using the RIDE wave model. Furthermore, the research has been expanded to assess the resonance patterns that might emerge in the event of an alternative earthquake or underwater landslide along the fault line responsible for the seismic event, encompassing wave periods ranging from T = 1–9 minutes and T = 20–30 minutes. Modeling results revealed that on the day of the earthquake, wave heights in Sığacık Marina and Akarca surged by 8.5 times in comparison to the wave height at the epicenter. This increase is notably higher, ranging from 2 to 2.5 times, compared to calculations made for other locations (Demircili, Altınköy, and Tepecik). Consequently, it was concluded that one of the reasons for the heightened effectiveness of inundation in Sığacık and Akarca was attributable to resonance. Moreover, supplementary investigations have indicated that waves with a period of T<9 minutes will pose higher risks for Demircili, Altınköy, Sığacık Marina, and Tepecik compared to the day of the earthquake. By comprehensively studying wave resonance in semi-enclosed bays, researchers and policymakers can better anticipate the potential impact of tsunami events and take measures to protect coastal communities, ultimately increasing resilience and reducing the loss of life and property in vulner

海啸波在半封闭海湾中的共振对了解和减轻地震事件对沿海社区的影响至关重要。半封闭海湾的特点是部分封闭,由于海湾的自然频率与海啸波的自然频率一致,海啸波的共振行为会扩大海湾的影响。这种共振现象会大大增加海浪高度和淹没程度,给附近的居民点和基础设施带来更大的风险。了解这些海湾的共振模式对于准确的灾害评估、预警系统以及有效的备灾和救灾战略至关重要。2020 年 10 月 30 日,爱琴海土耳其塞费里希萨尔湾和希腊萨摩斯岛之间发生地震。正常断层地震产生的长波对塞费里希萨尔湾和萨摩斯岛北海岸的居民点造成了明显破坏。根据锡罗斯海图站的测量,波高在 2 至 20 厘米之间,波长在 9 至 20 秒之间。根据震后进行的现场调查报告,塞费里希萨尔湾内有六个居民点被淹没。不过,Sığacık 和 Akarca 的淹没程度明显高于其他地方,达到 2-3 倍,水位高达 2 米。鉴于淹没水位的变化归因于 Sığacık 和 Akarca 的共振现象,而不是海啸波的传播,因此本研究侧重于在塞费里希萨尔湾进行波浪共振建模。共振建模使用的是 RIDE 波浪模型。此外,研究还扩展到评估在发生替代地震或沿造成地震事件的断层线发生水下滑坡时可能出现的共振模式,包括 T = 1-9 分钟和 T = 20-30 分钟的波浪周期。建模结果显示,地震当天,Sığacık Marina 和 Akarca 的波高与震中波高相比激增了 8.5 倍。与其他地点(Demircili、Altınköy 和 Tepecik)的计算结果相比,波高明显增加了 2 至 2.5 倍。因此,得出的结论是,Sığacık 和 Akarca 的淹没效果提高的原因之一是共振。此外,补充调查还表明,与地震当天相比,周期为 T<9 分钟的波浪将对 Demircili、Altınköy、Sığacık Marina 和 Tepecik 造成更大风险。通过全面研究半封闭海湾的波浪共振,研究人员和决策者可以更好地预测海啸事件的潜在影响,并采取措施保护沿海社区,最终提高脆弱地区的抗灾能力,减少生命和财产损失。
{"title":"Resonance modeling of the tsunami caused by the Aegean Sea Earthquake (Mw7.0) of October 30, 2020","authors":"","doi":"10.1016/j.jocs.2024.102398","DOIUrl":"10.1016/j.jocs.2024.102398","url":null,"abstract":"<div><p>The resonance of tsunami waves in semi-enclosed bays is paramount in understanding and mitigating the impact of seismic events on coastal communities. Semi-enclosed bays, characterized by their partial enclosure, can amplify the effects of incoming tsunami waves due to resonance behavior, where the natural frequencies of the bay correspond to those of the incoming waves. This resonance phenomenon can significantly increase wave height and inundation levels, posing an increased risk to nearby settlements and infrastructure. Understanding the resonance patterns in these bays is crucial for accurate hazard assessment, early warning systems, and effective disaster preparedness and response strategies. On October 30, 2020, an earthquake occurred between the Turkish Bay of Seferihisar Bay and the Greek island of Samos in the Aegean Sea. Long waves generated by the normal-faulting earthquake caused notable damage to settlements within Seferihisar Bay and the north coast of Samos Island. According to the measurements of the Syros mareograph stations, the wave heights were between 2 and 20 cm and wave periods between 9 and 20 seconds. Based on on-site survey reports conducted after the earthquake, inundation was reported in six settlements within Seferihisar Bay. However, inundation was notably higher in Sığacık and Akarca, reaching 2–3 times higher than in other locations, and the water level reached 2 m high. Given that the variance in inundation levels is attributed to resonance phenomena in Sığacık and Akarca rather than the propagation of tsunami waves, this study focused on conducting wave resonance modeling in Seferihisar Bay. The resonance modeling was performed using the RIDE wave model. Furthermore, the research has been expanded to assess the resonance patterns that might emerge in the event of an alternative earthquake or underwater landslide along the fault line responsible for the seismic event, encompassing wave periods ranging from T = 1–9 minutes and T = 20–30 minutes. Modeling results revealed that on the day of the earthquake, wave heights in Sığacık Marina and Akarca surged by 8.5 times in comparison to the wave height at the epicenter. This increase is notably higher, ranging from 2 to 2.5 times, compared to calculations made for other locations (Demircili, Altınköy, and Tepecik). Consequently, it was concluded that one of the reasons for the heightened effectiveness of inundation in Sığacık and Akarca was attributable to resonance. Moreover, supplementary investigations have indicated that waves with a period of T&lt;9 minutes will pose higher risks for Demircili, Altınköy, Sığacık Marina, and Tepecik compared to the day of the earthquake. By comprehensively studying wave resonance in semi-enclosed bays, researchers and policymakers can better anticipate the potential impact of tsunami events and take measures to protect coastal communities, ultimately increasing resilience and reducing the loss of life and property in vulner","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the integration of IoT and Generative AI in English language education: Smart tools for personalized learning experiences 探索物联网与生成式人工智能在英语教育中的融合:个性化学习体验的智能工具
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-04 DOI: 10.1016/j.jocs.2024.102397

English language education is undergoing a transformative shift, propelled by advancements in technology. This research explores the integration of the Internet of Things (IoT) and Generative Artificial Intelligence (Generative AI) in the context of English language education, with a focus on developing a personalized oral assessment method. The proposed method leverages real-time data collection from IoT devices and Generative AI's language generation capabilities to create a dynamic and adaptive learning environment. The study addresses historical challenges in traditional teaching methodologies, emphasizing the need for AI approaches. The research objectives encompass a comprehensive exploration of the historical context, challenges, and existing technological interventions in English language education. A novel, technology-driven oral assessment method is designed, implemented, and rigorously evaluated using datasets such as Librispeech and L2Arctic. The ablation study investigates the impact of training dataset proportions and model learning rates on the method's performance. Results from the study highlight the importance of maintaining a balance in dataset proportions, selecting an optimal learning rate, and considering model depth in achieving optimal performance.

在技术进步的推动下,英语教育正在经历一场变革。本研究探讨了物联网(IoT)和生成式人工智能(Generative AI)在英语教育中的整合,重点是开发一种个性化口语评估方法。所提出的方法利用了物联网设备的实时数据收集和生成式人工智能的语言生成能力,以创建一个动态和自适应的学习环境。该研究解决了传统教学方法中的历史难题,强调了对人工智能方法的需求。研究目标包括全面探索英语教育的历史背景、挑战和现有技术干预。设计、实施并使用 Librispeech 和 L2Arctic 等数据集严格评估了一种新颖的、技术驱动的口语评估方法。消融研究调查了训练数据集比例和模型学习率对该方法性能的影响。研究结果凸显了保持数据集比例平衡、选择最佳学习率和考虑模型深度对实现最佳性能的重要性。
{"title":"Exploring the integration of IoT and Generative AI in English language education: Smart tools for personalized learning experiences","authors":"","doi":"10.1016/j.jocs.2024.102397","DOIUrl":"10.1016/j.jocs.2024.102397","url":null,"abstract":"<div><p>English language education is undergoing a transformative shift, propelled by advancements in technology. This research explores the integration of the Internet of Things (IoT) and Generative Artificial Intelligence (Generative AI) in the context of English language education, with a focus on developing a personalized oral assessment method. The proposed method leverages real-time data collection from IoT devices and Generative AI's language generation capabilities to create a dynamic and adaptive learning environment. The study addresses historical challenges in traditional teaching methodologies, emphasizing the need for AI approaches. The research objectives encompass a comprehensive exploration of the historical context, challenges, and existing technological interventions in English language education. A novel, technology-driven oral assessment method is designed, implemented, and rigorously evaluated using datasets such as Librispeech and L2Arctic. The ablation study investigates the impact of training dataset proportions and model learning rates on the method's performance. Results from the study highlight the importance of maintaining a balance in dataset proportions, selecting an optimal learning rate, and considering model depth in achieving optimal performance.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141998444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A topological approach for semi-supervised learning 半监督学习的拓扑方法
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-08-03 DOI: 10.1016/j.jocs.2024.102403

Nowadays, Machine Learning and Deep Learning methods have become the state-of-the-art approach to solve data classification tasks. In order to use those methods, it is necessary to acquire and label a considerable amount of data; however, this is not straightforward in some fields, since data annotation is time consuming and might require expert knowledge. This challenge can be tackled by means of semi-supervised learning methods that take advantage of both labelled and unlabelled data. In this work, we present new semi-supervised learning methods based on techniques from Topological Data Analysis (TDA). In particular, we have created two semi-supervised learning methods following two topological approaches. In the former, we have used a homological approach that consists in studying the persistence diagrams associated with the data using the bottleneck and Wasserstein distances. In the latter, we have considered the connectivity of the data. In addition, we have carried out a thorough analysis of the developed methods using 9 tabular datasets with low and high dimensionality. The results show that the developed semi-supervised methods outperform the results obtained with models trained with only manually labelled data, and are an alternative to other classical semi-supervised learning algorithms.

如今,机器学习和深度学习方法已成为解决数据分类任务的最先进方法。要使用这些方法,就必须获取并标注大量数据;然而,这在某些领域并不简单,因为数据标注不仅耗时,而且可能需要专家知识。半监督学习方法可以利用已标注和未标注的数据来解决这一难题。在这项工作中,我们提出了基于拓扑数据分析(TDA)技术的新型半监督学习方法。特别是,我们根据两种拓扑方法创建了两种半监督学习方法。在前者中,我们使用了一种同源方法,即利用瓶颈距离和瓦瑟斯坦距离研究与数据相关的持久图。在后者中,我们考虑了数据的连通性。此外,我们还使用 9 个低维和高维表格数据集对所开发的方法进行了全面分析。结果表明,所开发的半监督方法优于仅使用人工标注数据训练的模型,是其他经典半监督学习算法的替代方法。
{"title":"A topological approach for semi-supervised learning","authors":"","doi":"10.1016/j.jocs.2024.102403","DOIUrl":"10.1016/j.jocs.2024.102403","url":null,"abstract":"<div><p>Nowadays, Machine Learning and Deep Learning methods have become the state-of-the-art approach to solve data classification tasks. In order to use those methods, it is necessary to acquire and label a considerable amount of data; however, this is not straightforward in some fields, since data annotation is time consuming and might require expert knowledge. This challenge can be tackled by means of semi-supervised learning methods that take advantage of both labelled and unlabelled data. In this work, we present new semi-supervised learning methods based on techniques from Topological Data Analysis (TDA). In particular, we have created two semi-supervised learning methods following two topological approaches. In the former, we have used a homological approach that consists in studying the persistence diagrams associated with the data using the bottleneck and Wasserstein distances. In the latter, we have considered the connectivity of the data. In addition, we have carried out a thorough analysis of the developed methods using 9 tabular datasets with low and high dimensionality. The results show that the developed semi-supervised methods outperform the results obtained with models trained with only manually labelled data, and are an alternative to other classical semi-supervised learning algorithms.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Computational Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1