First-order algorithms have long dominated the training of deep neural networks, excelling in tasks like image classification and natural language processing. Now there is a compelling opportunity to explore alternatives that could outperform current state-of-the-art results. From the estimation theory, the Extended Kalman Filter (EKF) arose as a viable alternative and has shown advantages over backpropagation methods. Current computational advances offer the opportunity to review algorithms derived from the EKF, almost excluded from the training of convolutional neural networks. This article revisits an approach of the EKF with decoupling and it brings the Fully Decoupled Extended Kalman Filter (FDEKF) for training convolutional neural networks in image classification tasks. The FDEKF is a second-order algorithm with some advantages over the first-order algorithms, so it can lead to faster convergence and higher accuracy, due to a higher probability of finding the global optimum. In this research, experiments are conducted on well-known datasets that include Fashion, Sports, and Handwritten Digits images. The FDEKF shows faster convergence compared to other algorithms such as the popular Adam optimizer, the sKAdam algorithm, and the reduced extended Kalman filter. Finally, motivated by the finding of the highest accuracy of FDEKF with images of natural scenes, we show its effectiveness in another experiment focused on outdoor terrain recognition.
{"title":"Training of Convolutional Neural Networks for Image Classification with Fully Decoupled Extended Kalman Filter","authors":"Armando Gaytan, Ofelia Begovich-Mendoza, Nancy Arana-Daniel","doi":"10.3390/a17060243","DOIUrl":"https://doi.org/10.3390/a17060243","url":null,"abstract":"First-order algorithms have long dominated the training of deep neural networks, excelling in tasks like image classification and natural language processing. Now there is a compelling opportunity to explore alternatives that could outperform current state-of-the-art results. From the estimation theory, the Extended Kalman Filter (EKF) arose as a viable alternative and has shown advantages over backpropagation methods. Current computational advances offer the opportunity to review algorithms derived from the EKF, almost excluded from the training of convolutional neural networks. This article revisits an approach of the EKF with decoupling and it brings the Fully Decoupled Extended Kalman Filter (FDEKF) for training convolutional neural networks in image classification tasks. The FDEKF is a second-order algorithm with some advantages over the first-order algorithms, so it can lead to faster convergence and higher accuracy, due to a higher probability of finding the global optimum. In this research, experiments are conducted on well-known datasets that include Fashion, Sports, and Handwritten Digits images. The FDEKF shows faster convergence compared to other algorithms such as the popular Adam optimizer, the sKAdam algorithm, and the reduced extended Kalman filter. Finally, motivated by the finding of the highest accuracy of FDEKF with images of natural scenes, we show its effectiveness in another experiment focused on outdoor terrain recognition.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141378045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents various machine learning methods with different text features that are explored and evaluated to determine the authorship of the texts in the example of the Azerbaijani language. We consider techniques like artificial neural network, convolutional neural network, random forest, and support vector machine. These techniques are used with different text features like word length, sentence length, combined word length and sentence length, n-grams, and word frequencies. The models were trained and tested on the works of many famous Azerbaijani writers. The results of computer experiments obtained by utilizing a comparison of various techniques and text features were analyzed. The cases where the usage of text features allowed better results were determined.
{"title":"A Comparative Study of Machine Learning Methods and Text Features for Text Authorship Recognition in the Example of Azerbaijani Language Texts","authors":"Rustam Azimov, Efthimios Providas","doi":"10.3390/a17060242","DOIUrl":"https://doi.org/10.3390/a17060242","url":null,"abstract":"This paper presents various machine learning methods with different text features that are explored and evaluated to determine the authorship of the texts in the example of the Azerbaijani language. We consider techniques like artificial neural network, convolutional neural network, random forest, and support vector machine. These techniques are used with different text features like word length, sentence length, combined word length and sentence length, n-grams, and word frequencies. The models were trained and tested on the works of many famous Azerbaijani writers. The results of computer experiments obtained by utilizing a comparison of various techniques and text features were analyzed. The cases where the usage of text features allowed better results were determined.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141383477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A fitness landscape analysis of the loss surfaces produced by product unit neural networks is performed in order to gain a better understanding of the impact of product units on the characteristics of the loss surfaces. The loss surface characteristics of product unit neural networks are then compared to the characteristics of loss surfaces produced by neural networks that make use of summation units. The failure of certain optimization algorithms in training product neural networks is explained through trends observed between loss surface characteristics and optimization algorithm performance. The paper shows that the loss surfaces of product unit neural networks have extremely large gradients with many deep ravines and valleys, which explains why gradient-based optimization algorithms fail at training these neural networks.
{"title":"Fitness Landscape Analysis of Product Unit Neural Networks","authors":"Andries P. Engelbrecht, Robert Gouldie ","doi":"10.3390/a17060241","DOIUrl":"https://doi.org/10.3390/a17060241","url":null,"abstract":"A fitness landscape analysis of the loss surfaces produced by product unit neural networks is performed in order to gain a better understanding of the impact of product units on the characteristics of the loss surfaces. The loss surface characteristics of product unit neural networks are then compared to the characteristics of loss surfaces produced by neural networks that make use of summation units. The failure of certain optimization algorithms in training product neural networks is explained through trends observed between loss surface characteristics and optimization algorithm performance. The paper shows that the loss surfaces of product unit neural networks have extremely large gradients with many deep ravines and valleys, which explains why gradient-based optimization algorithms fail at training these neural networks.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141388155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
João Pedro Costa Violante, Marcela A. G. Machado, Amanda dos Santos Mendes, Túlio S. Almeida
Control charts are tools of paramount importance in statistical process control. They are broadly applied in monitoring processes and improving quality, as they allow the detection of special causes of variation with a significant level of accuracy. Furthermore, there are several strategies able to be employed in different contexts, all of which offer their own advantages. Therefore, this study focuses on monitoring the variability in univariate processes through variance using the Binomial version of the ATTRIVAR Same Sample S2 (B-ATTRIVAR SS S2) control chart, given that it allows coupling attribute and variable inspections (ATTRIVAR means attribute + variable), i.e., taking advantage of the cost-effectiveness of the former and the wealth of information and greater performance of the latter. Its Binomial version was used for such a purpose, since inspections are made using two attributes, and the Same Sample was used due to being submitted to both the attribute and variable stages of inspection. A computational application was developed in the R language using the Shiny package so as to create an interface to facilitate its application and use in the quality control of the production processes. Its application enables users to input process parameters and generate the B-ATTRIVAR SS control chart for monitoring the process variability with variance. By comparing the data obtained from its application with a simpler code, its performance was validated, given that its results exhibited striking similarity.
控制图是统计过程控制中最重要的工具。它们被广泛应用于监控过程和提高质量,因为它们可以非常准确地检测出导致变异的特殊原因。此外,在不同的情况下还可以采用多种策略,它们都有各自的优势。因此,本研究的重点是利用 ATTRIVAR Same Sample S2(B-ATTRIVAR SS S2)控制图的二项式版本,通过变异监测单变量过程中的变异性,因为它可以将属性检查和变量检查(ATTRIVAR 指属性 + 变量)结合起来,即利用前者的成本效益和后者的丰富信息和更高的性能。由于使用两个属性进行检验,因此使用了其二项式版本;由于同时进行属性和变量阶段的检验,因此使用了相同样本。使用 Shiny 软件包以 R 语言开发了一个计算应用程序,以创建一个界面,方便在生产过程的质量控制中应用和使用。通过该应用程序,用户可以输入工艺参数并生成 B-ATTRIVAR SS 控制图,以监控工艺的变异性。通过比较从其应用中获得的数据和一个更简单的代码,其性能得到了验证,因为其结果表现出惊人的相似性。
{"title":"An Interface to Monitor Process Variability Using the Binomial ATTRIVAR SS Control Chart","authors":"João Pedro Costa Violante, Marcela A. G. Machado, Amanda dos Santos Mendes, Túlio S. Almeida","doi":"10.3390/a17050216","DOIUrl":"https://doi.org/10.3390/a17050216","url":null,"abstract":"Control charts are tools of paramount importance in statistical process control. They are broadly applied in monitoring processes and improving quality, as they allow the detection of special causes of variation with a significant level of accuracy. Furthermore, there are several strategies able to be employed in different contexts, all of which offer their own advantages. Therefore, this study focuses on monitoring the variability in univariate processes through variance using the Binomial version of the ATTRIVAR Same Sample S2 (B-ATTRIVAR SS S2) control chart, given that it allows coupling attribute and variable inspections (ATTRIVAR means attribute + variable), i.e., taking advantage of the cost-effectiveness of the former and the wealth of information and greater performance of the latter. Its Binomial version was used for such a purpose, since inspections are made using two attributes, and the Same Sample was used due to being submitted to both the attribute and variable stages of inspection. A computational application was developed in the R language using the Shiny package so as to create an interface to facilitate its application and use in the quality control of the production processes. Its application enables users to input process parameters and generate the B-ATTRIVAR SS control chart for monitoring the process variability with variance. By comparing the data obtained from its application with a simpler code, its performance was validated, given that its results exhibited striking similarity.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140971228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper introduces an algorithm to tackle the boundary condition (BC) problem, which has long persisted in the numerical and computational treatment of smoothed particle hydrodynamics (SPH). Central to the BC problem is a need for an effective method to reconcile a numerical representation of particles with 2D or 3D geometry. We describe and evaluate an algorithmic solution—boundary SPH (BSPH)—drawn from a novel twist on the mesh-based boundary method, allowing SPH particles to interact (directly and implicitly) with either convex or concave 3D meshes. The method draws inspiration from existing works in graphics, particularly discrete signed distance fields, to determine whether particles are intersecting or submerged with mesh triangles. We evaluate the efficacy of BSPH through application to several simulation environments of varying mesh complexity, showing practical real-time implementation in Unity3D and its high-level shader language (HLSL), which we test in the parallelization of particle operations. To examine robustness, we portray slip and no-slip conditions in simulation, and we separately evaluate convex and concave meshes. To demonstrate empirical utility, we show pressure gradients as measured in simulated still water tank implementations of hydrodynamics. Our results identify that BSPH, despite producing irregular pressure values among particles close to the boundary manifolds of the meshes, successfully prevents particles from intersecting or submerging into the boundary manifold. Average FPS calculations for each simulation scenario show that the mesh boundary method can still be used effectively with simple simulation scenarios. We additionally point the reader to future works that could investigate the effect of simulation parameters and scene complexity on simulation performance, resolve abnormal pressure values along the mesh boundary, and test the method’s robustness on a wider variety of simulation environments.
{"title":"Boundary SPH for Robust Particle–Mesh Interaction in Three Dimensions","authors":"Ryan Kim, Paul M. Torrens","doi":"10.3390/a17050218","DOIUrl":"https://doi.org/10.3390/a17050218","url":null,"abstract":"This paper introduces an algorithm to tackle the boundary condition (BC) problem, which has long persisted in the numerical and computational treatment of smoothed particle hydrodynamics (SPH). Central to the BC problem is a need for an effective method to reconcile a numerical representation of particles with 2D or 3D geometry. We describe and evaluate an algorithmic solution—boundary SPH (BSPH)—drawn from a novel twist on the mesh-based boundary method, allowing SPH particles to interact (directly and implicitly) with either convex or concave 3D meshes. The method draws inspiration from existing works in graphics, particularly discrete signed distance fields, to determine whether particles are intersecting or submerged with mesh triangles. We evaluate the efficacy of BSPH through application to several simulation environments of varying mesh complexity, showing practical real-time implementation in Unity3D and its high-level shader language (HLSL), which we test in the parallelization of particle operations. To examine robustness, we portray slip and no-slip conditions in simulation, and we separately evaluate convex and concave meshes. To demonstrate empirical utility, we show pressure gradients as measured in simulated still water tank implementations of hydrodynamics. Our results identify that BSPH, despite producing irregular pressure values among particles close to the boundary manifolds of the meshes, successfully prevents particles from intersecting or submerging into the boundary manifold. Average FPS calculations for each simulation scenario show that the mesh boundary method can still be used effectively with simple simulation scenarios. We additionally point the reader to future works that could investigate the effect of simulation parameters and scene complexity on simulation performance, resolve abnormal pressure values along the mesh boundary, and test the method’s robustness on a wider variety of simulation environments.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140967557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Under a distributed information system, the scale of various operational components such as applications, operating systems, databases, servers, and networks is immense, with intricate access relationships. The silo effect of each professional is prominent, and the linkage mechanism is insufficient, making it difficult to locate the infrastructure components that cause exceptions under a particular application. Current research only plays a role in local scenarios, and its accuracy and generalization are still very limited. This paper proposes a novel fault location method based on dynamic operation maps and alarm common point analysis. During the fault period, various alarm entities are associated with dynamic operation maps, and alarm common points are obtained based on graph search addressing methods, covering deployment relationship common points, connection common points (physical and logical), and access flow common points. This method, compared with knowledge graph approaches, eliminates the complex process of knowledge graph construction, making it more concise and efficient. Furthermore, in contrast to indicator correlation analysis methods, this approach supplements with configuration correlation information, resulting in more precise positioning. Through practical validation, its fault hit rate exceeds 82%, which is significantly better than the existing main methods.
{"title":"Fault Location Method Based on Dynamic Operation and Maintenance Map and Common Alarm Points Analysis","authors":"Sheng Wu, Jihong Guan","doi":"10.3390/a17050217","DOIUrl":"https://doi.org/10.3390/a17050217","url":null,"abstract":"Under a distributed information system, the scale of various operational components such as applications, operating systems, databases, servers, and networks is immense, with intricate access relationships. The silo effect of each professional is prominent, and the linkage mechanism is insufficient, making it difficult to locate the infrastructure components that cause exceptions under a particular application. Current research only plays a role in local scenarios, and its accuracy and generalization are still very limited. This paper proposes a novel fault location method based on dynamic operation maps and alarm common point analysis. During the fault period, various alarm entities are associated with dynamic operation maps, and alarm common points are obtained based on graph search addressing methods, covering deployment relationship common points, connection common points (physical and logical), and access flow common points. This method, compared with knowledge graph approaches, eliminates the complex process of knowledge graph construction, making it more concise and efficient. Furthermore, in contrast to indicator correlation analysis methods, this approach supplements with configuration correlation information, resulting in more precise positioning. Through practical validation, its fault hit rate exceeds 82%, which is significantly better than the existing main methods.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140969598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quantum computing has the potential to solve problems that are currently intractable to classical computers with algorithms like Quantum Phase Estimation (QPE); however, noise significantly hinders the performance of today’s quantum computers. Machine learning has the potential to improve the performance of QPE algorithms, especially in the presence of noise. In this work, QPE circuits were simulated with varying levels of depolarizing noise to generate datasets of QPE output. In each case, the phase being estimated was generated with a phase gate, and each circuit modeled was defined by a randomly selected phase. The model accuracy, prediction speed, overfitting level and variation in accuracy with noise level was determined for 5 machine learning algorithms. These attributes were compared to the traditional method of post-processing and a 6x–36 improvement in model performance was noted, depending on the dataset. No algorithm was a clear winner when considering these 4 criteria, as the lowest-error model (neural network) was also the slowest predictor; the algorithm with the lowest overfitting and fastest prediction time (linear regression) had the highest error level and a high degree of variation of error with noise. The XGBoost ensemble algorithm was judged to be the best tradeoff between these criteria due to its error level, prediction time and low variation of error with noise. For the first time, a machine learning model was validated using a 2-qubit datapoint obtained from an IBMQ quantum computer. The best 2-qubit model predicted within 2% of the actual phase, while the traditional method possessed a 25% error.
{"title":"Improving 2–5 Qubit Quantum Phase Estimation Circuits Using Machine Learning","authors":"Charles Woodrum, Torrey Wagner, David Weeks","doi":"10.3390/a17050214","DOIUrl":"https://doi.org/10.3390/a17050214","url":null,"abstract":"Quantum computing has the potential to solve problems that are currently intractable to classical computers with algorithms like Quantum Phase Estimation (QPE); however, noise significantly hinders the performance of today’s quantum computers. Machine learning has the potential to improve the performance of QPE algorithms, especially in the presence of noise. In this work, QPE circuits were simulated with varying levels of depolarizing noise to generate datasets of QPE output. In each case, the phase being estimated was generated with a phase gate, and each circuit modeled was defined by a randomly selected phase. The model accuracy, prediction speed, overfitting level and variation in accuracy with noise level was determined for 5 machine learning algorithms. These attributes were compared to the traditional method of post-processing and a 6x–36 improvement in model performance was noted, depending on the dataset. No algorithm was a clear winner when considering these 4 criteria, as the lowest-error model (neural network) was also the slowest predictor; the algorithm with the lowest overfitting and fastest prediction time (linear regression) had the highest error level and a high degree of variation of error with noise. The XGBoost ensemble algorithm was judged to be the best tradeoff between these criteria due to its error level, prediction time and low variation of error with noise. For the first time, a machine learning model was validated using a 2-qubit datapoint obtained from an IBMQ quantum computer. The best 2-qubit model predicted within 2% of the actual phase, while the traditional method possessed a 25% error.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140975871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eric Chuu, Yabo Niu, A. Bhattacharya, Debdeep Pati
We consider the estimation of the marginal likelihood in Bayesian statistics, with primary emphasis on Gaussian graphical models, where the intractability of the marginal likelihood in high dimensions is a frequently researched problem. We propose a general algorithm that can be widely applied to a variety of problem settings and excels particularly when dealing with near log-concave posteriors. Our method builds upon a previously posited algorithm that uses MCMC samples to partition the parameter space and forms piecewise constant approximations over these partition sets as a means of estimating the normalizing constant. In this paper, we refine the aforementioned local approximations by taking advantage of the shape of the target distribution and leveraging an expectation propagation algorithm to approximate Gaussian integrals over rectangular polytopes. Our numerical experiments show the versatility and accuracy of the proposed estimator, even as the parameter space increases in dimension and becomes more complicated.
{"title":"EPSOM-Hyb: A General Purpose Estimator of Log-Marginal Likelihoods with Applications in Probabilistic Graphical Models","authors":"Eric Chuu, Yabo Niu, A. Bhattacharya, Debdeep Pati","doi":"10.3390/a17050213","DOIUrl":"https://doi.org/10.3390/a17050213","url":null,"abstract":"We consider the estimation of the marginal likelihood in Bayesian statistics, with primary emphasis on Gaussian graphical models, where the intractability of the marginal likelihood in high dimensions is a frequently researched problem. We propose a general algorithm that can be widely applied to a variety of problem settings and excels particularly when dealing with near log-concave posteriors. Our method builds upon a previously posited algorithm that uses MCMC samples to partition the parameter space and forms piecewise constant approximations over these partition sets as a means of estimating the normalizing constant. In this paper, we refine the aforementioned local approximations by taking advantage of the shape of the target distribution and leveraging an expectation propagation algorithm to approximate Gaussian integrals over rectangular polytopes. Our numerical experiments show the versatility and accuracy of the proposed estimator, even as the parameter space increases in dimension and becomes more complicated.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140974838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linear assignment problems hold a pivotal role in combinatorial optimization, offering a broad spectrum of applications within the field of data sciences. They consist of assigning “agents” to “tasks” in a way that leads to a minimum total cost associated with the assignment. The assignment is balanced when the number of agents equals the number of tasks, with a one-to-one correspondence between agents and tasks, and it is and unbalanced otherwise. Additional options and constraints may be imposed, such as allowing agents to perform multiple tasks or allowing tasks to be performed by multiple agents. In this paper, we propose a novel framework that can solve all these assignment problems employing methodologies derived from the field of statistical physics. We describe this formalism in detail and validate all its assertions. A major part of this framework is the definition of a concave effective free energy function that encapsulates the constraints of the assignment problem within a finite temperature context. We demonstrate that this free energy monotonically decreases as a function of a parameter β representing the inverse of temperature. As β increases, the free energy converges to the optimal assignment cost. Furthermore, we demonstrate that when β values are sufficiently large, the exact solution to the assignment problem can be derived by rounding off the elements of the computed assignment matrix to the nearest integer. We describe a computer implementation of our framework and illustrate its application to multi-task assignment problems for which the Hungarian algorithm is not applicable.
{"title":"A General Statistical Physics Framework for Assignment Problems","authors":"P. Koehl, H. Orland","doi":"10.3390/a17050212","DOIUrl":"https://doi.org/10.3390/a17050212","url":null,"abstract":"Linear assignment problems hold a pivotal role in combinatorial optimization, offering a broad spectrum of applications within the field of data sciences. They consist of assigning “agents” to “tasks” in a way that leads to a minimum total cost associated with the assignment. The assignment is balanced when the number of agents equals the number of tasks, with a one-to-one correspondence between agents and tasks, and it is and unbalanced otherwise. Additional options and constraints may be imposed, such as allowing agents to perform multiple tasks or allowing tasks to be performed by multiple agents. In this paper, we propose a novel framework that can solve all these assignment problems employing methodologies derived from the field of statistical physics. We describe this formalism in detail and validate all its assertions. A major part of this framework is the definition of a concave effective free energy function that encapsulates the constraints of the assignment problem within a finite temperature context. We demonstrate that this free energy monotonically decreases as a function of a parameter β representing the inverse of temperature. As β increases, the free energy converges to the optimal assignment cost. Furthermore, we demonstrate that when β values are sufficiently large, the exact solution to the assignment problem can be derived by rounding off the elements of the computed assignment matrix to the nearest integer. We describe a computer implementation of our framework and illustrate its application to multi-task assignment problems for which the Hungarian algorithm is not applicable.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140981989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A double optimal solution (DOS) of a least-squares problem Ax=b,A∈Rq×n with q≠n is derived in an m+1-dimensional varying affine Krylov subspace (VAKS); two minimization techniques exactly determine the m+1 expansion coefficients of the solution x in the VAKS. The minimal-norm solution can be obtained automatically regardless of whether the linear system is consistent or inconsistent. A new double optimal algorithm (DOA) is created; it is sufficiently time saving by inverting an m×m positive definite matrix at each iteration step, where m≪min(n,q). The properties of the DOA are investigated and the estimation of residual error is provided. The residual norms are proven to be strictly decreasing in the iterations; hence, the DOA is absolutely convergent. Numerical tests reveal the efficiency of the DOA for solving least-squares problems. The DOA is applicable to least-squares problems regardless of whether qn. The Moore–Penrose inverse matrix is also addressed by adopting the DOA; the accuracy and efficiency of the proposed method are proven. The m+1-dimensional VAKS is different from the traditional m-dimensional affine Krylov subspace used in the conjugate gradient (CG)-type iterative algorithms CGNR (or CGLS) and CGRE (or Craig method) for solving least-squares problems with q>n. We propose a variant of the Karush–Kuhn–Tucker equation, and then we apply the partial pivoting Gaussian elimination method to solve the variant, which is better than the original Karush–Kuhn–Tucker equation, the CGNR and the CGNE for solving over-determined linear systems. Our main contribution is developing a double-optimization-based iterative algorithm in a varying affine Krylov subspace for effectively and accurately solving least-squares problems, even for a dense and ill-conditioned matrix A with q≪n or q≫n.
在 m+1 维变化仿射克雷洛夫子空间(VAKS)中推导出了最小二乘问题 Ax=b,A∈Rq×n 且 q≠n 的双最优解(DOS);两种最小化技术精确确定了解 x 在 VAKS 中的 m+1 个扩展系数。无论线性系统是一致还是不一致,都能自动获得最小规范解。我们创建了一种新的双最优算法(DOA);它通过在每个迭代步骤中反转 m×m 正定矩阵(其中 m≪min(n,q))来充分节省时间。对 DOA 的特性进行了研究,并提供了残差误差的估计。证明残差规范在迭代中严格递减,因此 DOA 是绝对收敛的。数值测试表明了 DOA 在解决最小二乘问题时的效率。采用 DOA 方法还解决了 Moore-Penrose 逆矩阵问题,证明了所提方法的准确性和高效性。m+1 维 VAKS 不同于共轭梯度(CG)型迭代算法 CGNR(或 CGLS)和 CGRE(或 Craig 方法)中用于求解 q>n 最小二乘问题的传统 m 维仿射 Krylov 子空间。我们提出了 Karush-Kuhn-Tucker 方程的一个变体,然后应用部分枢轴高斯消元法求解该变体,在求解超定线性系统方面,该变体优于原始的 Karush-Kuhn-Tucker 方程、CGNR 和 CGNE。我们的主要贡献是在变化的仿射克雷洛夫子空间中开发了一种基于双重优化的迭代算法,即使对于q≪n或q≫n的密集且条件不佳的矩阵A,也能有效、准确地求解最小二乘问题。
{"title":"Solving Least-Squares Problems via a Double-Optimal Algorithm and a Variant of the Karush–Kuhn–Tucker Equation for Over-Determined Systems","authors":"Chein-Shan Liu, C. Kuo, Chih-Wen Chang","doi":"10.3390/a17050211","DOIUrl":"https://doi.org/10.3390/a17050211","url":null,"abstract":"A double optimal solution (DOS) of a least-squares problem Ax=b,A∈Rq×n with q≠n is derived in an m+1-dimensional varying affine Krylov subspace (VAKS); two minimization techniques exactly determine the m+1 expansion coefficients of the solution x in the VAKS. The minimal-norm solution can be obtained automatically regardless of whether the linear system is consistent or inconsistent. A new double optimal algorithm (DOA) is created; it is sufficiently time saving by inverting an m×m positive definite matrix at each iteration step, where m≪min(n,q). The properties of the DOA are investigated and the estimation of residual error is provided. The residual norms are proven to be strictly decreasing in the iterations; hence, the DOA is absolutely convergent. Numerical tests reveal the efficiency of the DOA for solving least-squares problems. The DOA is applicable to least-squares problems regardless of whether qn. The Moore–Penrose inverse matrix is also addressed by adopting the DOA; the accuracy and efficiency of the proposed method are proven. The m+1-dimensional VAKS is different from the traditional m-dimensional affine Krylov subspace used in the conjugate gradient (CG)-type iterative algorithms CGNR (or CGLS) and CGRE (or Craig method) for solving least-squares problems with q>n. We propose a variant of the Karush–Kuhn–Tucker equation, and then we apply the partial pivoting Gaussian elimination method to solve the variant, which is better than the original Karush–Kuhn–Tucker equation, the CGNR and the CGNE for solving over-determined linear systems. Our main contribution is developing a double-optimization-based iterative algorithm in a varying affine Krylov subspace for effectively and accurately solving least-squares problems, even for a dense and ill-conditioned matrix A with q≪n or q≫n.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140981393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}