Pub Date : 2024-06-17DOI: 10.1016/j.jcmds.2024.100095
Perfect Y. Gidisu, Michiel E. Hochstenbach
A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of one-round sampling and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.
CUR 因式分解经常被用来替代奇异值分解(SVD),尤其是在奇异向量的具体解释具有挑战性的情况下。此外,如果原始数据矩阵具有非负性和稀疏性等特性,CUR 分解与 SVD 相比能更好地保留这些特性。这种方法的一个重要方面是从原始矩阵中选择列和行子集的方法。本研究调查了单轮采样和迭代子选择技术的有效性,并在迭代 SVD 的基础上引入了新的迭代子选择策略。在构建 CUR 因式分解时,离散经验插值法(DEIM)是一种可证明的合适索引选择技术。我们的贡献旨在通过多轮迭代调用 DEIM 方案来提高其近似质量,即我们根据之前选择的列和行来选择后续的列和行。因此,我们在每次迭代后都会修改 A,删除之前选定的列和行所捕获的信息。我们还讨论了如何将计算大型数据矩阵几个奇异向量的迭代程序与新的迭代子选择策略相结合。我们介绍了数值实验的结果,对单轮采样和迭代子选择技术进行了比较,并证明使用后者可以提高近似质量。
{"title":"A DEIM-CUR factorization with iterative SVDs","authors":"Perfect Y. Gidisu, Michiel E. Hochstenbach","doi":"10.1016/j.jcmds.2024.100095","DOIUrl":"https://doi.org/10.1016/j.jcmds.2024.100095","url":null,"abstract":"<div><p>A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of <em>one-round sampling</em> and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify <span><math><mi>A</mi></math></span> after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"12 ","pages":"Article 100095"},"PeriodicalIF":0.0,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000063/pdfft?md5=16d9fd47f077d52851c28e4d876eb3c0&pid=1-s2.0-S2772415824000063-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-21DOI: 10.1016/j.jcmds.2024.100094
A. Bocchinfuso , D. Calvetti, E. Somersalo
Dictionary learning methods continue to gain popularity for the solution of challenging inverse problems. In the dictionary learning approach, the computational forward model is replaced by a large dictionary of possible outcomes, and the problem is to identify the dictionary entries that best match the data, akin to traditional query matching in search engines. Sparse coding techniques are used to guarantee that the dictionary matching identifies only few of the dictionary entries, and dictionary compression methods are used to reduce the complexity of the matching problem. In this article, we propose a work flow to facilitate the dictionary matching process. First, the full dictionary is divided into subdictionaries that are separately compressed. The error introduced by the dictionary compression is handled in the Bayesian framework as a modeling error. Furthermore, we propose a new Bayesian data-driven group sparsity coding method to help identify subdictionaries that are not relevant for the dictionary matching. After discarding irrelevant subdictionaries, the dictionary matching is addressed as a deflated problem using sparse coding. The compression and deflation steps can lead to substantial decreases of the computational complexity. The effectiveness of compensating for the dictionary compression error and using the novel group sparsity promotion to deflate the original dictionary are illustrated by applying the methodology to real world problems, the glitch detection in the LIGO experiment and hyperspectral remote sensing.
{"title":"Bayesian sparsity and class sparsity priors for dictionary learning and coding","authors":"A. Bocchinfuso , D. Calvetti, E. Somersalo","doi":"10.1016/j.jcmds.2024.100094","DOIUrl":"https://doi.org/10.1016/j.jcmds.2024.100094","url":null,"abstract":"<div><p>Dictionary learning methods continue to gain popularity for the solution of challenging inverse problems. In the dictionary learning approach, the computational forward model is replaced by a large dictionary of possible outcomes, and the problem is to identify the dictionary entries that best match the data, akin to traditional query matching in search engines. Sparse coding techniques are used to guarantee that the dictionary matching identifies only few of the dictionary entries, and dictionary compression methods are used to reduce the complexity of the matching problem. In this article, we propose a work flow to facilitate the dictionary matching process. First, the full dictionary is divided into subdictionaries that are separately compressed. The error introduced by the dictionary compression is handled in the Bayesian framework as a modeling error. Furthermore, we propose a new Bayesian data-driven group sparsity coding method to help identify subdictionaries that are not relevant for the dictionary matching. After discarding irrelevant subdictionaries, the dictionary matching is addressed as a deflated problem using sparse coding. The compression and deflation steps can lead to substantial decreases of the computational complexity. The effectiveness of compensating for the dictionary compression error and using the novel group sparsity promotion to deflate the original dictionary are illustrated by applying the methodology to real world problems, the glitch detection in the LIGO experiment and hyperspectral remote sensing.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"11 ","pages":"Article 100094"},"PeriodicalIF":0.0,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000051/pdfft?md5=87116ca1a8ef189c30f80b5ed4b567bd&pid=1-s2.0-S2772415824000051-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140321133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-08DOI: 10.1016/j.jcmds.2024.100092
C.Y. Chew , G. Teng , Y.S. Lai
We present a simulation method for generating random variables from Erlang and negative binomial distributions using the generalized Lambert function. The generalized Lambert function is utilized to solve the quantile functions of these distributions, allowing for efficient and accurate generation of random variables. The simulation procedure is based on Halley’s method and is demonstrated through the generation of 100,000 random variables for each distribution. The results show close agreement with the theoretical mean and variance values, indicating the effectiveness of the proposed method. This approach offers a valuable tool for generating random variables from Erlang and negative binomial distributions in various applications.
我们介绍了一种利用广义兰伯特 W 函数从二项分布和负二项分布生成随机变量的模拟方法。广义兰伯特 W 函数用于求解这些分布的量子函数,从而高效、准确地生成随机变量。模拟程序以哈雷法为基础,通过为每种分布生成 100,000 个随机变量进行了演示。结果显示与理论均值和方差值非常接近,说明所提出的方法非常有效。这种方法为在各种应用中从二项分布和负二项分布生成随机变量提供了宝贵的工具。
{"title":"Simulation of Erlang and negative binomial distributions using the generalized Lambert W function","authors":"C.Y. Chew , G. Teng , Y.S. Lai","doi":"10.1016/j.jcmds.2024.100092","DOIUrl":"https://doi.org/10.1016/j.jcmds.2024.100092","url":null,"abstract":"<div><p>We present a simulation method for generating random variables from Erlang and negative binomial distributions using the generalized Lambert <span><math><mi>W</mi></math></span> function. The generalized Lambert <span><math><mi>W</mi></math></span> function is utilized to solve the quantile functions of these distributions, allowing for efficient and accurate generation of random variables. The simulation procedure is based on Halley’s method and is demonstrated through the generation of 100,000 random variables for each distribution. The results show close agreement with the theoretical mean and variance values, indicating the effectiveness of the proposed method. This approach offers a valuable tool for generating random variables from Erlang and negative binomial distributions in various applications.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"10 ","pages":"Article 100092"},"PeriodicalIF":0.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000038/pdfft?md5=106597da03409e1369af24276ca25af6&pid=1-s2.0-S2772415824000038-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139737745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-07DOI: 10.1016/j.jcmds.2024.100091
Martyna Lukaszewicz , Ousseini Issaka Salia , Paul A. Hohenlohe , Erkan O. Buzbas
Statistical estimation of parameters in large models of evolutionary processes is often too computationally inefficient to pursue using exact model likelihoods, even with single-nucleotide polymorphism (SNP) data, which offers a way to reduce the size of genetic data while retaining relevant information. Approximate Bayesian Computation (ABC) to perform statistical inference about parameters of large models takes the advantage of simulations to bypass direct evaluation of model likelihoods. We develop a mechanistic model to simulate forward-in-time divergent selection with variable migration rates, modes of reproduction (sexual, asexual), length and number of migration-selection cycles. We investigate the computational feasibility of ABC to perform statistical inference and study the quality of estimates on the position of loci under selection and the strength of selection. To expand the parameter space of positions under selection, we enhance the model by implementing an outlier scan on summarized observed data. We evaluate the usefulness of summary statistics well-known to capture the strength of selection, and assess their informativeness under divergent selection. We also evaluate the effect of genetic drift with respect to an idealized deterministic model with single-locus selection. We discuss the role of the recombination rate as a confounding factor in estimating the strength of divergent selection, and emphasize its importance in break down of linkage disequilibrium (LD). We answer the question for which part of the parameter space of the model we recover strong signal for estimating the selection, and determine whether population differentiation-based summary statistics or LD–based summary statistics perform well in estimating selection.
{"title":"Approximate Bayesian computational methods to estimate the strength of divergent selection in population genomics models","authors":"Martyna Lukaszewicz , Ousseini Issaka Salia , Paul A. Hohenlohe , Erkan O. Buzbas","doi":"10.1016/j.jcmds.2024.100091","DOIUrl":"https://doi.org/10.1016/j.jcmds.2024.100091","url":null,"abstract":"<div><p>Statistical estimation of parameters in large models of evolutionary processes is often too computationally inefficient to pursue using exact model likelihoods, even with single-nucleotide polymorphism (SNP) data, which offers a way to reduce the size of genetic data while retaining relevant information. Approximate Bayesian Computation (ABC) to perform statistical inference about parameters of large models takes the advantage of simulations to bypass direct evaluation of model likelihoods. We develop a mechanistic model to simulate forward-in-time divergent selection with variable migration rates, modes of reproduction (sexual, asexual), length and number of migration-selection cycles. We investigate the computational feasibility of ABC to perform statistical inference and study the quality of estimates on the position of loci under selection and the strength of selection. To expand the parameter space of positions under selection, we enhance the model by implementing an outlier scan on summarized observed data. We evaluate the usefulness of summary statistics well-known to capture the strength of selection, and assess their informativeness under divergent selection. We also evaluate the effect of genetic drift with respect to an idealized deterministic model with single-locus selection. We discuss the role of the recombination rate as a confounding factor in estimating the strength of divergent selection, and emphasize its importance in break down of linkage disequilibrium (LD). We answer the question for which part of the parameter space of the model we recover strong signal for estimating the selection, and determine whether population differentiation-based summary statistics or LD–based summary statistics perform well in estimating selection.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"10 ","pages":"Article 100091"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000026/pdfft?md5=74c5a713f0b6de0a968b0a22ee2b9d09&pid=1-s2.0-S2772415824000026-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139731929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms. In this mechanism, perturbations are adapted to the history of states via the notion of occupation time. After integrating this mechanism into the framework of perturbed gradient descent (PGD) and perturbed accelerated gradient descent (PAGD), two new algorithms are proposed: perturbed gradient descent adapted to occupation time (PGDOT) and its accelerated version (PAGDOT). PGDOT and PAGDOT are guaranteed to avoid getting stuck at non-degenerate saddle points, and are shown to converge to second-order stationary points at least as fast as PGD and PAGD, respectively. The theoretical analysis is corroborated by empirical studies in which the new algorithms consistently escape saddle points and outperform not only their counterparts, PGD and PAGD, but also other popular alternatives including stochastic gradient descent, Adam, and several state-of-the-art adaptive gradient methods.
{"title":"Escaping saddle points efficiently with occupation-time-adapted perturbations","authors":"Xin Guo , Jiequn Han , Mahan Tajrobehkar , Wenpin Tang","doi":"10.1016/j.jcmds.2024.100090","DOIUrl":"https://doi.org/10.1016/j.jcmds.2024.100090","url":null,"abstract":"<div><p>Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms. In this mechanism, perturbations are adapted to the history of states via the notion of occupation time. After integrating this mechanism into the framework of perturbed gradient descent (PGD) and perturbed accelerated gradient descent (PAGD), two new algorithms are proposed: perturbed gradient descent adapted to occupation time (PGDOT) and its accelerated version (PAGDOT). PGDOT and PAGDOT are guaranteed to avoid getting stuck at non-degenerate saddle points, and are shown to converge to second-order stationary points at least as fast as PGD and PAGD, respectively. The theoretical analysis is corroborated by empirical studies in which the new algorithms consistently escape saddle points and outperform not only their counterparts, PGD and PAGD, but also other popular alternatives including stochastic gradient descent, Adam, and several state-of-the-art adaptive gradient methods.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"10 ","pages":"Article 100090"},"PeriodicalIF":0.0,"publicationDate":"2024-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000014/pdfft?md5=ef92b7ba4259b7a90a297dea99cfb00a&pid=1-s2.0-S2772415824000014-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139503605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-18DOI: 10.1016/j.jcmds.2023.100089
Ahmed Yunus, Jonathan Loo
To effectively prevent crimes, it is vital to anticipate their patterns and likely occurrences. Our efforts focused on analyzing diverse open-source datasets related to London, such as the Met police records, public social media posts, data from transportation hubs like bus and rail stations etc. These datasets provided rich insights into human behaviors, activities, and demographics across different parts of London, paving the way for a machine learning-driven prediction system. We developed this system using unique crime-related features extracted from these datasets. Furthermore, our study outlined methods to gather detailed street-level information from local communities using various applications. This innovative approach significantly enhances our ability to deeply understand and predict crime patterns. The proposed predictive system has the potential to forecast potential crimes in advance, enabling government bodies to proactively deploy targeted interventions, ultimately aiming to prevent and address criminal incidents more effectively.
{"title":"London street crime analysis and prediction using crowdsourced dataset","authors":"Ahmed Yunus, Jonathan Loo","doi":"10.1016/j.jcmds.2023.100089","DOIUrl":"10.1016/j.jcmds.2023.100089","url":null,"abstract":"<div><p>To effectively prevent crimes, it is vital to anticipate their patterns and likely occurrences. Our efforts focused on analyzing diverse open-source datasets related to London, such as the Met police records, public social media posts, data from transportation hubs like bus and rail stations etc. These datasets provided rich insights into human behaviors, activities, and demographics across different parts of London, paving the way for a machine learning-driven prediction system. We developed this system using unique crime-related features extracted from these datasets. Furthermore, our study outlined methods to gather detailed street-level information from local communities using various applications. This innovative approach significantly enhances our ability to deeply understand and predict crime patterns. The proposed predictive system has the potential to forecast potential crimes in advance, enabling government bodies to proactively deploy targeted interventions, ultimately aiming to prevent and address criminal incidents more effectively.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"10 ","pages":"Article 100089"},"PeriodicalIF":0.0,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415823000160/pdfft?md5=9901b92589c99927f4a51aa0d969d7a5&pid=1-s2.0-S2772415823000160-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139017375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-02DOI: 10.1016/j.jcmds.2023.100086
Luis Pedro Silvestrin , Harry van Zanten , Mark Hoogendoorn , Ger Koole
With the development of new sensors and monitoring devices, more sources of data become available to be used as inputs for machine learning models. These can on the one hand help to improve the accuracy of a model. On the other hand, combining these new inputs with historical data remains a challenge that has not yet been studied in enough detail. In this work, we propose a transfer learning algorithm that combines new and historical data with different input dimensions. This approach is easy to implement, efficient, with computational complexity equivalent to the ordinary least-squares method, and requires no hyperparameter tuning, making it straightforward to apply when the new data is limited. Different from other approaches, we provide a rigorous theoretical study of its robustness, showing that it cannot be outperformed by a baseline that utilizes only the new data. Our approach achieves state-of-the-art performance on 9 real-life datasets, outperforming the linear DSFT, another linear transfer learning algorithm, and performing comparably to non-linear DSFT.1
{"title":"Transfer learning across datasets with different input dimensions: An algorithm and analysis for the linear regression case","authors":"Luis Pedro Silvestrin , Harry van Zanten , Mark Hoogendoorn , Ger Koole","doi":"10.1016/j.jcmds.2023.100086","DOIUrl":"https://doi.org/10.1016/j.jcmds.2023.100086","url":null,"abstract":"<div><p>With the development of new sensors and monitoring devices, more sources of data become available to be used as inputs for machine learning models. These can on the one hand help to improve the accuracy of a model. On the other hand, combining these new inputs with historical data remains a challenge that has not yet been studied in enough detail. In this work, we propose a transfer learning algorithm that combines new and historical data with different input dimensions. This approach is easy to implement, efficient, with computational complexity equivalent to the ordinary least-squares method, and requires no hyperparameter tuning, making it straightforward to apply when the new data is limited. Different from other approaches, we provide a rigorous theoretical study of its robustness, showing that it cannot be outperformed by a baseline that utilizes only the new data. Our approach achieves state-of-the-art performance on 9 real-life datasets, outperforming the linear DSFT, another linear transfer learning algorithm, and performing comparably to non-linear DSFT.<span><sup>1</sup></span></p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"9 ","pages":"Article 100086"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415823000135/pdfft?md5=8c5d403909a1ea698959ce44c171ed61&pid=1-s2.0-S2772415823000135-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134657055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-30DOI: 10.1016/j.jcmds.2023.100085
Chartese Jones
The notion of improving plays out in many forms in our lives. We look for better quality, faster speed, and leisurelier connections. To achieve our desired goals, we must ask questions. How to make a process stronger? How to make a process more efficient? How to make a process more effective? Image denoising plays a vital role in many professions and understanding how noise can be present in images has led to multiple denoising techniques. These techniques include total variation regularization, non-local regularization, sparse representation, and low-rank minimization just to name a few. Many of these techniques exist because of the concept of improvement. First, we have a change (problem). This change invokes thoughts and questions. How these changes occur and how they are handled play an essential role in the realization or malfunction of that process. With this understanding, first, we look to fully understand the process to achieve success. As it relates to image denoising, the non-local means is incredibly effective in image reconstruction. In particular, the non-local means filter removes noise and sharpens edges without losing too many fine structures and details. Also, the non-local means algorithm is amazingly accurate. Consequently, the disadvantage that plagues the non-local means filtering algorithm is the computational burden and it is due to the non-local averaging. In this paper, we investigate innovative ways to reduce the computational burden and enhance the effectiveness of this filtering process. Research examining image analysis shows there is a battle between noise reduction and the preservation of actual features, which makes the reduction of noise a difficult task. For exploration, we propose a quarter-match non-local means denoising filtering algorithm. The filters help to classify a more concentrated region in the image and thereby enhance the computational efficiency of the existing non-local means denoising methods and produce an enriched comparison for overlying in the restoration process. To survey the constructs of this new algorithm, the authors use the original non-local means filtering algorithm, which is coined, “State of the Art” and other selective processes to test the effectiveness and efficiency of the new model. When comparing the original non-local means with the new quarter match filtering algorithm, on average, we can reduce the computational cost by half, while improving the quality of the image. To further test our new algorithm, medical resonance (MR) and synthetic aperture radar (SAR) images are used as specimens for real-world applications.
{"title":"Quarter match non-local means algorithm for noise removal","authors":"Chartese Jones","doi":"10.1016/j.jcmds.2023.100085","DOIUrl":"https://doi.org/10.1016/j.jcmds.2023.100085","url":null,"abstract":"<div><p>The notion of improving plays out in many forms in our lives. We look for better quality, faster speed, and leisurelier connections. To achieve our desired goals, we must ask questions. How to make a process stronger? How to make a process more efficient? How to make a process more effective? Image denoising plays a vital role in many professions and understanding how noise can be present in images has led to multiple denoising techniques. These techniques include total variation regularization, non-local regularization, sparse representation, and low-rank minimization just to name a few. Many of these techniques exist because of the concept of improvement. First, we have a change (problem). This change invokes thoughts and questions. How these changes occur and how they are handled play an essential role in the realization or malfunction of that process. With this understanding, first, we look to fully understand the process to achieve success. As it relates to image denoising, the non-local means is incredibly effective in image reconstruction. In particular, the non-local means filter removes noise and sharpens edges without losing too many fine structures and details. Also, the non-local means algorithm is amazingly accurate. Consequently, the disadvantage that plagues the non-local means filtering algorithm is the computational burden and it is due to the non-local averaging. In this paper, we investigate innovative ways to reduce the computational burden and enhance the effectiveness of this filtering process. Research examining image analysis shows there is a battle between noise reduction and the preservation of actual features, which makes the reduction of noise a difficult task. For exploration, we propose a quarter-match non-local means denoising filtering algorithm. The filters help to classify a more concentrated region in the image and thereby enhance the computational efficiency of the existing non-local means denoising methods and produce an enriched comparison for overlying in the restoration process. To survey the constructs of this new algorithm, the authors use the original non-local means filtering algorithm, which is coined, “State of the Art” and other selective processes to test the effectiveness and efficiency of the new model. When comparing the original non-local means with the new quarter match filtering algorithm, on average, we can reduce the computational cost by half, while improving the quality of the image. To further test our new algorithm, medical resonance (MR) and synthetic aperture radar (SAR) images are used as specimens for real-world applications.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"9 ","pages":"Article 100085"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50194986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-24DOI: 10.1016/j.jcmds.2023.100083
Juarez S. Azevedo , Jarbas A. Fernandes
In many situations, uncertainty about the mechanical properties of surrounding soils due to the lack of data and spatial variations requires tools that involve the study of parameters by means of random variables or random functions. Usually only a few measurements of parameters, such as permeability or porosity, are available to build a model, and some measurements of the geomechanical behavior, such as displacements, stresses, and strains are needed to check/calibrate the model. In order to introduce this type of modeling in geomechanical analysis, taking into account the random nature of soil parameters, Bayesian inference techniques are implemented in highly heterogeneous porous media. Within the framework of a coupling algorithm, these are incorporated into the inverse poroelasticity problem, with porosity, permeability and Young modulus treated as stationary random fields obtained by the moving average (MA) method. To this end, the Metropolis–Hasting (MH) algorithm was chosen to seek the geomechanical parameters that yield the lowest misfit. Numerical simulations related to injection problems and fluid withdrawal in a domain are performed to compare the performance of this methodology. We conclude with some remarks about numerical experiments.
{"title":"The parameter inversion in coupled geomechanics and flow simulations using Bayesian inference","authors":"Juarez S. Azevedo , Jarbas A. Fernandes","doi":"10.1016/j.jcmds.2023.100083","DOIUrl":"https://doi.org/10.1016/j.jcmds.2023.100083","url":null,"abstract":"<div><p>In many situations, uncertainty about the mechanical properties of surrounding soils due to the lack of data and spatial variations requires tools that involve the study of parameters by means of random variables or random functions. Usually only a few measurements of parameters, such as permeability or porosity, are available to build a model, and some measurements of the geomechanical behavior, such as displacements, stresses, and strains are needed to check/calibrate the model. In order to introduce this type of modeling in geomechanical analysis, taking into account the random nature of soil parameters, Bayesian inference techniques are implemented in highly heterogeneous porous media. Within the framework of a coupling algorithm, these are incorporated into the inverse poroelasticity problem, with porosity, permeability and Young modulus treated as stationary random fields obtained by the moving average (MA) method. To this end, the Metropolis–Hasting (MH) algorithm was chosen to seek the geomechanical parameters that yield the lowest misfit. Numerical simulations related to injection problems and fluid withdrawal in a <span><math><mrow><mn>3</mn><mi>D</mi></mrow></math></span> domain are performed to compare the performance of this methodology. We conclude with some remarks about numerical experiments.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"9 ","pages":"Article 100083"},"PeriodicalIF":0.0,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50194987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-01DOI: 10.1016/j.jcmds.2023.100082
Yen Lee Loh
The discrete Laplace transform (DLT) with inputs and outputs has a nominal computational cost of . There are approximate DLT algorithms with cost such that the output errors divided by the sum of the inputs are less than a fixed tolerance . However, certain important applications of DLTs require a more stringent accuracy criterion, where the output errors divided by the true output values are less than . We present a fast DLT algorithm combining two strategies. The bottom-up strategy exploits the Taylor expansion of the Laplace transform kernel. The top-down strategy chooses groups of terms in the DLT to include or neglect, based on the whole summand, and not just on the Laplace transform kernel. The overall effort is when the source and target points are very dense or very sparse, and appears to be in the intermediate regime. Our algorithm achieves the same accuracy as brute-force evaluation, and is typically 10–100 times faster.
{"title":"Fast discrete Laplace transforms","authors":"Yen Lee Loh","doi":"10.1016/j.jcmds.2023.100082","DOIUrl":"https://doi.org/10.1016/j.jcmds.2023.100082","url":null,"abstract":"<div><p>The discrete Laplace transform (DLT) with <span><math><mi>M</mi></math></span> inputs and <span><math><mi>N</mi></math></span> outputs has a nominal computational cost of <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>M</mi><mi>N</mi><mo>)</mo></mrow></mrow></math></span>. There are approximate DLT algorithms with <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>M</mi><mo>+</mo><mi>N</mi><mo>)</mo></mrow></mrow></math></span> cost such that the output errors divided by the <em>sum of the inputs</em> are less than a fixed tolerance <span><math><mi>η</mi></math></span>. However, certain important applications of DLTs require a more stringent accuracy criterion, where the output errors divided by the <em>true output values</em> are less than <span><math><mi>η</mi></math></span>. We present a fast DLT algorithm combining two strategies. The bottom-up strategy exploits the Taylor expansion of the Laplace transform kernel. The top-down strategy chooses groups of terms in the DLT to include or neglect, based on the whole summand, and not just on the Laplace transform kernel. The overall effort is <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>M</mi><mo>+</mo><mi>N</mi><mo>)</mo></mrow></mrow></math></span> when the source and target points are very dense or very sparse, and appears to be <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mrow><mo>(</mo><mi>M</mi><mo>+</mo><mi>N</mi><mo>)</mo></mrow></mrow><mrow><mn>1</mn><mo>.</mo><mn>5</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span> in the intermediate regime. Our algorithm achieves the same accuracy as brute-force evaluation, and is typically 10–100 times faster.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"8 ","pages":"Article 100082"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50194985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}