首页 > 最新文献

Foundations of data science (Springfield, Mo.)最新文献

英文 中文
CHATGPT FOR COMPUTATIONAL TOPOLOGY. 用于计算拓扑学的 chatgpt
IF 1.7 Q2 MATHEMATICS, APPLIED Pub Date : 2024-06-01 DOI: 10.3934/fods.2024009
Jian Liu, Li Shen, Guo-Wei Wei

ChatGPT represents a significant milestone in the field of artificial intelligence (AI), finding widespread applications across diverse domains. However, its effectiveness in mathematical contexts has been somewhat constrained by its susceptibility to conceptual errors. Concurrently, topological data analysis (TDA), a relatively new discipline, has garnered substantial interest in recent years. Nonetheless, the advancement of TDA is impeded by the limited understanding of computational algorithms and coding proficiency among theoreticians. This work endeavors to bridge the gap between theoretical topological concepts and their practical implementation in computational topology through the utilization of ChatGPT. We showcase how a pure theoretician, devoid of computational experience and coding skills, can effectively transform mathematical formulations and concepts into functional codes for computational topology with the assistance of ChatGPT. Our strategy outlines a productive process wherein a mathematician trains ChatGPT on pure mathematical concepts, steers ChatGPT towards generating computational topology codes, and subsequently validates the generated codes using established examples. Our specific case studies encompass the computation of Betti numbers, Laplacian matrices, and Dirac matrices for simplicial complexes, as well as the persistence of various homologies and Laplacians. Furthermore, we explore the application of ChatGPT in computing recently developed topological theories for hypergraphs and digraphs, as well as the persistent harmonic space, which has not been computed in the literature, to the best of our knowledge. This work serves as an initial step towards effectively transforming pure mathematical theories into practical computational tools, with the ultimate goal of enabling real applications across diverse fields.

ChatGPT 是人工智能(AI)领域的一个重要里程碑,被广泛应用于各个领域。然而,由于易受概念错误的影响,它在数学背景下的有效性受到了一定的限制。与此同时,拓扑数据分析(TDA)作为一门相对较新的学科,近年来受到了广泛关注。然而,理论家们对计算算法和编码能力的理解有限,阻碍了拓扑数据分析的发展。本研究利用 ChatGPT,努力弥合理论拓扑概念与计算拓扑实际应用之间的差距。我们展示了缺乏计算经验和编码技能的纯理论者如何在 ChatGPT 的帮助下有效地将数学公式和概念转化为计算拓扑学的功能代码。我们的策略概述了这样一个富有成效的过程:数学家对 ChatGPT 进行纯数学概念的培训,引导 ChatGPT 生成计算拓扑代码,然后用已有的实例验证生成的代码。我们的具体案例研究包括简单复数的贝蒂数、拉普拉斯矩阵和狄拉克矩阵的计算,以及各种同调和拉普拉斯的持久性。此外,我们还探索了 ChatGPT 在计算最近开发的超图和数图拓扑理论以及持久谐波空间中的应用,据我们所知,持久谐波空间还没有在文献中计算过。这项工作是将纯数学理论有效转化为实用计算工具的第一步,其最终目标是在不同领域实现实际应用。
{"title":"CHATGPT FOR COMPUTATIONAL TOPOLOGY.","authors":"Jian Liu, Li Shen, Guo-Wei Wei","doi":"10.3934/fods.2024009","DOIUrl":"10.3934/fods.2024009","url":null,"abstract":"<p><p>ChatGPT represents a significant milestone in the field of artificial intelligence (AI), finding widespread applications across diverse domains. However, its effectiveness in mathematical contexts has been somewhat constrained by its susceptibility to conceptual errors. Concurrently, topological data analysis (TDA), a relatively new discipline, has garnered substantial interest in recent years. Nonetheless, the advancement of TDA is impeded by the limited understanding of computational algorithms and coding proficiency among theoreticians. This work endeavors to bridge the gap between theoretical topological concepts and their practical implementation in computational topology through the utilization of ChatGPT. We showcase how a pure theoretician, devoid of computational experience and coding skills, can effectively transform mathematical formulations and concepts into functional codes for computational topology with the assistance of ChatGPT. Our strategy outlines a productive process wherein a mathematician trains ChatGPT on pure mathematical concepts, steers ChatGPT towards generating computational topology codes, and subsequently validates the generated codes using established examples. Our specific case studies encompass the computation of Betti numbers, Laplacian matrices, and Dirac matrices for simplicial complexes, as well as the persistence of various homologies and Laplacians. Furthermore, we explore the application of ChatGPT in computing recently developed topological theories for hypergraphs and digraphs, as well as the persistent harmonic space, which has not been computed in the literature, to the best of our knowledge. This work serves as an initial step towards effectively transforming pure mathematical theories into practical computational tools, with the ultimate goal of enabling real applications across diverse fields.</p>","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"6 2","pages":"221-250"},"PeriodicalIF":1.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11463974/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142395644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PERSISTENT PATH LAPLACIAN. 持久路径拉普拉斯算子。
Q2 MATHEMATICS, APPLIED Pub Date : 2023-03-01 DOI: 10.3934/fods.2022015
Rui Wang, Guo-Wei Wei

Path homology proposed by S.-T.Yau and his co-workers provides a new mathematical model for directed graphs and networks. Persistent path homology (PPH) extends the path homology with filtration to deal with asymmetry structures. However, PPH is constrained to purely topological persistence and cannot track the homotopic shape evolution of data during filtration. To overcome the limitation of PPH, persistent path Laplacian (PPL) is introduced to capture the shape evolution of data. PPL's harmonic spectra fully recover PPH's topological persistence and its non-harmonic spectra reveal the homotopic shape evolution of data during filtration.

Yau及其同事提出的路径同调为有向图和网络提供了一个新的数学模型。持久路径同源性(PPH)通过过滤来扩展路径同源性,以处理不对称结构。然而,PPH受限于纯拓扑持久性,并且不能在过滤过程中跟踪数据的同位形状演化。为了克服PPH的局限性,引入了持久路径拉普拉斯算子(PPL)来捕捉数据的形状演化。PPL的调和谱完全恢复了PPH的拓扑持久性,其非调和谱揭示了过滤过程中数据的同位形状演化。
{"title":"PERSISTENT PATH LAPLACIAN.","authors":"Rui Wang,&nbsp;Guo-Wei Wei","doi":"10.3934/fods.2022015","DOIUrl":"https://doi.org/10.3934/fods.2022015","url":null,"abstract":"<p><p>Path homology proposed by S.-T.Yau and his co-workers provides a new mathematical model for directed graphs and networks. Persistent path homology (PPH) extends the path homology with filtration to deal with asymmetry structures. However, PPH is constrained to purely topological persistence and cannot track the homotopic shape evolution of data during filtration. To overcome the limitation of PPH, persistent path Laplacian (PPL) is introduced to capture the shape evolution of data. PPL's harmonic spectra fully recover PPH's topological persistence and its non-harmonic spectra reveal the homotopic shape evolution of data during filtration.</p>","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"5 1","pages":"26-55"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10575407/pdf/nihms-1888540.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41241769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Diagnostic of the Lévy area for geophysical flow models in view of defining high order stochastic discrete-time schemes 基于高阶随机离散时间格式的地球物理流动模型lsamvy区域诊断
Q2 MATHEMATICS, APPLIED Pub Date : 2023-01-01 DOI: 10.3934/fods.2023011
Pierre-Marie Boulvard, Etienne Mémin
In this paper we characterize numerically through two criteria the Lévy area related to unresolved fluctuation velocities associated to a stochastic coarse-scale representation of geophysical fluid flow dynamics. We study in particular whether or not the process associated to the random unresolved velocity components exhibits a Lévy area corresponding to a Wiener process, and if the law of this process can reasonably be approached by a centered Dirac measure. This exploration enables us to answer positively to a conjecture made for the constitution of high-order discrete time evolution schemes for stochastic representation defined from stochastic transport.
在本文中,我们通过两个标准对与地球物理流体流动动力学随机粗尺度表示相关的未解决的波动速度有关的lsamvy面积进行了数值表征。我们特别研究了与随机未解析速度分量相关的过程是否表现出与Wiener过程相对应的lsamvy区域,以及该过程的规律是否可以通过中心狄拉克测量合理地接近。这一探索使我们能够积极地回答由随机输运定义的随机表示的高阶离散时间演化方案的构造的猜想。
{"title":"Diagnostic of the Lévy area for geophysical flow models in view of defining high order stochastic discrete-time schemes","authors":"Pierre-Marie Boulvard, Etienne Mémin","doi":"10.3934/fods.2023011","DOIUrl":"https://doi.org/10.3934/fods.2023011","url":null,"abstract":"In this paper we characterize numerically through two criteria the Lévy area related to unresolved fluctuation velocities associated to a stochastic coarse-scale representation of geophysical fluid flow dynamics. We study in particular whether or not the process associated to the random unresolved velocity components exhibits a Lévy area corresponding to a Wiener process, and if the law of this process can reasonably be approached by a centered Dirac measure. This exploration enables us to answer positively to a conjecture made for the constitution of high-order discrete time evolution schemes for stochastic representation defined from stochastic transport.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136306282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Hierarchical regularization networks for sparsification based learning on noisy datasets 基于噪声数据集学习的分层正则化网络
Q2 MATHEMATICS, APPLIED Pub Date : 2023-01-01 DOI: 10.3934/fods.2023009
P. Shekhar, M. Babu, Abani Patra
{"title":"Hierarchical regularization networks for sparsification based learning on noisy datasets","authors":"P. Shekhar, M. Babu, Abani Patra","doi":"10.3934/fods.2023009","DOIUrl":"https://doi.org/10.3934/fods.2023009","url":null,"abstract":"","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70248298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Persistent hyperdigraph homology and persistent hyperdigraph Laplacians 持久超向位同调和持久超向位拉普拉斯算子
Q2 MATHEMATICS, APPLIED Pub Date : 2023-01-01 DOI: 10.3934/fods.2023010
Dong Chen, Jian Liu, Jie Wu, Guo-Wei Wei
Hypergraphs are useful mathematical models for describing complex relationships among members of a structured graph, while hyperdigraphs serve as a generalization that can encode asymmetric relationships in the data. However, obtaining topological information directly from hyperdigraphs remains a challenge. To address this issue, we introduce hyperdigraph homology in this work. We also propose topological hyperdigraph Laplacians, which can extract both harmonic spectra and non-harmonic spectra from directed and internally organized data. Moreover, we introduce persistent hyperdigraph homology and persistent hyperdigraph Laplacians through filtration, enabling the capture of topological persistence and homotopic shape evolution of directed and structured data across multiple scales. The proposed methods offer new multiscale algebraic topology tools for topological data analysis.
超图是描述结构化图成员之间复杂关系的有用数学模型,而超图则是一种概括,可以对数据中的不对称关系进行编码。然而,直接从超向图中获取拓扑信息仍然是一个挑战。为了解决这个问题,我们在本文中引入了超向位同调。我们还提出了拓扑超向拉普拉斯算子,它可以从有向和内部组织的数据中提取谐波谱和非谐波谱。此外,我们通过过滤引入了持续超有向图同调和持续超有向图拉普拉斯算子,从而实现了有向和结构化数据在多个尺度上的拓扑持久性和同调形状演化。提出的方法为拓扑数据分析提供了新的多尺度代数拓扑工具。
{"title":"Persistent hyperdigraph homology and persistent hyperdigraph Laplacians","authors":"Dong Chen, Jian Liu, Jie Wu, Guo-Wei Wei","doi":"10.3934/fods.2023010","DOIUrl":"https://doi.org/10.3934/fods.2023010","url":null,"abstract":"Hypergraphs are useful mathematical models for describing complex relationships among members of a structured graph, while hyperdigraphs serve as a generalization that can encode asymmetric relationships in the data. However, obtaining topological information directly from hyperdigraphs remains a challenge. To address this issue, we introduce hyperdigraph homology in this work. We also propose topological hyperdigraph Laplacians, which can extract both harmonic spectra and non-harmonic spectra from directed and internally organized data. Moreover, we introduce persistent hyperdigraph homology and persistent hyperdigraph Laplacians through filtration, enabling the capture of topological persistence and homotopic shape evolution of directed and structured data across multiple scales. The proposed methods offer new multiscale algebraic topology tools for topological data analysis.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136053367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Unsupervised learning of observation functions in state space models by nonparametric moment methods 非参数矩法在状态空间模型中观测函数的无监督学习
Q2 MATHEMATICS, APPLIED Pub Date : 2023-01-01 DOI: 10.3934/fods.2023002
Qingci An, Yannis Kevrekidis, Fei Lu, Mauro Maggioni
We investigate the unsupervised learning of non-invertible observation functions in nonlinear state space models. Assuming abundant data of the observation process along with the distribution of the state process, we introduce a nonparametric generalized moment method to estimate the observation function via constrained regression. The major challenge comes from the non-invertibility of the observation function and the lack of data pairs between the state and observation. We address the fundamental issue of identifiability from quadratic loss functionals and show that the function space of identifiability is the closure of a RKHS that is intrinsic to the state process. Numerical results show that the first two moments and temporal correlations, along with upper and lower bounds, can identify functions ranging from piecewise polynomials to smooth functions, leading to convergent estimators. The limitations of this method, such as non-identifiability due to symmetry and stationarity, are also discussed.
研究了非线性状态空间模型中不可逆观测函数的无监督学习问题。假设观测过程数据丰富,且状态过程分布均匀,采用非参数广义矩法对观测函数进行约束回归估计。主要的挑战来自于观测函数的不可逆性以及状态和观测之间缺乏数据对。我们从二次损失函数中解决了可辨识性的基本问题,并证明了可辨识性的函数空间是状态过程固有的RKHS的闭包。数值结果表明,前两个矩和时间相关以及上界和下界可以识别从分段多项式到光滑函数的函数,从而得到收敛估计量。本文还讨论了该方法的局限性,如对称性和平稳性所导致的不可识别性。
{"title":"Unsupervised learning of observation functions in state space models by nonparametric moment methods","authors":"Qingci An, Yannis Kevrekidis, Fei Lu, Mauro Maggioni","doi":"10.3934/fods.2023002","DOIUrl":"https://doi.org/10.3934/fods.2023002","url":null,"abstract":"We investigate the unsupervised learning of non-invertible observation functions in nonlinear state space models. Assuming abundant data of the observation process along with the distribution of the state process, we introduce a nonparametric generalized moment method to estimate the observation function via constrained regression. The major challenge comes from the non-invertibility of the observation function and the lack of data pairs between the state and observation. We address the fundamental issue of identifiability from quadratic loss functionals and show that the function space of identifiability is the closure of a RKHS that is intrinsic to the state process. Numerical results show that the first two moments and temporal correlations, along with upper and lower bounds, can identify functions ranging from piecewise polynomials to smooth functions, leading to convergent estimators. The limitations of this method, such as non-identifiability due to symmetry and stationarity, are also discussed.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135534595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Noise calibration for SPDEs: A case study for the rotating shallow water model spde的噪声校正:以旋转浅水模型为例
Q2 MATHEMATICS, APPLIED Pub Date : 2023-01-01 DOI: 10.3934/fods.2023012
Dan Crisan, Oana Lang, Alexander Lobbe, Peter-Jan van Leeuwen, Roland Potthast
{"title":"Noise calibration for SPDEs: A case study for the rotating shallow water model","authors":"Dan Crisan, Oana Lang, Alexander Lobbe, Peter-Jan van Leeuwen, Roland Potthast","doi":"10.3934/fods.2023012","DOIUrl":"https://doi.org/10.3934/fods.2023012","url":null,"abstract":"","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134980769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weight set decomposition for weighted rank and rating aggregation: An interpretable and visual decision support tool 加权排名和评级聚合的权重集分解:一个可解释和可视化的决策支持工具
Q2 MATHEMATICS, APPLIED Pub Date : 2023-01-01 DOI: 10.3934/fods.2023001
Tyler A. Perini, A. Langville, Glenn Kramer, Jeff Shrager, Mark Shapiro
{"title":"Weight set decomposition for weighted rank and rating aggregation: An interpretable and visual decision support tool","authors":"Tyler A. Perini, A. Langville, Glenn Kramer, Jeff Shrager, Mark Shapiro","doi":"10.3934/fods.2023001","DOIUrl":"https://doi.org/10.3934/fods.2023001","url":null,"abstract":"","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70248223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
GEOMETRIC STRUCTURE GUIDED MODEL AND ALGORITHMS FOR COMPLETE DECONVOLUTION OF GENE EXPRESSION DATA. 基因表达数据完全反褶积的几何结构导向模型和算法
Q2 MATHEMATICS, APPLIED Pub Date : 2022-09-01 DOI: 10.3934/fods.2022013
Duan Chen, Shaoyu Li, Xue Wang

Complete deconvolution analysis for bulk RNA-seq data is important and helpful to distinguish whether the differences of disease-associated GEPs (gene expression profiles) in tissues of patients and normal controls are due to changes in cellular composition of tissue samples, or due to GEPs changes in specific cells. One of the major techniques to perform complete deconvolution is nonnegative matrix factorization (NMF), which also has a wide-range of applications in the machine learning community. However, the NMF is a well-known strongly ill-posed problem, so a direct application of NMF to RNA-seq data will suffer severe difficulties in the interpretability of solutions. In this paper, we develop an NMF-based mathematical model and corresponding computational algorithms to improve the solution identifiability of deconvoluting bulk RNA-seq data. In our approach, we combine the biological concept of marker genes with the solvability conditions of the NMF theories, and develop a geometric structures guided optimization model. In this strategy, the geometric structure of bulk tissue data is first explored by the spectral clustering technique. Then, the identified information of marker genes is integrated as solvability constraints, while the overall correlation graph is used as manifold regularization. Both synthetic and biological data are used to validate the proposed model and algorithms, from which solution interpretability and accuracy are significantly improved.

对大量RNA-seq数据进行完整的去卷积分析非常重要,有助于区分患者和正常对照组组织中疾病相关GEP(基因表达谱)的差异是由于组织样本的细胞组成变化,还是由于特定细胞中GEP的变化。执行完全反褶积的主要技术之一是非负矩阵分解(NMF),它在机器学习社区中也有广泛的应用。然而,NMF是一个众所周知的强不适定问题,因此将NMF直接应用于RNA-seq数据将在解决方案的可解释性方面遇到严重困难。在本文中,我们开发了一个基于NMF的数学模型和相应的计算算法,以提高解卷积批量RNA-seq数据的解可识别性。在我们的方法中,我们将标记基因的生物学概念与NMF理论的可解性条件相结合,并开发了一个几何结构引导的优化模型。在该策略中,首先通过光谱聚类技术来探索大块组织数据的几何结构。然后,标记基因的识别信息被整合为可解性约束,而整体相关图被用作流形正则化。使用合成和生物数据来验证所提出的模型和算法,从而显著提高了解决方案的可解释性和准确性。
{"title":"GEOMETRIC STRUCTURE GUIDED MODEL AND ALGORITHMS FOR COMPLETE DECONVOLUTION OF GENE EXPRESSION DATA.","authors":"Duan Chen, Shaoyu Li, Xue Wang","doi":"10.3934/fods.2022013","DOIUrl":"10.3934/fods.2022013","url":null,"abstract":"<p><p>Complete deconvolution analysis for bulk RNA-seq data is important and helpful to distinguish whether the differences of disease-associated GEPs (gene expression profiles) in tissues of patients and normal controls are due to changes in cellular composition of tissue samples, or due to GEPs changes in specific cells. One of the major techniques to perform complete deconvolution is nonnegative matrix factorization (NMF), which also has a wide-range of applications in the machine learning community. However, the NMF is a well-known strongly ill-posed problem, so a direct application of NMF to RNA-seq data will suffer severe difficulties in the interpretability of solutions. In this paper, we develop an NMF-based mathematical model and corresponding computational algorithms to improve the solution identifiability of deconvoluting bulk RNA-seq data. In our approach, we combine the biological concept of marker genes with the solvability conditions of the NMF theories, and develop a geometric structures guided optimization model. In this strategy, the geometric structure of bulk tissue data is first explored by the spectral clustering technique. Then, the identified information of marker genes is integrated as solvability constraints, while the overall correlation graph is used as manifold regularization. Both synthetic and biological data are used to validate the proposed model and algorithms, from which solution interpretability and accuracy are significantly improved.</p>","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":"441-466"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10798655/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42614124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ASPECTS OF TOPOLOGICAL APPROACHES FOR DATA SCIENCE. 数据科学拓扑方法的各个方面。
Q2 MATHEMATICS, APPLIED Pub Date : 2022-06-01 DOI: 10.3934/fods.2022002
Jelena Grbić, Jie Wu, Kelin Xia, Guo-Wei Wei

We establish a new theory which unifies various aspects of topological approaches for data science, by being applicable both to point cloud data and to graph data, including networks beyond pairwise interactions. We generalize simplicial complexes and hypergraphs to super-hypergraphs and establish super-hypergraph homology as an extension of simplicial homology. Driven by applications, we also introduce super-persistent homology.

我们建立了一种新理论,通过同时适用于点云数据和图数据(包括超越成对交互的网络),统一了数据科学拓扑方法的各个方面。我们将简单复合物和超图概括为超超图,并建立了超超图同源性作为简单同源性的扩展。在应用的推动下,我们还引入了超持久同源性。
{"title":"ASPECTS OF TOPOLOGICAL APPROACHES FOR DATA SCIENCE.","authors":"Jelena Grbić, Jie Wu, Kelin Xia, Guo-Wei Wei","doi":"10.3934/fods.2022002","DOIUrl":"10.3934/fods.2022002","url":null,"abstract":"<p><p>We establish a new theory which unifies various aspects of topological approaches for data science, by being applicable both to point cloud data and to graph data, including networks beyond pairwise interactions. We generalize simplicial complexes and hypergraphs to super-hypergraphs and establish super-hypergraph homology as an extension of simplicial homology. Driven by applications, we also introduce super-persistent homology.</p>","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"4 2","pages":"165-216"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9881677/pdf/nihms-1825620.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10592051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Foundations of data science (Springfield, Mo.)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1