首页 > 最新文献

Annals of Mathematics and Artificial Intelligence最新文献

英文 中文
Personalized choice prediction with less user information (DRAFT) 利用较少的用户信息进行个性化选择预测
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-30 DOI: 10.1007/s10472-024-09927-9
Francine Chen, Yanxia Zhang, Minh Nguyen, Matt Klenk, Charlene Wu

While most models of human choice are linear to ease interpretation, it is not clear whether linear models are good models of human decision making. And while prior studies have investigated how task conditions and group characteristics, such as personality or socio-demographic background, influence human decisions, no prior works have investigated how to use less personal information for choice prediction. We propose a deep learning model based on self-attention and cross-attention to model human decision making which takes into account both subject-specific information and task conditions. We show that our model can consistently predict human decisions more accurately than linear models and other baseline models while remaining interpretable. In addition, although a larger amount of subject specific information will generally lead to more accurate choice prediction, collecting more surveys to gather subject background information is a burden to subjects, as well as costly and time-consuming. To address this, we introduce a training scheme that reduces the number of surveys that must be collected in order to achieve more accurate predictions.

摘要 虽然大多数人类选择模型都是线性的,以便于解释,但线性模型是否是人类决策的良好模型还不清楚。虽然之前的研究已经探讨了任务条件和群体特征(如个性或社会人口背景)如何影响人类决策,但还没有研究如何利用较少的个人信息进行选择预测。我们提出了一种基于自我注意和交叉注意的深度学习模型,用于模拟人类决策,该模型同时考虑了特定主题信息和任务条件。我们的研究表明,与线性模型和其他基线模型相比,我们的模型能更准确地预测人类决策,同时还能保持可解释性。此外,虽然更多的受试者特定信息通常会导致更准确的选择预测,但收集更多的调查来收集受试者背景信息对受试者来说是一种负担,而且既费钱又费时。为了解决这个问题,我们引入了一种训练方案,可以减少必须收集的调查问卷数量,从而获得更准确的预测结果。
{"title":"Personalized choice prediction with less user information","authors":"Francine Chen,&nbsp;Yanxia Zhang,&nbsp;Minh Nguyen,&nbsp;Matt Klenk,&nbsp;Charlene Wu","doi":"10.1007/s10472-024-09927-9","DOIUrl":"10.1007/s10472-024-09927-9","url":null,"abstract":"<div><p>While most models of human choice are linear to ease interpretation, it is not clear whether linear models are good models of human decision making. And while prior studies have investigated how task conditions and group characteristics, such as personality or socio-demographic background, influence human decisions, no prior works have investigated how to use less personal information for choice prediction. We propose a deep learning model based on self-attention and cross-attention to model human decision making which takes into account both subject-specific information and task conditions. We show that our model can consistently predict human decisions more accurately than linear models and other baseline models while remaining interpretable. In addition, although a larger amount of subject specific information will generally lead to more accurate choice prediction, collecting more surveys to gather subject background information is a burden to subjects, as well as costly and time-consuming. To address this, we introduce a training scheme that reduces the number of surveys that must be collected in order to achieve more accurate predictions.</p></div>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 6","pages":"1489 - 1509"},"PeriodicalIF":1.2,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10472-024-09927-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139648618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clique detection with a given reliability 具有给定可靠性的小群检测
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-29 DOI: 10.1007/s10472-024-09928-8
Dmitry Semenov, Alexander Koldanov, Petr Koldanov, Panos Pardalos

In this paper we propose a new notion of a clique reliability. The clique reliability is understood as the ratio of the number of statistically significant links in a clique to the number of edges of the clique. This notion relies on a recently proposed original technique for separating inferences about pairwise connections between vertices of a network into significant and admissible ones. In this paper, we propose an extension of this technique to the problem of clique detection. We propose a method of step-by-step construction of a clique with a given reliability. The results of constructing cliques with a given reliability using data on the returns of stocks included in the Dow Jones index are presented.

在本文中,我们提出了一个新的小群可靠性概念。聚类可靠性被理解为聚类中具有统计意义的链接数与聚类边数之比。这一概念依赖于最近提出的一项原创技术,该技术可将网络顶点间成对连接的推断分为重要连接和可接受连接。在本文中,我们提出将这一技术扩展到聚类检测问题中。我们提出了一种逐步构建具有给定可靠性的小群的方法。本文介绍了利用道琼斯指数中的股票收益数据构建具有给定可靠性的聚类的结果。
{"title":"Clique detection with a given reliability","authors":"Dmitry Semenov, Alexander Koldanov, Petr Koldanov, Panos Pardalos","doi":"10.1007/s10472-024-09928-8","DOIUrl":"https://doi.org/10.1007/s10472-024-09928-8","url":null,"abstract":"<p>In this paper we propose a new notion of a clique reliability. The clique reliability is understood as the ratio of the number of statistically significant links in a clique to the number of edges of the clique. This notion relies on a recently proposed original technique for separating inferences about pairwise connections between vertices of a network into significant and admissible ones. In this paper, we propose an extension of this technique to the problem of clique detection. We propose a method of step-by-step construction of a clique with a given reliability. The results of constructing cliques with a given reliability using data on the returns of stocks included in the Dow Jones index are presented.</p>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"208 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139578602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel homological calculus for 3D binary digital images 三维二进制数字图像的并行同调微积分
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-29 DOI: 10.1007/s10472-023-09913-7
Fernando Díaz-del-Río, Helena Molina-Abril, Pedro Real, Darian Onchis, Sergio Blanco-Trejo

Topological representations of binary digital images usually take into consideration different adjacency types between colors. Within the cubical-voxel 3D binary image context, we design an algorithm for computing the isotopic model of an image, called (6, 26)-Homological Region Adjacency Tree ((6, 26)-Hom-Tree). This algorithm is based on a flexible graph scaffolding at the inter-voxel level called Homological Spanning Forest model (HSF). Hom-Trees are edge-weighted trees in which each node is a maximally connected set of constant-value voxels, which is interpreted as a subtree of the HSF. This representation integrates and relates the homological information (connected components, tunnels and cavities) of the maximally connected regions of constant color using 6-adjacency and 26-adjacency for black and white voxels, respectively (the criteria most commonly used for 3D images). The Euler-Poincaré numbers (which may as well be computed by counting the number of cells of each dimension on a cubical complex) and the connected component labeling of the foreground and background of a given image can also be straightforwardly computed from its Hom-Trees. Being (I_D) a 3D binary well-composed image (where D is the set of black voxels), an almost fully parallel algorithm for constructing the Hom-Tree via HSF computation is implemented and tested here. If (I_D) has (m_1{times } m_2{times } m_3) voxels, the time complexity order of the reproducible algorithm is near (O(log (m_1{+}m_2{+}m_3))), under the assumption that a processing element is available for each cubical voxel. Strategies for using the compressed information of the Hom-Tree representation to distinguish two topologically different images having the same homological information (Betti numbers) are discussed here. The topological discriminatory power of the Hom-Tree and the low time complexity order of the proposed implementation guarantee its usability within machine learning methods for the classification and comparison of natural 3D images.

二值数字图像的拓扑表示通常会考虑颜色之间的不同邻接类型。在立方体-体素三维二值图像的背景下,我们设计了一种计算图像同位模型的算法,称为(6, 26)-Homological Region Adjacency Tree((6, 26)-Hom-Tree )。该算法基于体素间层次的灵活图脚手架,称为同调生成林模型(HSF)。同调树是边缘加权树,其中每个节点都是最大连接的恒值体素集,被解释为 HSF 的子树。这种表示方法分别使用黑白体素的 6 相接和 26 相接(最常用于三维图像的标准)来整合和关联最大连接恒色区域的同调信息(连接成分、隧道和空腔)。欧拉-平卡莱数(也可以通过计算立方体复数上每个维度的单元数来计算)以及给定图像的前景和背景的连通分量标记也可以通过其同源树直接计算出来。由于 (I_D) 是三维二元井合成图像(其中 D 是黑色体素的集合),因此这里实现并测试了一种通过 HSF 计算构建 Hom-Tree 的几乎完全并行的算法。如果 (I_D) 有 (m_1{times } m_2{times } m_3) 个体素,在每个立方体素都有一个处理元素的假设下,可重现算法的时间复杂度阶数接近 (O(log (m_1{+}m_2{+}m_3))) 。这里讨论的是如何利用 Hom-Tree 表示法的压缩信息来区分具有相同同调信息(贝蒂数)的两幅拓扑不同的图像。Hom-Tree 的拓扑判别能力和所提议的低时间复杂度实施顺序保证了其在机器学习方法中的可用性,以用于自然 3D 图像的分类和比较。
{"title":"Parallel homological calculus for 3D binary digital images","authors":"Fernando Díaz-del-Río,&nbsp;Helena Molina-Abril,&nbsp;Pedro Real,&nbsp;Darian Onchis,&nbsp;Sergio Blanco-Trejo","doi":"10.1007/s10472-023-09913-7","DOIUrl":"10.1007/s10472-023-09913-7","url":null,"abstract":"<div><p>Topological representations of binary digital images usually take into consideration different adjacency types between colors. Within the cubical-voxel 3D binary image context, we design an algorithm for computing the isotopic model of an image, called (<b>6</b>, <b>26</b>)-Homological Region Adjacency Tree ((<b>6</b>, <b>26</b>)-<i>Hom-Tree</i>). This algorithm is based on a flexible graph scaffolding at the inter-voxel level called Homological Spanning Forest model (HSF). <i>Hom-Trees</i> are edge-weighted trees in which each node is a maximally connected set of constant-value voxels, which is interpreted as a subtree of the HSF. This representation integrates and relates the homological information (connected components, tunnels and cavities) of the maximally connected regions of constant color using 6-adjacency and 26-adjacency for black and white voxels, respectively (the criteria most commonly used for 3D images). The Euler-Poincaré numbers (which may as well be computed by counting the number of cells of each dimension on a cubical complex) and the connected component labeling of the foreground and background of a given image can also be straightforwardly computed from its Hom-Trees. Being <span>(I_D)</span> a 3D binary well-composed image (where <i>D</i> is the set of black voxels), an almost fully parallel algorithm for constructing the <i>Hom-Tree</i> via HSF computation is implemented and tested here. If <span>(I_D)</span> has <span>(m_1{times } m_2{times } m_3)</span> voxels, the time complexity order of the reproducible algorithm is near <span>(O(log (m_1{+}m_2{+}m_3)))</span>, under the assumption that a processing element is available for each cubical voxel. Strategies for using the compressed information of the <i>Hom-Tree</i> representation to distinguish two topologically different images having the same homological information (Betti numbers) are discussed here. The topological discriminatory power of the <i>Hom-Tree</i> and the low time complexity order of the proposed implementation guarantee its usability within machine learning methods for the classification and comparison of natural 3<i>D</i> images.</p></div>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 1","pages":"77 - 113"},"PeriodicalIF":1.2,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10472-023-09913-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139578597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weighted and Choquet (L^p) distance representation of comparative dissimilarity relations on fuzzy description profiles 模糊描述轮廓上比较异同关系的加权和 Choquet $$L^p$$ 距离表示法
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-24 DOI: 10.1007/s10472-024-09924-y
Giulianella Coletti, Davide Petturiti, Bernadette Bouchon-Meunier

We consider comparative dissimilarity relations on pairs on fuzzy description profiles, the latter providing a fuzzy set-based representation of pairs of objects. Such a relation expresses the idea of “no more dissimilar than” and is used by a decision maker when performing a case-based decision task under vague information. We first limit ourselves to those relations admitting a weighted (varvec{L}^p) distance representation, for which we provide an axiomatic characterization in case the relation is complete, transitive and defined on the entire space of pairs of fuzzy description profiles. Next, we switch to the more general class of comparative dissimilarity relations representable by a Choquet (varvec{L}^p) distance, parameterized by a completely alternating normalized capacity.

我们考虑的是模糊描述轮廓上成对对象的比较异同关系,后者提供了成对对象的基于模糊集的表示方法。这种关系表达了 "不比......更不相似 "的概念,决策者在模糊信息下执行基于案例的决策任务时会用到它。我们首先局限于那些允许加权(varvec{L}^p)距离表示的关系,对于这些关系,我们提供了一个公理化的表征,以防该关系是完整的、传递性的,并且定义在模糊描述轮廓对的整个空间上。接下来,我们转而讨论更一般的比较不相似性关系,这种比较不相似性关系可以用 Choquet (varvec{L}^p) 距离表示,其参数是完全交替的归一化容量。
{"title":"Weighted and Choquet (L^p) distance representation of comparative dissimilarity relations on fuzzy description profiles","authors":"Giulianella Coletti,&nbsp;Davide Petturiti,&nbsp;Bernadette Bouchon-Meunier","doi":"10.1007/s10472-024-09924-y","DOIUrl":"10.1007/s10472-024-09924-y","url":null,"abstract":"<div><p>We consider comparative dissimilarity relations on pairs on fuzzy description profiles, the latter providing a fuzzy set-based representation of pairs of objects. Such a relation expresses the idea of “no more dissimilar than” and is used by a decision maker when performing a case-based decision task under vague information. We first limit ourselves to those relations admitting a weighted <span>(varvec{L}^p)</span> distance representation, for which we provide an axiomatic characterization in case the relation is complete, transitive and defined on the entire space of pairs of fuzzy description profiles. Next, we switch to the more general class of comparative dissimilarity relations representable by a Choquet <span>(varvec{L}^p)</span> distance, parameterized by a completely alternating normalized capacity.</p></div>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 6","pages":"1407 - 1436"},"PeriodicalIF":1.2,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10472-024-09924-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139562232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ISAIM-2022: international symposium on artificial intelligence and mathematics ISAIM-2022:人工智能与数学国际研讨会
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-19 DOI: 10.1007/s10472-024-09922-0
Dimitrios I. Diochnos, Martin Charles Golumbic, Frederick Hoffman
{"title":"ISAIM-2022: international symposium on artificial intelligence and mathematics","authors":"Dimitrios I. Diochnos,&nbsp;Martin Charles Golumbic,&nbsp;Frederick Hoffman","doi":"10.1007/s10472-024-09922-0","DOIUrl":"10.1007/s10472-024-09922-0","url":null,"abstract":"","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 1","pages":"1 - 4"},"PeriodicalIF":1.2,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139611902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stability of accuracy for the training of DNNs via the uniform doubling condition 通过均匀加倍条件训练 DNN 的精度稳定性
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-19 DOI: 10.1007/s10472-023-09919-1
Yitzchak Shmalo

We study the stability of accuracy during the training of deep neural networks (DNNs). In this context, the training of a DNN is performed via the minimization of a cross-entropy loss function, and the performance metric is accuracy (the proportion of objects that are classified correctly). While training results in a decrease of loss, the accuracy does not necessarily increase during the process and may sometimes even decrease. The goal of achieving stability of accuracy is to ensure that if accuracy is high at some initial time, it remains high throughout training. A recent result by Berlyand, Jabin, and Safsten introduces a doubling condition on the training data, which ensures the stability of accuracy during training for DNNs using the absolute value activation function. For training data in (mathbb {R}^n), this doubling condition is formulated using slabs in (mathbb {R}^n) and depends on the choice of the slabs. The goal of this paper is twofold. First, to make the doubling condition uniform, that is, independent of the choice of slabs. This leads to sufficient conditions for stability in terms of training data only. In other words, for a training set T that satisfies the uniform doubling condition, there exists a family of DNNs such that a DNN from this family with high accuracy on the training set at some training time (t_0) will have high accuracy for all time (t>t_0). Moreover, establishing uniformity is necessary for the numerical implementation of the doubling condition. We demonstrate how to numerically implement a simplified version of this uniform doubling condition on a dataset and apply it to achieve stability of accuracy using a few model examples. The second goal is to extend the original stability results from the absolute value activation function to a broader class of piecewise linear activation functions with finitely many critical points, such as the popular Leaky ReLU.

摘要 我们研究了深度神经网络(DNN)训练过程中准确率的稳定性。在这种情况下,深度神经网络的训练是通过最小化交叉熵损失函数来实现的,其性能指标是准确率(正确分类对象的比例)。虽然训练会导致损失的减少,但在训练过程中,准确率并不一定会提高,有时甚至会降低。实现准确率稳定性的目标是,如果准确率在某个初始时间很高,则确保在整个训练过程中都保持较高的准确率。Berlyand、Jabin 和 Safsten 最近的一项研究成果引入了训练数据加倍条件,从而确保了使用绝对值激活函数的 DNN 在训练过程中的准确率稳定性。对于 (mathbb {R}^n) 中的训练数据,这个加倍条件是使用 (mathbb {R}^n) 中的板块制定的,并取决于板块的选择。本文的目标有两个。首先,使加倍条件统一,即与板块的选择无关。这就为仅在训练数据方面的稳定性提供了充分条件。换句话说,对于满足统一加倍条件的训练集 T,存在一个 DNN 家族,使得这个家族中在某个训练时间 (t_0) 对训练集具有高准确率的 DNN 在所有时间 (t>t_0) 都具有高准确率。此外,建立统一性对于加倍条件的数值实现是必要的。我们演示了如何在数据集上数值实现这种均匀加倍条件的简化版本,并通过几个模型实例应用它来实现精度的稳定性。第二个目标是将绝对值激活函数的原始稳定性结果扩展到具有有限多个临界点的更广泛的片断线性激活函数类别,例如流行的 Leaky ReLU。
{"title":"Stability of accuracy for the training of DNNs via the uniform doubling condition","authors":"Yitzchak Shmalo","doi":"10.1007/s10472-023-09919-1","DOIUrl":"10.1007/s10472-023-09919-1","url":null,"abstract":"<div><p>We study the stability of accuracy during the training of deep neural networks (DNNs). In this context, the training of a DNN is performed via the minimization of a cross-entropy loss function, and the performance metric is accuracy (the proportion of objects that are classified correctly). While training results in a decrease of loss, the accuracy does not necessarily increase during the process and may sometimes even decrease. The goal of achieving stability of accuracy is to ensure that if accuracy is high at some initial time, it remains high throughout training. A recent result by Berlyand, Jabin, and Safsten introduces a doubling condition on the training data, which ensures the stability of accuracy during training for DNNs using the absolute value activation function. For training data in <span>(mathbb {R}^n)</span>, this doubling condition is formulated using slabs in <span>(mathbb {R}^n)</span> and depends on the choice of the slabs. The goal of this paper is twofold. First, to make the doubling condition uniform, that is, independent of the choice of slabs. This leads to sufficient conditions for stability in terms of training data only. In other words, for a training set <i>T</i> that satisfies the uniform doubling condition, there exists a family of DNNs such that a DNN from this family with high accuracy on the training set at some training time <span>(t_0)</span> will have high accuracy for all time <span>(t&gt;t_0)</span>. Moreover, establishing uniformity is necessary for the numerical implementation of the doubling condition. We demonstrate how to numerically implement a simplified version of this uniform doubling condition on a dataset and apply it to achieve stability of accuracy using a few model examples. The second goal is to extend the original stability results from the absolute value activation function to a broader class of piecewise linear activation functions with finitely many critical points, such as the popular Leaky ReLU.</p></div>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 2","pages":"439 - 483"},"PeriodicalIF":1.2,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combinatorial and geometric problems in imaging sciences 成像科学中的组合和几何问题
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-19 DOI: 10.1007/s10472-024-09923-z
Valentin E. Brimkov
{"title":"Combinatorial and geometric problems in imaging sciences","authors":"Valentin E. Brimkov","doi":"10.1007/s10472-024-09923-z","DOIUrl":"10.1007/s10472-024-09923-z","url":null,"abstract":"","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 1","pages":"5 - 6"},"PeriodicalIF":1.2,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139525436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Best-effort adaptation 尽力适应
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-01-13 DOI: 10.1007/s10472-023-09917-3
Pranjal Awasthi, Corinna Cortes, Mehryar Mohri

We study a problem of best-effort adaptation motivated by several applications and considerations, which consists of determining an accurate predictor for a target domain, for which a moderate amount of labeled samples are available, while leveraging information from another domain for which substantially more labeled samples are at one’s disposal. We present a new and general discrepancy-based theoretical analysis of sample reweighting methods, including bounds holding uniformly over the weights. We show how these bounds can guide the design of learning algorithms that we discuss in detail. We further show that our learning guarantees and algorithms provide improved solutions for standard domain adaptation problems, for which few labeled data or none are available from the target domain. We finally report the results of a series of experiments demonstrating the effectiveness of our best-effort adaptation and domain adaptation algorithms, as well as comparisons with several baselines. We also discuss how our analysis can benefit the design of principled solutions for fine-tuning.

我们研究了一个由多个应用和考虑因素激发的尽力适应问题,它包括为一个目标领域确定一个准确的预测器,对于这个领域,我们只有适量的标注样本,同时利用另一个领域的信息,对于这个领域,我们可以利用更多的标注样本。我们提出了一种新的基于差异的样本重权重方法理论分析,包括权重均一的约束。我们展示了这些界限如何指导我们详细讨论的学习算法的设计。我们进一步表明,我们的学习保证和算法为标准领域适应问题提供了更好的解决方案,对于这些问题,目标领域只有很少的标注数据或没有标注数据。最后,我们报告了一系列实验结果,证明了我们的尽力适应和领域适应算法的有效性,以及与几种基线算法的比较。我们还讨论了我们的分析如何有助于设计微调的原则性解决方案。
{"title":"Best-effort adaptation","authors":"Pranjal Awasthi,&nbsp;Corinna Cortes,&nbsp;Mehryar Mohri","doi":"10.1007/s10472-023-09917-3","DOIUrl":"10.1007/s10472-023-09917-3","url":null,"abstract":"<div><p>We study a problem of <i>best-effort adaptation</i> motivated by several applications and considerations, which consists of determining an accurate predictor for a target domain, for which a moderate amount of labeled samples are available, while leveraging information from another domain for which substantially more labeled samples are at one’s disposal. We present a new and general discrepancy-based theoretical analysis of sample reweighting methods, including bounds holding uniformly over the weights. We show how these bounds can guide the design of learning algorithms that we discuss in detail. We further show that our learning guarantees and algorithms provide improved solutions for standard domain adaptation problems, for which few labeled data or none are available from the target domain. We finally report the results of a series of experiments demonstrating the effectiveness of our best-effort adaptation and domain adaptation algorithms, as well as comparisons with several baselines. We also discuss how our analysis can benefit the design of principled solutions for <i>fine-tuning</i>.</p></div>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 2","pages":"393 - 438"},"PeriodicalIF":1.2,"publicationDate":"2024-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139465039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RAMP experiments in solving the uncapacitated facility location problem 解决无容量设施位置问题的 RAMP 实验
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-12-30 DOI: 10.1007/s10472-023-09920-8
Telmo Matos

In this paper, we consider three Relaxation Adaptive Memory Programming (RAMP) approaches for solving the Uncapacitated Facility Location Problem (UFLP), whose objective is to locate a set of facilities and allocate these facilities to all clients at minimum cost. Different levels of sophistication were implemented to measure the performance of the RAMP approach. In the simpler level, (Dual-) RAMP explores more intensively the dual side of the problem, incorporating a Lagrangean Relaxation and Subgradient Optimization with a simple Improvement Method on the primal side. In the most sophisticated level, RAMP combines a Dual-Ascent procedure on the dual side with a Scatter Search (SS) procedure on primal side, forming the Primal–Dual RAMP (PD-RAMP). The Dual-RAMP algorithm starts with (dual side) the dualization of the initial problem, and then a projection method projects the dual solutions into the primal solutions space. Next, (primal side) the projected solutions are improved through an improvement method. In the PD-RAMP algorithm, the SS procedure is incorporated in the primal side to carry out a more intensive exploration. The algorithm alternates between the dual and the primal side until a fixed number of iterations is achieved. Computational experiments on a standard testbed for the UFLP were conducted to assess the performance of all the RAMP algorithms.

在本文中,我们考虑了三种松弛自适应内存编程(RAMP)方法来解决无容量设施定位问题(UFLP),该问题的目标是定位一组设施,并以最低成本将这些设施分配给所有客户。我们采用了不同复杂程度的 RAMP 方法来衡量其性能。在较简单的层次中,(双)RAMP 对问题的双面进行了更深入的探索,将拉格朗日放松法和次梯度优化法与简单的改进法结合在一起。在最复杂的层面上,RAMP 将对偶面上的双上升程序与原始面上的散点搜索 (SS) 程序相结合,形成了原始-双 RAMP (PD-RAMP)。Dual-RAMP 算法从(对偶侧)初始问题的对偶化开始,然后用投影法将对偶解投射到原始解空间。接着,通过改进方法对投影解(原始解)进行改进。在 PD-RAMP 算法中,SS 程序被纳入原始侧,以进行更深入的探索。该算法在对偶侧和原始侧之间交替进行,直到达到固定的迭代次数。为了评估所有 RAMP 算法的性能,我们在 UFLP 的标准测试平台上进行了计算实验。
{"title":"RAMP experiments in solving the uncapacitated facility location problem","authors":"Telmo Matos","doi":"10.1007/s10472-023-09920-8","DOIUrl":"10.1007/s10472-023-09920-8","url":null,"abstract":"<div><p>In this paper, we consider three Relaxation Adaptive Memory Programming (RAMP) approaches for solving the Uncapacitated Facility Location Problem (UFLP), whose objective is to locate a set of facilities and allocate these facilities to all clients at minimum cost. Different levels of sophistication were implemented to measure the performance of the RAMP approach. In the simpler level, (Dual-) RAMP explores more intensively the dual side of the problem, incorporating a Lagrangean Relaxation and Subgradient Optimization with a simple Improvement Method on the primal side. In the most sophisticated level, RAMP combines a Dual-Ascent procedure on the dual side with a Scatter Search (SS) procedure on primal side, forming the Primal–Dual RAMP (PD-RAMP). The Dual-RAMP algorithm starts with (dual side) the dualization of the initial problem, and then a projection method projects the dual solutions into the primal solutions space. Next, (primal side) the projected solutions are improved through an improvement method. In the PD-RAMP algorithm, the SS procedure is incorporated in the primal side to carry out a more intensive exploration. The algorithm alternates between the dual and the primal side until a fixed number of iterations is achieved. Computational experiments on a standard testbed for the UFLP were conducted to assess the performance of all the RAMP algorithms.</p></div>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 2","pages":"485 - 504"},"PeriodicalIF":1.2,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139066200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning from masked analogies between sentences at multiple levels of formality 从多级形式句子之间的掩蔽类比中学习
IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-12-26 DOI: 10.1007/s10472-023-09918-2

Abstract

This paper explores the inference of sentence analogies not restricted to the formal level. We introduce MaskPrompt, a prompt-based method that addresses the analogy task as masked analogy completion. This enables us to fine-tune, in a lightweight manner, pre-trained language models on the task of reconstructing masked spans in analogy prompts. We apply constraints which are approximations of the parallelogram view of analogy to construct a corpus of sentence analogies from textual entailment sentence pairs. In the constructed corpus, sentence analogies are characterized by their level of being formal, ranging from strict to loose. We apply MaskPrompt on this corpus and compare MaskPrompt with the basic fine-tuning paradigm. Our experiments show that MaskPrompt outperforms basic fine-tuning in solving analogies in terms of overall performance, with gains of over 2% in accuracy. Furthermore, we study the contribution of loose analogies, i.e., analogies relaxed on the formal aspect. When fine-tuning with a small number of them (several hundreds), the accuracy on strict analogies jumps from 82% to 99%. This demonstrates that loose analogies effectively capture implicit but coherent analogical regularities. We also use MaskPrompt with different schemes on masked content to optimize analogy solutions. The best masking scheme during fine-tuning is to mask any term: it exhibits the highest robustness in accuracy on all tested equivalent forms of analogies.

摘要 本文探讨了不局限于形式层面的句子类比推理。我们介绍了 MaskPrompt,这是一种基于提示的方法,它将类比任务视为屏蔽类比完成。这使我们能够以一种轻量级的方式,对预先训练好的语言模型进行微调,以完成在类比提示中重建掩码跨度的任务。我们运用近似于平行四边形类比观点的约束条件,从文本蕴涵句对中构建了一个句子类比语料库。在所构建的语料库中,句子类比的特征在于其形式化程度,从严格到宽松不等。我们在该语料库中应用了 MaskPrompt,并将 MaskPrompt 与基本微调范式进行了比较。实验结果表明,在解决类比问题时,MaskPrompt 的整体性能优于基本微调范式,准确率提高了 2% 以上。此外,我们还研究了松散类比的贡献,即形式方面的松散类比。当使用少量类比(数百个)进行微调时,严格类比的准确率从 82% 跃升至 99%。这表明,宽松类比能有效捕捉隐含但连贯的类比规律。我们还在屏蔽内容上使用不同方案的 MaskPrompt 来优化类比解决方案。在微调过程中,最佳的屏蔽方案是屏蔽任何术语:在所有测试过的等效类比形式中,它表现出最高的稳健性。
{"title":"Learning from masked analogies between sentences at multiple levels of formality","authors":"","doi":"10.1007/s10472-023-09918-2","DOIUrl":"https://doi.org/10.1007/s10472-023-09918-2","url":null,"abstract":"<h3>Abstract</h3> <p>This paper explores the inference of sentence analogies not restricted to the formal level. We introduce MaskPrompt, a prompt-based method that addresses the analogy task as masked analogy completion. This enables us to fine-tune, in a lightweight manner, pre-trained language models on the task of reconstructing masked spans in analogy prompts. We apply constraints which are approximations of the parallelogram view of analogy to construct a corpus of sentence analogies from textual entailment sentence pairs. In the constructed corpus, sentence analogies are characterized by their level of being formal, ranging from strict to loose. We apply MaskPrompt on this corpus and compare MaskPrompt with the basic fine-tuning paradigm. Our experiments show that MaskPrompt outperforms basic fine-tuning in solving analogies in terms of overall performance, with gains of over 2% in accuracy. Furthermore, we study the contribution of loose analogies, i.e., analogies relaxed on the formal aspect. When fine-tuning with a small number of them (several hundreds), the accuracy on strict analogies jumps from 82% to 99%. This demonstrates that loose analogies effectively capture implicit but coherent analogical regularities. We also use MaskPrompt with different schemes on masked content to optimize analogy solutions. The best masking scheme during fine-tuning is to mask any term: it exhibits the highest robustness in accuracy on all tested equivalent forms of analogies.</p>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139051312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annals of Mathematics and Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1