首页 > 最新文献

Machine Learning最新文献

英文 中文
Compressed sensing: a discrete optimization approach 压缩传感:一种离散优化方法
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-11 DOI: 10.1007/s10994-024-06577-0
Dimitris Bertsimas, Nicholas A. G. Johnson

We study the Compressed Sensing (CS) problem, which is the problem of finding the most sparse vector that satisfies a set of linear measurements up to some numerical tolerance. CS is a central problem in Statistics, Operations Research and Machine Learning which arises in applications such as signal processing, data compression, image reconstruction, and multi-label learning. We introduce an (ell _2) regularized formulation of CS which we reformulate as a mixed integer second order cone program. We derive a second order cone relaxation of this problem and show that under mild conditions on the regularization parameter, the resulting relaxation is equivalent to the well studied basis pursuit denoising problem. We present a semidefinite relaxation that strengthens the second order cone relaxation and develop a custom branch-and-bound algorithm that leverages our second order cone relaxation to solve small-scale instances of CS to certifiable optimality. When compared against solutions produced by three state of the art benchmark methods on synthetic data, our numerical results show that our approach produces solutions that are on average (6.22%) more sparse. When compared only against the experiment-wise best performing benchmark method on synthetic data, our approach produces solutions that are on average (3.10%) more sparse. On real world ECG data, for a given (ell _2) reconstruction error our approach produces solutions that are on average (9.95%) more sparse than benchmark methods ((3.88%) more sparse if only compared against the best performing benchmark), while for a given sparsity level our approach produces solutions that have on average (10.77%) lower reconstruction error than benchmark methods ((1.42%) lower error if only compared against the best performing benchmark). When used as a component of a multi-label classification algorithm, our approach achieves greater classification accuracy than benchmark compressed sensing methods. This improved accuracy comes at the cost of an increase in computation time by several orders of magnitude. Thus, for applications where runtime is not of critical importance, leveraging integer optimization can yield sparser and lower error solutions to CS than existing benchmarks.

我们研究的是压缩传感(CS)问题,即寻找满足一组线性测量的最稀疏矢量,并达到一定的数值容差。CS 是统计学、运筹学和机器学习中的一个核心问题,在信号处理、数据压缩、图像重建和多标签学习等应用中都会出现。我们引入了 CS 的正则化表述,并将其重新表述为混合整数二阶锥形程序。我们推导出了这个问题的二阶圆锥松弛,并证明在正则化参数的温和条件下,所得到的松弛等价于研究得很透彻的基追求去噪问题。我们提出了一种加强二阶圆锥松弛的半有限松弛,并开发了一种定制的分支和边界算法,该算法利用我们的二阶圆锥松弛来解决 CS 的小规模实例,并达到可证明的最优性。与三种最先进的基准方法在合成数据上得出的解决方案相比,我们的数值结果表明,我们的方法得出的解决方案平均稀疏度更高(6.22%)。如果只与合成数据上实验性能最好的基准方法进行比较,我们的方法得出的解决方案平均稀疏度要更高(3.10%)。在真实世界的心电图数据上,对于给定的重构误差,我们的方法产生的解决方案比基准方法平均稀疏(9.95%)(如果只与表现最好的基准方法相比,则稀疏(3.88%)),而对于给定的稀疏程度,我们的方法产生的解决方案的重构误差比基准方法平均低(10.77%)(如果只与表现最好的基准方法相比,则误差低(1.42%))。当作为多标签分类算法的一个组成部分时,我们的方法比基准压缩传感方法实现了更高的分类精度。精度提高的代价是计算时间增加了几个数量级。因此,对于运行时间并不重要的应用,利用整数优化可以获得比现有基准更稀疏、误差更低的 CS 解决方案。
{"title":"Compressed sensing: a discrete optimization approach","authors":"Dimitris Bertsimas, Nicholas A. G. Johnson","doi":"10.1007/s10994-024-06577-0","DOIUrl":"https://doi.org/10.1007/s10994-024-06577-0","url":null,"abstract":"<p>We study the Compressed Sensing (CS) problem, which is the problem of finding the most sparse vector that satisfies a set of linear measurements up to some numerical tolerance. CS is a central problem in Statistics, Operations Research and Machine Learning which arises in applications such as signal processing, data compression, image reconstruction, and multi-label learning. We introduce an <span>(ell _2)</span> regularized formulation of CS which we reformulate as a mixed integer second order cone program. We derive a second order cone relaxation of this problem and show that under mild conditions on the regularization parameter, the resulting relaxation is equivalent to the well studied basis pursuit denoising problem. We present a semidefinite relaxation that strengthens the second order cone relaxation and develop a custom branch-and-bound algorithm that leverages our second order cone relaxation to solve small-scale instances of CS to certifiable optimality. When compared against solutions produced by three state of the art benchmark methods on synthetic data, our numerical results show that our approach produces solutions that are on average <span>(6.22%)</span> more sparse. When compared only against the experiment-wise best performing benchmark method on synthetic data, our approach produces solutions that are on average <span>(3.10%)</span> more sparse. On real world ECG data, for a given <span>(ell _2)</span> reconstruction error our approach produces solutions that are on average <span>(9.95%)</span> more sparse than benchmark methods (<span>(3.88%)</span> more sparse if only compared against the best performing benchmark), while for a given sparsity level our approach produces solutions that have on average <span>(10.77%)</span> lower reconstruction error than benchmark methods (<span>(1.42%)</span> lower error if only compared against the best performing benchmark). When used as a component of a multi-label classification algorithm, our approach achieves greater classification accuracy than benchmark compressed sensing methods. This improved accuracy comes at the cost of an increase in computation time by several orders of magnitude. Thus, for applications where runtime is not of critical importance, leveraging integer optimization can yield sparser and lower error solutions to CS than existing benchmarks.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"56 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141611977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable dating of greek papyri images 希腊纸莎草纸图像的可解释年代
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-11 DOI: 10.1007/s10994-024-06589-w
John Pavlopoulos, Maria Konstantinidou, Elpida Perdiki, Isabelle Marthot-Santaniello, Holger Essler, Georgios Vardakas, Aristidis Likas

Greek literary papyri, which are unique witnesses of antique literature, do not usually bear a date. They are thus currently dated based on palaeographical methods, with broad approximations which often span more than a century. We created a dataset of 242 images of papyri written in “bookhand” scripts whose date can be securely assigned, and we used it to train algorithms for the task of dating, showing its challenging nature. To address data scarcity, we extended our dataset by segmenting each image into its respective text lines. By using the line-based version of our dataset, we trained a Convolutional Neural Network, equipped with a fragmentation-based augmentation strategy, and we achieved a mean absolute error of 54 years. The results improve further when the task is cast as a multi-class classification problem, predicting the century. Using our network, we computed precise date estimations for papyri whose date is disputed or vaguely defined, employing explainability to understand dating-driving features.

希腊文学纸莎草纸是古代文学的独特见证,通常不标注日期。因此,目前只能根据古文字学的方法来确定它们的年代,大致的近似值往往跨越一个多世纪。我们创建了一个包含 242 幅以 "手写体 "书写的纸莎草纸图像的数据集,这些图像的日期可以确定。为了解决数据稀缺的问题,我们通过将每张图像分割成相应的文本行来扩展我们的数据集。通过使用基于行的数据集版本,我们训练了一个卷积神经网络,该网络配备了基于片段的增强策略,我们取得的平均绝对误差为 54 年。如果将该任务视为预测世纪的多类分类问题,结果会进一步改善。利用我们的网络,我们计算出了日期有争议或定义模糊的纸莎草纸的精确日期估计,利用可解释性来理解日期驱动特征。
{"title":"Explainable dating of greek papyri images","authors":"John Pavlopoulos, Maria Konstantinidou, Elpida Perdiki, Isabelle Marthot-Santaniello, Holger Essler, Georgios Vardakas, Aristidis Likas","doi":"10.1007/s10994-024-06589-w","DOIUrl":"https://doi.org/10.1007/s10994-024-06589-w","url":null,"abstract":"<p>Greek literary papyri, which are unique witnesses of antique literature, do not usually bear a date. They are thus currently dated based on palaeographical methods, with broad approximations which often span more than a century. We created a dataset of 242 images of papyri written in “bookhand” scripts whose date can be securely assigned, and we used it to train algorithms for the task of dating, showing its challenging nature. To address data scarcity, we extended our dataset by segmenting each image into its respective text lines. By using the line-based version of our dataset, we trained a Convolutional Neural Network, equipped with a fragmentation-based augmentation strategy, and we achieved a mean absolute error of 54 years. The results improve further when the task is cast as a multi-class classification problem, predicting the century. Using our network, we computed precise date estimations for papyri whose date is disputed or vaguely defined, employing explainability to understand dating-driving features.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"29 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Moreau-Yoshida variational transport: a general framework for solving regularized distributional optimization problems 莫罗-吉田变分传输:解决正则分布优化问题的一般框架
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-10 DOI: 10.1007/s10994-024-06586-z
Dai Hai Nguyen, Tetsuya Sakurai

We address a general optimization problem involving the minimization of a composite objective functional defined over a class of probability distributions. The objective function consists of two components: one assumed to have a variational representation, and the other expressed in terms of the expectation operator of a possibly nonsmooth convex regularizer function. Such a regularized distributional optimization problem widely appears in machine learning and statistics, including proximal Monte-Carlo sampling, Bayesian inference, and generative modeling for regularized estimation and generation. Our proposed method, named Moreau-Yoshida Variational Transport (MYVT), introduces a novel approach to tackle this regularized distributional optimization problem. First, as the name suggests, our method utilizes the Moreau-Yoshida envelope to provide a smooth approximation of the nonsmooth function in the objective. Second, we reformulate the approximate problem as a concave-convex saddle point problem by leveraging the variational representation. Subsequently, we develop an efficient primal–dual algorithm to approximate the saddle point. Furthermore, we provide theoretical analyses and present experimental results to showcase the effectiveness of the proposed method.

我们要解决的是一个一般优化问题,涉及最小化定义在一类概率分布上的复合目标函数。目标函数由两部分组成:一部分假定具有变分表示法,另一部分用可能是非光滑凸正则函数的期望算子表示。这种正则化分布优化问题广泛出现在机器学习和统计学领域,包括近似蒙特卡洛采样、贝叶斯推理以及用于正则化估计和生成的生成模型。我们提出的方法被命名为莫罗-吉田变分传输(Moreau-Yoshida Variational Transport,MYVT),它引入了一种新方法来解决这种正则化分布优化问题。首先,顾名思义,我们的方法利用莫罗-吉田包络为目标中的非光滑函数提供光滑近似值。其次,我们利用变分表示法将近似问题重新表述为凹凸鞍点问题。随后,我们开发了一种高效的初等二元算法来逼近鞍点。此外,我们还提供了理论分析和实验结果,以展示所提方法的有效性。
{"title":"Moreau-Yoshida variational transport: a general framework for solving regularized distributional optimization problems","authors":"Dai Hai Nguyen, Tetsuya Sakurai","doi":"10.1007/s10994-024-06586-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06586-z","url":null,"abstract":"<p>We address a general optimization problem involving the minimization of a composite objective functional defined over a class of probability distributions. The objective function consists of two components: one assumed to have a variational representation, and the other expressed in terms of the expectation operator of a possibly nonsmooth convex regularizer function. Such a regularized distributional optimization problem widely appears in machine learning and statistics, including proximal Monte-Carlo sampling, Bayesian inference, and generative modeling for regularized estimation and generation. Our proposed method, named Moreau-Yoshida Variational Transport (MYVT), introduces a novel approach to tackle this regularized distributional optimization problem. First, as the name suggests, our method utilizes the Moreau-Yoshida envelope to provide a smooth approximation of the nonsmooth function in the objective. Second, we reformulate the approximate problem as a concave-convex saddle point problem by leveraging the variational representation. Subsequently, we develop an efficient primal–dual algorithm to approximate the saddle point. Furthermore, we provide theoretical analyses and present experimental results to showcase the effectiveness of the proposed method.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"20 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141584981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Permutation-invariant linear classifiers 置换不变线性分类器
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06561-8
Ludwig Lausser, Robin Szekely, Hans A. Kestler

Invariant concept classes form the backbone of classification algorithms immune to specific data transformations, ensuring consistent predictions regardless of these alterations. However, this robustness can come at the cost of limited access to the original sample information, potentially impacting generalization performance. This study introduces an addition to these classes—the permutation-invariant linear classifiers. Distinguished by their structural characteristics, permutation-invariant linear classifiers are unaffected by permutations on feature vectors, a property not guaranteed by other non-constant linear classifiers. The study characterizes this new concept class, highlighting its constant capacity, independent of input dimensionality. In practical assessments using linear support vector machines, the permutation-invariant classifiers exhibit superior performance in permutation experiments on artificial datasets and real mutation profiles. Interestingly, they outperform general linear classifiers not only in permutation experiments but also in permutation-free settings, surpassing unconstrained counterparts. Additionally, findings from real mutation profiles support the significance of tumor mutational burden as a biomarker.

不变概念类是不受特定数据转换影响的分类算法的支柱,可确保预测结果的一致性,而不受这些变化的影响。然而,这种鲁棒性的代价可能是对原始样本信息的访问有限,从而可能影响泛化性能。本研究介绍了这些分类器中的新成员--置换不变线性分类器。包络不变线性分类器的结构特点是不受特征向量包络变换的影响,这是其他非恒定线性分类器无法保证的。本研究描述了这一新概念类别的特征,强调了其与输入维度无关的恒定能力。在使用线性支持向量机进行的实际评估中,包覆不变分类器在人工数据集和真实突变剖面的包覆实验中表现出卓越的性能。有趣的是,它们不仅在变异实验中表现优于一般线性分类器,而且在无变异设置中表现也优于无约束分类器。此外,真实突变图谱的研究结果也证明了肿瘤突变负荷作为生物标记物的重要性。
{"title":"Permutation-invariant linear classifiers","authors":"Ludwig Lausser, Robin Szekely, Hans A. Kestler","doi":"10.1007/s10994-024-06561-8","DOIUrl":"https://doi.org/10.1007/s10994-024-06561-8","url":null,"abstract":"<p>Invariant concept classes form the backbone of classification algorithms immune to specific data transformations, ensuring consistent predictions regardless of these alterations. However, this robustness can come at the cost of limited access to the original sample information, potentially impacting generalization performance. This study introduces an addition to these classes—the permutation-invariant linear classifiers. Distinguished by their structural characteristics, permutation-invariant linear classifiers are unaffected by permutations on feature vectors, a property not guaranteed by other non-constant linear classifiers. The study characterizes this new concept class, highlighting its constant capacity, independent of input dimensionality. In practical assessments using linear support vector machines, the permutation-invariant classifiers exhibit superior performance in permutation experiments on artificial datasets and real mutation profiles. Interestingly, they outperform general linear classifiers not only in permutation experiments but also in permutation-free settings, surpassing unconstrained counterparts. Additionally, findings from real mutation profiles support the significance of tumor mutational burden as a biomarker.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"65 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regional bias in monolingual English language models 单语英语语言模型中的地区偏差
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06555-6
Jiachen Lyu, Katharina Dost, Yun Sing Koh, Jörg Wicker

In Natural Language Processing (NLP), pre-trained language models (LLMs) are widely employed and refined for various tasks. These models have shown considerable social and geographic biases creating skewed or even unfair representations of certain groups. Research focuses on biases toward L2 (English as a second language) regions but neglects bias within L1 (first language) regions. In this work, we ask if there is regional bias within L1 regions already inherent in pre-trained LLMs and, if so, what the consequences are in terms of downstream model performance. We contribute an investigation framework specifically tailored for low-resource regions, offering a method to identify bias without imposing strict requirements for labeled datasets. Our research reveals subtle geographic variations in the word embeddings of BERT, even in cultures traditionally perceived as similar. These nuanced features, once captured, have the potential to significantly impact downstream tasks. Generally, models exhibit comparable performance on datasets that share similarities, and conversely, performance may diverge when datasets differ in their nuanced features embedded within the language. It is crucial to note that estimating model performance solely based on standard benchmark datasets may not necessarily apply to the datasets with distinct features from the benchmark datasets. Our proposed framework plays a pivotal role in identifying and addressing biases detected in word embeddings, particularly evident in low-resource regions such as New Zealand.

在自然语言处理(NLP)领域,预训练语言模型(LLMs)被广泛使用,并在各种任务中得到改进。这些模型显示出相当大的社会和地理偏差,对某些群体造成了偏斜甚至不公平的表述。研究的重点是 L2(英语作为第二语言)地区的偏差,但忽略了 L1(第一语言)地区的偏差。在这项工作中,我们要问的是,在预训练的 LLM 中是否已经存在 L1 区域内固有的区域偏差,如果存在,那么在下游模型性能方面会产生什么后果。我们提出了一个专门针对低资源地区的调查框架,提供了一种无需对标记数据集提出严格要求即可识别偏差的方法。我们的研究揭示了 BERT 词嵌入的微妙地域差异,即使是在传统上被视为相似的文化中也是如此。这些细微的特征一旦被捕捉到,就有可能对下游任务产生重大影响。一般来说,模型在具有相似性的数据集上表现出相当的性能,反之,当数据集在语言嵌入的细微特征上存在差异时,性能可能会出现差异。需要注意的是,仅根据标准基准数据集估算模型性能并不一定适用于与基准数据集具有不同特征的数据集。我们提出的框架在识别和解决词嵌入中发现的偏差方面发挥了关键作用,这在新西兰等低资源地区尤为明显。
{"title":"Regional bias in monolingual English language models","authors":"Jiachen Lyu, Katharina Dost, Yun Sing Koh, Jörg Wicker","doi":"10.1007/s10994-024-06555-6","DOIUrl":"https://doi.org/10.1007/s10994-024-06555-6","url":null,"abstract":"<p>In Natural Language Processing (NLP), pre-trained language models (LLMs) are widely employed and refined for various tasks. These models have shown considerable social and geographic biases creating skewed or even unfair representations of certain groups. Research focuses on biases toward L2 (English as a second language) regions but neglects bias within L1 (first language) regions. In this work, we ask if there is regional bias within L1 regions already inherent in pre-trained LLMs and, if so, what the consequences are in terms of downstream model performance. We contribute an investigation framework specifically tailored for low-resource regions, offering a method to identify bias without imposing strict requirements for labeled datasets. Our research reveals subtle geographic variations in the word embeddings of BERT, even in cultures traditionally perceived as similar. These nuanced features, once captured, have the potential to significantly impact downstream tasks. Generally, models exhibit comparable performance on datasets that share similarities, and conversely, performance may diverge when datasets differ in their nuanced features embedded within the language. It is crucial to note that estimating model performance solely based on standard benchmark datasets may not necessarily apply to the datasets with distinct features from the benchmark datasets. Our proposed framework plays a pivotal role in identifying and addressing biases detected in word embeddings, particularly evident in low-resource regions such as New Zealand.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"35 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conformal predictions for probabilistically robust scalable machine learning classification 针对概率稳健可扩展机器学习分类的共形预测
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06571-6
Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli

Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined (varepsilon) error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.

共形预测使我们有可能定义可靠、稳健的学习算法。但是,它们本质上是一种评估算法是否足以在实践中使用的方法。为了从设计之初就定义可靠的分类学习框架,我们引入了可扩展分类器的概念,通过将其与统计秩理论和概率学习理论联系起来,对经典分类器的概念进行了概括。在本文中,我们分析了可扩展分类器和保形预测之间的相似性,引入了得分函数的新定义,并定义了一个特殊的输入变量集--保形安全集,它可以识别输入空间中满足误差覆盖保证的模式,即观察到属于该集的点的错误(可能不安全)标签的概率被预定义的(varepsilon)误差水平所限制。我们通过在网络安全中识别 DNS 隧道攻击的应用,展示了这一框架的实际意义。我们的工作有助于开发概率上稳健可靠的机器学习模型。
{"title":"Conformal predictions for probabilistically robust scalable machine learning classification","authors":"Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli","doi":"10.1007/s10994-024-06571-6","DOIUrl":"https://doi.org/10.1007/s10994-024-06571-6","url":null,"abstract":"<p>Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined <span>(varepsilon)</span> error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"72 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural discovery of balance-aware polarized communities 神经发现平衡感知极化群落
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06581-4
Francesco Gullo, Domenico Mandaglio, Andrea Tagarelli

Signed graphs are a model to depict friendly (positive) or antagonistic (negative) interactions (edges) among users (nodes). 2-Polarized-Communities (2pc) is a well-established combinatorial-optimization problem whose goal is to find two polarized communities from a signed graph, i.e., two subsets of nodes (disjoint, but not necessarily covering the entire node set) which exhibit a high number of both intra-community positive edges and negative inter-community edges. The state of the art in 2pc suffers from the limitations that (i) existing methods rely on a single (optimal) solution to a continuous relaxation of the problem in order to produce the ultimate discrete solution via rounding, and (ii) 2pc objective function comes with no control on size balance among communities. In this paper, we provide advances to the 2pc problem by addressing both these limitations, with a twofold contribution. First, we devise a novel neural approach that allows for soundly and elegantly explore a variety of suboptimal solutions to the relaxed 2pc problem, so as to pick the one that leads to the best discrete solution after rounding. Second, we introduce a generalization of 2pc objective function – termed (gamma )-polarity – which fosters size balance among communities, and we incorporate it into the proposed machine-learning framework. Extensive experiments attest high accuracy of our approach, its superiority over the state of the art, and capability of function (gamma )-polarity to discover high-quality size-balanced communities.

签名图是一种描述用户(节点)之间友好(积极)或敌对(消极)互动(边)的模型。两极化社群(2pc)是一个成熟的组合优化问题,其目标是从签名图中找到两个两极化社群,即两个节点子集(不相交,但不一定覆盖整个节点集),这两个子集显示出大量的社群内正向边和社群间负向边。2pc 技术的现状存在以下局限性:(i) 现有方法依赖于问题连续松弛的单一(最优)解,以便通过舍入产生最终的离散解;(ii) 2pc 目标函数无法控制群落间的大小平衡。在本文中,我们通过解决这两个局限性,对 2pc 问题做出了两方面的贡献。首先,我们设计了一种新颖的神经方法,可以合理、优雅地探索松弛 2pc 问题的各种次优解,从而选出舍入后的最佳离散解。其次,我们引入了 2pc 目标函数的广义化--称为 (gamma )-极性(polarity)--它促进了社区之间的规模平衡,我们将其纳入了所提出的机器学习框架。广泛的实验证明了我们的方法具有很高的准确性,它优于目前的技术水平,而且函数 ((gamma )-polarity)有能力发现高质量的大小平衡的社区。
{"title":"Neural discovery of balance-aware polarized communities","authors":"Francesco Gullo, Domenico Mandaglio, Andrea Tagarelli","doi":"10.1007/s10994-024-06581-4","DOIUrl":"https://doi.org/10.1007/s10994-024-06581-4","url":null,"abstract":"<p><i>Signed graphs</i> are a model to depict friendly (<i>positive</i>) or antagonistic (<i>negative</i>) interactions (edges) among users (nodes). <span>2-Polarized-Communities</span> (<span>2pc</span>) is a well-established combinatorial-optimization problem whose goal is to find two <i>polarized</i> communities from a signed graph, i.e., two subsets of nodes (disjoint, but not necessarily covering the entire node set) which exhibit a high number of both intra-community positive edges and negative inter-community edges. The state of the art in <span>2pc</span> suffers from the limitations that (<i>i</i>) existing methods rely on a single (optimal) solution to a continuous relaxation of the problem in order to produce the ultimate discrete solution via rounding, and (<i>ii</i>) <span>2pc</span> objective function comes with no control on size balance among communities. In this paper, we provide advances to the <span>2pc</span> problem by addressing both these limitations, with a twofold contribution. First, we devise a novel neural approach that allows for soundly and elegantly explore a variety of suboptimal solutions to the relaxed <span>2pc</span> problem, so as to pick the one that leads to the best discrete solution after rounding. Second, we introduce a generalization of <span>2pc</span> objective function – termed <span>(gamma )</span>-<i>polarity </i>– which fosters size balance among communities, and we incorporate it into the proposed machine-learning framework. Extensive experiments attest high accuracy of our approach, its superiority over the state of the art, and capability of function <span>(gamma )</span>-polarity to discover high-quality size-balanced communities.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"179 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141577849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FairMOE: counterfactually-fair mixture of experts with levels of interpretability FairMOE: 具有可解释性水平的反事实公平专家混合物
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-08 DOI: 10.1007/s10994-024-06583-2
Joe Germino, Nuno Moniz, Nitesh V. Chawla

With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models’ predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently fair experts: FairMOE. We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.

随着人工智能在我们日常生活中的兴起,人类对机器学习模型预测的解释需求成为一个关键问题。一般来说,可解释性被视为一个二元概念,需要进行性能权衡。要么模型是完全可解释的,但缺乏捕捉数据中更复杂模式的能力,要么就是一个黑盒子。在本文中,我们认为这种观点具有严重的局限性,可解释性应被视为一个连续的领域信息概念。我们利用著名的混合专家架构,用户可自定义对不可解释性的限制。我们用一个反事实公平模块扩展了这一想法,以确保选择始终公平的专家:FairMOE。我们使用与公平性相关的数据集进行了广泛的实验评估,并将我们的建议与最先进的方法进行了比较。我们的结果表明,FairMOE 在公平性和预测性方面都能与领先的公平感知算法相媲美,同时还能提供更稳定的性能、更有竞争力的可扩展性,更重要的是,它还具有更强的可解释性。
{"title":"FairMOE: counterfactually-fair mixture of experts with levels of interpretability","authors":"Joe Germino, Nuno Moniz, Nitesh V. Chawla","doi":"10.1007/s10994-024-06583-2","DOIUrl":"https://doi.org/10.1007/s10994-024-06583-2","url":null,"abstract":"<p>With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models’ predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently <i>fair</i> experts: <b>FairMOE</b>. We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"29 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast linear model trees by PILOT PILOT 快速线性模型树
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-08 DOI: 10.1007/s10994-024-06590-3
Jakob Raymaekers, Peter J. Rousseeuw, Tim Verdonck, Ruicong Yao

Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standard decision trees. But most existing methods for fitting linear model trees are time consuming and therefore not scalable to large data sets. In addition, they are more prone to overfitting and extrapolation issues than standard regression trees. In this paper we introduce PILOT, a new algorithm for linear model trees that is fast, regularized, stable and interpretable. PILOT trains in a greedy fashion like classic regression trees, but incorporates an L2 boosting approach and a model selection rule for fitting linear models in the nodes. The abbreviation PILOT stands for PIecewise Linear Organic Tree, where ‘organic’ refers to the fact that no pruning is carried out. PILOT has the same low time and space complexity as CART without its pruning. An empirical study indicates that PILOT tends to outperform standard decision trees and other linear model trees on a variety of data sets. Moreover, we prove its consistency in an additive model setting under weak assumptions. When the data is generated by a linear model, the convergence rate is polynomial.

线性模型树是在叶节点中加入线性模型的回归树。这既保留了决策树的直观解释,又能使其更好地捕捉线性关系,而标准决策树很难做到这一点。但是,现有的大多数拟合线性模型树的方法都很耗时,因此无法扩展到大型数据集。此外,与标准回归树相比,它们更容易出现过拟合和外推问题。在本文中,我们介绍了 PILOT,一种快速、正则化、稳定和可解释的线性模型树新算法。PILOT 与经典回归树一样采用贪婪方式进行训练,但在节点中加入了 L2 提升方法和拟合线性模型的模型选择规则。缩写 PILOT 是 PIecewise Linear Organic Tree 的缩写,其中的 "organic "指的是不进行修剪。PILOT 与不进行剪枝的 CART 一样,具有较低的时间和空间复杂度。实证研究表明,PILOT 在各种数据集上的表现往往优于标准决策树和其他线性模型树。此外,我们还证明了它在弱假设条件下的加法模型设置中的一致性。当数据由线性模型生成时,收敛速率为多项式。
{"title":"Fast linear model trees by PILOT","authors":"Jakob Raymaekers, Peter J. Rousseeuw, Tim Verdonck, Ruicong Yao","doi":"10.1007/s10994-024-06590-3","DOIUrl":"https://doi.org/10.1007/s10994-024-06590-3","url":null,"abstract":"<p>Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standard decision trees. But most existing methods for fitting linear model trees are time consuming and therefore not scalable to large data sets. In addition, they are more prone to overfitting and extrapolation issues than standard regression trees. In this paper we introduce PILOT, a new algorithm for linear model trees that is fast, regularized, stable and interpretable. PILOT trains in a greedy fashion like classic regression trees, but incorporates an <i>L</i><sup>2</sup> boosting approach and a model selection rule for fitting linear models in the nodes. The abbreviation PILOT stands for PIecewise Linear Organic Tree, where ‘organic’ refers to the fact that no pruning is carried out. PILOT has the same low time and space complexity as CART without its pruning. An empirical study indicates that PILOT tends to outperform standard decision trees and other linear model trees on a variety of data sets. Moreover, we prove its consistency in an additive model setting under weak assumptions. When the data is generated by a linear model, the convergence rate is polynomial.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"10 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematic approach for learning imbalanced data: enhancing zero-inflated models through boosting 学习不平衡数据的系统方法:通过提升增强零膨胀模型
IF 7.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-08 DOI: 10.1007/s10994-024-06558-3
Yeasung Jeong, Kangbok Lee, Young Woong Park, Sumin Han

In this paper, we propose systematic approaches for learning imbalanced data based on a two-regime process: regime 0, which generates excess zeros (majority class), and regime 1, which contributes to generating an outcome of one (minority class). The proposed model contains two latent equations: a split probit (logit) equation in the first stage and an ordinary probit (logit) equation in the second stage. Because boosting improves the accuracy of prediction versus using a single classifier, we combined a boosting strategy with the two-regime process. Thus, we developed the zero-inflated probit boost (ZIPBoost) and zero-inflated logit boost (ZILBoost) methods. We show that the weight functions of ZIPBoost have the desired properties for good predictive performance. Like AdaBoost, the weight functions upweight misclassified examples and downweight correctly classified examples. We show that the weight functions of ZILBoost have similar properties to those of LogitBoost. The algorithm will focus more on examples that are hard to classify in the next iteration, resulting in improved prediction accuracy. We provide the relative performance of ZIPBoost and ZILBoost, which rely on the excess kurtosis of the data distribution. Furthermore, we show the convergence and time complexity of our proposed methods. We demonstrate the performance of our proposed methods using a Monte Carlo simulation, mergers and acquisitions (M&A) data application, and imbalanced datasets from the Keel repository. The results of the experiments show that our proposed methods yield better prediction accuracy compared to other learning algorithms.

在本文中,我们提出了基于两制度过程学习不平衡数据的系统方法:制度 0(产生多余的零(多数类))和制度 1(有助于产生结果为一(少数类))。所提出的模型包含两个潜在方程:第一阶段为分裂概率(logit)方程,第二阶段为普通概率(logit)方程。与使用单一分类器相比,提升策略可提高预测的准确性,因此我们将提升策略与双因子过程相结合。因此,我们开发了零膨胀概率提升(ZIPBoost)和零膨胀对数提升(ZILBoost)方法。我们证明,ZIPBoost 的权重函数具有良好预测性能所需的特性。与 AdaBoost 一样,权重函数对错误分类的示例加权,对正确分类的示例减权。我们表明,ZILBoost 的权重函数与 LogitBoost 的权重函数具有相似的特性。该算法在下一次迭代中会更关注难以分类的示例,从而提高预测准确率。我们提供了 ZIPBoost 和 ZILBoost 的相对性能,它们都依赖于数据分布的过度峰度。此外,我们还展示了所提方法的收敛性和时间复杂性。我们使用蒙特卡罗模拟、并购(M&A)数据应用以及 Keel 数据库中的不平衡数据集演示了我们提出的方法的性能。实验结果表明,与其他学习算法相比,我们提出的方法具有更高的预测准确性。
{"title":"A systematic approach for learning imbalanced data: enhancing zero-inflated models through boosting","authors":"Yeasung Jeong, Kangbok Lee, Young Woong Park, Sumin Han","doi":"10.1007/s10994-024-06558-3","DOIUrl":"https://doi.org/10.1007/s10994-024-06558-3","url":null,"abstract":"<p>In this paper, we propose systematic approaches for learning imbalanced data based on a two-regime process: regime 0, which generates excess zeros (majority class), and regime 1, which contributes to generating an outcome of one (minority class). The proposed model contains two latent equations: a split probit (logit) equation in the first stage and an ordinary probit (logit) equation in the second stage. Because boosting improves the accuracy of prediction versus using a single classifier, we combined a boosting strategy with the two-regime process. Thus, we developed the zero-inflated probit boost (ZIPBoost) and zero-inflated logit boost (ZILBoost) methods. We show that the weight functions of ZIPBoost have the desired properties for good predictive performance. Like AdaBoost, the weight functions upweight misclassified examples and downweight correctly classified examples. We show that the weight functions of ZILBoost have similar properties to those of LogitBoost. The algorithm will focus more on examples that are hard to classify in the next iteration, resulting in improved prediction accuracy. We provide the relative performance of ZIPBoost and ZILBoost, which rely on the excess kurtosis of the data distribution. Furthermore, we show the convergence and time complexity of our proposed methods. We demonstrate the performance of our proposed methods using a Monte Carlo simulation, mergers and acquisitions (M&amp;A) data application, and imbalanced datasets from the Keel repository. The results of the experiments show that our proposed methods yield better prediction accuracy compared to other learning algorithms.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"40 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1