首页 > 最新文献

Asian Conference on Machine Learning最新文献

英文 中文
RoLNiP: Robust Learning Using Noisy Pairwise Comparisons RoLNiP:基于噪声两两比较的鲁棒学习
Pub Date : 2023-03-04 DOI: 10.48550/arXiv.2303.02341
Samartha S Maheshwara, Naresh Manwani
This paper presents a robust approach for learning from noisy pairwise comparisons. We propose sufficient conditions on the loss function under which the risk minimization framework becomes robust to noise in the pairwise similar dissimilar data. Our approach does not require the knowledge of noise rate in the uniform noise case. In the case of conditional noise, the proposed method depends on the noise rates. For such cases, we offer a provably correct approach for estimating the noise rates. Thus, we propose an end-to-end approach to learning robust classifiers in this setting. We experimentally show that the proposed approach RoLNiP outperforms the robust state-of-the-art methods for learning with noisy pairwise comparisons.
本文提出了一种从噪声两两比较中学习的鲁棒方法。我们提出了损失函数的充分条件,在此条件下,风险最小化框架对两两相似的不相似数据具有鲁棒性。我们的方法不需要知道均匀噪声情况下的噪声率。在有条件噪声的情况下,所提出的方法取决于噪声率。对于这种情况,我们提供了一种可证明正确的估计噪声率的方法。因此,我们提出了一种端到端的方法来学习这种情况下的鲁棒分类器。我们的实验表明,所提出的方法RoLNiP优于具有噪声两两比较的鲁棒的最先进的学习方法。
{"title":"RoLNiP: Robust Learning Using Noisy Pairwise Comparisons","authors":"Samartha S Maheshwara, Naresh Manwani","doi":"10.48550/arXiv.2303.02341","DOIUrl":"https://doi.org/10.48550/arXiv.2303.02341","url":null,"abstract":"This paper presents a robust approach for learning from noisy pairwise comparisons. We propose sufficient conditions on the loss function under which the risk minimization framework becomes robust to noise in the pairwise similar dissimilar data. Our approach does not require the knowledge of noise rate in the uniform noise case. In the case of conditional noise, the proposed method depends on the noise rates. For such cases, we offer a provably correct approach for estimating the noise rates. Thus, we propose an end-to-end approach to learning robust classifiers in this setting. We experimentally show that the proposed approach RoLNiP outperforms the robust state-of-the-art methods for learning with noisy pairwise comparisons.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130343476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AIIR-MIX: Multi-Agent Reinforcement Learning Meets Attention Individual Intrinsic Reward Mixing Network air - mix:多智能体强化学习满足注意个体内在奖励混合网络
Pub Date : 2023-02-19 DOI: 10.48550/arXiv.2302.09531
Wei Li, Weiyan Liu, Shitong Shao, Shiyi Huang
Deducing the contribution of each agent and assigning the corresponding reward to them is a crucial problem in cooperative Multi-Agent Reinforcement Learning (MARL). Previous studies try to resolve the issue through designing an intrinsic reward function, but the intrinsic reward is simply combined with the environment reward by summation in these studies, which makes the performance of their MARL framework unsatisfactory. We propose a novel method named Attention Individual Intrinsic Reward Mixing Network (AIIR-MIX) in MARL, and the contributions of AIIR-MIX are listed as follows:(a) we construct a novel intrinsic reward network based on the attention mechanism to make teamwork more effective. (b) we propose a Mixing network that is able to combine intrinsic and extrinsic rewards non-linearly and dynamically in response to changing conditions of the environment. We compare AIIR-MIX with many State-Of-The-Art (SOTA) MARL methods on battle games in StarCraft II. And the results demonstrate that AIIR-MIX performs admirably and can defeat the current advanced methods on average test win rate. To validate the effectiveness of AIIR-MIX, we conduct additional ablation studies. The results show that AIIR-MIX can dynamically assign each agent a real-time intrinsic reward in accordance with their actual contribution.
在协作式多智能体强化学习(MARL)中,计算每个智能体的贡献并给予相应的奖励是一个关键问题。以往的研究都试图通过设计一个内在奖励函数来解决这个问题,但这些研究都是简单地将内在奖励与环境奖励进行求和,这使得他们的MARL框架的性能不理想。我们在MARL中提出了一种新颖的方法——注意个体内在奖励混合网络(AIIR-MIX), AIIR-MIX的贡献如下:(a)基于注意机制构建了一种新颖的内在奖励网络,使团队合作更有效。(b)我们提出了一种混合网络,它能够非线性和动态地结合内在和外在奖励,以响应不断变化的环境条件。我们将air - mix与许多最先进的(SOTA) MARL方法在星际争霸II的战斗游戏中进行了比较。实验结果表明,AIIR-MIX具有良好的性能,在平均测试胜率上优于现有的先进方法。为了验证air - mix的有效性,我们进行了额外的消融研究。结果表明,AIIR-MIX可以根据每个智能体的实际贡献动态分配实时的内在奖励。
{"title":"AIIR-MIX: Multi-Agent Reinforcement Learning Meets Attention Individual Intrinsic Reward Mixing Network","authors":"Wei Li, Weiyan Liu, Shitong Shao, Shiyi Huang","doi":"10.48550/arXiv.2302.09531","DOIUrl":"https://doi.org/10.48550/arXiv.2302.09531","url":null,"abstract":"Deducing the contribution of each agent and assigning the corresponding reward to them is a crucial problem in cooperative Multi-Agent Reinforcement Learning (MARL). Previous studies try to resolve the issue through designing an intrinsic reward function, but the intrinsic reward is simply combined with the environment reward by summation in these studies, which makes the performance of their MARL framework unsatisfactory. We propose a novel method named Attention Individual Intrinsic Reward Mixing Network (AIIR-MIX) in MARL, and the contributions of AIIR-MIX are listed as follows:(a) we construct a novel intrinsic reward network based on the attention mechanism to make teamwork more effective. (b) we propose a Mixing network that is able to combine intrinsic and extrinsic rewards non-linearly and dynamically in response to changing conditions of the environment. We compare AIIR-MIX with many State-Of-The-Art (SOTA) MARL methods on battle games in StarCraft II. And the results demonstrate that AIIR-MIX performs admirably and can defeat the current advanced methods on average test win rate. To validate the effectiveness of AIIR-MIX, we conduct additional ablation studies. The results show that AIIR-MIX can dynamically assign each agent a real-time intrinsic reward in accordance with their actual contribution.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125035330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Interpretability of Attention Networks 论注意网络的可解释性
Pub Date : 2022-12-30 DOI: 10.48550/arXiv.2212.14776
L. N. Pandey, Rahul Vashisht, H. G. Ramaswamy
Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network. We make such a notion more precise for a variant of the classification problem that we term selective dependence classification (SDC) when used with attention model architectures. Under such a setting, we demonstrate various error modes where an attention model can be accurate but fail to be interpretable, and show that such models do occur as a result of training. We illustrate various situations that can accentuate and mitigate this behaviour. Finally, we use our objective definition of interpretability for SDC tasks to evaluate a few attention model learning algorithms designed to encourage sparsity and demonstrate that these algorithms help improve interpretability.
注意机制构成了几个成功的深度学习架构的核心组成部分,并且基于一个关键思想:“输出仅取决于输入的一小部分(但未知)。”在图像字幕和语言翻译等几个实际应用中,这基本上是正确的。在具有注意力机制的训练模型中,对负责输出的输入段进行编码的中间模块的输出通常被用作窥视网络“推理”的一种方式。当与注意力模型架构一起使用时,我们将这种概念更精确地用于分类问题的变体,我们称之为选择依赖分类(SDC)。在这种设置下,我们展示了各种错误模式,其中注意模型可以准确但无法解释,并表明这种模型确实是训练的结果。我们举例说明了可以加重和减轻这种行为的各种情况。最后,我们使用我们对SDC任务可解释性的客观定义来评估一些旨在鼓励稀疏性的注意模型学习算法,并证明这些算法有助于提高可解释性。
{"title":"On the Interpretability of Attention Networks","authors":"L. N. Pandey, Rahul Vashisht, H. G. Ramaswamy","doi":"10.48550/arXiv.2212.14776","DOIUrl":"https://doi.org/10.48550/arXiv.2212.14776","url":null,"abstract":"Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network. We make such a notion more precise for a variant of the classification problem that we term selective dependence classification (SDC) when used with attention model architectures. Under such a setting, we demonstrate various error modes where an attention model can be accurate but fail to be interpretable, and show that such models do occur as a result of training. We illustrate various situations that can accentuate and mitigate this behaviour. Finally, we use our objective definition of interpretability for SDC tasks to evaluate a few attention model learning algorithms designed to encourage sparsity and demonstrate that these algorithms help improve interpretability.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117349291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evaluating the Perceived Safety of Urban City via Maximum Entropy Deep Inverse Reinforcement Learning 基于最大熵深度逆强化学习的城市感知安全评价
Pub Date : 2022-11-19 DOI: 10.48550/arXiv.2211.10660
Yaxuan Wang, Zhixin Zeng, Qijun Zhao
Inspired by expert evaluation policy for urban perception, we proposed a novel inverse reinforcement learning (IRL) based framework for predicting urban safety and recovering the corresponding reward function. We also presented a scalable state representation method to model the prediction problem as a Markov decision process (MDP) and use reinforcement learning (RL) to solve the problem. Additionally, we built a dataset called SmallCity based on the crowdsourcing method to conduct the research. As far as we know, this is the first time the IRL approach has been introduced to the urban safety perception and planning field to help experts quantitatively analyze perceptual features. Our results showed that IRL has promising prospects in this field. We will later open-source the crowdsourcing data collection site and the model proposed in this paper.
受城市感知专家评价策略的启发,我们提出了一种新的基于逆强化学习(IRL)的城市安全预测框架,并恢复相应的奖励函数。我们还提出了一种可扩展的状态表示方法,将预测问题建模为马尔可夫决策过程(MDP),并使用强化学习(RL)来解决问题。此外,我们基于众包的方法建立了一个名为SmallCity的数据集来进行研究。据我们所知,这是首次将IRL方法引入城市安全感知与规划领域,帮助专家定量分析感知特征。结果表明,IRL在该领域具有广阔的应用前景。之后我们将对众包数据采集站点和本文提出的模型进行开源。
{"title":"Evaluating the Perceived Safety of Urban City via Maximum Entropy Deep Inverse Reinforcement Learning","authors":"Yaxuan Wang, Zhixin Zeng, Qijun Zhao","doi":"10.48550/arXiv.2211.10660","DOIUrl":"https://doi.org/10.48550/arXiv.2211.10660","url":null,"abstract":"Inspired by expert evaluation policy for urban perception, we proposed a novel inverse reinforcement learning (IRL) based framework for predicting urban safety and recovering the corresponding reward function. We also presented a scalable state representation method to model the prediction problem as a Markov decision process (MDP) and use reinforcement learning (RL) to solve the problem. Additionally, we built a dataset called SmallCity based on the crowdsourcing method to conduct the research. As far as we know, this is the first time the IRL approach has been introduced to the urban safety perception and planning field to help experts quantitatively analyze perceptual features. Our results showed that IRL has promising prospects in this field. We will later open-source the crowdsourcing data collection site and the model proposed in this paper.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126492699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One Gradient Frank-Wolfe for Decentralized Online Convex and Submodular Optimization 离散在线凸与次模优化的一梯度Frank-Wolfe
Pub Date : 2022-10-30 DOI: 10.48550/arXiv.2210.16790
T. Nguyen, K. Nguyen, D. Trystram
Decentralized learning has been studied intensively in recent years motivated by its wide applications in the context of federated learning. The majority of previous research focuses on the offline setting in which the objective function is static. However, the offline setting becomes unrealistic in numerous machine learning applications that witness the change of massive data. In this paper, we propose emph{decentralized online} algorithm for convex and continuous DR-submodular optimization, two classes of functions that are present in a variety of machine learning problems. Our algorithms achieve performance guarantees comparable to those in the centralized offline setting. Moreover, on average, each participant performs only a emph{single} gradient computation per time step. Subsequently, we extend our algorithms to the bandit setting. Finally, we illustrate the competitive performance of our algorithms in real-world experiments.
由于分散学习在联邦学习中的广泛应用,近年来得到了广泛的研究。以往的研究大多集中在目标函数为静态的离线设置上。然而,在大量数据变化的机器学习应用中,离线设置变得不现实。在本文中,我们提出了用于凸和连续dr -子模优化的emph{分散在线}算法,这两类函数存在于各种机器学习问题中。我们的算法实现了与集中式离线设置相当的性能保证。此外,平均而言,每个参与者在每个时间步emph{长}只执行一次梯度计算。随后,我们将算法扩展到强盗设置。最后,我们在现实世界的实验中说明了我们的算法的竞争性能。
{"title":"One Gradient Frank-Wolfe for Decentralized Online Convex and Submodular Optimization","authors":"T. Nguyen, K. Nguyen, D. Trystram","doi":"10.48550/arXiv.2210.16790","DOIUrl":"https://doi.org/10.48550/arXiv.2210.16790","url":null,"abstract":"Decentralized learning has been studied intensively in recent years motivated by its wide applications in the context of federated learning. The majority of previous research focuses on the offline setting in which the objective function is static. However, the offline setting becomes unrealistic in numerous machine learning applications that witness the change of massive data. In this paper, we propose emph{decentralized online} algorithm for convex and continuous DR-submodular optimization, two classes of functions that are present in a variety of machine learning problems. Our algorithms achieve performance guarantees comparable to those in the centralized offline setting. Moreover, on average, each participant performs only a emph{single} gradient computation per time step. Subsequently, we extend our algorithms to the bandit setting. Finally, we illustrate the competitive performance of our algorithms in real-world experiments.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129130559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Scale Context Extracted Hashing for Fine-Grained Image Binary Encoding 用于细粒度图像二值编码的跨尺度上下文提取哈希
Pub Date : 2022-10-14 DOI: 10.48550/arXiv.2210.07572
Xuetong Xue, Jiaying Shi, Xinxue He, Sheng Xu, Zhaoming Pan
Deep hashing has been widely applied to large-scale image retrieval tasks owing to efficient computation and low storage cost by encoding high-dimensional image data into binary codes. Since binary codes do not contain as much information as float features, the essence of binary encoding is preserving the main context to guarantee retrieval quality. However, the existing hashing methods have great limitations on suppressing redundant background information and accurately encoding from Euclidean space to Hamming space by a simple sign function. In order to solve these problems, a Cross-Scale Context Extracted Hashing Network (CSCE-Net) is proposed in this paper. Firstly, we design a two-branch framework to capture fine-grained local information while maintaining high-level global semantic information. Besides, Attention guided Information Extraction module (AIE) is introduced between two branches, which suppresses areas of low context information cooperated with global sliding windows. Unlike previous methods, our CSCE-Net learns a content-related Dynamic Sign Function (DSF) to replace the original simple sign function. Therefore, the proposed CSCE-Net is context-sensitive and able to perform well on accurate image binary encoding. We further demonstrate that our CSCE-Net is superior to the existing hashing methods, which improves retrieval performance on standard benchmarks.
深度哈希将高维图像数据编码为二进制码,计算效率高,存储成本低,已广泛应用于大规模图像检索任务中。由于二进制编码不像浮点特征那样包含大量的信息,因此二进制编码的本质是保留主要上下文以保证检索质量。然而,现有的哈希方法在抑制冗余背景信息和通过简单的符号函数从欧几里得空间精确编码到汉明空间方面存在很大的局限性。为了解决这些问题,本文提出了一种跨尺度上下文提取哈希网络(CSCE-Net)。首先,我们设计了一个两分支框架来捕获细粒度的局部信息,同时保持高层次的全局语义信息。在两个分支之间引入注意引导信息提取模块(AIE),与全局滑动窗口协同抑制低上下文信息区域。与以前的方法不同,我们的CSCE-Net学习了一个与内容相关的动态符号函数(DSF)来代替原来的简单符号函数。因此,本文提出的CSCE-Net具有上下文敏感性,能够对图像进行精确的二进制编码。我们进一步证明了我们的CSCE-Net优于现有的散列方法,这提高了标准基准测试的检索性能。
{"title":"Cross-Scale Context Extracted Hashing for Fine-Grained Image Binary Encoding","authors":"Xuetong Xue, Jiaying Shi, Xinxue He, Sheng Xu, Zhaoming Pan","doi":"10.48550/arXiv.2210.07572","DOIUrl":"https://doi.org/10.48550/arXiv.2210.07572","url":null,"abstract":"Deep hashing has been widely applied to large-scale image retrieval tasks owing to efficient computation and low storage cost by encoding high-dimensional image data into binary codes. Since binary codes do not contain as much information as float features, the essence of binary encoding is preserving the main context to guarantee retrieval quality. However, the existing hashing methods have great limitations on suppressing redundant background information and accurately encoding from Euclidean space to Hamming space by a simple sign function. In order to solve these problems, a Cross-Scale Context Extracted Hashing Network (CSCE-Net) is proposed in this paper. Firstly, we design a two-branch framework to capture fine-grained local information while maintaining high-level global semantic information. Besides, Attention guided Information Extraction module (AIE) is introduced between two branches, which suppresses areas of low context information cooperated with global sliding windows. Unlike previous methods, our CSCE-Net learns a content-related Dynamic Sign Function (DSF) to replace the original simple sign function. Therefore, the proposed CSCE-Net is context-sensitive and able to perform well on accurate image binary encoding. We further demonstrate that our CSCE-Net is superior to the existing hashing methods, which improves retrieval performance on standard benchmarks.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129838562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semantic Cross Attention for Few-shot Learning 语义交叉注意在短时学习中的应用
Pub Date : 2022-10-12 DOI: 10.48550/arXiv.2210.06311
Bin Xiao, Chien Liu, W. Hsaio
Few-shot learning (FSL) has attracted considerable attention recently. Among existing approaches, the metric-based method aims to train an embedding network that can make similar samples close while dissimilar samples as far as possible and achieves promising results. FSL is characterized by using only a few images to train a model that can generalize to novel classes in image classification problems, but this setting makes it difficult to learn the visual features that can identify the images' appearance variations. The model training is likely to move in the wrong direction, as the images in an identical semantic class may have dissimilar appearances, whereas the images in different semantic classes may share a similar appearance. We argue that FSL can benefit from additional semantic features to learn discriminative feature representations. Thus, this study proposes a multi-task learning approach to view semantic features of label text as an auxiliary task to help boost the performance of the FSL task. Our proposed model uses word-embedding representations as semantic features to help train the embedding network and a semantic cross-attention module to bridge the semantic features into the typical visual modal. The proposed approach is simple, but produces excellent results. We apply our proposed approach to two previous metric-based FSL methods, all of which can substantially improve performance. The source code for our model is accessible from github.
近年来,FSL (Few-shot learning)受到了广泛的关注。在现有的方法中,基于度量的方法旨在训练一个使相似样本尽可能接近而不相似样本尽可能接近的嵌入网络,并取得了很好的效果。FSL的特点是仅使用少量图像来训练模型,该模型可以推广到图像分类问题中的新类别,但这种设置使得难以学习可以识别图像外观变化的视觉特征。模型训练很可能会朝着错误的方向发展,因为相同语义类中的图像可能具有不同的外观,而不同语义类中的图像可能具有相似的外观。我们认为FSL可以受益于额外的语义特征来学习判别特征表示。因此,本研究提出了一种多任务学习方法,将标签文本的语义特征作为辅助任务来看待,以帮助提高FSL任务的性能。我们提出的模型使用词嵌入表示作为语义特征来帮助训练嵌入网络,并使用语义交叉注意模块将语义特征桥接到典型的视觉模态中。所提出的方法很简单,但是产生了很好的结果。我们将我们提出的方法应用于之前的两种基于度量的FSL方法,所有这些方法都可以大大提高性能。我们的模型的源代码可以从github访问。
{"title":"Semantic Cross Attention for Few-shot Learning","authors":"Bin Xiao, Chien Liu, W. Hsaio","doi":"10.48550/arXiv.2210.06311","DOIUrl":"https://doi.org/10.48550/arXiv.2210.06311","url":null,"abstract":"Few-shot learning (FSL) has attracted considerable attention recently. Among existing approaches, the metric-based method aims to train an embedding network that can make similar samples close while dissimilar samples as far as possible and achieves promising results. FSL is characterized by using only a few images to train a model that can generalize to novel classes in image classification problems, but this setting makes it difficult to learn the visual features that can identify the images' appearance variations. The model training is likely to move in the wrong direction, as the images in an identical semantic class may have dissimilar appearances, whereas the images in different semantic classes may share a similar appearance. We argue that FSL can benefit from additional semantic features to learn discriminative feature representations. Thus, this study proposes a multi-task learning approach to view semantic features of label text as an auxiliary task to help boost the performance of the FSL task. Our proposed model uses word-embedding representations as semantic features to help train the embedding network and a semantic cross-attention module to bridge the semantic features into the typical visual modal. The proposed approach is simple, but produces excellent results. We apply our proposed approach to two previous metric-based FSL methods, all of which can substantially improve performance. The source code for our model is accessible from github.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134258700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Pose Guided Human Image Synthesis with Partially Decoupled GAN 基于部分解耦GAN的姿态引导人体图像合成
Pub Date : 2022-10-07 DOI: 10.48550/arXiv.2210.03627
Jianguo Wu, Jianzong Wang, Shijing Si, Xiaoyang Qu, Jing Xiao
Pose Guided Human Image Synthesis (PGHIS) is a challenging task of transforming a human image from the reference pose to a target pose while preserving its style. Most existing methods encode the texture of the whole reference human image into a latent space, and then utilize a decoder to synthesize the image texture of the target pose. However, it is difficult to recover the detailed texture of the whole human image. To alleviate this problem, we propose a method by decoupling the human body into several parts (eg, hair, face, hands, feet, etc) and then using each of these parts to guide the synthesis of a realistic image of the person, which preserves the detailed information of the generated images. In addition, we design a multi-head attention-based module for PGHIS. Because most convolutional neural network-based methods have difficulty in modeling long-range dependency due to the convolutional operation, the long-range modeling capability of attention mechanism is more suitable than convolutional neural networks for pose transfer task, especially for sharp pose deformation. Extensive experiments on Market-1501 and DeepFashion datasets reveal that our method almost outperforms other existing state-of-the-art methods in terms of both qualitative and quantitative metrics.
姿态引导人体图像合成(PGHIS)是一项具有挑战性的任务,将人体图像从参考姿态转换为目标姿态,同时保持其风格。现有的方法大多是将整个参考人体图像的纹理编码到一个隐空间中,然后利用解码器合成目标姿态的图像纹理。然而,很难恢复整个人体图像的细节纹理。为了解决这个问题,我们提出了一种方法,将人体解耦成几个部分(例如,头发,脸,手,脚等),然后使用这些部分中的每个部分来指导合成真实的人物图像,该方法保留了生成图像的详细信息。此外,我们还为PGHIS设计了一个基于多头注意力的模块。由于大多数基于卷积神经网络的方法由于卷积运算而难以对远程依赖进行建模,因此注意机制的远程建模能力比卷积神经网络更适合于姿态转移任务,特别是尖锐姿态变形任务。在Market-1501和DeepFashion数据集上进行的大量实验表明,我们的方法在定性和定量指标方面几乎优于其他现有的最先进的方法。
{"title":"Pose Guided Human Image Synthesis with Partially Decoupled GAN","authors":"Jianguo Wu, Jianzong Wang, Shijing Si, Xiaoyang Qu, Jing Xiao","doi":"10.48550/arXiv.2210.03627","DOIUrl":"https://doi.org/10.48550/arXiv.2210.03627","url":null,"abstract":"Pose Guided Human Image Synthesis (PGHIS) is a challenging task of transforming a human image from the reference pose to a target pose while preserving its style. Most existing methods encode the texture of the whole reference human image into a latent space, and then utilize a decoder to synthesize the image texture of the target pose. However, it is difficult to recover the detailed texture of the whole human image. To alleviate this problem, we propose a method by decoupling the human body into several parts (eg, hair, face, hands, feet, etc) and then using each of these parts to guide the synthesis of a realistic image of the person, which preserves the detailed information of the generated images. In addition, we design a multi-head attention-based module for PGHIS. Because most convolutional neural network-based methods have difficulty in modeling long-range dependency due to the convolutional operation, the long-range modeling capability of attention mechanism is more suitable than convolutional neural networks for pose transfer task, especially for sharp pose deformation. Extensive experiments on Market-1501 and DeepFashion datasets reveal that our method almost outperforms other existing state-of-the-art methods in terms of both qualitative and quantitative metrics.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115027913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient Deep Clustering of Human Activities and How to Improve Evaluation 人类活动的高效深度聚类及其改进评价方法
Pub Date : 2022-09-17 DOI: 10.48550/arXiv.2209.08335
Louis Mahon, Thomas Lukasiewicz
There has been much recent research on human activity re-cog-ni-tion (HAR), due to the proliferation of wearable sensors in watches and phones, and the advances of deep learning methods, which avoid the need to manually extract features from raw sensor signals. A significant disadvantage of deep learning applied to HAR is the need for manually labelled training data, which is especially difficult to obtain for HAR datasets. Progress is starting to be made in the unsupervised setting, in the form of deep HAR clustering models, which can assign labels to data without having been given any labels to train on, but there are problems with evaluating deep HAR clustering models, which makes assessing the field and devising new methods difficult. In this paper, we highlight several distinct problems with how deep HAR clustering models are evaluated, describing these problems in detail and conducting careful experiments to explicate the effect that they can have on results. We then discuss solutions to these problems, and suggest standard evaluation settings for future deep HAR clustering models. Additionally, we present a new deep clustering model for HAR. When tested under our proposed settings, our model performs better than (or on par with) existing models, while also being more efficient and better able to scale to more complex datasets by avoiding the need for an autoencoder.
由于手表和手机中可穿戴传感器的普及,以及深度学习方法的进步,最近有很多关于人类活动感知(HAR)的研究,这些方法避免了从原始传感器信号中手动提取特征的需要。深度学习应用于HAR的一个显著缺点是需要手动标记训练数据,这对于HAR数据集来说尤其难以获得。在无监督环境中,以深度HAR聚类模型的形式开始取得进展,该模型可以在没有任何标签的情况下为数据分配标签,但是在评估深度HAR聚类模型时存在问题,这使得评估该领域和设计新方法变得困难。在本文中,我们强调了评估深度HAR聚类模型的几个不同问题,详细描述了这些问题,并进行了仔细的实验来解释它们对结果的影响。然后,我们讨论了这些问题的解决方案,并提出了未来深度HAR聚类模型的标准评估设置。此外,我们还提出了一种新的HAR深度聚类模型。当在我们提出的设置下进行测试时,我们的模型比现有模型表现得更好(或与之相当),同时也更有效,并且通过避免对自动编码器的需要,能够更好地扩展到更复杂的数据集。
{"title":"Efficient Deep Clustering of Human Activities and How to Improve Evaluation","authors":"Louis Mahon, Thomas Lukasiewicz","doi":"10.48550/arXiv.2209.08335","DOIUrl":"https://doi.org/10.48550/arXiv.2209.08335","url":null,"abstract":"There has been much recent research on human activity re-cog-ni-tion (HAR), due to the proliferation of wearable sensors in watches and phones, and the advances of deep learning methods, which avoid the need to manually extract features from raw sensor signals. A significant disadvantage of deep learning applied to HAR is the need for manually labelled training data, which is especially difficult to obtain for HAR datasets. Progress is starting to be made in the unsupervised setting, in the form of deep HAR clustering models, which can assign labels to data without having been given any labels to train on, but there are problems with evaluating deep HAR clustering models, which makes assessing the field and devising new methods difficult. In this paper, we highlight several distinct problems with how deep HAR clustering models are evaluated, describing these problems in detail and conducting careful experiments to explicate the effect that they can have on results. We then discuss solutions to these problems, and suggest standard evaluation settings for future deep HAR clustering models. Additionally, we present a new deep clustering model for HAR. When tested under our proposed settings, our model performs better than (or on par with) existing models, while also being more efficient and better able to scale to more complex datasets by avoiding the need for an autoencoder.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122817400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On PAC Learning Halfspaces in Non-interactive Local Privacy Model with Public Unlabeled Data 具有公共未标记数据的非交互式局部隐私模型的PAC学习半空间
Pub Date : 2022-09-17 DOI: 10.48550/arXiv.2209.08319
Jinyan Su, Jinhui Xu, Di Wang
In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample complexities that are only linear in the dimension and polynomial in other terms for both private and public data, which significantly improve the previous results. Our methods could also be used for other private PAC learning problems.
本文研究了非交互局部差分隐私模型(NLDP)中的PAC学习半空间问题。为了突破指数样本复杂性的障碍,之前的结果研究了一个宽松的设置,其中服务器可以访问一些额外的公共但未标记的数据。我们继续朝这个方向前进。具体来说,我们考虑的是标准设定下的问题,而不是之前研究的大保证金设定。在对底层数据分布的不同温和假设下,我们提出了两种基于Massart噪声模型和自监督学习的方法,并表明对于私有和公共数据都有可能实现仅在维度上是线性的、在其他方面是多项式的样本复杂性,这大大改进了之前的结果。我们的方法也可以用于其他私人PAC学习问题。
{"title":"On PAC Learning Halfspaces in Non-interactive Local Privacy Model with Public Unlabeled Data","authors":"Jinyan Su, Jinhui Xu, Di Wang","doi":"10.48550/arXiv.2209.08319","DOIUrl":"https://doi.org/10.48550/arXiv.2209.08319","url":null,"abstract":"In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample complexities that are only linear in the dimension and polynomial in other terms for both private and public data, which significantly improve the previous results. Our methods could also be used for other private PAC learning problems.","PeriodicalId":119756,"journal":{"name":"Asian Conference on Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133978255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Asian Conference on Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1