Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)最新文献

英文中文

Optimization of Annealed Importance Sampling Hyperparameters 退火重要抽样超参数的优化

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-09-27 DOI: 10.48550/arXiv.2209.13226

Shirin Goshtasbpour, F. Pérez-Cruz

Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. Although AIS is guaranteed to provide unbiased estimate for any set of hyperparameters, the common implementations rely on simple heuristics such as the geometric average bridging distributions between initial and the target distribution which affect the estimation performance when the computation budget is limited. In order to reduce the number of sampling iterations, we present a parameteric AIS process with flexible intermediary distributions defined by a residual density with respect to the geometric mean path. Our method allows parameter sharing between annealing distributions, the use of fix linear schedule for discretization and amortization of hyperparameter selection in latent variable models. We assess the performance of Optimized-Path AIS for marginal likelihood estimation of deep generative models and compare it to compare it to more computationally intensive AIS.

退火重要性抽样(AIS)是一种常用的算法，用于估计深度生成模型的难以处理的边际似然。尽管AIS可以保证对任何超参数集提供无偏估计，但通常的实现依赖于简单的启发式方法，如初始分布和目标分布之间的几何平均桥接分布，这在计算预算有限的情况下会影响估计性能。为了减少采样迭代次数，我们提出了一种参数化AIS过程，该过程具有相对于几何平均路径的残差密度定义的灵活中间分布。我们的方法允许退火分布之间的参数共享，使用固定线性调度进行离散化，并在潜在变量模型中平摊超参数选择。我们评估了优化路径AIS对深度生成模型的边际似然估计的性能，并将其与更计算密集型的AIS进行了比较。

引用次数: 1

Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks 在流行的基于事件的谣言检测基准中探测虚假相关性

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-09-19 DOI: 10.48550/arXiv.2209.08799

Jiaying Wu, Bryan Hooi

As social media becomes a hotbed for the spread of misinformation, the crucial task of rumor detection has witnessed promising advances fostered by open-source benchmark datasets. Despite being widely used, we find that these datasets suffer from spurious correlations, which are ignored by existing studies and lead to severe overestimation of existing rumor detection performance. The spurious correlations stem from three causes: (1) event-based data collection and labeling schemes assign the same veracity label to multiple highly similar posts from the same underlying event; (2) merging multiple data sources spuriously relates source identities to veracity labels; and (3) labeling bias. In this paper, we closely investigate three of the most popular rumor detection benchmark datasets (i.e., Twitter15, Twitter16 and PHEME), and propose event-separated rumor detection as a solution to eliminate spurious cues. Under the event-separated setting, we observe that the accuracy of existing state-of-the-art models drops significantly by over 40%, becoming only comparable to a simple neural classifier. To better address this task, we propose Publisher Style Aggregation (PSA), a generalizable approach that aggregates publisher posting records to learn writing style and veracity stance. Extensive experiments demonstrate that our method outperforms existing baselines in terms of effectiveness, efficiency and generalizability.

随着社交媒体成为传播错误信息的温床，在开源基准数据集的推动下，至关重要的谣言检测任务取得了可喜的进展。尽管被广泛使用，但我们发现这些数据集存在虚假相关性，这些相关性被现有研究忽略，导致对现有谣言检测性能的严重高估。虚假相关性源于三个原因:(1)基于事件的数据收集和标记方案将相同的真实性标签分配给来自同一潜在事件的多个高度相似的帖子;(2)合并多个数据源将源身份与真实性标签虚假关联;(3)标签偏差。在本文中，我们仔细研究了三种最流行的谣言检测基准数据集(即Twitter15, Twitter16和PHEME)，并提出了事件分离的谣言检测作为消除虚假线索的解决方案。在事件分离设置下，我们观察到现有最先进模型的准确性显著下降超过40%，仅与简单的神经分类器相当。为了更好地解决这个问题，我们提出了发布者风格聚合(PSA)，这是一种聚合发布者发布记录以学习写作风格和准确性立场的通用方法。大量的实验表明，我们的方法在有效性、效率和通用性方面优于现有的基线。

{"title":"Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks","authors":"Jiaying Wu, Bryan Hooi","doi":"10.48550/arXiv.2209.08799","DOIUrl":"https://doi.org/10.48550/arXiv.2209.08799","url":null,"abstract":"As social media becomes a hotbed for the spread of misinformation, the crucial task of rumor detection has witnessed promising advances fostered by open-source benchmark datasets. Despite being widely used, we find that these datasets suffer from spurious correlations, which are ignored by existing studies and lead to severe overestimation of existing rumor detection performance. The spurious correlations stem from three causes: (1) event-based data collection and labeling schemes assign the same veracity label to multiple highly similar posts from the same underlying event; (2) merging multiple data sources spuriously relates source identities to veracity labels; and (3) labeling bias. In this paper, we closely investigate three of the most popular rumor detection benchmark datasets (i.e., Twitter15, Twitter16 and PHEME), and propose event-separated rumor detection as a solution to eliminate spurious cues. Under the event-separated setting, we observe that the accuracy of existing state-of-the-art models drops significantly by over 40%, becoming only comparable to a simple neural classifier. To better address this task, we propose Publisher Style Aggregation (PSA), a generalizable approach that aggregates publisher posting records to learn writing style and veracity stance. Extensive experiments demonstrate that our method outperforms existing baselines in terms of effectiveness, efficiency and generalizability.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":"95 1","pages":"274-290"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89906508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Fast and Accurate Importance Weighting for Correcting Sample Bias 快速准确的校正样本偏差的重要性加权

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-09-09 DOI: 10.48550/arXiv.2209.04215

Antoine de Mathelin, F. Deheeger, M. Mougeot, N. Vayatis

Bias in datasets can be very detrimental for appropriate statistical estimation. In response to this problem, importance weighting methods have been developed to match any biased distribution to its corresponding target unbiased distribution. The seminal Kernel Mean Matching (KMM) method is, nowadays, still considered as state of the art in this research field. However, one of the main drawbacks of this method is the computational burden for large datasets. Building on previous works by Huang et al. (2007) and de Mathelin et al. (2021), we derive a novel importance weighting algorithm which scales to large datasets by using a neural network to predict the instance weights. We show, on multiple public datasets, under various sample biases, that our proposed approach drastically reduces the computational time on large dataset while maintaining similar sample bias correction performance compared to other importance weighting methods. The proposed approach appears to be the only one able to give relevant reweighting in a reasonable time for large dataset with up to two million data.

数据集的偏差对适当的统计估计是非常有害的。针对这一问题，人们开发了重要性加权方法，将任意有偏分布与其对应的目标无偏分布进行匹配。具有开创性的核均值匹配(KMM)方法目前仍被认为是该研究领域的最新技术。然而，这种方法的主要缺点之一是大数据集的计算负担。在Huang et al.(2007)和de Mathelin et al.(2021)先前工作的基础上，我们推导了一种新的重要性加权算法，该算法通过使用神经网络预测实例权重来扩展到大型数据集。我们表明，在多个公共数据集上，在不同的样本偏差下，我们提出的方法大大减少了大型数据集上的计算时间，同时与其他重要性加权方法相比，保持了相似的样本偏差校正性能。所提出的方法似乎是唯一一种能够在合理的时间内对多达200万数据的大型数据集进行相关重加权的方法。

{"title":"Fast and Accurate Importance Weighting for Correcting Sample Bias","authors":"Antoine de Mathelin, F. Deheeger, M. Mougeot, N. Vayatis","doi":"10.48550/arXiv.2209.04215","DOIUrl":"https://doi.org/10.48550/arXiv.2209.04215","url":null,"abstract":"Bias in datasets can be very detrimental for appropriate statistical estimation. In response to this problem, importance weighting methods have been developed to match any biased distribution to its corresponding target unbiased distribution. The seminal Kernel Mean Matching (KMM) method is, nowadays, still considered as state of the art in this research field. However, one of the main drawbacks of this method is the computational burden for large datasets. Building on previous works by Huang et al. (2007) and de Mathelin et al. (2021), we derive a novel importance weighting algorithm which scales to large datasets by using a neural network to predict the instance weights. We show, on multiple public datasets, under various sample biases, that our proposed approach drastically reduces the computational time on large dataset while maintaining similar sample bias correction performance compared to other importance weighting methods. The proposed approach appears to be the only one able to give relevant reweighting in a reasonable time for large dataset with up to two million data.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":"22 1","pages":"659-674"},"PeriodicalIF":0.0,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82601678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Class-Incremental Learning via Knowledge Amalgamation 通过知识整合进行课堂增量学习

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-09-05 DOI: 10.48550/arXiv.2209.02112

Marcus Vinícius de Carvalho, Mahardhika Pratama, Jie Zhang, Yajuan San

Catastrophic forgetting has been a significant problem hindering the deployment of deep learning algorithms in the continual learning setting. Numerous methods have been proposed to address the catastrophic forgetting problem where an agent loses its generalization power of old tasks while learning new tasks. We put forward an alternative strategy to handle the catastrophic forgetting with knowledge amalgamation (CFA), which learns a student network from multiple heterogeneous teacher models specializing in previous tasks and can be applied to current offline methods. The knowledge amalgamation process is carried out in a single-head manner with only a selected number of memorized samples and no annotations. The teachers and students do not need to share the same network structure, allowing heterogeneous tasks to be adapted to a compact or sparse data representation. We compare our method with competitive baselines from different strategies, demonstrating our approach's advantages.

灾难性遗忘一直是阻碍深度学习算法在持续学习环境中部署的一个重要问题。人们提出了许多方法来解决灾难性遗忘问题，即智能体在学习新任务时失去了对旧任务的泛化能力。我们提出了一种利用知识合并(CFA)来处理灾难性遗忘的替代策略，该策略从多个专攻先前任务的异构教师模型中学习学生网络，并可应用于当前的离线方法。知识合并过程以单头方式进行，只有选定数量的记忆样本，没有注释。教师和学生不需要共享相同的网络结构，允许异构任务适应紧凑或稀疏的数据表示。我们将我们的方法与来自不同策略的竞争性基线进行比较，展示了我们方法的优势。

引用次数: 2

Scalable Adversarial Online Continual Learning 可扩展的对抗在线持续学习

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-09-04 DOI: 10.48550/arXiv.2209.01558

T. Dam, Mahardhika Pratama, Md Meftahul Ferdaus, S. Anavatti, Hussein Abbas

Adversarial continual learning is effective for continual learning problems because of the presence of feature alignment process generating task-invariant features having low susceptibility to the catastrophic forgetting problem. Nevertheless, the ACL method imposes considerable complexities because it relies on task-specific networks and discriminators. It also goes through an iterative training process which does not fit for online (one-epoch) continual learning problems. This paper proposes a scalable adversarial continual learning (SCALE) method putting forward a parameter generator transforming common features into task-specific features and a single discriminator in the adversarial game to induce common features. The training process is carried out in meta-learning fashions using a new combination of three loss functions. SCALE outperforms prominent baselines with noticeable margins in both accuracy and execution time.

由于存在特征对齐过程，产生的任务不变特征对灾难性遗忘问题的敏感性较低，因此对抗性持续学习对持续学习问题是有效的。然而，ACL方法带来了相当大的复杂性，因为它依赖于特定于任务的网络和鉴别器。它还经历了一个迭代的训练过程，这并不适合在线(一个epoch)持续学习问题。本文提出了一种可扩展的对抗持续学习(SCALE)方法，提出了一个参数生成器将共同特征转化为特定任务的特征，并在对抗博弈中使用单个鉴别器来诱导共同特征。训练过程以元学习的方式进行，使用三个损失函数的新组合。SCALE在准确性和执行时间上都优于显著的基线。

引用次数: 1

An Ion Exchange Mechanism Inspired Story Ending Generator for Different Characters 基于离子交换机制的不同角色故事结局生成器

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-09-01 DOI: 10.48550/arXiv.2209.00200

Xinyu Jiang, Qi Zhang, Chongyang Shi, Kaiying Jiang, Liang Hu, Shoujin Wang

Story ending generation aims at generating reasonable endings for a given story context. Most existing studies in this area focus on generating coherent or diversified story endings, while they ignore that different characters may lead to different endings for a given story. In this paper, we propose a Character-oriented Story Ending Generator (CoSEG) to customize an ending for each character in a story. Specifically, we first propose a character modeling module to learn the personalities of characters from their descriptive experiences extracted from the story context. Then, inspired by the ion exchange mechanism in chemical reactions, we design a novel vector breaking/forming module to learn the intrinsic interactions between each character and the corresponding context through an analogical information exchange procedure. Finally, we leverage the attention mechanism to learn effective character-specific interactions and feed each interaction into a decoder to generate character-orient endings. Extensive experimental results and case studies demonstrate that CoSEG achieves significant improvements in the quality of generated endings compared with state-of-the-art methods, and it effectively customizes the endings for different characters.

故事结局生成旨在为给定的故事情境生成合理的结局。这一领域的现有研究大多侧重于创造连贯或多样化的故事结局，而忽略了不同的角色可能导致不同的故事结局。在本文中，我们提出了一个面向角色的故事结局生成器(CoSEG)，为故事中的每个角色定制一个结局。具体来说，我们首先提出了一个角色建模模块，通过从故事情境中提取的描述性经验来学习角色的个性。然后，受化学反应中的离子交换机制的启发，我们设计了一个新的矢量断裂/形成模块，通过类比的信息交换过程来学习每个字符与相应上下文之间的内在相互作用。最后，我们利用注意机制来学习有效的角色特定交互，并将每个交互馈送到解码器中以生成面向角色的结尾。大量的实验结果和案例研究表明，与现有方法相比，CoSEG在生成结局的质量上取得了显著的提高，并且有效地为不同的角色定制了结局。

{"title":"An Ion Exchange Mechanism Inspired Story Ending Generator for Different Characters","authors":"Xinyu Jiang, Qi Zhang, Chongyang Shi, Kaiying Jiang, Liang Hu, Shoujin Wang","doi":"10.48550/arXiv.2209.00200","DOIUrl":"https://doi.org/10.48550/arXiv.2209.00200","url":null,"abstract":"Story ending generation aims at generating reasonable endings for a given story context. Most existing studies in this area focus on generating coherent or diversified story endings, while they ignore that different characters may lead to different endings for a given story. In this paper, we propose a Character-oriented Story Ending Generator (CoSEG) to customize an ending for each character in a story. Specifically, we first propose a character modeling module to learn the personalities of characters from their descriptive experiences extracted from the story context. Then, inspired by the ion exchange mechanism in chemical reactions, we design a novel vector breaking/forming module to learn the intrinsic interactions between each character and the corresponding context through an analogical information exchange procedure. Finally, we leverage the attention mechanism to learn effective character-specific interactions and feed each interaction into a decoder to generate character-orient endings. Extensive experimental results and case studies demonstrate that CoSEG achieves significant improvements in the quality of generated endings compared with state-of-the-art methods, and it effectively customizes the endings for different characters.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":"34 1","pages":"553-570"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88988209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EpiGNN: Exploring Spatial Transmission with Graph Neural Network for Regional Epidemic Forecasting EpiGNN:基于图神经网络的区域流行病预测的空间传播研究

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-08-23 DOI: 10.48550/arXiv.2208.11517

Feng Xie, Zhong Zhang, Liang Li, B. Zhou, Yusong Tan

Epidemic forecasting is the key to effective control of epidemic transmission and helps the world mitigate the crisis that threatens public health. To better understand the transmission and evolution of epidemics, we propose EpiGNN, a graph neural network-based model for epidemic forecasting. Specifically, we design a transmission risk encoding module to characterize local and global spatial effects of regions in epidemic processes and incorporate them into the model. Meanwhile, we develop a Region-Aware Graph Learner (RAGL) that takes transmission risk, geographical dependencies, and temporal information into account to better explore spatial-temporal dependencies and makes regions aware of related regions' epidemic situations. The RAGL can also combine with external resources, such as human mobility, to further improve prediction performance. Comprehensive experiments on five real-world epidemic-related datasets (including influenza and COVID-19) demonstrate the effectiveness of our proposed method and show that EpiGNN outperforms state-of-the-art baselines by 9.48% in RMSE.

疫情预测是有效控制疫情传播的关键，有助于世界减轻威胁公共卫生的危机。为了更好地理解流行病的传播和演变，我们提出了一种基于图神经网络的流行病预测模型EpiGNN。具体而言，我们设计了一个传播风险编码模块，以表征流行病过程中区域的局部和全局空间效应，并将其纳入模型。同时，我们开发了一种考虑传播风险、地理依赖关系和时间信息的区域感知图学习器(RAGL)，以更好地探索时空依赖关系，使区域了解相关区域的疫情情况。RAGL还可以与外部资源相结合，例如人类的流动性，以进一步提高预测性能。在五个现实世界流行病相关数据集(包括流感和COVID-19)上的综合实验证明了我们提出的方法的有效性，并表明EpiGNN在RMSE上优于最先进的基线9.48%。

{"title":"EpiGNN: Exploring Spatial Transmission with Graph Neural Network for Regional Epidemic Forecasting","authors":"Feng Xie, Zhong Zhang, Liang Li, B. Zhou, Yusong Tan","doi":"10.48550/arXiv.2208.11517","DOIUrl":"https://doi.org/10.48550/arXiv.2208.11517","url":null,"abstract":"Epidemic forecasting is the key to effective control of epidemic transmission and helps the world mitigate the crisis that threatens public health. To better understand the transmission and evolution of epidemics, we propose EpiGNN, a graph neural network-based model for epidemic forecasting. Specifically, we design a transmission risk encoding module to characterize local and global spatial effects of regions in epidemic processes and incorporate them into the model. Meanwhile, we develop a Region-Aware Graph Learner (RAGL) that takes transmission risk, geographical dependencies, and temporal information into account to better explore spatial-temporal dependencies and makes regions aware of related regions' epidemic situations. The RAGL can also combine with external resources, such as human mobility, to further improve prediction performance. Comprehensive experiments on five real-world epidemic-related datasets (including influenza and COVID-19) demonstrate the effectiveness of our proposed method and show that EpiGNN outperforms state-of-the-art baselines by 9.48% in RMSE.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":"3 1","pages":"469-485"},"PeriodicalIF":0.0,"publicationDate":"2022-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73070490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Cloud-Based Real-Time Molecular Screening Platform with MolFormer 基于云的MolFormer实时分子筛选平台

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-08-13 DOI: 10.48550/arXiv.2208.06665

Brian M. Belgodere, V. Chenthamarakshan, Payel Das, Pierre L. Dognin, Toby Kurien, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, R. Young

With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed. Here, we present a cloud-based real-time platform that allows users to virtually screen molecules of interest. For this purpose, molecular embeddings inferred from a recently proposed large chemical language model, named MolFormer, are leveraged. The platform currently supports three tasks: nearest neighbor retrieval, chemical space visualization, and property prediction. Based on the functionalities of this platform and results obtained, we believe that such a platform can play a pivotal role in automating chemistry and chemical engineering research, as well as assist in drug discovery and material design tasks. A demo of our platform is provided at url{www.ibm.biz/molecular_demo}.

随着许多化学任务的高保真自动化的前景，化学语言处理模型正在迅速出现。在这里，我们提出了一个基于云的实时平台，允许用户虚拟筛选感兴趣的分子。为此，我们利用了从最近提出的大型化学语言模型MolFormer中推断出的分子嵌入。该平台目前支持三个任务:最近邻检索、化学空间可视化和属性预测。基于该平台的功能和所获得的结果，我们认为该平台可以在自动化化学和化学工程研究中发挥关键作用，并协助药物发现和材料设计任务。我们的平台的演示在url{www.ibm.biz/molecular_demo}上提供。

引用次数: 1

Improving Micro-video Recommendation by Controlling Position Bias 通过控制位置偏差改进微视频推荐

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-08-09 DOI: 10.48550/arXiv.2208.05315

Yisong Yu, Beihong Jin, Jiageng Song, Beibei Li, Y. Zheng, Wei-wei Zhu

As the micro-video apps become popular, the numbers of micro-videos and users increase rapidly, which highlights the importance of micro-video recommendation. Although the micro-video recommendation can be naturally treated as the sequential recommendation, the previous sequential recommendation models do not fully consider the characteristics of micro-video apps, and in their inductive biases, the role of positions is not in accord with the reality in the micro-video scenario. Therefore, in the paper, we present a model named PDMRec (Position Decoupled Micro-video Recommendation). PDMRec applies separate self-attention modules to model micro-video information and the positional information and then aggregate them together, avoid the noisy correlations between micro-video semantics and positional information being encoded into the sequence embeddings. Moreover, PDMRec proposes contrastive learning strategies which closely match with the characteristics of the micro-video scenario, thus reducing the interference from micro-video positions in sequences. We conduct the extensive experiments on two real-world datasets. The experimental results shows that PDMRec outperforms existing multiple state-of-the-art models and achieves significant performance improvements.

随着微视频app的普及，微视频的数量和用户数量迅速增加，这就凸显了微视频推荐的重要性。虽然微视频推荐可以自然地视为顺序推荐，但以往的顺序推荐模型并没有充分考虑到微视频应用的特点，在归纳偏差中，位置在微视频场景中的作用并不符合实际。因此，在本文中，我们提出了一个名为PDMRec(位置解耦微视频推荐)的模型。PDMRec采用独立的自关注模块对微视频信息和位置信息进行建模，然后将它们聚合在一起，避免了微视频语义和位置信息之间的噪声相关性被编码到序列嵌入中。此外，PDMRec提出了与微视频场景特征紧密匹配的对比学习策略，从而减少了序列中微视频位置的干扰。我们在两个真实世界的数据集上进行了广泛的实验。实验结果表明，PDMRec优于现有的多个最先进的模型，并取得了显着的性能改进。

{"title":"Improving Micro-video Recommendation by Controlling Position Bias","authors":"Yisong Yu, Beihong Jin, Jiageng Song, Beibei Li, Y. Zheng, Wei-wei Zhu","doi":"10.48550/arXiv.2208.05315","DOIUrl":"https://doi.org/10.48550/arXiv.2208.05315","url":null,"abstract":"As the micro-video apps become popular, the numbers of micro-videos and users increase rapidly, which highlights the importance of micro-video recommendation. Although the micro-video recommendation can be naturally treated as the sequential recommendation, the previous sequential recommendation models do not fully consider the characteristics of micro-video apps, and in their inductive biases, the role of positions is not in accord with the reality in the micro-video scenario. Therefore, in the paper, we present a model named PDMRec (Position Decoupled Micro-video Recommendation). PDMRec applies separate self-attention modules to model micro-video information and the positional information and then aggregate them together, avoid the noisy correlations between micro-video semantics and positional information being encoded into the sequence embeddings. Moreover, PDMRec proposes contrastive learning strategies which closely match with the characteristics of the micro-video scenario, thus reducing the interference from micro-video positions in sequences. We conduct the extensive experiments on two real-world datasets. The experimental results shows that PDMRec outperforms existing multiple state-of-the-art models and achieves significant performance improvements.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":"23 1","pages":"508-523"},"PeriodicalIF":0.0,"publicationDate":"2022-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78453246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects 不再跨行卷积或池化:低分辨率图像和小对象的新CNN构建块

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

Pub Date : 2022-08-07 DOI: 10.48550/arXiv.2208.03641

Raja Sunkara, Tie Luo

Convolutional neural networks (CNNs) have made resounding success in many computer vision tasks such as image classification and object detection. However, their performance degrades rapidly on tougher tasks where images are of low resolution or objects are small. In this paper, we point out that this roots in a defective yet common design in existing CNN architectures, namely the use of strided convolution and/or pooling layers, which results in a loss of fine-grained information and learning of less effective feature representations. To this end, we propose a new CNN building block called SPD-Conv in place of each strided convolution layer and each pooling layer (thus eliminates them altogether). SPD-Conv is comprised of a space-to-depth (SPD) layer followed by a non-strided convolution (Conv) layer, and can be applied in most if not all CNN architectures. We explain this new design under two most representative computer vision tasks: object detection and image classification. We then create new CNN architectures by applying SPD-Conv to YOLOv5 and ResNet, and empirically show that our approach significantly outperforms state-of-the-art deep learning models, especially on tougher tasks with low-resolution images and small objects. We have open-sourced our code at https://github.com/LabSAINT/SPD-Conv.

卷积神经网络(cnn)在图像分类和目标检测等计算机视觉任务中取得了巨大的成功。然而，在图像分辨率低或物体很小的复杂任务中，它们的性能会迅速下降。在本文中，我们指出，这源于现有CNN架构中有缺陷但常见的设计，即使用跨行卷积和/或池化层，这会导致细粒度信息的丢失和学习不太有效的特征表示。为此，我们提出了一个新的CNN构建块，称为SPD-Conv，以取代每个跨行卷积层和每个池化层(从而完全消除它们)。SPD-Conv由一个空间到深度(SPD)层和一个非跨行卷积(Conv)层组成，可以应用于大多数(如果不是所有的话)CNN架构。我们在两个最具代表性的计算机视觉任务中解释了这种新设计:目标检测和图像分类。然后，我们通过将SPD-Conv应用于YOLOv5和ResNet来创建新的CNN架构，并通过经验表明，我们的方法显着优于最先进的深度学习模型，特别是在具有低分辨率图像和小物体的更困难的任务上。我们已经在https://github.com/LabSAINT/SPD-Conv上开源了我们的代码。

{"title":"No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects","authors":"Raja Sunkara, Tie Luo","doi":"10.48550/arXiv.2208.03641","DOIUrl":"https://doi.org/10.48550/arXiv.2208.03641","url":null,"abstract":"Convolutional neural networks (CNNs) have made resounding success in many computer vision tasks such as image classification and object detection. However, their performance degrades rapidly on tougher tasks where images are of low resolution or objects are small. In this paper, we point out that this roots in a defective yet common design in existing CNN architectures, namely the use of strided convolution and/or pooling layers, which results in a loss of fine-grained information and learning of less effective feature representations. To this end, we propose a new CNN building block called SPD-Conv in place of each strided convolution layer and each pooling layer (thus eliminates them altogether). SPD-Conv is comprised of a space-to-depth (SPD) layer followed by a non-strided convolution (Conv) layer, and can be applied in most if not all CNN architectures. We explain this new design under two most representative computer vision tasks: object detection and image classification. We then create new CNN architectures by applying SPD-Conv to YOLOv5 and ResNet, and empirically show that our approach significantly outperforms state-of-the-art deep learning models, especially on tougher tasks with low-resolution images and small objects. We have open-sourced our code at https://github.com/LabSAINT/SPD-Conv.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":"116 1","pages":"443-459"},"PeriodicalIF":0.0,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87897655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 40

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀