AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.01.006

Zongtao Liu , Wei Dong , Chaoliang Wang , Haoqingzi Shen , Gang Sun , Qun jiang , Quanjin Tao , Yang Yang

Recently, several studies explore to use neural networks(NNs) to solve different routing problems, which is an auspicious direction. These studies usually design an encoder–decoder based framework that uses encoder embeddings of nodes and the problem-specific context to iteratively generate node sequence(path), and further optimize the produced result on top, such as a beam search. However, these models are limited to accepting only the coordinates of nodes as input, disregarding the self-referential nature of the studied routing problems, and failing to account for the low reliability of node selection in the initial stages, thereby posing challenges for real-world applications.

In this paper, we take the orienteering problem as an example to tackle these limitations in the previous studies. We propose a novel combination of a variant beam search algorithm and a learned heuristic for solving the general orienteering problem. We acquire the heuristic with an attention network that takes the distances among nodes as input, and learn it via a reinforcement learning framework. The empirical studies show that our method can surpass a wide range of baselines and achieve results iteratively generate the optimal or highly specialized approach.

最近，一些研究探索使用神经网络（NN）来解决不同的路由问题，这是一个很好的方向。这些研究通常设计一个基于编码器-解码器的框架，利用节点的编码器嵌入和特定问题的上下文来迭代生成节点序列（路径），并在此基础上进一步优化生成的结果，例如波束搜索。然而，这些模型仅限于接受节点坐标作为输入，忽略了所研究路由问题的自反性，也没有考虑到初始阶段节点选择的低可靠性，从而给实际应用带来了挑战。我们提出了一种新颖的变体波束搜索算法和学习启发式相结合的方法来解决一般定向问题。我们将启发式与以节点间距离为输入的注意力网络相结合，并通过强化学习框架对其进行学习。实证研究表明，我们的方法可以超越各种基线，并取得迭代生成最优或高度专业化方法的结果。

{"title":"Boosting graph search with attention network for solving the general orienteering problem","authors":"Zongtao Liu , Wei Dong , Chaoliang Wang , Haoqingzi Shen , Gang Sun , Qun jiang , Quanjin Tao , Yang Yang","doi":"10.1016/j.aiopen.2024.01.006","DOIUrl":"https://doi.org/10.1016/j.aiopen.2024.01.006","url":null,"abstract":"<div><p>Recently, several studies explore to use neural networks(NNs) to solve different routing problems, which is an auspicious direction. These studies usually design an encoder–decoder based framework that uses encoder embeddings of nodes and the problem-specific context to iteratively generate node sequence(path), and further optimize the produced result on top, such as a beam search. However, these models are limited to accepting only the coordinates of nodes as input, disregarding the self-referential nature of the studied routing problems, and failing to account for the low reliability of node selection in the initial stages, thereby posing challenges for real-world applications.</p><p>In this paper, we take the orienteering problem as an example to tackle these limitations in the previous studies. We propose a novel combination of a variant beam search algorithm and a learned heuristic for solving the general orienteering problem. We acquire the heuristic with an attention network that takes the distances among nodes as input, and learn it via a reinforcement learning framework. The empirical studies show that our method can surpass a wide range of baselines and achieve results iteratively generate the optimal or highly specialized approach.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 46-54"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266665102400007X/pdfft?md5=4bd44cc9b0d6326c8e34b456fa017774&pid=1-s2.0-S266665102400007X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139936464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Wave2Graph: Integrating spectral features and correlations for graph-based learning in sound waves Wave2Graph：整合频谱特征和相关性，实现基于图谱的声波学习

AI Open

Pub Date : 2024-01-01 DOI: 10.1016/j.aiopen.2024.08.004

Van-Truong Hoang , Khanh-Tung Tran , Xuan-Son Vu , Duy-Khuong Nguyen , Monowar Bhuyan , Hoang D. Nguyen

This paper investigates a novel graph-based representation of sound waves inspired by the physical phenomenon of correlated vibrations. We propose a Wave2Graph framework for integrating multiple acoustic representations, including the spectrum of frequencies and correlations, into various neural computing architectures to achieve new state-of-the-art performances in sound classification. The capability and reliability of our end-to-end framework are evidently demonstrated in voice pathology for low-cost and non-invasive mass-screening of medical conditions, including respiratory illnesses and Alzheimer’s Dementia. We conduct extensive experiments on multiple public benchmark datasets (ICBHI and ADReSSo) and our real-world dataset (IJSound: Respiratory disease detection using coughs and breaths). Wave2Graph framework consistently outperforms previous state-of-the-art methods with a large magnitude, up to 7.65% improvement, promising the usefulness of graph-based representation in signal processing and machine learning.

本文受相关振动物理现象的启发，研究了一种基于图形的新型声波表示法。我们提出了一个 Wave2Graph 框架，用于将包括频谱和相关性在内的多种声学表示法集成到各种神经计算架构中，从而在声音分类方面实现最先进的新性能。我们的端到端框架的能力和可靠性在语音病理学中得到了明显的体现，可用于呼吸系统疾病和阿尔茨海默氏症痴呆症等疾病的低成本、无创大规模筛查。我们在多个公共基准数据集（ICBHI 和 ADReSSo）和真实世界数据集（IJSound：利用咳嗽和呼吸检测呼吸疾病）上进行了广泛的实验。Wave2Graph 框架的表现始终优于之前的先进方法，最高提升幅度达 7.65%，证明了基于图的表示法在信号处理和机器学习中的实用性。

{"title":"Wave2Graph: Integrating spectral features and correlations for graph-based learning in sound waves","authors":"Van-Truong Hoang , Khanh-Tung Tran , Xuan-Son Vu , Duy-Khuong Nguyen , Monowar Bhuyan , Hoang D. Nguyen","doi":"10.1016/j.aiopen.2024.08.004","DOIUrl":"10.1016/j.aiopen.2024.08.004","url":null,"abstract":"<div><p>This paper investigates a novel graph-based representation of sound waves inspired by the physical phenomenon of correlated vibrations. We propose a Wave2Graph framework for integrating multiple acoustic representations, including the spectrum of frequencies and correlations, into various neural computing architectures to achieve new state-of-the-art performances in sound classification. The capability and reliability of our end-to-end framework are evidently demonstrated in voice pathology for low-cost and non-invasive mass-screening of medical conditions, including respiratory illnesses and Alzheimer’s Dementia. We conduct extensive experiments on multiple public benchmark datasets (ICBHI and ADReSSo) and our real-world dataset (IJSound: Respiratory disease detection using coughs and breaths). Wave2Graph framework consistently outperforms previous state-of-the-art methods with a large magnitude, up to 7.65% improvement, promising the usefulness of graph-based representation in signal processing and machine learning.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 115-125"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000147/pdfft?md5=39354e1c8fc8f37b3f91eb3d652b379f&pid=1-s2.0-S2666651024000147-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142158467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How to generate popular post headlines on social media? 如何在社交媒体上生成受欢迎的帖子标题？

AI Open

Pub Date : 2023-12-16 DOI: 10.1016/j.aiopen.2023.12.002

Zhouxiang Fang , Min Yu , Zhendong Fu , Boning Zhang , Xuanwen Huang , Xiaoqi Tang , Yang Yang

Posts, as important containers of user-generated-content on social media, are of tremendous social influence and commercial value. As an integral component of post, headline has decisive influence on post’s popularity. However, the current mainstream method for headline generation is still manually writing, which is unstable and requires extensive human effort. This drives us to explore a novel research question: Can we automate the generation of popular headlines on social media? We collect more than 1 million posts of 42,447 thousand celebrities from public data of Xiaohongshu, which is a well-known social media platform in China. We then conduct careful observations on the headlines of these posts. Observation results demonstrate that trends and personal styles are widespread in headlines on social medias and have significant contribution to posts’ popularity. Motivated by these insights, we present MEBART, which combines Multiple preference-Extractors with Bidirectional and Auto-Regressive Transformers (BART), capturing trends and personal styles to generate popular headlines on social medias. We perform extensive experiments on real-world datasets and achieve SOTA performance compared with advanced baselines. In addition, ablation and case studies demonstrate that MEBART advances in capturing trends and personal styles.

帖子作为社交媒体上用户生成内容的重要载体，具有巨大的社会影响力和商业价值。作为帖子的重要组成部分，标题对帖子的受欢迎程度有着决定性的影响。然而，目前生成标题的主流方法仍然是人工撰写，这种方法不稳定，而且需要大量的人力。这促使我们探索一个新的研究问题：我们能否自动生成社交媒体上的流行标题？我们从中国知名社交媒体平台小红书的公开数据中收集了 4244.7 万名人的 100 多万条帖子。然后，我们对这些帖子的标题进行了细致的观察。观察结果表明，流行趋势和个人风格在社交媒体的标题中广泛存在，并对帖子的受欢迎程度有重要影响。受这些见解的启发，我们提出了 MEBART，它将多重偏好提取器与双向和自回归变换器（BART）相结合，捕捉趋势和个人风格，从而生成社交媒体上的流行标题。我们在真实世界的数据集上进行了广泛的实验，与先进的基线相比，取得了 SOTA 的性能。此外，消融和案例研究也证明了 MEBART 在捕捉趋势和个人风格方面的进步。

{"title":"How to generate popular post headlines on social media?","authors":"Zhouxiang Fang , Min Yu , Zhendong Fu , Boning Zhang , Xuanwen Huang , Xiaoqi Tang , Yang Yang","doi":"10.1016/j.aiopen.2023.12.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.12.002","url":null,"abstract":"<div><p>Posts, as important containers of user-generated-content on social media, are of tremendous social influence and commercial value. As an integral component of post, headline has decisive influence on post’s popularity. However, the current mainstream method for headline generation is still manually writing, which is unstable and requires extensive human effort. This drives us to explore a novel research question: Can we automate the generation of popular headlines on social media? We collect more than 1 million posts of 42,447 thousand celebrities from public data of Xiaohongshu, which is a well-known social media platform in China. We then conduct careful observations on the headlines of these posts. Observation results demonstrate that trends and personal styles are widespread in headlines on social medias and have significant contribution to posts’ popularity. Motivated by these insights, we present MEBART, which combines <strong>M</strong>ultiple preference-<strong>E</strong>xtractors with <strong>B</strong>idirectional and <strong>A</strong>uto-Regressive Transformers (BART), capturing trends and personal styles to generate popular headlines on social medias. We perform extensive experiments on real-world datasets and achieve SOTA performance compared with advanced baselines. In addition, ablation and case studies demonstrate that <em>MEBART</em> advances in capturing trends and personal styles.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 1-9"},"PeriodicalIF":0.0,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000244/pdfft?md5=77f6189a8605961caeb7262aab78dbf9&pid=1-s2.0-S2666651023000244-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139050352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation 作为潜在序列的语言：半监督转述生成的深层潜在变量模型

AI Open

Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.05.001

Jialin Yu , Alexandra I. Cristea , Anoushka Harit , Zhongtian Sun , Olanrewaju Tahir Aduragba , Lei Shi , Noura Al Moubayed

This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin ( $p < . 05$ ; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.

本文探讨了半监督转述生成的深层潜变量模型，其中未标记数据的缺失目标对被建模为潜转述序列。我们提出了一种新的无监督模型，称为变分序列自动编码重建（VSAR），该模型在给定观测文本的情况下执行潜在序列推理。为了利用来自文本对的信息，我们还引入了一种新的监督模型，称为双向学习（DDL），该模型旨在与我们提出的VSAR模型集成。将VSAR与DDL相结合（DDL+VSAR）使我们能够进行半监督学习。尽管如此，合并后的车型仍存在冷启动问题。为了进一步解决这个问题，我们提出了一种改进的权重初始化解决方案，从而产生了一种新的两阶段训练方案，我们称之为知识强化学习（KRL）。我们的经验评估表明，在完整数据上，与最先进的监督基线相比，组合模型产生了具有竞争力的性能。此外，在只有一小部分标记对可用的情况下，我们的组合模型始终显著优于强监督模型基线（DDL）（p<；.05；Wilcoxon检验）。我们的代码可在https://github.com/jialin-yu/latent-sequence-paraphrase.

{"title":"Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation","authors":"Jialin Yu , Alexandra I. Cristea , Anoushka Harit , Zhongtian Sun , Olanrewaju Tahir Aduragba , Lei Shi , Noura Al Moubayed","doi":"10.1016/j.aiopen.2023.05.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.05.001","url":null,"abstract":"<div><p>This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named <em>variational sequence auto-encoding reconstruction</em> (<strong>VSAR</strong>), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call <em>dual directional learning</em> (<strong>DDL</strong>), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (<strong>DDL+VSAR</strong>) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call <em>knowledge-reinforced-learning</em> (<strong>KRL</strong>). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (<strong>DDL</strong>) by a significant margin (<span><math><mrow><mi>p</mi><mo><</mo><mo>.</mo><mn>05</mn></mrow></math></span>; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 19-32"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

UPRec: User-aware Pre-training for sequential Recommendation UPRec：顺序推荐的用户感知预培训

AI Open

Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.008

Chaojun Xiao , Ruobing Xie , Yuan Yao , Zhiyuan Liu , Maosong Sun , Xu Zhang , Leyu Lin

Recent years witness the success of pre-trained models to alleviate the data sparsity problem in recommender systems. However, existing pre-trained models for recommendation mainly focus on leveraging universal sequence patterns from user behavior sequences and item information, whereas ignore heterogeneous user information to capture personalized interests, which has been shown to contribute to the personalized recommendation. In this paper, we propose a simple yet effective model, called User-aware Pre-training for Recommendation (UPRec), which could flexibly encode heterogeneous user information into the sequential modeling of user behaviors. Specifically, UPRec first encodes the sequential behavior to generate user embeddings, and then jointly optimizes the model with the sequential objective and user-aware objective constructed from the user attributes and structured social graphs. Comprehensive experimental results on two real-world large-scale recommendation datasets demonstrate that UPRec can effectively enrich the user representations with user attributes and social relations and thus provide more appropriate recommendations for users.

近年来，预训练模型在缓解推荐系统中的数据稀疏性问题方面取得了成功。然而，现有的预训练推荐模型主要关注利用来自用户行为序列和项目信息的通用序列模式，而忽略异构用户信息来捕获个性化兴趣，这已被证明有助于个性化推荐。在本文中，我们提出了一个简单而有效的模型，称为用户感知推荐预训练（UPRec），它可以灵活地将异构用户信息编码到用户行为的序列建模中。具体而言，UPRec首先对序列行为进行编码以生成用户嵌入，然后利用由用户属性和结构化社交图构建的序列目标和用户感知目标来联合优化模型。在两个真实世界的大规模推荐数据集上的综合实验结果表明，UPRec可以有效地丰富具有用户属性和社会关系的用户表示，从而为用户提供更合适的推荐。

{"title":"UPRec: User-aware Pre-training for sequential Recommendation","authors":"Chaojun Xiao , Ruobing Xie , Yuan Yao , Zhiyuan Liu , Maosong Sun , Xu Zhang , Leyu Lin","doi":"10.1016/j.aiopen.2023.08.008","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.008","url":null,"abstract":"<div><p>Recent years witness the success of pre-trained models to alleviate the data sparsity problem in recommender systems. However, existing pre-trained models for recommendation mainly focus on leveraging universal sequence patterns from user behavior sequences and item information, whereas ignore heterogeneous user information to capture personalized interests, which has been shown to contribute to the personalized recommendation. In this paper, we propose a simple yet effective model, called <strong>U</strong>ser-aware <strong>P</strong>re-training for <strong>Rec</strong>ommendation (UPRec), which could flexibly encode heterogeneous user information into the sequential modeling of user behaviors. Specifically, UPRec first encodes the sequential behavior to generate user embeddings, and then jointly optimizes the model with the sequential objective and user-aware objective constructed from the user attributes and structured social graphs. Comprehensive experimental results on two real-world large-scale recommendation datasets demonstrate that UPRec can effectively enrich the user representations with user attributes and social relations and thus provide more appropriate recommendations for users.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 137-144"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning fair representations via an adversarial framework 通过对抗性框架学习公平表征

AI Open

Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.003

Huadong Qiu , Rui Feng , Ruoyun Hu , Xiao Yang , Shaowa Lin , Quanjin Tao , Yang Yang

Fairness has become a central issue for our research community as classification algorithms are adopted in societally critical domains such as recidivism prediction and loan approval. In this work, we consider the potential bias based on protected attributes (e.g., race and gender), and tackle this problem by learning latent representations of individuals that are statistically indistinguishable between protected groups while sufficiently preserving other information for classification. To do that, we develop a minimax adversarial framework with a generator to capture the data distribution and generate latent representations, and a critic to ensure that the distributions across different protected groups are similar. Our framework provides theoretical guarantee with respect statistical parity and individual fairness. Empirical results on four real-world datasets also show that the learned representation can effectively be used for classification tasks such as credit risk prediction while obstructing information related to protected groups, especially when removing protected attributes is not sufficient for fair classification.

随着分类算法在累犯预测和贷款审批等社会关键领域的应用，公平性已成为我们研究界的核心问题。在这项工作中，我们考虑了基于受保护属性（如种族和性别）的潜在偏见，并通过学习受保护群体之间在统计上无法区分的个人的潜在表征来解决这个问题，同时充分保留其他信息进行分类。为此，我们开发了一个极小最大对抗性框架，其中有一个生成器来捕获数据分布并生成潜在表示，还有一个评论家来确保不同受保护组之间的分布是相似的。我们的框架为尊重统计平等和个人公平提供了理论保障。在四个真实世界数据集上的经验结果还表明，学习的表示可以有效地用于分类任务，如信用风险预测，同时阻碍与受保护群体相关的信息，特别是当去除受保护属性不足以进行公平分类时。

{"title":"Learning fair representations via an adversarial framework","authors":"Huadong Qiu , Rui Feng , Ruoyun Hu , Xiao Yang , Shaowa Lin , Quanjin Tao , Yang Yang","doi":"10.1016/j.aiopen.2023.08.003","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.003","url":null,"abstract":"<div><p>Fairness has become a central issue for our research community as classification algorithms are adopted in societally critical domains such as recidivism prediction and loan approval. In this work, we consider the potential bias based on protected attributes (e.g., race and gender), and tackle this problem by learning latent representations of individuals that are statistically indistinguishable between protected groups while sufficiently preserving other information for classification. To do that, we develop a minimax adversarial framework with a <em>generator</em> to capture the data distribution and generate latent representations, and a <em>critic</em> to ensure that the distributions across different protected groups are similar. Our framework provides theoretical guarantee with respect statistical parity and individual fairness. Empirical results on four real-world datasets also show that the learned representation can effectively be used for classification tasks such as credit risk prediction while obstructing information related to protected groups, especially when removing protected attributes is not sufficient for fair classification.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 91-97"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49761371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Associating multiple vision transformer layers for fine-grained image representation 关联多个视觉转换层以实现细粒度图像表示

AI Open

Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.09.001

Fayou Sun , Hea Choon Ngo , Yong Wee Sek , Zuqiang Meng

- Accurate discriminative region proposal has an important effect for fine-grained image recognition. The vision transformer (ViT) brings about a striking effect in computer vision due to its innate multi-head self-attention mechanism. However, the attention maps are gradually similar after certain layers, and since ViT used a classification token to achieve classification, it is unable to effectively select discriminative image patches for fine-grained image classification. To accurately detect discriminative regions, we propose a novel network AMTrans, which efficiently increases layers to learn diverse features and utilizes integrated raw attention maps to capture more salient features. Specifically, we employ DeepViT as backbone to solve the attention collapse issue. Then, we fuse each head attention weight within each layer to produce an attention weight map. After that, we alternatively use recurrent residual refinement blocks to promote salient feature and then utilize the semantic grouping method to propose the discriminative feature region. A lot of experiments prove that AMTrans acquires the SOTA performance on four widely used fine-grained datasets under the same settings, involving Stanford-Cars, Stanford-Dogs, CUB-200-2011, and ImageNet.

-准确的判别区域建议对于细粒度图像识别具有重要作用。视觉转换器（ViT）由于其固有的多头自注意机制，在计算机视觉中产生了引人注目的效果。然而，在某些层之后，注意力图逐渐相似，并且由于ViT使用分类令牌来实现分类，因此无法有效地选择有判别力的图像块进行细粒度图像分类。为了准确检测判别区域，我们提出了一种新的网络AMTrans，它有效地增加了层来学习不同的特征，并利用集成的原始注意力图来捕捉更显著的特征。具体来说，我们使用DeepViT作为主干来解决注意力崩溃问题。然后，我们融合每一层中的每个头部注意力权重，以生成注意力权重图。然后，我们交替地使用递归残差细化块来提升显著特征，然后使用语义分组方法来提出判别特征区域。大量实验证明，AMTrans在相同设置下，在四个广泛使用的细粒度数据集上获得了SOTA性能，这些数据集包括Stanford Cars、Stanford Dogs、CUB200-2011和ImageNet。

{"title":"Associating multiple vision transformer layers for fine-grained image representation","authors":"Fayou Sun , Hea Choon Ngo , Yong Wee Sek , Zuqiang Meng","doi":"10.1016/j.aiopen.2023.09.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.09.001","url":null,"abstract":"<div><p>- Accurate discriminative region proposal has an important effect for fine-grained image recognition. The vision transformer (ViT) brings about a striking effect in computer vision due to its innate multi-head self-attention mechanism. However, the attention maps are gradually similar after certain layers, and since ViT used a classification token to achieve classification, it is unable to effectively select discriminative image patches for fine-grained image classification. To accurately detect discriminative regions, we propose a novel network AMTrans, which efficiently increases layers to learn diverse features and utilizes integrated raw attention maps to capture more salient features. Specifically, we employ DeepViT as backbone to solve the attention collapse issue. Then, we fuse each head attention weight within each layer to produce an attention weight map. After that, we alternatively use recurrent residual refinement blocks to promote salient feature and then utilize the semantic grouping method to propose the discriminative feature region. A lot of experiments prove that AMTrans acquires the SOTA performance on four widely used fine-grained datasets under the same settings, involving Stanford-Cars, Stanford-Dogs, CUB-200-2011, and ImageNet.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 130-136"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MOTT: A new model for multi-object tracking based on green learning paradigm MOTT:基于绿色学习范式的多目标跟踪新模型

AI Open

Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.09.002

Shan Wu , Amnir Hadachi , Chaoru Lu , Damien Vivet

Multi-object tracking (MOT) is one of the most essential and challenging tasks in computer vision (CV). Unlike object detectors, MOT systems nowadays are more complicated and consist of several neural network models. Thus, the balance between the system performance and the runtime is crucial for online scenarios. While some of the works contribute by adding more modules to achieve improvements, we propose a pruned model by leveraging the state-of-the-art Transformer backbone model. Our model saves up to 62% FLOPS compared with other Transformer-based models and almost as twice as fast as them. The results of the proposed model are still competitive among the state-of-the-art methods. Moreover, we will open-source our modified Transformer backbone model for general CV tasks as well as the MOT system.

多目标跟踪（MOT）是计算机视觉（CV）中最重要和最具挑战性的任务之一。与物体探测器不同，MOT系统现在更加复杂，由几个神经网络模型组成。因此，系统性能和运行时间之间的平衡对于在线场景至关重要。虽然一些工作通过添加更多模块来实现改进，但我们通过利用最先进的Transformer主干模型提出了一个精简模型。与其他基于Transformer的模型相比，我们的模型节省了高达62%的FLOPS，速度几乎是它们的两倍。所提出的模型的结果在最先进的方法中仍然具有竞争力。此外，我们将为通用CV任务和MOT系统开源我们修改后的Transformer主干模型。

引用次数: 0

Semantic graph based topic modelling framework for multilingual fake news detection 基于语义图的多语言假新闻检测主题建模框架

AI Open

Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.004

Rami Mohawesh , Xiao Liu , Hilya Mudrika Arini , Yutao Wu , Hui Yin

Fake news detection is one of the most alluring problems that has grabbed the interest of Machine Learning (ML) and Natural Language Processing (NLP) experts in recent years. The majority of existing studies on detecting fake news are written in English, restricting its application outside the English-speaking population. The lack of annotated corpora and technologies makes it difficult to identify false news in the scenario of low-resource languages, despite the growth in multilingual web content. Moreover, existing works cannot collect more semantic and contextual characteristics from documents in a particular multilingual text corpus. To bridge up these challenges and deal with the multilingual fake news detection challenge, we develop a new semantic graph attention-based representation learning framework to extract structural and semantic representations of texts. Our experiments on TALLIP fake news datasets showed that the classification performance had been significantly enhanced, ranging from 1% to 7% in terms of accuracy metric, and our proposed framework outperformed the state-of-the-art techniques for the multilingual fake news detection task.

假新闻检测是近年来引起机器学习（ML）和自然语言处理（NLP）专家兴趣的最具吸引力的问题之一。现有的大多数关于检测假新闻的研究都是用英语写的，这限制了它在英语人群之外的应用。尽管多语言网络内容不断增长，但由于缺乏带注释的语料库和技术，在资源匮乏的语言环境中很难识别虚假新闻。此外，现有的作品无法从特定的多语言文本语料库中的文档中收集更多的语义和上下文特征。为了弥补这些挑战并应对多语言假新闻检测的挑战，我们开发了一个新的基于语义图注意力的表示学习框架来提取文本的结构和语义表示。我们在TALLIP假新闻数据集上的实验表明，分类性能得到了显著提高，准确度从1%到7%不等，并且我们提出的框架在多语言假新闻检测任务中优于最先进的技术。

{"title":"Semantic graph based topic modelling framework for multilingual fake news detection","authors":"Rami Mohawesh , Xiao Liu , Hilya Mudrika Arini , Yutao Wu , Hui Yin","doi":"10.1016/j.aiopen.2023.08.004","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.004","url":null,"abstract":"<div><p>Fake news detection is one of the most alluring problems that has grabbed the interest of Machine Learning (ML) and Natural Language Processing (NLP) experts in recent years. The majority of existing studies on detecting fake news are written in English, restricting its application outside the English-speaking population. The lack of annotated corpora and technologies makes it difficult to identify false news in the scenario of low-resource languages, despite the growth in multilingual web content. Moreover, existing works cannot collect more semantic and contextual characteristics from documents in a particular multilingual text corpus. To bridge up these challenges and deal with the multilingual fake news detection challenge, we develop a new semantic graph attention-based representation learning framework to extract structural and semantic representations of texts. Our experiments on TALLIP fake news datasets showed that the classification performance had been significantly enhanced, ranging from 1% to 7% in terms of accuracy metric, and our proposed framework outperformed the state-of-the-art techniques for the multilingual fake news detection task.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 33-41"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

AdaDS: Adaptive data selection for accelerating pre-trained language model knowledge distillation 加速预训练语言模型知识升华的自适应数据选择

AI Open

Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.005

Qinhong Zhou , Peng Li , Yang Liu , Yuyang Guan , Qizhou Xing , Ming Chen , Maosong Sun , Yang Liu

Knowledge distillation (KD) is a widely used method for transferring knowledge from large teacher models to computationally efficient student models. Unfortunately, the computational cost of KD becomes unaffordable as pre-trained language models (PLMs) grow larger. Computing KD loss on only part of the training set is a promising way to accelerate KD. However, existing works heuristically leverage only one static data selection strategy during the KD process, demonstrating inconsistent improvements across different distillation scenarios. In this work, we conduct a thorough study on various typical data selection strategies for KD, and show that this problem is due to the fact that the best data selection strategy is specific to various factors, including task, selected data size, and training stage. To automatically adapt to these factors, we propose a framework named AdaDS to learn to choose the data selection strategy adaptively during the KD process. Experimental results show that our proposed method is effective for various tasks and selected data sizes under both fine-tuning and pre-training stages, achieving comparable performance to DistilBERT with only 10% amount of queries to the teacher model.

知识提取（KD）是一种广泛使用的方法，用于将知识从大型教师模型转移到计算高效的学生模型。不幸的是，随着预训练语言模型（PLM）的增长，KD的计算成本变得难以承受。仅在训练集的一部分上计算KD损失是加速KD的一种很有前途的方法。然而，现有的工作在KD过程中仅启发式地利用了一种静态数据选择策略，表明不同蒸馏场景的改进不一致。在这项工作中，我们对KD的各种典型数据选择策略进行了深入的研究，并表明这个问题是由于最佳数据选择策略是特定于各种因素的，包括任务、选择的数据大小和训练阶段。为了自动适应这些因素，我们提出了一个名为AdaDS的框架来学习在KD过程中自适应地选择数据选择策略。实验结果表明，在微调和预训练阶段，我们提出的方法对各种任务和选定的数据大小都是有效的，在对教师模型只有10%的查询量的情况下，实现了与DistilBERT相当的性能。

{"title":"AdaDS: Adaptive data selection for accelerating pre-trained language model knowledge distillation","authors":"Qinhong Zhou , Peng Li , Yang Liu , Yuyang Guan , Qizhou Xing , Ming Chen , Maosong Sun , Yang Liu","doi":"10.1016/j.aiopen.2023.08.005","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.005","url":null,"abstract":"<div><p>Knowledge distillation (KD) is a widely used method for transferring knowledge from large teacher models to computationally efficient student models. Unfortunately, the computational cost of KD becomes unaffordable as pre-trained language models (PLMs) grow larger. Computing KD loss on only part of the training set is a promising way to accelerate KD. However, existing works heuristically leverage only one static data selection strategy during the KD process, demonstrating inconsistent improvements across different distillation scenarios. In this work, we conduct a thorough study on various typical data selection strategies for KD, and show that this problem is due to the fact that the best data selection strategy is specific to various factors, including task, selected data size, and training stage. To automatically adapt to these factors, we propose a framework named AdaDS to learn to choose the data selection strategy adaptively during the KD process. Experimental results show that our proposed method is effective for various tasks and selected data sizes under both fine-tuning and pre-training stages, achieving comparable performance to DistilBERT with only 10% amount of queries to the teacher model.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 56-63"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AI Open最新文献