首页 > 最新文献

Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence最新文献

英文 中文
Better and Faster: Adaptive Event Conversion for Event-Based Object Detection 更好更快:基于事件的对象检测的自适应事件转换
Yan Peng, Yueyi Zhang, Peilin Xiao, Xiaoyan Sun, Feng Wu
Event cameras are a kind of bio-inspired imaging sensor, which asynchronously collect sparse event streams with many advantages. In this paper, we focus on building better and faster event-based object detectors. To this end, we first propose a computationally efficient event representation Hyper Histogram, which adequately preserves both the polarity and temporal information of events. Then we devise an Adaptive Event Conversion module, which converts events into Hyper Histograms according to event density via an adaptive queue. Moreover, we introduce a novel event-based augmentation method Shadow Mosaic, which significantly improves the event sample diversity and enhances the generalization ability of detection models. We equip our proposed modules on three representative object detection models: YOLOv5, Deformable-DETR, and RetinaNet. Experimental results on three event-based detection datasets (1Mpx, Gen1, and MVSEC-NIGHTL21) demonstrate that our proposed approach outperforms other state-of-the-art methods by a large margin, while achieving a much faster running speed (< 14 ms and < 4 ms for 50 ms event data on the 1Mpx and Gen1 datasets).
事件相机是一种异步采集稀疏事件流的仿生成像传感器,具有许多优点。在本文中,我们专注于构建更好更快的基于事件的目标检测器。为此,我们首先提出了一种计算效率高的事件表示超直方图,它充分保留了事件的极性和时间信息。然后设计了自适应事件转换模块,通过自适应队列将事件根据事件密度转换成超直方图。此外,我们还引入了一种新的基于事件的增强方法Shadow Mosaic,该方法显著改善了事件样本的多样性,增强了检测模型的泛化能力。我们在三个代表性的目标检测模型上装备我们提出的模块:YOLOv5, Deformable-DETR和RetinaNet。在三个基于事件的检测数据集(1Mpx、Gen1和MVSEC-NIGHTL21)上的实验结果表明,我们提出的方法在很大程度上优于其他最先进的方法,同时实现了更快的运行速度(在1Mpx和Gen1数据集上的50毫秒事件数据< 14 ms和< 4 ms)。
{"title":"Better and Faster: Adaptive Event Conversion for Event-Based Object Detection","authors":"Yan Peng, Yueyi Zhang, Peilin Xiao, Xiaoyan Sun, Feng Wu","doi":"10.1609/aaai.v37i2.25298","DOIUrl":"https://doi.org/10.1609/aaai.v37i2.25298","url":null,"abstract":"Event cameras are a kind of bio-inspired imaging sensor, which asynchronously collect sparse event streams with many advantages. In this paper, we focus on building better and faster event-based object detectors. To this end, we first propose a computationally efficient event representation Hyper Histogram, which adequately preserves both the polarity and temporal information of events. Then we devise an Adaptive Event Conversion module, which converts events into Hyper Histograms according to event density via an adaptive queue. Moreover, we introduce a novel event-based augmentation method Shadow Mosaic, which significantly improves the event sample diversity and enhances the generalization ability of detection models. We equip our proposed modules on three representative object detection models: YOLOv5, Deformable-DETR, and RetinaNet. Experimental results on three event-based detection datasets (1Mpx, Gen1, and MVSEC-NIGHTL21) demonstrate that our proposed approach outperforms other state-of-the-art methods by a large margin, while achieving a much faster running speed (< 14 ms and < 4 ms for 50 ms event data on the 1Mpx and Gen1 datasets).","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"60 1","pages":"2056-2064"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78731664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning Generalizable Batch Active Learning Strategies via Deep Q-networks (Student Abstract) 基于深度q网络的可泛化批量主动学习策略(学生摘要)
Yichen Li, Wen-Jie Shen, Boyu Zhang, Feng Mao, Zongzhang Zhang, Yang Yu
To handle a large amount of unlabeled data, batch active learning (BAL) queries humans for the labels of a batch of the most valuable data points at every round. Most current BAL strategies are based on human-designed heuristics, such as uncertainty sampling or mutual information maximization. However, there exists a disagreement between these heuristics and the ultimate goal of BAL, i.e., optimizing the model's final performance within the query budgets. This disagreement leads to a limited generality of these heuristics. To this end, we formulate BAL as an MDP and propose a data-driven approach based on deep reinforcement learning. Our method learns the BAL strategy by maximizing the model's final performance. Experiments on the UCI benchmark show that our method can achieve competitive performance compared to existing heuristics-based approaches.
为了处理大量未标记的数据,批量主动学习(BAL)在每轮向人类查询一批最有价值的数据点的标签。目前大多数BAL策略都是基于人为设计的启发式方法,如不确定性采样或互信息最大化。然而,这些启发式方法与BAL的最终目标(即在查询预算范围内优化模型的最终性能)之间存在分歧。这种分歧导致了这些启发式的有限普遍性。为此,我们将BAL制定为MDP,并提出了一种基于深度强化学习的数据驱动方法。我们的方法通过最大化模型的最终性能来学习BAL策略。在UCI基准上的实验表明,与现有的基于启发式的方法相比,我们的方法可以获得具有竞争力的性能。
{"title":"Learning Generalizable Batch Active Learning Strategies via Deep Q-networks (Student Abstract)","authors":"Yichen Li, Wen-Jie Shen, Boyu Zhang, Feng Mao, Zongzhang Zhang, Yang Yu","doi":"10.1609/aaai.v37i13.26989","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26989","url":null,"abstract":"To handle a large amount of unlabeled data, batch active learning (BAL) queries humans for the labels of a batch of the most valuable data points at every round. Most current BAL strategies are based on human-designed heuristics, such as uncertainty sampling or mutual information maximization. However, there exists a disagreement between these heuristics and the ultimate goal of BAL, i.e., optimizing the model's final performance within the query budgets. This disagreement leads to a limited generality of these heuristics. To this end, we formulate BAL as an MDP and propose a data-driven approach based on deep reinforcement learning. Our method learns the BAL strategy by maximizing the model's final performance. Experiments on the UCI benchmark show that our method can achieve competitive performance compared to existing heuristics-based approaches.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"32 1","pages":"16258-16259"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78768216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inconsistent Cores for ASP: The Perks and Perils of Non-monotonicity ASP的不一致核心:非单调性的好处和危险
J. Fichte, Markus Hecher, Stefan Szeider
Answer Set Programming (ASP) is a prominent modeling and solving framework. An inconsistent core (IC) of an ASP program is an inconsistent subset of rules. In the case of inconsistent programs, a smallest or subset-minimal IC contains crucial rules for the inconsistency. In this work, we study fnding minimal ICs of ASP programs and key fragments from a complexity-theoretic perspective. Interestingly, due to ASP’s non-monotonic behavior, also consistent programs admit ICs. It turns out that there is an entire landscape of problems involving ICs with a diverse range of complexities up to the fourth level of the Polynomial Hierarchy. Deciding the existence of an IC is, already for tight programs, on the second level of the Polynomial Hierarchy. Furthermore, we give encodings for IC-related problems on the fragment of tight programs and illustrate feasibility on small instance sets.
答案集编程(ASP)是一个杰出的建模和求解框架。ASP程序的不一致核心(IC)是规则的不一致子集。对于不一致的程序,最小的或子集最小的集成电路包含不一致的关键规则。本文从复杂性理论的角度研究ASP程序的最小集成电路和关键片段的寻找。有趣的是,由于ASP的非单调性,一致性程序也承认ic。事实证明,在多项式层次结构的第四层,涉及ic的问题具有各种各样的复杂性。对于紧凑的程序来说,决定IC是否存在已经在多项式层次的第二层上了。此外,我们给出了紧规划片段上集成电路相关问题的编码,并说明了在小实例集上的可行性。
{"title":"Inconsistent Cores for ASP: The Perks and Perils of Non-monotonicity","authors":"J. Fichte, Markus Hecher, Stefan Szeider","doi":"10.1609/aaai.v37i5.25783","DOIUrl":"https://doi.org/10.1609/aaai.v37i5.25783","url":null,"abstract":"Answer Set Programming (ASP) is a prominent modeling and solving framework. An inconsistent core (IC) of an ASP program is an inconsistent subset of rules. In the case of inconsistent programs, a smallest or subset-minimal IC contains crucial rules for the inconsistency. In this work, we study fnding minimal ICs of ASP programs and key fragments from a complexity-theoretic perspective. Interestingly, due to ASP’s non-monotonic behavior, also consistent programs admit ICs. It turns out that there is an entire landscape of problems involving ICs with a diverse range of complexities up to the fourth level of the Polynomial Hierarchy. Deciding the existence of an IC is, already for tight programs, on the second level of the Polynomial Hierarchy. Furthermore, we give encodings for IC-related problems on the fragment of tight programs and illustrate feasibility on small instance sets.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"18 1","pages":"6363-6371"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74971850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LADA-Trans-NER: Adaptive Efficient Transformer for Chinese Named Entity Recognition Using Lexicon-Attention and Data-Augmentation LADA-Trans-NER:基于词典关注和数据增强的中文命名实体识别自适应高效转换器
Jiguo Liu, Chao Liu, Nan Li, Shihao Gao, Mingqi Liu, Dali Zhu
Recently, word enhancement has become very popular for Chinese Named Entity Recognition (NER), reducing segmentation errors and increasing the semantic and boundary information of Chinese words. However, these methods tend to ignore the semantic relationship before and after the sentence after integrating lexical information. Therefore, the regularity of word length information has not been fully explored in various word-character fusion methods. In this work, we propose a Lexicon-Attention and Data-Augmentation (LADA) method for Chinese NER. We discuss the challenges of using existing methods in incorporating word information for NER and show how our proposed methods could be leveraged to overcome those challenges. LADA is based on a Transformer Encoder that utilizes lexicon to construct a directed graph and fuses word information through updating the optimal edge of the graph. Specially, we introduce the advanced data augmentation method to obtain the optimal representation for the NER task. Experimental results show that the augmentation done using LADA can considerably boost the performance of our NER system and achieve significantly better results than previous state-of-the-art methods and variant models in the literature on four publicly available NER datasets, namely Resume, MSRA, Weibo, and OntoNotes v4. We also observe better generalization and application to a real-world setting from LADA on multi-source complex entities.
近年来,词增强技术在中文命名实体识别(NER)中非常流行,它可以减少分词错误,增加中文词的语义和边界信息。然而,这些方法在整合词汇信息后往往忽略了句子前后的语义关系。因此,在各种字字融合方法中,对字长信息的规律性研究并不充分。在这项工作中,我们提出了一种词典关注和数据增强(LADA)方法。我们讨论了使用现有方法合并NER单词信息的挑战,并展示了如何利用我们提出的方法来克服这些挑战。LADA是基于一个变压器编码器,利用词典构造一个有向图,并通过更新图的最优边来融合单词信息。特别地,我们引入了先进的数据增强方法来获得NER任务的最优表示。实验结果表明,在简历、MSRA、微博和OntoNotes v4四个公开的NER数据集上,使用LADA进行的增强可以显著提高我们的NER系统的性能,并且取得了明显优于之前最先进的方法和文献中的变体模型的结果。我们还观察到LADA在多源复杂实体上更好的泛化和应用于现实环境。
{"title":"LADA-Trans-NER: Adaptive Efficient Transformer for Chinese Named Entity Recognition Using Lexicon-Attention and Data-Augmentation","authors":"Jiguo Liu, Chao Liu, Nan Li, Shihao Gao, Mingqi Liu, Dali Zhu","doi":"10.1609/aaai.v37i11.26554","DOIUrl":"https://doi.org/10.1609/aaai.v37i11.26554","url":null,"abstract":"Recently, word enhancement has become very popular for Chinese Named Entity Recognition (NER), reducing segmentation errors and increasing the semantic and boundary information of Chinese words. However, these methods tend to ignore the semantic relationship before and after the sentence after integrating lexical information. Therefore, the regularity of word length information has not been fully explored in various word-character fusion methods. In this work, we propose a Lexicon-Attention and Data-Augmentation (LADA) method for Chinese NER. We discuss the challenges of using existing methods in incorporating word information for NER and show how our proposed methods could be leveraged to overcome those challenges. LADA is based on a Transformer Encoder that utilizes lexicon to construct a directed graph and fuses word information through updating the optimal edge of the graph. Specially, we introduce the advanced data augmentation method to obtain the optimal representation for the NER task. Experimental results show that the augmentation done using LADA can considerably boost the performance of our NER system and achieve significantly better results than previous state-of-the-art methods and variant models in the literature on four publicly available NER datasets, namely Resume, MSRA, Weibo, and OntoNotes v4. We also observe better generalization and application to a real-world setting from LADA on multi-source complex entities.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"21 1","pages":"13236-13245"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75024279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Xaitk-Saliency: An Open Source Explainable AI Toolkit for Saliency Xaitk-Saliency:一个开源的可解释的AI工具包
Brian Hu, Paul Tunison, Brandon Richard Webster, Anthony J. Hoogs
Advances in artificial intelligence (AI) using techniques such as deep learning have fueled the recent progress in fields such as computer vision. However, these algorithms are still often viewed as "black boxes", which cannot easily explain how they arrived at their final output decisions. Saliency maps are one commonly used form of explainable AI (XAI), which indicate the input features an algorithm paid attention to during its decision process. Here, we introduce the open source xaitk-saliency package, an XAI framework and toolkit for saliency. We demonstrate its modular and flexible nature by highlighting two example use cases for saliency maps: (1) object detection model comparison and (2) doppelganger saliency for person re-identification. We also show how the xaitk-saliency package can be paired with visualization tools to support the interactive exploration of saliency maps. Our results suggest that saliency maps may play a critical role in the verification and validation of AI models, ensuring their trusted use and deployment. The code is publicly available at: https://github.com/xaitk/xaitk-saliency.
利用深度学习等技术的人工智能(AI)的进步推动了计算机视觉等领域的最新进展。然而,这些算法仍然经常被视为“黑盒子”,无法轻易解释它们是如何得出最终输出决策的。显著性图是一种常用的可解释人工智能(XAI)形式,它表示算法在决策过程中所关注的输入特征。在这里,我们将介绍开源的XAI -saliency包,这是一个用于显著性的XAI框架和工具包。我们通过突出突出显着性地图的两个示例用例来展示其模块化和灵活性:(1)对象检测模型比较和(2)用于人员重新识别的二重体显着性。我们还展示了如何将xaitk-saliency包与可视化工具配对,以支持显著性图的交互式探索。我们的研究结果表明,显著性地图可能在人工智能模型的验证和验证中发挥关键作用,确保它们的可信使用和部署。该代码可在https://github.com/xaitk/xaitk-saliency公开获取。
{"title":"Xaitk-Saliency: An Open Source Explainable AI Toolkit for Saliency","authors":"Brian Hu, Paul Tunison, Brandon Richard Webster, Anthony J. Hoogs","doi":"10.1609/aaai.v37i13.26871","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26871","url":null,"abstract":"Advances in artificial intelligence (AI) using techniques such as deep learning have fueled the recent progress in fields such as computer vision. However, these algorithms are still often viewed as \"black boxes\", which cannot easily explain how they arrived at their final output decisions. Saliency maps are one commonly used form of explainable AI (XAI), which indicate the input features an algorithm paid attention to during its decision process. Here, we introduce the open source xaitk-saliency package, an XAI framework and toolkit for saliency. We demonstrate its modular and flexible nature by highlighting two example use cases for saliency maps: (1) object detection model comparison and (2) doppelganger saliency for person re-identification. We also show how the xaitk-saliency package can be paired with visualization tools to support the interactive exploration of saliency maps. Our results suggest that saliency maps may play a critical role in the verification and validation of AI models, ensuring their trusted use and deployment. The code is publicly available at: https://github.com/xaitk/xaitk-saliency.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"20 1","pages":"15760-15766"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72614360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract) 通过动态屏蔽实现安全高效的多智能体强化学习(学生摘要)
Wenli Xiao, Yiwei Lyu, J. Dolan
Multi-agent Reinforcement Learning (MARL) has been increasingly used in safety-critical applications but has no safety guarantees, especially during training. In this paper, we propose dynamic shielding, a novel decentralized MARL framework to ensure safety in both training and deployment phases. Our framework leverages Shield, a reactive system running in parallel with the reinforcement learning algorithm to monitor and correct agents' behavior. In our algorithm, shields dynamically split and merge according to the environment state in order to maintain decentralization and avoid conservative behaviors while enjoying formal safety guarantees. We demonstrate the effectiveness of MARL with dynamic shielding in the mobile navigation scenario.
多智能体强化学习(MARL)越来越多地应用于安全关键应用,但它没有安全保证,特别是在训练过程中。在本文中,我们提出了动态屏蔽,一种新的分散的MARL框架,以确保在训练和部署阶段的安全。我们的框架利用Shield,一个与强化学习算法并行运行的反应系统来监控和纠正代理的行为。在我们的算法中,屏蔽根据环境状态动态拆分和合并,以保持去中心化,避免保守行为,同时享有正式的安全保证。我们演示了带动态屏蔽的MARL在移动导航场景中的有效性。
{"title":"Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract)","authors":"Wenli Xiao, Yiwei Lyu, J. Dolan","doi":"10.1609/aaai.v37i13.27041","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.27041","url":null,"abstract":"Multi-agent Reinforcement Learning (MARL) has been increasingly used in safety-critical applications but has no safety guarantees, especially during training. In this paper, we propose dynamic shielding, a novel decentralized MARL framework to ensure safety in both training and deployment phases. Our framework leverages Shield, a reactive system running in parallel with the reinforcement learning algorithm to monitor and correct agents' behavior. In our algorithm, shields dynamically split and merge according to the environment state in order to maintain decentralization and avoid conservative behaviors while enjoying formal safety guarantees. We demonstrate the effectiveness of MARL with dynamic shielding in the mobile navigation scenario.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"30 7","pages":"16362-16363"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72635396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
InParformer: Evolutionary Decomposition Transformers with Interactive Parallel Attention for Long-Term Time Series Forecasting InParformer:用于长期时间序列预测的具有交互平行关注的演化分解变压器
Haizhou Cao, Zhenhao Huang, Tiechui Yao, Jue Wang, Hui He, Yangang Wang
Long-term time series forecasting (LTSF) provides substantial benefits for numerous real-world applications, whereas places essential demands on the model capacity to capture long-range dependencies. Recent Transformer-based models have significantly improved LTSF performance. It is worth noting that Transformer with the self-attention mechanism was originally proposed to model language sequences whose tokens (i.e., words) are discrete and highly semantic. However, unlike language sequences, most time series are sequential and continuous numeric points. Time steps with temporal redundancy are weakly semantic, and only leveraging time-domain tokens is hard to depict the overall properties of time series (e.g., the overall trend and periodic variations). To address these problems, we propose a novel Transformer-based forecasting model named InParformer with an Interactive Parallel Attention (InPar Attention) mechanism. The InPar Attention is proposed to learn long-range dependencies comprehensively in both frequency and time domains. To improve its learning capacity and efficiency, we further design several mechanisms, including query selection, key-value pair compression, and recombination. Moreover, InParformer is constructed with evolutionary seasonal-trend decomposition modules to enhance intricate temporal pattern extraction. Extensive experiments on six real-world benchmarks show that InParformer outperforms the state-of-the-art forecasting Transformers.
长期时间序列预测(LTSF)为许多实际应用程序提供了实质性的好处,同时对模型捕获长期依赖关系的能力提出了基本要求。最近基于transformer的模型显著提高了LTSF的性能。值得注意的是,具有自关注机制的Transformer最初被提出用于对语言序列建模,这些语言序列的标记(例如,单词)是离散的且高度语义化的。然而,与语言序列不同,大多数时间序列是连续的和连续的数字点。具有时间冗余的时间步是弱语义的,仅利用时域标记很难描述时间序列的整体属性(例如,总体趋势和周期变化)。为了解决这些问题,我们提出了一种新的基于变压器的预测模型InParformer,该模型具有交互式并行注意(InPar Attention)机制。提出了InPar注意力,在频域和时域上全面学习远程依赖关系。为了提高其学习能力和效率,我们进一步设计了几种机制,包括查询选择、键值对压缩和重组。此外,InParformer构建了季节趋势演化分解模块,增强了复杂时间模式的提取能力。在六个现实世界基准上的广泛实验表明,InParformer优于最先进的预测变压器。
{"title":"InParformer: Evolutionary Decomposition Transformers with Interactive Parallel Attention for Long-Term Time Series Forecasting","authors":"Haizhou Cao, Zhenhao Huang, Tiechui Yao, Jue Wang, Hui He, Yangang Wang","doi":"10.1609/aaai.v37i6.25845","DOIUrl":"https://doi.org/10.1609/aaai.v37i6.25845","url":null,"abstract":"Long-term time series forecasting (LTSF) provides substantial benefits for numerous real-world applications, whereas places essential demands on the model capacity to capture long-range dependencies. Recent Transformer-based models have significantly improved LTSF performance. It is worth noting that Transformer with the self-attention mechanism was originally proposed to model language sequences whose tokens (i.e., words) are discrete and highly semantic. However, unlike language sequences, most time series are sequential and continuous numeric points. Time steps with temporal redundancy are weakly semantic, and only leveraging time-domain tokens is hard to depict the overall properties of time series (e.g., the overall trend and periodic variations). To address these problems, we propose a novel Transformer-based forecasting model named InParformer with an Interactive Parallel Attention (InPar Attention) mechanism. The InPar Attention is proposed to learn long-range dependencies comprehensively in both frequency and time domains. To improve its learning capacity and efficiency, we further design several mechanisms, including query selection, key-value pair compression, and recombination. Moreover, InParformer is constructed with evolutionary seasonal-trend decomposition modules to enhance intricate temporal pattern extraction. Extensive experiments on six real-world benchmarks show that InParformer outperforms the state-of-the-art forecasting Transformers.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"177 1","pages":"6906-6915"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74963333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Photogrammetry and VR for Comparing 2D and Immersive Linguistic Data Collection (Student Abstract) 比较二维和沉浸式语言数据采集的摄影测量和VR(学生摘要)
Jacob Rubinstein, Cynthia Matuszek, Don Engel
The overarching goal of this work is to enable the collection of language describing a wide variety of objects viewed in virtual reality. We aim to create full 3D models from a small number of ‘keyframe’ images of objects found in the publicly available Grounded Language Dataset (GoLD) using photogrammetry. We will then collect linguistic descriptions by placing our models in virtual reality and having volunteers describe them. To evaluate the impact of virtual reality immersion on linguistic descriptions of the objects, we intend to apply contrastive learning to perform grounded language learning, then compare the descriptions collected from images (in GoLD) versus our models.
这项工作的首要目标是使描述虚拟现实中各种各样物体的语言集合成为可能。我们的目标是使用摄影测量技术,从公开可用的基础语言数据集(GoLD)中发现的少量“关键帧”对象图像中创建完整的3D模型。然后,我们将通过将我们的模型放置在虚拟现实中并让志愿者描述它们来收集语言描述。为了评估虚拟现实沉浸对物体语言描述的影响,我们打算应用对比学习来进行基础语言学习,然后将从图像(GoLD)收集的描述与我们的模型进行比较。
{"title":"Photogrammetry and VR for Comparing 2D and Immersive Linguistic Data Collection (Student Abstract)","authors":"Jacob Rubinstein, Cynthia Matuszek, Don Engel","doi":"10.1609/aaai.v37i13.27016","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.27016","url":null,"abstract":"The overarching goal of this work is to enable the collection of language describing a wide variety of objects viewed in virtual reality. We aim to create full 3D models from a small number of ‘keyframe’ images of objects found in the publicly available Grounded Language Dataset (GoLD) using photogrammetry. We will then collect linguistic descriptions by placing our models in virtual reality and having volunteers describe them. To evaluate the impact of virtual reality immersion on linguistic descriptions of the objects, we intend to apply contrastive learning to perform grounded language learning, then compare the descriptions collected from images (in GoLD) versus our models.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"11 1","pages":"16312-16313"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75127079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Global Video Scene Segmentation with Context-Aware Transformer 基于上下文感知转换器的全局视频场景分割
Yang Yang, Yurui Huang, Weili Guo, Baohua Xu, Dingyin Xia
Videos such as movies or TV episodes usually need to divide the long storyline into cohesive units, i.e., scenes, to facilitate the understanding of video semantics. The key challenge lies in finding the boundaries of scenes by comprehensively considering the complex temporal structure and semantic information. To this end, we introduce a novel Context-Aware Transformer (CAT) with a self-supervised learning framework to learn high-quality shot representations, for generating well-bounded scenes. More specifically, we design the CAT with local-global self-attentions, which can effectively consider both the long-term and short-term context to improve the shot encoding. For training the CAT, we adopt the self-supervised learning schema. Firstly, we leverage shot-to-scene level pretext tasks to facilitate the pre-training with pseudo boundary, which guides CAT to learn the discriminative shot representations that maximize intra-scene similarity and inter-scene discrimination in an unsupervised manner. Then, we transfer contextual representations for fine-tuning the CAT with supervised data, which encourages CAT to accurately detect the boundary for scene segmentation. As a result, CAT is able to learn the context-aware shot representations and provides global guidance for scene segmentation. Our empirical analyses show that CAT can achieve state-of-the-art performance when conducting the scene segmentation task on the MovieNet dataset, e.g., offering 2.15 improvements on AP.
电影或电视剧等视频通常需要将较长的故事情节划分为衔接单元,即场景,以方便对视频语义的理解。综合考虑复杂的时间结构和语义信息,寻找场景的边界是关键的挑战。为此,我们引入了一种新颖的具有自监督学习框架的上下文感知转换器(CAT)来学习高质量的镜头表示,以生成边界良好的场景。更具体地说,我们设计了局部全局自关注的CAT,它可以有效地考虑长期和短期上下文,以改进镜头编码。对于CAT的训练,我们采用了自监督学习模式。首先,我们利用镜头到场景级的借口任务来促进伪边界的预训练,引导CAT以无监督的方式学习最大限度地提高场景内相似性和场景间区别的判别镜头表示。然后,我们转移上下文表示,用监督数据对CAT进行微调,这鼓励CAT准确地检测场景分割的边界。因此,CAT能够学习上下文感知的镜头表示,并为场景分割提供全局指导。我们的实证分析表明,当在MovieNet数据集上执行场景分割任务时,CAT可以达到最先进的性能,例如,在AP上提供2.15的改进。
{"title":"Towards Global Video Scene Segmentation with Context-Aware Transformer","authors":"Yang Yang, Yurui Huang, Weili Guo, Baohua Xu, Dingyin Xia","doi":"10.1609/aaai.v37i3.25426","DOIUrl":"https://doi.org/10.1609/aaai.v37i3.25426","url":null,"abstract":"Videos such as movies or TV episodes usually need to divide the long storyline into cohesive units, i.e., scenes, to facilitate the understanding of video semantics. The key challenge lies in finding the boundaries of scenes by comprehensively considering the complex temporal structure and semantic information. To this end, we introduce a novel Context-Aware Transformer (CAT) with a self-supervised learning framework to learn high-quality shot representations, for generating well-bounded scenes. More specifically, we design the CAT with local-global self-attentions, which can effectively consider both the long-term and short-term context to improve the shot encoding. For training the CAT, we adopt the self-supervised learning schema. Firstly, we leverage shot-to-scene level pretext tasks to facilitate the pre-training with pseudo boundary, which guides CAT to learn the discriminative shot representations that maximize intra-scene similarity and inter-scene discrimination in an unsupervised manner. Then, we transfer contextual representations for fine-tuning the CAT with supervised data, which encourages CAT to accurately detect the boundary for scene segmentation. As a result, CAT is able to learn the context-aware shot representations and provides global guidance for scene segmentation. Our empirical analyses show that CAT can achieve state-of-the-art performance when conducting the scene segmentation task on the MovieNet dataset, e.g., offering 2.15 improvements on AP.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"23 3 1","pages":"3206-3213"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77373482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Phase-Informed Bayesian Ensemble Models Improve Performance of COVID-19 Forecasts 相位知情贝叶斯集成模型提高COVID-19预测性能
A. Adiga, Gursharn Kaur, Lijing Wang, Benjamin Hurt, P. Porebski, S. Venkatramanan, B. Lewis, M. Marathe
Despite hundreds of methods published in the literature, forecasting epidemic dynamics remains challenging yet important. The challenges stem from multiple sources, including: the need for timely data, co-evolution of epidemic dynamics with behavioral and immunological adaptations, and the evolution of new pathogen strains. The ongoing COVID-19 pandemic highlighted these challenges; in an important article, Reich et al. did a comprehensive analysis highlighting many of these challenges.In this paper, we take another step in critically evaluating existing epidemic forecasting methods. Our methods are based on a simple yet crucial observation - epidemic dynamics go through a number of phases (waves). Armed with this understanding, we propose a modification to our deployed Bayesian ensembling case time series forecasting framework. We show that ensembling methods employing the phase information and using different weighting schemes for each phase can produce improved forecasts. We evaluate our proposed method with both the currently deployed model and the COVID-19 forecasthub models. The overall performance of the proposed model is consistent across the pandemic but more importantly, it is ranked third and first during two critical rapid growth phases in cases, regimes where the performance of most models from the CDC forecasting hub dropped significantly.
尽管文献中发表了数百种方法,但预测流行病动态仍然具有挑战性,但也很重要。挑战来自多个方面,包括:需要及时的数据,流行病动态与行为和免疫适应的共同演变,以及新的病原体菌株的演变。正在进行的COVID-19大流行凸显了这些挑战;在一篇重要的文章中,Reich等人对这些挑战进行了全面的分析。在本文中,我们采取了另一个步骤,对现有的流行病预测方法进行批判性评估。我们的方法基于一个简单但至关重要的观察——流行病动力学经历了若干阶段(波)。有了这种理解,我们提出了对我们部署的贝叶斯集成案例时间序列预测框架的修改。结果表明,采用相位信息的组合方法和对每个相位使用不同的加权方案可以提高预测效果。我们用当前部署的模型和COVID-19预测模型来评估我们提出的方法。拟议模型的总体表现在整个大流行期间是一致的,但更重要的是,在病例快速增长的两个关键阶段,它分别排名第三和第一,而在这两个阶段,疾病预防控制中心预测中心的大多数模型的表现都大幅下降。
{"title":"Phase-Informed Bayesian Ensemble Models Improve Performance of COVID-19 Forecasts","authors":"A. Adiga, Gursharn Kaur, Lijing Wang, Benjamin Hurt, P. Porebski, S. Venkatramanan, B. Lewis, M. Marathe","doi":"10.1609/aaai.v37i13.26855","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26855","url":null,"abstract":"Despite hundreds of methods published in the literature, forecasting epidemic dynamics remains challenging yet important. The challenges stem from multiple sources, including: the need for timely data, co-evolution of epidemic dynamics with behavioral and immunological adaptations, and the evolution of new pathogen strains. The ongoing COVID-19 pandemic highlighted these challenges; in an important article, Reich et al. did a comprehensive analysis highlighting many of these challenges.\u0000\u0000In this paper, we take another step in critically evaluating existing epidemic forecasting methods. Our methods are based on a simple yet crucial observation - epidemic dynamics go through a number of phases (waves). Armed with this understanding, we propose a modification to our deployed Bayesian ensembling case time series forecasting framework. We show that ensembling methods employing the phase information and using different weighting schemes for each phase can produce improved forecasts. We evaluate our proposed method with both the currently deployed model and the COVID-19 forecasthub models. The overall performance of the proposed model is consistent across the pandemic but more importantly, it is ranked third and first during two critical rapid growth phases in cases, regimes where the performance of most models from the CDC forecasting hub dropped significantly.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"74 1","pages":"15647-15653"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77380139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1