{"title":"XM2A: Multi-Scale Multi-Head Attention with Cross-Talk for Multi-Variate Time Series Analysis","authors":"Yash Garg, K. Candan","doi":"10.1109/MIPR51284.2021.00030","DOIUrl":null,"url":null,"abstract":"Advances in sensory technologies are enabling the capture of a diverse spectrum of real-world data streams. In-creasing availability of such data, especially in the form of multi-variate time series, allows for new opportunities for applications that rely on identifying and leveraging complex temporal patterns A particular challenge such algorithms face is that complex patterns consist of multiple simpler patterns of varying scales (temporal length). While several recent works (such as multi-head attention networks) recognized the fact complex patterns need to be understood in the form of multiple simpler patterns, we note that existing works lack the ability of represent the interactions across these constituting patterns. To tackle this limitation, in this paper, we propose a novel Multi-scale Multi-head Attention with Cross-Talk (XM2A) framework designed to represent multi-scale patterns that make up a complex pattern by configuring each attention head to learn a pattern at a particular scale and accounting for the co-existence of patterns at multiple scales through a cross-talking mechanism among the heads. Experiments show that XM2A outperforms state-of-the-art attention mechanisms, such as Transformer and MSMSA, on benchmark datasets, such as SADD, AUSLAN, and MOCAP.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIPR51284.2021.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Advances in sensory technologies are enabling the capture of a diverse spectrum of real-world data streams. In-creasing availability of such data, especially in the form of multi-variate time series, allows for new opportunities for applications that rely on identifying and leveraging complex temporal patterns A particular challenge such algorithms face is that complex patterns consist of multiple simpler patterns of varying scales (temporal length). While several recent works (such as multi-head attention networks) recognized the fact complex patterns need to be understood in the form of multiple simpler patterns, we note that existing works lack the ability of represent the interactions across these constituting patterns. To tackle this limitation, in this paper, we propose a novel Multi-scale Multi-head Attention with Cross-Talk (XM2A) framework designed to represent multi-scale patterns that make up a complex pattern by configuring each attention head to learn a pattern at a particular scale and accounting for the co-existence of patterns at multiple scales through a cross-talking mechanism among the heads. Experiments show that XM2A outperforms state-of-the-art attention mechanisms, such as Transformer and MSMSA, on benchmark datasets, such as SADD, AUSLAN, and MOCAP.