Pub Date : 2024-02-19DOI: 10.1109/tetc.2024.3365354
Luca Bertaccini, Gianna Paulin, Matheus Cavalcante, Tim Fischer, Stefan Mach, Luca Benini
{"title":"MiniFloats on RISC-V Cores: ISA Extensions with Mixed-Precision Short Dot Products","authors":"Luca Bertaccini, Gianna Paulin, Matheus Cavalcante, Tim Fischer, Stefan Mach, Luca Benini","doi":"10.1109/tetc.2024.3365354","DOIUrl":"https://doi.org/10.1109/tetc.2024.3365354","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"16 1","pages":""},"PeriodicalIF":5.9,"publicationDate":"2024-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139948120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-31DOI: 10.1109/tetc.2024.3358759
Nathaniel Hobbs, Periklis A. Papakonstantinou, Jaideep Vaidya
{"title":"Engravings, Secrets, and Interpretability of Neural Networks","authors":"Nathaniel Hobbs, Periklis A. Papakonstantinou, Jaideep Vaidya","doi":"10.1109/tetc.2024.3358759","DOIUrl":"https://doi.org/10.1109/tetc.2024.3358759","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"1 1","pages":""},"PeriodicalIF":5.9,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-26DOI: 10.1109/tetc.2024.3354419
Huimin Zeng, Zhenrui Yue, Lanyu Shang, Yang Zhang, Dong Wang
{"title":"Unsupervised Domain Adaptation Via Contrastive Adversarial Domain Mixup: A Case Study on COVID-19","authors":"Huimin Zeng, Zhenrui Yue, Lanyu Shang, Yang Zhang, Dong Wang","doi":"10.1109/tetc.2024.3354419","DOIUrl":"https://doi.org/10.1109/tetc.2024.3354419","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"22 1","pages":""},"PeriodicalIF":5.9,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1109/tetc.2023.3346691
Francesco Buccafurri, Gianluca Lax, Denis Migdal, Lorenzo Musarella, Christophe Rosenberger
{"title":"Combining Trust Graphs and Keystroke Dynamics to Counter Fake Identities in Social Networks","authors":"Francesco Buccafurri, Gianluca Lax, Denis Migdal, Lorenzo Musarella, Christophe Rosenberger","doi":"10.1109/tetc.2023.3346691","DOIUrl":"https://doi.org/10.1109/tetc.2023.3346691","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"58 1","pages":""},"PeriodicalIF":5.9,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139956623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1109/TETC.2023.3344131
Dongbo Zhou;Hongwei Yu;Jie Yu;Shuai Zhao;Wenhui Xu;Qianqian Li;Fengyin Cai
Mining and predicting college students behaviors from fine-grained spatial-temporal campus activity data play key roles in the academic success and personal development of college students. Most of the existing behavior prediction methods use shallow learning algorithms such as statistics, clustering, and correlation analysis approaches, which fail to mine the long-term spatial-temporal dependencies and semantic correlations from these fine-grained campus data. We propose a novel multi-fragment dynamic semantic spatial-temporal graph convolution network, named the MFDS-STGCN, on the basis of a spatial-temporal graph convolutional network (STGCN) for the automatic prediction of college students’ behaviors and abnormal behaviors. We construct a dataset including 7.6 million behavioral records derived from approximately 400 students over 140 days to evaluate the effectiveness of the prediction model. Extensive experimental results demonstrate that the proposed method outperforms multiple baseline prediction methods in terms of student behavior prediction and abnormal behavior prediction, with accuracies of 92.60% and 90.84%, respectively. To further enable behavior prediction, we establish an early warning management mechanism. Based on the predictions and analyses of Big Data, education administrators can detect undesirable abnormal behaviors in time and thus implement effective interventions to better guide students' campus lives, ultimately helping them to more effectively develop and grow.
{"title":"MFDS-STGCN: Predicting the Behaviors of College Students With Fine-Grained Spatial-Temporal Activities Data","authors":"Dongbo Zhou;Hongwei Yu;Jie Yu;Shuai Zhao;Wenhui Xu;Qianqian Li;Fengyin Cai","doi":"10.1109/TETC.2023.3344131","DOIUrl":"10.1109/TETC.2023.3344131","url":null,"abstract":"Mining and predicting college students behaviors from fine-grained spatial-temporal campus activity data play key roles in the academic success and personal development of college students. Most of the existing behavior prediction methods use shallow learning algorithms such as statistics, clustering, and correlation analysis approaches, which fail to mine the long-term spatial-temporal dependencies and semantic correlations from these fine-grained campus data. We propose a novel multi-fragment dynamic semantic spatial-temporal graph convolution network, named the MFDS-STGCN, on the basis of a spatial-temporal graph convolutional network (STGCN) for the automatic prediction of college students’ behaviors and abnormal behaviors. We construct a dataset including 7.6 million behavioral records derived from approximately 400 students over 140 days to evaluate the effectiveness of the prediction model. Extensive experimental results demonstrate that the proposed method outperforms multiple baseline prediction methods in terms of student behavior prediction and abnormal behavior prediction, with accuracies of 92.60% and 90.84%, respectively. To further enable behavior prediction, we establish an early warning management mechanism. Based on the predictions and analyses of Big Data, education administrators can detect undesirable abnormal behaviors in time and thus implement effective interventions to better guide students' campus lives, ultimately helping them to more effectively develop and grow.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 1","pages":"254-265"},"PeriodicalIF":5.9,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139947919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Keyword spotting (KWS) on edge devices requires low power consumption and real-time response. In this work, a ferroelectric field-effect transistor (FeFET)-based compute-in-memory (CIM) architecture is proposed for streaming KWS processing. Compared with the conventional sequential processing scheme, the inference latency is reduced by 7.7 × ∼17.6× without energy efficiency loss. To make the KWS models robust to hardware non-idealities such as analog-to-digital converter (ADC) offset, an offset-aware training scheme is proposed. It consists of ADC offset noise injection and frame-wise normalization. This scheme effectively improves the mean accuracy and chip yield by 1.5%∼5.2%, and 5%∼39%, for TC-ResNet and DS-TC-ResNet (with MatchboxNet configuration), respectively. The proposed CIM architecture is implemented with ferroelectric field-effect transistor technology, with simulated low energy consumption of 1.65 μJ/decision for 12-word keyword spotting using TC-ResNet8.
{"title":"A FeFET-Based ADC Offset Robust Compute-In-Memory Architecture for Streaming Keyword Spotting (KWS)","authors":"Yandong Luo;Johan Vanderhaegen;Oleg Rybakov;Martin Kraemer;Niel Warren;Shimeng Yu","doi":"10.1109/TETC.2023.3345346","DOIUrl":"https://doi.org/10.1109/TETC.2023.3345346","url":null,"abstract":"Keyword spotting (KWS) on edge devices requires low power consumption and real-time response. In this work, a ferroelectric field-effect transistor (FeFET)-based compute-in-memory (CIM) architecture is proposed for streaming KWS processing. Compared with the conventional sequential processing scheme, the inference latency is reduced by 7.7 × ∼17.6× without energy efficiency loss. To make the KWS models robust to hardware non-idealities such as analog-to-digital converter (ADC) offset, an offset-aware training scheme is proposed. It consists of ADC offset noise injection and frame-wise normalization. This scheme effectively improves the mean accuracy and chip yield by 1.5%∼5.2%, and 5%∼39%, for TC-ResNet and DS-TC-ResNet (with MatchboxNet configuration), respectively. The proposed CIM architecture is implemented with ferroelectric field-effect transistor technology, with simulated low energy consumption of 1.65 μJ/decision for 12-word keyword spotting using TC-ResNet8.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 1","pages":"23-34"},"PeriodicalIF":5.9,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140161166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1109/TETC.2023.3345870
Jeongmin Lim;Young Geun Kim;Sung Woo Chung;Farinaz Koushanfar;Joonho Kong
Deep learning (DL)-based recommendation models play an important role in many real-world applications. However, an embedding layer, which is a key part of the DL-based recommendation models, requires sparse memory accesses to a very large memory space followed by the pooling operations (i.e., reduction operations). It makes the system overprovision memory capacity for model deployment. Moreover, with conventional CPU-based architecture, it is difficult to exploit the locality, causing a huge burden for data transfer between the CPU and memory. To resolve this problem, we propose an embedding vector element quantization and compression method to reduce the memory footprint (capacity) required by the embedding tables. In addition, to reduce the amount of data transfer and memory access, we propose near-memory acceleration hardware with an SRAM buffer that stores the frequently accessed embedding vectors. Our quantization and compression method results in compression ratios of 3.95–4.14 for embedding tables in widely used datasets while negligibly affecting the inference accuracy. Our acceleration technique with 3D stacked DRAM memories, which facilitates the near-memory processing in the logic die with high DRAM bandwidth, leads to 4.9 × –5.4 × embedding layer speedup as compared to the 8-core CPU-based execution while reducing the memory energy consumption by 5.9 × −12.1 ×, on average.
基于深度学习(DL)的推荐模型在许多实际应用中发挥着重要作用。然而,作为基于深度学习的推荐模型的关键部分,嵌入层需要对非常大的内存空间进行稀疏内存访问,然后进行池化操作(即还原操作)。这使得系统在部署模型时需要超额配置内存容量。此外,在基于 CPU 的传统架构中,很难利用局部性,导致 CPU 和内存之间的数据传输负担沉重。为解决这一问题,我们提出了一种嵌入向量元素量化和压缩方法,以减少嵌入表所需的内存占用(容量)。此外,为了减少数据传输和内存访问量,我们还提出了近内存加速硬件,该硬件带有一个 SRAM 缓冲器,用于存储经常访问的嵌入向量。我们的量化和压缩方法使广泛使用的数据集的嵌入表压缩率达到了 3.95-4.14 倍,同时对推理精度的影响可以忽略不计。我们采用的三维堆叠 DRAM 存储器加速技术有助于在具有高 DRAM 带宽的逻辑芯片中进行近内存处理,与基于 8 核 CPU 的执行相比,嵌入层速度提高了 4.9 × -5.4 ×,同时内存能耗平均降低了 5.9 × -12.1×。
{"title":"Near-Memory Computing With Compressed Embedding Table for Personalized Recommendation","authors":"Jeongmin Lim;Young Geun Kim;Sung Woo Chung;Farinaz Koushanfar;Joonho Kong","doi":"10.1109/TETC.2023.3345870","DOIUrl":"https://doi.org/10.1109/TETC.2023.3345870","url":null,"abstract":"Deep learning (DL)-based recommendation models play an important role in many real-world applications. However, an embedding layer, which is a key part of the DL-based recommendation models, requires sparse memory accesses to a very large memory space followed by the pooling operations (i.e., reduction operations). It makes the system overprovision memory capacity for model deployment. Moreover, with conventional CPU-based architecture, it is difficult to exploit the locality, causing a huge burden for data transfer between the CPU and memory. To resolve this problem, we propose an embedding vector element quantization and compression method to reduce the memory footprint (capacity) required by the embedding tables. In addition, to reduce the amount of data transfer and memory access, we propose near-memory acceleration hardware with an SRAM buffer that stores the frequently accessed embedding vectors. Our quantization and compression method results in compression ratios of 3.95–4.14 for embedding tables in widely used datasets while negligibly affecting the inference accuracy. Our acceleration technique with 3D stacked DRAM memories, which facilitates the near-memory processing in the logic die with high DRAM bandwidth, leads to 4.9 × –5.4 × embedding layer speedup as compared to the 8-core CPU-based execution while reducing the memory energy consumption by 5.9 × −12.1 ×, on average.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 3","pages":"938-951"},"PeriodicalIF":5.1,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142143780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}