首页 > 最新文献

2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)最新文献

英文 中文
An Approach to Real-Time Fall Detection based on OpenPose and LSTM 基于OpenPose和LSTM的实时跌倒检测方法
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00250
Po-Chih Chen, Chih-Hung Chang, Yu-Wei Chan, Yin-Te Tsai, W. Chu
Falls are consistently the top cause of death among seniors. At a time when the global population is getting older and fewer births. The shortage of nursing staff seriously affects the health care of the elderly. If information and communication technology can be used, automatic detection and identification the elderly fall, we believe it can reduce the injury of the elderly due to falls. This paper proposes a method different from the previous wearable sensing device, which is based on the displacement of human relative positional parameters in the image to identify the occurrence of human fall. We implemented a system based on OpenPose and combined with the deep learning neural network model LSTM with time series, the image recognition is carried out, the human joint parameters of human posture falling and falling in the image are captured, and the identified parameters are simply filtered, and then the filtered parameters are used for model training.
跌倒一直是老年人死亡的首要原因。在全球人口老龄化和出生率下降的时候。护理人员的短缺严重影响了老年人的保健。如果可以利用信息通信技术,自动检测和识别老年人跌倒,我们相信可以减少老年人因跌倒而造成的伤害。本文提出了一种不同于以往可穿戴传感设备的方法,该方法是基于图像中人体相对位置参数的位移来识别人体跌倒的发生。我们实现了一个基于OpenPose的系统,并结合具有时间序列的深度学习神经网络模型LSTM,对图像进行识别,捕获图像中人体姿态跌落和跌落的人体关节参数,对识别出的参数进行简单滤波,然后将滤波后的参数用于模型训练。
{"title":"An Approach to Real-Time Fall Detection based on OpenPose and LSTM","authors":"Po-Chih Chen, Chih-Hung Chang, Yu-Wei Chan, Yin-Te Tsai, W. Chu","doi":"10.1109/COMPSAC54236.2022.00250","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00250","url":null,"abstract":"Falls are consistently the top cause of death among seniors. At a time when the global population is getting older and fewer births. The shortage of nursing staff seriously affects the health care of the elderly. If information and communication technology can be used, automatic detection and identification the elderly fall, we believe it can reduce the injury of the elderly due to falls. This paper proposes a method different from the previous wearable sensing device, which is based on the displacement of human relative positional parameters in the image to identify the occurrence of human fall. We implemented a system based on OpenPose and combined with the deep learning neural network model LSTM with time series, the image recognition is carried out, the human joint parameters of human posture falling and falling in the image are captured, and the identified parameters are simply filtered, and then the filtered parameters are used for model training.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129701651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Structure-Focused Deep Learning Approach for Table Recognition from Document Images 基于结构的深度学习文档图像表识别方法
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00105
Mengxi Zhou, R. Ramnath
In this paper, we present a nuanced exploration of deep-learning techniques (DL) for extracting structural infor-mation from document images generated from the digitization of business processes. The driving example presented is the extraction of columns and rows of tables using a simple stacked CNN architecture and a combination of ensemble techniques. In addition, the component models of the ensemble are diversified by training on datasets created by applying a “semantics-preserving” transformation on the base dataset. This “semantics-preserving” transformation also aims to alleviate hard recognition in certain noisy images commonly encountered in practice. Our experiments demonstrate how DL techniques can be applied and innovatively combined to measurably improve the accuracy of structure extraction.
在本文中,我们对深度学习技术(DL)进行了细致的探索,用于从业务流程数字化生成的文档图像中提取结构信息。给出的驱动示例是使用简单的堆叠CNN架构和集成技术的组合提取表的列和行。此外,通过在基础数据集上应用“语义保留”转换创建的数据集上进行训练,集成的组件模型变得多样化。这种“保持语义”的转换也旨在减轻在实践中经常遇到的某些噪声图像的难以识别。我们的实验证明了深度学习技术如何被应用和创新地结合起来,以显著提高结构提取的准确性。
{"title":"A Structure-Focused Deep Learning Approach for Table Recognition from Document Images","authors":"Mengxi Zhou, R. Ramnath","doi":"10.1109/COMPSAC54236.2022.00105","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00105","url":null,"abstract":"In this paper, we present a nuanced exploration of deep-learning techniques (DL) for extracting structural infor-mation from document images generated from the digitization of business processes. The driving example presented is the extraction of columns and rows of tables using a simple stacked CNN architecture and a combination of ensemble techniques. In addition, the component models of the ensemble are diversified by training on datasets created by applying a “semantics-preserving” transformation on the base dataset. This “semantics-preserving” transformation also aims to alleviate hard recognition in certain noisy images commonly encountered in practice. Our experiments demonstrate how DL techniques can be applied and innovatively combined to measurably improve the accuracy of structure extraction.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130421875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Mortality Rate based on Comprehensive Features of Intensive Care Unit Patients 基于重症监护病房患者综合特征的死亡率预测
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00222
Jagan Moahan Reddy Danda, Kumar Priyansh, H. Shahriar, Hisham M. Haddad, A. Cuzzocrea, Nazmus Sakib
Predictive analytics is gaining momentum in health-care since the adoption of electronic health record (EHR) system in hospitals. In particular, machine learning models are built using the critical care EHR data and the information provided during the ICU admissions to predict the mortality of patients admitted in ICU. As per the MIMIC-IV dataset, the survival rate of patients admitted in ICU is found to be 89.76%. This paper proposes a hybrid prediction technique that uses Random Forest and XGBoost for predicting the mortality rate. The proposed techniques performed well in predicting mortality rate despite the class imbalance problem of the dataset. The experiments conducted on MIMIC-IV dataset yields prediction accuracy of 89.72%.
自从医院采用电子健康记录(EHR)系统以来,预测分析在医疗保健领域的势头日益强劲。特别是,使用重症监护EHR数据和ICU入院期间提供的信息建立机器学习模型,以预测ICU入院患者的死亡率。根据MIMIC-IV数据集,ICU住院患者的生存率为89.76%。本文提出了一种使用随机森林和XGBoost的混合预测技术来预测死亡率。尽管数据集存在类别不平衡问题,但所提出的技术在预测死亡率方面表现良好。在MIMIC-IV数据集上进行的实验,预测准确率达到89.72%。
{"title":"Predicting Mortality Rate based on Comprehensive Features of Intensive Care Unit Patients","authors":"Jagan Moahan Reddy Danda, Kumar Priyansh, H. Shahriar, Hisham M. Haddad, A. Cuzzocrea, Nazmus Sakib","doi":"10.1109/COMPSAC54236.2022.00222","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00222","url":null,"abstract":"Predictive analytics is gaining momentum in health-care since the adoption of electronic health record (EHR) system in hospitals. In particular, machine learning models are built using the critical care EHR data and the information provided during the ICU admissions to predict the mortality of patients admitted in ICU. As per the MIMIC-IV dataset, the survival rate of patients admitted in ICU is found to be 89.76%. This paper proposes a hybrid prediction technique that uses Random Forest and XGBoost for predicting the mortality rate. The proposed techniques performed well in predicting mortality rate despite the class imbalance problem of the dataset. The experiments conducted on MIMIC-IV dataset yields prediction accuracy of 89.72%.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124175142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Towards a Distributed Inference Detection System in a Multi-Database Context 多数据库环境下的分布式推理检测系统
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00246
Sad Rafik, P. Lachat, N. Bennani, V. Rehn-Sonigo
The omnipresence of services offered by diverse applications leads customers to share more and more personal data, among which some are sensitive. Dishonest entities perform inference attacks by querying non-sensitive data in order to deduce the stored sensitive data. Detecting those attacks is still an open problem in a setting where a dishonest entity has access to distinct data controllers' databases containing data collected from the same customer. This problem has been addressed considering a centralized detection system. However, this approach is limited because of this centralized nature where the system protects the customers' privacy at the expense of the data controllers' privacy. Hence, we propose in this article the description of a distributed architecture to detect inference attacks in a multi-database context, while preserving the privacy of both the applications and the customers.
各种应用程序提供的服务无处不在,导致客户分享越来越多的个人数据,其中一些是敏感的。不诚实实体通过查询非敏感数据来推断存储的敏感数据,从而进行推理攻击。在一个不诚实的实体可以访问包含从同一客户收集的数据的不同数据控制器的数据库的情况下,检测这些攻击仍然是一个悬而未决的问题。考虑到集中式检测系统,这个问题已经得到了解决。然而,这种方法是有限的,因为这种集中的性质,系统以牺牲数据控制器的隐私为代价来保护客户的隐私。因此,我们在本文中提出了一种分布式体系结构的描述,以检测多数据库上下文中的推理攻击,同时保护应用程序和客户的隐私。
{"title":"Towards a Distributed Inference Detection System in a Multi-Database Context","authors":"Sad Rafik, P. Lachat, N. Bennani, V. Rehn-Sonigo","doi":"10.1109/COMPSAC54236.2022.00246","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00246","url":null,"abstract":"The omnipresence of services offered by diverse applications leads customers to share more and more personal data, among which some are sensitive. Dishonest entities perform inference attacks by querying non-sensitive data in order to deduce the stored sensitive data. Detecting those attacks is still an open problem in a setting where a dishonest entity has access to distinct data controllers' databases containing data collected from the same customer. This problem has been addressed considering a centralized detection system. However, this approach is limited because of this centralized nature where the system protects the customers' privacy at the expense of the data controllers' privacy. Hence, we propose in this article the description of a distributed architecture to detect inference attacks in a multi-database context, while preserving the privacy of both the applications and the customers.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121491639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Grasp Position Estimation from Depth Image Using Stacked Hourglass Network Structure 基于堆叠沙漏网络结构的深度图像抓取位置估计
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00187
Keisuke Hamamoto, Huimin Lu, Yujie Li, Tohru Kamiya, Y. Nakatoh, S. Serikawa
In recent years, robots have been used not only in factories. However, most robots currently used in such places can only perform the actions programmed to perform in a predefined space. For robots to become widespread in the future, not only in factories, distribution warehouses, and other places but also in homes and other environments where robots receive complex commands and their surroundings are constantly being updated, it is necessary to make robots intelligent. Therefore, this study proposed a deep learning grasp position estimation model using depth images to achieve intelligence in pick-and-place. This study used only depth images as the training data to build the deep learning model. Some previous studies have used RGB images and depth images. However, in this study, we used only depth images as training data because we expect the inference to be based on the object's shape, independent of the color information of the object. By performing inference based on the target object's shape, the deep learning model is expected to minimize the need for re-training when the target object package changes in the production line since it is not dependent on the RGB image. In this study, we propose a deep learning model that focuses on the stacked encoder-decoder structure of the Stacked Hourglass Network. We compared the proposed method with the baseline method in the same evaluation metrics and a real robot, which shows higher accuracy than other methods in previous studies.
近年来,机器人不仅用于工厂。然而,目前在这些地方使用的大多数机器人只能在预定义的空间内执行程序规定的动作。为了使机器人在未来得到广泛应用,不仅在工厂、配送仓库等地方,而且在家庭和其他环境中,机器人接受复杂的命令,并且它们的周围环境不断更新,因此有必要使机器人智能化。因此,本研究提出了一种基于深度图像的深度学习抓取位置估计模型,以实现智能拾取。本研究仅使用深度图像作为训练数据来构建深度学习模型。之前的一些研究使用了RGB图像和深度图像。然而,在本研究中,我们只使用深度图像作为训练数据,因为我们希望推理基于物体的形状,独立于物体的颜色信息。通过根据目标物体的形状进行推理,深度学习模型有望在生产线中目标物体包发生变化时最大限度地减少重新训练的需要,因为它不依赖于RGB图像。在这项研究中,我们提出了一个深度学习模型,重点关注堆叠沙漏网络的堆叠编码器-解码器结构。我们将所提出的方法与基线方法在相同的评价指标下进行了比较,并对一个真实的机器人进行了比较,结果表明该方法比以往研究的其他方法具有更高的精度。
{"title":"Grasp Position Estimation from Depth Image Using Stacked Hourglass Network Structure","authors":"Keisuke Hamamoto, Huimin Lu, Yujie Li, Tohru Kamiya, Y. Nakatoh, S. Serikawa","doi":"10.1109/COMPSAC54236.2022.00187","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00187","url":null,"abstract":"In recent years, robots have been used not only in factories. However, most robots currently used in such places can only perform the actions programmed to perform in a predefined space. For robots to become widespread in the future, not only in factories, distribution warehouses, and other places but also in homes and other environments where robots receive complex commands and their surroundings are constantly being updated, it is necessary to make robots intelligent. Therefore, this study proposed a deep learning grasp position estimation model using depth images to achieve intelligence in pick-and-place. This study used only depth images as the training data to build the deep learning model. Some previous studies have used RGB images and depth images. However, in this study, we used only depth images as training data because we expect the inference to be based on the object's shape, independent of the color information of the object. By performing inference based on the target object's shape, the deep learning model is expected to minimize the need for re-training when the target object package changes in the production line since it is not dependent on the RGB image. In this study, we propose a deep learning model that focuses on the stacked encoder-decoder structure of the Stacked Hourglass Network. We compared the proposed method with the baseline method in the same evaluation metrics and a real robot, which shows higher accuracy than other methods in previous studies.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121523912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Formal Analysis and Verification of DPSTM v2 Architecture Using CSP 基于CSP的DPSTM v2体系结构形式化分析与验证
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00138
Pei Li, Jiaqi Yin, Huibiao Zhu, Lili Xiao, M. Popovic
Transactional memory is designed for developing parallel programs and improving the efficiency of parallel pro-grams. PSTM (python software transactional memory) mainly supports multi-core parallel programs based on the python language. In order to better adapt to the developing requirements of distributed concurrent programs and enhance the safety of the system, DPSTM (distributed python software transactional memory) was developed. Compared with PSTM, DPSTM has the advantages of higher operating efficiency and stronger fault tolerance. In this paper, we apply CSP (Communicating Sequential Processes) to formally analyze the components of DPSTM v2 architecture, the data exchange process between components, and two different transaction processing modes. We use the model checker PAT (Process Analysis Toolkit) to model the DPSTM v2 architecture and verify eight properties, including deadlock freedom, ACI (atomicity, isolation, and consistency), sequential consistency, data server availability, read tolerance, and crash tolerance. The verification results show that the DPSTM v2 archi-tecture can guarantee all of the above properties. In particular, the normal operation of the system can be maintained when some of the data servers are crashed, ensuring the safety of a distributed system.
事务性内存是为开发并行程序和提高并行程序的运行效率而设计的。PSTM (python软件事务性内存)主要支持基于python语言的多核并行程序。为了更好地适应分布式并发程序的开发需求,提高系统的安全性,开发了分布式python软件事务存储器DPSTM (distributed python software transactional memory)。与PSTM相比,DPSTM具有更高的运行效率和更强的容错能力。本文采用CSP (communication Sequential Processes)对DPSTM v2体系结构的组件、组件之间的数据交换过程以及两种不同的事务处理模式进行了形式化分析。我们使用模型检查器PAT (Process Analysis Toolkit)对DPSTM v2体系结构建模,并验证8个属性,包括死锁自由度、ACI(原子性、隔离性和一致性)、顺序一致性、数据服务器可用性、读取容错性和崩溃容错性。验证结果表明,DPSTM v2架构能够保证上述所有特性。特别是在部分数据服务器崩溃的情况下,可以维持系统的正常运行,保证分布式系统的安全。
{"title":"Formal Analysis and Verification of DPSTM v2 Architecture Using CSP","authors":"Pei Li, Jiaqi Yin, Huibiao Zhu, Lili Xiao, M. Popovic","doi":"10.1109/COMPSAC54236.2022.00138","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00138","url":null,"abstract":"Transactional memory is designed for developing parallel programs and improving the efficiency of parallel pro-grams. PSTM (python software transactional memory) mainly supports multi-core parallel programs based on the python language. In order to better adapt to the developing requirements of distributed concurrent programs and enhance the safety of the system, DPSTM (distributed python software transactional memory) was developed. Compared with PSTM, DPSTM has the advantages of higher operating efficiency and stronger fault tolerance. In this paper, we apply CSP (Communicating Sequential Processes) to formally analyze the components of DPSTM v2 architecture, the data exchange process between components, and two different transaction processing modes. We use the model checker PAT (Process Analysis Toolkit) to model the DPSTM v2 architecture and verify eight properties, including deadlock freedom, ACI (atomicity, isolation, and consistency), sequential consistency, data server availability, read tolerance, and crash tolerance. The verification results show that the DPSTM v2 archi-tecture can guarantee all of the above properties. In particular, the normal operation of the system can be maintained when some of the data servers are crashed, ensuring the safety of a distributed system.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121665634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concept drift detection for distributed multi-model machine learning systems 分布式多模型机器学习系统的概念漂移检测
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00168
Beverly Abadines Quon, J. Gaudiot
Many works focus on optimizing machine learning models during their training phase, but fail to account how these models adapt into their model-serving phase once they are deployed into real world applications. In this phase models must process through streams of data that can evolve over time and distort the relationship between incoming data, causing concept drift. This paper proposes leveraging the advantages of emerging features stores in order to improve concept drift detection on unlabeled, dynamic data streams across multiple models. Firstly, we introduce Drift Detection on Distributed Datasets (QuaD), which combines classical drift detectors to make use of labeled and unlabeled data, and create local context (i.e. per live model) and global context (i.e. across multiple models). Secondly, we propose using feature store entities, SHAP values, and Collaborative Filtering (CF) to augment unlabeled data across multiple models. To the best of our knowledge, QuaD is the first work that examines the collective behavior of concept drift across multiple models and discerns associations between models that may share a susceptibility in a dynamic setting. QuaD uses a combination of performance-based and data distribution-based drift detectors and CF to capture varying types of concept drifts for labeled and unlabeled data streams and is modeled around the data abstraction provided by emerging feature stores.
许多工作专注于在训练阶段优化机器学习模型,但没有考虑这些模型一旦部署到现实世界的应用程序中,如何适应它们的模型服务阶段。在这个阶段,模型必须处理数据流,这些数据流可能随着时间的推移而演变,并扭曲传入数据之间的关系,从而导致概念漂移。本文提出利用新兴特征存储的优势,以改进跨多个模型的未标记动态数据流的概念漂移检测。首先,我们介绍了分布式数据集上的漂移检测(QuaD),它结合了经典的漂移检测器来利用标记和未标记的数据,并创建本地上下文(即每个实时模型)和全局上下文(即跨多个模型)。其次,我们建议使用特征存储实体、SHAP值和协同过滤(CF)来增加跨多个模型的未标记数据。据我们所知,QuaD是第一个研究跨多个模型的概念漂移的集体行为,并辨别在动态环境中可能共享敏感性的模型之间的关联的工作。QuaD结合使用基于性能和基于数据分布的漂移检测器和CF来捕获标记和未标记数据流的不同类型的概念漂移,并围绕新兴特征存储提供的数据抽象进行建模。
{"title":"Concept drift detection for distributed multi-model machine learning systems","authors":"Beverly Abadines Quon, J. Gaudiot","doi":"10.1109/COMPSAC54236.2022.00168","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00168","url":null,"abstract":"Many works focus on optimizing machine learning models during their training phase, but fail to account how these models adapt into their model-serving phase once they are deployed into real world applications. In this phase models must process through streams of data that can evolve over time and distort the relationship between incoming data, causing concept drift. This paper proposes leveraging the advantages of emerging features stores in order to improve concept drift detection on unlabeled, dynamic data streams across multiple models. Firstly, we introduce Drift Detection on Distributed Datasets (QuaD), which combines classical drift detectors to make use of labeled and unlabeled data, and create local context (i.e. per live model) and global context (i.e. across multiple models). Secondly, we propose using feature store entities, SHAP values, and Collaborative Filtering (CF) to augment unlabeled data across multiple models. To the best of our knowledge, QuaD is the first work that examines the collective behavior of concept drift across multiple models and discerns associations between models that may share a susceptibility in a dynamic setting. QuaD uses a combination of performance-based and data distribution-based drift detectors and CF to capture varying types of concept drifts for labeled and unlabeled data streams and is modeled around the data abstraction provided by emerging feature stores.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124334679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Big data architectures for data lakes: A systematic literature review 面向数据湖的大数据架构:系统文献综述
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00179
Sonam Ramchand, Tariq Mahmood
The rise in big technologies has been demanding different concepts and practices for data exploitation; among them data lake is a recently emerged concept that is meant to deal with the heterogeneous data. Data lakes have been residing in the big data era since 2010, but there has not been any systematic review yet over data lake implementation. In this research survey, we conduct a review and provide a road map to researcher that elaborates what has happened to data lakes till now. We aim to give understanding for basic concept of data lakes and propose a novel data lake definition that could best describe the concept based on the literature review. One of the main problem while implementing data lake is deciding the technologies to use, this study covers technologies that can potentially be used for data lake implementation. Furthermore, data lake architectures and their variants are discussed in detail. Moreover, we analyze current state, challenges, pros and cons of the data lake. This study is all in one place for researchers who try to understand data lake concept, architectures, technologies, approaches, current state and challenges.
大技术的兴起要求对数据利用提出不同的概念和实践;其中,数据湖是最近出现的一个概念,旨在处理异构数据。数据湖从2010年开始就存在于大数据时代,但目前还没有对数据湖的实施进行系统的回顾。在这次研究调查中,我们进行了回顾,并为研究者提供了一个路线图,详细说明了数据湖到目前为止发生了什么。本文旨在对数据湖的基本概念进行理解,并在文献综述的基础上提出一个最能描述数据湖概念的数据湖定义。实现数据湖的主要问题之一是决定使用哪些技术,本研究涵盖了可能用于数据湖实现的技术。此外,还详细讨论了数据湖体系结构及其变体。此外,我们还分析了数据湖的现状、挑战和利弊。本研究为试图了解数据湖概念、架构、技术、方法、现状和挑战的研究人员提供了一个地方。
{"title":"Big data architectures for data lakes: A systematic literature review","authors":"Sonam Ramchand, Tariq Mahmood","doi":"10.1109/COMPSAC54236.2022.00179","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00179","url":null,"abstract":"The rise in big technologies has been demanding different concepts and practices for data exploitation; among them data lake is a recently emerged concept that is meant to deal with the heterogeneous data. Data lakes have been residing in the big data era since 2010, but there has not been any systematic review yet over data lake implementation. In this research survey, we conduct a review and provide a road map to researcher that elaborates what has happened to data lakes till now. We aim to give understanding for basic concept of data lakes and propose a novel data lake definition that could best describe the concept based on the literature review. One of the main problem while implementing data lake is deciding the technologies to use, this study covers technologies that can potentially be used for data lake implementation. Furthermore, data lake architectures and their variants are discussed in detail. Moreover, we analyze current state, challenges, pros and cons of the data lake. This study is all in one place for researchers who try to understand data lake concept, architectures, technologies, approaches, current state and challenges.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124421949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Madelyn: Multi-Domain Multi-Agent Reinforcement Learning for Data-center Networks Madelyn:数据中心网络的多域多智能体强化学习
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00109
A. Kattepur, S. David
Data-center network configurations are crucial in ensuring end-to-end differentiated service performance within 5G. Data-center networks encom-pass two domains: (i) the fat-tree networking fabric with leaf, spine and super-spine layers (ii) data-center server nodes with container and workload placement policies. These have traditionally been managed within silos with context and configurations driven within each domain. In this work, we examine the effect of configuration changes in one domain and its effect on the other. We develop Madelyn, a multi-domain multi-agent rein-forcement learning framework for data-center networks that can propose network-aware, virtual network function placement. This framework takes into account the data-center fabric wights, drop rates, capacities, load balancing and traffic shaping. It also considers the network function pod placements based on affinity / anti-affinity rules, node capacities and taints/tolerations. Using this multi-agent framework, we provide network aware scheduling policies for differentiated network function virtualization services running on Kubernetes pods within data-center networks. The results are demonstrated over a real traffic dataset collected over Ericsson's testbed networks.
数据中心网络配置对于确保5G的端到端差异化服务性能至关重要。数据中心网络包括两个领域:(i)具有叶子层、主干层和超级主干层的胖树网络结构(ii)具有容器和工作负载放置策略的数据中心服务器节点。这些传统上是在筒仓中管理的,在每个域中驱动上下文和配置。在这项工作中,我们研究了配置变化在一个领域的影响及其对另一个领域的影响。我们开发了Madelyn,一个用于数据中心网络的多域多智能体强化学习框架,可以提出网络感知,虚拟网络功能放置。该框架考虑了数据中心结构的重量、丢包率、容量、负载平衡和流量整形。它还考虑基于亲和/反亲和规则、节点容量和污点/公差的网络功能pod放置。使用这个多代理框架,我们为在数据中心网络中的Kubernetes pod上运行的差异化网络功能虚拟化服务提供网络感知调度策略。结果在爱立信测试平台网络上收集的真实流量数据集上得到了验证。
{"title":"Madelyn: Multi-Domain Multi-Agent Reinforcement Learning for Data-center Networks","authors":"A. Kattepur, S. David","doi":"10.1109/COMPSAC54236.2022.00109","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00109","url":null,"abstract":"Data-center network configurations are crucial in ensuring end-to-end differentiated service performance within 5G. Data-center networks encom-pass two domains: (i) the fat-tree networking fabric with leaf, spine and super-spine layers (ii) data-center server nodes with container and workload placement policies. These have traditionally been managed within silos with context and configurations driven within each domain. In this work, we examine the effect of configuration changes in one domain and its effect on the other. We develop Madelyn, a multi-domain multi-agent rein-forcement learning framework for data-center networks that can propose network-aware, virtual network function placement. This framework takes into account the data-center fabric wights, drop rates, capacities, load balancing and traffic shaping. It also considers the network function pod placements based on affinity / anti-affinity rules, node capacities and taints/tolerations. Using this multi-agent framework, we provide network aware scheduling policies for differentiated network function virtualization services running on Kubernetes pods within data-center networks. The results are demonstrated over a real traffic dataset collected over Ericsson's testbed networks.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124098264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Correlation-based Real-time Segmentation Scheme for Multi-user Collaborative Activities 一种基于关联的多用户协同活动实时分割方案
Pub Date : 2022-06-01 DOI: 10.1109/COMPSAC54236.2022.00150
Kisoo Kim, Hyunju Kim, Dongman Lee
Activity Segmentation, dividing a continuous sensor stream into a set of activity segments, is a crucial pre-process in Human Activity Recognition (HAR) and it is required to be done in real-time for real-world smart services. Existing single-user activity segmentation schemes fail to correctly detect transition points due to concurrent and overlapping events from multiple users in case of Multi-user Collaborative Activity Recognition (MCAR). In this paper, we propose a novel scheme for activity segmentation for MCAR that expresses complex events and the correlations between them. For this, the proposed scheme first creates an event stream from a sensor stream and defines event sets in terms of time windows. For each time window, two types of correlations for every event pair are calculated: duration correlation and history correlation. After calculating event correlation, the change score of a time window is measured by comparing the calculated correlation values with those of the preceding windows. Then, the proposed scheme elects as an activity transition point a time window whose change score exceeds the transition threshold. We evaluate the proposed method on two multi-user collaborative activity datasets and experiment results show that the proposed scheme achieves better segmentation performance than existing approaches.
活动分割,将连续的传感器流划分为一组活动段,是人类活动识别(HAR)中至关重要的预处理,并且需要实时完成现实世界的智能服务。在多用户协同活动识别(MCAR)中,现有的单用户活动分割方案由于多个用户事件的并发和重叠而无法正确检测过渡点。在本文中,我们提出了一种新的MCAR活动分割方案,该方案表达了复杂事件及其之间的相关性。为此,该方案首先从传感器流中创建事件流,并根据时间窗口定义事件集。对于每个时间窗口,计算每个事件对的两种类型的相关性:持续时间相关性和历史相关性。计算出事件相关性后,将计算出的相关值与前一个窗口的相关值进行比较,得到时间窗口的变化评分。然后,选择变化分数超过过渡阈值的时间窗口作为活动过渡点;我们在两个多用户协同活动数据集上对该方法进行了评估,实验结果表明,该方法比现有方法具有更好的分割性能。
{"title":"A Correlation-based Real-time Segmentation Scheme for Multi-user Collaborative Activities","authors":"Kisoo Kim, Hyunju Kim, Dongman Lee","doi":"10.1109/COMPSAC54236.2022.00150","DOIUrl":"https://doi.org/10.1109/COMPSAC54236.2022.00150","url":null,"abstract":"Activity Segmentation, dividing a continuous sensor stream into a set of activity segments, is a crucial pre-process in Human Activity Recognition (HAR) and it is required to be done in real-time for real-world smart services. Existing single-user activity segmentation schemes fail to correctly detect transition points due to concurrent and overlapping events from multiple users in case of Multi-user Collaborative Activity Recognition (MCAR). In this paper, we propose a novel scheme for activity segmentation for MCAR that expresses complex events and the correlations between them. For this, the proposed scheme first creates an event stream from a sensor stream and defines event sets in terms of time windows. For each time window, two types of correlations for every event pair are calculated: duration correlation and history correlation. After calculating event correlation, the change score of a time window is measured by comparing the calculated correlation values with those of the preceding windows. Then, the proposed scheme elects as an activity transition point a time window whose change score exceeds the transition threshold. We evaluate the proposed method on two multi-user collaborative activity datasets and experiment results show that the proposed scheme achieves better segmentation performance than existing approaches.","PeriodicalId":330838,"journal":{"name":"2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126549218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1