2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)最新文献

Predicting Clinical Events via Graph Neural Networks 通过图神经网络预测临床事件

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00207

Teja Kanchinadam, Shaheen Gauher

Timely detection of clinical events would provide healthcare providers the opportunity to make meaningful interventions that can result in improved health outcomes. This work describes a methodology developed at a large U.S. healthcare insurance company for predicting clinical events using administrative claims data. Most of the existing literature for predicting clinical events leverage historical data in Electronic Health Records (EHR). EHR data however has limitations making it undesirable for real-time use-cases. It is inconsistent, expensive, inefficient and sparsely available. In contrast, administrative claims data is relatively consistent, efficient and readily available. In this work, we introduce a novel modeling workflow: First, we learn custom embeddings for medical codes within claims data in order to uncover the hidden relationships between them. Second, we introduce a novel way of representing a member’s health history with a graph such that the relationships between various diagnosis and procedure codes is captured. Finally, we apply Graph Neural Networks (GNN) to perform a multi-label graph classification for clinical event prediction. Our approach produces more accurate predictions than any other standard classification approaches and can be easily generalized to other clinical prediction tasks.

及时发现临床事件将使医疗保健提供者有机会采取有意义的干预措施，从而改善健康结果。这项工作描述了在美国一家大型医疗保险公司开发的一种方法，用于使用行政索赔数据预测临床事件。大多数预测临床事件的现有文献都利用电子健康记录(EHR)中的历史数据。然而，EHR数据有一些限制，使得它不适合实时用例。它是不一致的，昂贵的，低效的，稀缺的。相比之下，行政索赔数据相对一致、有效和容易获得。在这项工作中，我们引入了一种新的建模工作流程:首先，我们学习了索赔数据中医疗代码的自定义嵌入，以揭示它们之间隐藏的关系。其次，我们引入了一种用图表表示成员健康史的新方法，以便捕获各种诊断和程序代码之间的关系。最后，我们应用图神经网络(GNN)进行多标签图分类，用于临床事件预测。我们的方法比任何其他标准分类方法产生更准确的预测，并且可以很容易地推广到其他临床预测任务。

{"title":"Predicting Clinical Events via Graph Neural Networks","authors":"Teja Kanchinadam, Shaheen Gauher","doi":"10.1109/ICMLA55696.2022.00207","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00207","url":null,"abstract":"Timely detection of clinical events would provide healthcare providers the opportunity to make meaningful interventions that can result in improved health outcomes. This work describes a methodology developed at a large U.S. healthcare insurance company for predicting clinical events using administrative claims data. Most of the existing literature for predicting clinical events leverage historical data in Electronic Health Records (EHR). EHR data however has limitations making it undesirable for real-time use-cases. It is inconsistent, expensive, inefficient and sparsely available. In contrast, administrative claims data is relatively consistent, efficient and readily available. In this work, we introduce a novel modeling workflow: First, we learn custom embeddings for medical codes within claims data in order to uncover the hidden relationships between them. Second, we introduce a novel way of representing a member’s health history with a graph such that the relationships between various diagnosis and procedure codes is captured. Finally, we apply Graph Neural Networks (GNN) to perform a multi-label graph classification for clinical event prediction. Our approach produces more accurate predictions than any other standard classification approaches and can be easily generalized to other clinical prediction tasks.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"47 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120839115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Can We Predict Consequences of Cyber Attacks? 我们能预测网络攻击的后果吗?

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00174

Prerit Datta, A. Namin, Keith S. Jones

Threat modeling is a process by which security designers and researchers analyze the security of a system against known threats and vulnerabilities. There is a myriad of threat intelligence and vulnerability databases that security experts use to make important day-to-day decisions. Security experts and incident responders require the right set of skills and tools to recognize attack consequences and convey them to various stakeholders. In this paper, we used natural language processing (NLP) and deep learning to analyze text descriptions of cyberattacks and predict their consequences. This can be useful to quickly analyze new attacks discovered in the wild, help security practitioners take requisite actions, and convey attack consequences to stakeholders in a simple way. In this work, we predicted the multilabels (availability, access control, confidentiality, integrity, and other) corresponding to each text description in MITRE’s CWE dataset. We compared the performance of various CNN and LSTM deep neural networks in predicting these labels. The results indicate that it is possible to predict multilabels using a LSTM deep neural network with multiple output layers equal to the number of labels. LSTM performance was better when compared to CNN models.

威胁建模是安全设计人员和研究人员根据已知威胁和漏洞分析系统安全性的过程。有无数的威胁情报和漏洞数据库，安全专家使用它们来做出重要的日常决策。安全专家和事件响应人员需要正确的技能和工具集来识别攻击后果并将其传达给各种涉众。在本文中，我们使用自然语言处理(NLP)和深度学习来分析网络攻击的文本描述并预测其后果。这有助于快速分析在野外发现的新攻击，帮助安全从业者采取必要的行动，并以简单的方式将攻击后果传达给涉众。在这项工作中，我们预测了MITRE的CWE数据集中每个文本描述对应的多标签(可用性、访问控制、机密性、完整性等)。我们比较了各种CNN和LSTM深度神经网络在预测这些标签方面的性能。结果表明，使用LSTM深度神经网络进行多标签预测是可能的，其中多个输出层等于标签的数量。与CNN模型相比，LSTM的性能更好。

{"title":"Can We Predict Consequences of Cyber Attacks?","authors":"Prerit Datta, A. Namin, Keith S. Jones","doi":"10.1109/ICMLA55696.2022.00174","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00174","url":null,"abstract":"Threat modeling is a process by which security designers and researchers analyze the security of a system against known threats and vulnerabilities. There is a myriad of threat intelligence and vulnerability databases that security experts use to make important day-to-day decisions. Security experts and incident responders require the right set of skills and tools to recognize attack consequences and convey them to various stakeholders. In this paper, we used natural language processing (NLP) and deep learning to analyze text descriptions of cyberattacks and predict their consequences. This can be useful to quickly analyze new attacks discovered in the wild, help security practitioners take requisite actions, and convey attack consequences to stakeholders in a simple way. In this work, we predicted the multilabels (availability, access control, confidentiality, integrity, and other) corresponding to each text description in MITRE’s CWE dataset. We compared the performance of various CNN and LSTM deep neural networks in predicting these labels. The results indicate that it is possible to predict multilabels using a LSTM deep neural network with multiple output layers equal to the number of labels. LSTM performance was better when compared to CNN models.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121050169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quantifying Cognitive Load from Voice using Transformer-Based Models and a Cross-Dataset Evaluation 基于转换模型和跨数据集评估的语音认知负荷量化

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00055

Pascal Hecker, A. Kappattanavar, Maximilian Schmitt, S. Moontaha, Johannes Wagner, F. Eyben, Björn Schuller, B. Arnrich

Cognitive load is frequently induced in laboratory setups to measure responses to stress, and its impact on voice has been studied in the field of computational paralinguistics. One dataset on this topic was provided in the Computational Paralinguistics Challenge (ComParE) 2014, and therefore offers great comparability. Recently, transformer-based deep learning architectures established a new state-of-the-art and are finding their way gradually into the audio domain. In this context, we investigate the performance of popular transformer architectures in the audio domain on the ComParE 2014 dataset, and the impact of different pre-training and fine-tuning setups on these models. Further, we recorded a small custom dataset, designed to be comparable with the ComParE 2014 one, to assess cross-corpus model generalisability. We find that the transformer models outperform the challenge baseline, the challenge winner, and more recent deep learning approaches. Models based on the ‘large’ architecture perform well on the task at hand, while models based on the ‘base’ architecture perform at chance level. Fine-tuning on related domains (such as ASR or emotion), before fine-tuning on the targets, yields no higher performance compared to models pre-trained only in a self-supervised manner. The generalisability of the models between datasets is more intricate than expected, as seen in an unexpected low performance on the small custom dataset, and we discuss potential ‘hidden’ underlying discrepancies between the datasets. In summary, transformer-based architectures outperform previous attempts to quantify cognitive load from voice. This is promising, in particular for healthcare-related problems in computational paralinguistics applications, since datasets are sparse in that realm.

在实验室环境中，认知负荷经常被用来测量对压力的反应，它对语音的影响已经在计算副语言学领域进行了研究。关于这个主题的一个数据集是在2014年计算副语言学挑战(ComParE)中提供的，因此具有很强的可比性。最近，基于转换器的深度学习架构建立了一种新的技术，并逐渐进入音频领域。在此背景下，我们在ComParE 2014数据集上研究了音频领域中流行的变压器架构的性能，以及不同的预训练和微调设置对这些模型的影响。此外，我们记录了一个小型自定义数据集，旨在与ComParE 2014的数据集进行比较，以评估跨语料库模型的通用性。我们发现，转换模型优于挑战基线、挑战赢家和最近的深度学习方法。基于“大型”架构的模型在手头的任务上表现良好，而基于“基础”架构的模型在偶然级别上表现良好。在对目标进行微调之前，对相关领域(如ASR或情感)进行微调，与仅以自我监督的方式预训练的模型相比，不会产生更高的性能。数据集之间模型的通用性比预期的更复杂，正如在小型自定义数据集上意想不到的低性能所看到的那样，我们讨论了数据集之间潜在的“隐藏”潜在差异。总之，基于转换器的体系结构优于以前量化语音认知负荷的尝试。这是很有希望的，特别是对于计算副语言学应用中的医疗保健相关问题，因为该领域的数据集是稀疏的。

{"title":"Quantifying Cognitive Load from Voice using Transformer-Based Models and a Cross-Dataset Evaluation","authors":"Pascal Hecker, A. Kappattanavar, Maximilian Schmitt, S. Moontaha, Johannes Wagner, F. Eyben, Björn Schuller, B. Arnrich","doi":"10.1109/ICMLA55696.2022.00055","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00055","url":null,"abstract":"Cognitive load is frequently induced in laboratory setups to measure responses to stress, and its impact on voice has been studied in the field of computational paralinguistics. One dataset on this topic was provided in the Computational Paralinguistics Challenge (ComParE) 2014, and therefore offers great comparability. Recently, transformer-based deep learning architectures established a new state-of-the-art and are finding their way gradually into the audio domain. In this context, we investigate the performance of popular transformer architectures in the audio domain on the ComParE 2014 dataset, and the impact of different pre-training and fine-tuning setups on these models. Further, we recorded a small custom dataset, designed to be comparable with the ComParE 2014 one, to assess cross-corpus model generalisability. We find that the transformer models outperform the challenge baseline, the challenge winner, and more recent deep learning approaches. Models based on the ‘large’ architecture perform well on the task at hand, while models based on the ‘base’ architecture perform at chance level. Fine-tuning on related domains (such as ASR or emotion), before fine-tuning on the targets, yields no higher performance compared to models pre-trained only in a self-supervised manner. The generalisability of the models between datasets is more intricate than expected, as seen in an unexpected low performance on the small custom dataset, and we discuss potential ‘hidden’ underlying discrepancies between the datasets. In summary, transformer-based architectures outperform previous attempts to quantify cognitive load from voice. This is promising, in particular for healthcare-related problems in computational paralinguistics applications, since datasets are sparse in that realm.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126099246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

An Edge-based Real-Time Object Detection 一种基于边缘的实时目标检测方法

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00075

A. Ahmadinia, Jaabaal Shah

This paper looks at performance bottlenecks of real-time object detection on edge devices. The "You only look once v4" (YOLOv4) is currently one of the leading state-of-the-art models for real-time object detection, and its tiny version: YOLOv4-tiny, is designed for edge devices. To improve object detection accuracy without sacrificing detection speed, we propose an object detection method based on YOLOv4-tiny and VGG-Net. First, we implement the mosaic data augmentation and Mish activation function to increase the generalization ability of the proposed model, making it more robust. Secondly, to enhance the richness of the features extracted, an extra 3x3 convolution layer is added in a way that two successive 3x3 convolutions are used to obtain 5x5 receptive fields. This would enable us to extract global features in the first CSP (Cross Stage Partial Network) Block and restructure the connections of the subsequent layers to have the same effect on the next CSP blocks. Evaluation results show that the proposed model has comparable performance and memory footprint but significantly greater accuracy than YOLOv4-tiny. Also, the proposed tiny model has similar performance to YOLOv4-tiny, and improves accuracy with much lower memory overhead, which makes it an ideal solution for real-time object detection, especially on edge devices.

本文研究了边缘设备上实时目标检测的性能瓶颈。“你只看一次v4”(YOLOv4)是目前实时目标检测的领先先进模型之一，其微型版本:YOLOv4-tiny是专为边缘设备设计的。为了在不牺牲检测速度的前提下提高目标检测精度，提出了一种基于YOLOv4-tiny和VGG-Net的目标检测方法。首先，我们实现了拼接数据增强和Mish激活函数，提高了模型的泛化能力，增强了模型的鲁棒性。其次，为了增强提取的特征的丰富性，增加了一个额外的3x3卷积层，即使用两个连续的3x3卷积来获得5x5个接受域。这将使我们能够在第一个CSP(跨阶段部分网络)块中提取全局特征，并重组后续层的连接，以对下一个CSP块具有相同的效果。评估结果表明，该模型具有相当的性能和内存占用，但精度明显高于YOLOv4-tiny。此外，所提出的微型模型具有与YOLOv4-tiny相似的性能，并且以更低的内存开销提高了精度，这使其成为实时目标检测的理想解决方案，特别是在边缘设备上。

{"title":"An Edge-based Real-Time Object Detection","authors":"A. Ahmadinia, Jaabaal Shah","doi":"10.1109/ICMLA55696.2022.00075","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00075","url":null,"abstract":"This paper looks at performance bottlenecks of real-time object detection on edge devices. The \"You only look once v4\" (YOLOv4) is currently one of the leading state-of-the-art models for real-time object detection, and its tiny version: YOLOv4-tiny, is designed for edge devices. To improve object detection accuracy without sacrificing detection speed, we propose an object detection method based on YOLOv4-tiny and VGG-Net. First, we implement the mosaic data augmentation and Mish activation function to increase the generalization ability of the proposed model, making it more robust. Secondly, to enhance the richness of the features extracted, an extra 3x3 convolution layer is added in a way that two successive 3x3 convolutions are used to obtain 5x5 receptive fields. This would enable us to extract global features in the first CSP (Cross Stage Partial Network) Block and restructure the connections of the subsequent layers to have the same effect on the next CSP blocks. Evaluation results show that the proposed model has comparable performance and memory footprint but significantly greater accuracy than YOLOv4-tiny. Also, the proposed tiny model has similar performance to YOLOv4-tiny, and improves accuracy with much lower memory overhead, which makes it an ideal solution for real-time object detection, especially on edge devices.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126166128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Recurrent Neural Network-Based Video Compression 基于循环神经网络的视频压缩

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00154

Zahra Montajabi, V. Ghassab, N. Bouguila

Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.

近年来，视频压缩在媒体技术中的计算机视觉问题中得到了广泛关注。使用最先进的视频压缩方法，视频可以以更好的质量传输，需要更少的带宽和内存。基于神经网络的视频压缩方法的出现极大地提高了视频编码的性能。提出了一种基于循环神经网络(RNN)的视频压缩方法。该方法包括编码器、中间模块和解码器。中间模块采用二值化器实现更好的量化性能。在编解码器模块中，采用LSTM (long short-term memory)单元保留有价值的信息，剔除不必要的信息，迭代降低重构视频的质量损失。该方法降低了基于神经网络的压缩方案的复杂度，编码的视频质量损失较小。采用峰值信噪比(PSNR)、视频多方法评估融合(VMAF)和结构相似指数度量(SSIM)质量指标对该方法进行了评估。将该方法应用于两个不同的公共视频压缩数据集，结果表明该方法优于现有的标准视频编码方案，如H.264和H.265。

{"title":"Recurrent Neural Network-Based Video Compression","authors":"Zahra Montajabi, V. Ghassab, N. Bouguila","doi":"10.1109/ICMLA55696.2022.00154","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00154","url":null,"abstract":"Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116140331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Anomaly Detection from Multilinear Observations via Time-Series Analysis and 3DTPCA 基于时间序列分析和3DTPCA的多线性观测异常检测

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00112

Jackson Cates, R. Hoover, Kyle A. Caudle, D. Marchette, Cagri Ozdemir

In the era of big data, there is massive demand for new techniques to forecast and analyze multi-dimensional data. One task that has seen great interest in the community is anomaly detection of streaming data. Toward this end, the current research develops a novel approach to anomaly detection of streaming 2-dimensional observations via multilinear time-series analysis and 3-dimensional tensor principal component analysis (3DTPCA). We approach this problem utilizing dimensionality reduction and probabilistic inference in a low-dimensional space. We first propose a natural extension to 2-dimensional tensor principal component analysis (2DTPCA) to perform data dimensionality reduction on 4-dimensional tensor objects, aptly named 3DTPCA. We then represent the sub-sequences of our time-series observations as a 4-dimensional tensor utilizing a sliding window. Finally, we use 3DTPCA to compute reconstruction errors for inferring anomalous instances within the multilinear data stream. Experimental validation is presented via MovingMNIST data. Results illustrate that the proposed approach has a significant speedup in training time compared with deep learning, while performing competitively in terms of accuracy.

在大数据时代，对多维数据预测和分析的新技术有着巨大的需求。在社区中引起极大兴趣的一项任务是流数据的异常检测。为此，本研究开发了一种基于多线性时间序列分析和三维张量主成分分析(3DTPCA)的二维流观测异常检测新方法。我们利用低维空间中的降维和概率推理来解决这个问题。我们首先提出了对二维张量主成分分析(2DTPCA)的自然扩展，对四维张量对象进行数据降维，并将其命名为3DTPCA。然后，我们将时间序列观测的子序列表示为利用滑动窗口的4维张量。最后，我们使用3DTPCA计算重建误差，以推断多线性数据流中的异常实例。通过MovingMNIST数据进行了实验验证。结果表明，与深度学习相比，该方法在训练时间上有显着的加快，同时在准确性方面具有竞争力。

{"title":"Anomaly Detection from Multilinear Observations via Time-Series Analysis and 3DTPCA","authors":"Jackson Cates, R. Hoover, Kyle A. Caudle, D. Marchette, Cagri Ozdemir","doi":"10.1109/ICMLA55696.2022.00112","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00112","url":null,"abstract":"In the era of big data, there is massive demand for new techniques to forecast and analyze multi-dimensional data. One task that has seen great interest in the community is anomaly detection of streaming data. Toward this end, the current research develops a novel approach to anomaly detection of streaming 2-dimensional observations via multilinear time-series analysis and 3-dimensional tensor principal component analysis (3DTPCA). We approach this problem utilizing dimensionality reduction and probabilistic inference in a low-dimensional space. We first propose a natural extension to 2-dimensional tensor principal component analysis (2DTPCA) to perform data dimensionality reduction on 4-dimensional tensor objects, aptly named 3DTPCA. We then represent the sub-sequences of our time-series observations as a 4-dimensional tensor utilizing a sliding window. Finally, we use 3DTPCA to compute reconstruction errors for inferring anomalous instances within the multilinear data stream. Experimental validation is presented via MovingMNIST data. Results illustrate that the proposed approach has a significant speedup in training time compared with deep learning, while performing competitively in terms of accuracy.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114501305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hyperparameter Tuning in Offline Reinforcement Learning 离线强化学习中的超参数调优

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00101

Andrew Tittaferrante, A. Yassine

In this work, we propose a reliable hyperparameter tuning scheme for offline reinforcement learning. We demonstrate our proposed scheme using the simplest antmaze environment from the standard benchmark offline dataset, D4RL. The usual approach for policy evaluation in offline reinforcement learning involves online evaluation, i.e., cherry-picking best performance on the test environment. To mitigate this cherry-picking, we propose an ad-hoc online evaluation metric, which we name "median-median-return". This metric enables more reliable reporting of results because it represents the expected performance of the learned policy by taking the median online evaluation performance across both epochs and training runs. To demonstrate our scheme, we employ the recently state-of-the-art algorithm, IQL, and perform a thorough hyperparameter search based on our proposed metric. The tuned architectures enjoy notably stronger cherry-picked performance, and the best models are able to surpass the reported state-of-the-art performance on average.

在这项工作中，我们提出了一种可靠的超参数调谐方案用于离线强化学习。我们使用来自标准基准离线数据集D4RL的最简单的antmaze环境来演示我们提出的方案。离线强化学习中策略评估的常用方法包括在线评估，即在测试环境中挑选最佳性能。为了减轻这种挑选，我们提出了一个特别的在线评估指标，我们将其命名为“中位数回报”。这个度量可以更可靠地报告结果，因为它通过在两个时间段和训练运行中获取在线评估性能的中位数来表示学习策略的预期性能。为了演示我们的方案，我们采用了最近最先进的算法IQL，并基于我们提出的度量执行了彻底的超参数搜索。调优的体系结构享有明显更强的精选性能，并且最好的模型能够超过报告的最先进的平均性能。

引用次数: 0

Sat2rain: Multiple Satellite Images to Rainfall Amounts Conversion By Improved GAN Sat2rain:基于改进GAN的多卫星图像到降雨量的转换

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00233

Hidetomo Sakaino, A. Higuchi

This paper presents a conversion method of cloud to precipitation images based on an improved Generative Adversarial Network (GAN) using multiple satellite and radar images. Since heavy rainfall events have been yearly increasing everywhere on the earth, precipitation radar images on lands become more important to use and predict, where much denser data is observed than on-the-ground sensor data. However, the coverage of such radar sites is very limited in small regions like land and/or near the sea. On the other hand, satellite images, i.e., Himawari-8, are available globally, but no direct precipitation images, i.e., rain clouds, can be obtained. GAN is a good selection for image translation, but it is known that high edges and textures can be lost. This paper proposes ‘sat2rain’, a two-step algorithm with a new constraint of the loss function. First, multiple satellite band and topography images are input to GAN, where block-wised images from overall images are used to cover over 2500 km x 2500 km. Second, enhanced GAN-based training between satellite images and radar images is conducted. Experimental results show the effectiveness of the proposed sat2rain mesh-wise method over the previous point-wise Random Forest method in terms of high edge and texture.

本文提出了一种基于改进的生成对抗网络(GAN)的多卫星和雷达图像云到降水图像的转换方法。由于全球各地的强降雨事件每年都在增加，陆地上的降水雷达图像对于使用和预测变得更加重要，因为在陆地上观测到的数据比地面传感器数据密集得多。然而，这种雷达站的覆盖范围在陆地和/或靠近海洋等小区域非常有限。另一方面，全球有卫星图像，即Himawari-8，但没有直接降水图像，即雨云。GAN是图像翻译的一个很好的选择，但众所周知，高边缘和纹理可能会丢失。本文提出了一种新的损失函数约束的两步算法“sat2rain”。首先，将多个卫星波段和地形图像输入到GAN中，其中使用总体图像中的块智能图像覆盖超过2500公里x 2500公里。其次，对卫星图像和雷达图像进行基于gan的增强训练。实验结果表明，基于sat2rain网格的方法在高边缘和纹理方面优于基于点的随机森林方法。

{"title":"Sat2rain: Multiple Satellite Images to Rainfall Amounts Conversion By Improved GAN","authors":"Hidetomo Sakaino, A. Higuchi","doi":"10.1109/ICMLA55696.2022.00233","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00233","url":null,"abstract":"This paper presents a conversion method of cloud to precipitation images based on an improved Generative Adversarial Network (GAN) using multiple satellite and radar images. Since heavy rainfall events have been yearly increasing everywhere on the earth, precipitation radar images on lands become more important to use and predict, where much denser data is observed than on-the-ground sensor data. However, the coverage of such radar sites is very limited in small regions like land and/or near the sea. On the other hand, satellite images, i.e., Himawari-8, are available globally, but no direct precipitation images, i.e., rain clouds, can be obtained. GAN is a good selection for image translation, but it is known that high edges and textures can be lost. This paper proposes ‘sat2rain’, a two-step algorithm with a new constraint of the loss function. First, multiple satellite band and topography images are input to GAN, where block-wised images from overall images are used to cover over 2500 km x 2500 km. Second, enhanced GAN-based training between satellite images and radar images is conducted. Experimental results show the effectiveness of the proposed sat2rain mesh-wise method over the previous point-wise Random Forest method in terms of high edge and texture.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128270019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Lung Nodules Identification in CT Scans Using Multiple Instance Learning* 利用多实例学习识别CT扫描中的肺结节*

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00089

Wiem Safta, H. Frigui

We propose a Multiple Instance Learning (MIL) approach for lung nodules classification to address the limitations of current Computer-Aided Diagnosis (CAD) systems. One of these limitations consists of the need for a large collection of training samples that require to be segmented and annotated by radiologists. Another consists of using a fixed volume size for all nodules regardless of their actual sizes. Using a MIL approach, we represent each nodule by a nested sequence of volumes centered at the identified center of the nodule. We extract one feature vector from each volume. The set of features for each nodule are combined and represented by a bag. Using this representation, we investigate and compare many MIL algorithms and feature extraction methods. We start by applying benchmark MIL algorithms to traditional Gray Level Co-occurrence Matrix (GLCM) engineered features. Then, we design and train simple Convolutional Neural Networks (CNNs) to learn and extract features that characterize lung nodules. These extracted features are then fed to a benchmark MIL algorithm to learn a classification model. We report the results of three experiments applied to both GLCM and CNN features using two benchmark datasets. We designed our experiments to compare the different features and compare MIL versus Single Instance Learning (SIL) where a single feature vector represents a nodule. We show that our MIL representation using CNN features is more accurate for the lung nodules diagnosis task. We also show that MIL representation achieves better results than SIL applied on the ground truth region of each nodule.

我们提出了一种多实例学习(MIL)方法用于肺结节分类，以解决当前计算机辅助诊断(CAD)系统的局限性。其中一个限制包括需要大量的训练样本，这些样本需要放射科医生进行分割和注释。另一种方法是对所有结节使用固定的体积大小，而不管其实际大小。使用MIL方法，我们通过以结节识别中心为中心的嵌套体积序列来表示每个结节。我们从每个卷中提取一个特征向量。每个结节的特征集组合在一起，用一个包表示。使用这种表示，我们研究和比较了许多MIL算法和特征提取方法。我们首先将基准MIL算法应用于传统的灰度共生矩阵(GLCM)工程特征。然后，我们设计和训练简单卷积神经网络(cnn)来学习和提取表征肺结节的特征。然后将这些提取的特征馈送到基准MIL算法中以学习分类模型。我们报告了使用两个基准数据集对GLCM和CNN特征进行的三个实验的结果。我们设计了实验来比较不同的特征，并将MIL与单实例学习(SIL)进行比较，其中单个特征向量表示一个节点。我们表明，使用CNN特征的MIL表示对于肺结节诊断任务更准确。我们还表明，MIL表示比在每个节点的真值区域上应用SIL取得了更好的效果。

{"title":"Lung Nodules Identification in CT Scans Using Multiple Instance Learning*","authors":"Wiem Safta, H. Frigui","doi":"10.1109/ICMLA55696.2022.00089","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00089","url":null,"abstract":"We propose a Multiple Instance Learning (MIL) approach for lung nodules classification to address the limitations of current Computer-Aided Diagnosis (CAD) systems. One of these limitations consists of the need for a large collection of training samples that require to be segmented and annotated by radiologists. Another consists of using a fixed volume size for all nodules regardless of their actual sizes. Using a MIL approach, we represent each nodule by a nested sequence of volumes centered at the identified center of the nodule. We extract one feature vector from each volume. The set of features for each nodule are combined and represented by a bag. Using this representation, we investigate and compare many MIL algorithms and feature extraction methods. We start by applying benchmark MIL algorithms to traditional Gray Level Co-occurrence Matrix (GLCM) engineered features. Then, we design and train simple Convolutional Neural Networks (CNNs) to learn and extract features that characterize lung nodules. These extracted features are then fed to a benchmark MIL algorithm to learn a classification model. We report the results of three experiments applied to both GLCM and CNN features using two benchmark datasets. We designed our experiments to compare the different features and compare MIL versus Single Instance Learning (SIL) where a single feature vector represents a nodule. We show that our MIL representation using CNN features is more accurate for the lung nodules diagnosis task. We also show that MIL representation achieves better results than SIL applied on the ground truth region of each nodule.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129205487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ontology-Based Post-Hoc Explanations via Simultaneous Concept Extraction* 同时概念抽取的基于本体的事后解释*

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00147

A. Ponomarev, Anton Agafonov

Ontology-based explanation techniques allow one to get explanation why a neural network arrived to some conclusion using human-understandable terms and their formal definitions. The paper proposes a method to build post-hoc ontology-based explanations by training a multi-label neural network mapping the activations of the specified "black box" network to ontology concepts. In order to simplify training of such network we employ semantic loss, taking into account relationships between concepts. The experiment with a synthetic dataset shows that the proposed method can generate accurate ontology-based explanations of a given network.

基于本体的解释技术允许人们使用人类可理解的术语及其正式定义来解释为什么神经网络得出某些结论。本文提出了一种通过训练多标签神经网络将指定的“黑箱”网络的激活映射到本体概念来构建基于本体的事后解释的方法。为了简化这种网络的训练，我们考虑了概念之间的关系，使用了语义损失。在一个合成数据集上的实验表明，该方法可以对给定的网络生成准确的基于本体的解释。

引用次数: 0