Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00207
Teja Kanchinadam, Shaheen Gauher
Timely detection of clinical events would provide healthcare providers the opportunity to make meaningful interventions that can result in improved health outcomes. This work describes a methodology developed at a large U.S. healthcare insurance company for predicting clinical events using administrative claims data. Most of the existing literature for predicting clinical events leverage historical data in Electronic Health Records (EHR). EHR data however has limitations making it undesirable for real-time use-cases. It is inconsistent, expensive, inefficient and sparsely available. In contrast, administrative claims data is relatively consistent, efficient and readily available. In this work, we introduce a novel modeling workflow: First, we learn custom embeddings for medical codes within claims data in order to uncover the hidden relationships between them. Second, we introduce a novel way of representing a member’s health history with a graph such that the relationships between various diagnosis and procedure codes is captured. Finally, we apply Graph Neural Networks (GNN) to perform a multi-label graph classification for clinical event prediction. Our approach produces more accurate predictions than any other standard classification approaches and can be easily generalized to other clinical prediction tasks.
{"title":"Predicting Clinical Events via Graph Neural Networks","authors":"Teja Kanchinadam, Shaheen Gauher","doi":"10.1109/ICMLA55696.2022.00207","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00207","url":null,"abstract":"Timely detection of clinical events would provide healthcare providers the opportunity to make meaningful interventions that can result in improved health outcomes. This work describes a methodology developed at a large U.S. healthcare insurance company for predicting clinical events using administrative claims data. Most of the existing literature for predicting clinical events leverage historical data in Electronic Health Records (EHR). EHR data however has limitations making it undesirable for real-time use-cases. It is inconsistent, expensive, inefficient and sparsely available. In contrast, administrative claims data is relatively consistent, efficient and readily available. In this work, we introduce a novel modeling workflow: First, we learn custom embeddings for medical codes within claims data in order to uncover the hidden relationships between them. Second, we introduce a novel way of representing a member’s health history with a graph such that the relationships between various diagnosis and procedure codes is captured. Finally, we apply Graph Neural Networks (GNN) to perform a multi-label graph classification for clinical event prediction. Our approach produces more accurate predictions than any other standard classification approaches and can be easily generalized to other clinical prediction tasks.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"47 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120839115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00174
Prerit Datta, A. Namin, Keith S. Jones
Threat modeling is a process by which security designers and researchers analyze the security of a system against known threats and vulnerabilities. There is a myriad of threat intelligence and vulnerability databases that security experts use to make important day-to-day decisions. Security experts and incident responders require the right set of skills and tools to recognize attack consequences and convey them to various stakeholders. In this paper, we used natural language processing (NLP) and deep learning to analyze text descriptions of cyberattacks and predict their consequences. This can be useful to quickly analyze new attacks discovered in the wild, help security practitioners take requisite actions, and convey attack consequences to stakeholders in a simple way. In this work, we predicted the multilabels (availability, access control, confidentiality, integrity, and other) corresponding to each text description in MITRE’s CWE dataset. We compared the performance of various CNN and LSTM deep neural networks in predicting these labels. The results indicate that it is possible to predict multilabels using a LSTM deep neural network with multiple output layers equal to the number of labels. LSTM performance was better when compared to CNN models.
{"title":"Can We Predict Consequences of Cyber Attacks?","authors":"Prerit Datta, A. Namin, Keith S. Jones","doi":"10.1109/ICMLA55696.2022.00174","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00174","url":null,"abstract":"Threat modeling is a process by which security designers and researchers analyze the security of a system against known threats and vulnerabilities. There is a myriad of threat intelligence and vulnerability databases that security experts use to make important day-to-day decisions. Security experts and incident responders require the right set of skills and tools to recognize attack consequences and convey them to various stakeholders. In this paper, we used natural language processing (NLP) and deep learning to analyze text descriptions of cyberattacks and predict their consequences. This can be useful to quickly analyze new attacks discovered in the wild, help security practitioners take requisite actions, and convey attack consequences to stakeholders in a simple way. In this work, we predicted the multilabels (availability, access control, confidentiality, integrity, and other) corresponding to each text description in MITRE’s CWE dataset. We compared the performance of various CNN and LSTM deep neural networks in predicting these labels. The results indicate that it is possible to predict multilabels using a LSTM deep neural network with multiple output layers equal to the number of labels. LSTM performance was better when compared to CNN models.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121050169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00055
Pascal Hecker, A. Kappattanavar, Maximilian Schmitt, S. Moontaha, Johannes Wagner, F. Eyben, Björn Schuller, B. Arnrich
Cognitive load is frequently induced in laboratory setups to measure responses to stress, and its impact on voice has been studied in the field of computational paralinguistics. One dataset on this topic was provided in the Computational Paralinguistics Challenge (ComParE) 2014, and therefore offers great comparability. Recently, transformer-based deep learning architectures established a new state-of-the-art and are finding their way gradually into the audio domain. In this context, we investigate the performance of popular transformer architectures in the audio domain on the ComParE 2014 dataset, and the impact of different pre-training and fine-tuning setups on these models. Further, we recorded a small custom dataset, designed to be comparable with the ComParE 2014 one, to assess cross-corpus model generalisability. We find that the transformer models outperform the challenge baseline, the challenge winner, and more recent deep learning approaches. Models based on the ‘large’ architecture perform well on the task at hand, while models based on the ‘base’ architecture perform at chance level. Fine-tuning on related domains (such as ASR or emotion), before fine-tuning on the targets, yields no higher performance compared to models pre-trained only in a self-supervised manner. The generalisability of the models between datasets is more intricate than expected, as seen in an unexpected low performance on the small custom dataset, and we discuss potential ‘hidden’ underlying discrepancies between the datasets. In summary, transformer-based architectures outperform previous attempts to quantify cognitive load from voice. This is promising, in particular for healthcare-related problems in computational paralinguistics applications, since datasets are sparse in that realm.
{"title":"Quantifying Cognitive Load from Voice using Transformer-Based Models and a Cross-Dataset Evaluation","authors":"Pascal Hecker, A. Kappattanavar, Maximilian Schmitt, S. Moontaha, Johannes Wagner, F. Eyben, Björn Schuller, B. Arnrich","doi":"10.1109/ICMLA55696.2022.00055","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00055","url":null,"abstract":"Cognitive load is frequently induced in laboratory setups to measure responses to stress, and its impact on voice has been studied in the field of computational paralinguistics. One dataset on this topic was provided in the Computational Paralinguistics Challenge (ComParE) 2014, and therefore offers great comparability. Recently, transformer-based deep learning architectures established a new state-of-the-art and are finding their way gradually into the audio domain. In this context, we investigate the performance of popular transformer architectures in the audio domain on the ComParE 2014 dataset, and the impact of different pre-training and fine-tuning setups on these models. Further, we recorded a small custom dataset, designed to be comparable with the ComParE 2014 one, to assess cross-corpus model generalisability. We find that the transformer models outperform the challenge baseline, the challenge winner, and more recent deep learning approaches. Models based on the ‘large’ architecture perform well on the task at hand, while models based on the ‘base’ architecture perform at chance level. Fine-tuning on related domains (such as ASR or emotion), before fine-tuning on the targets, yields no higher performance compared to models pre-trained only in a self-supervised manner. The generalisability of the models between datasets is more intricate than expected, as seen in an unexpected low performance on the small custom dataset, and we discuss potential ‘hidden’ underlying discrepancies between the datasets. In summary, transformer-based architectures outperform previous attempts to quantify cognitive load from voice. This is promising, in particular for healthcare-related problems in computational paralinguistics applications, since datasets are sparse in that realm.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126099246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00075
A. Ahmadinia, Jaabaal Shah
This paper looks at performance bottlenecks of real-time object detection on edge devices. The "You only look once v4" (YOLOv4) is currently one of the leading state-of-the-art models for real-time object detection, and its tiny version: YOLOv4-tiny, is designed for edge devices. To improve object detection accuracy without sacrificing detection speed, we propose an object detection method based on YOLOv4-tiny and VGG-Net. First, we implement the mosaic data augmentation and Mish activation function to increase the generalization ability of the proposed model, making it more robust. Secondly, to enhance the richness of the features extracted, an extra 3x3 convolution layer is added in a way that two successive 3x3 convolutions are used to obtain 5x5 receptive fields. This would enable us to extract global features in the first CSP (Cross Stage Partial Network) Block and restructure the connections of the subsequent layers to have the same effect on the next CSP blocks. Evaluation results show that the proposed model has comparable performance and memory footprint but significantly greater accuracy than YOLOv4-tiny. Also, the proposed tiny model has similar performance to YOLOv4-tiny, and improves accuracy with much lower memory overhead, which makes it an ideal solution for real-time object detection, especially on edge devices.
{"title":"An Edge-based Real-Time Object Detection","authors":"A. Ahmadinia, Jaabaal Shah","doi":"10.1109/ICMLA55696.2022.00075","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00075","url":null,"abstract":"This paper looks at performance bottlenecks of real-time object detection on edge devices. The \"You only look once v4\" (YOLOv4) is currently one of the leading state-of-the-art models for real-time object detection, and its tiny version: YOLOv4-tiny, is designed for edge devices. To improve object detection accuracy without sacrificing detection speed, we propose an object detection method based on YOLOv4-tiny and VGG-Net. First, we implement the mosaic data augmentation and Mish activation function to increase the generalization ability of the proposed model, making it more robust. Secondly, to enhance the richness of the features extracted, an extra 3x3 convolution layer is added in a way that two successive 3x3 convolutions are used to obtain 5x5 receptive fields. This would enable us to extract global features in the first CSP (Cross Stage Partial Network) Block and restructure the connections of the subsequent layers to have the same effect on the next CSP blocks. Evaluation results show that the proposed model has comparable performance and memory footprint but significantly greater accuracy than YOLOv4-tiny. Also, the proposed tiny model has similar performance to YOLOv4-tiny, and improves accuracy with much lower memory overhead, which makes it an ideal solution for real-time object detection, especially on edge devices.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126166128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00154
Zahra Montajabi, V. Ghassab, N. Bouguila
Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.
{"title":"Recurrent Neural Network-Based Video Compression","authors":"Zahra Montajabi, V. Ghassab, N. Bouguila","doi":"10.1109/ICMLA55696.2022.00154","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00154","url":null,"abstract":"Recently, video compression gained a large focus among computer vision problems in media technologies. Using state of the art video compression methods, videos can be transmitted in a better quality requiring less bandwidth and memory. The advent of neural network-based video compression methods remarkably promoted video coding performance. In this paper, a video compression method is presented based on Recurrent Neural Network (RNN). The method includes an encoder, a middle module, and a decoder. Binarizer is utilized in the middle module to achieve better quantization performance. In encoder and decoder modules, long short-term memory (LSTM) units are used to keep the valuable information and eliminate unnecessary ones to iteratively reduce the quality loss of reconstructed video. This method reduces the complexity of neural network-based compression schemes and encodes the videos with less quality loss. The proposed method is evaluated using peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), and structural similarity index measure (SSIM) quality metrics. The proposed method is applied to two different public video compression datasets and the results show that the method outperforms existing standard video encoding schemes such as H.264 and H.265.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116140331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00112
Jackson Cates, R. Hoover, Kyle A. Caudle, D. Marchette, Cagri Ozdemir
In the era of big data, there is massive demand for new techniques to forecast and analyze multi-dimensional data. One task that has seen great interest in the community is anomaly detection of streaming data. Toward this end, the current research develops a novel approach to anomaly detection of streaming 2-dimensional observations via multilinear time-series analysis and 3-dimensional tensor principal component analysis (3DTPCA). We approach this problem utilizing dimensionality reduction and probabilistic inference in a low-dimensional space. We first propose a natural extension to 2-dimensional tensor principal component analysis (2DTPCA) to perform data dimensionality reduction on 4-dimensional tensor objects, aptly named 3DTPCA. We then represent the sub-sequences of our time-series observations as a 4-dimensional tensor utilizing a sliding window. Finally, we use 3DTPCA to compute reconstruction errors for inferring anomalous instances within the multilinear data stream. Experimental validation is presented via MovingMNIST data. Results illustrate that the proposed approach has a significant speedup in training time compared with deep learning, while performing competitively in terms of accuracy.
{"title":"Anomaly Detection from Multilinear Observations via Time-Series Analysis and 3DTPCA","authors":"Jackson Cates, R. Hoover, Kyle A. Caudle, D. Marchette, Cagri Ozdemir","doi":"10.1109/ICMLA55696.2022.00112","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00112","url":null,"abstract":"In the era of big data, there is massive demand for new techniques to forecast and analyze multi-dimensional data. One task that has seen great interest in the community is anomaly detection of streaming data. Toward this end, the current research develops a novel approach to anomaly detection of streaming 2-dimensional observations via multilinear time-series analysis and 3-dimensional tensor principal component analysis (3DTPCA). We approach this problem utilizing dimensionality reduction and probabilistic inference in a low-dimensional space. We first propose a natural extension to 2-dimensional tensor principal component analysis (2DTPCA) to perform data dimensionality reduction on 4-dimensional tensor objects, aptly named 3DTPCA. We then represent the sub-sequences of our time-series observations as a 4-dimensional tensor utilizing a sliding window. Finally, we use 3DTPCA to compute reconstruction errors for inferring anomalous instances within the multilinear data stream. Experimental validation is presented via MovingMNIST data. Results illustrate that the proposed approach has a significant speedup in training time compared with deep learning, while performing competitively in terms of accuracy.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114501305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00101
Andrew Tittaferrante, A. Yassine
In this work, we propose a reliable hyperparameter tuning scheme for offline reinforcement learning. We demonstrate our proposed scheme using the simplest antmaze environment from the standard benchmark offline dataset, D4RL. The usual approach for policy evaluation in offline reinforcement learning involves online evaluation, i.e., cherry-picking best performance on the test environment. To mitigate this cherry-picking, we propose an ad-hoc online evaluation metric, which we name "median-median-return". This metric enables more reliable reporting of results because it represents the expected performance of the learned policy by taking the median online evaluation performance across both epochs and training runs. To demonstrate our scheme, we employ the recently state-of-the-art algorithm, IQL, and perform a thorough hyperparameter search based on our proposed metric. The tuned architectures enjoy notably stronger cherry-picked performance, and the best models are able to surpass the reported state-of-the-art performance on average.
{"title":"Hyperparameter Tuning in Offline Reinforcement Learning","authors":"Andrew Tittaferrante, A. Yassine","doi":"10.1109/ICMLA55696.2022.00101","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00101","url":null,"abstract":"In this work, we propose a reliable hyperparameter tuning scheme for offline reinforcement learning. We demonstrate our proposed scheme using the simplest antmaze environment from the standard benchmark offline dataset, D4RL. The usual approach for policy evaluation in offline reinforcement learning involves online evaluation, i.e., cherry-picking best performance on the test environment. To mitigate this cherry-picking, we propose an ad-hoc online evaluation metric, which we name \"median-median-return\". This metric enables more reliable reporting of results because it represents the expected performance of the learned policy by taking the median online evaluation performance across both epochs and training runs. To demonstrate our scheme, we employ the recently state-of-the-art algorithm, IQL, and perform a thorough hyperparameter search based on our proposed metric. The tuned architectures enjoy notably stronger cherry-picked performance, and the best models are able to surpass the reported state-of-the-art performance on average.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114564954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00233
Hidetomo Sakaino, A. Higuchi
This paper presents a conversion method of cloud to precipitation images based on an improved Generative Adversarial Network (GAN) using multiple satellite and radar images. Since heavy rainfall events have been yearly increasing everywhere on the earth, precipitation radar images on lands become more important to use and predict, where much denser data is observed than on-the-ground sensor data. However, the coverage of such radar sites is very limited in small regions like land and/or near the sea. On the other hand, satellite images, i.e., Himawari-8, are available globally, but no direct precipitation images, i.e., rain clouds, can be obtained. GAN is a good selection for image translation, but it is known that high edges and textures can be lost. This paper proposes ‘sat2rain’, a two-step algorithm with a new constraint of the loss function. First, multiple satellite band and topography images are input to GAN, where block-wised images from overall images are used to cover over 2500 km x 2500 km. Second, enhanced GAN-based training between satellite images and radar images is conducted. Experimental results show the effectiveness of the proposed sat2rain mesh-wise method over the previous point-wise Random Forest method in terms of high edge and texture.
{"title":"Sat2rain: Multiple Satellite Images to Rainfall Amounts Conversion By Improved GAN","authors":"Hidetomo Sakaino, A. Higuchi","doi":"10.1109/ICMLA55696.2022.00233","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00233","url":null,"abstract":"This paper presents a conversion method of cloud to precipitation images based on an improved Generative Adversarial Network (GAN) using multiple satellite and radar images. Since heavy rainfall events have been yearly increasing everywhere on the earth, precipitation radar images on lands become more important to use and predict, where much denser data is observed than on-the-ground sensor data. However, the coverage of such radar sites is very limited in small regions like land and/or near the sea. On the other hand, satellite images, i.e., Himawari-8, are available globally, but no direct precipitation images, i.e., rain clouds, can be obtained. GAN is a good selection for image translation, but it is known that high edges and textures can be lost. This paper proposes ‘sat2rain’, a two-step algorithm with a new constraint of the loss function. First, multiple satellite band and topography images are input to GAN, where block-wised images from overall images are used to cover over 2500 km x 2500 km. Second, enhanced GAN-based training between satellite images and radar images is conducted. Experimental results show the effectiveness of the proposed sat2rain mesh-wise method over the previous point-wise Random Forest method in terms of high edge and texture.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128270019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00089
Wiem Safta, H. Frigui
We propose a Multiple Instance Learning (MIL) approach for lung nodules classification to address the limitations of current Computer-Aided Diagnosis (CAD) systems. One of these limitations consists of the need for a large collection of training samples that require to be segmented and annotated by radiologists. Another consists of using a fixed volume size for all nodules regardless of their actual sizes. Using a MIL approach, we represent each nodule by a nested sequence of volumes centered at the identified center of the nodule. We extract one feature vector from each volume. The set of features for each nodule are combined and represented by a bag. Using this representation, we investigate and compare many MIL algorithms and feature extraction methods. We start by applying benchmark MIL algorithms to traditional Gray Level Co-occurrence Matrix (GLCM) engineered features. Then, we design and train simple Convolutional Neural Networks (CNNs) to learn and extract features that characterize lung nodules. These extracted features are then fed to a benchmark MIL algorithm to learn a classification model. We report the results of three experiments applied to both GLCM and CNN features using two benchmark datasets. We designed our experiments to compare the different features and compare MIL versus Single Instance Learning (SIL) where a single feature vector represents a nodule. We show that our MIL representation using CNN features is more accurate for the lung nodules diagnosis task. We also show that MIL representation achieves better results than SIL applied on the ground truth region of each nodule.
{"title":"Lung Nodules Identification in CT Scans Using Multiple Instance Learning*","authors":"Wiem Safta, H. Frigui","doi":"10.1109/ICMLA55696.2022.00089","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00089","url":null,"abstract":"We propose a Multiple Instance Learning (MIL) approach for lung nodules classification to address the limitations of current Computer-Aided Diagnosis (CAD) systems. One of these limitations consists of the need for a large collection of training samples that require to be segmented and annotated by radiologists. Another consists of using a fixed volume size for all nodules regardless of their actual sizes. Using a MIL approach, we represent each nodule by a nested sequence of volumes centered at the identified center of the nodule. We extract one feature vector from each volume. The set of features for each nodule are combined and represented by a bag. Using this representation, we investigate and compare many MIL algorithms and feature extraction methods. We start by applying benchmark MIL algorithms to traditional Gray Level Co-occurrence Matrix (GLCM) engineered features. Then, we design and train simple Convolutional Neural Networks (CNNs) to learn and extract features that characterize lung nodules. These extracted features are then fed to a benchmark MIL algorithm to learn a classification model. We report the results of three experiments applied to both GLCM and CNN features using two benchmark datasets. We designed our experiments to compare the different features and compare MIL versus Single Instance Learning (SIL) where a single feature vector represents a nodule. We show that our MIL representation using CNN features is more accurate for the lung nodules diagnosis task. We also show that MIL representation achieves better results than SIL applied on the ground truth region of each nodule.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129205487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00147
A. Ponomarev, Anton Agafonov
Ontology-based explanation techniques allow one to get explanation why a neural network arrived to some conclusion using human-understandable terms and their formal definitions. The paper proposes a method to build post-hoc ontology-based explanations by training a multi-label neural network mapping the activations of the specified "black box" network to ontology concepts. In order to simplify training of such network we employ semantic loss, taking into account relationships between concepts. The experiment with a synthetic dataset shows that the proposed method can generate accurate ontology-based explanations of a given network.
{"title":"Ontology-Based Post-Hoc Explanations via Simultaneous Concept Extraction*","authors":"A. Ponomarev, Anton Agafonov","doi":"10.1109/ICMLA55696.2022.00147","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00147","url":null,"abstract":"Ontology-based explanation techniques allow one to get explanation why a neural network arrived to some conclusion using human-understandable terms and their formal definitions. The paper proposes a method to build post-hoc ontology-based explanations by training a multi-label neural network mapping the activations of the specified \"black box\" network to ontology concepts. In order to simplify training of such network we employ semantic loss, taking into account relationships between concepts. The experiment with a synthetic dataset shows that the proposed method can generate accurate ontology-based explanations of a given network.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121407383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}