首页 > 最新文献

2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)最新文献

英文 中文
Understanding Linguistic Variations in Neutral and Strongly Opinionated Reviews 理解中性评论和强烈固执己见评论中的语言差异
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00237
Salim Sazzed
Reviews with a user rating close to the center of the rating scale are often referred to as neutral reviews and are prevalent in consumer feedback. By leveraging annotated data, implicit characteristics of neutral reviews can be learned for a better prediction. In case of the absence of annotated data, often, unsupervised lexicon-based approaches are employed. Nevertheless, word-level sentiment and hand-crafted aggregation rules of lexicon-based are usually inadequate for distinguishing neutral reviews. Therefore, in this study, we try to find additional distinguishing signals for identifying neutral reviews. We investi-gate a number of attributes, such as the frequency of contrasting conjunctions, extreme opinions, intensifiers, modifiers, and negation, to discover distinctive elements in neutral reviews. We find that some linguistic features, such as contrasting conjunctions and mitigators can provide additional signals that may help to distinguish neutral reviews across multi-domain datasets. Our analysis and findings deliver insights for developing effective unsupervised methods for discerning different types of reviews.
用户评分接近评分量表中心的评论通常被称为中立评论,在消费者反馈中很普遍。通过利用带注释的数据,可以学习中立评论的隐含特征,从而更好地进行预测。在没有注释数据的情况下,通常采用无监督的基于词典的方法。然而,词级情感和基于词典的手工聚合规则通常不足以区分中立评论。因此,在本研究中,我们试图找到额外的区分信号来识别中性评论。我们调查了一些属性,如对比连词、极端观点、强化词、修饰语和否定的频率,以发现中性评论中的独特元素。我们发现一些语言特征,如对比连词和缓解词可以提供额外的信号,这可能有助于区分跨多领域数据集的中立评论。我们的分析和发现为开发有效的无监督方法来识别不同类型的评论提供了见解。
{"title":"Understanding Linguistic Variations in Neutral and Strongly Opinionated Reviews","authors":"Salim Sazzed","doi":"10.1109/ICMLA55696.2022.00237","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00237","url":null,"abstract":"Reviews with a user rating close to the center of the rating scale are often referred to as neutral reviews and are prevalent in consumer feedback. By leveraging annotated data, implicit characteristics of neutral reviews can be learned for a better prediction. In case of the absence of annotated data, often, unsupervised lexicon-based approaches are employed. Nevertheless, word-level sentiment and hand-crafted aggregation rules of lexicon-based are usually inadequate for distinguishing neutral reviews. Therefore, in this study, we try to find additional distinguishing signals for identifying neutral reviews. We investi-gate a number of attributes, such as the frequency of contrasting conjunctions, extreme opinions, intensifiers, modifiers, and negation, to discover distinctive elements in neutral reviews. We find that some linguistic features, such as contrasting conjunctions and mitigators can provide additional signals that may help to distinguish neutral reviews across multi-domain datasets. Our analysis and findings deliver insights for developing effective unsupervised methods for discerning different types of reviews.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125640232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
CNN-n-GRU: end-to-end speech emotion recognition from raw waveform signal using CNNs and gated recurrent unit networks CNN-n-GRU:利用cnn和门控循环单元网络从原始波形信号进行端到端语音情感识别
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00116
Alaa Nfissi, W. Bouachir, N. Bouguila, B. Mishara
We present CNN-n-GRU, a new end-to-end (E2E) architecture built of an n-layer convolutional neural network (CNN) followed sequentially by an n-layer Gated Recurrent Unit (GRU) for speech emotion recognition. CNNs and RNNs both exhibited promising outcomes when fed raw waveform voice inputs. This inspired our idea to combine them into a single model to maximise their potential. Instead of using handcrafted features or spectrograms, we train CNNs to recognise low-level speech representations from raw waveform, which allows the network to capture relevant narrow-band emotion characteristics. On the other hand, RNNs (GRUs in our case) can learn temporal characteristics, allowing the network to better capture the signal’s time-distributed features. Because a CNN can generate multiple levels of representation abstraction, we exploit early layers to extract high-level features, then to supply the appropriate input to subsequent RNN layers in order to aggregate long-term dependencies. By taking advantage of both CNNs and GRUs in a single model, the proposed architecture has important advantages over other models from the literature. The proposed model was evaluated using the TESS dataset and compared to state-of-the-art methods. Our experimental results demonstrate that the proposed model is more accurate than traditional classification approaches for speech emotion recognition.
我们提出了CNN-n-GRU,一种新的端到端(E2E)架构,该架构由一个n层卷积神经网络(CNN)和一个用于语音情感识别的n层门控循环单元(GRU)依次构建。当输入原始波形语音输入时,cnn和rnn都显示出有希望的结果。这激发了我们的想法,将它们组合成一个单一的模型,以最大限度地发挥它们的潜力。我们没有使用手工制作的特征或频谱图,而是训练cnn从原始波形中识别低级语音表示,这允许网络捕获相关的窄带情感特征。另一方面,rnn(在我们的例子中是gru)可以学习时间特征,允许网络更好地捕获信号的时间分布特征。因为CNN可以生成多层表示抽象,我们利用早期层提取高级特征,然后为后续RNN层提供适当的输入,以聚合长期依赖关系。通过在单个模型中同时利用cnn和gru,所提出的体系结构比文献中的其他模型具有重要的优势。使用TESS数据集对所提出的模型进行了评估,并与最先进的方法进行了比较。实验结果表明,该模型比传统的语音情感识别分类方法更准确。
{"title":"CNN-n-GRU: end-to-end speech emotion recognition from raw waveform signal using CNNs and gated recurrent unit networks","authors":"Alaa Nfissi, W. Bouachir, N. Bouguila, B. Mishara","doi":"10.1109/ICMLA55696.2022.00116","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00116","url":null,"abstract":"We present CNN-n-GRU, a new end-to-end (E2E) architecture built of an n-layer convolutional neural network (CNN) followed sequentially by an n-layer Gated Recurrent Unit (GRU) for speech emotion recognition. CNNs and RNNs both exhibited promising outcomes when fed raw waveform voice inputs. This inspired our idea to combine them into a single model to maximise their potential. Instead of using handcrafted features or spectrograms, we train CNNs to recognise low-level speech representations from raw waveform, which allows the network to capture relevant narrow-band emotion characteristics. On the other hand, RNNs (GRUs in our case) can learn temporal characteristics, allowing the network to better capture the signal’s time-distributed features. Because a CNN can generate multiple levels of representation abstraction, we exploit early layers to extract high-level features, then to supply the appropriate input to subsequent RNN layers in order to aggregate long-term dependencies. By taking advantage of both CNNs and GRUs in a single model, the proposed architecture has important advantages over other models from the literature. The proposed model was evaluated using the TESS dataset and compared to state-of-the-art methods. Our experimental results demonstrate that the proposed model is more accurate than traditional classification approaches for speech emotion recognition.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128084481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Real-time Digit Gesture Recognition System Based on mmWave Radar 基于毫米波雷达的实时数字手势识别系统
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00129
Chun Yuan, Youxuan Zhong, Jiake Tian, Y. Zou
Gesture communication is one of the most general communication methods in the world, with the obvious advantage of exchanging information without worrying about the borderline of different languages. Therefore, establishing a cost-effective way of capturing and understanding human gestures has long been a popular research topic regarding human-machine interaction, particularly in emerging scenarios such as smart cities, etc. In this paper, we propose a system based on a commercially available mmWave radar to recognize digits represented by the travel path of the human hand using a specially designed convolutional neural network (CNN) algorithm. We illustrate the proposed system is capable of recording the path of the moving hand in real-time at the cost of 1 transmitter, 2 receivers, and 2.78 GHz bandwidth from the mmWave radar. Our experimental results show that an average prediction accuracy of 98.8% is achieved in a validation test based on a 7:3 ratio split from existing dataset and an average prediction accuracy of 95.3% in generalization test using fresh data.
手势交流是世界上最普遍的交流方式之一,它的优点是可以交换信息而不用担心不同语言的界限。因此,建立一种具有成本效益的捕获和理解人类手势的方法一直是人机交互领域的热门研究课题,特别是在智能城市等新兴场景中。在本文中,我们提出了一个基于市售毫米波雷达的系统,该系统使用特殊设计的卷积神经网络(CNN)算法来识别由人手移动路径表示的数字。我们说明了所提出的系统能够实时记录移动的手的路径,而代价是1个发射器,2个接收器和来自毫米波雷达的2.78 GHz带宽。实验结果表明,在现有数据中以7:3的比例分割的验证测试中,平均预测准确率为98.8%,在使用新数据的泛化测试中,平均预测准确率为95.3%。
{"title":"A Real-time Digit Gesture Recognition System Based on mmWave Radar","authors":"Chun Yuan, Youxuan Zhong, Jiake Tian, Y. Zou","doi":"10.1109/ICMLA55696.2022.00129","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00129","url":null,"abstract":"Gesture communication is one of the most general communication methods in the world, with the obvious advantage of exchanging information without worrying about the borderline of different languages. Therefore, establishing a cost-effective way of capturing and understanding human gestures has long been a popular research topic regarding human-machine interaction, particularly in emerging scenarios such as smart cities, etc. In this paper, we propose a system based on a commercially available mmWave radar to recognize digits represented by the travel path of the human hand using a specially designed convolutional neural network (CNN) algorithm. We illustrate the proposed system is capable of recording the path of the moving hand in real-time at the cost of 1 transmitter, 2 receivers, and 2.78 GHz bandwidth from the mmWave radar. Our experimental results show that an average prediction accuracy of 98.8% is achieved in a validation test based on a 7:3 ratio split from existing dataset and an average prediction accuracy of 95.3% in generalization test using fresh data.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133181259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Multivariate Time-Series Transformers for Seizure Identification on EEG 用于脑电图癫痫发作识别的无监督多变量时间序列变压器
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00208
.Ilkay Yildiz Potter, George Zerveas, Carsten Eickhoff, D. Duncan
Epilepsy is one of the most common neurological disorders, typically observed via seizure episodes. Epileptic seizures are commonly monitored through electroencephalogram (EEG) recordings due to their routine and low expense collection. The stochastic nature of EEG makes seizure identification via manual inspections performed by highly-trained experts a tedious endeavor, motivating the use of automated identification. The literature on automated identification focuses mostly on supervised learning methods requiring expert labels of EEG segments that contain seizures, which are difficult to obtain. Motivated by these observations, we pose seizure identification as an unsupervised anomaly detection problem. To this end, we employ the first unsupervised transformer-based model for seizure identification on raw EEG. We train an autoencoder involving a transformer encoder via an unsupervised loss function, incorporating a novel masking strategy uniquely designed for multivariate time-series data such as EEG. Training employs EEG recordings that do not contain any seizures, while seizures are identified with respect to reconstruction errors at inference time. We evaluate our method on three publicly available benchmark EEG datasets for distinguishing seizure vs. non-seizure windows. Our method leads to significantly better seizure identification performance than supervised learning counterparts, by up to 16% recall, 9% accuracy, and 9% Area under the Receiver Operating Characteristics Curve (AUC), establishing particular benefits on highly imbalanced data. Through accurate seizure identification, our method could facilitate widely accessible and early detection of epilepsy development, without needing expensive label collection or manual feature extraction.
癫痫是最常见的神经系统疾病之一,通常通过癫痫发作来观察。癫痫发作通常通过脑电图(EEG)记录监测,因为它们是常规的和低费用的收集。脑电图的随机性使得通过由训练有素的专家进行的人工检查来识别癫痫发作是一项繁琐的工作,这促使了自动识别的使用。关于自动识别的文献大多集中在监督学习方法上,需要包含癫痫发作的脑电图片段的专家标签,而这很难获得。基于这些观察结果,我们将癫痫发作识别作为一个无监督的异常检测问题。为此,我们采用了第一个基于无监督变压器的模型对原始脑电图进行癫痫发作识别。我们通过无监督损失函数训练了一个包含变压器编码器的自编码器,并结合了一种针对多变量时间序列数据(如EEG)设计的新颖掩蔽策略。训练使用不包含任何癫痫发作的脑电图记录,而癫痫发作是根据推断时间的重建错误来识别的。我们在三个公开可用的基准脑电图数据集上评估了我们的方法,以区分癫痫发作和非癫痫发作窗口。我们的方法显著优于监督学习方法的癫痫识别性能,召回率高达16%,准确率高达9%,接收者操作特征曲线(AUC)下面积达到9%,在高度不平衡的数据上建立了特别的优势。通过准确的癫痫发作识别,我们的方法可以促进癫痫发展的广泛可及和早期检测,而不需要昂贵的标签收集或人工特征提取。
{"title":"Unsupervised Multivariate Time-Series Transformers for Seizure Identification on EEG","authors":".Ilkay Yildiz Potter, George Zerveas, Carsten Eickhoff, D. Duncan","doi":"10.1109/ICMLA55696.2022.00208","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00208","url":null,"abstract":"Epilepsy is one of the most common neurological disorders, typically observed via seizure episodes. Epileptic seizures are commonly monitored through electroencephalogram (EEG) recordings due to their routine and low expense collection. The stochastic nature of EEG makes seizure identification via manual inspections performed by highly-trained experts a tedious endeavor, motivating the use of automated identification. The literature on automated identification focuses mostly on supervised learning methods requiring expert labels of EEG segments that contain seizures, which are difficult to obtain. Motivated by these observations, we pose seizure identification as an unsupervised anomaly detection problem. To this end, we employ the first unsupervised transformer-based model for seizure identification on raw EEG. We train an autoencoder involving a transformer encoder via an unsupervised loss function, incorporating a novel masking strategy uniquely designed for multivariate time-series data such as EEG. Training employs EEG recordings that do not contain any seizures, while seizures are identified with respect to reconstruction errors at inference time. We evaluate our method on three publicly available benchmark EEG datasets for distinguishing seizure vs. non-seizure windows. Our method leads to significantly better seizure identification performance than supervised learning counterparts, by up to 16% recall, 9% accuracy, and 9% Area under the Receiver Operating Characteristics Curve (AUC), establishing particular benefits on highly imbalanced data. Through accurate seizure identification, our method could facilitate widely accessible and early detection of epilepsy development, without needing expensive label collection or manual feature extraction.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133341415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Performance Benchmark of Machine Learning-Based Methodology for Swahili News Article Categorization 基于机器学习的斯瓦希里语新闻分类方法的性能基准
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00238
Shaun Anthony Little, Kaushik Roy, Ahmed Al Hamoud
As data increases at unprecedented rates, so does the need to classify this data, including news article data. Unfortunately, most news article categorization research utilizes global languages such as English or Spanish, and not much research considers low-resource languages like Swahili. Testing multiple classifiers and preprocessing methods, we show that the SVM model with tokenization and stop word removal has the highest accuracy (85.13%) scores for Swahili news article categorization. These results from the first publicly available peer-reviewed Swahili news article dataset provide benchmark performance for Swahili news article categorization and contribute to lean Swahili text classification research.
随着数据以前所未有的速度增长,对这些数据(包括新闻文章数据)进行分类的需求也在增加。不幸的是,大多数新闻文章分类研究都使用英语或西班牙语等全球语言,而很少有研究考虑像斯瓦希里语这样的低资源语言。通过对多个分类器和预处理方法的测试,我们发现带有标记化和停止词去除的SVM模型对斯瓦希里语新闻文章分类的准确率最高(85.13%)。这些结果来自第一个公开可用的同行评审的斯瓦希里语新闻文章数据集,为斯瓦希里语新闻文章分类提供了基准性能,并有助于精益斯瓦希里语文本分类研究。
{"title":"Performance Benchmark of Machine Learning-Based Methodology for Swahili News Article Categorization","authors":"Shaun Anthony Little, Kaushik Roy, Ahmed Al Hamoud","doi":"10.1109/ICMLA55696.2022.00238","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00238","url":null,"abstract":"As data increases at unprecedented rates, so does the need to classify this data, including news article data. Unfortunately, most news article categorization research utilizes global languages such as English or Spanish, and not much research considers low-resource languages like Swahili. Testing multiple classifiers and preprocessing methods, we show that the SVM model with tokenization and stop word removal has the highest accuracy (85.13%) scores for Swahili news article categorization. These results from the first publicly available peer-reviewed Swahili news article dataset provide benchmark performance for Swahili news article categorization and contribute to lean Swahili text classification research.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133627093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Robust Approach to Fine-tune Pre-trained Transformer-based models for Text Summarization through Latent Space Compression 基于潜在空间压缩的文本摘要预训练模型的鲁棒微调方法
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00030
Ala Alam Falaki, R. Gras
We proposed a technique to reduce the decoder’s number of parameters in a sequence-to-sequence (seq2seq) architecture for automatic text summarization. This approach uses a pre-trained Autoencoder (AE) trained on top of an encoder’s output to reduce its embedding dimension, which significantly reduces the summarizer model’s decoder size. Two experiments were performed to validate the idea: a custom seq2seq architecture with various pre-trained encoders and incorporating the approach in an encoder-decoder model (BART) for text summarization. Both studies showed promising results in terms of ROUGE score. However, the impressive outcome is the 54% decrease in the inference time and a 57% drop in GPU memory usage while fine-tuning with minimal quality loss (4.5% R1 score). It significantly reduces the hardware requirement to fine-tune large-scale pre-trained models. It is also shown that our approach can be combined with other network size reduction techniques (e.g. Distillation) to further reduce any encoder-decoder model parameters count. The implementation and checkpoints are available on GitHub.1
我们提出了一种在序列到序列(sequence-to-sequence, seq2seq)架构中减少解码器参数数量的技术,用于自动文本摘要。这种方法使用预训练的自动编码器(AE),在编码器的输出上进行训练,以减少其嵌入维度,从而显着减少摘要器模型的解码器大小。我们进行了两个实验来验证这个想法:一个带有各种预训练编码器的自定义seq2seq架构,并将该方法合并到用于文本摘要的编码器-解码器模型(BART)中。两项研究在ROUGE评分方面都显示出令人鼓舞的结果。然而,令人印象深刻的结果是推理时间减少了54%,GPU内存使用减少了57%,而微调的质量损失最小(R1分数为4.5%)。它大大减少了对大规模预训练模型进行微调的硬件需求。研究还表明,我们的方法可以与其他网络大小缩减技术(例如蒸馏)相结合,以进一步减少任何编码器-解码器模型参数计数。实现和检查点可以在GitHub.1上获得
{"title":"A Robust Approach to Fine-tune Pre-trained Transformer-based models for Text Summarization through Latent Space Compression","authors":"Ala Alam Falaki, R. Gras","doi":"10.1109/ICMLA55696.2022.00030","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00030","url":null,"abstract":"We proposed a technique to reduce the decoder’s number of parameters in a sequence-to-sequence (seq2seq) architecture for automatic text summarization. This approach uses a pre-trained Autoencoder (AE) trained on top of an encoder’s output to reduce its embedding dimension, which significantly reduces the summarizer model’s decoder size. Two experiments were performed to validate the idea: a custom seq2seq architecture with various pre-trained encoders and incorporating the approach in an encoder-decoder model (BART) for text summarization. Both studies showed promising results in terms of ROUGE score. However, the impressive outcome is the 54% decrease in the inference time and a 57% drop in GPU memory usage while fine-tuning with minimal quality loss (4.5% R1 score). It significantly reduces the hardware requirement to fine-tune large-scale pre-trained models. It is also shown that our approach can be combined with other network size reduction techniques (e.g. Distillation) to further reduce any encoder-decoder model parameters count. The implementation and checkpoints are available on GitHub.1","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133762053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GDSCAN: Pedestrian Group Detection using Dynamic Epsilon GDSCAN:使用动态Epsilon进行行人群检测
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00267
Ming-Jie Chen, Shadi Banitaan, Mina Maleki, Yichun Li
In order to maintain human safety in autonomous vehicles, pedestrian detection and tracking in real-time have become crucial research areas. The critical challenge in this field is to improve pedestrian detection accuracy while reducing tracking processing time. Due to the fact that pedestrians move in groups with the same speed and direction, we can address this challenge by detecting and tracking pedestrian groups. This work focused on pedestrian group detection. Various clustering methods were used in this study to identify pedestrian groups. Firstly, pedestrians were identified using a convolutional neural network approach. Secondly, K-Means and DBSCAN clustering methods were used to identify pedestrian groups based on the coordinates of the pedestrians’ bounding boxes. Moreover, we proposed a modified DBSCAN clustering method named GDSCAN that employs dynamic epsilon to different areas of an image. The experimental results on the MOT17 dataset show that GDSCAN outperformed K-Means and DBSCAN methods based on the Silhouette Coefficient score and Adjusted Rand Index (ARI).
为了维护自动驾驶车辆中的人身安全,行人的实时检测和跟踪已成为关键的研究领域。该领域的关键挑战是在提高行人检测精度的同时减少跟踪处理时间。由于行人以相同的速度和方向成群移动,我们可以通过检测和跟踪行人群体来解决这一挑战。这项工作的重点是行人群体检测。本研究采用了不同的聚类方法来识别行人群体。首先,采用卷积神经网络方法识别行人;其次,采用K-Means和DBSCAN聚类方法,根据行人边界框坐标进行行人群识别;此外,我们还提出了一种改进的DBSCAN聚类方法GDSCAN,该方法对图像的不同区域使用动态的epsilon。在MOT17数据集上的实验结果表明,基于轮廓系数得分和调整后兰德指数(ARI)的GDSCAN方法优于K-Means和DBSCAN方法。
{"title":"GDSCAN: Pedestrian Group Detection using Dynamic Epsilon","authors":"Ming-Jie Chen, Shadi Banitaan, Mina Maleki, Yichun Li","doi":"10.1109/ICMLA55696.2022.00267","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00267","url":null,"abstract":"In order to maintain human safety in autonomous vehicles, pedestrian detection and tracking in real-time have become crucial research areas. The critical challenge in this field is to improve pedestrian detection accuracy while reducing tracking processing time. Due to the fact that pedestrians move in groups with the same speed and direction, we can address this challenge by detecting and tracking pedestrian groups. This work focused on pedestrian group detection. Various clustering methods were used in this study to identify pedestrian groups. Firstly, pedestrians were identified using a convolutional neural network approach. Secondly, K-Means and DBSCAN clustering methods were used to identify pedestrian groups based on the coordinates of the pedestrians’ bounding boxes. Moreover, we proposed a modified DBSCAN clustering method named GDSCAN that employs dynamic epsilon to different areas of an image. The experimental results on the MOT17 dataset show that GDSCAN outperformed K-Means and DBSCAN methods based on the Silhouette Coefficient score and Adjusted Rand Index (ARI).","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134134117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Flexible Exploration Strategies in Multi-Agent Reinforcement Learning for Instability by Mutual Learning 基于互学习的不稳定性多智能体强化学习中的灵活探索策略
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00100
Yuki Miyashita, T. Sugawara
A fundamental challenge in multi-agent reinforcement learning is an effective exploration of state-action spaces because agents must learn their policies in a non-stationary environment due to changing policies of other learning agents. As the agent’s learning progresses, different undesired situations may appear one after another and agents have to learn again to adapt them. Therefore, agents must learn again with a high probability of exploration to find the appropriate actions for the exposed situation. However, existing algorithms can suffer from inability to learn behavior again on the lack of exploration for these situations because agents usually become exploitation-oriented by using simple exploration strategies, such as ε-greedy strategy. Therefore, we propose two types of simple exploration strategies, where each agent monitors the trend of performance and controls the exploration probability, ε, based on the transition of performance. By introducing a coordinated problem called the PushBlock problem, which includes the above issue, we show that the proposed method could improve the overall performance relative to conventional ε-greedy strategies and analyze their effects on the generated behavior.
多智能体强化学习的一个基本挑战是对状态-动作空间的有效探索,因为由于其他学习智能体的策略变化,智能体必须在非平稳环境中学习它们的策略。随着智能体学习的进行,不同的不希望的情况可能会接连出现,智能体必须再次学习以适应它们。因此,代理必须以高概率的探索再次学习,以找到暴露情况下的适当行动。然而,对于这些情况,由于智能体通常使用简单的探索策略(如ε-greedy策略)而变得以开发为导向,因此现有算法在缺乏探索的情况下无法再次学习行为。因此,我们提出了两种简单的勘探策略,其中每个智能体监控性能趋势并根据性能转变控制勘探概率ε。通过引入一个包含上述问题的协调问题PushBlock问题,我们证明了所提出的方法相对于传统的ε-greedy策略可以提高整体性能,并分析了它们对生成行为的影响。
{"title":"Flexible Exploration Strategies in Multi-Agent Reinforcement Learning for Instability by Mutual Learning","authors":"Yuki Miyashita, T. Sugawara","doi":"10.1109/ICMLA55696.2022.00100","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00100","url":null,"abstract":"A fundamental challenge in multi-agent reinforcement learning is an effective exploration of state-action spaces because agents must learn their policies in a non-stationary environment due to changing policies of other learning agents. As the agent’s learning progresses, different undesired situations may appear one after another and agents have to learn again to adapt them. Therefore, agents must learn again with a high probability of exploration to find the appropriate actions for the exposed situation. However, existing algorithms can suffer from inability to learn behavior again on the lack of exploration for these situations because agents usually become exploitation-oriented by using simple exploration strategies, such as ε-greedy strategy. Therefore, we propose two types of simple exploration strategies, where each agent monitors the trend of performance and controls the exploration probability, ε, based on the transition of performance. By introducing a coordinated problem called the PushBlock problem, which includes the above issue, we show that the proposed method could improve the overall performance relative to conventional ε-greedy strategies and analyze their effects on the generated behavior.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134163409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Stochastic Induction of Decision Trees with Application to Learning Haar Trees 决策树随机归纳法及其在Haar树学习中的应用
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00137
A. Alizadeh, Mukesh Singhal, Vahid Behzadan, Pooya Tavallali, A. Ranganath
Decision trees are a convenient and established approach for any supervised learning task. Decision trees are trained by greedily splitting a leaf nodes, into two leaf nodes until a specific stopping criterion is reached. Splitting a node consists of finding the best feature and threshold that minimizes a criterion. The criterion minimization problem is solved through a costly exhaustive search algorithm. This paper proposes a novel stochastic approach for criterion minimization. The algorithm is compared with several other related state-of-the-art decision tree learning methods, including the baseline non-stochastic approach. We apply the proposed algorithm to learn a Haar tree over MNIST dataset that consists of over 200, 000 features and 60, 000 samples. The result is comparable to the performance of oblique trees while providing a significant speed-up in both inference and training times.
决策树对于任何监督学习任务都是一种方便且成熟的方法。决策树通过贪婪地将一个叶节点分割成两个叶节点来训练,直到达到特定的停止准则。分割节点包括找到最佳特征和最小化标准的阈值。通过代价高昂的穷举搜索算法求解准则最小化问题。本文提出了一种新的准则最小化的随机方法。该算法与其他几种相关的最先进的决策树学习方法进行了比较,包括基线非随机方法。我们应用所提出的算法在包含超过20万个特征和6万个样本的MNIST数据集上学习Haar树。结果与斜树的性能相当,同时在推理和训练时间上都提供了显着的加速。
{"title":"Stochastic Induction of Decision Trees with Application to Learning Haar Trees","authors":"A. Alizadeh, Mukesh Singhal, Vahid Behzadan, Pooya Tavallali, A. Ranganath","doi":"10.1109/ICMLA55696.2022.00137","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00137","url":null,"abstract":"Decision trees are a convenient and established approach for any supervised learning task. Decision trees are trained by greedily splitting a leaf nodes, into two leaf nodes until a specific stopping criterion is reached. Splitting a node consists of finding the best feature and threshold that minimizes a criterion. The criterion minimization problem is solved through a costly exhaustive search algorithm. This paper proposes a novel stochastic approach for criterion minimization. The algorithm is compared with several other related state-of-the-art decision tree learning methods, including the baseline non-stochastic approach. We apply the proposed algorithm to learn a Haar tree over MNIST dataset that consists of over 200, 000 features and 60, 000 samples. The result is comparable to the performance of oblique trees while providing a significant speed-up in both inference and training times.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134323445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Non-linear White-box Predictors: A Use Case in Energy Systems 学习非线性白盒预测器:能源系统中的一个用例
Pub Date : 2022-12-01 DOI: 10.1109/ICMLA55696.2022.00082
Sandra Wilfling, M. Ebrahimi, Qamar Alfalouji, G. Schweiger, Mina Basirat
Many applications in energy systems require models that represent the non-linear dynamics of the underlying systems. Black-box models with non-linear architecture are suitable candidates for modeling these systems; however, they are computationally expensive and lack interpretability. An inexpensive white-box linear combination learned over a suitable polynomial feature set can result in a high-performing non-linear model that is easier to interpret, validate, and verify against reference models created by the domain experts. This paper proposes a workflow to learn a linear combination of non-linear terms for an engineered polynomial feature set. We firstly detect non-linear dependencies and then attempt to reconstruct them using feature expansion. Afterwards, we select possible predictors with the highest correlation coefficients for predictive regression analysis. We demonstrate how to learn inexpensive yet comprehensible linear combinations of non-linear terms from four datasets. Experimental evaluations show our workflow yields improvements in the metrics R2, CV-RMSE and MAPE in all datasets. Further evaluation of the learned models’ goodness of fit using prediction error plots also confirms that the proposed workflow results in models that can more accurately capture the nature of the underlying physical systems.
能源系统中的许多应用都需要表示底层系统的非线性动力学的模型。具有非线性结构的黑盒模型是这些系统建模的合适候选者;然而,它们在计算上很昂贵并且缺乏可解释性。在合适的多项式特征集上学习便宜的白盒线性组合可以产生高性能的非线性模型,该模型更容易解释、验证和验证由领域专家创建的参考模型。本文提出了一种学习工程多项式特征集非线性项的线性组合的工作流程。我们首先检测非线性依赖关系,然后尝试使用特征扩展来重建它们。然后,我们选择相关系数最高的可能预测因子进行预测回归分析。我们演示了如何从四个数据集中学习便宜但易于理解的非线性项的线性组合。实验评估表明,我们的工作流程在所有数据集中的指标R2、CV-RMSE和MAPE方面都有改进。使用预测误差图对学习模型的拟合优度进行进一步评估,也证实了所提出的工作流产生的模型可以更准确地捕捉底层物理系统的性质。
{"title":"Learning Non-linear White-box Predictors: A Use Case in Energy Systems","authors":"Sandra Wilfling, M. Ebrahimi, Qamar Alfalouji, G. Schweiger, Mina Basirat","doi":"10.1109/ICMLA55696.2022.00082","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00082","url":null,"abstract":"Many applications in energy systems require models that represent the non-linear dynamics of the underlying systems. Black-box models with non-linear architecture are suitable candidates for modeling these systems; however, they are computationally expensive and lack interpretability. An inexpensive white-box linear combination learned over a suitable polynomial feature set can result in a high-performing non-linear model that is easier to interpret, validate, and verify against reference models created by the domain experts. This paper proposes a workflow to learn a linear combination of non-linear terms for an engineered polynomial feature set. We firstly detect non-linear dependencies and then attempt to reconstruct them using feature expansion. Afterwards, we select possible predictors with the highest correlation coefficients for predictive regression analysis. We demonstrate how to learn inexpensive yet comprehensible linear combinations of non-linear terms from four datasets. Experimental evaluations show our workflow yields improvements in the metrics R2, CV-RMSE and MAPE in all datasets. Further evaluation of the learned models’ goodness of fit using prediction error plots also confirms that the proposed workflow results in models that can more accurately capture the nature of the underlying physical systems.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121702473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1