Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00237
Salim Sazzed
Reviews with a user rating close to the center of the rating scale are often referred to as neutral reviews and are prevalent in consumer feedback. By leveraging annotated data, implicit characteristics of neutral reviews can be learned for a better prediction. In case of the absence of annotated data, often, unsupervised lexicon-based approaches are employed. Nevertheless, word-level sentiment and hand-crafted aggregation rules of lexicon-based are usually inadequate for distinguishing neutral reviews. Therefore, in this study, we try to find additional distinguishing signals for identifying neutral reviews. We investi-gate a number of attributes, such as the frequency of contrasting conjunctions, extreme opinions, intensifiers, modifiers, and negation, to discover distinctive elements in neutral reviews. We find that some linguistic features, such as contrasting conjunctions and mitigators can provide additional signals that may help to distinguish neutral reviews across multi-domain datasets. Our analysis and findings deliver insights for developing effective unsupervised methods for discerning different types of reviews.
{"title":"Understanding Linguistic Variations in Neutral and Strongly Opinionated Reviews","authors":"Salim Sazzed","doi":"10.1109/ICMLA55696.2022.00237","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00237","url":null,"abstract":"Reviews with a user rating close to the center of the rating scale are often referred to as neutral reviews and are prevalent in consumer feedback. By leveraging annotated data, implicit characteristics of neutral reviews can be learned for a better prediction. In case of the absence of annotated data, often, unsupervised lexicon-based approaches are employed. Nevertheless, word-level sentiment and hand-crafted aggregation rules of lexicon-based are usually inadequate for distinguishing neutral reviews. Therefore, in this study, we try to find additional distinguishing signals for identifying neutral reviews. We investi-gate a number of attributes, such as the frequency of contrasting conjunctions, extreme opinions, intensifiers, modifiers, and negation, to discover distinctive elements in neutral reviews. We find that some linguistic features, such as contrasting conjunctions and mitigators can provide additional signals that may help to distinguish neutral reviews across multi-domain datasets. Our analysis and findings deliver insights for developing effective unsupervised methods for discerning different types of reviews.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125640232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00116
Alaa Nfissi, W. Bouachir, N. Bouguila, B. Mishara
We present CNN-n-GRU, a new end-to-end (E2E) architecture built of an n-layer convolutional neural network (CNN) followed sequentially by an n-layer Gated Recurrent Unit (GRU) for speech emotion recognition. CNNs and RNNs both exhibited promising outcomes when fed raw waveform voice inputs. This inspired our idea to combine them into a single model to maximise their potential. Instead of using handcrafted features or spectrograms, we train CNNs to recognise low-level speech representations from raw waveform, which allows the network to capture relevant narrow-band emotion characteristics. On the other hand, RNNs (GRUs in our case) can learn temporal characteristics, allowing the network to better capture the signal’s time-distributed features. Because a CNN can generate multiple levels of representation abstraction, we exploit early layers to extract high-level features, then to supply the appropriate input to subsequent RNN layers in order to aggregate long-term dependencies. By taking advantage of both CNNs and GRUs in a single model, the proposed architecture has important advantages over other models from the literature. The proposed model was evaluated using the TESS dataset and compared to state-of-the-art methods. Our experimental results demonstrate that the proposed model is more accurate than traditional classification approaches for speech emotion recognition.
{"title":"CNN-n-GRU: end-to-end speech emotion recognition from raw waveform signal using CNNs and gated recurrent unit networks","authors":"Alaa Nfissi, W. Bouachir, N. Bouguila, B. Mishara","doi":"10.1109/ICMLA55696.2022.00116","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00116","url":null,"abstract":"We present CNN-n-GRU, a new end-to-end (E2E) architecture built of an n-layer convolutional neural network (CNN) followed sequentially by an n-layer Gated Recurrent Unit (GRU) for speech emotion recognition. CNNs and RNNs both exhibited promising outcomes when fed raw waveform voice inputs. This inspired our idea to combine them into a single model to maximise their potential. Instead of using handcrafted features or spectrograms, we train CNNs to recognise low-level speech representations from raw waveform, which allows the network to capture relevant narrow-band emotion characteristics. On the other hand, RNNs (GRUs in our case) can learn temporal characteristics, allowing the network to better capture the signal’s time-distributed features. Because a CNN can generate multiple levels of representation abstraction, we exploit early layers to extract high-level features, then to supply the appropriate input to subsequent RNN layers in order to aggregate long-term dependencies. By taking advantage of both CNNs and GRUs in a single model, the proposed architecture has important advantages over other models from the literature. The proposed model was evaluated using the TESS dataset and compared to state-of-the-art methods. Our experimental results demonstrate that the proposed model is more accurate than traditional classification approaches for speech emotion recognition.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128084481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00129
Chun Yuan, Youxuan Zhong, Jiake Tian, Y. Zou
Gesture communication is one of the most general communication methods in the world, with the obvious advantage of exchanging information without worrying about the borderline of different languages. Therefore, establishing a cost-effective way of capturing and understanding human gestures has long been a popular research topic regarding human-machine interaction, particularly in emerging scenarios such as smart cities, etc. In this paper, we propose a system based on a commercially available mmWave radar to recognize digits represented by the travel path of the human hand using a specially designed convolutional neural network (CNN) algorithm. We illustrate the proposed system is capable of recording the path of the moving hand in real-time at the cost of 1 transmitter, 2 receivers, and 2.78 GHz bandwidth from the mmWave radar. Our experimental results show that an average prediction accuracy of 98.8% is achieved in a validation test based on a 7:3 ratio split from existing dataset and an average prediction accuracy of 95.3% in generalization test using fresh data.
{"title":"A Real-time Digit Gesture Recognition System Based on mmWave Radar","authors":"Chun Yuan, Youxuan Zhong, Jiake Tian, Y. Zou","doi":"10.1109/ICMLA55696.2022.00129","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00129","url":null,"abstract":"Gesture communication is one of the most general communication methods in the world, with the obvious advantage of exchanging information without worrying about the borderline of different languages. Therefore, establishing a cost-effective way of capturing and understanding human gestures has long been a popular research topic regarding human-machine interaction, particularly in emerging scenarios such as smart cities, etc. In this paper, we propose a system based on a commercially available mmWave radar to recognize digits represented by the travel path of the human hand using a specially designed convolutional neural network (CNN) algorithm. We illustrate the proposed system is capable of recording the path of the moving hand in real-time at the cost of 1 transmitter, 2 receivers, and 2.78 GHz bandwidth from the mmWave radar. Our experimental results show that an average prediction accuracy of 98.8% is achieved in a validation test based on a 7:3 ratio split from existing dataset and an average prediction accuracy of 95.3% in generalization test using fresh data.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133181259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00208
.Ilkay Yildiz Potter, George Zerveas, Carsten Eickhoff, D. Duncan
Epilepsy is one of the most common neurological disorders, typically observed via seizure episodes. Epileptic seizures are commonly monitored through electroencephalogram (EEG) recordings due to their routine and low expense collection. The stochastic nature of EEG makes seizure identification via manual inspections performed by highly-trained experts a tedious endeavor, motivating the use of automated identification. The literature on automated identification focuses mostly on supervised learning methods requiring expert labels of EEG segments that contain seizures, which are difficult to obtain. Motivated by these observations, we pose seizure identification as an unsupervised anomaly detection problem. To this end, we employ the first unsupervised transformer-based model for seizure identification on raw EEG. We train an autoencoder involving a transformer encoder via an unsupervised loss function, incorporating a novel masking strategy uniquely designed for multivariate time-series data such as EEG. Training employs EEG recordings that do not contain any seizures, while seizures are identified with respect to reconstruction errors at inference time. We evaluate our method on three publicly available benchmark EEG datasets for distinguishing seizure vs. non-seizure windows. Our method leads to significantly better seizure identification performance than supervised learning counterparts, by up to 16% recall, 9% accuracy, and 9% Area under the Receiver Operating Characteristics Curve (AUC), establishing particular benefits on highly imbalanced data. Through accurate seizure identification, our method could facilitate widely accessible and early detection of epilepsy development, without needing expensive label collection or manual feature extraction.
{"title":"Unsupervised Multivariate Time-Series Transformers for Seizure Identification on EEG","authors":".Ilkay Yildiz Potter, George Zerveas, Carsten Eickhoff, D. Duncan","doi":"10.1109/ICMLA55696.2022.00208","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00208","url":null,"abstract":"Epilepsy is one of the most common neurological disorders, typically observed via seizure episodes. Epileptic seizures are commonly monitored through electroencephalogram (EEG) recordings due to their routine and low expense collection. The stochastic nature of EEG makes seizure identification via manual inspections performed by highly-trained experts a tedious endeavor, motivating the use of automated identification. The literature on automated identification focuses mostly on supervised learning methods requiring expert labels of EEG segments that contain seizures, which are difficult to obtain. Motivated by these observations, we pose seizure identification as an unsupervised anomaly detection problem. To this end, we employ the first unsupervised transformer-based model for seizure identification on raw EEG. We train an autoencoder involving a transformer encoder via an unsupervised loss function, incorporating a novel masking strategy uniquely designed for multivariate time-series data such as EEG. Training employs EEG recordings that do not contain any seizures, while seizures are identified with respect to reconstruction errors at inference time. We evaluate our method on three publicly available benchmark EEG datasets for distinguishing seizure vs. non-seizure windows. Our method leads to significantly better seizure identification performance than supervised learning counterparts, by up to 16% recall, 9% accuracy, and 9% Area under the Receiver Operating Characteristics Curve (AUC), establishing particular benefits on highly imbalanced data. Through accurate seizure identification, our method could facilitate widely accessible and early detection of epilepsy development, without needing expensive label collection or manual feature extraction.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133341415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00238
Shaun Anthony Little, Kaushik Roy, Ahmed Al Hamoud
As data increases at unprecedented rates, so does the need to classify this data, including news article data. Unfortunately, most news article categorization research utilizes global languages such as English or Spanish, and not much research considers low-resource languages like Swahili. Testing multiple classifiers and preprocessing methods, we show that the SVM model with tokenization and stop word removal has the highest accuracy (85.13%) scores for Swahili news article categorization. These results from the first publicly available peer-reviewed Swahili news article dataset provide benchmark performance for Swahili news article categorization and contribute to lean Swahili text classification research.
{"title":"Performance Benchmark of Machine Learning-Based Methodology for Swahili News Article Categorization","authors":"Shaun Anthony Little, Kaushik Roy, Ahmed Al Hamoud","doi":"10.1109/ICMLA55696.2022.00238","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00238","url":null,"abstract":"As data increases at unprecedented rates, so does the need to classify this data, including news article data. Unfortunately, most news article categorization research utilizes global languages such as English or Spanish, and not much research considers low-resource languages like Swahili. Testing multiple classifiers and preprocessing methods, we show that the SVM model with tokenization and stop word removal has the highest accuracy (85.13%) scores for Swahili news article categorization. These results from the first publicly available peer-reviewed Swahili news article dataset provide benchmark performance for Swahili news article categorization and contribute to lean Swahili text classification research.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133627093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00030
Ala Alam Falaki, R. Gras
We proposed a technique to reduce the decoder’s number of parameters in a sequence-to-sequence (seq2seq) architecture for automatic text summarization. This approach uses a pre-trained Autoencoder (AE) trained on top of an encoder’s output to reduce its embedding dimension, which significantly reduces the summarizer model’s decoder size. Two experiments were performed to validate the idea: a custom seq2seq architecture with various pre-trained encoders and incorporating the approach in an encoder-decoder model (BART) for text summarization. Both studies showed promising results in terms of ROUGE score. However, the impressive outcome is the 54% decrease in the inference time and a 57% drop in GPU memory usage while fine-tuning with minimal quality loss (4.5% R1 score). It significantly reduces the hardware requirement to fine-tune large-scale pre-trained models. It is also shown that our approach can be combined with other network size reduction techniques (e.g. Distillation) to further reduce any encoder-decoder model parameters count. The implementation and checkpoints are available on GitHub.1
{"title":"A Robust Approach to Fine-tune Pre-trained Transformer-based models for Text Summarization through Latent Space Compression","authors":"Ala Alam Falaki, R. Gras","doi":"10.1109/ICMLA55696.2022.00030","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00030","url":null,"abstract":"We proposed a technique to reduce the decoder’s number of parameters in a sequence-to-sequence (seq2seq) architecture for automatic text summarization. This approach uses a pre-trained Autoencoder (AE) trained on top of an encoder’s output to reduce its embedding dimension, which significantly reduces the summarizer model’s decoder size. Two experiments were performed to validate the idea: a custom seq2seq architecture with various pre-trained encoders and incorporating the approach in an encoder-decoder model (BART) for text summarization. Both studies showed promising results in terms of ROUGE score. However, the impressive outcome is the 54% decrease in the inference time and a 57% drop in GPU memory usage while fine-tuning with minimal quality loss (4.5% R1 score). It significantly reduces the hardware requirement to fine-tune large-scale pre-trained models. It is also shown that our approach can be combined with other network size reduction techniques (e.g. Distillation) to further reduce any encoder-decoder model parameters count. The implementation and checkpoints are available on GitHub.1","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133762053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00267
Ming-Jie Chen, Shadi Banitaan, Mina Maleki, Yichun Li
In order to maintain human safety in autonomous vehicles, pedestrian detection and tracking in real-time have become crucial research areas. The critical challenge in this field is to improve pedestrian detection accuracy while reducing tracking processing time. Due to the fact that pedestrians move in groups with the same speed and direction, we can address this challenge by detecting and tracking pedestrian groups. This work focused on pedestrian group detection. Various clustering methods were used in this study to identify pedestrian groups. Firstly, pedestrians were identified using a convolutional neural network approach. Secondly, K-Means and DBSCAN clustering methods were used to identify pedestrian groups based on the coordinates of the pedestrians’ bounding boxes. Moreover, we proposed a modified DBSCAN clustering method named GDSCAN that employs dynamic epsilon to different areas of an image. The experimental results on the MOT17 dataset show that GDSCAN outperformed K-Means and DBSCAN methods based on the Silhouette Coefficient score and Adjusted Rand Index (ARI).
{"title":"GDSCAN: Pedestrian Group Detection using Dynamic Epsilon","authors":"Ming-Jie Chen, Shadi Banitaan, Mina Maleki, Yichun Li","doi":"10.1109/ICMLA55696.2022.00267","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00267","url":null,"abstract":"In order to maintain human safety in autonomous vehicles, pedestrian detection and tracking in real-time have become crucial research areas. The critical challenge in this field is to improve pedestrian detection accuracy while reducing tracking processing time. Due to the fact that pedestrians move in groups with the same speed and direction, we can address this challenge by detecting and tracking pedestrian groups. This work focused on pedestrian group detection. Various clustering methods were used in this study to identify pedestrian groups. Firstly, pedestrians were identified using a convolutional neural network approach. Secondly, K-Means and DBSCAN clustering methods were used to identify pedestrian groups based on the coordinates of the pedestrians’ bounding boxes. Moreover, we proposed a modified DBSCAN clustering method named GDSCAN that employs dynamic epsilon to different areas of an image. The experimental results on the MOT17 dataset show that GDSCAN outperformed K-Means and DBSCAN methods based on the Silhouette Coefficient score and Adjusted Rand Index (ARI).","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134134117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00100
Yuki Miyashita, T. Sugawara
A fundamental challenge in multi-agent reinforcement learning is an effective exploration of state-action spaces because agents must learn their policies in a non-stationary environment due to changing policies of other learning agents. As the agent’s learning progresses, different undesired situations may appear one after another and agents have to learn again to adapt them. Therefore, agents must learn again with a high probability of exploration to find the appropriate actions for the exposed situation. However, existing algorithms can suffer from inability to learn behavior again on the lack of exploration for these situations because agents usually become exploitation-oriented by using simple exploration strategies, such as ε-greedy strategy. Therefore, we propose two types of simple exploration strategies, where each agent monitors the trend of performance and controls the exploration probability, ε, based on the transition of performance. By introducing a coordinated problem called the PushBlock problem, which includes the above issue, we show that the proposed method could improve the overall performance relative to conventional ε-greedy strategies and analyze their effects on the generated behavior.
{"title":"Flexible Exploration Strategies in Multi-Agent Reinforcement Learning for Instability by Mutual Learning","authors":"Yuki Miyashita, T. Sugawara","doi":"10.1109/ICMLA55696.2022.00100","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00100","url":null,"abstract":"A fundamental challenge in multi-agent reinforcement learning is an effective exploration of state-action spaces because agents must learn their policies in a non-stationary environment due to changing policies of other learning agents. As the agent’s learning progresses, different undesired situations may appear one after another and agents have to learn again to adapt them. Therefore, agents must learn again with a high probability of exploration to find the appropriate actions for the exposed situation. However, existing algorithms can suffer from inability to learn behavior again on the lack of exploration for these situations because agents usually become exploitation-oriented by using simple exploration strategies, such as ε-greedy strategy. Therefore, we propose two types of simple exploration strategies, where each agent monitors the trend of performance and controls the exploration probability, ε, based on the transition of performance. By introducing a coordinated problem called the PushBlock problem, which includes the above issue, we show that the proposed method could improve the overall performance relative to conventional ε-greedy strategies and analyze their effects on the generated behavior.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134163409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00137
A. Alizadeh, Mukesh Singhal, Vahid Behzadan, Pooya Tavallali, A. Ranganath
Decision trees are a convenient and established approach for any supervised learning task. Decision trees are trained by greedily splitting a leaf nodes, into two leaf nodes until a specific stopping criterion is reached. Splitting a node consists of finding the best feature and threshold that minimizes a criterion. The criterion minimization problem is solved through a costly exhaustive search algorithm. This paper proposes a novel stochastic approach for criterion minimization. The algorithm is compared with several other related state-of-the-art decision tree learning methods, including the baseline non-stochastic approach. We apply the proposed algorithm to learn a Haar tree over MNIST dataset that consists of over 200, 000 features and 60, 000 samples. The result is comparable to the performance of oblique trees while providing a significant speed-up in both inference and training times.
{"title":"Stochastic Induction of Decision Trees with Application to Learning Haar Trees","authors":"A. Alizadeh, Mukesh Singhal, Vahid Behzadan, Pooya Tavallali, A. Ranganath","doi":"10.1109/ICMLA55696.2022.00137","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00137","url":null,"abstract":"Decision trees are a convenient and established approach for any supervised learning task. Decision trees are trained by greedily splitting a leaf nodes, into two leaf nodes until a specific stopping criterion is reached. Splitting a node consists of finding the best feature and threshold that minimizes a criterion. The criterion minimization problem is solved through a costly exhaustive search algorithm. This paper proposes a novel stochastic approach for criterion minimization. The algorithm is compared with several other related state-of-the-art decision tree learning methods, including the baseline non-stochastic approach. We apply the proposed algorithm to learn a Haar tree over MNIST dataset that consists of over 200, 000 features and 60, 000 samples. The result is comparable to the performance of oblique trees while providing a significant speed-up in both inference and training times.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134323445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/ICMLA55696.2022.00082
Sandra Wilfling, M. Ebrahimi, Qamar Alfalouji, G. Schweiger, Mina Basirat
Many applications in energy systems require models that represent the non-linear dynamics of the underlying systems. Black-box models with non-linear architecture are suitable candidates for modeling these systems; however, they are computationally expensive and lack interpretability. An inexpensive white-box linear combination learned over a suitable polynomial feature set can result in a high-performing non-linear model that is easier to interpret, validate, and verify against reference models created by the domain experts. This paper proposes a workflow to learn a linear combination of non-linear terms for an engineered polynomial feature set. We firstly detect non-linear dependencies and then attempt to reconstruct them using feature expansion. Afterwards, we select possible predictors with the highest correlation coefficients for predictive regression analysis. We demonstrate how to learn inexpensive yet comprehensible linear combinations of non-linear terms from four datasets. Experimental evaluations show our workflow yields improvements in the metrics R2, CV-RMSE and MAPE in all datasets. Further evaluation of the learned models’ goodness of fit using prediction error plots also confirms that the proposed workflow results in models that can more accurately capture the nature of the underlying physical systems.
{"title":"Learning Non-linear White-box Predictors: A Use Case in Energy Systems","authors":"Sandra Wilfling, M. Ebrahimi, Qamar Alfalouji, G. Schweiger, Mina Basirat","doi":"10.1109/ICMLA55696.2022.00082","DOIUrl":"https://doi.org/10.1109/ICMLA55696.2022.00082","url":null,"abstract":"Many applications in energy systems require models that represent the non-linear dynamics of the underlying systems. Black-box models with non-linear architecture are suitable candidates for modeling these systems; however, they are computationally expensive and lack interpretability. An inexpensive white-box linear combination learned over a suitable polynomial feature set can result in a high-performing non-linear model that is easier to interpret, validate, and verify against reference models created by the domain experts. This paper proposes a workflow to learn a linear combination of non-linear terms for an engineered polynomial feature set. We firstly detect non-linear dependencies and then attempt to reconstruct them using feature expansion. Afterwards, we select possible predictors with the highest correlation coefficients for predictive regression analysis. We demonstrate how to learn inexpensive yet comprehensible linear combinations of non-linear terms from four datasets. Experimental evaluations show our workflow yields improvements in the metrics R2, CV-RMSE and MAPE in all datasets. Further evaluation of the learned models’ goodness of fit using prediction error plots also confirms that the proposed workflow results in models that can more accurately capture the nature of the underlying physical systems.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121702473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}