Numerous advancements in various fields, including pattern recognition and image classification, have been made thanks to modern computer vision and machine learning methods. The capsule network is one of the advanced machine learning algorithms that encodes features based on their hierarchical relationships. Basically, a capsule network is a type of neural network that performs inverse graphics to represent the object in different parts and view the existing relationship between these parts, unlike CNNs, which lose most of the evidence related to spatial location and requires lots of training data. So, we present a comparative review of various capsule network architectures used in various applications. The paper’s main contribution is that it summarizes and explains the significant current published capsule network architectures with their advantages, limitations, modifications, and applications.
{"title":"Capsule Network with Its Limitation, Modification, and Applications - A Survey","authors":"Mahmood Ul Haq, M. A. J. Sethi, A. Rehman","doi":"10.3390/make5030047","DOIUrl":"https://doi.org/10.3390/make5030047","url":null,"abstract":"Numerous advancements in various fields, including pattern recognition and image classification, have been made thanks to modern computer vision and machine learning methods. The capsule network is one of the advanced machine learning algorithms that encodes features based on their hierarchical relationships. Basically, a capsule network is a type of neural network that performs inverse graphics to represent the object in different parts and view the existing relationship between these parts, unlike CNNs, which lose most of the evidence related to spatial location and requires lots of training data. So, we present a comparative review of various capsule network architectures used in various applications. The paper’s main contribution is that it summarizes and explains the significant current published capsule network architectures with their advantages, limitations, modifications, and applications.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"52 1","pages":"891-921"},"PeriodicalIF":0.0,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90471447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The proliferation of novel attacks and growing amounts of data has caused practitioners in the field of network intrusion detection to constantly work towards keeping up with this evolving adversarial landscape. Researchers have been seeking to harness deep learning techniques in efforts to detect zero-day attacks and allow network intrusion detection systems to more efficiently alert network operators. The technique outlined in this work uses a one-class training process to shape autoencoder feature residuals for the effective detection of network attacks. Compared to an original set of input features, we show that autoencoder feature residuals are a suitable replacement, and often perform at least as well as the original feature set. This quality allows autoencoder feature residuals to prevent the need for extensive feature engineering without reducing classification performance. Additionally, it is found that without generating new data compared to an original feature set, using autoencoder feature residuals often improves classifier performance. Practical side effects from using autoencoder feature residuals emerge by analyzing the potential data compression benefits they provide.
{"title":"Autoencoder Feature Residuals for Network Intrusion Detection: One-Class Pretraining for Improved Performance","authors":"B. Lewandowski, R. Paffenroth","doi":"10.3390/make5030046","DOIUrl":"https://doi.org/10.3390/make5030046","url":null,"abstract":"The proliferation of novel attacks and growing amounts of data has caused practitioners in the field of network intrusion detection to constantly work towards keeping up with this evolving adversarial landscape. Researchers have been seeking to harness deep learning techniques in efforts to detect zero-day attacks and allow network intrusion detection systems to more efficiently alert network operators. The technique outlined in this work uses a one-class training process to shape autoencoder feature residuals for the effective detection of network attacks. Compared to an original set of input features, we show that autoencoder feature residuals are a suitable replacement, and often perform at least as well as the original feature set. This quality allows autoencoder feature residuals to prevent the need for extensive feature engineering without reducing classification performance. Additionally, it is found that without generating new data compared to an original feature set, using autoencoder feature residuals often improves classifier performance. Practical side effects from using autoencoder feature residuals emerge by analyzing the potential data compression benefits they provide.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"21 1","pages":"868-890"},"PeriodicalIF":0.0,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80102408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a technique to reduce the number of parameters in a transformer-based encoder–decoder architecture by incorporating autoencoders. To discover the optimal compression, we trained different autoencoders on the embedding space (encoder’s output) of several pre-trained models. The experiments reveal that reducing the embedding size has the potential to dramatically decrease the GPU memory usage while speeding up the inference process. The proposed architecture was included in the BART model and tested for summarization, translation, and classification tasks. The summarization results show that a 60% decoder size reduction (from 96 M to 40 M parameters) will make the inference twice as fast and use less than half of GPU memory during fine-tuning process with only a 4.5% drop in R-1 score. The same trend is visible for translation and partially for classification tasks. Our approach reduces the GPU memory usage and processing time of large-scale sequence-to-sequence models for fine-tuning and inference. The implementation and checkpoints are available on GitHub.
{"title":"Efficient Latent Space Compression for Lightning-Fast Fine-Tuning and Inference of Transformer-Based Models","authors":"Ala Alam Falaki, R. Gras","doi":"10.3390/make5030045","DOIUrl":"https://doi.org/10.3390/make5030045","url":null,"abstract":"This paper presents a technique to reduce the number of parameters in a transformer-based encoder–decoder architecture by incorporating autoencoders. To discover the optimal compression, we trained different autoencoders on the embedding space (encoder’s output) of several pre-trained models. The experiments reveal that reducing the embedding size has the potential to dramatically decrease the GPU memory usage while speeding up the inference process. The proposed architecture was included in the BART model and tested for summarization, translation, and classification tasks. The summarization results show that a 60% decoder size reduction (from 96 M to 40 M parameters) will make the inference twice as fast and use less than half of GPU memory during fine-tuning process with only a 4.5% drop in R-1 score. The same trend is visible for translation and partially for classification tasks. Our approach reduces the GPU memory usage and processing time of large-scale sequence-to-sequence models for fine-tuning and inference. The implementation and checkpoints are available on GitHub.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"477 1","pages":"847-867"},"PeriodicalIF":0.0,"publicationDate":"2023-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80796528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traffic forecasting is an important task for transportation engineering as it helps authorities to plan and control traffic flow, detect congestion, and reduce environmental impact. In recent times, the application of deep learning techniques to handle large and complex datasets has become prevalent. However, these methods necessitate a proficiency in neural architecture engineering, a skill set that many decision makers in traffic management centers may not possess. Neural architecture search (NAS) methods have gained popularity for alleviating the problem of neural architecture engineering by discovering customized neural architectures for various tasks. Their application to traffic prediction has only recently been explored. Performance estimation of neural architectures, a sub-problem of NAS and often the bottleneck in terms of computation time, hinders the adaptation of research to real-world applications. Recently, zero-cost (ZC) proxies have emerged as a cost-effective means of evaluating network architectures without requiring training, circumventing the bottleneck at the expense of accuracy. This work extends previous research on evolutionary NAS (ENAS) by evaluating the utility of ZC proxies for the task of traffic prediction. We answer research questions related to the stability of zero-cost proxies and their correlation with validation losses on real-world datasets. When used in the ENAS framework, we show that ZC proxies can speed up the search process by two orders of magnitude without greatly affecting the accuracy of the prediction model. Traffic forecasting is an important task for transportation engineering as it helps authorities to plan and control traffic flow, detect congestion, and reduce environmental impact. Deep learning techniques have gained traction in handling such complex datasets, but require expertise in neural architecture engineering, often beyond the scope of traffic management decision-makers. Our study aims to address this challenge by using neural architecture search (NAS) methods. These methods, which simplify neural architecture engineering by discovering task-specific neural architectures, are only recently applied to traffic prediction. We specifically focus on the performance estimation of neural architectures, a computationally demanding sub-problem of NAS, that often hinders the real-world application of these methods. Extending prior work on evolutionary NAS (ENAS), our work evaluates the utility of zero-cost (ZC) proxies, recently emerged cost-effective evaluators of network architectures. These proxies operate without necessitating training, thereby circumventing the computational bottleneck, albeit at a slight cost to accuracy. Our findings indicate that, when integrated into the ENAS framework, ZC proxies can accelerate the search process by two orders of magnitude at a small cost of accuracy. These results establish the viability of ZC proxies as a practical solution to accelerate NAS methods wh
{"title":"Low Cost Evolutionary Neural Architecture Search (LENAS) Applied to Traffic Forecasting","authors":"Daniel Klosa, C. Büskens","doi":"10.3390/make5030044","DOIUrl":"https://doi.org/10.3390/make5030044","url":null,"abstract":"Traffic forecasting is an important task for transportation engineering as it helps authorities to plan and control traffic flow, detect congestion, and reduce environmental impact. In recent times, the application of deep learning techniques to handle large and complex datasets has become prevalent. However, these methods necessitate a proficiency in neural architecture engineering, a skill set that many decision makers in traffic management centers may not possess. Neural architecture search (NAS) methods have gained popularity for alleviating the problem of neural architecture engineering by discovering customized neural architectures for various tasks. Their application to traffic prediction has only recently been explored. Performance estimation of neural architectures, a sub-problem of NAS and often the bottleneck in terms of computation time, hinders the adaptation of research to real-world applications. Recently, zero-cost (ZC) proxies have emerged as a cost-effective means of evaluating network architectures without requiring training, circumventing the bottleneck at the expense of accuracy. This work extends previous research on evolutionary NAS (ENAS) by evaluating the utility of ZC proxies for the task of traffic prediction. We answer research questions related to the stability of zero-cost proxies and their correlation with validation losses on real-world datasets. When used in the ENAS framework, we show that ZC proxies can speed up the search process by two orders of magnitude without greatly affecting the accuracy of the prediction model. Traffic forecasting is an important task for transportation engineering as it helps authorities to plan and control traffic flow, detect congestion, and reduce environmental impact. Deep learning techniques have gained traction in handling such complex datasets, but require expertise in neural architecture engineering, often beyond the scope of traffic management decision-makers. Our study aims to address this challenge by using neural architecture search (NAS) methods. These methods, which simplify neural architecture engineering by discovering task-specific neural architectures, are only recently applied to traffic prediction. We specifically focus on the performance estimation of neural architectures, a computationally demanding sub-problem of NAS, that often hinders the real-world application of these methods. Extending prior work on evolutionary NAS (ENAS), our work evaluates the utility of zero-cost (ZC) proxies, recently emerged cost-effective evaluators of network architectures. These proxies operate without necessitating training, thereby circumventing the computational bottleneck, albeit at a slight cost to accuracy. Our findings indicate that, when integrated into the ENAS framework, ZC proxies can accelerate the search process by two orders of magnitude at a small cost of accuracy. These results establish the viability of ZC proxies as a practical solution to accelerate NAS methods wh","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"1 1","pages":"830-846"},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90233389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Salamon, David Salamon, V. A. Cantu, Michelle An, Tyler Perry, Robert A. Edwards, A. Segall
This paper investigates the post-hoc calibration of confidence for “exploratory” machine learning classification problems. The difficulty in these problems stems from the continuing desire to push the boundaries of which categories have enough examples to generalize from when curating datasets, and confusion regarding the validity of those categories. We argue that for such problems the “one-versus-all” approach (top-label calibration) must be used rather than the “calibrate-the-full-response-matrix” approach advocated elsewhere in the literature. We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation using only the test set and the final model. Chief among these methods is the use of kernel density ratios for confidence calibration including a novel algorithm for choosing the bandwidth. We test our claims and explore the limits of calibration on a bioinformatics application (PhANNs) as well as the classic MNIST benchmark. Finally, our analysis argues that post-hoc calibration should always be performed, may be performed using only the test dataset, and should be sanity-checked visually.
{"title":"Classification Confidence in Exploratory Learning: A User's Guide","authors":"P. Salamon, David Salamon, V. A. Cantu, Michelle An, Tyler Perry, Robert A. Edwards, A. Segall","doi":"10.3390/make5030043","DOIUrl":"https://doi.org/10.3390/make5030043","url":null,"abstract":"This paper investigates the post-hoc calibration of confidence for “exploratory” machine learning classification problems. The difficulty in these problems stems from the continuing desire to push the boundaries of which categories have enough examples to generalize from when curating datasets, and confusion regarding the validity of those categories. We argue that for such problems the “one-versus-all” approach (top-label calibration) must be used rather than the “calibrate-the-full-response-matrix” approach advocated elsewhere in the literature. We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation using only the test set and the final model. Chief among these methods is the use of kernel density ratios for confidence calibration including a novel algorithm for choosing the bandwidth. We test our claims and explore the limits of calibration on a bioinformatics application (PhANNs) as well as the classic MNIST benchmark. Finally, our analysis argues that post-hoc calibration should always be performed, may be performed using only the test dataset, and should be sanity-checked visually.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"33 1","pages":"803-829"},"PeriodicalIF":0.0,"publicationDate":"2023-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83946288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fábio Eid Morooka, Adalberto Manoel Junior, T. Sigahi, Jefferson de Souza Pinto, I. Rampasso, R. Anholon
Applications of deep learning (DL) in autonomous vehicle (AV) projects have gained increasing interest from both researchers and companies. This has caused a rapid expansion of scientific production on DL-AV in recent years, encouraging researchers to conduct systematic literature reviews (SLRs) to organize knowledge on the topic. However, a critical analysis of the existing SLRs on DL-AV reveals some methodological gaps, particularly regarding the use of bibliometric software, which are powerful tools for analyzing large amounts of data and for providing a holistic understanding on the structure of knowledge of a particular field. This study aims to identify the strategic themes and trends in DL-AV research using the Science Mapping Analysis Tool (SciMAT) and content analysis. Strategic diagrams and cluster networks were developed using SciMAT, allowing the identification of motor themes and research opportunities. The content analysis allowed categorization of the contribution of the academic literature on DL applications in AV project design; neural networks and AI models used in AVs; and transdisciplinary themes in DL-AV research, including energy, legislation, ethics, and cybersecurity. Potential research avenues are discussed for each of these categories. The findings presented in this study can benefit both experienced scholars who can gain access to condensed information about the literature on DL-AV and new researchers who may be attracted to topics related to technological development and other issues with social and environmental impacts.
{"title":"Deep Learning and Autonomous Vehicles: Strategic Themes, Applications, and Research Agenda Using SciMAT and Content-Centric Analysis, a Systematic Review","authors":"Fábio Eid Morooka, Adalberto Manoel Junior, T. Sigahi, Jefferson de Souza Pinto, I. Rampasso, R. Anholon","doi":"10.3390/make5030041","DOIUrl":"https://doi.org/10.3390/make5030041","url":null,"abstract":"Applications of deep learning (DL) in autonomous vehicle (AV) projects have gained increasing interest from both researchers and companies. This has caused a rapid expansion of scientific production on DL-AV in recent years, encouraging researchers to conduct systematic literature reviews (SLRs) to organize knowledge on the topic. However, a critical analysis of the existing SLRs on DL-AV reveals some methodological gaps, particularly regarding the use of bibliometric software, which are powerful tools for analyzing large amounts of data and for providing a holistic understanding on the structure of knowledge of a particular field. This study aims to identify the strategic themes and trends in DL-AV research using the Science Mapping Analysis Tool (SciMAT) and content analysis. Strategic diagrams and cluster networks were developed using SciMAT, allowing the identification of motor themes and research opportunities. The content analysis allowed categorization of the contribution of the academic literature on DL applications in AV project design; neural networks and AI models used in AVs; and transdisciplinary themes in DL-AV research, including energy, legislation, ethics, and cybersecurity. Potential research avenues are discussed for each of these categories. The findings presented in this study can benefit both experienced scholars who can gain access to condensed information about the literature on DL-AV and new researchers who may be attracted to topics related to technological development and other issues with social and environmental impacts.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"66 1","pages":"763-781"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75884660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clinical text often includes numbers of various types and formats. However, most current text classification approaches do not take advantage of these numbers. This study aims to demonstrate that using numbers as features can significantly improve the performance of text classification models. This study also demonstrates the feasibility of extracting such features from clinical text. Unsupervised learning was used to identify patterns of number usage in clinical text. These patterns were analyzed manually and converted into pattern-matching rules. Information extraction was used to incorporate numbers as features into a document representation model. We evaluated text classification models trained on such representation. Our experiments were performed with two document representation models (vector space model and word embedding model) and two classification models (support vector machines and neural networks). The results showed that even a handful of numerical features can significantly improve text classification performance. We conclude that commonly used document representations do not represent numbers in a way that machine learning algorithms can effectively utilize them as features. Although we demonstrated that traditional information extraction can be effective in converting numbers into features, further community-wide research is required to systematically incorporate number representation into the word embedding process.
{"title":"The Value of Numbers in Clinical Text Classification","authors":"Kristian Miok, P. Corcoran, Irena Spasic","doi":"10.3390/make5030040","DOIUrl":"https://doi.org/10.3390/make5030040","url":null,"abstract":"Clinical text often includes numbers of various types and formats. However, most current text classification approaches do not take advantage of these numbers. This study aims to demonstrate that using numbers as features can significantly improve the performance of text classification models. This study also demonstrates the feasibility of extracting such features from clinical text. Unsupervised learning was used to identify patterns of number usage in clinical text. These patterns were analyzed manually and converted into pattern-matching rules. Information extraction was used to incorporate numbers as features into a document representation model. We evaluated text classification models trained on such representation. Our experiments were performed with two document representation models (vector space model and word embedding model) and two classification models (support vector machines and neural networks). The results showed that even a handful of numerical features can significantly improve text classification performance. We conclude that commonly used document representations do not represent numbers in a way that machine learning algorithms can effectively utilize them as features. Although we demonstrated that traditional information extraction can be effective in converting numbers into features, further community-wide research is required to systematically incorporate number representation into the word embedding process.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"30 1","pages":"746-762"},"PeriodicalIF":0.0,"publicationDate":"2023-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83600174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Forest fires are one of the world’s deadliest natural disasters. Early detection of forest fires can help minimize the damage to ecosystems and forest life. In this paper, we propose an improved fire detection method YOLOv5-IFFDM for YOLOv5. Firstly, the fire and smoke detection accuracy and the network perception accuracy of small targets are improved by adding an attention mechanism to the backbone network. Secondly, the loss function is improved and the SoftPool pyramid pooling structure is used to improve the regression accuracy and detection performance of the model and the robustness of the model. In addition, a random mosaic augmentation technique is used to enhance the data to increase the generalization ability of the model, and re-clustering of flame and smoke detection a priori frames are used to improve the accuracy and speed. Finally, the parameters of the convolutional and normalization layers of the trained model are homogeneously merged to further reduce the model processing load and to improve the detection speed. Experimental results on self-built forest-fire and smoke datasets show that this algorithm has high detection accuracy and fast detection speed, with average accuracy of fire up to 90.5% and smoke up to 84.3%, and detection speed up to 75 FPS (frames per second transmission), which can meet the requirements of real-time and efficient fire detection.
{"title":"Research on Forest Fire Detection Algorithm Based on Improved YOLOv5","authors":"Jianfeng Li, Xiao-Feng Lian","doi":"10.3390/make5030039","DOIUrl":"https://doi.org/10.3390/make5030039","url":null,"abstract":"Forest fires are one of the world’s deadliest natural disasters. Early detection of forest fires can help minimize the damage to ecosystems and forest life. In this paper, we propose an improved fire detection method YOLOv5-IFFDM for YOLOv5. Firstly, the fire and smoke detection accuracy and the network perception accuracy of small targets are improved by adding an attention mechanism to the backbone network. Secondly, the loss function is improved and the SoftPool pyramid pooling structure is used to improve the regression accuracy and detection performance of the model and the robustness of the model. In addition, a random mosaic augmentation technique is used to enhance the data to increase the generalization ability of the model, and re-clustering of flame and smoke detection a priori frames are used to improve the accuracy and speed. Finally, the parameters of the convolutional and normalization layers of the trained model are homogeneously merged to further reduce the model processing load and to improve the detection speed. Experimental results on self-built forest-fire and smoke datasets show that this algorithm has high detection accuracy and fast detection speed, with average accuracy of fire up to 90.5% and smoke up to 84.3%, and detection speed up to 75 FPS (frames per second transmission), which can meet the requirements of real-time and efficient fire detection.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"10 1","pages":"725-745"},"PeriodicalIF":0.0,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79529380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angel Pina, Corbin Petersheim, Josh Cherian, J. Lahey, Gerianne Alexander, T. Hammond
When job seekers are unsuccessful in getting a position, they often do not get feedback to inform them on how to develop a better application in the future. Therefore, there is a critical need to understand what qualifications recruiters value in order to help applicants. To address this need, we utilized eye-trackers to measure and record visual data of recruiters screening resumes to gain insight into which Areas of Interest (AOIs) influenced recruiters’ decisions the most. Using just this eye-tracking data, we trained a machine learning classifier to predict whether or not a recruiter would move a resume on to the next level of the hiring process with an AUC of 0.767. We found that features associated with recruiters looking outside the content of a resume were most predictive of their decision as well as total time viewing the resume and time spent on the Experience and Education sections. We hypothesize that this behavior is indicative of the recruiter reflecting on the content of the resume. These initial results show that applicants should focus on designing clear and concise resumes that are easy for recruiters to absorb and think about, with additional attention given to the Experience and Education sections.
{"title":"Using Machine Learning with Eye-Tracking Data to Predict if a Recruiter Will Approve a Resume","authors":"Angel Pina, Corbin Petersheim, Josh Cherian, J. Lahey, Gerianne Alexander, T. Hammond","doi":"10.3390/make5030038","DOIUrl":"https://doi.org/10.3390/make5030038","url":null,"abstract":"When job seekers are unsuccessful in getting a position, they often do not get feedback to inform them on how to develop a better application in the future. Therefore, there is a critical need to understand what qualifications recruiters value in order to help applicants. To address this need, we utilized eye-trackers to measure and record visual data of recruiters screening resumes to gain insight into which Areas of Interest (AOIs) influenced recruiters’ decisions the most. Using just this eye-tracking data, we trained a machine learning classifier to predict whether or not a recruiter would move a resume on to the next level of the hiring process with an AUC of 0.767. We found that features associated with recruiters looking outside the content of a resume were most predictive of their decision as well as total time viewing the resume and time spent on the Experience and Education sections. We hypothesize that this behavior is indicative of the recruiter reflecting on the content of the resume. These initial results show that applicants should focus on designing clear and concise resumes that are easy for recruiters to absorb and think about, with additional attention given to the Experience and Education sections.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"25 1","pages":"713-724"},"PeriodicalIF":0.0,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83289449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the context of pharmaceuticals, drug-drug interactions (DDIs) occur when two or more drugs interact, potentially altering the intended effects of the drugs and resulting in adverse patient health outcomes. Therefore, it is essential to identify and comprehend these interactions. In recent years, an increasing number of novel compounds have been discovered, resulting in the discovery of numerous new DDIs. There is a need for effective methods to extract and analyze DDIs, as the majority of this information is still predominantly located in biomedical articles and sources. Despite the development of various techniques, accurately predicting DDIs remains a significant challenge. This paper proposes a novel solution to this problem by leveraging the power of Relation BioBERT (R-BioBERT) to detect and classify DDIs and the Bidirectional Long Short-Term Memory (BLSTM) to improve the accuracy of predictions. In addition to determining whether two drugs interact, the proposed method also identifies the specific types of interactions between them. Results show that the use of BLSTM leads to significantly higher F-scores compared to our baseline model, as demonstrated on three well-known DDI extraction datasets that includes SemEval 2013, TAC 2018, and TAC 2019.
{"title":"Drug-Drug Interaction Extraction from Biomedical Text Using Relation BioBERT with BLSTM","authors":"Maryam KafiKang, Abdeltawab Hendawi","doi":"10.3390/make5020036","DOIUrl":"https://doi.org/10.3390/make5020036","url":null,"abstract":"In the context of pharmaceuticals, drug-drug interactions (DDIs) occur when two or more drugs interact, potentially altering the intended effects of the drugs and resulting in adverse patient health outcomes. Therefore, it is essential to identify and comprehend these interactions. In recent years, an increasing number of novel compounds have been discovered, resulting in the discovery of numerous new DDIs. There is a need for effective methods to extract and analyze DDIs, as the majority of this information is still predominantly located in biomedical articles and sources. Despite the development of various techniques, accurately predicting DDIs remains a significant challenge. This paper proposes a novel solution to this problem by leveraging the power of Relation BioBERT (R-BioBERT) to detect and classify DDIs and the Bidirectional Long Short-Term Memory (BLSTM) to improve the accuracy of predictions. In addition to determining whether two drugs interact, the proposed method also identifies the specific types of interactions between them. Results show that the use of BLSTM leads to significantly higher F-scores compared to our baseline model, as demonstrated on three well-known DDI extraction datasets that includes SemEval 2013, TAC 2018, and TAC 2019.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135006472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}