首页 > 最新文献

Machine learning and knowledge extraction最新文献

英文 中文
Learning Sentence-Level Representations with Predictive Coding 用预测编码学习句子级表示
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-09 DOI: 10.3390/make5010005
Vladimir Araujo, M. Moens, Álvaro Soto
Learning sentence representations is an essential and challenging topic in the deep learning and natural language processing communities. Recent methods pre-train big models on a massive text corpus, focusing mainly on learning the representation of contextualized words. As a result, these models cannot generate informative sentence embeddings since they do not explicitly exploit the structure and discourse relationships existing in contiguous sentences. Drawing inspiration from human language processing, this work explores how to improve sentence-level representations of pre-trained models by borrowing ideas from predictive coding theory. Specifically, we extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks. We conduct extensive experimentation with various benchmarks for the English and Spanish languages, designed to assess sentence- and discourse-level representations and pragmatics-focused assessments. Our results show that our approach improves sentence representations consistently for both languages. Furthermore, the experiments also indicate that our models capture discourse and pragmatics knowledge. In addition, to validate the proposed method, we carried out an ablation study and a qualitative study with which we verified that the predictive mechanism helps to improve the quality of the representations.
在深度学习和自然语言处理领域,句子表征学习是一个重要而富有挑战性的课题。最近的方法是在海量文本语料库上预训练大型模型,主要集中在学习语境化单词的表示。因此,这些模型不能生成信息性的句子嵌入,因为它们没有明确地利用连续句子中存在的结构和话语关系。从人类语言处理中获得灵感,这项工作探索了如何通过借鉴预测编码理论的思想来改进预训练模型的句子级表示。具体来说,我们通过自底向上和自顶向下的计算扩展bert风格模型,以预测网络中每个中间层潜在空间中的未来句子。我们对英语和西班牙语的各种基准进行了广泛的实验,旨在评估句子和话语层面的表征以及以语用为重点的评估。我们的结果表明,我们的方法一致地提高了两种语言的句子表示。此外,实验还表明,我们的模型捕捉语篇和语用知识。此外,为了验证所提出的方法,我们进行了消融研究和定性研究,我们验证了预测机制有助于提高表征的质量。
{"title":"Learning Sentence-Level Representations with Predictive Coding","authors":"Vladimir Araujo, M. Moens, Álvaro Soto","doi":"10.3390/make5010005","DOIUrl":"https://doi.org/10.3390/make5010005","url":null,"abstract":"Learning sentence representations is an essential and challenging topic in the deep learning and natural language processing communities. Recent methods pre-train big models on a massive text corpus, focusing mainly on learning the representation of contextualized words. As a result, these models cannot generate informative sentence embeddings since they do not explicitly exploit the structure and discourse relationships existing in contiguous sentences. Drawing inspiration from human language processing, this work explores how to improve sentence-level representations of pre-trained models by borrowing ideas from predictive coding theory. Specifically, we extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks. We conduct extensive experimentation with various benchmarks for the English and Spanish languages, designed to assess sentence- and discourse-level representations and pragmatics-focused assessments. Our results show that our approach improves sentence representations consistently for both languages. Furthermore, the experiments also indicate that our models capture discourse and pragmatics knowledge. In addition, to validate the proposed method, we carried out an ablation study and a qualitative study with which we verified that the predictive mechanism helps to improve the quality of the representations.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"12 1","pages":"59-77"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79531481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
IPPT4KRL: Iterative Post-Processing Transfer for Knowledge Representation Learning 知识表示学习的迭代后处理迁移
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-06 DOI: 10.3390/make5010004
Weihang Zhang, O. Șerban, Jiahao Sun, Yike Guo
Knowledge Graphs (KGs), a structural way to model human knowledge, have been a critical component of many artificial intelligence applications. Many KG-based tasks are built using knowledge representation learning, which embeds KG entities and relations into a low-dimensional semantic space. However, the quality of representation learning is often limited by the heterogeneity and sparsity of real-world KGs. Multi-KG representation learning, which utilizes KGs from different sources collaboratively, presents one promising solution. In this paper, we propose a simple, but effective iterative method that post-processes pre-trained knowledge graph embedding (IPPT4KRL) on individual KGs to maximize the knowledge transfer from another KG when a small portion of alignment information is introduced. Specifically, additional triples are iteratively included in the post-processing based on their adjacencies to the cross-KG alignments to refine the pre-trained embedding space of individual KGs. We also provide the benchmarking results of existing multi-KG representation learning methods on several generated and well-known datasets. The empirical results of the link prediction task on these datasets show that the proposed IPPT4KRL method achieved comparable and even superior results when compared against more complex methods in multi-KG representation learning.
知识图(Knowledge Graphs, KGs)是一种对人类知识进行结构化建模的方法,已成为许多人工智能应用的关键组成部分。许多基于KG的任务是使用知识表示学习构建的,它将KG实体和关系嵌入到低维语义空间中。然而,表征学习的质量经常受到现实世界kg的异质性和稀疏性的限制,多kg表征学习是一种很有前途的解决方案,它协同利用来自不同来源的kg。在本文中,我们提出了一种简单而有效的迭代方法,即在单个KG上对预训练知识图嵌入(IPPT4KRL)进行后处理,以在引入少量对齐信息时最大限度地从另一个KG转移知识。具体来说,在后处理过程中,基于它们与交叉kg对齐的邻接关系,迭代地包含额外的三元组,以细化单个kg的预训练嵌入空间。我们还在几个生成的和已知的数据集上提供了现有多kg表示学习方法的基准测试结果。在这些数据集上的链接预测任务的实证结果表明,与更复杂的多千克表示学习方法相比,所提出的IPPT4KRL方法取得了相当甚至更好的结果。
{"title":"IPPT4KRL: Iterative Post-Processing Transfer for Knowledge Representation Learning","authors":"Weihang Zhang, O. Șerban, Jiahao Sun, Yike Guo","doi":"10.3390/make5010004","DOIUrl":"https://doi.org/10.3390/make5010004","url":null,"abstract":"Knowledge Graphs (KGs), a structural way to model human knowledge, have been a critical component of many artificial intelligence applications. Many KG-based tasks are built using knowledge representation learning, which embeds KG entities and relations into a low-dimensional semantic space. However, the quality of representation learning is often limited by the heterogeneity and sparsity of real-world KGs. Multi-KG representation learning, which utilizes KGs from different sources collaboratively, presents one promising solution. In this paper, we propose a simple, but effective iterative method that post-processes pre-trained knowledge graph embedding (IPPT4KRL) on individual KGs to maximize the knowledge transfer from another KG when a small portion of alignment information is introduced. Specifically, additional triples are iteratively included in the post-processing based on their adjacencies to the cross-KG alignments to refine the pre-trained embedding space of individual KGs. We also provide the benchmarking results of existing multi-KG representation learning methods on several generated and well-known datasets. The empirical results of the link prediction task on these datasets show that the proposed IPPT4KRL method achieved comparable and even superior results when compared against more complex methods in multi-KG representation learning.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"11 1","pages":"43-58"},"PeriodicalIF":0.0,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91361686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Arabic Cyberbullying Tweets Using Machine Learning 使用机器学习检测阿拉伯网络欺凌推文
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-05 DOI: 10.3390/make5010003
Alanoud Mohammed Alduailaj, A. Belghith
The advancement of technology has paved the way for a new type of bullying, which often leads to negative stigma in the social setting. Cyberbullying is a cybercrime wherein one individual becomes the target of harassment and hatred. It has recently become more prevalent due to a rise in the usage of social media platforms, and, in some severe situations, it has even led to victims’ suicides. In the literature, several cyberbullying detection methods are proposed, but they are mainly focused on word-based data and user account attributes. Furthermore, most of them are related to the English language. Meanwhile, only a few papers have studied cyberbullying detection in Arabic social media platforms. This paper, therefore, aims to use machine learning in the Arabic language for automatic cyberbullying detection. The proposed mechanism identifies cyberbullying using the Support Vector Machine (SVM) classifier algorithm by using a real dataset obtained from YouTube and Twitter to train and test the classifier. Moreover, we include the Farasa tool to overcome text limitations and improve the detection of bullying attacks.
科技的进步为一种新型的欺凌铺平了道路,这往往导致社会环境中的负面污名。网络欺凌是一种网络犯罪,其中一个人成为骚扰和仇恨的目标。最近,由于社交媒体平台的使用增加,它变得更加普遍,在一些严重的情况下,它甚至导致了受害者的自杀。在文献中,提出了几种网络欺凌检测方法,但它们主要集中在基于词的数据和用户帐户属性。此外,其中大多数都与英语语言有关。与此同时,只有少数论文研究了阿拉伯社交媒体平台上的网络欺凌检测。因此,本文旨在使用阿拉伯语的机器学习来自动检测网络欺凌。本文提出的机制利用支持向量机(SVM)分类器算法识别网络欺凌,并使用来自YouTube和Twitter的真实数据集对分类器进行训练和测试。此外,我们还包括Farasa工具来克服文本限制并改进对欺凌攻击的检测。
{"title":"Detecting Arabic Cyberbullying Tweets Using Machine Learning","authors":"Alanoud Mohammed Alduailaj, A. Belghith","doi":"10.3390/make5010003","DOIUrl":"https://doi.org/10.3390/make5010003","url":null,"abstract":"The advancement of technology has paved the way for a new type of bullying, which often leads to negative stigma in the social setting. Cyberbullying is a cybercrime wherein one individual becomes the target of harassment and hatred. It has recently become more prevalent due to a rise in the usage of social media platforms, and, in some severe situations, it has even led to victims’ suicides. In the literature, several cyberbullying detection methods are proposed, but they are mainly focused on word-based data and user account attributes. Furthermore, most of them are related to the English language. Meanwhile, only a few papers have studied cyberbullying detection in Arabic social media platforms. This paper, therefore, aims to use machine learning in the Arabic language for automatic cyberbullying detection. The proposed mechanism identifies cyberbullying using the Support Vector Machine (SVM) classifier algorithm by using a real dataset obtained from YouTube and Twitter to train and test the classifier. Moreover, we include the Farasa tool to overcome text limitations and improve the detection of bullying attacks.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"11 13 1","pages":"29-42"},"PeriodicalIF":0.0,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79466260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Machine Learning and Knowledge Extraction: 7th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2023, Benevento, Italy, August 29 – September 1, 2023, Proceedings 机器学习与知识提取:第7届IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9国际跨领域会议,CD-MAKE 2023,贝内文托,意大利,2023年8月29日- 9月1日,论文集
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-01 DOI: 10.1007/978-3-031-40837-3
{"title":"Machine Learning and Knowledge Extraction: 7th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2023, Benevento, Italy, August 29 – September 1, 2023, Proceedings","authors":"","doi":"10.1007/978-3-031-40837-3","DOIUrl":"https://doi.org/10.1007/978-3-031-40837-3","url":null,"abstract":"","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50988111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skew Class-balanced Re-weighting for Unbiased Scene Graph Generation 用于无偏场景图生成的倾斜类平衡重加权
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-01 DOI: 10.3390/make5010018
Haeyong Kang, C. D. Yoo
An unbiased scene graph generation (SGG) algorithm referred to as Skew Class-Balanced Re-Weighting (SCR) is proposed for considering the unbiased predicate prediction caused by the long-tailed distribution. The prior works focus mainly on alleviating the deteriorating performances of the minority predicate predictions, showing drastic dropping recall scores, i.e., losing the majority predicate performances. It has not yet correctly analyzed the trade-off between majority and minority predicate performances in the limited SGG datasets. In this paper, to alleviate the issue, the Skew Class-Balanced Re-Weighting (SCR) loss function is considered for the unbiased SGG models. Leveraged by the skewness of biased predicate predictions, the SCR estimates the target predicate weight coefficient and then re-weights more to the biased predicates for better trading-off between the majority predicates and the minority ones. Extensive experiments conducted on the standard Visual Genome dataset and Open Image V4 and V6 show the performances and generality of the SCR with the traditional SGG models.
针对长尾分布导致的无偏谓词预测问题,提出了一种无偏场景图生成(SGG)算法——偏类平衡重加权(SCR)。先前的工作主要集中在缓解少数谓词预测的性能恶化,显示召回分数急剧下降,即失去多数谓词的性能。它还没有正确地分析在有限的SGG数据集中多数和少数谓词性能之间的权衡。为了解决这一问题,本文对无偏SGG模型考虑了偏类平衡重加权(SCR)损失函数。利用偏置谓词预测的偏性,SCR估计目标谓词权重系数,然后对偏置谓词重新赋予更多权重,以便在多数谓词和少数谓词之间更好地权衡。在标准的Visual Genome数据集和Open Image V4和V6上进行的大量实验表明,SCR与传统的SGG模型具有良好的性能和通用性。
{"title":"Skew Class-balanced Re-weighting for Unbiased Scene Graph Generation","authors":"Haeyong Kang, C. D. Yoo","doi":"10.3390/make5010018","DOIUrl":"https://doi.org/10.3390/make5010018","url":null,"abstract":"An unbiased scene graph generation (SGG) algorithm referred to as Skew Class-Balanced Re-Weighting (SCR) is proposed for considering the unbiased predicate prediction caused by the long-tailed distribution. The prior works focus mainly on alleviating the deteriorating performances of the minority predicate predictions, showing drastic dropping recall scores, i.e., losing the majority predicate performances. It has not yet correctly analyzed the trade-off between majority and minority predicate performances in the limited SGG datasets. In this paper, to alleviate the issue, the Skew Class-Balanced Re-Weighting (SCR) loss function is considered for the unbiased SGG models. Leveraged by the skewness of biased predicate predictions, the SCR estimates the target predicate weight coefficient and then re-weights more to the biased predicates for better trading-off between the majority predicates and the minority ones. Extensive experiments conducted on the standard Visual Genome dataset and Open Image V4 and V6 show the performances and generality of the SCR with the traditional SGG models.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"16 1","pages":"287-303"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80049816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Synthetic Data Generation for Visual Detection of Flattened PET Bottles PET压扁瓶视觉检测的合成数据生成
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-12-29 DOI: 10.3390/make5010002
Vitālijs Feščenko, Jānis Ārents, R. Kadikis
Polyethylene terephthalate (PET) bottle recycling is a highly automated task; however, manual quality control is required due to inefficiencies of the process. In this paper, we explore automation of the quality control sub-task, namely visual bottle detection, using convolutional neural network (CNN)-based methods and synthetic generation of labelled training data. We propose a synthetic generation pipeline tailored for transparent and crushed PET bottle detection; however, it can also be applied to undeformed bottles if the viewpoint is set from above. We conduct various experiments on CNNs to compare the quality of real and synthetic data, show that synthetic data can reduce the amount of real data required and experiment with the combination of both datasets in multiple ways to obtain the best performance.
聚对苯二甲酸乙二醇酯(PET)瓶回收是一项高度自动化的任务;然而,由于流程效率低下,需要人工质量控制。在本文中,我们探索了质量控制子任务的自动化,即视觉瓶检测,使用基于卷积神经网络(CNN)的方法和合成生成标记训练数据。我们提出了一种适合透明和破碎PET瓶检测的合成生成管道;然而,如果从上面设置视点,它也可以应用于未变形的瓶子。我们在cnn上进行了各种实验,比较了真实数据和合成数据的质量,表明合成数据可以减少对真实数据的需求,并以多种方式将两种数据集组合在一起进行实验,以获得最佳性能。
{"title":"Synthetic Data Generation for Visual Detection of Flattened PET Bottles","authors":"Vitālijs Feščenko, Jānis Ārents, R. Kadikis","doi":"10.3390/make5010002","DOIUrl":"https://doi.org/10.3390/make5010002","url":null,"abstract":"Polyethylene terephthalate (PET) bottle recycling is a highly automated task; however, manual quality control is required due to inefficiencies of the process. In this paper, we explore automation of the quality control sub-task, namely visual bottle detection, using convolutional neural network (CNN)-based methods and synthetic generation of labelled training data. We propose a synthetic generation pipeline tailored for transparent and crushed PET bottle detection; however, it can also be applied to undeformed bottles if the viewpoint is set from above. We conduct various experiments on CNNs to compare the quality of real and synthetic data, show that synthetic data can reduce the amount of real data required and experiment with the combination of both datasets in multiple ways to obtain the best performance.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"88 1","pages":"14-28"},"PeriodicalIF":0.0,"publicationDate":"2022-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81405429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multimodal AutoML via Representation Evolution 基于表示进化的多模态自动化
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-12-23 DOI: 10.3390/make5010001
Blaž Škrlj, Matej Bevec, Nadine Lavrac
With the increasing amounts of available data, learning simultaneously from different types of inputs is becoming necessary to obtain robust and well-performing models. With the advent of representation learning in recent years, lower-dimensional vector-based representations have become available for both images and texts, while automating simultaneous learning from multiple modalities remains a challenging problem. This paper presents an AutoML (automated machine learning) approach to automated machine learning model configuration identification for data composed of two modalities: texts and images. The approach is based on the idea of representation evolution, the process of automatically amplifying heterogeneous representations across several modalities, optimized jointly with a collection of fast, well-regularized linear models. The proposed approach is benchmarked against 11 unimodal and multimodal (texts and images) approaches on four real-life benchmark datasets from different domains. It achieves competitive performance with minimal human effort and low computing requirements, enabling learning from multiple modalities in automated manner for a wider community of researchers.
随着可用数据量的增加,从不同类型的输入中同时学习对于获得健壮且性能良好的模型变得非常必要。随着近年来表征学习的出现,基于低维向量的表征已经可以用于图像和文本,而从多种模式中自动同时学习仍然是一个具有挑战性的问题。本文提出了一种自动机器学习方法,用于文本和图像两种模式组成的数据的自动机器学习模型配置识别。该方法基于表征进化的思想,即跨多种模式自动放大异构表征的过程,并与一组快速、良好正则化的线性模型共同优化。该方法在来自不同领域的四个实际基准数据集上对11种单模态和多模态(文本和图像)方法进行了基准测试。它以最少的人力和较低的计算需求实现了具有竞争力的性能,使更广泛的研究人员能够以自动化的方式从多种模式中学习。
{"title":"Multimodal AutoML via Representation Evolution","authors":"Blaž Škrlj, Matej Bevec, Nadine Lavrac","doi":"10.3390/make5010001","DOIUrl":"https://doi.org/10.3390/make5010001","url":null,"abstract":"With the increasing amounts of available data, learning simultaneously from different types of inputs is becoming necessary to obtain robust and well-performing models. With the advent of representation learning in recent years, lower-dimensional vector-based representations have become available for both images and texts, while automating simultaneous learning from multiple modalities remains a challenging problem. This paper presents an AutoML (automated machine learning) approach to automated machine learning model configuration identification for data composed of two modalities: texts and images. The approach is based on the idea of representation evolution, the process of automatically amplifying heterogeneous representations across several modalities, optimized jointly with a collection of fast, well-regularized linear models. The proposed approach is benchmarked against 11 unimodal and multimodal (texts and images) approaches on four real-life benchmark datasets from different domains. It achieves competitive performance with minimal human effort and low computing requirements, enabling learning from multiple modalities in automated manner for a wider community of researchers.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"8 1","pages":"1-13"},"PeriodicalIF":0.0,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86545568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Explainable Deep Learning Framework for Detecting and Localising Smoke and Fire Incidents: Evaluation of Grad-CAM++ and LIME 用于烟雾和火灾事件检测和定位的可解释的深度学习框架:grad - cam++和LIME的评估
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-12-06 DOI: 10.3390/make4040057
Ioannis D. Apostolopoulos, I. Athanasoula, Mpesiana A. Tzani, P. Groumpos
Climate change is expected to increase fire events and activity with multiple impacts on human lives. Large grids of forest and city monitoring devices can assist in incident detection, accelerating human intervention in extinguishing fires before they get out of control. Artificial Intelligence promises to automate the detection of fire-related incidents. This study enrols 53,585 fire/smoke and normal images and benchmarks seventeen state-of-the-art Convolutional Neural Networks for distinguishing between the two classes. The Xception network proves to be superior to the rest of the CNNs, obtaining very high accuracy. Grad-CAM++ and LIME algorithms improve the post hoc explainability of Xception and verify that it is learning features found in the critical locations of the image. Both methods agree on the suggested locations, strengthening the abovementioned outcome.
预计气候变化将增加火灾事件和活动,对人类生活产生多重影响。森林和城市监测设备的大型网格可以协助事件检测,加速人为干预,在火灾失控之前将其扑灭。人工智能有望自动检测火灾相关事件。这项研究招募了53,585张火灾/烟雾和正常图像,并对17个最先进的卷积神经网络进行基准测试,以区分这两类图像。异常网络被证明优于其他的cnn,获得了很高的准确率。Grad-CAM++和LIME算法提高了Xception的事后可解释性,并验证它正在学习图像关键位置发现的特征。两种方法都同意建议的地点,加强了上述结果。
{"title":"An Explainable Deep Learning Framework for Detecting and Localising Smoke and Fire Incidents: Evaluation of Grad-CAM++ and LIME","authors":"Ioannis D. Apostolopoulos, I. Athanasoula, Mpesiana A. Tzani, P. Groumpos","doi":"10.3390/make4040057","DOIUrl":"https://doi.org/10.3390/make4040057","url":null,"abstract":"Climate change is expected to increase fire events and activity with multiple impacts on human lives. Large grids of forest and city monitoring devices can assist in incident detection, accelerating human intervention in extinguishing fires before they get out of control. Artificial Intelligence promises to automate the detection of fire-related incidents. This study enrols 53,585 fire/smoke and normal images and benchmarks seventeen state-of-the-art Convolutional Neural Networks for distinguishing between the two classes. The Xception network proves to be superior to the rest of the CNNs, obtaining very high accuracy. Grad-CAM++ and LIME algorithms improve the post hoc explainability of Xception and verify that it is learning features found in the critical locations of the image. Both methods agree on the suggested locations, strengthening the abovementioned outcome.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"76 1","pages":"1124-1135"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86493972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Ontology Completion with Graph-Based Machine Learning: A Comprehensive Evaluation 基于图的机器学习完成本体:一个综合评价
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-12-01 DOI: 10.3390/make4040056
Sebastian Mežnar, Matej Bevec, N. Lavrač, Blaž Škrlj
Increasing quantities of semantic resources offer a wealth of human knowledge, but their growth also increases the probability of wrong knowledge base entries. The development of approaches that identify potentially spurious parts of a given knowledge base is therefore highly relevant. We propose an approach for ontology completion that transforms an ontology into a graph and recommends missing edges using structure-only link analysis methods. By systematically evaluating thirteen methods (some for knowledge graphs) on eight different semantic resources, including Gene Ontology, Food Ontology, Marine Ontology, and similar ontologies, we demonstrate that a structure-only link analysis can offer a scalable and computationally efficient ontology completion approach for a subset of analyzed data sets. To the best of our knowledge, this is currently the most extensive systematic study of the applicability of different types of link analysis methods across semantic resources from different domains. It demonstrates that by considering symbolic node embeddings, explanations of the predictions (links) can be obtained, making this branch of methods potentially more valuable than black-box methods.
越来越多的语义资源为人类提供了丰富的知识,但它们的增长也增加了错误知识库条目的概率。因此,开发识别给定知识库中潜在虚假部分的方法是高度相关的。我们提出了一种本体补全方法,该方法将本体转换为图,并使用仅结构链接分析方法推荐缺失的边。通过系统地评估8种不同语义资源(包括基因本体、食品本体、海洋本体和类似本体)上的13种方法(其中一些用于知识图),我们证明了仅结构链接分析可以为分析数据集子集提供可扩展且计算效率高的本体补全方法。据我们所知,这是目前对不同类型的链接分析方法在不同领域语义资源之间的适用性进行的最广泛的系统研究。它表明,通过考虑符号节点嵌入,可以获得预测(链接)的解释,这使得该方法的分支可能比黑盒方法更有价值。
{"title":"Ontology Completion with Graph-Based Machine Learning: A Comprehensive Evaluation","authors":"Sebastian Mežnar, Matej Bevec, N. Lavrač, Blaž Škrlj","doi":"10.3390/make4040056","DOIUrl":"https://doi.org/10.3390/make4040056","url":null,"abstract":"Increasing quantities of semantic resources offer a wealth of human knowledge, but their growth also increases the probability of wrong knowledge base entries. The development of approaches that identify potentially spurious parts of a given knowledge base is therefore highly relevant. We propose an approach for ontology completion that transforms an ontology into a graph and recommends missing edges using structure-only link analysis methods. By systematically evaluating thirteen methods (some for knowledge graphs) on eight different semantic resources, including Gene Ontology, Food Ontology, Marine Ontology, and similar ontologies, we demonstrate that a structure-only link analysis can offer a scalable and computationally efficient ontology completion approach for a subset of analyzed data sets. To the best of our knowledge, this is currently the most extensive systematic study of the applicability of different types of link analysis methods across semantic resources from different domains. It demonstrates that by considering symbolic node embeddings, explanations of the predictions (links) can be obtained, making this branch of methods potentially more valuable than black-box methods.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"23 1","pages":"1107-1123"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80833878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SGD-Based Cascade Scheme for Higher Degrees Wiener Polynomial Approximation of Large Biomedical Datasets 基于sgd的大型生物医学数据集高次Wiener多项式近似级联方案
Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-11-21 DOI: 10.3390/make4040055
I. Izonin, R. Tkachenko, Rostyslav Holoven, Kyrylo Yemets, Myroslav Havryliuk, Shishir K. Shandilya
The modern development of the biomedical engineering area is accompanied by the availability of large volumes of data with a non-linear response surface. The effective analysis of such data requires the development of new, more productive machine learning methods. This paper proposes a cascade ensemble that combines the advantages of using a high-order Wiener polynomial and Stochastic Gradient Descent algorithm while eliminating their disadvantages to ensure a high accuracy of the approximation of such data with a satisfactory training time. The work presents flow charts of the learning algorithms and the application of the developed ensemble scheme, and all the steps are described in detail. The simulation was carried out based on a real-world dataset. Procedures for the proposed model tuning have been performed. The high accuracy of the approximation based on the developed ensemble scheme was established experimentally. The possibility of an implicit approximation by high orders of the Wiener polynomial with a slight increase in the number of its members is shown. It ensures a low training time for the proposed method during the analysis of large datasets, which provides the possibility of its practical use in the biomedical engineering area.
生物医学工程领域的现代发展伴随着大量具有非线性响应面的数据的可用性。对这些数据的有效分析需要开发新的、更高效的机器学习方法。本文提出了一种结合了高阶维纳多项式和随机梯度下降算法的优点,同时又消除了它们的缺点的级联集成,以保证在令人满意的训练时间内对此类数据的逼近具有较高的精度。本文给出了学习算法的流程图和所开发的集成方案的应用,并对所有步骤进行了详细的描述。模拟是基于真实世界的数据集进行的。已经执行了所建议的模型调优过程。实验结果表明,该方法具有较高的逼近精度。给出了高阶维纳多项式在其成员数稍有增加的情况下隐式逼近的可能性。保证了该方法在分析大数据集时训练时间短,为其在生物医学工程领域的实际应用提供了可能。
{"title":"SGD-Based Cascade Scheme for Higher Degrees Wiener Polynomial Approximation of Large Biomedical Datasets","authors":"I. Izonin, R. Tkachenko, Rostyslav Holoven, Kyrylo Yemets, Myroslav Havryliuk, Shishir K. Shandilya","doi":"10.3390/make4040055","DOIUrl":"https://doi.org/10.3390/make4040055","url":null,"abstract":"The modern development of the biomedical engineering area is accompanied by the availability of large volumes of data with a non-linear response surface. The effective analysis of such data requires the development of new, more productive machine learning methods. This paper proposes a cascade ensemble that combines the advantages of using a high-order Wiener polynomial and Stochastic Gradient Descent algorithm while eliminating their disadvantages to ensure a high accuracy of the approximation of such data with a satisfactory training time. The work presents flow charts of the learning algorithms and the application of the developed ensemble scheme, and all the steps are described in detail. The simulation was carried out based on a real-world dataset. Procedures for the proposed model tuning have been performed. The high accuracy of the approximation based on the developed ensemble scheme was established experimentally. The possibility of an implicit approximation by high orders of the Wiener polynomial with a slight increase in the number of its members is shown. It ensures a low training time for the proposed method during the analysis of large datasets, which provides the possibility of its practical use in the biomedical engineering area.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"58 1","pages":"1088-1106"},"PeriodicalIF":0.0,"publicationDate":"2022-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85818894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Machine learning and knowledge extraction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1