首页 > 最新文献

Adv. Artif. Intell. Mach. Learn.最新文献

英文 中文
FishRecGAN: An End to End GAN Based Network for Fisheye Rectification and Calibration FishRecGAN:一种基于端到端GAN的鱼眼校正和校准网络
Pub Date : 2023-05-09 DOI: 10.48550/arXiv.2305.05222
Xin Shen, Kyungdon Joo, Jean Oh
We propose an end-to-end deep learning approach to rectify fisheye images and simultaneously calibrate camera intrinsic and distortion parameters. Our method consists of two parts: a Quick Image Rectification Module developed with a Pix2Pix GAN and Wasserstein GAN (W-Pix2PixGAN), and a Calibration Module with a CNN architecture. Our Quick Rectification Network performs robust rectification with good resolution, making it suitable for constant calibration in camera-based surveillance equipment. To achieve highquality calibration, we use the straightened output from the Quick Rectification Module as a guidance-like semantic feature map for the Calibration Module to learn the geometric relationship between the straightened feature and the distorted feature. We train and validate our method with a large synthesized dataset labeled with well-simulated parameters applied to a perspective image dataset. Our solution has achieved robust performance in high-resolution with a significant PSNR value of 22.343. 1
我们提出了一种端到端的深度学习方法来校正鱼眼图像,同时校准相机的固有参数和畸变参数。我们的方法由两部分组成:一个由Pix2PixGAN和Wasserstein GAN (W-Pix2PixGAN)开发的快速图像校正模块,以及一个具有CNN架构的校准模块。我们的快速整流网络以良好的分辨率进行鲁棒整流,使其适用于基于摄像机的监控设备的恒定校准。为了实现高质量的校准,我们使用快速校正模块的矫直输出作为校准模块的类似指南的语义特征映射,以学习矫直特征与扭曲特征之间的几何关系。我们用一个大型的合成数据集来训练和验证我们的方法,该数据集标记有很好的模拟参数,应用于透视图像数据集。我们的解决方案在高分辨率下实现了强大的性能,PSNR值达到22.343。1
{"title":"FishRecGAN: An End to End GAN Based Network for Fisheye Rectification and Calibration","authors":"Xin Shen, Kyungdon Joo, Jean Oh","doi":"10.48550/arXiv.2305.05222","DOIUrl":"https://doi.org/10.48550/arXiv.2305.05222","url":null,"abstract":"We propose an end-to-end deep learning approach to rectify fisheye images and simultaneously calibrate camera intrinsic and distortion parameters. Our method consists of two parts: a Quick Image Rectification Module developed with a Pix2Pix GAN and Wasserstein GAN (W-Pix2PixGAN), and a Calibration Module with a CNN architecture. Our Quick Rectification Network performs robust rectification with good resolution, making it suitable for constant calibration in camera-based surveillance equipment. To achieve highquality calibration, we use the straightened output from the Quick Rectification Module as a guidance-like semantic feature map for the Calibration Module to learn the geometric relationship between the straightened feature and the distorted feature. We train and validate our method with a large synthesized dataset labeled with well-simulated parameters applied to a perspective image dataset. Our solution has achieved robust performance in high-resolution with a significant PSNR value of 22.343. 1","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131050173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI Era ChatGPT和Bard是否应该与其数据提供商分享收入?人工智能时代的新商业模式
Pub Date : 2023-05-04 DOI: 10.48550/arXiv.2305.02555
Dong Zhang
With various AI tools such as ChatGPT becoming increasingly convenient and popular, we are entering a true AI era. We can foresee that exceptional AI tools will soon reap considerable profits. A crucial question arise: should AI tools share revenue with their training data providers in additional to traditional stakeholders and shareholders? The answer is Yes. Large AI tools, such as large language models, always require more and better quality data to continuously improve, but current copyright laws limit their access to various types of data. Sharing revenue between AI tools and their data providers could transform the current hostile zero-sum game relationship between AI tools and a majority of copyrighted data owners into a collaborative and mutually beneficial one, which is necessary to facilitate the development of a virtuous cycle among AI tools, their users and data providers that drives forward AI technology and builds a healthy AI ecosystem. However, current revenue-sharing business models do not work for AI tools in the forthcoming AI era, since the most widely used metrics for website-based traffic and action, such as clicks, will be replaced by new metrics such as prompts and cost per prompt for generative AI tools. Therefore, a completely new revenue-sharing business model must be established. This new business model, which must be independent of AI tools and be easily explained to data providers, needs to establish a prompt-based scoring system to measure data engagement of each data provider. This paper systematically discusses how to build such a scoring system for all data providers for AI tools based on classification and content similarity models, and outlines the requirements for AI tools or third parties to build it. AI tools can share revenue with data providers using such a scoring system, which would encourage more data owners to participate in the revenuesharing program. This will be a utilitarian AI era where all parties benefit.
随着ChatGPT等各种人工智能工具变得越来越方便和流行,我们正在进入一个真正的人工智能时代。我们可以预见,卓越的人工智能工具将很快获得可观的利润。一个关键的问题出现了:除了传统的利益相关者和股东之外,人工智能工具是否应该与其培训数据提供商分享收入?答案是肯定的。大型人工智能工具,如大型语言模型,总是需要更多、更高质量的数据来不断改进,但目前的版权法限制了它们对各种类型数据的访问。人工智能工具和数据提供商之间的收入共享可以将人工智能工具和大多数版权数据所有者之间敌对的零和游戏关系转变为协作和互利的关系,这对于促进人工智能工具、用户和数据提供商之间的良性循环发展是必要的,从而推动人工智能技术的发展,建立健康的人工智能生态系统。然而,在即将到来的人工智能时代,目前的收入分成商业模式不适用于人工智能工具,因为最广泛使用的基于网站的流量和行为指标(如点击)将被新指标(如生成人工智能工具的提示和每提示成本)所取代。因此,必须建立一种全新的收益共享商业模式。这种新的商业模式必须独立于人工智能工具,并易于向数据提供者解释,需要建立一个基于提示的评分系统来衡量每个数据提供者的数据参与度。本文系统地讨论了如何基于分类和内容相似度模型为AI工具的所有数据提供者构建这样一个评分系统,并概述了AI工具或第三方构建该评分系统的要求。人工智能工具可以使用这样的评分系统与数据提供商分享收入,这将鼓励更多的数据所有者参与收入共享计划。这将是一个功利的人工智能时代,各方都将受益。
{"title":"Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI Era","authors":"Dong Zhang","doi":"10.48550/arXiv.2305.02555","DOIUrl":"https://doi.org/10.48550/arXiv.2305.02555","url":null,"abstract":"With various AI tools such as ChatGPT becoming increasingly convenient and popular, we are entering a true AI era. We can foresee that exceptional AI tools will soon reap considerable profits. A crucial question arise: should AI tools share revenue with their training data providers in additional to traditional stakeholders and shareholders? The answer is Yes. Large AI tools, such as large language models, always require more and better quality data to continuously improve, but current copyright laws limit their access to various types of data. Sharing revenue between AI tools and their data providers could transform the current hostile zero-sum game relationship between AI tools and a majority of copyrighted data owners into a collaborative and mutually beneficial one, which is necessary to facilitate the development of a virtuous cycle among AI tools, their users and data providers that drives forward AI technology and builds a healthy AI ecosystem. However, current revenue-sharing business models do not work for AI tools in the forthcoming AI era, since the most widely used metrics for website-based traffic and action, such as clicks, will be replaced by new metrics such as prompts and cost per prompt for generative AI tools. Therefore, a completely new revenue-sharing business model must be established. This new business model, which must be independent of AI tools and be easily explained to data providers, needs to establish a prompt-based scoring system to measure data engagement of each data provider. This paper systematically discusses how to build such a scoring system for all data providers for AI tools based on classification and content similarity models, and outlines the requirements for AI tools or third parties to build it. AI tools can share revenue with data providers using such a scoring system, which would encourage more data owners to participate in the revenuesharing program. This will be a utilitarian AI era where all parties benefit.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"89 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120879073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural Vibration Signal Denoising Using Stacking Ensemble of Hybrid CNN-RNN 基于混合CNN-RNN叠加综的结构振动信号去噪
Pub Date : 2023-03-11 DOI: 10.54364/AAIML.2023.1165
Youzhi Liang, Wen-Chieh Liang, Jianguo Jia
Vibration signals have been increasingly utilized in various engineering fields for analysis and monitoring purposes, including structural health monitoring, fault diagnosis and damage detection, where vibration signals can provide valuable information about the condition and integrity of structures. In recent years, there has been a growing trend towards the use of vibration signals in the field of bioengineering. Activity-induced structural vibrations, particularly footstep-induced signals, are useful for analyzing the movement of biological systems such as the human body and animals, providing valuable information regarding an individual’s gait, body mass, and posture, making them an attractive tool for health monitoring, security, and human-computer interaction. However, the presence of various types of noise can compromise the accuracy of footstep-induced signal analysis. In this paper, we propose a novel ensemble model that leverages both the ensemble of multiple signals and of recurrent and convolutional neural network predictions. The proposed model consists of three stages: preprocessing, hybrid modeling, and ensemble. In the preprocessing stage, features are extracted using the Fast Fourier Transform and wavelet transform to capture the underlying physics-governed dynamics of the system and extract spatial and temporal features. In the hybrid modeling stage, a bi-directional LSTM is used to denoise the noisy signal concatenated with FFT results, and a CNN is used to obtain a condensed feature representation of the signal. In the ensemble stage, three layers of a fully-connected neural network are used to produce the final denoised signal. The proposed model addresses the challenges associated with structural vibration signals, which outperforms the prevailing algorithms for a wide range of noise levels, evaluated using PSNR, SNR, and WMAPE.
振动信号越来越多地用于各种工程领域的分析和监测,包括结构健康监测、故障诊断和损伤检测,其中振动信号可以提供有关结构状态和完整性的有价值的信息。近年来,振动信号在生物工程领域的应用呈现出日益增长的趋势。活动引起的结构振动,特别是脚步声引起的信号,对于分析生物系统(如人体和动物)的运动非常有用,可以提供有关个人步态、体重和姿势的宝贵信息,使其成为健康监测、安全和人机交互的有吸引力的工具。然而,各种类型的噪声的存在会影响足迹信号分析的准确性。在本文中,我们提出了一种新的集成模型,该模型既利用了多个信号的集成,也利用了循环和卷积神经网络预测。该模型分为预处理、混合建模和集成三个阶段。在预处理阶段,使用快速傅立叶变换和小波变换提取特征,以捕获系统的底层物理控制动态,并提取空间和时间特征。在混合建模阶段,使用双向LSTM对与FFT结果拼接的噪声信号进行降噪,并使用CNN获得信号的压缩特征表示。在集成阶段,使用三层全连接神经网络来产生最终的去噪信号。所提出的模型解决了与结构振动信号相关的挑战,在广泛的噪声水平范围内优于现行算法,使用PSNR, SNR和WMAPE进行评估。
{"title":"Structural Vibration Signal Denoising Using Stacking Ensemble of Hybrid CNN-RNN","authors":"Youzhi Liang, Wen-Chieh Liang, Jianguo Jia","doi":"10.54364/AAIML.2023.1165","DOIUrl":"https://doi.org/10.54364/AAIML.2023.1165","url":null,"abstract":"Vibration signals have been increasingly utilized in various engineering fields for analysis and monitoring purposes, including structural health monitoring, fault diagnosis and damage detection, where vibration signals can provide valuable information about the condition and integrity of structures. In recent years, there has been a growing trend towards the use of vibration signals in the field of bioengineering. Activity-induced structural vibrations, particularly footstep-induced signals, are useful for analyzing the movement of biological systems such as the human body and animals, providing valuable information regarding an individual’s gait, body mass, and posture, making them an attractive tool for health monitoring, security, and human-computer interaction. However, the presence of various types of noise can compromise the accuracy of footstep-induced signal analysis. In this paper, we propose a novel ensemble model that leverages both the ensemble of multiple signals and of recurrent and convolutional neural network predictions. The proposed model consists of three stages: preprocessing, hybrid modeling, and ensemble. In the preprocessing stage, features are extracted using the Fast Fourier Transform and wavelet transform to capture the underlying physics-governed dynamics of the system and extract spatial and temporal features. In the hybrid modeling stage, a bi-directional LSTM is used to denoise the noisy signal concatenated with FFT results, and a CNN is used to obtain a condensed feature representation of the signal. In the ensemble stage, three layers of a fully-connected neural network are used to produce the final denoised signal. The proposed model addresses the challenges associated with structural vibration signals, which outperforms the prevailing algorithms for a wide range of noise levels, evaluated using PSNR, SNR, and WMAPE.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129122992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Comparison of Methods for Neural Network Aggregation 神经网络聚合方法的比较
Pub Date : 2023-03-06 DOI: 10.48550/arXiv.2303.03488
John Pomerat, Aviv Segev
Deep learning has been successful in the theoretical aspect. For deep learning to succeed in industry, we need to have algorithms capable of handling many inconsistencies appearing in real data. These inconsistencies can have large effects on the implementation of a deep learning algorithm. Artificial Intelligence is currently changing the medical industry. However, receiving authorization to use medical data for training machine learning algorithms is a huge hurdle. A possible solution is sharing the data without sharing the patient information. We propose a multi-party computation protocol for the deep learning algorithm. The protocol enables to conserve both the privacy and the security of the training data. Three approaches of neural networks assembly are analyzed: transfer learning, average ensemble learning, and series network learning. The results are compared to approaches based on data-sharing in different experiments. We analyze the security issues of the proposed protocol. Although the analysis is based on medical data, the results of multi-party computation of machine learning training are theoretical and can be implemented in multiple research areas.
深度学习在理论方面取得了成功。为了让深度学习在工业中取得成功,我们需要有能够处理真实数据中出现的许多不一致的算法。这些不一致会对深度学习算法的实现产生很大的影响。人工智能正在改变医疗行业。然而,获得使用医疗数据来训练机器学习算法的授权是一个巨大的障碍。一个可能的解决方案是在不共享患者信息的情况下共享数据。我们提出了一种用于深度学习算法的多方计算协议。该协议能够保护训练数据的隐私性和安全性。分析了神经网络组装的三种方法:迁移学习、平均集成学习和串联网络学习。在不同的实验中,将结果与基于数据共享的方法进行了比较。我们分析了该协议的安全问题。虽然分析是基于医疗数据,但机器学习训练多方计算的结果是理论性的,可以在多个研究领域实现。
{"title":"A Comparison of Methods for Neural Network Aggregation","authors":"John Pomerat, Aviv Segev","doi":"10.48550/arXiv.2303.03488","DOIUrl":"https://doi.org/10.48550/arXiv.2303.03488","url":null,"abstract":"Deep learning has been successful in the theoretical aspect. For deep learning to succeed in industry, we need to have algorithms capable of handling many inconsistencies appearing in real data. These inconsistencies can have large effects on the implementation of a deep learning algorithm. Artificial Intelligence is currently changing the medical industry. However, receiving authorization to use medical data for training machine learning algorithms is a huge hurdle. A possible solution is sharing the data without sharing the patient information. We propose a multi-party computation protocol for the deep learning algorithm. The protocol enables to conserve both the privacy and the security of the training data. Three approaches of neural networks assembly are analyzed: transfer learning, average ensemble learning, and series network learning. The results are compared to approaches based on data-sharing in different experiments. We analyze the security issues of the proposed protocol. Although the analysis is based on medical data, the results of multi-party computation of machine learning training are theoretical and can be implemented in multiple research areas.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122358085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One-class Damage Detector Using Deeper Fully Convolutional Data Descriptions for Civil Application 基于深度全卷积数据描述的一类民用损伤检测器
Pub Date : 2023-03-03 DOI: 10.54364/aaiml.2023.1159
Takato Yasuno, M. Okano, Junichiro Fujii
Infrastructure managers must maintain high standards to ensure user satisfaction during the lifecycle of infrastructures. Surveillance cameras and visual inspections have enabled progress in automating the detection of anomalous features and assessing the occurrence of deterioration. However, collecting damage data is typically time consuming and requires repeated inspections. The one-class damage detection approach has an advantage in that normal images can be used to optimize model parameters. Additionally, visual evaluation of heatmaps enables us to understand localized anomalous features. The authors highlight damage vision applications utilized in the robust property and localized damage explainability. First, we propose a civil-purpose application for automating one-class damage detection reproducing a Fully Convolutional Data Description (FCDD) as a baseline model. We have obtained accurate and explainable results demonstrating experimental studies on concrete damage and steel corrosion in civil engineering. Additionally, to develop a more robust application, we applied our method to another outdoor domain that contains complex and noisy backgrounds using natural disaster datasets collected using various devices. Furthermore, we propose a valuable solution of deeper FCDDs focusing on other powerful backbones to improve the performance of damage detection and implement ablation studies on disaster datasets. The key results indicate that the deeper FCDDs outperformed the baseline FCDD on datasets representing natural disaster damage caused by hurricanes, typhoons, earthquakes, and fourevent disasters.
基础设施管理人员必须保持高标准,以确保在基础设施的生命周期内用户满意。监视摄像机和目视检查使在自动检测异常特征和评估恶化情况方面取得了进展。然而,收集损坏数据通常非常耗时,并且需要反复检查。一类损伤检测方法的优点是可以使用正常图像来优化模型参数。此外,热图的视觉评价使我们能够了解局部异常特征。重点介绍了损伤视觉在鲁棒性和局部损伤可解释性方面的应用。首先,我们提出了一个民用应用程序,用于自动化一类损伤检测,再现了一个完全卷积数据描述(FCDD)作为基线模型。我们已经获得了准确和可解释的结果,证明了土木工程中混凝土损伤和钢腐蚀的实验研究。此外,为了开发更强大的应用程序,我们使用使用各种设备收集的自然灾害数据集,将我们的方法应用于另一个包含复杂和嘈杂背景的户外域。此外,我们提出了一个有价值的解决方案,即聚焦于其他强大的骨干的更深层次的fcdd,以提高损伤检测的性能,并实现对灾害数据集的消融研究。关键结果表明,在飓风、台风、地震和四种自然灾害造成的自然灾害损失数据集上,更深层次的fdd优于基线fdd。
{"title":"One-class Damage Detector Using Deeper Fully Convolutional Data Descriptions for Civil Application","authors":"Takato Yasuno, M. Okano, Junichiro Fujii","doi":"10.54364/aaiml.2023.1159","DOIUrl":"https://doi.org/10.54364/aaiml.2023.1159","url":null,"abstract":"Infrastructure managers must maintain high standards to ensure user satisfaction during the lifecycle of infrastructures. Surveillance cameras and visual inspections have enabled progress in automating the detection of anomalous features and assessing the occurrence of deterioration. However, collecting damage data is typically time consuming and requires repeated inspections. The one-class damage detection approach has an advantage in that normal images can be used to optimize model parameters. Additionally, visual evaluation of heatmaps enables us to understand localized anomalous features. The authors highlight damage vision applications utilized in the robust property and localized damage explainability. First, we propose a civil-purpose application for automating one-class damage detection reproducing a Fully Convolutional Data Description (FCDD) as a baseline model. We have obtained accurate and explainable results demonstrating experimental studies on concrete damage and steel corrosion in civil engineering. Additionally, to develop a more robust application, we applied our method to another outdoor domain that contains complex and noisy backgrounds using natural disaster datasets collected using various devices. Furthermore, we propose a valuable solution of deeper FCDDs focusing on other powerful backbones to improve the performance of damage detection and implement ablation studies on disaster datasets. The key results indicate that the deeper FCDDs outperformed the baseline FCDD on datasets representing natural disaster damage caused by hurricanes, typhoons, earthquakes, and fourevent disasters.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133334335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Evolutionary Augmentation Policy Optimization for Self-supervised Learning 自监督学习的进化增强策略优化
Pub Date : 2023-03-02 DOI: 10.48550/arXiv.2303.01584
Noah Barrett, Zahra Sadeghi, S. Matwin
Self-supervised Learning (SSL) is a machine learning algorithm for pretraining Deep Neural Networks (DNNs) without requiring manually labeled data. The central idea of this learning technique is based on an auxiliary stage aka pretext task in which labeled data are created automatically through data augmentation and exploited for pretraining the DNN. However, the effect of each pretext task is not well studied or compared in the literature. In this paper, we study the contribution of augmentation operators on the performance of self supervised learning algorithms in a constrained settings. We propose an evolutionary search method for optimization of data augmentation pipeline in pretext tasks and measure the impact of augmentation operators in several SOTA SSL algorithms. By encoding different combination of augmentation operators in chromosomes we seek the optimal augmentation policies through an evolutionary optimization mechanism. We further introduce methods for analyzing and explaining the performance of optimized SSL algorithms. Our results indicate that our proposed method can find solutions that outperform the accuracy of classification of SSL algorithms which confirms the influence of augmentation policy choice on the overall performance of SSL algorithms. We also compare optimal SSL solutions found by our evolutionary search mechanism and show the effect of batch size in the pretext task on two visual datasets.
自监督学习(SSL)是一种用于预训练深度神经网络(dnn)的机器学习算法,无需手动标记数据。这种学习技术的核心思想是基于一个辅助阶段,即借口任务,其中通过数据增强自动创建标记数据,并用于DNN的预训练。然而,每种借口任务的效果在文献中并没有得到很好的研究或比较。本文研究了约束条件下增广算子对自监督学习算法性能的贡献。我们提出了一种用于优化借口任务中数据增强管道的进化搜索方法,并测量了几种SOTA SSL算法中增强算子的影响。通过在染色体中编码不同的增强算子组合,通过进化优化机制寻求最优的增强策略。我们进一步介绍了分析和解释优化SSL算法性能的方法。我们的结果表明,我们提出的方法可以找到优于SSL算法分类准确性的解决方案,这证实了增强策略选择对SSL算法整体性能的影响。我们还比较了进化搜索机制找到的最优SSL解决方案,并展示了借口任务中批大小对两个可视化数据集的影响。
{"title":"Evolutionary Augmentation Policy Optimization for Self-supervised Learning","authors":"Noah Barrett, Zahra Sadeghi, S. Matwin","doi":"10.48550/arXiv.2303.01584","DOIUrl":"https://doi.org/10.48550/arXiv.2303.01584","url":null,"abstract":"Self-supervised Learning (SSL) is a machine learning algorithm for pretraining Deep Neural Networks (DNNs) without requiring manually labeled data. The central idea of this learning technique is based on an auxiliary stage aka pretext task in which labeled data are created automatically through data augmentation and exploited for pretraining the DNN. However, the effect of each pretext task is not well studied or compared in the literature. In this paper, we study the contribution of augmentation operators on the performance of self supervised learning algorithms in a constrained settings. We propose an evolutionary search method for optimization of data augmentation pipeline in pretext tasks and measure the impact of augmentation operators in several SOTA SSL algorithms. By encoding different combination of augmentation operators in chromosomes we seek the optimal augmentation policies through an evolutionary optimization mechanism. We further introduce methods for analyzing and explaining the performance of optimized SSL algorithms. Our results indicate that our proposed method can find solutions that outperform the accuracy of classification of SSL algorithms which confirms the influence of augmentation policy choice on the overall performance of SSL algorithms. We also compare optimal SSL solutions found by our evolutionary search mechanism and show the effect of batch size in the pretext task on two visual datasets.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115877815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EuclidNet: Deep Visual Reasoning for Constructible Problems in Geometry 欧几里得网:几何中可构造问题的深度视觉推理
Pub Date : 2022-12-27 DOI: 10.54364/aaiml.2023.1152
M. Wong, Xintong Qi, C. Tan
In this paper, we present a visual reasoning framework driven by deep learning for solving constructible problems in geometry that is useful for automated geometry theorem proving. Constructible problems in geometry often ask for the sequence of straightedge-and-compass constructions to construct a given goal given some initial setup. Our EuclidNet framework leverages the neural network architecture Mask R-CNN to extract the visual features from the initial setup and goal configuration with extra points of intersection, and then generate possible construction steps as intermediary data models that are used as feedback in the training process for further refinement of the construction step sequence. This process is repeated recursively until either a solution is found, in which case we backtrack the path for a step-by-step construction guide, or the problem is identified as unsolvable. Our EuclidNet framework is validated on the problem set of Euclidea with an average of 75% accuracy without prior knowledge and complex Japanese Sangaku geometry problems, demonstrating its capacity to leverage backtracking for deep visual reasoning of challenging problems.
在本文中,我们提出了一个由深度学习驱动的视觉推理框架,用于解决几何中的可构造问题,该框架可用于自动几何定理证明。几何中的可构造问题通常要求直线和罗盘构造序列来构造给定初始设置的给定目标。我们的EuclidNet框架利用神经网络架构Mask R-CNN从初始设置和目标配置中提取具有额外交集点的视觉特征,然后生成可能的构建步骤作为中间数据模型,在训练过程中用作反馈,以进一步细化构建步骤序列。这个过程递归地重复,直到找到解决方案,在这种情况下,我们回溯一步一步构建指南的路径,或者问题被确定为无法解决。我们的EuclidNet框架在没有先验知识和复杂的日本Sangaku几何问题的Euclidea问题集上以平均75%的准确率进行了验证,证明了它利用回溯进行具有挑战性问题的深度视觉推理的能力。
{"title":"EuclidNet: Deep Visual Reasoning for Constructible Problems in Geometry","authors":"M. Wong, Xintong Qi, C. Tan","doi":"10.54364/aaiml.2023.1152","DOIUrl":"https://doi.org/10.54364/aaiml.2023.1152","url":null,"abstract":"In this paper, we present a visual reasoning framework driven by deep learning for solving constructible problems in geometry that is useful for automated geometry theorem proving. Constructible problems in geometry often ask for the sequence of straightedge-and-compass constructions to construct a given goal given some initial setup. Our EuclidNet framework leverages the neural network architecture Mask R-CNN to extract the visual features from the initial setup and goal configuration with extra points of intersection, and then generate possible construction steps as intermediary data models that are used as feedback in the training process for further refinement of the construction step sequence. This process is repeated recursively until either a solution is found, in which case we backtrack the path for a step-by-step construction guide, or the problem is identified as unsolvable. Our EuclidNet framework is validated on the problem set of Euclidea with an average of 75% accuracy without prior knowledge and complex Japanese Sangaku geometry problems, demonstrating its capacity to leverage backtracking for deep visual reasoning of challenging problems.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123296624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Predicting Survival of Tongue Cancer Patients by Machine Learning Models 用机器学习模型预测舌癌患者的生存
Pub Date : 2022-12-23 DOI: 10.48550/arXiv.2212.12114
Angelos Vasilopoulos, N. Xi
Tongue cancer is a common oral cavity malignancy that originates in the mouth and throat. Much effort has been invested in improving its diagnosis, treatment, and management. Surgical removal, chemotherapy, and radiation therapy remain the major treatment for tongue cancer. The treatment effect is determined by patients’ survival status. Previous studies have identified certain survival and risk factors based on descriptive statistics, ignoring the complex, nonlinear relationship among clinical and demographic variables. In this study, we utilize five cutting-edge machine learning models and clinical data to predict the survival of tongue cancer patients after treatment. Five-fold cross-validation, bootstrap analysis, and permutation feature importance are applied to estimate and interpret model performance. The prognostic factors identified by our method are consistent with previous clinical studies. Our method is accurate, interpretable, and thus useable as additional evidence in tongue cancer treatment and management.
舌癌是一种常见的口腔恶性肿瘤,起源于口腔和咽喉。在改善其诊断、治疗和管理方面已经投入了大量的努力。手术切除、化疗和放射治疗仍然是舌癌的主要治疗方法。治疗效果取决于患者的生存状态。以往的研究基于描述性统计确定了某些生存和危险因素,忽略了临床和人口变量之间复杂的非线性关系。在这项研究中,我们利用五种尖端的机器学习模型和临床数据来预测舌癌患者治疗后的生存。五倍交叉验证、自举分析和排列特征重要性被应用于估计和解释模型性能。我们的方法确定的预后因素与以往的临床研究一致。我们的方法是准确的,可解释的,因此可作为舌癌治疗和管理的额外证据。
{"title":"Predicting Survival of Tongue Cancer Patients by Machine Learning Models","authors":"Angelos Vasilopoulos, N. Xi","doi":"10.48550/arXiv.2212.12114","DOIUrl":"https://doi.org/10.48550/arXiv.2212.12114","url":null,"abstract":"Tongue cancer is a common oral cavity malignancy that originates in the mouth and throat. Much effort has been invested in improving its diagnosis, treatment, and management. Surgical removal, chemotherapy, and radiation therapy remain the major treatment for tongue cancer. The treatment effect is determined by patients’ survival status. Previous studies have identified certain survival and risk factors based on descriptive statistics, ignoring the complex, nonlinear relationship among clinical and demographic variables. In this study, we utilize five cutting-edge machine learning models and clinical data to predict the survival of tongue cancer patients after treatment. Five-fold cross-validation, bootstrap analysis, and permutation feature importance are applied to estimate and interpret model performance. The prognostic factors identified by our method are consistent with previous clinical studies. Our method is accurate, interpretable, and thus useable as additional evidence in tongue cancer treatment and management.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122356811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of Auto Insurance Risk Based on t-SNE Dimensionality Reduction 基于t-SNE降维的车险风险预测
Pub Date : 2022-12-19 DOI: 10.54364/AAIML.2022.1139
J. Levitas, Konstantin Yavilberg, O. Korol, Genadi Man
Correct risk estimation of policyholders is of great significance to auto insurance companies. While the current tools used in this field have been proven in practice to be quite efficient and beneficial, we argue that there is still a lot of room for development and improvement in the auto insurance risk estimation process. To this end, we develop a framework based on a combination of a neural network together with a dimensionality reduction technique t-SNE (t-distributed stochastic neighbour embedding). This enables us to visually represent the complex structure of the risk as a two-dimensional surface, while still preserving the properties of the local region in the features space. The obtained results, which are based on real insurance data, reveal a clear contrast between the high and the low risk policy holders, and indeed improve upon the actual risk estimation performed by the insurer. Due to the visual accessibility of the portfolio in this approach, we argue that this framework could be advantageous to the auto insurer, both as a main risk prediction tool and as an additional validation stage in other approaches.
正确估计投保人的风险对车险公司来说意义重大。虽然目前在这一领域使用的工具在实践中被证明是非常有效和有益的,但我们认为在汽车保险风险估计过程中仍有很大的发展和改进空间。为此,我们开发了一个基于神经网络和降维技术t-SNE (t分布随机邻居嵌入)相结合的框架。这使我们能够直观地将风险的复杂结构表示为二维表面,同时仍然保留特征空间中局部区域的性质。所获得的结果基于真实的保险数据,揭示了高风险和低风险投保人之间的明显对比,并且确实改进了保险公司进行的实际风险估计。由于该方法中投资组合的可视化可访问性,我们认为该框架可能对汽车保险公司有利,既可以作为主要的风险预测工具,也可以作为其他方法的附加验证阶段。
{"title":"Prediction of Auto Insurance Risk Based on t-SNE Dimensionality Reduction","authors":"J. Levitas, Konstantin Yavilberg, O. Korol, Genadi Man","doi":"10.54364/AAIML.2022.1139","DOIUrl":"https://doi.org/10.54364/AAIML.2022.1139","url":null,"abstract":"Correct risk estimation of policyholders is of great significance to auto insurance companies. While the current tools used in this field have been proven in practice to be quite efficient and beneficial, we argue that there is still a lot of room for development and improvement in the auto insurance risk estimation process. To this end, we develop a framework based on a combination of a neural network together with a dimensionality reduction technique t-SNE (t-distributed stochastic neighbour embedding). This enables us to visually represent the complex structure of the risk as a two-dimensional surface, while still preserving the properties of the local region in the features space. The obtained results, which are based on real insurance data, reveal a clear contrast between the high and the low risk policy holders, and indeed improve upon the actual risk estimation performed by the insurer. Due to the visual accessibility of the portfolio in this approach, we argue that this framework could be advantageous to the auto insurer, both as a main risk prediction tool and as an additional validation stage in other approaches.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133458786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can a face tell us anything about an NBA prospect? - A Deep Learning approach 一张脸能告诉我们NBA的前景吗?-深度学习方法
Pub Date : 2022-12-13 DOI: 10.48550/arXiv.2212.06804
A. Gavros, Foteini Gavrou
Statistical analysis and modeling is becoming increasingly popular in professional sports organizations. Sophisticated methods and models of sports talent evaluation have been created for this purpose. In this research, we present a different perspective from the dominant tactic of statistical data analysis. We deploy Convolutional Neural Networks in an attempt to predict the career trajectory of newly drafted players from each draft class. We created a database consisting of about 1500 image data from players in every draft class since 1990. We then divided the players into five different quality classes based on their NBA career. Next, we trained popular image classification models in our data and conducted a series of tests in an attempt to create models that will provide reliable predictions of the rookie players’ careers. The results of this study suggest that there is a potential correlation between facial characteristics and athletic talent, worth of further investigation.
统计分析和建模在专业体育组织中越来越流行。为此,建立了完善的体育人才评价方法和模型。在这项研究中,我们提出了一个不同于统计数据分析的主导策略的观点。我们使用卷积神经网络来预测每个选秀级别新入选球员的职业轨迹。我们创建了一个数据库,包含了自1990年以来每个选秀级别的1500名球员的图像数据。然后,我们根据球员的NBA生涯将他们分为五个不同的素质等级。接下来,我们在数据中训练了流行的图像分类模型,并进行了一系列测试,试图创建能够提供新秀球员职业生涯可靠预测的模型。这项研究的结果表明,面部特征与运动天赋之间存在潜在的相关性,值得进一步研究。
{"title":"Can a face tell us anything about an NBA prospect? - A Deep Learning approach","authors":"A. Gavros, Foteini Gavrou","doi":"10.48550/arXiv.2212.06804","DOIUrl":"https://doi.org/10.48550/arXiv.2212.06804","url":null,"abstract":"Statistical analysis and modeling is becoming increasingly popular in professional sports organizations. Sophisticated methods and models of sports talent evaluation have been created for this purpose. In this research, we present a different perspective from the dominant tactic of statistical data analysis. We deploy Convolutional Neural Networks in an attempt to predict the career trajectory of newly drafted players from each draft class. We created a database consisting of about 1500 image data from players in every draft class since 1990. We then divided the players into five different quality classes based on their NBA career. Next, we trained popular image classification models in our data and conducted a series of tests in an attempt to create models that will provide reliable predictions of the rookie players’ careers. The results of this study suggest that there is a potential correlation between facial characteristics and athletic talent, worth of further investigation.","PeriodicalId":373878,"journal":{"name":"Adv. Artif. Intell. Mach. Learn.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122425620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Adv. Artif. Intell. Mach. Learn.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1