Weihua Lei, Cleber Zanchettin, Flávio A. O. Santos, Luís A. Nunes Amaral
{"title":"细胞自动生成图像的计算实验揭示了卷积神经网络在模式识别任务中的内在局限性","authors":"Weihua Lei, Cleber Zanchettin, Flávio A. O. Santos, Luís A. Nunes Amaral","doi":"10.1063/5.0213905","DOIUrl":null,"url":null,"abstract":"The extraordinary success of convolutional neural networks (CNNs) in various computer vision tasks has revitalized the field of artificial intelligence. The out-sized expectations created by this extraordinary success have, however, been tempered by a recognition of CNNs’ fragility. Importantly, the magnitude of the problem is unclear due to a lack of rigorous benchmark datasets. Here, we propose a solution to the benchmarking problem that reveals the extent of the vulnerabilities of CNNs and of the methods used to provide interpretability to their predictions. We employ cellular automata (CA) to generate images with rigorously controllable characteristics. CA allow for the definition of both extraordinarily simple and highly complex discrete functions and allow for the generation of boundless datasets of images without repeats. In this work, we systematically investigate the fragility and interpretability of the three popular CNN architectures using CA-generated datasets. We find a sharp transition from a learnable phase to an unlearnable phase as the latent space entropy of the discrete CA functions increases. Furthermore, we demonstrate that shortcut learning is an inherent trait of CNNs. Given a dataset with an easy-to-learn and strongly predictive pattern, CNN will consistently learn the shortcut even if the pattern occurs only on a small fraction of the image. Finally, we show that widely used attribution methods aiming to add interpretability to CNN outputs are strongly CNN-architecture specific and vary widely in their ability to identify input regions of high importance to the model. Our results provide significant insight into the limitations of both CNNs and the approaches developed to add interpretability to their predictions and raise concerns about the types of tasks that should be entrusted to them.","PeriodicalId":502250,"journal":{"name":"APL Machine Learning","volume":"115 14","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Computational experiments with cellular-automata generated images reveal intrinsic limitations of convolutional neural networks on pattern recognition tasks\",\"authors\":\"Weihua Lei, Cleber Zanchettin, Flávio A. O. Santos, Luís A. Nunes Amaral\",\"doi\":\"10.1063/5.0213905\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The extraordinary success of convolutional neural networks (CNNs) in various computer vision tasks has revitalized the field of artificial intelligence. The out-sized expectations created by this extraordinary success have, however, been tempered by a recognition of CNNs’ fragility. Importantly, the magnitude of the problem is unclear due to a lack of rigorous benchmark datasets. Here, we propose a solution to the benchmarking problem that reveals the extent of the vulnerabilities of CNNs and of the methods used to provide interpretability to their predictions. We employ cellular automata (CA) to generate images with rigorously controllable characteristics. CA allow for the definition of both extraordinarily simple and highly complex discrete functions and allow for the generation of boundless datasets of images without repeats. In this work, we systematically investigate the fragility and interpretability of the three popular CNN architectures using CA-generated datasets. We find a sharp transition from a learnable phase to an unlearnable phase as the latent space entropy of the discrete CA functions increases. Furthermore, we demonstrate that shortcut learning is an inherent trait of CNNs. Given a dataset with an easy-to-learn and strongly predictive pattern, CNN will consistently learn the shortcut even if the pattern occurs only on a small fraction of the image. Finally, we show that widely used attribution methods aiming to add interpretability to CNN outputs are strongly CNN-architecture specific and vary widely in their ability to identify input regions of high importance to the model. Our results provide significant insight into the limitations of both CNNs and the approaches developed to add interpretability to their predictions and raise concerns about the types of tasks that should be entrusted to them.\",\"PeriodicalId\":502250,\"journal\":{\"name\":\"APL Machine Learning\",\"volume\":\"115 14\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"APL Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1063/5.0213905\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"APL Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1063/5.0213905","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
卷积神经网络(CNN)在各种计算机视觉任务中取得了非凡的成功,振兴了人工智能领域。然而,由于认识到卷积神经网络的脆弱性,人们对这一非凡成功产生了过高的期望。重要的是,由于缺乏严格的基准数据集,问题的严重性尚不明确。在此,我们提出了一个基准测试问题的解决方案,以揭示 CNN 的脆弱性程度,以及为其预测提供可解释性的方法。我们采用细胞自动机(CA)生成具有严格可控特征的图像。细胞自动机允许定义异常简单和高度复杂的离散函数,并允许生成无穷无尽的无重复图像数据集。在这项工作中,我们利用 CA 生成的数据集系统地研究了三种流行的 CNN 架构的脆弱性和可解释性。我们发现,随着离散 CA 函数的潜在空间熵的增加,可学习阶段会急剧过渡到不可学习阶段。此外,我们还证明了捷径学习是 CNN 的固有特性。如果数据集具有易于学习且预测性很强的模式,即使该模式只出现在一小部分图像上,CNN 也能持续学习该捷径。最后,我们表明,广泛使用的旨在为 CNN 输出增加可解释性的归因方法具有很强的 CNN 体系结构特性,在识别对模型非常重要的输入区域的能力方面存在很大差异。我们的研究结果让我们深入了解了 CNN 和为增加其预测的可解释性而开发的方法的局限性,并引起了人们对应该委托给它们的任务类型的关注。
Computational experiments with cellular-automata generated images reveal intrinsic limitations of convolutional neural networks on pattern recognition tasks
The extraordinary success of convolutional neural networks (CNNs) in various computer vision tasks has revitalized the field of artificial intelligence. The out-sized expectations created by this extraordinary success have, however, been tempered by a recognition of CNNs’ fragility. Importantly, the magnitude of the problem is unclear due to a lack of rigorous benchmark datasets. Here, we propose a solution to the benchmarking problem that reveals the extent of the vulnerabilities of CNNs and of the methods used to provide interpretability to their predictions. We employ cellular automata (CA) to generate images with rigorously controllable characteristics. CA allow for the definition of both extraordinarily simple and highly complex discrete functions and allow for the generation of boundless datasets of images without repeats. In this work, we systematically investigate the fragility and interpretability of the three popular CNN architectures using CA-generated datasets. We find a sharp transition from a learnable phase to an unlearnable phase as the latent space entropy of the discrete CA functions increases. Furthermore, we demonstrate that shortcut learning is an inherent trait of CNNs. Given a dataset with an easy-to-learn and strongly predictive pattern, CNN will consistently learn the shortcut even if the pattern occurs only on a small fraction of the image. Finally, we show that widely used attribution methods aiming to add interpretability to CNN outputs are strongly CNN-architecture specific and vary widely in their ability to identify input regions of high importance to the model. Our results provide significant insight into the limitations of both CNNs and the approaches developed to add interpretability to their predictions and raise concerns about the types of tasks that should be entrusted to them.