Augusto Seben da Rosa, Marlon Daniel Angeli, Jorge Aikes Junior, Alef Iury Ferreira, Lucas Rafael Gris, Anderson da Silva Soares, Arnaldo Candido Junior, Frederico Santos de Oliveira, Gabriel Trevisan Damke, Rafael Teixeira Sousa
{"title":"No Saved Kaleidosope: an 100% Jitted Neural Network Coding Language with Pythonic Syntax","authors":"Augusto Seben da Rosa, Marlon Daniel Angeli, Jorge Aikes Junior, Alef Iury Ferreira, Lucas Rafael Gris, Anderson da Silva Soares, Arnaldo Candido Junior, Frederico Santos de Oliveira, Gabriel Trevisan Damke, Rafael Teixeira Sousa","doi":"arxiv-2409.11600","DOIUrl":null,"url":null,"abstract":"We developed a jitted compiler for training Artificial Neural Networks using\nC++, LLVM and Cuda. It features object-oriented characteristics, strong typing,\nparallel workers for data pre-processing, pythonic syntax for expressions,\nPyTorch like model declaration and Automatic Differentiation. We implement the\nmechanisms of cache and pooling in order to manage VRAM, cuBLAS for high\nperformance matrix multiplication and cuDNN for convolutional layers. Our\nexperiments with Residual Convolutional Neural Networks on ImageNet, we reach\nsimilar speed but degraded performance. Also, the GRU network experiments show\nsimilar accuracy, but our compiler have degraded speed in that task. However,\nour compiler demonstrates promising results at the CIFAR-10 benchmark, in which\nwe reach the same performance and about the same speed as PyTorch. We make the\ncode publicly available at: https://github.com/NoSavedDATA/NoSavedKaleidoscope","PeriodicalId":501197,"journal":{"name":"arXiv - CS - Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We developed a jitted compiler for training Artificial Neural Networks using
C++, LLVM and Cuda. It features object-oriented characteristics, strong typing,
parallel workers for data pre-processing, pythonic syntax for expressions,
PyTorch like model declaration and Automatic Differentiation. We implement the
mechanisms of cache and pooling in order to manage VRAM, cuBLAS for high
performance matrix multiplication and cuDNN for convolutional layers. Our
experiments with Residual Convolutional Neural Networks on ImageNet, we reach
similar speed but degraded performance. Also, the GRU network experiments show
similar accuracy, but our compiler have degraded speed in that task. However,
our compiler demonstrates promising results at the CIFAR-10 benchmark, in which
we reach the same performance and about the same speed as PyTorch. We make the
code publicly available at: https://github.com/NoSavedDATA/NoSavedKaleidoscope