Indranil Sur, Zachary A. Daniels, Abrar Rahman, Kamil Faber, Gianmarco J. Gallardo, Tyler L. Hayes, Cameron Taylor, Mustafa Burak Gurbuz, James Smith, Sahana P Joshi, N. Japkowicz, Michael Baron, Z. Kira, Christopher Kanan, Roberto Corizzo, Ajay Divakaran, M. Piacentino, Jesse Hostetler, Aswin Raghavan
As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model’s capability to learn new tasks. This paper addresses the challenging lifelong reinforcement learning (L2RL) setting. Pushing the state-of-the-art forward in L2RL and making L2RL useful for practical applications requires more than developing individual L2RL algorithms; it requires making progress at the systems-level, especially research into the non-trivial problem of how to integrate multiple L2RL algorithms into a common framework. In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system. As an instantiation of L2RLCF, we develop a standard API allowing easy integration of novel lifelong learning components. We describe a case study that demonstrates how multiple independently-developed LL components can be integrated into a single realized system. We also introduce an evaluation environment in order to measure the effect of combining various system components. Our evaluation environment employs different LL scenarios (sequences of tasks) consisting of Starcraft-2 minigames and allows for the fair, comprehensive, and quantitative comparison of different combinations of components within a challenging common evaluation environment.
{"title":"System Design for an Integrated Lifelong Reinforcement Learning Agent for Real-Time Strategy Games","authors":"Indranil Sur, Zachary A. Daniels, Abrar Rahman, Kamil Faber, Gianmarco J. Gallardo, Tyler L. Hayes, Cameron Taylor, Mustafa Burak Gurbuz, James Smith, Sahana P Joshi, N. Japkowicz, Michael Baron, Z. Kira, Christopher Kanan, Roberto Corizzo, Ajay Divakaran, M. Piacentino, Jesse Hostetler, Aswin Raghavan","doi":"10.1145/3564121.3565236","DOIUrl":"https://doi.org/10.1145/3564121.3565236","url":null,"abstract":"As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model’s capability to learn new tasks. This paper addresses the challenging lifelong reinforcement learning (L2RL) setting. Pushing the state-of-the-art forward in L2RL and making L2RL useful for practical applications requires more than developing individual L2RL algorithms; it requires making progress at the systems-level, especially research into the non-trivial problem of how to integrate multiple L2RL algorithms into a common framework. In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system. As an instantiation of L2RLCF, we develop a standard API allowing easy integration of novel lifelong learning components. We describe a case study that demonstrates how multiple independently-developed LL components can be integrated into a single realized system. We also introduce an evaluation environment in order to measure the effect of combining various system components. Our evaluation environment employs different LL scenarios (sequences of tasks) consisting of Starcraft-2 minigames and allows for the fair, comprehensive, and quantitative comparison of different combinations of components within a challenging common evaluation environment.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122336120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Serene Banerjee, Joy Bose, Sleeba Paul Puthepurakel, Pratyush Kiran Uppuluri, Subhadip Bandyopadhyay, Y. S. K. Reddy, Ranjani H. G.
For autonomous driving, safer travel, and fleet management, Vehicle-to-Vehicle (V2V) communication protocols are an emerging area of research and development. State-of-the-art techniques include machine learning (ML) and reinforcement learning (RL) to adapt modulation and coding rates as the vehicle moves. However, channel state estimations are often incorrect and rapidly changing in a V2V scenario. We propose a combination of input features, including (a) sensor inputs from other parameters in the vehicle, such as speed and global positioning system (GPS), (b) estimation of interference and load for each of the vehicles, and (c) channel state estimation to find the optimal rate that would maximize Quality-of-Service. Our model uses an ensemble of RL-agents to predict trends in the input parameters and to find the inter-dependencies of these input parameters. An RL agent then utilizes these inputs to find the best modulation and coding rate as the vehicle moves. We demonstrate our results through prototype experiments using real data collected from customer networks.
{"title":"Link-Adaptation for Improved Quality-of-Service in V2V Communication using Reinforcement Learning","authors":"Serene Banerjee, Joy Bose, Sleeba Paul Puthepurakel, Pratyush Kiran Uppuluri, Subhadip Bandyopadhyay, Y. S. K. Reddy, Ranjani H. G.","doi":"10.1145/3564121.3564122","DOIUrl":"https://doi.org/10.1145/3564121.3564122","url":null,"abstract":"For autonomous driving, safer travel, and fleet management, Vehicle-to-Vehicle (V2V) communication protocols are an emerging area of research and development. State-of-the-art techniques include machine learning (ML) and reinforcement learning (RL) to adapt modulation and coding rates as the vehicle moves. However, channel state estimations are often incorrect and rapidly changing in a V2V scenario. We propose a combination of input features, including (a) sensor inputs from other parameters in the vehicle, such as speed and global positioning system (GPS), (b) estimation of interference and load for each of the vehicles, and (c) channel state estimation to find the optimal rate that would maximize Quality-of-Service. Our model uses an ensemble of RL-agents to predict trends in the input parameters and to find the inter-dependencies of these input parameters. An RL agent then utilizes these inputs to find the best modulation and coding rate as the vehicle moves. We demonstrate our results through prototype experiments using real data collected from customer networks.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115779504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Decision forests have proved to be useful in machine learning tasks. gcForest is a model that leverages ensembles of decision forests for classification. It combines several decision forests and by adding properties and layered architecture in such a way that it has been proven to give competitive results compared to convolutional neural networks. This paper analyzes the performance of a gcForest model trained on the MNIST digit classification data set on a multi-core CPU based system. Using a performance model-based approach it also presents an analysis of performance on a well-endowed FPGA accelerator card for the same model. It is concluded that the multi-core CPU system can deliver more throughput than the FPGA with batched workload, while the FPGA offers lower latency for a single inference. We also analyze the scalability of the gcForest model on the multi-core server system and with the help of experiments and models, uncover ways to improve the scalability.
{"title":"Performance Evaluation of gcForest inferencing on multi-core CPU and FPGA","authors":"P. Manavar, Sharyu Vijay Mukhekar, M. Nambiar","doi":"10.1145/3564121.3564797","DOIUrl":"https://doi.org/10.1145/3564121.3564797","url":null,"abstract":"Decision forests have proved to be useful in machine learning tasks. gcForest is a model that leverages ensembles of decision forests for classification. It combines several decision forests and by adding properties and layered architecture in such a way that it has been proven to give competitive results compared to convolutional neural networks. This paper analyzes the performance of a gcForest model trained on the MNIST digit classification data set on a multi-core CPU based system. Using a performance model-based approach it also presents an analysis of performance on a well-endowed FPGA accelerator card for the same model. It is concluded that the multi-core CPU system can deliver more throughput than the FPGA with batched workload, while the FPGA offers lower latency for a single inference. We also analyze the scalability of the gcForest model on the multi-core server system and with the help of experiments and models, uncover ways to improve the scalability.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123728256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vinayaka Kamath, Evan Sinclair, Damon Gilkerson, V. Padmanabhan, Sreangsu Acharyya
We model the read-workload experienced by an email server as a superposition of reads performed by different software clients at non-deterministic times, each modeled as a dependent point process. The probability of a read event occurring on an email is affected, among others, by the age of an email and the time of the email recipient’s day. Unlike the more commonly encountered variants of point processes – the one-dimensional temporal, or the multi-dimensional spatial or spatio-temporal – the dependence between the different temporal axes, age and time of day, is incorporated by a point process defined over a non-Euclidean manifold. The used model captures the diverse patterns exhibited by the different clients, for example, the influence of age of an email, time of the user’s day, recent reads by the same or different clients, whether the client is controlled directly by the user, or is a software-agent acting semi-autonomously on the user’s behalf or is a server-side batch job that attempts to avoid adverse impact on user’s latency experience. We show how estimating this point process can be mapped to a Poisson regression, thereby saving the time to implement custom model training software.
{"title":"Modeling Email Server I/O Events As Multi-temporal Point Processes","authors":"Vinayaka Kamath, Evan Sinclair, Damon Gilkerson, V. Padmanabhan, Sreangsu Acharyya","doi":"10.1145/3564121.3564129","DOIUrl":"https://doi.org/10.1145/3564121.3564129","url":null,"abstract":"We model the read-workload experienced by an email server as a superposition of reads performed by different software clients at non-deterministic times, each modeled as a dependent point process. The probability of a read event occurring on an email is affected, among others, by the age of an email and the time of the email recipient’s day. Unlike the more commonly encountered variants of point processes – the one-dimensional temporal, or the multi-dimensional spatial or spatio-temporal – the dependence between the different temporal axes, age and time of day, is incorporated by a point process defined over a non-Euclidean manifold. The used model captures the diverse patterns exhibited by the different clients, for example, the influence of age of an email, time of the user’s day, recent reads by the same or different clients, whether the client is controlled directly by the user, or is a software-agent acting semi-autonomously on the user’s behalf or is a server-side batch job that attempts to avoid adverse impact on user’s latency experience. We show how estimating this point process can be mapped to a Poisson regression, thereby saving the time to implement custom model training software.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129884890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wildfires can cause significant damage to forests and endanger wildlife. Detecting these forest fires at the initial stages helps the authorities in preventing them from spreading further. In this paper, we first propose a novel technique, termed CIELAB-color technique, which detects fire based on the color of the fire in CIELAB color space. We train state-of-art CNNs to detect fire. Since deep learning (CNNs) and image processing have complementary strengths, we combine their strengths to propose an ensemble architecture. It uses two CNNs and the CIELAB-color technique and then performs majority voting to decide the final fire/no-fire prediction output. We finally propose a chain-of-classifiers technique which first tests an image using the CIELAB-color technique. If an image is flagged as no-fire, then it further checks the image using a CNN. This technique has lower model size than ensemble technique. On FLAME dataset, the ensemble technique provides 93.32% accuracy, outperforming both previous works ( accuracy) and individually using either CNNs or CIELAB-color technique. The source code can be obtained from https://github.com/CandleLabAI/FireDetection.
{"title":"Ensembling Deep Learning And CIELAB Color Space Model for Fire Detection from UAV images","authors":"Yash Jain, Vishu Saxena, Sparsh Mittal","doi":"10.1145/3564121.3564130","DOIUrl":"https://doi.org/10.1145/3564121.3564130","url":null,"abstract":"Wildfires can cause significant damage to forests and endanger wildlife. Detecting these forest fires at the initial stages helps the authorities in preventing them from spreading further. In this paper, we first propose a novel technique, termed CIELAB-color technique, which detects fire based on the color of the fire in CIELAB color space. We train state-of-art CNNs to detect fire. Since deep learning (CNNs) and image processing have complementary strengths, we combine their strengths to propose an ensemble architecture. It uses two CNNs and the CIELAB-color technique and then performs majority voting to decide the final fire/no-fire prediction output. We finally propose a chain-of-classifiers technique which first tests an image using the CIELAB-color technique. If an image is flagged as no-fire, then it further checks the image using a CNN. This technique has lower model size than ensemble technique. On FLAME dataset, the ensemble technique provides 93.32% accuracy, outperforming both previous works ( accuracy) and individually using either CNNs or CIELAB-color technique. The source code can be obtained from https://github.com/CandleLabAI/FireDetection.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122229698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Domain specific hardware architectures and hardware accelerators have been a vital part of modern system design. Especially for math intensive applications involving tasks related to machine perception, incorporating hardware accelerators that work in tandem with general purpose micro-processors can prove to be energy efficient both at server and edge scenarios. FPGAs, due to their reconfigurability makes it possible to have customized hardware designed as per the computational and memory requirements specific to that application. This work proposes an optimized low latency hardware accelerator implementation of Mobile-net V2 CNN on an FPGA. This paper presents an implementation of Mobile-net-V2 inference on a Xilinx Ultrascale+ MPSOC platform incorporating solely half precision floating point arithmetic for both parameters and activations of the network. The proposed implementation is also optimized by merging all batch-norm layers with its preceding convolutional layers. For applications which cannot compromise on performance of the algorithm for execution speed and efficiency, an optimized floating point inference is proposed. The current implementation offers an overall performance improvement of at-least 20X with moderate resource utilization with minimal variance in inference latency, as compared to performing inference on the processor alone with almost no degradation in the model accuracy.
{"title":"An hardware accelerator design of Mobile-Net model on FPGA","authors":"Sanjaya M V, M. Rao","doi":"10.1145/3564121.3564124","DOIUrl":"https://doi.org/10.1145/3564121.3564124","url":null,"abstract":"Domain specific hardware architectures and hardware accelerators have been a vital part of modern system design. Especially for math intensive applications involving tasks related to machine perception, incorporating hardware accelerators that work in tandem with general purpose micro-processors can prove to be energy efficient both at server and edge scenarios. FPGAs, due to their reconfigurability makes it possible to have customized hardware designed as per the computational and memory requirements specific to that application. This work proposes an optimized low latency hardware accelerator implementation of Mobile-net V2 CNN on an FPGA. This paper presents an implementation of Mobile-net-V2 inference on a Xilinx Ultrascale+ MPSOC platform incorporating solely half precision floating point arithmetic for both parameters and activations of the network. The proposed implementation is also optimized by merging all batch-norm layers with its preceding convolutional layers. For applications which cannot compromise on performance of the algorithm for execution speed and efficiency, an optimized floating point inference is proposed. The current implementation offers an overall performance improvement of at-least 20X with moderate resource utilization with minimal variance in inference latency, as compared to performing inference on the processor alone with almost no degradation in the model accuracy.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130528848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyperlocal e-commerce companies in India deliver food and groceries in around 20-40 minutes, and more recently, some companies focus on sub-ten-minute delivery targets. Such "instant" delivery platforms referred to as quick (q)-commerce onboard GPS locations of customer addresses along with their text addresses to enable Delivery Partners (DPs) navigate to the customer locations seamlessly. Inaccurate GPS locations lead to a breach of promises on delivery times for customers and order cancellations because the DPs may not be able to find the address easily or may not even navigate close to the actual address. As a first step towards correcting these inaccurate locations, in this work, we design a classifier to identify if the GPS location captured is incorrect using the text addresses. The classifier is trained in a self-supervised manner. We propose two strategies to generate the train set, one based on location perturbation using Gaussian noise and another based on swapping pairs of addresses in a dataset generated with accurate address locations. An ensemble of outputs of models trained on these two datasets give 84.5 % precision and 49 % recall in a large Indian city on our internal test set.
{"title":"Address Location Correction System for Q-commerce","authors":"Y. Reddy, Sumanth Sadu, A. Ganesan, Jose Mathew","doi":"10.1145/3564121.3564800","DOIUrl":"https://doi.org/10.1145/3564121.3564800","url":null,"abstract":"Hyperlocal e-commerce companies in India deliver food and groceries in around 20-40 minutes, and more recently, some companies focus on sub-ten-minute delivery targets. Such \"instant\" delivery platforms referred to as quick (q)-commerce onboard GPS locations of customer addresses along with their text addresses to enable Delivery Partners (DPs) navigate to the customer locations seamlessly. Inaccurate GPS locations lead to a breach of promises on delivery times for customers and order cancellations because the DPs may not be able to find the address easily or may not even navigate close to the actual address. As a first step towards correcting these inaccurate locations, in this work, we design a classifier to identify if the GPS location captured is incorrect using the text addresses. The classifier is trained in a self-supervised manner. We propose two strategies to generate the train set, one based on location perturbation using Gaussian noise and another based on swapping pairs of addresses in a dataset generated with accurate address locations. An ensemble of outputs of models trained on these two datasets give 84.5 % precision and 49 % recall in a large Indian city on our internal test set.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"82 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128158490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kshitij Garg, A. Narayanan, P. Misra, Arunchandar Vasan, Vivek Bandhu, Debarupa Das
Electric vehicle (EV) fleets are well suited for last-mile deliveries both from sustainability and operational cost perspectives. To ensure economic parity with non-EV options, even captive chargers for EV fleets need to be managed intelligently. Specifically, the EVs needs to be adequately charged for their entire delivery runs while handling reduced time flexibility between runs; limited number of chargers; and deviations from the planned schedule. Existing works either solve smaller instances of this problem optimally, or larger instances with significant sub-optimality. In addition, they typically consider either day-ahead or real-time planning in isolation. We complement existing works with a hybrid approach that first identifies a day-ahead plan for assigning EVs to chargers; and then uses online replanning to handle any deviations in real-time. For the day-ahead planning, we use a learning agent (LA) that learns to assign EVs to chargers over several problem instances. Because the agent solves a given instance during its testing phase, it achieves scale in problem size with limited sub-optimality. For the online replanning, we use a greedy heuristic that dynamically refines the day-ahead plan to handle delays in EV arrivals. We evaluate our approach using representative datasets. As baselines for the LA, we use an exact mixed-integer linear program (MILP) (greedy heuristic) for small (large) problem instances. As baselines for the replanning, we use no-planning and no-replanning. Our experiments show that LA performs better (8.5-14%) than greedy heuristic in large problem instances, while being reasonably close (< 22%) to the optimal in smaller instances. For online replanning, our approach performs about 7-20% better than no-planning and no-replanning for a range of delay profiles.
{"title":"A Hybrid Planning System for Smart Charging of Electric Fleets","authors":"Kshitij Garg, A. Narayanan, P. Misra, Arunchandar Vasan, Vivek Bandhu, Debarupa Das","doi":"10.1145/3564121.3564125","DOIUrl":"https://doi.org/10.1145/3564121.3564125","url":null,"abstract":"Electric vehicle (EV) fleets are well suited for last-mile deliveries both from sustainability and operational cost perspectives. To ensure economic parity with non-EV options, even captive chargers for EV fleets need to be managed intelligently. Specifically, the EVs needs to be adequately charged for their entire delivery runs while handling reduced time flexibility between runs; limited number of chargers; and deviations from the planned schedule. Existing works either solve smaller instances of this problem optimally, or larger instances with significant sub-optimality. In addition, they typically consider either day-ahead or real-time planning in isolation. We complement existing works with a hybrid approach that first identifies a day-ahead plan for assigning EVs to chargers; and then uses online replanning to handle any deviations in real-time. For the day-ahead planning, we use a learning agent (LA) that learns to assign EVs to chargers over several problem instances. Because the agent solves a given instance during its testing phase, it achieves scale in problem size with limited sub-optimality. For the online replanning, we use a greedy heuristic that dynamically refines the day-ahead plan to handle delays in EV arrivals. We evaluate our approach using representative datasets. As baselines for the LA, we use an exact mixed-integer linear program (MILP) (greedy heuristic) for small (large) problem instances. As baselines for the replanning, we use no-planning and no-replanning. Our experiments show that LA performs better (8.5-14%) than greedy heuristic in large problem instances, while being reasonably close (< 22%) to the optimal in smaller instances. For online replanning, our approach performs about 7-20% better than no-planning and no-replanning for a range of delay profiles.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"80 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114131555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeet Dutta, Swarnava Dey, Arijit Mukherjee, Arpan Pal
Deep Learning architectures used in computer vision, natural language and speech processing, unsupervised clustering, etc. have become highly complex and application-specific in recent times. Despite existing automated feature engineering techniques, building such complex models still requires extensive domain knowledge or a huge infrastructure for employing techniques such as Neural Architecture Search (NAS). Further, many industrial applications need in-premises decision-making close to sensors, thus making deployment of deep learning models on edge devices a desirable and often necessary option. Instead of freshly designing application-specific Deep Learning models, the transformation of already built models can achieve faster time to market and cost reduction. In this work, we present an efficient re-training-free model compression method that searches for the best hyper-parameters to reduce the model size and latency without losing any accuracy. Moreover, our proposed method takes into account any drop in accuracy due to hardware acceleration, when a Deep Neural Network is executed on accelerator hardware.
{"title":"Acceleration-aware, Retraining-free Evolutionary Pruning for Automated Fitment of Deep Learning Models on Edge Devices","authors":"Jeet Dutta, Swarnava Dey, Arijit Mukherjee, Arpan Pal","doi":"10.1145/3564121.3564133","DOIUrl":"https://doi.org/10.1145/3564121.3564133","url":null,"abstract":"Deep Learning architectures used in computer vision, natural language and speech processing, unsupervised clustering, etc. have become highly complex and application-specific in recent times. Despite existing automated feature engineering techniques, building such complex models still requires extensive domain knowledge or a huge infrastructure for employing techniques such as Neural Architecture Search (NAS). Further, many industrial applications need in-premises decision-making close to sensors, thus making deployment of deep learning models on edge devices a desirable and often necessary option. Instead of freshly designing application-specific Deep Learning models, the transformation of already built models can achieve faster time to market and cost reduction. In this work, we present an efficient re-training-free model compression method that searches for the best hyper-parameters to reduce the model size and latency without losing any accuracy. Moreover, our proposed method takes into account any drop in accuracy due to hardware acceleration, when a Deep Neural Network is executed on accelerator hardware.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115238512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quality assurance is required for the wide use of artificial intelligence (AI) systems in industry and society, including mission-critical areas such as medical or disaster management domains. However, the quality evaluation methods of machine learning (ML) components, especially deep neural networks, have not yet been established. In addition, various metrics are applied by evaluators with different quality requirements and testing environments, from data collection to experimentation to deployment. In this paper, we propose a quality provenance model, AIQPROV, to record who evaluated quality, when from which viewpoint, and how the evaluation was used. The AIQPROV model focuses on human activities on how to apply this to the field of quality assurance, where human intervention is required. Moreover, we present an extension of the W3C PROV framework and conduct a database to store the provenance information of the quality assurance lifecycle with 11 use cases to validate our model.
{"title":"How Provenance helps Quality Assurance Activities in AI/ML Systems","authors":"Takao Nakagawa, Kenichiro Narita, Kyoung-Sook Kim","doi":"10.1145/3564121.3564801","DOIUrl":"https://doi.org/10.1145/3564121.3564801","url":null,"abstract":"Quality assurance is required for the wide use of artificial intelligence (AI) systems in industry and society, including mission-critical areas such as medical or disaster management domains. However, the quality evaluation methods of machine learning (ML) components, especially deep neural networks, have not yet been established. In addition, various metrics are applied by evaluators with different quality requirements and testing environments, from data collection to experimentation to deployment. In this paper, we propose a quality provenance model, AIQPROV, to record who evaluated quality, when from which viewpoint, and how the evaluation was used. The AIQPROV model focuses on human activities on how to apply this to the field of quality assurance, where human intervention is required. Moreover, we present an extension of the W3C PROV framework and conduct a database to store the provenance information of the quality assurance lifecycle with 11 use cases to validate our model.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126235075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}