Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems

de Prado, Miguel; Pazos Escudero, Nuria; Benini, Luca

doi:10.23919/DATE.2019.8714959

de Prado, Miguel; Pazos Escudero, Nuria; Benini, Luca

2019

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Résumé

Deep Learning is increasingly being adopted by industry for computer vision applications running on embedded devices. While Convolutional Neural Networks' accuracy has achieved a mature and remarkable state, inference latency and throughput are a major concern especially when targeting low-cost and low-power embedded platforms. CNNs' inference latency may become a bottleneck for Deep Learning adoption by industry, as it is a crucial specification for many real-time processes. Furthermore, deployment of CNNs across heterogeneous platforms presents major compatibility issues due to vendor-specific technology and acceleration libraries.In this work, we present QS-DNN, a fully automatic search based on Reinforcement Learning which, combined with an inference engine optimizer, efficiently explores through the design space and empirically finds the optimal combinations of libraries and primitives to speed up the inference of CNNs on heterogeneous embedded devices. We show that, an optimized combination can achieve 45x speedup in inference latency on CPU compared to a dependency-free baseline and 2x on average on GPGPU compared to the best vendor library. Further, we demonstrate that, the quality of results and time "to-solution" is much better than with Random Search and achieves up to 15x better results for a short-time search.

Détails

Titre Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems

Auteur(s)/ trice(s) de Prado, Miguel (ETH Zürich, Zürich, Switzerland)
Pazos Escudero, Nuria (School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland)
Benini, Luca (ETH Zürich, Zürich, Switzerland)

Date 2019-03

Publié dans Proceedings of 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), 25-29 March 2019, Florence, Italy

Editeur Florence, Italy, 25-29 March 2019

Pagination pp. 1409-1414

Présenté à 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 2019-03-25, 2019-03-29

ISBN 978-3-9819263-2-3

DOI https://doi.org/10.23919/DATE.2019.8714959

Mots-clés (libres) libraries ; acceleration ; engines ; reinforcement learning ; space exploration ; optimization ; computer architecture

Type de papier published full paper

Domaine Ingénierie et Architecture

Ecole HE-Arc Ingénierie

Institut Aucun institut

Le document apparaît dans Documents de conférences
Global

Résumé

Détails

Actions