Automated design space exploration for optimised deployment of DNN on arm cortex-A CPUs

de Prado, Miguel; Mundy, Andrew; Saeed, Rabia; Denna, Maurizio; Pazos Escudero, Nuria; Benini, Luca

doi:10.1109/TCAD.2020.3046568

Automated design space exploration for optimised deployment of DNN on arm cortex-A CPUs

de Prado, Miguel; Mundy, Andrew; Saeed, Rabia; Denna, Maurizio; Pazos Escudero, Nuria; Benini, Luca

2020

Herunterladen

Formate

Formate
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Cite

The spread of deep learning on embedded devices has prompted the development of numerous methods to optimise the deployment of deep neural networks (DNN). Works have mainly focused on: i) efficient DNN architectures, ii) network optimisation techniques such as pruning and quantisation, iii) optimised algorithms to speed up the execution of the most computational intensive layers and, iv) dedicated hardware to accelerate the data flow and computation. However, there is a lack of research on cross-level optimisation as the space of approaches becomes too large to test and obtain a globally optimised solution. Thus, leading to suboptimal deployment in terms of latency, accuracy, and memory. In this work, we first detail and analyse the methods to improve the deployment of DNNs across the different levels of software optimisation. Building on this knowledge, we present an automated exploration framework to ease the deployment of DNNs. The framework relies on a Reinforcement Learning search that, combined with a deep learning inference framework, automatically explores the design space and learns an optimised solution that speeds up the performance and reduces the memory on embedded CPU platforms. Thus, we present a set of results for state-of-the-art DNNs on a range of Arm Cortex-A CPU platforms achieving up to 4× improvement in performance and over 2× reduction in memory with negligible loss in accuracy with respect to the BLAS floating-point implementation.

Titel

Automated design space exploration for optimised deployment of DNN on arm cortex-A CPUs

Autor(en)/ in(nen)

de Prado, Miguel (School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland ; ETH Zürich, Zürich, Switzerland)
Mundy, Andrew (School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland ; ETH Zürich, Zürich, Switzerland)
Saeed, Rabia (School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland ; ETH Zürich, Zürich, Switzerland)
Denna, Maurizio (School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland ; ETH Zürich, Zürich, Switzerland)
Pazos Escudero, Nuria (School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland ; ETH Zürich, Zürich, Switzerland)
Benini, Luca (School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland ; ETH Zürich, Zürich, Switzerland)

Datum

2020-12

Veröffentlich in

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Band

2021, 40

Nummer

11

Seiten / Artikelnummer

2293-2305

Verlag

Piscataway, NJ, USA, Institute of Electrical and Electronics Engineers (IEEE)

Seitenzahl & Äquivalente

14 p.

DOI

https://doi.org/10.1109/TCAD.2020.3046568

Schlüsselwörter

optimization ; convolution ; neural networks ; quantization (signal) ; memory management ; software ; space exploration

Artikeltyp

scientifique

Domaine

Ingénierie et Architecture

Ecole

HE-Arc Ingénierie

Institut

Aucun institut

Das Dokument erscheint in

Veröffentlichte Artikel
Global

Automated design space exploration for optimised deployment of DNN on arm cortex-A CPUs

Fichiers

Résumé

Einzelheiten

Aktionen