QUENN : QUantization engine for low-power neural networks

De Prado, Miguel; Benini, Luca; Denna, Maurizio; Pazos Escudero, Nuria

doi:10.1145/3203217.3203282

De Prado, Miguel; Benini, Luca; Denna, Maurizio; Pazos Escudero, Nuria

2018

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Résumé

Deep Learning is moving to edge devices, ushering in a new age of distributed Artificial Intelligence (AI). The high demand of computational resources required by deep neural networks may be alleviated by approximate computing techniques, and most notably reduced-precision arithmetic with coarsely quantized numerical representations. In this context, Bonseyes comes in as an initiative to enable stakeholders to bring AI to low-power and autonomous environments such as: Automotive, Medical Healthcare and Consumer Electronics. To achieve this, we introduce LPDNN, a framework for optimized deployment of Deep Neural Networks on heterogeneous embedded devices. In this work, we detail the quantization engine that is integrated in LPDNN. The engine depends on a fine-grained workflow which enables a Neural Network Design Exploration and a sensitivity analysis of each layer for quantization. We demonstrate the engine with a case study on Alexnet and VGG16 for three different techniques for direct quantization: standard fixed-point, dynamic fixed-point and k-means clustering, and demonstrate the potential of the latter. We argue that using a Gaussian quantizer with k-means clustering can achieve better performance than linear quantizers. Without retraining, we achieve over 55.64% saving for weights' storage and 69.17% for run-time memory accesses with less than 1% drop in top5 accuracy in Imagenet.

Détails

Titre QUENN : QUantization engine for low-power neural networks

Auteur(s)/ trice(s) De Prado, Miguel (Integrated Systems Laboratory, ETH Zürich Switzerland ; School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland)
Benini, Luca (Integrated Systems Laboratory, ETH Zürich Switzerland)
Denna, Maurizio (Nviso, Switzerland)
Pazos Escudero, Nuria (School of Engineering – HE-Arc Ingénierie, HES-SO University of Applied Sciences Western Switzerland)

Date 2018-05

Publié dans Proceedings of the 15th ACM International Conference on Computing Frontiers, 8-10 May 2018, Ischia, Italy

Editeur Ischia, Italy, 08-10 May 2018

Pagination 9 p.

Présenté à Proceedings of the 15th ACM International Conference on Computing Frontiers - CF '18, Ischia {80077}, Italy, 2018-05-08, 2018-05-10

ISBN 9781450357616

DOI https://doi.org/10.1145/3203217.3203282

Type de papier full paper

Domaine Ingénierie et Architecture

Ecole HE-Arc Ingénierie

Institut Aucun institut

Le document apparaît dans Documents de conférences
Global

Résumé

Détails

Actions

PDF