The need for performing deep neural network inferences on resource-constrained embedded devices (e.g., Internet of Things nodes) requires specialized architectures to achieve the best trade-off among performance, energy, and cost. One of the most promising architectures in this context is based on massive parallel and specialized cores interconnected by means of a Network-on-Chip (NoC). In this paper, we extensively evaluate NoC-based deep neural network accelerators by exploring the design space spanned by several architectural parameters including, network size, routing algorithm, local memory size, link width, and number of memory interfaces. We show how latency is mainly dominated by the on-chip communication whereas energy consumption is mainly accounted by memory (both on-chip and off-chip). The outcome of the analysis, thus, pushes toward a research line devoted to the optimization of the on-chip communication fabric and the memory subsystem for performance improvement and energy efficiency, respectively.

Networks-on-Chip based Deep Neural Networks Accelerators for IoT Edge Devices

Ascia G.;Catania V.;Monteleone S.;Palesi M.;Patti D.;
2019-01-01

Abstract

The need for performing deep neural network inferences on resource-constrained embedded devices (e.g., Internet of Things nodes) requires specialized architectures to achieve the best trade-off among performance, energy, and cost. One of the most promising architectures in this context is based on massive parallel and specialized cores interconnected by means of a Network-on-Chip (NoC). In this paper, we extensively evaluate NoC-based deep neural network accelerators by exploring the design space spanned by several architectural parameters including, network size, routing algorithm, local memory size, link width, and number of memory interfaces. We show how latency is mainly dominated by the on-chip communication whereas energy consumption is mainly accounted by memory (both on-chip and off-chip). The outcome of the analysis, thus, pushes toward a research line devoted to the optimization of the on-chip communication fabric and the memory subsystem for performance improvement and energy efficiency, respectively.
2019
978-1-7281-2949-5
Deep Neural Network accelerator; Design space exploration; Energy analysis; IoT edge devices; Network-on-Chip; Performance evaluation
File in questo prodotto:
File Dimensione Formato  
2019_08939236.pdf

solo gestori archivio

Tipologia: Versione Editoriale (PDF)
Dimensione 1.53 MB
Formato Adobe PDF
1.53 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/382933
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
social impact