The need for performing deep neural network inferences on resource-constrained embedded devices (e.g., Internet of Things nodes) requires specialized architectures to achieve the best trade-off among performance, energy, and cost. One of the most promising architectures in this context is based on massive parallel and specialized cores interconnected by means of a Network-on-Chip (NoC). In this paper, we extensively evaluate NoC-based deep neural network accelerators by exploring the design space spanned by several architectural parameters including, network size, routing algorithm, local memory size, link width, and number of memory interfaces. We show how latency is mainly dominated by the on-chip communication whereas energy consumption is mainly accounted by memory (both on-chip and off-chip). The outcome of the analysis, thus, pushes toward a research line devoted to the optimization of the on-chip communication fabric and the memory subsystem for performance improvement and energy efficiency, respectively.
|Titolo:||Networks-on-Chip based Deep Neural Networks Accelerators for IoT Edge Devices|
|Data di pubblicazione:||2019|
|Appare nelle tipologie:||4.1 Contributo in Atti di convegno|