Performance and energy figures of Deep Neural Network (DNN) accelerators are profoundly affected by the communication and memory sub-system. In this paper, we make the case of a state-of-the-art multi-chip-module-based architecture for DNN inference acceleration. We propose a hybrid wired/wireless network-in-package interconnection fabric and a compression technique for drastically improving the communication efficiency and reducing the memory and communication traffic with a consequent improvement of performance and energy metrics. We assess the inference performance and energy improvement vs. accuracy degradation for different CNNs showing that up to 77% and 68% of inference latency reduction and inference energy reduction, respectively, can be obtained while keeping the accuracy degradation below 5% as respect to the original uncompressed CNN.

Improving Inference Latency and Energy of DNNs through Wireless Enabled Multi-Chip-Module-based Architectures and Model Parameters Compression

Ascia G.;Catania V.;Mineo A.;Monteleone S.;Palesi M.;Patti D.
2020-01-01

Abstract

Performance and energy figures of Deep Neural Network (DNN) accelerators are profoundly affected by the communication and memory sub-system. In this paper, we make the case of a state-of-the-art multi-chip-module-based architecture for DNN inference acceleration. We propose a hybrid wired/wireless network-in-package interconnection fabric and a compression technique for drastically improving the communication efficiency and reducing the memory and communication traffic with a consequent improvement of performance and energy metrics. We assess the inference performance and energy improvement vs. accuracy degradation for different CNNs showing that up to 77% and 68% of inference latency reduction and inference energy reduction, respectively, can be obtained while keeping the accuracy degradation below 5% as respect to the original uncompressed CNN.
2020
978-1-7281-8847-8
DNN Accelerators
DNN Compression
Multi-Chip-Module
Network-in-Package
Network-on-Chip
Wireless NoC
File in questo prodotto:
File Dimensione Formato  
09241714 (1).pdf

solo gestori archivio

Tipologia: Versione Editoriale (PDF)
Dimensione 685.23 kB
Formato Adobe PDF
685.23 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/505555
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? ND
social impact