Performance and energy figures of Deep Neural Network (DNN) accelerators are profoundly affected by the communication and memory sub-system. In this paper, we make the case of a state-of-the-art multi-chip-module-based architecture for DNN inference acceleration. We propose a hybrid wired/wireless network-in-package interconnection fabric and a compression technique for drastically improving the communication efficiency and reducing the memory and communication traffic with a consequent improvement of performance and energy metrics. We assess the inference performance and energy improvement vs. accuracy degradation for different CNNs showing that up to 77% and 68% of inference latency reduction and inference energy reduction, respectively, can be obtained while keeping the accuracy degradation below 5% as respect to the original uncompressed CNN.
Improving Inference Latency and Energy of DNNs through Wireless Enabled Multi-Chip-Module-based Architectures and Model Parameters Compression
Ascia G.;Catania V.;Mineo A.;Monteleone S.;Palesi M.;Patti D.
2020-01-01
Abstract
Performance and energy figures of Deep Neural Network (DNN) accelerators are profoundly affected by the communication and memory sub-system. In this paper, we make the case of a state-of-the-art multi-chip-module-based architecture for DNN inference acceleration. We propose a hybrid wired/wireless network-in-package interconnection fabric and a compression technique for drastically improving the communication efficiency and reducing the memory and communication traffic with a consequent improvement of performance and energy metrics. We assess the inference performance and energy improvement vs. accuracy degradation for different CNNs showing that up to 77% and 68% of inference latency reduction and inference energy reduction, respectively, can be obtained while keeping the accuracy degradation below 5% as respect to the original uncompressed CNN.File | Dimensione | Formato | |
---|---|---|---|
09241714 (1).pdf
solo gestori archivio
Tipologia:
Versione Editoriale (PDF)
Dimensione
685.23 kB
Formato
Adobe PDF
|
685.23 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.