“School of Cognitive”
Back to Papers HomeBack to Papers of School of Cognitive
Paper IPM / Cognitive / 14914 |
|
||||||||
Abstract: | |||||||||
In this paper, we introduce an architecture for accelerating convolution stages in Convolutional Neural Networks (CNNs) implemented in embedded vision systems. The purpose of the architecture is to exploit the inherent parallelism in CNNs in order to reduce the required bandwidth, resource usage, and power consumption of highly computationally complex convolution operations as required by real-time embedded applications. We also implement the proposed architecture using fixed point arithmetic on a ZC706 evaluation board featuring a Xilinx Zynq-7000 System on Chip (SoC), where the embedded ARM processor with high clocking speed is used as the main controller to increase the flexibility and speed. The proposed architecture runs under a frequency of 150 MHz which leads to 19.2 Giga Multiply Accumulation operations per second while consuming less than 10 watts in power. This is done using only 391 DSP48 modules which shows significant utilization improvement compared to the state-of-the-art architectures.
Download TeX format |
|||||||||
back to top |