DC-CNN: Computational Flow Redefinition for Efficient CNN through Structural Decoupling

Fuxun Yu1,a, Zhuwei Qin1,b, Di Wang2, Ping Xu1,c, Chenchen Liu3, Zhi Tian1,d and Xiang Chen1,e

1George Mason University, Fairfax, VA, USA
afyu2@gmu.edu
bzqin@gmu.edu
cpxu3@gmu.edu
dztian1@gmu.edu
exchen26@gmu.edu
2Microsoft, Redmond, WA, USA
wangdi@microsoft.com
3University of Maryland, Baltimore County, Baltimore, MD, USA
ccliu@umbc.edu

ABSTRACT

Recently Convolutional Neural Networks (CNNs) are widely applied into novel intelligent applications and systems. However, the CNN computation performance is significantly hindered by its computation flow, which computes the model structure sequentially by layers with massive convolution operations. Such a layer-wise sequential computation flow can cause certain performance issues, such as resource under-utilization, huge memory overhead, etc. To solve these problems, we propose a novel CNN structural decoupling method, which could decouple CNN models into “critical paths” and eliminate the inter-layer data dependency. Based on this method, we redefine the CNN computation flow into parallel and cascade computing paradigms, which can significantly enhance the CNN computation performance with both multi-core and single-core CPU processors. Experiments show that, our DC-CNN framework could reduce 24% to 33% latency on multi-core CPUs for CIFAR and ImageNet. On small-capacity mobile platforms, cascade computing could reduce the latency by average 24% on ImageNet and 42% on CIFAR10. Meanwhile, the memory reduction could also reach average 21% and 64%, respectively

Keywords: Neural Network, Computation Optimization.



Full Text (PDF)