DATE 2019

Towards Cross-Platform Inference on Edge Devices with Emerging Neuromorphic Architecture

Shangyu Wu^1,a, Yi Wang^1,b, Amelie Chi Zhou^1,c, Rui Mao^1,d, Zili Shao² and Tao Li³
¹The National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, China
^ashangyuwu1006@gmail.com
^byiwang@szu.edu.cn
^cchi.zhou@szu.edu.cn
^dmao@szu.edu.cn
²The Chinese University of Hong Kong, Hong Kong, China
shao@cse.cuhk.edu.hk
³University of Florida, Gainesville, FL, USA
taoli@ece.ufl.edu

ABSTRACT

Deep convolutional neural networks have become the mainstream solution for many artificial intelligence applications. However, they are still rarely deployed on mobile or edge devices due to the cost of a substantial amount of data movement among limited resources. The emerging processing-inmemory neuromorphic architecture offers a promising direction to accelerate the inference process. The key issue becomes how to effectively allocate the processing of inference between computing and storage resources on an edge device.

This paper presents Mobile-I, a resource allocation scheme to accelerate the Inference process on Mobile or edge devices. Mobile-I targets at the emerging 3D neuromorphic architecture to reduce the processing latency among computing resources and fully utilize the limited on-chip storage resources. We formulate the target problem as a resource allocation problem and use a software-based solution to offer the cross-platform deployment across multiple mobile or edge devices. We conduct a set of experiments using realistic workloads that are generated from Intel Movidius neural compute stick. Experimental results show that Mobile-I can effectively reduce the processing latency and improve the utilization of computing resources with negligible overhead in comparison with representative schemes.

Keywords: Edge computing, Memory management, Scheduling, Neuromorphic architecture, Inference.

Full Text (PDF)