Emerging Neural Workloads and Their Impact on Hardware

David Brooks1, Martin M. Frank2, Tayfun Gokmen2, Udit Gupta1, X. Sharon Hu3, Shubham Jain4, Ann Franchesca Laguna3, Michael Niemier3, Ian O'Connor5, Anand Raghunathan4, Ashish Ranjan2, Dayane Reis3, Jacob R. Stevens4, Carole-Jean Wu6 and Xunzhao Yin7

1 Harvard University
dbrooks@eecs.harvard.edu
ugupta@g.harvard.edu
2 IBM T.J. Watson Research Center
mmfrank@us.ibm.com
tgokmen@us.ibm.com
ashish.ranjan@ibm.com
3 University of Notre Dame
shu@nd.edu
alaguna@nd.edu
mniemier@nd.edu
dreis@nd.edu
4 Purdue University
jain130@purdue.edu
raghunathan@purdue.edu,
steven69@purdue.edu
5 École Centrale de Lyon
Ian.Oconnor@ec-lyon.fr
6 Facebook
carolejeanwu@fb.com
7 Zhejiang University
xzyin1@zju.edu.cn

ABSTRACT

We consider existing and emerging neural workloads, and what hardware accelerators might be best suited for said workloads. We begin with a discussion of analog crossbar arrays, which are known to be well-suited for matrix-vector multiplication operations that are commonplace in existing neural network models such as convolutional neural networks (CNNs). We highlight candidate crosspoint devices, what device and materials challenges must be overcome for a given device to be employed in a crossbar array for a computationally interesting neural workload, and how circuit and algorithmic optimizations may be employed to mitigate undesirable characteristics from devices/materials. We then discuss two emerging neural workloads. We first consider machine learning models for one-and few-shot learning tasks (i.e., where a network can be trained with just one or a few, representative examples of a given class). Notably crossbar-based architectures can be used to accelerate said models. Hardware solutions based on content addressable memory arrays will also be discussed. We then consider machine learning models for recommendation systems. Recommendation models, an emerging class of machine learning models, employ distinct neural network architectures that operate of continuous and categorical input features which make hardware acceleration challenging. We will discuss the open research challenges and opportunities within this space.



Full Text (PDF)