AIX: A high performance and energy efficient inference accelerator on FPGA for a DNN-based commercial speech recognition

Minwook Ahna, Seok Joong Hwangb, Wonsub Kimc, Seungrok Jungd, Yeonbok Leee, Mookyoung Chungf, Woohyung Limg and Youngjoon Kimh
SK Telecom 6, Hwangsaeul-ro, 258beon-gil, Bundang-gu, Seongnam-si, Gyeonggi-do, Korea
aminwook.ahn@sk.com
bnzthing@sk.com
cwonsub79.kim@sk.com
dseungrok.jung@sk.com
eyeonbok.lee@sk.com
fmk.chung@sk.com
gw.lim@sk.com
hyoungjoon.kim

ABSTRACT


Automatic speech recognition (ASR) is crucial in virtual personal assistant (VPA) services such as Apple Siri, Amazon Alexa, Google Now and SKT NUGU. Recently, ASR has been showing a remarkable advance in accuracy by applying deep learning. However, with the explosive increase of the user utterances and growing complexities in ASR, the demands for the custom accelerators in datacenters are highly increasing in order to process them in real time with low power consumption. This paper evaluates a custom inference accelerator for ASR enhanced by a deep neural network, called AIX (Artificial Intelligence aXellerator). AIX is developed on a Xilinx FPGA and deployed to SKT NUGU since 2018. Owing to the full exploitation of DSP slices and memory bandwidth provided by FPGA, AIX outperforms the cutting-edge CPUs by 10.2 times and even a state-of-the-art GPU by 20.1 times with real time workloads of ASR in performance and power consumption wise. This improvement achieves faster response time in ASR, and in turn reduces the number of required machines in datacenters to a third.

Keywords: Neural network, Inference, Accelerator, Speech recognition, FPGA



Full Text (PDF)