11.4 Learning Gets Smarter

Time	Label	Presentation Title Authors
14:00	11.4.1	(Best Paper Award Candidate) NEUADC: NEURAL NETWORK-INSPIRED RRAM-BASED SYNTHESIZABLE ANALOG-TO-DIGITAL CONVERSION WITH RECONFIGURABLE QUANTIZATION SUPPORT Speaker: Xuan Zhang, WASHINGTON UNIVERSITY ST LOUIS, US Authors: Weidong Cao, Xin He, Ayan Chakrabarti and Xuan Zhang, Washington University, US Abstract Traditional analog-to-digital converters (ADCs) employ dedicated analog and mixed-signal (AMS) circuits and require time-consuming manual design process. They also exhibit limited reconfigurability and are unable to support diverse quantization schemes using the same circuitry. In this paper, we propose NeuADC --- an automated design approach to synthesizing an analog-to-digital (A/D) interface that can approximate the desired quantization function using a neural network (NN) with a single hidden layer. Our design leverages the mixed-signal resistive random-access memory (RRAM) crossbar architecture in a novel dual-path configuration to realize basic NN operations at the circuit level and exploits smooth bit-encoding scheme to improve the training accuracy. Results obtained from SPICE simulations based on 130nm technology suggest that not only can NeuADC deliver promising performance compared to the state-of-art ADC designs across comprehensive design metrics, but also it can intrinsically support multiple reconfigurable quantization schemes using the same hardware substrate, paving the ways for future adaptable application-driven signal conversion. The robustness of NeuADC's quantization quality under moderate RRAM resistance precision is also evaluated using SPICE simulations. Download Paper (PDF; Only available from the DATE venue WiFi)
14:30	11.4.2	HOLYLIGHT: A NANOPHOTONIC ACCELERATOR FOR DEEP LEARNING IN DATA CENTERS Speaker: Weichen Liu, School of Computer Science and Engineering, Nanyang Technological University, Singapore, CN Authors: Weichen Liu¹, Wenyang Liu², Yichen Ye³, Qian Lou⁴, Yiyuan Xie³ and Lei Jiang⁵ ¹Nanyang Technological University, SG; ²College of Computer Science, Chongqing University, CN; ³College of Electronics and Information Engineering, Southwest University, CN; ⁴Department of Intelligent Systems Engineering, Indiana University, US; ⁵Indiana University Bloomington, US Abstract Convolutional Neural Networks (CNNs) are widely adopted in object recognition, speech processing and machine translation, due to their extremely high inference accuracy. However, it is challenging to compute massive computationally expensive convolutions of deep CNNs on traditional CPUs and GPUs. Emerging Nanophotonic technology has been employed for on-chip data communication, because of its CMOS compatibility, high bandwidth and low power consumption. In this paper, we propose a nanophotonic accelerator, HolyLight, to boost the CNN inference throughput in datacenters. Instead of an all-photonic design, HolyLight performs convolutions by photonic integrated circuits, and process the other operations in CNNs by CMOS circuits for high inference accuracy. We first build HolyLight-M by microdisk-based matrix-vector multipliers. We find analog-to-digital converters (ADCs) seriously limit its inference throughput per Watt. We further use microdisk-based adders and shifters to architect HolyLight-A without ADCs. Compared to the state-of-the-art ReRAM-based accelerator, HolyLight-A improves the CNN inference throughput per Watt by 13x with trivial accuracy degradation. Download Paper (PDF; Only available from the DATE venue WiFi)
15:00	11.4.3	TRANSFER AND ONLINE REINFORCEMENT LEARNING IN STT-MRAM BASED EMBEDDED SYSTEMS FOR AUTONOMOUS DRONES Speaker: Insik Yoon, Georgia Institute of Technology, US Authors: Insik Yoon¹, Aqeel Anwar¹, Titash Rakshit² and Arijit Raychowdhury¹ ¹Georgia Institute of Technology, US; ²Samsung, US Abstract In this paper we present an algorithm-hardware co-design for camera-based autonomous flight in small drones. We show that the large write-latency and write-energy for non-volatile memory (NVM) based embedded systems makes them unsuitable for real-time reinforcement learning (RL). We address this by performing transfer learning (TL) on meta-environments and RL on the last few layers of a deep convolutional network, While the NVM stores the meta-model (from TL), an on-die SRAM stores the weights of the last few layers. Thus all the real-time updates via RL are carried out on the SRAM arrays. This provides us with a practical platform with comparable performance as end-to-end RL and 83.4% lower energy per image frame. Download Paper (PDF; Only available from the DATE venue WiFi)
15:15	11.4.4	AIX: A HIGH PERFORMANCE AND ENERGY EFFICIENT INFERENCE ACCELERATOR ON FPGA FOR A DNN-BASED COMMERCIAL SPEECH RECOGNITION Speaker: Minwook Ahn, SK Telecom, KR Authors: Minwook Ahn, Seok Joong Hwang, Wonsub Kim, Seungrok Jung, Yeonbok Lee, Mookyoung Chung, Woohyung Lim and Youngjoon Kim, SK Telecom, KR Abstract Automatic speech recognition (ASR) is crucial in virtual personal assistant (VPA) services such as Apple Siri, Amazon Alexa, Google Now and SKT NUGU. Recently, ASR has been showing a remarkable advance in accuracy by applying deep learning. However, with the explosive increase of the user utterances and growing complexities in ASR, the demands for the custom accelerators in datacenters are highly increasing in order to process them in real time with low power consumption. This paper evaluates a custom inference accelerator for ASR enhanced by a deep neural network, called AIX (Artificial Intelligence aXellerator). AIX is developed on a Xilinx FPGA and deployed to SKT NUGU since 2018. Owing to the full exploitation of DSP slices and memory bandwidth provided by FPGA, AIX outperforms the cutting-edge CPUs by 10.2 times and even a state-of-the-art GPU by 20.1 times with real time workloads of ASR in performance and power consumption wise. This improvement achieves faster response time in ASR, and in turn reduces the number of required machines in datacenters to a third. Download Paper (PDF; Only available from the DATE venue WiFi)
15:30		End of session Coffee Break in Exhibition Area Coffee Breaks in the Exhibition Area On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area. Lunch Breaks (Lunch Area) On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area. Tuesday, March 26, 2019 Coffee Break 10:30 - 11:30 Lunch Break 13:00 - 14:30 Keynote Lecture "Leonardo da Vinci, Humanism and Engineering between Florence and Milan" by Claudio Giorgione in room 1 13:50 - 14:20 Coffee Break 16:00 - 17:00 Wednesday, March 27, 2019 Coffee Break 10:00 - 11:00 Lunch Break 12:30 - 14:30 Keynote Lecture "Heterogeneous, High Scale Computing in the Era of Intelligent, Cloud-Connected" by David Pellerin, Amazon, US in room 1 13:50 - 14:20 Coffee Break 16:00 - 17:00 Thursday, March 28, 2019 Coffee Break 10:00 - 11:00 University Booth Best Demo Award Presentation at the University Booth 10:30 Lunch Break 12:30 - 14:00 Keynote Lecture "A Fundamental Look at Models and Intelligence" by Edward A. Lee, University of California, Berkeley, US in room 1 13:20 - 13:50 Coffee Break 15:30 - 16:00

Time

Label

Presentation Title
Authors

14:00

11.4.1

(Best Paper Award Candidate)
NEUADC: NEURAL NETWORK-INSPIRED RRAM-BASED SYNTHESIZABLE ANALOG-TO-DIGITAL CONVERSION WITH RECONFIGURABLE QUANTIZATION SUPPORT
Speaker:
Xuan Zhang, WASHINGTON UNIVERSITY ST LOUIS, US
Authors:
Weidong Cao, Xin He, Ayan Chakrabarti and Xuan Zhang, Washington University, US
Abstract
Traditional analog-to-digital converters (ADCs) employ dedicated analog and mixed-signal (AMS) circuits and require time-consuming manual design process. They also exhibit limited reconfigurability and are unable to support diverse quantization schemes using the same circuitry. In this paper, we propose NeuADC --- an automated design approach to synthesizing an analog-to-digital (A/D) interface that can approximate the desired quantization function using a neural network (NN) with a single hidden layer. Our design leverages the mixed-signal resistive random-access memory (RRAM) crossbar architecture in a novel dual-path configuration to realize basic NN operations at the circuit level and exploits smooth bit-encoding scheme to improve the training accuracy. Results obtained from SPICE simulations based on 130nm technology suggest that not only can NeuADC deliver promising performance compared to the state-of-art ADC designs across comprehensive design metrics, but also it can intrinsically support multiple reconfigurable quantization schemes using the same hardware substrate, paving the ways for future adaptable application-driven signal conversion. The robustness of NeuADC's quantization quality under moderate RRAM resistance precision is also evaluated using SPICE simulations.
Download Paper (PDF; Only available from the DATE venue WiFi)

14:30

11.4.2

HOLYLIGHT: A NANOPHOTONIC ACCELERATOR FOR DEEP LEARNING IN DATA CENTERS
Speaker:
Weichen Liu, School of Computer Science and Engineering, Nanyang Technological University, Singapore, CN
Authors:
Weichen Liu¹, Wenyang Liu², Yichen Ye³, Qian Lou⁴, Yiyuan Xie³ and Lei Jiang⁵
¹Nanyang Technological University, SG; ²College of Computer Science, Chongqing University, CN; ³College of Electronics and Information Engineering, Southwest University, CN; ⁴Department of Intelligent Systems Engineering, Indiana University, US; ⁵Indiana University Bloomington, US
Abstract
Convolutional Neural Networks (CNNs) are widely adopted in object recognition, speech processing and machine translation, due to their extremely high inference accuracy. However, it is challenging to compute massive computationally expensive convolutions of deep CNNs on traditional CPUs and GPUs. Emerging Nanophotonic technology has been employed for on-chip data communication, because of its CMOS compatibility, high bandwidth and low power consumption. In this paper, we propose a nanophotonic accelerator, HolyLight, to boost the CNN inference throughput in datacenters. Instead of an all-photonic design, HolyLight performs convolutions by photonic integrated circuits, and process the other operations in CNNs by CMOS circuits for high inference accuracy. We first build HolyLight-M by microdisk-based matrix-vector multipliers. We find analog-to-digital converters (ADCs) seriously limit its inference throughput per Watt. We further use microdisk-based adders and shifters to architect HolyLight-A without ADCs. Compared to the state-of-the-art ReRAM-based accelerator, HolyLight-A improves the CNN inference throughput per Watt by 13x with trivial accuracy degradation.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:00

11.4.3

TRANSFER AND ONLINE REINFORCEMENT LEARNING IN STT-MRAM BASED EMBEDDED SYSTEMS FOR AUTONOMOUS DRONES
Speaker:
Insik Yoon, Georgia Institute of Technology, US
Authors:
Insik Yoon¹, Aqeel Anwar¹, Titash Rakshit² and Arijit Raychowdhury¹
¹Georgia Institute of Technology, US; ²Samsung, US
Abstract
In this paper we present an algorithm-hardware co-design for camera-based autonomous flight in small drones. We show that the large write-latency and write-energy for non-volatile memory (NVM) based embedded systems makes them unsuitable for real-time reinforcement learning (RL). We address this by performing transfer learning (TL) on meta-environments and RL on the last few layers of a deep convolutional network, While the NVM stores the meta-model (from TL), an on-die SRAM stores the weights of the last few layers. Thus all the real-time updates via RL are carried out on the SRAM arrays. This provides us with a practical platform with comparable performance as end-to-end RL and 83.4% lower energy per image frame.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:15

11.4.4

AIX: A HIGH PERFORMANCE AND ENERGY EFFICIENT INFERENCE ACCELERATOR ON FPGA FOR A DNN-BASED COMMERCIAL SPEECH RECOGNITION
Speaker:
Minwook Ahn, SK Telecom, KR
Authors:
Minwook Ahn, Seok Joong Hwang, Wonsub Kim, Seungrok Jung, Yeonbok Lee, Mookyoung Chung, Woohyung Lim and Youngjoon Kim, SK Telecom, KR
Abstract
Automatic speech recognition (ASR) is crucial in virtual personal assistant (VPA) services such as Apple Siri, Amazon Alexa, Google Now and SKT NUGU. Recently, ASR has been showing a remarkable advance in accuracy by applying deep learning. However, with the explosive increase of the user utterances and growing complexities in ASR, the demands for the custom accelerators in datacenters are highly increasing in order to process them in real time with low power consumption. This paper evaluates a custom inference accelerator for ASR enhanced by a deep neural network, called AIX (Artificial Intelligence aXellerator). AIX is developed on a Xilinx FPGA and deployed to SKT NUGU since 2018. Owing to the full exploitation of DSP slices and memory bandwidth provided by FPGA, AIX outperforms the cutting-edge CPUs by 10.2 times and even a state-of-the-art GPU by 20.1 times with real time workloads of ASR in performance and power consumption wise. This improvement achieves faster response time in ASR, and in turn reduces the number of required machines in datacenters to a third.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:30

End of session
Coffee Break in Exhibition Area

Coffee Breaks in the Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Lunch Breaks (Lunch Area)

On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area.

Tuesday, March 26, 2019

Coffee Break 10:30 - 11:30
Lunch Break 13:00 - 14:30
Keynote Lecture "Leonardo da Vinci, Humanism and Engineering between Florence and Milan" by Claudio Giorgione in room 1 13:50 - 14:20
Coffee Break 16:00 - 17:00

Wednesday, March 27, 2019

Coffee Break 10:00 - 11:00
Lunch Break 12:30 - 14:30
Keynote Lecture "Heterogeneous, High Scale Computing in the Era of Intelligent, Cloud-Connected" by David Pellerin, Amazon, US in room 1 13:50 - 14:20
Coffee Break 16:00 - 17:00

Thursday, March 28, 2019

Coffee Break 10:00 - 11:00
University Booth Best Demo Award Presentation at the University Booth 10:30
Lunch Break 12:30 - 14:00
Keynote Lecture "A Fundamental Look at Models and Intelligence" by Edward A. Lee, University of California, Berkeley, US in room 1 13:20 - 13:50
Coffee Break 15:30 - 16:00