A Peripheral Circuit Reuse Structure Integrated with a Retimed Data Flow for Low Power RRAM Crossbar‐based CNN

Keni Qiu1, Weiwen Chen1, Yuanchao Xu1, Lixue Xia2, Yu Wang2 and Zili Shao3
1Capital Normal University, Beijing, China
2Tsinghua University, Beijing, China
3Hong Kong Polytechnic University, Hong Kong, China

ABSTRACT


Convolutional computations implemented in RRAM crossbar‐based Computing System (RCS) demonstrate the outstanding advantages of high performance and low power. However, current designs are energy‐unbalanced among the three parts of RRAM crossbar computation, peripheral circuits and memory accesses, and the latter two factors can significantly limit the potential gains of RCS. Addressing the problem of high power overhead of peripheral circuits in RCS, this paper proposes a Peripheral Circuit Unit (PeriCU)‐Reuse scheme to meet power budgets in energy constrained embedded systems. The underlying idea is to put the expensive ADCs/DACs onto spotlight and arrange multiple convolution layers to be sequentially served by the same PeriCU. In the solution, the first step is to determine the number of PeriCUs which are organized by cycle frames. Inside a cycle frame, the layers are computed in parallel inter‐ PeriCUs while sequentially intra‐PeriCU. Furthermore, a layer retiming technique is exploited to further improve the energy of RCS by assigning two adjacent layers within the same PeriCU so as to bypass the energy consuming memory accesses. The experiments of five convolutional applications validate that the PeriCU‐Reuse scheme integrated with the retiming technique can efficiently meet variable power budgets, and further reduce energy consumption efficiently.



Full Text (PDF)