Rapid In-Memory Matrix Multiplication Using Associative Processor

Mohamed Ayoub Neggaz1, Hasan Erdem Yantır2, Smail Niar1, Ahmed Eltawil2 and Fadi Kurdahi2
1LAMIH, University of Valenciennes, France
2CECS, University of California, Irvine, USA

ABSTRACT


Memory hierarchy latency is one of the main problems that prevents processors from achieving high performance. To eliminate the need of loading/storing large sets of data, Resistive Associative Processors (ReAP) have been proposed as a solution to the von Neumann bottleneck. In ReAPs, logic and memory structures are combined together to allow in memory computations. In this paper, we propose a new algorithm to compute the matrix multiplication inside the memory that exploits the benefits of ReAP. The proposed approach is based on the Cannon algorithm and uses a series of rotations without duplicating the data. It runs in O(n), where n is the dimension of the matrix. The method also applies to a large set of row by column matrix-based applications. Experimental results show several orders of magnitude increase in performance and reduction in energy and area when compared to the latest FPGA and CPU implementations.

Keywords: Resistive Associative Processor, Linear Algebra, Matrix Multiplication, FPGAs, Memristor.



Full Text (PDF)