PX‐CGRA: Polymorphic Approximate Coarse‐Grained Reconfigurable Architecture
Omid Akbari1,3,a, Mehdi Kamal1,b, Ali Afzali‐Kusha1,c, Massoud Pedram2 and Muhammad Shafique3
1School of Electrical and Computer Engineering, University of Tehran, Iran
aakbari.o@ut.ac.ir
bmehdikamal@ut.ac.ir
cafzali@ut.ac.ir
2Department of Electrical Engineering, University of Southern California, USA
pedram@usc.edu
3Institute of Computer Engineering, Vienna University of Technology (TU Wien), Austria
muhammad.shafique@tuwien.ac.at
ABSTRACT
Coarse‐Grained Reconfigurable Architectures (CGRAs) provide tradeoff between the energy‐efficiency of Application Specific Integrated Circuits (ASICs) and the flexibility of General Purpose Processors (GPPs). State‐of‐the‐art CGRAs only support exact architectures and precise application executions. However, a majority of the streaming applications such as multimedia and digital signal processing, which are amenable to CGRAs, are inherently error resilient. Therefore, these applications can greatly benefit from the emerging trend of Approximate Computing that leverages this error‐resiliency to provide higher energy efficiency proportional to the tolerable accuracy loss (can even be constrained). This paper, for the first time, introduces the novel concept of Polymorphic Approximate CGRA (PX‐CGRA) that employs heterogeneous tiles of Polymorphic‐Approximated ALU Clusters (PACs) connected in a 2‐D mesh style connection. These PACs can implement different approximate modes as well as accurate modes depending upon their selected configuration as per the run‐time requirements of executing applications. For designing an efficient PXCGRA, we propose a bottom‐up design flow. In addition, the flow of application mapping on PX‐CGRA is discussed including accuracylevel mapping, scheduling, and binding steps. To comprehensively evaluate the efficacy of the proposed CGRA, the complete PX‐CGRA architecture in different sizes as well as with different PACs configurations are synthesized using a 15‐nm FinFET technology. Our results show up to 15%‐45% energy efficiency improvement for 5%‐35% output quality degradation, respectively, when compared to the state‐of‐the‐art exact‐mode CGRA. Our proposed architecture and design methodology enable a new era of accuracy‐configurable CGRAs to provide significant energy gains.
Keywords: Coarse‐Grained Reconfigurable Architecture, Approximate Computing, Heterogeneous, Energy‐Efficiency, Dark Silicon, Adder, Multiplier, Quality, Design.