Subgraph Decoupling and Rescheduling for Increased Utilization in CGRA Architecture

Chen Yina, Qin Wang, Jianfei Jiang, Weiguang Sheng, Guanghui He, Zhigang Mao and Naifeng Jingb
Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, China
ayinchen@sjtu.edu.cn bsjtuj@sjtu.edu.cn

ABSTRACT


When coarse-grained reconfigurable array (CGRA) architecture is shifting towards general-purpose, some complex control flows, such as nested loop, conditional branch and data dependence, may embarrass it and reduce the processing element (PE) array utilization by breaking the intact dataflow graph (DFG) into multiple regions with inconsistent control regions. This paper proposes subgraph decoupling and rescheduling, which decouples the inconsistent regions into control-independent subgraphs. Each subgraph can be rescheduled with zero-cost domino context switching and parallelized to fully utilize the PE resources. Then, we propose lightweight hardware changes based on general CGRA architecture to enable our design. The experiment results show that our proposal can improve the performance and energy efficiency by 1.35⨯ and 1.18⨯ over a static-mapped CGRA (Plasticine), and by 1.27⨯ and 1.45⨯ over an instruction-driven CGRA (TIA).



Full Text (PDF)