WavePro: Clock-less Wave-Propagated Pipeline Compiler for Low-Power and High-Throughput Computation
Yehuda Kraa, Tzachi Noyb and Adam Temanc
Faculty of Engineering Bar-Ilan University Ramat Gan, Israel
ayehuda.kra@biu.ac.il
btzachi.noy@biu.ac.il
cadam.teman@biu.ac.il
ABSTRACT
Clock-less Wave-Propagated Pipelining is a longknown approach to achieve high-throughput without the overhead of costly sampling registers. However, due to many design challenges, which have only increased with technology scaling, this approach has never been widely accepted and has generally been limited to small and very specific demonstrations. This paper addresses this barrier by presenting WavePro, a generic and scalable algorithm, capable of skew balancing any combinatorial logic netlist for the application of wave-pipelining. The algorithm was implemented in the WavePro Compiler automation utility, which interfaces with industry delays extraction and standard timing analysis tools to produce a sign-off quality result. The utility is demonstrated upon a dot-product accelerator in a 65nm CMOS technology, using a vendor-provided standard cell library and commercial timing analysis tools. By reducing the worstcase output skew by over 70%, the test case example was able to achieve equivalent throughput of an 8-staged sequentially pipelined implementation with power savings of almost 3×.
Keywords: Wave Propagation, Clock-less Wave Pipeline, High Throughput, Low Power.