7.4 Low Power Design: From Highly-Optimized Power Delivery Networks to CNN Accelerators

Time	Label	Presentation Title Authors
14:30	7.4.1	DETAILED PLACEMENT FOR IR DROP MITIGATION BY POWER STAPLE INSERTION IN SUB-10NM VLSI Speaker: Minsoo Kim, UC San Diego, US Authors: Sun ik Heo¹, Andrew Kahng², Minsoo Kim³, Lutong Wang³ and Chutong Yang³ ¹Samsung Electronics Co., Ltd., KR; ²University of California San Diego, US; ³UC San Diego, US Abstract Power Delivery Network (PDN) is one of the most challenging topics in modern VLSI design. Due to aggressive technology node scaling, resistance of back-end-of-line (BEOL) layers increases dramatically in sub-10nm VLSI, causing high supply voltage (IR) drop. To solve this problem, pre-placed or post-placed power staples are inserted in pin-access layers to connect adjacent power rails and reduce PDN resistance, at the cost of reduced routing flexibility, or reduced power staple insertion opportunity. In this work, we propose dynamic programming-based single-row and double-row detailed placement optimizations to maximize the power staple insertion in a post-placement flow. We further propose metaheuristics to improve the quality of result. Compared to the traditional post-placement flow, we achieve up to 13.2% (10mV) reduction in IR drop, with almost no WNS degradation. Download Paper (PDF; Only available from the DATE venue WiFi)
15:00	7.4.2	OPTIMIZING THE ENERGY EFFICIENCY OF POWER SUPPLY IN HETEROGENEOUS MULTICORE CHIPS WITH INTEGRATED SWITCHED-CAPACITOR CONVERTERS Speaker: Lu Wang, ShanghaiTech University, CN Authors: Lu Wang¹, Leilei Wang¹, Dejia Shang¹, Cheng Zhuo² and Pingqiang Zhou¹ ¹ShanghaiTech University, CN; ²Zhejiang University, CN Abstract Energy efficiency is a major concern in heterogeneous multi-core chips. Due to the switching-capacitor converter (SCC) has wide output voltages and high potential ratio efficiency, they are widely used in multi-core chips. In this paper we propose the optimization of Metal-Insulator-Metal (MIM) capacitance resource allocation and converter ratio selection for SCCs to improve the power efficiency by transforming the mixed integer nonlinear programming (MINLP) problems into a series of convex problems. The experimental results show that our approach can achieve a 9%-13% improvement in power efficiency and can be applied to more complicated heterogeneous multicore scenarios. Download Paper (PDF; Only available from the DATE venue WiFi)
15:30	7.4.3	POWER DELIVERY PATHFINDING FOR EMERGING DIE-TO-WAFER INTEGRATION TECHNOLOGY Speaker: Seungwon Kim, Ulsan National Institute of Science and Technology, KR Authors: Andrew B. Kahng¹, Seokhyeong Kang², Seungwon Kim³, Kambiz Samadi⁴ and Bangqi Xu¹ ¹UC San Diego, US; ²Pohang University of Science and Technology, KR; ³Ulsan National Institute of Science and Technology (UNIST), KR; ⁴Qualcomm Technologies, Inc., US Abstract In advanced technology nodes, emerging die-towafer (D2W) integration technology is a promising "More Than Moore" lever for continued scaling of system capability and value. In D2W 3D IC implementation, the power delivery network (PDN) is crucial to meeting design specifications. However, determining the optimal PDN design is nontrivial. On the one hand, to meet the IR drop requirement, denser power mesh is desired. On the other hand, to meet the timing requirement for a high-utilization design, more routing resource should be available for signal routing. Moreover, additional competition between signal routing and power routing is caused by intertier vertical interconnects in 3D IC. In this paper, we propose a power delivery pathfinding methodology for emerging die-towafer integration, which seeks to identify an optimal or nearoptimal PDN for a given design and PDN specification. Our pathfinding methodology exploits models for routability and worst IR drop, which helps reduce iterations between PDN design and circuit design in 3D IC implementation. We present validations with real design examples and a 28nm foundry technology. Download Paper (PDF; Only available from the DATE venue WiFi)
15:45	7.4.4	ENERGY-EFFICIENT CONVOLUTIONAL NEURAL NETWORKS VIA RECURRENT DATA REUSE Speaker: Luca Mocerino, Politecnico di Torino, IT Authors: Luca Mocerino, Valerio Tenace and Andrea Calimera, Politecnico di Torino, IT Abstract Deep learning (DL) algorithms have substantially improved in terms of accuracy and efficiency. Convolutional Neural Networks (CNNs) are now able to outperform traditional algorithms in computer vision tasks such as object classification, detection, recognition, and image segmentation. They represent an attractive solution for many embedded applications which may take advantage from machine-learning at the edge. Needless to say, the key to success lies under the availability of efficient hardware implementations which meet the stringent design constraints. Inspired by the way human brains process information, this paper presents a method that improves the processing efficiency of CNNs leveraging their repetitiveness. More specifically, we introduce (i) a clustering methodology that maximizes weights/activation reuse, and (ii) the design of a heterogeneous processing element which integrates a Floating-Point Unit (FPU) with an associative memory that manages recurrent patterns. The experimental analysis reveals that the proposed method achieves substantial energy savings with low accuracy loss, thus providing a practical design option that might find application in the growing segment of edge-computing. Download Paper (PDF; Only available from the DATE venue WiFi)
16:00	IP3-16, 717	ADAPTIVE WORD REORDERING FOR LOW-POWER INTER-CHIP COMMUNICATION Speaker: Eleni Maragkoudaki, University of Manchester, GB Authors: Eleni Maragkoudaki¹, Przemyslaw Mroszczyk² and Vasilis Pavlidis³ ¹University of Manchester, GB; ²Qualcomm, IE; ³The University of Manchester, GB Abstract The energy for data transfer has an increasing effect on the total system energy as technology scales, often overtaking computation energy. To reduce the power of inter-chip interconnects, an adaptive encoding scheme called Adaptive Word Reordering (AWR) is proposed that effectively decreases the number of signal transitions, leading to a significant power reduction. AWR outperforms other adaptive encoding schemes in terms of decrease in transitions, yielding up to 73% reduction in switching. Furthermore, complex bit transition computations are represented as delays in the time domain to limit the power overhead due to encoding. The saved power outweighs the overhead beyond a moderate wire length where the I/O voltage is assumed equal to the core voltage. For a typical I/O voltage, the decrease in power is significant reaching 23% at just 1 mm. Download Paper (PDF; Only available from the DATE venue WiFi)
16:00		End of session Coffee Break in Exhibition Area Coffee Breaks in the Exhibition Area On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area. Lunch Breaks (Lunch Area) On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area. Tuesday, March 26, 2019 Coffee Break 10:30 - 11:30 Lunch Break 13:00 - 14:30 Keynote Lecture "Leonardo da Vinci, Humanism and Engineering between Florence and Milan" by Claudio Giorgione in room 1 13:50 - 14:20 Coffee Break 16:00 - 17:00 Wednesday, March 27, 2019 Coffee Break 10:00 - 11:00 Lunch Break 12:30 - 14:30 Keynote Lecture "Heterogeneous, High Scale Computing in the Era of Intelligent, Cloud-Connected" by David Pellerin, Amazon, US in room 1 13:50 - 14:20 Coffee Break 16:00 - 17:00 Thursday, March 28, 2019 Coffee Break 10:00 - 11:00 University Booth Best Demo Award Presentation at the University Booth 10:30 Lunch Break 12:30 - 14:00 Keynote Lecture "A Fundamental Look at Models and Intelligence" by Edward A. Lee, University of California, Berkeley, US in room 1 13:20 - 13:50 Coffee Break 15:30 - 16:00

Time

Label

Presentation Title
Authors

14:30

7.4.1

DETAILED PLACEMENT FOR IR DROP MITIGATION BY POWER STAPLE INSERTION IN SUB-10NM VLSI
Speaker:
Minsoo Kim, UC San Diego, US
Authors:
Sun ik Heo¹, Andrew Kahng², Minsoo Kim³, Lutong Wang³ and Chutong Yang³
¹Samsung Electronics Co., Ltd., KR; ²University of California San Diego, US; ³UC San Diego, US
Abstract
Power Delivery Network (PDN) is one of the most challenging topics in modern VLSI design. Due to aggressive technology node scaling, resistance of back-end-of-line (BEOL) layers increases dramatically in sub-10nm VLSI, causing high supply voltage (IR) drop. To solve this problem, pre-placed or post-placed power staples are inserted in pin-access layers to connect adjacent power rails and reduce PDN resistance, at the cost of reduced routing flexibility, or reduced power staple insertion opportunity. In this work, we propose dynamic programming-based single-row and double-row detailed placement optimizations to maximize the power staple insertion in a post-placement flow. We further propose metaheuristics to improve the quality of result. Compared to the traditional post-placement flow, we achieve up to 13.2% (10mV) reduction in IR drop, with almost no WNS degradation.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:00

7.4.2

OPTIMIZING THE ENERGY EFFICIENCY OF POWER SUPPLY IN HETEROGENEOUS MULTICORE CHIPS WITH INTEGRATED SWITCHED-CAPACITOR CONVERTERS
Speaker:
Lu Wang, ShanghaiTech University, CN
Authors:
Lu Wang¹, Leilei Wang¹, Dejia Shang¹, Cheng Zhuo² and Pingqiang Zhou¹
¹ShanghaiTech University, CN; ²Zhejiang University, CN
Abstract
Energy efficiency is a major concern in heterogeneous multi-core chips. Due to the switching-capacitor converter (SCC) has wide output voltages and high potential ratio efficiency, they are widely used in multi-core chips. In this paper we propose the optimization of Metal-Insulator-Metal (MIM) capacitance resource allocation and converter ratio selection for SCCs to improve the power efficiency by transforming the mixed integer nonlinear programming (MINLP) problems into a series of convex problems. The experimental results show that our approach can achieve a 9%-13% improvement in power efficiency and can be applied to more complicated heterogeneous multicore scenarios.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:30

7.4.3

POWER DELIVERY PATHFINDING FOR EMERGING DIE-TO-WAFER INTEGRATION TECHNOLOGY
Speaker:
Seungwon Kim, Ulsan National Institute of Science and Technology, KR
Authors:
Andrew B. Kahng¹, Seokhyeong Kang², Seungwon Kim³, Kambiz Samadi⁴ and Bangqi Xu¹
¹UC San Diego, US; ²Pohang University of Science and Technology, KR; ³Ulsan National Institute of Science and Technology (UNIST), KR; ⁴Qualcomm Technologies, Inc., US
Abstract
In advanced technology nodes, emerging die-towafer (D2W) integration technology is a promising "More Than Moore" lever for continued scaling of system capability and value. In D2W 3D IC implementation, the power delivery network (PDN) is crucial to meeting design specifications. However, determining the optimal PDN design is nontrivial. On the one hand, to meet the IR drop requirement, denser power mesh is desired. On the other hand, to meet the timing requirement for a high-utilization design, more routing resource should be available for signal routing. Moreover, additional competition between signal routing and power routing is caused by intertier vertical interconnects in 3D IC. In this paper, we propose a power delivery pathfinding methodology for emerging die-towafer integration, which seeks to identify an optimal or nearoptimal PDN for a given design and PDN specification. Our pathfinding methodology exploits models for routability and worst IR drop, which helps reduce iterations between PDN design and circuit design in 3D IC implementation. We present validations with real design examples and a 28nm foundry technology.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:45

7.4.4

ENERGY-EFFICIENT CONVOLUTIONAL NEURAL NETWORKS VIA RECURRENT DATA REUSE
Speaker:
Luca Mocerino, Politecnico di Torino, IT
Authors:
Luca Mocerino, Valerio Tenace and Andrea Calimera, Politecnico di Torino, IT
Abstract
Deep learning (DL) algorithms have substantially improved in terms of accuracy and efficiency. Convolutional Neural Networks (CNNs) are now able to outperform traditional algorithms in computer vision tasks such as object classification, detection, recognition, and image segmentation. They represent an attractive solution for many embedded applications which may take advantage from machine-learning at the edge. Needless to say, the key to success lies under the availability of efficient hardware implementations which meet the stringent design constraints. Inspired by the way human brains process information, this paper presents a method that improves the processing efficiency of CNNs leveraging their repetitiveness. More specifically, we introduce (i) a clustering methodology that maximizes weights/activation reuse, and (ii) the design of a heterogeneous processing element which integrates a Floating-Point Unit (FPU) with an associative memory that manages recurrent patterns. The experimental analysis reveals that the proposed method achieves substantial energy savings with low accuracy loss, thus providing a practical design option that might find application in the growing segment of edge-computing.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:00

IP3-16, 717

ADAPTIVE WORD REORDERING FOR LOW-POWER INTER-CHIP COMMUNICATION
Speaker:
Eleni Maragkoudaki, University of Manchester, GB
Authors:
Eleni Maragkoudaki¹, Przemyslaw Mroszczyk² and Vasilis Pavlidis³
¹University of Manchester, GB; ²Qualcomm, IE; ³The University of Manchester, GB
Abstract
The energy for data transfer has an increasing effect on the total system energy as technology scales, often overtaking computation energy. To reduce the power of inter-chip interconnects, an adaptive encoding scheme called Adaptive Word Reordering (AWR) is proposed that effectively decreases the number of signal transitions, leading to a significant power reduction. AWR outperforms other adaptive encoding schemes in terms of decrease in transitions, yielding up to 73% reduction in switching. Furthermore, complex bit transition computations are represented as delays in the time domain to limit the power overhead due to encoding. The saved power outweighs the overhead beyond a moderate wire length where the I/O voltage is assumed equal to the core voltage. For a typical I/O voltage, the decrease in power is significant reaching 23% at just 1 mm.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:00

End of session
Coffee Break in Exhibition Area

Coffee Breaks in the Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Lunch Breaks (Lunch Area)

On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area.

Tuesday, March 26, 2019

Coffee Break 10:30 - 11:30
Lunch Break 13:00 - 14:30
Keynote Lecture "Leonardo da Vinci, Humanism and Engineering between Florence and Milan" by Claudio Giorgione in room 1 13:50 - 14:20
Coffee Break 16:00 - 17:00

Wednesday, March 27, 2019

Coffee Break 10:00 - 11:00
Lunch Break 12:30 - 14:30
Keynote Lecture "Heterogeneous, High Scale Computing in the Era of Intelligent, Cloud-Connected" by David Pellerin, Amazon, US in room 1 13:50 - 14:20
Coffee Break 16:00 - 17:00

Thursday, March 28, 2019

Coffee Break 10:00 - 11:00
University Booth Best Demo Award Presentation at the University Booth 10:30
Lunch Break 12:30 - 14:00
Keynote Lecture "A Fundamental Look at Models and Intelligence" by Edward A. Lee, University of California, Berkeley, US in room 1 13:20 - 13:50
Coffee Break 15:30 - 16:00