Communication-efficient View-Pooling for Distributed Multi-View Neural Networks
Manik Singhala, Vijay Raghunathanb and Anand Raghunathanc
School of Electrical and Computer Engineering, Purdue University
amsingha@purdue.edu
bvr@purdue.edu
craghunathan@purdue.edu
ABSTRACT
Multi-view object detection or the problem of detecting an object using multiple viewpoints, is an important problem in computer vision with varied applications such as distributed smart cameras and collaborative drone swarms. Multi-view object detection algorithms based on deep neural networks (DNNs) achieve high accuracy by view pooling, or aggregating features corresponding to the different views. However, when these algorithms are realized on networks of edge devices, the communication cost incurred by view pooling often dominates the overall latency and energy consumption. In this paper, we propose techniques for communication efficient view pooling that can be used to improve the efficiency of distributed multi-view object detection and apply them to state-of-the-art multi-view DNNs. First, we propose significance-aware feature selection, which identifies and communicates only those features from each view that are likely to impact the pooled result (and hence, the final output of the DNN). Second, we propose multi-resolution view pooling, which divides views into dominant and non-dominant views, and down-scales the features from non-dominant views using an additional network layer before communicating them for pooling. The dominant and nondominant views are pooled separately and the results are jointly used to derive the final classification. We implement and evaluate the proposed pooling schemes using a model test-bed of twelve Raspberry Pi 3b+ devices and show that they achieve 9⨯ - 36⨯ reduction in data communicated and 1.8⨯ reduction in inference latency, with no degradation in accuracy.