A Distributed Framework for Low-Latency OpenVX over the RDMA NoC of a Clustered Manycore

Abstract : OpenVX is a standard proposed by the Khronos group for cross-platform acceleration of computer vision and deep learning applications. OpenVX abstracts the target processor architecture complexity and automates the implementation of processing pipelines through high-level optimizations. While highly efficient OpenVX implementations exist for shared memory multi-core processors, targeting OpenVX to clustered manycore processors appears challenging. Indeed, such processors comprise multiple compute units or clusters, each fitted with an on-chip local memory shared by several cores. This paper describes an efficient implementation of OpenVX that targets clustered manycore processors. We propose a framework that includes computation graph analysis, kernel fusion techniques, RDMA-based tiling into local memories, optimization passes, and a distributed execution runtime. This framework is implemented and evaluated on the 2nd-generation Kalray MPPA (R) clustered manycore processor. Experimental results show that super-linear speed-ups are obtained for multi-cluster execution by leveraging the bandwidth of on-chip memories and the capabilities of asynchronous RDMA engines.
Document type :
Conference papers
Complete list of metadatas

Cited literature [24 references]  Display  Hide  Download

https://hal-univ-rennes1.archives-ouvertes.fr/hal-02049414
Contributor : Laurent Jonchère <>
Submitted on : Monday, April 8, 2019 - 1:54:14 PM
Last modification on : Monday, August 19, 2019 - 12:52:02 PM
Long-term archiving on : Wednesday, July 10, 2019 - 12:59:06 PM

File

PID5495673.pdf
Files produced by the author(s)

Identifiers

Citation

Julien Hascoet, Benoît Dupont de Dinechin, Karol Desnos, Jean-Francois Nezan. A Distributed Framework for Low-Latency OpenVX over the RDMA NoC of a Clustered Manycore. IEEE High Performance Extreme Computing Conference (HPEC 2018), Sep 2018, Waltham, MA, United States. ⟨10.1109/hpec.2018.8547736⟩. ⟨hal-02049414⟩

Share

Metrics

Record views

78

Files downloads

82