The Open Neural Network Compiler (ONNC) project aims to provide a compiler to connect Open Neural Network Exchange Format (ONNX) to every deep learning accelerator (DLA). ONNX is a standard format for representing deep learning models that enables models to be correctly transferred between frameworks, like caffe, CNTK, mxnet, pytorch and TensorFlow. ONNX guarantees interoperability between frameworks, however, the industry still needs a backer to guarantee executability between DLAs - to ensure every DLA can execute ONNX models correctly.
ONNC is such a backer for DLA vendors. It is a kind of cross compiler that transforms ONNX models into binary machine code of DLAs.
Every DLA has its own unique and delicate design in its memory for fast data movement. A compiler must provide sufficient flexibility to handle with the wide range of varieties. ONNC leverages the IR design of ONNX and provides rich and effective algorithms to eliminate overhead of data movement. And the best is that DLA vendors can easily reuse these algorithms by just describing its own unique physical cost model. Skymizer hopes DLA vendors can be free from re-inventing these intricated optimization algorithms.
In this talk, we not only introduce ONNC framework, we will dive into ONNC internals. We will explain our plan to support uTensor backend for ARM Cortex-M and discuss some technical issues.