Yesterday, Facebook open-sourced QNNPACK that stands for Quantized Neural Networks PACKage. It is a mobile-optimized library for low-intensity convolutions used in state-of-the-art neural networks. It comes with implementations of convolutional, deconvolutional, and fully connected neural network operators on quantized 8-bit tensors.
This library makes it possible to bring advanced computer vision tasks such as running Mask R-CNN and DensePose on phones in real time and image classification in less than 100 ms. Currently, QNNPACK is integrated into PyTorch1.0 with Caffe2 graph representation and is usable via Caffe2 model representation.
Running state-of-the-art artificial intelligence on mobile phones is not very easy as it requires several adaptations to get optimized performance from its hardware. Earlier, there wasn’t a performant open source implementation for several common neural network primitives. Because of this reason, promising research models such as ResNeXt, CondenseNet, and ShuffleNet were underused. QNNPACK enables developers to use these research models by providing high-performance implementations of convolutional, deconvolutional, and fully connected operations on quantized tensors.
QNNPACK-based Caffe2 operators were approximately 2x faster than TensorFlow Lite on quantized state-of-the-art MobileNet v2 architecture, says the Facebook research blog. The library speeds up many operations, such as depthwise convolutions, that advanced neural network architectures use.
Along with QNNPACK, they have also open-sourced Caffe2 quantized MobileNet v2 model that gives 1.3 percent higher accuracy than the corresponding TensorFlow model.
To know more in detail about QNNPACK, check out the official announcement on the Facebook blog.
Facebook introduces two new AI-powered video calling devices “built with Privacy + Security in mind”
Facebook’s Glow, a machine learning compiler, to be supported by Intel, Qualcomm and others
Facebook launches LogDevice: An open source distributed data store designed for logs