A typical neural network contains a significant amount of redundant information. This enables us to apply both lossless and lossy compression to them, and often achieve fairly good results.
Huffman encoding is a type of compression that is commonly referred to in research papers concerning CNN compression. You can also use Apple compression or Facebook zstd libraries, which deliver state-of-the-art compression. Apple compression contains four compression algorithms (three common and one Apple-specific):
- LZ4 is the fastest of the four.
- ZLIB is standard zip archiving.
- LZMA is slower but delivers the best compression.
- LZFSE is a bit faster and delivers slightly better compression than ZLIB. It is optimized for the Apple hardware to be energy efficient.
Here is a code snippet for you to compress data using the LZFSE algorithm from the compression library...