This lossless data compressor uses the traditional predictive approach:
The model uses something called arithmetic coding, a standard compression technique. This model tries to make arithmetic coding adaptive. Here’s an example by a Reddit user that explains how exactly this model works:
“The rough frequency of `e' in English is about 50%. But if you just saw this partial sentence "I am going to th", the probability/frequency of `e' skyrockets to, say, 98%. In standard arithmetic coding scheme, you would still parametrize you encoder with 50% to encode the next "e" despite it's very likely (~98%) that "e" is the next character (you are using more bits than you need in this case), while with the help of a neural network, the frequency becomes adaptive.”
To ensure that both the decoder and encoder are using the exact same model, the authors have developed a custom C library called LibNC. This library is responsible for implementing the various operations needed by the models. It has no dependency on any other libraries and has a C API.
Performance of the model was evaluated against enwik8 Hutter Prize benchmark. The models show slower decompression speed, 1.5x slower for the LSTM model and 3x slower for the Transformer model. But, its description is simple and the memory consumption is reasonable as compared to other compressors giving similar compression ratio.
Speaking of the compression ratio, the models are yet to reach the performance of CMIX, a lossless data compressor that gives optimized compression ratio at the cost of high CPU/memory usage. In all the experiments, the Transformer model gives worse performance than the LSTM model although it gives the best performance in language modeling benchmarks.
To know more in detail, check out the paper, Lossless Data Compression with Neural Networks.
Microsoft open-sources Project Zipline, its data compression algorithm and hardware for the cloud
Making the Most of Your Hadoop Data Lake, Part 1: Data Compression
Interpretation of Functional APIs in Deep Neural Networks by Rowel Atienza