Due to the efforts of the Facebook AI Research team, there is a way to get vastly smaller models (in terms of the size that they take up in the hard drive), as you have seen in the Model quantization section in Chapter 2, Creating Models Using FastText Command Line. Models which take up hundreds of MBs can be quantized to only a couple of MBs. For example, if you see the DBpedia model released by Facebook, which can be accessed at the web page https://fasttext.cc/docs/en/supervised-models.html, notice that the regular model (this is the BIN file) is of 427 MB while the smaller model (the FTZ file) is only 1.7 MB.
This reduction in size is achieved by throwing out some of the information that is encoded in the BIN files (or the bigger model). The problem that needs to be solved here is how to keep information that is important and how to identify information...