Optimus as a cohesive API
The main goal of Optimus is to create a cohesive API so that you can handle data and create ML models in the simplest way possible. In Optimus, you have the ml
accessor, which will give you access to the ML algorithms implemented in Optimus.
ML algorithms can be hard to implement in parallel—for example, density-based spatial clustering of applications with noise (DBSCAN) is not implemented in Spark. For Optimus, we implemented algorithms that were common to all the libraries, and the ones that we considered as must-haves but that were missed, in a specific library. First, let's see which library empowers every Optimus engine, as follows:
- pandas uses scikit-learn.
- Dask uses Dask-ML.
- cuDF uses cuML.
- Dask cuDF uses cuML.
- Vaex uses vaex.ml.
- Spark uses MLlib.
- Ibis has no ML library available yet.
With this said, now let's see which algorithms are implemented in every library. Have a look at the...