Classifying beyond the basic
The Databricks AutoML product is a solid starting point for classification, regression, and forecasting models. There are more advanced classification techniques beyond tree-based models, gradient boost models, and logistic regression that you can use with the lakehouse, as it is designed to work with virtually any open source ML model.
The Databricks ML runtimes include pre-built DL infrastructure and libraries such as PyTorch, TensorFlow, and Hugging Face transformers. DL models are computationally intensive, and distributed DL (DDL) frameworks such as Horovod also work in conjunction with these DL libraries for more efficient DDL. Be sure to check out the new PyTorch on Databricks! There is a PyTorch on Databricks – Introducing the Spark PyTorch Distributor blog that is useful if you are working with PyTorch (https://www.databricks.com/blog/2023/04/20/pytorch-databricks-introducing-spark-pytorch-distributor.html).
Another exciting type of...