Packaging dependencies with MLflow models
In a Databricks environment, files commonly reside in DBFS. However, for enhanced performance, it’s recommended to bundle these artifacts directly within the model artifact. This ensures that all dependencies are statically captured at deployment time.
The log_model()
method allows you to not only log the model but also its dependent files and artifacts. This function takes an artifacts
parameter where you can specify paths to these additional files:
Here is an example of how to log custom artifacts with your models: mlflow.pyfunc.log_model( artifacts={'model-weights': "/dbfs/path/to/file", "tokenizer_cache": "./tokenizer_cache"} )
In custom Python models logged with MLflow, you can access these dependencies within the model’s code using the context.artifacts
attribute:
class CustomMLflowModel(mlflow.pyfunc.PythonModel): def load_context...