Feature stores
A feature store is a repository of features that have been created and versioned and are ready for model training. Recording features and storing them is critical for reproducibility. I have seen many cases where data scientists have created models with no documentation on how to retrain them other than a mess of complex code. A feature store is a catalog of features, similar to a model store. Feature stores are normally organized into databases and feature tables.
Let’s jump right in and go through Databricks feature store’s APIs:
- First, let’s import the necessary libraries:
from databricks import feature_store
from databricks.feature_store import FeatureLookup
import random
- Now, let’s create our name and record our database and schema. We are using the
users
DataFrame. We will also set ourlookup_key
, which in this case isuser_id
. A lookup key is just the value that identifies the feature store when we’re searching for...