Diving into feature stores and the problems they solve
As more teams in the organization start to use AI and ML to solve various business use cases, it becomes necessary to have a centralized, reusable, and easily discoverable feature repository. This repository is called a feature store.
All the curated features are in centralized, governed, access-controlled storage, such as a curated data lake. Different data science teams can be granted access to feature tables based on their needs. Like in enterprise data lakes, we can track data lineage; similarly, we can track the lineage of a feature table logged in Databricks Feature Store. We can also see all the downstream models that are consuming features from a registered feature table.
There are hundreds of data science teams tackling different business questions in large organizations. Each team may have its own domain knowledge and expertise. Performing feature engineering often requires heavy processing. Without a feature store...