9.2 BART models
A Bayesian additive regression trees (BART) model is a sum of m trees that we use to approximate a function [Chipman et al., 2010]. To complete the model, we need to set priors over trees. The main function of such priors is to prevent overfitting while retaining the flexibility that trees provide. Priors are designed to keep the individual trees relatively shallow and the values at the leaf nodes relatively small.
PyMC does not support BART models directly but we can use PyMC-BART, a Python module that extends PyMC functionality to support BART models. PyMC-BART offers:
A BART random variable that works very similar to other distributions in PyMC like
pm.Normal
,pm.Poisson
, etc.A sampler called PGBART as trees cannot be sampled with PyMC’s default step methods such as NUTS or Metropolis.
The following utility functions to help work with the result of a BART model:
pmb.plot_pdp
: A function to generate partial dependence plots [Friedman, ...