9.5 Choosing the number of trees
The number of trees (m
) controls the flexibility of the BART function. As a rule of thumb, the default value of 50 should be enough to get a good approximation. And larger values, like 100 or 200, should provide a more refined answer. Usually, it is hard to overfit by increasing the number of trees, because the larger the number of trees, the smaller the values at the leaf nodes.
In practice, you may be worried about overshooting m
because the computational cost of BART, both in terms of time and memory, will increase. One way to tune m
is to perform K-fold cross-validation, as recommended by Chipman et al. [2010]. Another option is to approximate cross-validation by using LOO as discussed in Chapter 5. We have observed that LOO can indeed be of help to provide a reasonable value of m
[Quiroga et al., 2022].