Occam's razor is an idea that appears not just in data science, but in science in general. It's a problem-solving principle, which suggests that we should prefer simple models that explain phenomena to complex models that also explain the same phenomena. The idea is that a simple model, without much complexity and without extraneous features, is more likely to be correct than an overly complicated model. The hope with some of these regularization methods, such as Bayesian ridge regression, is to obtain simple models. These models are as simple as they need to be, and they do a decent job of explaining data without overfitting.
In comparison, out of the box, OLS is prone to overfitting. Let's take a look at the following base function, which is generating a dataset:
Here, we can see randomly selected points from this function, with noise added...