Multiple R-squared
While calculating R2 previously, we saw how it was the amount of variance explained by the model:
Since the variance is the mean squared error, we can multiply both the var(ε) and var(y) terms by the sample size and arrive at the following alternative equation for R2:
This is simply the sum of squared residuals over the sum of squared differences from the mean. Incanter contains the incanter.core/sum-of-squares
function that makes this very simple to express:
(defn r-squared [coefs x y] (let [fitted (i/mmult x coefs) residuals (i/minus y fitted) differences (i/minus y (s/mean y)) rss (i/sum-of-squares residuals) ess (i/sum-of-squares differences)] (- 1 (/ rss ess))))
We use the variable names rss
for residual sum of squares and ess
for explained sum of squares. We can calculate the matrix R2 for our new model as follows:
(defn ex-3-21 [] (let [data (swimmer-data) x (->> (feature-matrix ["Height...