Performing the t-test
The difference in the way t-test works stems from the probability distribution from which our p-value is calculated. Having calculated our t-statistic, we need to look up the value in the t-distribution parameterized by the degrees of freedom of our data:
(defn t-test [a b] (let [df (+ (count a) (count b) -2)] (- 1 (s/cdf-t (i/abs (t-stat a b)) :df df))))
The degrees of freedom are two less than the sizes of the samples combined, which is 298 for our samples.
Recall that we are performing a hypothesis test. So, let's state our null and alternate hypotheses:
H0: This sample is drawn from a population with a supplied mean
H1: This sample is drawn from a population with a greater mean
Let's run the example:
(defn ex-2-16 [] (let [data (->> (load-data "new-site.tsv") (:rows) (group-by :site) (map-vals (partial map :dwell-time))) a (get data 0) b (get data 1)] (t-test a b))) ;; 0.0503
This...