Chi-squared multiple significance testing
Not all categories are dichotomous (such as male and female, survived and perished). Although we would expect categorical variables to have a finite number of categories, there is no hard upper limit on the number of categories a particular attribute can have.
We could use other categorical variables to separate out the passengers on the Titanic, such as the class in which they were traveling. There were three class levels on the Titanic, and the frequency-table
function we constructed at the beginning of this chapter is already able to handle multiple classes.
(defn ex-4-12 [] (->> (load-data "titanic.tsv") (frequency-table :count [:survived :pclass])))
This code generates the following frequency table:
| :pclass | :survived | :count | |---------+-----------+--------| | third | y | 181 | | third | n | 528 | | second | y | 119 | | second | n | 158 | | first | n | 123 | | ...