Rule models
We can best understand rule models using the principles of discrete mathematics. Let's review some of these principles.
Let X be a set of features, the feature space, and C be a set of classes. We can define the ideal classifier for X as follows:
c: X → C
A set of examples in the feature space with class c is defined as follows:
D = {(x1, c( x1)), ... , (xn, c( xn)) ⊆ X × C
A splitting of X is partitioning X into a set of mutually exclusive subsets X1....Xs, so we can say the following:
X = X1 ∪ .. ∪ Xs
This induces a splitting of D into D1,...Ds. We define Dj where j = 1,...,s and is {(x,c(x) ∈ D | x ∈ Xj)}.
This is just defining a subset in X called Xj where all the members of Xj are perfectly classified.
In the following table we define a number of measurements using sums of indicator functions. An indicator function uses the notation where I[...] is equal to one if the statement between the square brackets is true and zero...