Our first examples
Let's begin with a few simple examples to understand what is going on.Â
For some of us, it's very easy to get tempted to try the shiniest algorithms and do hyper-parameter optimization instead of the less glamorous step-by-step understanding.Â
A simple 2D example
Let's develop our intuition of how the autoencoder works with a simple two-dimensional example.Â
We first generate 10,000 points coming from a normal distribution with mean 0 and variance 1:
library(MASS) library(keras) Sigma <- matrix(c(1,0,0,1),2,2) n_points <- 10000 df <- mvrnorm(n=n_points, rep(0,2), Sigma) df <- as.data.frame(df)
The distribution of the values should look as follows:
Distribution of the variable V1 we just generated; the variable V2 looks fairly similar.
Distribution of the variables V1 and V2 we generated.Â
Let's spice things up a bit and add some outliers to the mixture. In many fraud applications, the fraud rate is about 1–5%, so we generate 1% of our samples as coming from a normal...