Using auto-encoders for anomaly detection
Now that we have built an auto-encoder and accessed the features of the inner layers, we will move on to an example of how auto-encoders can be used for anomaly detection. The premise here is quite simple: we take the reconstructed outputs from the decoder and see which instances have the most error, that is, which instances are the most difficult for the decoder to reconstruct. The code that is used here is in Chapter9/anomaly.R
, and we will be using the UCI HAR
dataset that we have already been introduced to in Chapter 2, Training a Prediction Model. If you have not already downloaded the data, go back to that chapter for instructions on how to do so.. The first part of the code loads the data, and we subset the features to only use the ones with mean, sd, and skewness in the feature names:
library(keras) library(ggplot2) train.x <- read.table("UCI HAR Dataset/train/X_train.txt") train.y <- read.table("UCI HAR Dataset/train/y_train.txt")[...