Implementation of SVM
The one-class SVM algorithm does not use (ignores) the examples that are far from or deviated from the observations during training. Only the observations that are most concentrated or dense are leveraged for (unsupervised) learning and such an approach is effective in specific problems where very few deviations from normal are expected.
A synthetic dataset is created to implement SVM. We will have about 2% of the synthetic data in the minority class (outliers) denoted by 1
and 98% in the majority class (inliers) denoted by 0
, and leverage the RBF kernel to map the data into a high-dimensional space. The Python code (with the scikit-learn library) runs as follows:
import pandas as pd, numpy as np from collections import Counter import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.svm import OneClassSVM from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report X, y = make_classification...