LDA – the difference from PCA
LDA and PCA are linear transformation methods; the latter yields directions or PCs that maximize data variance and the former yields directions that maximize the separation between data classes. The way in which the PCA algorithm works disregards class labels.
LDA is a supervised method to reduce dimensionality that projects the data onto a subspace in a way that maximizes the separability between (groups) classes; hence, it is used for pattern classification problems. LDA works well for data with multiple classes; however, it makes assumptions of normally distributed classes and equal class covariances. PCA tends to work well if the number of samples in each class is relatively small. In both cases, though, observations ought to be much higher relative to the dimensions for meaningful results.
LDA seeks a projection that discriminates data in the best possible way, unlike PCA, which seeks a projection that preserves maximum information in...