Linear Projections

One of the most straightforward solution for classification is to project the data linearly and to perform Naive Bayes classification on the projected data. Here we present different ways for projecting the data linearly onto one or two dimensions.

Principal Component Analysis (PCA)
PCA seeks the directions of maximum variance (corresponding to the eigenvectors of the data covariance matrix), and projects the data onto the first Principal Component (eigenvector whose eigenvalue is the largest). No distinction is made between samples belonging to different classes. More information on Wikipedia.

Linear Discriminant Analysis (LDA)
LDA models each sample class separately and finds the direction that maximizes the distance between the two distributions. In its basic form, LDA models each class as a gaussian distribution of equal variance. More information on Wikipedia.

Fisher Linear Discriminant
Fisher-LDA extends LDA by modeling each class as a gaussian distribution of individual variance (instead of a single variance common to both distributions as in standard LDA). If the two classes have similar distributions, there will be no visible difference between LDA and Fisher-LDA. More information on Wikipedia.

Independent Component Analysis (ICA)
ICA looks for the directions that maximize data independence. While in the previous cases the components are always orthogonal, ICA can project the data along non-orthogonal dimensions. More information on Wikipedia.

Kernel Parameters
More information on Wikipedia.

Naive Bayes
Regardless of the method used for projection (if any), The data is separated into positive and negative classes, and the probability of a sample belonging to each class is computed separately. The response of the classifier in this implementation is a Maximum A Posteriori (MAP) decision rule. More information on Wikipedia.

The interface presents two buttons, that allow to visualize the projection of the data into components space, and to switch the current data in the canvas from the source samples to the projected samples, and back again.