KMeans clustering. It's an unsupervised learning algorithm.
Syntax
parameters = kmeansfit(X)
parameters = kmeansfit(X,options)
Inputs
 X
 Training data.
 Type: double
 Dimension: vector  matrix
 options
 Type: struct

 n_clusters
 Number of clusters to find (default: 8).
 Type: integer
 Dimension: scalar
 init
 Method for initialization of centroids. 'kmeans++' (default): Selects initial cluster centers in a smart way to speedup convergence. 'random': Choose k observations (rows) at random from data for the initial centroids.
 Type: char
 Dimension: string
 n_init
 Number of times the kmeans algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia (default: 10).
 Type: integer
 Dimension: scalar
 max_iter
 Maximum number of iterations of the kmeans algorithm for a single run (default: 300).
 Type: integer
 Dimension: scalar
 tol
 Relative tolerance with regard to inertia to declare convergence (default: 1e4).
 Type: double
 Dimension: scalar
 random_state
 Determines random number generation for centroid initialization. Set this parameter to make randomness deterministic.
 Type: integer
 Dimension: scalar
 algorithm
 Kmeans algorithm to use.
 'full': classical EMstyle algorithm
 'elkan': more efficient variant of classical by using triangle inequality, but currently doesn't support sparse data.
 'auto' (default): chooses 'elkan' for dense data and 'full' for sparse data.
 Type: char
 Dimension: string
Outputs
 parameters
 Contains all the values passed to kmeansfit method as options. Additionally it has below keyvalue pairs.
 Type: struct

 labels
 Labels of each point.
 Type: double
 Dimension: vector
 inertia
 Sum of squared distances of samples to their closest cluster center.
 Type: double
 Dimension: scalar
 n_iter
 Number of interations run.
 Type: integer
 Dimension: scalar
 n_samples
 Number of rows in the training data.
 Type: integer
 Dimension: scalar
 n_features
 Number of columns in the training data.
 Type: integer
 Dimension: scalar
Example
Usage of kmeansfit with options
rand('seed', 2);
XTrain = rand(14, 5);
XTest = rand(2, 5);
options = struct;
options.n_clusters = 2;
parameters = kmeansfit(XTrain, options);
> parameters
parameters = struct [
algorithm: auto
cluster_centers: [Matrix] 2 x 5
0.25669 0.37129 0.78008 0.28967 0.55561
0.57313 0.57262 0.30554 0.31799 0.40330
init: kmeans++
interia: 2.4113899
...
Comments
If the algorithm stops before fully converging (because of tol or max_iter), labels and cluster_centers will not be consistent, i.e. the cluster_centers will not be the means of the points in each cluster. Also, the estimator will reassign labels after the last iteration to make labels consistent with predict on the training set. Output 'parameters' should be passed as input to kmeanspredict function.