pcafit

Attention: Available only with Activate commercial edition.

Syntax

parameters = pcafit(X)

parameters = pcafit(X,options)

Inputs

X

Training data.

Type: double

Dimension: vector | matrix

options

Type: struct

n_components: Number of components to keep. If n_components is not set, all components are kept: min(n_samples, n_features).; If n_components is between 0 and 1 (exclusive) and svd_solver = 'full', number of components is selected such that the amount of variance that needs to be explained is greater than the percentage specified by n_components.; If svd_solver = 'arpack', the number of components must be strictly less than the min(n_samples, n_features). So n_components will be min(n_samples, n_features) – 1.; Type: integer; Dimension: scalar
svd_solver: 'auto' (default): solver is selected by a default policy based on dimension of X and n_components. If the input data is larger than 500x500 and the number of components to extract is lower than 80% of the smallest dimension of the data, then the more efficient 'randomized' method is enabled. Otherwise exact full SVD is computed and optionally truncated afterwards.; 'full': runs exact full SVD calling the standard LAPACK solver and selects the components by postprocessing.; 'arpack': runs SVD truncated to n_components calling ARPACK solver. It requires strictly that n_components is between 0 and min(n_samples, n_features).; 'randomized': runs randomized SVD by the method of Halko et al.; Type: char; Dimension: string
tol: Tolerance for singular values computed by svd_solver = 'arpack' (default: 0).; Type: double; Dimension: scalar
iterated_power: Number of iterations for the power method computed by svd_solver = 'randomized'. Must be of range [0, infinity). If not set, it is computed automatically.; Type: integer; Dimension: scalar
random_state: Determines random number generation for svd_solver = 'arpack' or 'randomized'.; Type: integer; Dimension: scalar

Outputs

parameters

Contains all the values passed to pcafit method as options. Additionally it has below key-value pairs.

Type: struct

components: Principal axes in feature space, representing the directions of maximum variance in the data. The components are sorted by explained_variance.; Type: double; Dimension: matrix
explained_variance: Amount of variance explained by each of the selected components. It is equal to n_components largest eigenvalues of the covariance matrix of X.; Type: double; Dimension: vector
explained_variance_ratio: Percentage of variance explained by each of the selected components. If n_components is not set, then all components are stored and the sum of the ratios is equal to 1.0.; Type: double; Dimension: vector
singular_values: The singular values corresponding to each of the selected components. The singular values are equal to the 2-norms of the n_components variable in the lower-dimensional space.; Type: double; Dimension: vector
mean: Per feature empirical mean, estimated from the training set.; Type: double; Dimension: vector
n_components: Estimated number of components when, n_components is set between 0 and 1 while fitting with svd_solver = 'full'.; Type: integer; Dimension: scalar

Example

Usage of pcafit with options

X = [-1, -1; -2, -1; -3, -2; 1, 1; 2, 1; 3, 2];
options = struct;
options.n_components = 2;
parameters = pcafit(X, options);

> parameters
parameters = struct [
  components: [Matrix] 2 x 2
  -0.83849  -0.54491
   0.54491  -0.83849
  explained_variance: [Matrix] 1 x 2
  7.93954  0.06046
  explained_variance_ratio: [Matrix] 1 x 2
  0.99244  0.00756
]

Comments

Output 'parameters' should be passed as input to pcatransform function.