pcafit

Principal Component Analysis.

Attention: Available only with Activate commercial edition.

Syntax

parameters = pcafit(X)

parameters = pcafit(X,options)

Inputs

X
Training data.
Type: double
Dimension: vector | matrix
options
Type: struct
n_components
Number of components to keep. If n_components is not set, all components are kept: min(n_samples, n_features).
If n_components is between 0 and 1 (exclusive) and svd_solver = 'full', number of components is selected such that the amount of variance that needs to be explained is greater than the percentage specified by n_components.
If svd_solver = 'arpack', the number of components must be strictly less than the min(n_samples, n_features). So n_components will be min(n_samples, n_features) – 1.
Type: integer
Dimension: scalar
svd_solver
'auto' (default): solver is selected by a default policy based on dimension of X and n_components. If the input data is larger than 500x500 and the number of components to extract is lower than 80% of the smallest dimension of the data, then the more efficient 'randomized' method is enabled. Otherwise exact full SVD is computed and optionally truncated afterwards.
'full': runs exact full SVD calling the standard LAPACK solver and selects the components by postprocessing.
'arpack': runs SVD truncated to n_components calling ARPACK solver. It requires strictly that n_components is between 0 and min(n_samples, n_features).
'randomized': runs randomized SVD by the method of Halko et al.
Type: char
Dimension: string
tol
Tolerance for singular values computed by svd_solver = 'arpack' (default: 0).
Type: double
Dimension: scalar
iterated_power
Number of iterations for the power method computed by svd_solver = 'randomized'. Must be of range [0, infinity). If not set, it is computed automatically.
Type: integer
Dimension: scalar
random_state
Determines random number generation for svd_solver = 'arpack' or 'randomized'.
Type: integer
Dimension: scalar

Outputs

parameters
Contains all the values passed to pcafit method as options. Additionally it has below key-value pairs.
Type: struct
components
Principal axes in feature space, representing the directions of maximum variance in the data. The components are sorted by explained_variance.
Type: double
Dimension: matrix
explained_variance
Amount of variance explained by each of the selected components. It is equal to n_components largest eigenvalues of the covariance matrix of X.
Type: double
Dimension: vector
explained_variance_ratio
Percentage of variance explained by each of the selected components. If n_components is not set, then all components are stored and the sum of the ratios is equal to 1.0.
Type: double
Dimension: vector
singular_values
The singular values corresponding to each of the selected components. The singular values are equal to the 2-norms of the n_components variable in the lower-dimensional space.
Type: double
Dimension: vector
mean
Per feature empirical mean, estimated from the training set.
Type: double
Dimension: vector
n_components
Estimated number of components when, n_components is set between 0 and 1 while fitting with svd_solver = 'full'.
Type: integer
Dimension: scalar

Example

Usage of pcafit with options

X = [-1, -1; -2, -1; -3, -2; 1, 1; 2, 1; 3, 2];
options = struct;
options.n_components = 2;
parameters = pcafit(X, options);
      
> parameters
parameters = struct [
  components: [Matrix] 2 x 2
  -0.83849  -0.54491
   0.54491  -0.83849
  explained_variance: [Matrix] 1 x 2
  7.93954  0.06046
  explained_variance_ratio: [Matrix] 1 x 2
  0.99244  0.00756
]

Comments

Output 'parameters' should be passed as input to pcatransform function.