Logistic Regression classifier.
Attention: Available only with Twin Activate commercial edition.
Syntax
parameters = logisticfit(X,y)
parameters = logisticfit(X,y,options)
Inputs
- X
- Training data.
- Type: double
- Dimension: vector | matrix
- y
- Target values.
- Type: double
- Dimension: vector | matrix
- options
- Type: struct
-
- penalty
- Used to specify the norm used in the penalization. The 'newton-cg', 'sag' and 'lbfgs' solvers support only l2 (default) penalties.
- Type: char
- Dimension: string
- dual
- Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual = false (default) when n_samples > n_features.
- Type: logical
- Dimension: Boolean
- tol
- Tolerance for stopping criteria (default: 1e-4).
- Type: double
- Dimension: scalar
- C
- Inverse of regularization strength. It must be a positive float (default: 1). Like in Support Vector Machines, smaller values specify stronger regularization.
- Type: double
- Dimension: scalar
- random_state
- Used when solver = 'sag' or 'liblinear' to shuffle the data.
- Type: integer
- Dimension: scalar
- solver
- Algorithm used to use in the optimization problem.Allowed solvers: 'newton-cg', 'lbfgs' (default), 'liblinear', 'sag', 'saga'
- Type: char
- Dimension: string
- max_iter
- Useful only for newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge (default: 100).
- Type: integer
- Dimension: scalar
- multi_class
- If 'ovr' is chosen, then a binary problem is fit for each label. If 'multinomial', then the loss minimized is the multinomial loss fit across the entire probability distribution, even when the data is binary. 'multinomial' is unavailable when solver = 'liblinear'. 'auto' (default) selects 'ovr' if the data is binary, or if solver = 'liblinear', and otherwise selects 'multinomial'.
- Type: char
- Dimension: string
Outputs
- parameters
- Contains all the values passed to logisticfit method as options. Additionally it has below key-value pairs.
- Type: struct
-
- scorer
- Function handle pointing to 'accuracy' function.
- Type: function handle
- intercept
- Intercept (bias) added to the decision function.
- Type: double
- Dimension: scalar
- coef
- Coefficient of the features in the decision function.
- Type: double
- Dimension: cell
- classes
- A list of class labels known to the classifier.
- Type: double
- Dimension: matrix
- n_samples
- Number of rows in the training data.
- Type: integer
- Dimension: scalar
- n_features
- Number of columns in the training data.
- Type: integer
- Dimension: scalar
Example
Usage of logisticfit
data = dlmread('banknote_authentication.txt', ',');
X = data(:, 1:2);
y = data(:, end);
parameters = logisticfit(X, y)
parameters = struct [
C: 1
classes: [Matrix] 1 x 2
0 1
coef: [Matrix] 1 x 2
-1.10461 -0.27259
dual: 0
intercept: 0.812773588
max_iter: 100
multi_class: auto
n_features: 2
n_samples: 1372
params: [Matrix] 1 x 3
0.81277 -1.10461 -0.27259
penalty: l2
solver: lbfgs
tol: 0.0001
]
Comments
Output 'parameters' should be passed as input to logisticpredict function.
For small datasets, 'liblinear' is a good choice, whereas 'sag' and 'saga' are faster for large ones.
For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs' handle multinomial loss; 'liblinear' is limited to one-versus-rest schemese. 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty, whereas 'liblinear' and 'saga' handle L1 penalty.
Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data using preprocessing methods.