Logistic Regression classifier.
Syntax
parameters = logisticfit(X,y)
parameters = logisticfit(X,y,options)
Inputs
 X
 Training data.
 Type: double
 Dimension: vector  matrix
 y
 Target values.
 Type: double
 Dimension: vector  matrix
 options
 Type: struct

 penalty
 Used to specify the norm used in the penalization. The 'newtoncg', 'sag' and 'lbfgs' solvers support only l2 (default) penalties.
 Type: char
 Dimension: string
 dual
 Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual = false (default) when n_samples > n_features.
 Type: logical
 Dimension: Boolean
 tol
 Tolerance for stopping criteria (default: 1e4).
 Type: double
 Dimension: scalar
 C
 Inverse of regularization strength. It must be a positive float (default: 1). Like in Support Vector Machines, smaller values specify stronger regularization.
 Type: double
 Dimension: scalar
 random_state
 Used when solver = 'sag' or 'liblinear' to shuffle the data.
 Type: integer
 Dimension: scalar
 solver
 Algorithm used to use in the optimization problem.Allowed solvers: 'newtoncg', 'lbfgs' (default), 'liblinear', 'sag', 'saga'
 Type: char
 Dimension: string
 max_iter
 Useful only for newtoncg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge (default: 100).
 Type: integer
 Dimension: scalar
 multi_class
 If 'ovr' is chosen, then a binary problem is fit for each label. If 'multinomial', then the loss minimized is the multinomial loss fit across the entire probability distribution, even when the data is binary. 'multinomial' is unavailable when solver = 'liblinear'. 'auto' (default) selects 'ovr' if the data is binary, or if solver = 'liblinear', and otherwise selects 'multinomial'.
 Type: char
 Dimension: string
Outputs
 parameters
 Contains all the values passed to logisticfit method as options. Additionally it has below keyvalue pairs.
 Type: struct

 scorer
 Function handle pointing to 'accuracy' function.
 Type: function handle
 intercept
 Intercept (bias) added to the decision function.
 Type: double
 Dimension: scalar
 coef
 Coefficient of the features in the decision function.
 Type: double
 Dimension: cell
 classes
 A list of class labels known to the classifier.
 Type: double
 Dimension: matrix
 n_samples
 Number of rows in the training data.
 Type: integer
 Dimension: scalar
 n_features
 Number of columns in the training data.
 Type: integer
 Dimension: scalar
Example
Usage of logisticfit
data = dlmread('banknote_authentication.txt', ',');
X = data(:, 1:2);
y = data(:, end);
parameters = logisticfit(X, y)
parameters = struct [
C: 1
classes: [Matrix] 1 x 2
0 1
coef: [Matrix] 1 x 2
1.10461 0.27259
dual: 0
intercept: 0.812773588
max_iter: 100
multi_class: auto
n_features: 2
n_samples: 1372
params: [Matrix] 1 x 3
0.81277 1.10461 0.27259
penalty: l2
solver: lbfgs
tol: 0.0001
]
Comments
Output 'parameters' should be passed as input to logisticpredict function.
For small datasets, 'liblinear' is a good choice, whereas 'sag' and 'saga' are faster for large ones.
For multiclass problems, only 'newtoncg', 'sag', 'saga' and 'lbfgs' handle multinomial loss; 'liblinear' is limited to oneversusrest schemese. 'newtoncg', 'lbfgs' and 'sag' only handle L2 penalty, whereas 'liblinear' and 'saga' handle L1 penalty.
Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data using preprocessing methods.