fit a support vector machine regression model -pg电子麻将胡了
fit a support vector machine regression model
syntax
description
fitrsvm trains or cross-validates a support vector
machine (svm) regression model on a low- through moderate-dimensional predictor data
set. fitrsvm supports mapping the predictor data using kernel
functions, and supports smo, isda, or l1 soft-margin minimization via
quadratic programming for objective-function minimization.
to train a linear svm regression model on a high-dimensional data set, that is, data
sets that include many predictor variables, use fitrlinear instead.
to train an svm model for binary classification, see fitcsvm for low- through moderate-dimensional predictor data sets, or
fitclinear for high-dimensional data sets.
returns a full, trained support vector machine (svm) regression model mdl = fitrsvm(tbl,responsevarname)mdl trained using the predictors values in the table tbl and the response values in tbl.responsevarname.
returns an svm regression model with additional options specified by one or more name-value pair arguments, using any of the previous syntaxes. for example, you can specify the kernel function or train a cross-validated model.mdl = fitrsvm(___,name,value)
examples
train linear support vector machine regression model
train a support vector machine (svm) regression model using sample data stored in matrices.
load the carsmall data set.
load carsmall rng 'default' % for reproducibility
specify horsepower and weight as the predictor variables (x) and mpg as the response variable (y).
x = [horsepower,weight]; y = mpg;
train a default svm regression model.
mdl = fitrsvm(x,y)
mdl =
regressionsvm
responsename: 'y'
categoricalpredictors: []
responsetransform: 'none'
alpha: [75x1 double]
bias: 57.3800
kernelparameters: [1x1 struct]
numobservations: 93
boxconstraints: [93x1 double]
convergenceinfo: [1x1 struct]
issupportvector: [93x1 logical]
solver: 'smo'
properties, methods
mdl is a trained regressionsvm model.
check the model for convergence.
mdl.convergenceinfo.converged
ans = logical
0
0 indicates that the model did not converge.
retrain the model using standardized data.
mdlstd = fitrsvm(x,y,'standardize',true)mdlstd =
regressionsvm
responsename: 'y'
categoricalpredictors: []
responsetransform: 'none'
alpha: [77x1 double]
bias: 22.9131
kernelparameters: [1x1 struct]
mu: [109.3441 2.9625e 03]
sigma: [45.3545 805.9668]
numobservations: 93
boxconstraints: [93x1 double]
convergenceinfo: [1x1 struct]
issupportvector: [93x1 logical]
solver: 'smo'
properties, methods
check the model for convergence.
mdlstd.convergenceinfo.converged
ans = logical
1
1 indicates that the model did converge.
compute the resubstitution (in-sample) mean-squared error for the new model.
lstd = resubloss(mdlstd)
lstd = 17.0256
train support vector machine regression model
train a support vector machine regression model using the abalone data from the uci machine learning repository.
download the data and save it in your current folder with the name 'abalone.csv'.
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data'; websave('abalone.csv',url);
read the data into a table. specify the variable names.
varnames = {'sex'; 'length'; 'diameter'; 'height'; 'whole_weight';...
'shucked_weight'; 'viscera_weight'; 'shell_weight'; 'rings'};
tbl = readtable('abalone.csv','filetype','text','readvariablenames',false);
tbl.properties.variablenames = varnames;the sample data contains 4177 observations. all the predictor variables are continuous except for sex, which is a categorical variable with possible values 'm' (for males), 'f' (for females), and 'i' (for infants). the goal is to predict the number of rings (stored in rings) on the abalone and determine its age using physical measurements.
train an svm regression model, using a gaussian kernel function with an automatic kernel scale. standardize the data.
rng default % for reproducibility mdl = fitrsvm(tbl,'rings','kernelfunction','gaussian','kernelscale','auto',... 'standardize',true)
mdl =
regressionsvm
predictornames: {'sex' 'length' 'diameter' 'height' 'whole_weight' 'shucked_weight' 'viscera_weight' 'shell_weight'}
responsename: 'rings'
categoricalpredictors: 1
responsetransform: 'none'
alpha: [3635×1 double]
bias: 10.8144
kernelparameters: [1×1 struct]
mu: [0 0 0 0.5240 0.4079 0.1395 0.8287 0.3594 0.1806 0.2388]
sigma: [1 1 1 0.1201 0.0992 0.0418 0.4904 0.2220 0.1096 0.1392]
numobservations: 4177
boxconstraints: [4177×1 double]
convergenceinfo: [1×1 struct]
issupportvector: [4177×1 logical]
solver: 'smo'
properties, methods
the command window shows that mdl is a trained regressionsvm model and displays a property list.
display the properties of mdl using dot notation. for example, check to confirm whether the model converged and how many iterations it completed.
conv = mdl.convergenceinfo.converged
conv = logical
1
iter = mdl.numiterations
iter = 2759
the returned results indicate that the model converged after 2759 iterations.
cross-validate svm regression model
load the carsmall data set.
load carsmall rng 'default' % for reproducibility
specify horsepower and weight as the predictor variables (x) and mpg as the response variable (y).
x = [horsepower weight]; y = mpg;
cross-validate two svm regression models using 5-fold cross-validation. for both models, specify to standardize the predictors. for one of the models, specify to train using the default linear kernel, and the gaussian kernel for the other model.
mdllin = fitrsvm(x,y,'standardize',true,'kfold',5)
mdllin =
regressionpartitionedsvm
crossvalidatedmodel: 'svm'
predictornames: {'x1' 'x2'}
responsename: 'y'
numobservations: 94
kfold: 5
partition: [1x1 cvpartition]
responsetransform: 'none'
properties, methods
mdlgau = fitrsvm(x,y,'standardize',true,'kfold',5,'kernelfunction','gaussian')
mdlgau =
regressionpartitionedsvm
crossvalidatedmodel: 'svm'
predictornames: {'x1' 'x2'}
responsename: 'y'
numobservations: 94
kfold: 5
partition: [1x1 cvpartition]
responsetransform: 'none'
properties, methods
mdllin.trained
ans=5×1 cell array
{1x1 classreg.learning.regr.compactregressionsvm}
{1x1 classreg.learning.regr.compactregressionsvm}
{1x1 classreg.learning.regr.compactregressionsvm}
{1x1 classreg.learning.regr.compactregressionsvm}
{1x1 classreg.learning.regr.compactregressionsvm}
mdllin and mdlgau are regressionpartitionedsvm cross-validated models. the trained property of each model is a 5-by-1 cell array of compactregressionsvm models. the models in the cell store the results of training on 4 folds of observations, and leaving one fold of observations out.
compare the generalization error of the models. in this case, the generalization error is the out-of-sample mean-squared error.
mselin = kfoldloss(mdllin)
mselin = 17.4417
msegau = kfoldloss(mdlgau)
msegau = 16.7333
the svm regression model using the gaussian kernel performs better than the one using the linear kernel.
create a model suitable for making predictions by passing the entire data set to fitrsvm, and specify all name-value pair arguments that yielded the better-performing model. however, do not specify any cross-validation options.
mdlgau = fitrsvm(x,y,'standardize',true,'kernelfunction','gaussian');
to predict the mpg of a set of cars, pass mdl and a table containing the horsepower and weight measurements of the cars to predict.
optimize svm regression
this example shows how to optimize hyperparameters automatically using fitrsvm. the example uses the carsmall data.
load the carsmall data set.
load carsmallspecify horsepower and weight as the predictor variables (x) and mpg as the response variable (y).
x = [horsepower weight]; y = mpg;
delete rows of x and y where either array has missing values.
r = rmmissing([x y]); x = r(:,1:end-1); y = r(:,end);
find hyperparameters that minimize five-fold cross-validation loss by using automatic hyperparameter optimization.
for reproducibility, set the random seed and use the 'expected-improvement-plus' acquisition function.
rng default mdl = fitrsvm(x,y,'optimizehyperparameters','auto',... 'hyperparameteroptimizationoptions',struct('acquisitionfunctionname',... 'expected-improvement-plus'))
|====================================================================================================================| | iter | eval | objective: | objective | bestsofar | bestsofar | boxconstraint| kernelscale | epsilon | | | result | log(1 loss) | runtime | (observed) | (estim.) | | | | |====================================================================================================================| | 1 | best | 2.9063 | 0.53469 | 2.9063 | 2.9063 | 28.362 | 911.02 | 0.016304 | | 2 | accept | 5.09 | 8.9835 | 2.9063 | 3.0286 | 0.036953 | 0.053504 | 0.034503 | | 3 | accept | 4.1988 | 0.11067 | 2.9063 | 2.9065 | 0.0062352 | 0.0013479 | 81.308 | | 4 | accept | 4.1988 | 0.13082 | 2.9063 | 3.0166 | 680.3 | 293.54 | 162.86 | | 5 | accept | 2.911 | 0.14758 | 2.9063 | 2.9074 | 25.798 | 951.96 | 0.012952 | | 6 | accept | 2.9133 | 0.083699 | 2.9063 | 2.9074 | 1.637 | 288.31 | 0.0095253 | | 7 | accept | 2.9172 | 0.079151 | 2.9063 | 2.9061 | 0.64941 | 336.58 | 0.053186 | | 8 | best | 2.9049 | 0.072264 | 2.9049 | 2.9017 | 14.446 | 504.88 | 0.016845 | | 9 | accept | 4.1457 | 0.11639 | 2.9049 | 2.9044 | 0.0070799 | 998.03 | 0.096 | | 10 | accept | 2.9231 | 0.21541 | 2.9049 | 2.9044 | 4.2345 | 988.52 | 0.16944 | | 11 | accept | 2.9062 | 0.19738 | 2.9049 | 2.9042 | 2.3059 | 153.63 | 0.20208 | | 12 | accept | 2.9127 | 0.070551 | 2.9049 | 2.9064 | 2.7788 | 381.1 | 0.053787 | | 13 | accept | 4.1988 | 0.14893 | 2.9049 | 2.9053 | 11.863 | 21.429 | 34.968 | | 14 | best | 2.9021 | 0.066902 | 2.9021 | 2.9021 | 0.54896 | 85.043 | 0.01929 | | 15 | accept | 2.9195 | 0.089052 | 2.9021 | 2.9021 | 3.5638 | 95.584 | 0.0099706 | | 16 | accept | 2.9334 | 0.13963 | 2.9021 | 2.9022 | 849.98 | 954.92 | 0.0093444 | | 17 | accept | 2.9508 | 0.31428 | 2.9021 | 2.9022 | 208.43 | 270.91 | 0.0095527 | | 18 | accept | 2.9079 | 0.097408 | 2.9021 | 2.9032 | 0.82463 | 133.01 | 0.059944 | | 19 | accept | 2.9051 | 0.15627 | 2.9021 | 2.9048 | 179.8 | 994.15 | 0.098141 | | 20 | accept | 2.9038 | 0.16884 | 2.9021 | 2.9049 | 19.865 | 523.62 | 0.17466 | |====================================================================================================================| | iter | eval | objective: | objective | bestsofar | bestsofar | boxconstraint| kernelscale | epsilon | | | result | log(1 loss) | runtime | (observed) | (estim.) | | | | |====================================================================================================================| | 21 | accept | 2.9038 | 0.14472 | 2.9021 | 2.9052 | 165.02 | 990.85 | 0.011034 | | 22 | accept | 2.9047 | 0.086646 | 2.9021 | 2.9031 | 0.72748 | 131.25 | 0.0097564 | | 23 | accept | 2.9044 | 0.091078 | 2.9021 | 2.9031 | 73.581 | 730.36 | 0.039214 | | 24 | accept | 2.9279 | 0.22897 | 2.9021 | 2.9031 | 2.2772 | 372.34 | 0.31532 | | 25 | accept | 4.1988 | 0.095941 | 2.9021 | 2.9031 | 963.08 | 0.0011726 | 584.82 | | 26 | accept | 2.9388 | 0.14515 | 2.9021 | 2.903 | 914.63 | 862.93 | 0.059825 | | 27 | accept | 5.8854 | 10.011 | 2.9021 | 2.901 | 816.71 | 4.1904 | 0.0098651 | | 28 | accept | 2.9169 | 0.081832 | 2.9021 | 2.9012 | 0.001186 | 14.022 | 0.010653 | | 29 | accept | 2.9146 | 0.22186 | 2.9021 | 2.901 | 0.014717 | 28.169 | 0.0093332 | | 30 | accept | 3.1173 | 0.081241 | 2.9021 | 2.9011 | 0.0010229 | 12.263 | 6.8846 |

__________________________________________________________
optimization completed.
maxobjectiveevaluations of 30 reached.
total function evaluations: 30
total elapsed time: 43.3303 seconds
total objective function evaluation time: 23.1117
best observed feasible point:
boxconstraint kernelscale epsilon
_____________ ___________ _______
0.54896 85.043 0.01929
observed objective function value = 2.9021
estimated objective function value = 2.9037
function evaluation time = 0.066902
best estimated feasible point (according to models):
boxconstraint kernelscale epsilon
_____________ ___________ ________
73.581 730.36 0.039214
estimated objective function value = 2.9011
estimated function evaluation time = 0.15408
mdl =
regressionsvm
responsename: 'y'
categoricalpredictors: []
responsetransform: 'none'
alpha: [93x1 double]
bias: 46.4648
kernelparameters: [1x1 struct]
numobservations: 93
hyperparameteroptimizationresults: [1x1 bayesianoptimization]
boxconstraints: [93x1 double]
convergenceinfo: [1x1 struct]
issupportvector: [93x1 logical]
solver: 'smo'
properties, methods
the optimization searched over boxconstraint, kernelscale, and epsilon. the output is the regression with the minimum estimated cross-validation loss.
input arguments
tbl — predictor data
table
sample data used to train the model, specified as a table. each
row of tbl corresponds to one observation, and
each column corresponds to one predictor variable. optionally, tbl can
contain one additional column for the response variable. multicolumn
variables and cell arrays other than cell arrays of character vectors
are not allowed.
if tbl contains the response variable, and
you want to use all remaining variables in tbl as
predictors, then specify the response variable using responsevarname.
if tbl contains the response variable, and
you want to use only a subset of the remaining variables in tbl as
predictors, then specify a formula using formula.
if tbl does not contain the response variable,
then specify a response variable using y. the
length of response variable and the number of rows of tbl must
be equal.
if a row of tbl or an element of y contains
at least one nan, then fitrsvm removes
those rows and elements from both arguments when training the model.
to specify the names of the predictors in the order of their
appearance in tbl, use the predictornames name-value
pair argument.
data types: table
responsevarname — response variable name
name of variable in tbl
response variable name, specified as the name of a variable in
tbl. the response variable must be a numeric vector.
you must specify responsevarname as a character vector or string
scalar. for example, if tbl stores the response variable
y as tbl.y, then specify it as
'y'. otherwise, the software treats all columns of
tbl, including y, as predictors when
training the model.
data types: char | string
formula — explanatory model of response variable and subset of predictor variables
character vector | string scalar
explanatory model of the response variable and a subset of the predictor variables,
specified as a character vector or string scalar in the form
"y~x1 x2 x3". in this form, y represents the
response variable, and x1, x2, and
x3 represent the predictor variables.
to specify a subset of variables in tbl as predictors for
training the model, use a formula. if you specify a formula, then the software does not
use any variables in tbl that do not appear in
formula.
the variable names in the formula must be both variable names in tbl
(tbl.properties.variablenames) and valid matlab® identifiers. you can verify the variable names in tbl by
using the isvarname function. if the variable names
are not valid, then you can convert them by using the matlab.lang.makevalidname function.
data types: char | string
y — response data
numeric vector
response data, specified as an n-by-1 numeric
vector. the length of y and the number of rows
of tbl or x must be equal.
if a row of tbl or x,
or an element of y, contains at least one nan,
then fitrsvm removes those rows and elements
from both arguments when training the model.
to specify the response variable name, use the responsename name-value
pair argument.
data types: single | double
x — predictor data
numeric matrix
predictor data to which the svm regression model is fit, specified as an n-by-p numeric matrix. n is the number of observations and p is the number of predictor variables.
the length of y and the number of rows
of x must be equal.
if a row of x or an element of y contains
at least one nan, then fitrsvm removes
those rows and elements from both arguments.
to specify the names of the predictors in the order of their
appearance in x, use the predictornames name-value
pair argument.
data types: single | double
name-value arguments
specify optional pairs of arguments as
name1=value1,...,namen=valuen, where name is
the argument name and value is the corresponding value.
name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
before r2021a, use commas to separate each name and value, and enclose
name in quotes.
example: 'kernelfunction','gaussian','standardize',true,'crossval','on' trains a 10-fold cross-validated svm regression model using a gaussian kernel and standardized training data.
note
you cannot use any cross-validation name-value argument together with the
'optimizehyperparameters' name-value argument. you can modify the
cross-validation for 'optimizehyperparameters' only by using the
'hyperparameteroptimizationoptions' name-value argument.
boxconstraint — box constraint
positive scalar value
box constraint for the alpha coefficients, specified as the comma-separated pair consisting of 'boxconstraint' and a positive scalar value.
the absolute value of the alpha coefficients cannot exceed the value of boxconstraint.
the default boxconstraint value for the 'gaussian' or 'rbf' kernel function is iqr(y)/1.349, where iqr(y) is the interquartile range of response variable y. for all other kernels, the default boxconstraint value is 1.
example: boxconstraint,10
data types: single | double
kernelfunction — kernel function
'linear' (default) | 'gaussian' | 'rbf' | 'polynomial' | function name
kernel function used to compute the gram matrix, specified as the comma-separated pair consisting of 'kernelfunction' and a value in this table.
| value | description | formula |
|---|---|---|
'gaussian' or 'rbf' | gaussian or radial basis function (rbf) kernel |
|
'linear' | linear kernel |
|
'polynomial' | polynomial kernel. use 'polynomialorder', to specify a polynomial kernel of order q. |
|
you can set your own kernel function, for example, kernel, by setting 'kernelfunction','kernel'. kernel must have the following form:
function g = kernel(u,v)uis an m-by-p matrix.vis an n-by-p matrix.gis an m-by-n gram matrix of the rows ofuandv.
and kernel.m must be on the matlab path.
it is good practice to avoid using generic names for kernel functions. for example, call a sigmoid kernel function 'mysigmoid' rather than 'sigmoid'.
example: 'kernelfunction','gaussian'
data types: char | string
kernelscale — kernel scale parameter
1 (default) | 'auto' | positive scalar
kernel scale parameter, specified as the comma-separated pair
consisting of 'kernelscale' and 'auto' or
a positive scalar. the software divides all elements of the predictor
matrix x by the value of kernelscale.
then, the software applies the appropriate kernel norm to compute
the gram matrix.
if you specify
'auto', then the software selects an appropriate scale factor using a heuristic procedure. this heuristic procedure uses subsampling, so estimates can vary from one call to another. therefore, to reproduce results, set a random number seed using before training.if you specify
kernelscaleand your own kernel function, for example,'kernelfunction','kernel', then the software throws an error. you must apply scaling withinkernel.
example: 'kernelscale','auto'
data types: double | single | char | string
polynomialorder — polynomial kernel function order
3 (default) | positive integer
polynomial kernel function order, specified as the comma-separated
pair consisting of 'polynomialorder' and a positive
integer.
if you set 'polynomialorder' and kernelfunction is
not 'polynomial', then the software throws an error.
example: 'polynomialorder',2
data types: double | single
kerneloffset — kernel offset parameter
nonnegative scalar
kernel offset parameter, specified as the comma-separated pair
consisting of 'kerneloffset' and a nonnegative
scalar.
the software adds kerneloffset to each element
of the gram matrix.
the defaults are:
0if the solver is smo (that is, you set'solver','smo')0.1if the solver is isda (that is, you set'solver','isda')
example: 'kerneloffset',0
data types: double | single
epsilon — half the width of epsilon-insensitive band
iqr(y)/13.49 (default) | nonnegative scalar value
half the width of the epsilon-insensitive band, specified as
the comma-separated pair consisting of 'epsilon' and
a nonnegative scalar value.
the default epsilon value is iqr(y)/13.49,
which is an estimate of a tenth of the standard deviation using the
interquartile range of the response variable y.
if iqr(y) is equal to zero, then the default epsilon value
is 0.1.
example: 'epsilon',0.3
data types: single | double
standardize — flag to standardize predictor data
false (default) | true
flag to standardize the predictor data, specified as the comma-separated pair consisting of 'standardize' and true (1) or false
(0).
if you set 'standardize',true:
the software centers and scales each column of the predictor data (
x) by the weighted column mean and standard deviation, respectively (for details on weighted standardizing, see algorithms). matlab does not standardize the data contained in the dummy variable columns generated for categorical predictors.the software trains the model using the standardized predictor matrix, but stores the unstandardized data in the model property
x.
example: 'standardize',true
data types: logical
solver — optimization routine
'isda' | 'l1qp' | 'smo'
optimization routine, specified as the comma-separated pair consisting of 'solver' and a value in this table.
| value | description |
|---|---|
'isda' | iterative single data algorithm (see ) |
'l1qp' | uses (optimization toolbox) to implement l1 soft-margin minimization by quadratic programming. this option requires an optimization toolbox™ license. for more details, see quadratic programming definition (optimization toolbox). |
'smo' | sequential minimal optimization (see ) |
the defaults are:
'isda'if you set'outlierfraction'to a positive value'smo'otherwise
example: 'solver','isda'
alpha — initial estimates of alpha coefficients
numeric vector
initial estimates of alpha coefficients, specified as the comma-separated pair consisting of 'alpha' and a numeric vector. the length of alpha must be equal to the number of rows of x.
each element of
alphacorresponds to an observation inx.alphacannot contain anynans.if you specify
alphaand any one of the cross-validation name-value pair arguments ('crossval','cvpartition','holdout','kfold', or'leaveout'), then the software returns an error.
if y contains any missing values, then remove all rows of y, x, and alpha that correspond to the missing values. that is, enter:
idx = ~isnan(y); y = y(idx); x = x(idx,:); alpha = alpha(idx);
y, x, and alpha as the response, predictors, and initial alpha estimates, respectively.
the default is zeros(size(y,1)).
example: 'alpha',0.1*ones(size(x,1),1)
data types: single | double
cachesize — cache size
1000 (default) | 'maximal' | positive scalar
cache size, specified as the comma-separated pair consisting
of 'cachesize' and 'maximal' or
a positive scalar.
if cachesize is 'maximal',
then the software reserves enough memory to hold the entire n-by-n gram matrix.
if cachesize is a positive scalar, then the
software reserves cachesize megabytes of memory
for training the model.
example: 'cachesize','maximal'
data types: double | single | char | string
clipalphas — flag to clip alpha coefficients
true (default) | false
flag to clip alpha coefficients, specified as the comma-separated
pair consisting of 'clipalphas' and either true or false.
suppose that the alpha coefficient for observation j is αj and the box constraint of observation j is cj, j = 1,...,n, where n is the training sample size.
| value | description |
|---|---|
true | at each iteration, if αj is near 0 or near cj, then matlab sets αj to 0 or to cj, respectively. |
false | matlab does not change the alpha coefficients during optimization. |
matlab stores the final values of α in
the alpha property of the trained svm model object.
clipalphas can affect smo and isda convergence.
example: 'clipalphas',false
data types: logical
numprint — number of iterations between optimization diagnostic message output
1000 (default) | nonnegative integer
number of iterations between optimization diagnostic message
output, specified as the comma-separated pair consisting of 'numprint' and
a nonnegative integer.
if you specify 'verbose',1 and 'numprint',numprint, then
the software displays all optimization diagnostic messages from smo and isda every
numprint iterations in the command window.
example: 'numprint',500
data types: double | single
outlierfraction — expected proportion of outliers in training data
0 (default) | numeric scalar in the interval [0,1)
expected proportion of outliers in training data, specified as the comma-separated pair consisting of 'outlierfraction' and a numeric scalar in the interval [0,1). fitrsvm removes observations with large gradients, ensuring that fitrsvm removes the fraction of observations specified by outlierfraction by the time convergence is reached. this name-value pair is only valid when 'solver' is 'isda'.
example: 'outlierfraction',0.1
data types: single | double
removeduplicates — flag to replace duplicate observations with single observations
false (default) | true
flag to replace duplicate observations with single observations in the training data,
specified as the comma-separated pair consisting of
'removeduplicates' and true or
false.
if removeduplicates is true, then
fitrsvm replaces duplicate observations in the training
data with a single observation of the same value. the weight of the single observation
is equal to the sum of the weights of the corresponding removed duplicates (see
weights).
tip
if your data set contains many duplicate observations, then specifying
'removeduplicates',true can decrease convergence time
considerably.
data types: logical
verbose — verbosity level
0 (default) | 1 | 2
verbosity level, specified as the comma-separated pair consisting of
'verbose' and 0, 1, or
2. the value of verbose controls the amount of
optimization information that the software displays in the command window and saves the
information as a structure to mdl.convergenceinfo.history.
this table summarizes the available verbosity level options.
| value | description |
|---|---|
0 | the software does not display or save convergence information. |
1 | the software displays diagnostic messages and saves convergence
criteria every numprint iterations, where
numprint is the value of the name-value pair
argument 'numprint'. |
2 | the software displays diagnostic messages and saves convergence criteria at every iteration. |
example: 'verbose',1
data types: double | single
categoricalpredictors — categorical predictors list
vector of positive integers | logical vector | character matrix | string array | cell array of character vectors | 'all'
categorical predictors list, specified as one of the values in this table.
| value | description |
|---|---|
| vector of positive integers |
each entry in the vector is an index value indicating that the corresponding predictor is
categorical. the index values are between 1 and if |
| logical vector |
a |
| character matrix | each row of the matrix is the name of a predictor variable. the names must match the entries in predictornames. pad the names with extra blanks so each row of the character matrix has the same length. |
| string array or cell array of character vectors | each element in the array is the name of a predictor variable. the names must match the entries in predictornames. |
"all" | all predictors are categorical. |
by default, if the
predictor data is in a table (tbl), fitrsvm
assumes that a variable is categorical if it is a logical vector, categorical vector, character
array, string array, or cell array of character vectors. if the predictor data is a matrix
(x), fitrsvm assumes that all predictors are
continuous. to identify any other predictors as categorical predictors, specify them by using
the categoricalpredictors name-value argument.
for the identified categorical predictors, fitrsvm creates dummy variables using two different schemes, depending on whether a categorical variable is unordered or ordered. for an unordered categorical variable, fitrsvm creates one dummy variable for each level of the categorical variable. for an ordered categorical variable, fitrsvm creates one less dummy variable than the number of categories. for details, see automatic creation of dummy variables.
example: 'categoricalpredictors','all'
data types: single | double | logical | char | string | cell
predictornames — predictor variable names
string array of unique names | cell array of unique character vectors
predictor variable names, specified as a string array of unique names or cell array of unique
character vectors. the functionality of predictornames depends on the
way you supply the training data.
if you supply
xandy, then you can usepredictornamesto assign names to the predictor variables inx.the order of the names in
predictornamesmust correspond to the column order ofx. that is,predictornames{1}is the name ofx(:,1),predictornames{2}is the name ofx(:,2), and so on. also,size(x,2)andnumel(predictornames)must be equal.by default,
predictornamesis{'x1','x2',...}.
if you supply
tbl, then you can usepredictornamesto choose which predictor variables to use in training. that is,fitrsvmuses only the predictor variables inpredictornamesand the response variable during training.predictornamesmust be a subset oftbl.properties.variablenamesand cannot include the name of the response variable.by default,
predictornamescontains the names of all predictor variables.a good practice is to specify the predictors for training using either
predictornamesorformula, but not both.
example: "predictornames",["sepallength","sepalwidth","petallength","petalwidth"]
data types: string | cell
responsename — response variable name
"y" (default) | character vector | string scalar
response variable name, specified as a character vector or string scalar.
if you supply
y, then you can useresponsenameto specify a name for the response variable.if you supply
responsevarnameorformula, then you cannot useresponsename.
example: "responsename","response"
data types: char | string
responsetransform — response transformation
'none' (default) | function handle
response transformation, specified as either 'none' or a function
handle. the default is 'none', which means @(y)y,
or no transformation. for a matlab function or a function you define, use its function handle for the
response transformation. the function handle must accept a vector (the original response
values) and return a vector of the same size (the transformed response values).
example: suppose you create a function handle that applies an exponential
transformation to an input vector by using myfunction = @(y)exp(y).
then, you can specify the response transformation as
'responsetransform',myfunction.
data types: char | string | function_handle
weights — observation weights
ones(size(x,1),1) (default) | vector of numeric values
observation weights, specified as the comma-separated pair consisting of 'weights' and a vector of numeric values. the size of weights must equal the number of rows in x. fitrsvm normalizes the values of weights to sum to 1.
data types: single | double
crossval — cross-validation flag
'off' (default) | 'on'
cross-validation flag, specified as the comma-separated pair consisting of 'crossval' and either 'on' or 'off'.
if you specify 'on', then the software implements 10-fold cross-validation.
to override this cross-validation setting, use one of these name-value pair arguments: cvpartition, holdout, kfold, or leaveout. to create a cross-validated model, you can use one cross-validation name-value pair argument at a time only.
alternatively, you can cross-validate the model later using the method.
example: 'crossval','on'
cvpartition — cross-validation partition
[] (default) | cvpartition partition object
cross-validation partition, specified as a cvpartition partition object
created by cvpartition. the partition object
specifies the type of cross-validation and the indexing for the training and validation
sets.
to create a cross-validated model, you can specify only one of these four name-value
arguments: cvpartition, holdout,
kfold, or leaveout.
example: suppose you create a random partition for 5-fold cross-validation on 500
observations by using cvp = cvpartition(500,'kfold',5). then, you can
specify the cross-validated model by using
'cvpartition',cvp.
holdout — fraction of data for holdout validation
scalar value in the range (0,1)
fraction of the data used for holdout validation, specified as a scalar value in the range
(0,1). if you specify 'holdout',p, then the software completes these
steps:
randomly select and reserve
p*100% of the data as validation data, and train the model using the rest of the data.store the compact, trained model in the
trainedproperty of the cross-validated model.
to create a cross-validated model, you can specify only one of these four name-value
arguments: cvpartition, holdout,
kfold, or leaveout.
example: 'holdout',0.1
data types: double | single
kfold — number of folds
10 (default) | positive integer value greater than 1
number of folds to use in a cross-validated model, specified as a positive integer value
greater than 1. if you specify 'kfold',k, then the software completes
these steps:
randomly partition the data into
ksets.for each set, reserve the set as validation data, and train the model using the other
k– 1 sets.store the
kcompact, trained models in ak-by-1 cell vector in thetrainedproperty of the cross-validated model.
to create a cross-validated model, you can specify only one of these four name-value
arguments: cvpartition, holdout,
kfold, or leaveout.
example: 'kfold',5
data types: single | double
leaveout — leave-one-out cross-validation flag
'off' (default) | 'on'
leave-one-out cross-validation flag, specified as 'on' or
'off'. if you specify 'leaveout','on', then
for each of the n observations (where n is the
number of observations, excluding missing observations, specified in the
numobservations property of the model), the software completes
these steps:
reserve the one observation as validation data, and train the model using the other n – 1 observations.
store the n compact, trained models in an n-by-1 cell vector in the
trainedproperty of the cross-validated model.
to create a cross-validated model, you can specify only one of these four name-value
arguments: cvpartition, holdout,
kfold, or leaveout.
example: 'leaveout','on'
deltagradienttolerance — tolerance for gradient difference
0 (default) | nonnegative scalar
tolerance for gradient difference between upper and lower violators obtained by smo or isda, specified as the comma-separated pair consisting of 'deltagradienttolerance' and a nonnegative scalar.
example: 'deltagradienttolerance',1e-4
data types: single | double
gaptolerance — feasibility gap tolerance
1e-3 (default) | nonnegative scalar
feasibility gap tolerance obtained by smo or isda, specified as the comma-separated pair consisting of 'gaptolerance' and a nonnegative scalar.
if gaptolerance is 0, then fitrsvm does not use this parameter to check convergence.
example: 'gaptolerance',1e-4
data types: single | double
iterationlimit — maximal number of numerical optimization iterations
1e6 (default) | positive integer
maximal number of numerical optimization iterations, specified
as the comma-separated pair consisting of 'iterationlimit' and
a positive integer.
the software returns a trained model regardless of whether the
optimization routine successfully converges. mdl.convergenceinfo contains
convergence information.
example: 'iterationlimit',1e8
data types: double | single
kkttolerance — tolerance for kkt violation
0 | nonnegative scalar value
tolerance for karush-kuhn-tucker (kkt) violation, specified as the comma-separated pair consisting of 'kkttolerance' and a nonnegative scalar value.
this name-value pair applies only if 'solver' is 'smo' or 'isda'.
if kkttolerance is 0, then fitrsvm does not use this parameter to check convergence.
example: 'kkttolerance',1e-4
data types: single | double
shrinkageperiod — number of iterations between reductions of active set
0 (default) | nonnegative integer
number of iterations between reductions of the active set, specified as the
comma-separated pair consisting of 'shrinkageperiod' and a
nonnegative integer.
if you set 'shrinkageperiod',0, then the software does not shrink
the active set.
example: 'shrinkageperiod',1000
data types: double | single
optimizehyperparameters — parameters to optimize
'none' (default) | 'auto' | 'all' | string array or cell array of eligible parameter names | vector of optimizablevariable objects
parameters to optimize, specified as the comma-separated pair consisting of 'optimizehyperparameters' and one of the following:
'none'— do not optimize.'auto'— use{'boxconstraint','kernelscale','epsilon'}.'all'— optimize all eligible parameters.string array or cell array of eligible parameter names.
vector of
optimizablevariableobjects, typically the output of .
the optimization attempts to minimize the cross-validation loss (error) for fitrsvm by varying the parameters. to control the cross-validation type and other aspects of the optimization, use the hyperparameteroptimizationoptions name-value pair.
note
the values of 'optimizehyperparameters' override any values you specify
using other name-value arguments. for example, setting
'optimizehyperparameters' to 'auto' causes
fitrsvm to optimize hyperparameters corresponding to the
'auto' option and to ignore any specified values for the
hyperparameters.
the eligible parameters for fitrsvm are:
boxconstraint—fitrsvmsearches among positive values, by default log-scaled in the range[1e-3,1e3].kernelscale—fitrsvmsearches among positive values, by default log-scaled in the range[1e-3,1e3].epsilon—fitrsvmsearches among positive values, by default log-scaled in the range[1e-3,1e2]*iqr(y)/1.349.kernelfunction—fitrsvmsearches among'gaussian','linear', and'polynomial'.polynomialorder—fitrsvmsearches among integers in the range[2,4].standardize—fitrsvmsearches among'true'and'false'.
set nondefault parameters by passing a vector of optimizablevariable objects that have nondefault values. for example,
load carsmall params = hyperparameters('fitrsvm',[horsepower,weight],mpg); params(1).range = [1e-4,1e6];
pass params as the value of optimizehyperparameters.
by default, the iterative display appears at the command line,
and plots appear according to the number of hyperparameters in the optimization. for the
optimization and plots, the objective function is log(1 cross-validation loss). to control the iterative display, set the verbose field of
the 'hyperparameteroptimizationoptions' name-value argument. to control the
plots, set the showplots field of the
'hyperparameteroptimizationoptions' name-value argument.
for an example, see optimize svm regression.
example: 'optimizehyperparameters','auto'
hyperparameteroptimizationoptions — options for optimization
structure
options for optimization, specified as a structure. this argument modifies the effect of the
optimizehyperparameters name-value argument. all fields in the
structure are optional.
| field name | values | default |
|---|---|---|
optimizer |
| 'bayesopt' |
acquisitionfunctionname |
acquisition functions whose names include
| 'expected-improvement-per-second-plus' |
maxobjectiveevaluations | maximum number of objective function evaluations. | 30 for 'bayesopt' and
'randomsearch', and the entire grid for
'gridsearch' |
maxtime | time limit, specified as a positive real scalar. the time limit is in seconds, as
measured by | inf |
numgriddivisions | for 'gridsearch', the number of values in each dimension. the value can be
a vector of positive integers giving the number of
values for each dimension, or a scalar that
applies to all dimensions. this field is ignored
for categorical variables. | 10 |
showplots | logical value indicating whether to show plots. if true, this field plots
the best observed objective function value against the iteration number. if you
use bayesian optimization (optimizer is
'bayesopt'), then this field also plots the best
estimated objective function value. the best observed objective function values
and best estimated objective function values correspond to the values in the
bestsofar (observed) and bestsofar
(estim.) columns of the iterative display, respectively. you can
find these values in the properties objectiveminimumtrace and estimatedobjectiveminimumtrace of
mdl.hyperparameteroptimizationresults. if the problem
includes one or two optimization parameters for bayesian optimization, then
showplots also plots a model of the objective function
against the parameters. | true |
saveintermediateresults | logical value indicating whether to save results when optimizer is
'bayesopt'. if
true, this field overwrites a
workspace variable named
'bayesoptresults' at each
iteration. the variable is a bayesianoptimization object. | false |
verbose | display at the command line:
for details, see the | 1 |
useparallel | logical value indicating whether to run bayesian optimization in parallel, which requires parallel computing toolbox™. due to the nonreproducibility of parallel timing, parallel bayesian optimization does not necessarily yield reproducible results. for details, see . | false |
repartition | logical value indicating whether to repartition the cross-validation at every
iteration. if this field is the setting
| false |
| use no more than one of the following three options. | ||
cvpartition | a cvpartition object, as created by cvpartition | 'kfold',5 if you do not specify a cross-validation
field |
holdout | a scalar in the range (0,1) representing the holdout fraction | |
kfold | an integer greater than 1 | |
example: 'hyperparameteroptimizationoptions',struct('maxobjectiveevaluations',60)
data types: struct
output arguments
mdl — trained svm regression model
regressionsvm model | regressionpartitionedsvm cross-validated model
trained svm regression model, returned as a regressionsvm model or cross-validated model.
if you set any of the name-value pair arguments kfold, holdout, leaveout, crossval, or cvpartition, then mdl is a regressionpartitionedsvm cross-validated model. otherwise, mdl is a regressionsvm model.
limitations
fitrsvm supports low- through moderate-dimensional data sets. for high-dimensional data set, use fitrlinear instead.
tips
unless your data set is large, always try to standardize the predictors (see
standardize). standardization makes predictors insensitive to the scales on which they are measured.it is good practice to cross-validate using the
kfoldname-value pair argument. the cross-validation results determine how well the svm model generalizes.sparsity in support vectors is a desirable property of an svm model. to decrease the number of support vectors, set the
boxconstraintname-value pair argument to a large value. this action also increases the training time.for optimal training time, set
cachesizeas high as the memory limit on your computer allows.if you expect many fewer support vectors than observations in the training set, then you can significantly speed up convergence by shrinking the active-set using the name-value pair argument
'shrinkageperiod'. it is good practice to use'shrinkageperiod',1000.duplicate observations that are far from the regression line do not affect convergence. however, just a few duplicate observations that occur near the regression line can slow down convergence considerably. to speed up convergence, specify
'removeduplicates',trueif:your data set contains many duplicate observations.
you suspect that a few duplicate observations can fall near the regression line.
however, to maintain the original data set during training,
fitrsvmmust temporarily store separate data sets: the original and one without the duplicate observations. therefore, if you specifytruefor data sets containing few duplicates, thenfitrsvmconsumes close to double the memory of the original data.after training a model, you can generate c/c code that predicts responses for new data. generating c/c code requires matlab coder™. for details, see introduction to code generation.
algorithms
for the mathematical formulation of linear and nonlinear svm regression problems and the solver algorithms, see .
nan,, empty character vector (''), empty string (""), andvalues indicate missing data values.fitrsvmremoves entire rows of data corresponding to a missing response. when normalizing weights,fitrsvmignores any weight corresponding to an observation with at least one missing predictor. consequently, observation box constraints might not equalboxconstraint.fitrsvmremoves observations that have zero weight.if you set
'standardize',trueand'weights', thenfitrsvmstandardizes the predictors using their corresponding weighted means and weighted standard deviations. that is,fitrsvmstandardizes predictor j (xj) usingxjk is observation k (row) of predictor j (column).
if your predictor data contains categorical variables, then the software generally uses full dummy encoding for these variables. the software creates one dummy variable for each level of each categorical variable.
the
predictornamesproperty stores one element for each of the original predictor variable names. for example, assume that there are three predictors, one of which is a categorical variable with three levels. thenpredictornamesis a 1-by-3 cell array of character vectors containing the original names of the predictor variables.the
expandedpredictornamesproperty stores one element for each of the predictor variables, including the dummy variables. for example, assume that there are three predictors, one of which is a categorical variable with three levels. thenexpandedpredictornamesis a 1-by-5 cell array of character vectors containing the names of the predictor variables and the new dummy variables.similarly, the
betaproperty stores one beta coefficient for each predictor, including the dummy variables.the
supportvectorsproperty stores the predictor values for the support vectors, including the dummy variables. for example, assume that there are m support vectors and three predictors, one of which is a categorical variable with three levels. thensupportvectorsis an m-by-5 matrix.the
xproperty stores the training data as originally input. it does not include the dummy variables. when the input is a table,xcontains only the columns used as predictors.
for predictors specified in a table, if any of the variables contain ordered (ordinal) categories, the software uses ordinal encoding for these variables.
for a variable having k ordered levels, the software creates k – 1 dummy variables. the jth dummy variable is -1 for levels up to j, and 1 for levels j 1 through k.
the names of the dummy variables stored in the
expandedpredictornamesproperty indicate the first level with the value 1. the software stores k – 1 additional predictor names for the dummy variables, including the names of levels 2, 3, ..., k.
all solvers implement l1 soft-margin minimization.
let
pbe the proportion of outliers that you expect in the training data. if you set'outlierfraction',p, then the software implements robust learning. in other words, the software attempts to remove 100p% of the observations when the optimization algorithm converges. the removed observations correspond to gradients that are large in magnitude.
references
[1] clark, d., z. schreter, a. adams. "a quantitative comparison of dystal and backpropagation." submitted to the australian conference on neural networks, 1996.
[2] fan, r.-e., p.-h. chen, and c.-j. lin. “working set selection using second order information for training support vector machines.” journal of machine learning research, vol 6, 2005, pp. 1889–1918.
[3] kecman v., t. -m. huang, and m. vogt. “iterative single data algorithm for training kernel machines from huge data sets: theory and performance.” in support vector machines: theory and applications. edited by lipo wang, 255–274. berlin: springer-verlag, 2005.
[4] lichman, m. uci machine learning repository, [http://archive.ics.uci.edu/ml]. irvine, ca: university of california, school of information and computer science.
[5] nash, w.j., t. l. sellers, s. r. talbot, a. j. cawthorn, and w. b. ford. "the population biology of abalone (haliotis species) in tasmania. i. blacklip abalone (h. rubra) from the north coast and islands of bass strait." sea fisheries division, technical report no. 48, 1994.
[6] waugh, s. "extending and benchmarking cascade-correlation: extensions to the cascade-correlation architecture and benchmarking of feed-forward supervised artificial neural networks." university of tasmania department of computer science thesis, 1995.
extended capabilities
automatic parallel support
accelerate code by automatically running computation in parallel using parallel computing toolbox™.
to perform parallel hyperparameter optimization, use the
'hyperparameteroptimizationoptions', struct('useparallel',true)
name-value argument in the call to the fitrsvm function.
for more information on parallel hyperparameter optimization, see .
for general information about parallel computing, see (parallel computing toolbox).
gpu arrays
accelerate code by running on a graphics processing unit (gpu) using parallel computing toolbox™.
usage notes and limitations:
you cannot specify the
kernelfunctionname-value argument as a custom kernel function.you can specify the
solvername-value argument as "smo" only.you cannot specify the
outlierfractionorshrinkageperiodname-value argument.the predictor data cannot contain infinite values.
fitrsvmfits the model on a gpu if any of the following apply:the input argument
xis agpuarrayobject.the input argument
yis agpuarrayobject.the input argument
tblcontainsgpuarraypredictor or response variables.
version history
introduced in r2015br2023a: fitrsvm now accepts gpuarray inputs (requires parallel computing toolbox)
starting in r2023a, fitrsvm supports gpu arrays with some
limitations.
打开示例
您曾对此示例进行过修改。是否要打开带有您的编辑的示例?
matlab 命令
您点击的链接对应于以下 matlab 命令:
请在 matlab 命令行窗口中直接输入以执行命令。web 浏览器不支持 matlab 命令。
you can also select a web site from the following list:
how to get best site performance
select the china site (in chinese or english) for best site performance. other mathworks country sites are not optimized for visits from your location.