English 中文(简体)
Optimizing SMO with RBFKernel (C and gamma)
原标题:

There are two parameters while using RBF kernels with Support Vector Machines: C and γ. It is not known beforehand which C and γ are the best for one problem; consequently some kind of model selection (parameter search) must be done. The goal is to identify good (C;γ) so that the classier can accurately predict unknown data (i.e., testing data).

weka.classifiers.meta.GridSearch is a meta-classifier for tuning a pair of parameters. It seems, however, that it takes ages to finish (when the dataset is rather large). What would you suggest to do in order to bring down the time required to accomplish this task?

According to A User s Guide to Support Vector Machines:

C : soft-margin constant . A smaller value of C allows to ignore points close to the boundary, and increases the margin.

γ> 0 is a parameter that controls the width of Gaussian

问题回答

Hastie et al. s SVMPath explores the entire regularization path for C and only requires about the same computational cost of training a single SVM model. From their paper:

Our R function SvmPath computes all 632 steps in the mixture example (n+ = n− = 100, radial kernel, γ = 1) in 1.44(0.02) secs on a pentium 4, 2Ghz linux machine; the svm function (using the optimized code libsvm, from the R library e1071) takes 9.28(0.06) seconds to compute the solution at 10 points along the path. Hence it takes our procedure about 50% more time to compute the entire path, than it costs libsvm to compute a typical single solution.

They released a GPLed implementation of the algorithm in R that you can download from CRAN here.

Using SVMPath should allow you to find a good C value for any given γ quickly. However, you would still need to do separate training runs for different γ values. But, this should be much faster than doing separate runs for each pair of C:γ values.





相关问题
Pentaho vs SAP Business Objects

Is there anyone out there that used these two technologies and could give me some comparison in the form of advantages and disadvantages of both? I m currently working with BO and I have heard that ...

MDX Except function in where clause

I m having problem restricting a query in mdx, using except function at where clause. i need to retrieved a set of data but which not in an specific set. Then i created the next query: select {[...

Optimizing SMO with RBFKernel (C and gamma)

There are two parameters while using RBF kernels with Support Vector Machines: C and γ. It is not known beforehand which C and γ are the best for one problem; consequently some kind of model selection ...

Recursive calls in Pentaho Data Integration

Is it possible for a step or transformation in Pentaho Data Integration to call itself, passing the results of the previous call as parameters/variables? My first thought was to create a loop in a ...

XUL and Javascript

I have the following XUL markup: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet href="chrome://global/skin/" type="text/css"?> <window width="400" height="275" title="...

MDX Measure Filtering

I am building a Mondrian Cube that shows information for a large range of dates. One of the measures for this cube is an average of a percentage value. Because some of the items in the cube should ...

Is Pentaho ETL and Data Analyzer good choice?

I was looking for ETL tool and on google found lot about Pentaho Kettle. I also need a Data Analyzer to run on Star Schema so that business user can play around and generate any kind of report or ...

Does Pentaho Kettle have a way to accept JMS messages?

Does Pentaho s ETL system, Kettle (http://kettle.pentaho.org/) have a plugin to accept information from JMS messages? I d like to set up a job that can read messages each containing a hash, extract ...

热门标签