English 中文(简体)
libsvm model file format
原标题:

According to this FAQ the model format in libsvm should be straightforward. And in fact it is, when I call just svm-train. As an example, the first SV for the a1a dataset is

 1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1

On the other hand, if I use the easy.py script, my first SV ends up being:

 512 1:-1 2:-1 3:1 4:-1 5:-1 6:-1 7:-1 8:-1 9:-1 10:-1 11:1 13:-1 14:1 15:-1 16:-1 17:-1 18:-1 19:1 20:-1 21:-1 22:-1 23:-1 24:-1 25:-1 26:-1 27:-1 28:-1 29:-1 30:-1 31:-1 32:-1 33:-1 34:-1 35:-1 36:-1 37:-1 38:-1 39:1 40:-1 41:-1 42:1 43:-1 44:-1 45:-1 46:-1 47:-1 48:-1 49:-1 50:-1 51:-1 52:-1 53:-1 54:-1 55:1 56:-1 57:-1 58:-1 59:-1 61:-1 62:-1 63:-1 64:1 65:-1 66:-1 67:1 68:-1 69:-1 70:-1 71:-1 72:-1 73:1 74:-1 75:1 76:1 77:-1 78:-1 79:-1 80:1 81:-1 82:-1 83:1 84:-1 85:-1 86:-1 87:-1 88:-1 90:-1 91:-1 92:-1 93:-1 94:-1 95:-1 97:-1 98:-1 99:-1 100:-1 101:-1 102:-1 103:-1 104:-1 105:-1 106:-1 107:-1 108:-1 109:-1 110:-1 112:-1 113:-1 114:-1 115:-1 117:-1 118:-1 119:-1 

which is an instance that doesn t exist at all in my training set! In fact if I do:

 $ grep "119:" a1a
 -1 1:1 6:1 18:1 22:1 36:1 42:1 49:1 66:1 67:1 73:1 74:1 76:1 80:1 119:1 
 -1 1:1 6:1 18:1 26:1 35:1 43:1 53:1 65:1 67:1 73:1 74:1 76:1 80:1 119:1 
 -1 2:1 6:1 15:1 19:1 39:1 42:1 55:1 62:1 67:1 72:1 74:1 76:1 78:1 119:1 
 -1 4:1 6:1 16:1 21:1 35:1 44:1 49:1 64:1 67:1 72:1 74:1 76:1 78:1 119:1 
 -1 2:1 6:1 14:1 30:1 35:1 42:1 49:1 65:1 67:1 72:1 74:1 76:1 78:1 119:1 
 -1 2:1 6:1 17:1 20:1 37:1 40:1 57:1 63:1 67:1 73:1 74:1 76:1 80:1 119:1 
 -1 5:1 6:1 18:1 22:1 36:1 40:1 54:1 61:1 67:1 72:1 75:1 76:1 80:1 119:1 
 -1 5:1 6:1 17:1 26:1 35:1 42:1 53:1 62:1 67:1 73:1 74:1 76:1 80:1 119:1 

There isn t any instance with 119:-1 (and even if it s just swapping +1 with -1, there isn t any instance with 119:1 and 118:1 either - missing attributes are zeros)

If I do this source code modification, I clearly see that in the former case (only svm-train involved) the first SV is also the first instance. But in the latter case (i.e. with easy.py script), the output which should give me which instance is the SV is eaten by grid.py

What s going on, here?

最佳回答

I think the culprit here is probably the call easy.py makes to svm-scale, which scales each attribute to be within [-1,1]. The training examples sent to svm-train will not be the same ones that are in your training file.

问题回答

暂无回答




相关问题
Parsing RealLegal s .PTX file format

Anyone have experience parsing the RealLegal PTX file format? I m interested in creating a Mac reader client for this proprietary format. The PTX file format is in wide use for storing court and ...

Semantic (or symbolic) Representation of Music [closed]

I m looking to create a little app that allows the user to visual and control (mostly) classical music, in order to better recognize structures such as canon. What would be an appropriate data and ...

Can a CSV file have a comment?

Is there any official way to allow a CSV formatted file to allow comments, either on its own line OR at the end of a line? I tried checking wikipedia on this and also RFC 4180 but both do not mention ...

How to read Lotus Notes mail archives (*.nsf)

Does anyone know how to read these files without using the interops or COM interaction? Just the direct way. Is there any spec of this format or reverse engineered stuff that could help on this? ...

Reverse Engineering File Formats using AI Techniques

This is to extend the question: Tools to help reverse engineer binary file formats Are there any tools that are publicly available that uses clustering and/or data mining techniques to reverse ...

Identify file format from header?

I got some files , which has header, {AD08E961-F492-11d1-B0EA-00AA00683350}2.00:en_US:45015:021 Can you please help me in finding out, from which software it is created. Only information I have is,...

libsvm model file format

According to this FAQ the model format in libsvm should be straightforward. And in fact it is, when I call just svm-train. As an example, the first SV for the a1a dataset is 1 3:1 11:1 14:1 19:1 39:...

热门标签