English 中文(简体)
Resample Filter of WEKA - How to interpret the result
原标题:

I am currently strugeling with a machine learning problem whereas I have to deal with great unbalanced data sets. That is, there are six classes ( 1 , 2 ... 6 ). Unfortunately there are e.g. for class 1 150 examples/instances, for 2 90 instances and for class 3 only 20. All other classes can t be "trained" since there are no available instances for these classes.

So far, I figured out that WEKA (the machine learning toolkit I am using) provides this supervised "Resample" filter. When I apply this filter with noReplacement =false and bialToUniformClass =1.0 then this results in a data set, where the the number of instances is nice and almost equal (for class 1 .. 3 and the others stay empty).

My question is now: how does WEKA and this filter generate "new"/additional instances for different classes.

Thank you very much in advance for any hints or suggestions.

Cheers Julian

问题回答

It doesn t. It s resampling existing instances. If you have one class-2 instance, and ask for a resampling with a bias of 1.0, you can expect N copies of that instance and N other instances of each other type for which there is already data.

Using WEKA s supervised Resample filter adds instances to a class. This realized by simply adding instances from the class which has only few instances multiple times to the result data set.

Therefore the resulting data set is strongly biased in terms of a class for which only few samples are available.

Try with the SMOTE filter on the preprocess.

It balance your dataset by generating new data for the minor class.





相关问题
How do I reset or override IE CSS filters?

I m using the proprietry MS filter property to try and create a non ugly equivalent to css3 text-shadow and box-shadow; I was actually doing really well until I hit this problem. It looks like when ...

PHP filter string input, cut off all newlines, 1 chars

I m writing in PHP! There s a user string input, and I want to cut off all newlines,1 chars that are more than 1 in a row (avoid <br /><br />...). example: I am a SPaMmEr! would ...

The Fastest DataStructure to Filter with in C#

Currently we are filtering and sorting data with a datatable. /// <summary> /// Filters the data table and returns a new data table with only the filtered rows. /// </summary>...

How to dynamically update a ListView on Android [closed]

On Android, how can I a ListView that filters based on user input, where the items shown are updated dynamically based on the TextView value? I m looking for something like this: -------------------...

Handling no results for docmd.applyfilter

I have an Access app where I use search functionality. I have a TextBox and a Search Button on the form, and it does a wildcard search of whatever the user enters in the TextBox, and displays the ...

Best method to scale DropShadows in AS3?

I m creating a flash application that makes use of both the dropShadow filter and scaling of various sprites. And therein lies the problem: This filter supports Stage scaling. However, it does ...

热门标签