I am currently strugeling with a machine learning problem whereas I have to deal with great unbalanced data sets. That is, there are six classes ( 1 , 2 ... 6 ). Unfortunately there are e.g. for class 1 150 examples/instances, for 2 90 instances and for class 3 only 20. All other classes can t be "trained" since there are no available instances for these classes.
So far, I figured out that WEKA (the machine learning toolkit I am using) provides this supervised "Resample" filter. When I apply this filter with noReplacement =false and bialToUniformClass =1.0 then this results in a data set, where the the number of instances is nice and almost equal (for class 1 .. 3 and the others stay empty).
My question is now: how does WEKA and this filter generate "new"/additional instances for different classes.
Thank you very much in advance for any hints or suggestions.
Cheers Julian