You could use a HashMultiset from google-collections:
import com.google.common.collect.*;
import com.google.common.collect.Multiset.Entry;
...
final Multiset<String> words = HashMultiset.create();
words.addAll(...);
Ordering<Entry<String>> byIncreasingCount = new Ordering<Entry<String>>() {
@Override public int compare(Entry<String> a, Entry<String> b) {
// safe because count is never negative
return left.getCount() - right.getCount();
}
});
Entry<String> maxEntry = byIncreasingCount.max(words.entrySet())
return maxEntry.getElement();
EDIT: oops, I thought you wanted only the single most common word. But it sounds like you want the several most common -- so, you could replace max
with sortedCopy
and now you have a list of all the entries in order.
To find the number of distinct words: words.elementSet().size()