I have 1M words in my dictionary. Whenever a user issue a query on my website, I will see if the query contains the words in my dictionary and increment the counter corresponding to them individually. Here is the example, say if a user type in "Obama is a president" and "Obama" and "president" are in my dictionary, then I should increment the counter by 1 for "Obama" and "president".
And from time to time, I want to see the top 100 words (most queried words). If I use Hbase to store the counter, what schema should I use? -- I have not come up an efficient one yet.
If I use word in my dictionary as row key, and "counter" as column key, then updating counter(increment) is very efficient. But it s very hard to sort and return the top 100.
Anyone can give a good advice? Thanks.