Is the following architecture possible in Hadoop MapReduce?
A distributed key-value store is used (HBase). So along with values, there would be a timestamp associated with the values. Map & Reduce tasks are executed iteratively. Map, in each iteration should take in values which were added in the previous iteration to the store (perhaps the ones with latest timestamp?). Reduce should take in Map s output as well as the pairs from the store whose key(s) match the key(s) that reduce has to process in the current iteration. The output of reduce goes to the store.
If this is possible, which classes (eg: InputFormat, run() of Reduce) should be extended so that instead of the regular operation the above operation takes place. If this is not possible, are there any alternatives to achieve the same?