English 中文(简体)
我如何向卡珊德拉的多个专栏家庭输出 单一的哈多普工作?
原标题:How do I output to multiple Cassandra column families in a single Hadoop job?

Cassandra 数据模型经常需要更新一个以上的列组以支持单一的“ write ” ( 即更新双向指数的两侧 ) 。 在工作配置只允许指定一个输出列组时, 我怎样才能对 Hadoop 这样做?

最佳回答

通过使用以下地点的补丁补丁卡桑德拉1.1,可以做到这一点:

https://issues.apache.org/jira/browse/CASSANDRA-4208

一旦有了这个, 而不是 ConfigHelper. set OutputColumnFamily () , 你就会呼叫 ConfigHelper.setKeyspace () 。 然后您就可以使用多输出 API来指定您在工作配置中的输出 CFC 如下 :

MultipleOutputs.addNamedOutput(job, "ColumnFamily1", ColumnFamilyOutputFormat.class, ByteBuffer.class, List.class);
MultipleOutputs.addNamedOutput(job, "ColumnFamily2", ColumnFamilyOutputFormat.class, ByteBuffer.class, List.class);

当您准备输出时,只需将名为 CF 的 CF 命名为输出名称:

output.write("ColumnFamily1", key, Collections.singletonList(mutation));

此处 output 是您减号中多输出实例的引用。

问题回答

暂无回答




相关问题
Hadoop - namenode is not starting up

I am trying to run hadoop as a root user, i executed namenode format command hadoop namenode -format when the Hadoop file system is running. After this, when i try to start the name node server, it ...

What default reducers are available in Elastic MapReduce?

I hope I m asking this in the right way. I m learning my way around Elastic MapReduce and I ve seen numerous references to the "Aggregate" reducer that can be used with "Streaming" job flows. In ...

Establishing Eclipse project environment for HadoopDB

I have checked-out a project from SourceForge named HadoopDB. It uses some class in another project named Hive. I have used Eclipse Java build path setting to link source to the Hive project root ...

Hadoop: intervals and JOIN

I m very new to Hadoop and I m currently trying to join two sources of data where the key is an interval (say [date-begin/date-end]). For example: input1: 20091001-20091002 A 20091011-20091104 ...

hadoop- determine if a file is being written to

Is there a way to determine if a file in hadoop is being written to? eg- I have a process that puts logs into hdfs. I have another process that monitors for the existence of new logs in hdfs, but I ...

Building Apache Hive - impossible to resolve dependencies

I am trying out the Apache Hive as per http://wiki.apache.org/hadoop/Hive/GettingStarted and am getting this error from Ivy: Downloaded file size doesn t match expected Content Length for http://...

热门标签