English 中文(简体)
Cassandra + 有宽柱子的猪
原标题:cassandra + pig with wide columns

I am currently working on a recommender application and I am using cassandra with hadoop and pig for map/reduce jobs. To take advantage of the column names properties our team has decided to store data using valueless columns and aggregate column names so for example all hits for a specific content are stored in a column family with a single row, and each column is a hit for the content using the following structure:

rowkey =  single_row  {
    id_content:hit_date, -
    .
    .
    .
}

利用这个方法,我们得到了宽长的行而不是瘦的行; 问题是,我需要如何操纵猪体内的数据 才能用这个方法将数据储存在卡桑德拉里?

问题回答

我不确定您在评论中是否使用复合列, 或者您是否只是将 id_ content 和 hit_date 合并在一起。

对于正常(即非复合)列,其图案是:

(key, {(col_name, col_value), ...})

就综合栏目而言,我认为计划如下:

(key, {((col_name_part_1, col_name_part_2), col_value), ...})

这一评估(综合栏目)的基础是阅读在https://issues.apache.org/jira/browse/CASSANDRA-3684 上提交的补丁。





相关问题
Merging multiple files into one within Hadoop

I get multiple small files into my input directory which I want to merge into a single file without using the local file system or writing mapreds. Is there a way I could do it using hadoof fs ...

Bundling jars, when submittingmap/reduce work through Pig?

I m试图将Hadoop、Pig和Casandra合并起来,以便能够通过简单的Pig查询,就Casses储存的数据开展工作。 问题在于,我不得不做一些工作来创造实际工作的地图/绘画。

generating bigram combinations from grouped data in pig

given my input data in userid,itemid format: raw: {userid: bytearray,itemid: bytearray} dump raw; (A,1) (A,2) (A,4) (A,5) (B,2) (B,3) (B,5) (C,1) (C,5) grpd = GROUP raw BY userid; dump grpd; (A,{(...

Difference between Pig and Hive? Why have both? [closed]

My background - 4 weeks old in the Hadoop world. Dabbled a bit in Hive, Pig and Hadoop using Cloudera s Hadoop VM. Have read Google s paper on Map-Reduce and GFS (PDF link). I understand that- Pig s ...

Regexp matching in pig

Using apache pig and the text hahahah. my brother just didnt do anything wrong. He cheated on a test? no way! I m trying to match "my brother just didnt do anything wrong." Ideally, I d want to ...

How to use Cassandra s Map Reduce with or w/o Pig?

Can someone explain how MapReduce works with Cassandra .6? I ve read through the word count example, but I don t quite follow what s happening on the Cassandra end vs. the "client" end. https://svn....

Storing data to SequenceFile from Apache Pig

Apache Pig can load data from Hadoop sequence files using the PiggyBank SequenceFileLoader: REGISTER /home/hadoop/pig/contrib/piggybank/java/piggybank.jar; DEFINE SequenceFileLoader org.apache.pig....

热门标签