I am currently working on a recommender application and I am using cassandra with hadoop and pig for map/reduce jobs. To take advantage of the column names properties our team has decided to store data using valueless columns and aggregate column names so for example all hits for a specific content are stored in a column family with a single row, and each column is a hit for the content using the following structure:
rowkey = single_row {
id_content:hit_date, -
.
.
.
}
利用这个方法,我们得到了宽长的行而不是瘦的行; 问题是,我需要如何操纵猪体内的数据 才能用这个方法将数据储存在卡桑德拉里?