English 中文(简体)
Time UUID type in pycassa
原标题:

I m having problems with using the time_uuid type as a key in my columnfamily. I want to store my records, and have them ordered by when they were inserted, and then I figured that the time_uuid is a good way to go. This is how I ve set up my column family:

sys.create_column_family("keyspace", "records", comparator_type=TIME_UUID_TYPE)

When I try to insert, I do this:

q=pycassa.ColumnFamily(pycassa.connect("keyspace"), "records")
myKey=pycassa.util.convert_time_to_uuid(datetime.datetime.utcnow())
q.insert(myKey,{ somedata : comevalue })

However, when I insert data, I always get an error:

Argument for a v1 UUID column name or value was neither a UUID, a datetime, or a number.

If I change the comparator_type to UTF8_TYPE, it works, but the order of the items when returned are not as they should be. What am I doing wrong?

最佳回答

The comparator for a column family is used for ordering the columns within each row. You are seeing that error because somedata is valid utf-8 but not a valid uuid.

The ordering of the rows stored in cassandra is determined by the partitioner. Most likely you are using RandomPartitioner which distributes load evenly across your cluster but does not allow for meaningful range queries (the rows will be returned in a random order.)

http://wiki.apache.org/cassandra/FAQ#range_rp

问题回答

The problem is that in your data model, you are using the time as a row key. Although this is possible, you won t get a meaningful ordering unless you also use the ByteOrderedPartitioner.

For this reason, most people insert time-ordered data using the time as a column name, not a row key. In this model, your insert statement would look like:

q.insert(someKey, {datetime.datetime.utcnow():  somevalue })

where someKey is a key that relates to the entire time series that you re inserting (for example, a username). (Note that you don t have to convert the time to UUID, pycassa does it for you.) To store something more than a single value, use a supercolumn or a composite key.

If you really want to store the time in your row keys, then you need to specify key_validation_class, not comparator_type. comparator_type sets the type of the column names, while key_validation_class sets the type of the row keys.

sys.create_column_family("keyspace", "records", key_validation_class=TIME_UUID_TYPE)

Remember the rows will not be sorted unless you also use the ByteOrderedPartitioner.





相关问题
How does Voldemort compare to Cassandra?

How does Voldemort compare to Cassandra? I m not talking about size of community and only want to hear from people who have actually used both. Especially I m interested in: How they dynamically ...

How does Cassandra rebalance when nodes go down?

Does anyone have experience with Cassandra when nodes go down or are unavailable? I am mostly interested in whether the cluster rebalances and what happens when the nodes come online, or are replaced ...

Cassandra time series data

We are looking at using Cassandra to store a stream of information coming from various sources. One issue we are facing is the best way to query between two dates. For example we will need to ...

Picking a database technology

We re setting out to build an online platform (API, Servers, Data, Wahoo!). For context, imagine that we need to build something like twitter, but with the comments (tweets) organized around a live ...

Row count of a column family in Cassandra

Is there a way to get a row count (key count) of a single column family in Cassandra? get_count can only be used to get the column count. For instance, if I have a column family containing users and ...

Update an existing column value

What happens when a new value for an existing column is added? Will the older value be overwritten by the new value? Or the older value will also retain and can be retrieved (similar to simpleDB)?

Cassandra Vs Amazon SimpleDB

I m working on an application where data size and SQL queries are going to be heavy. I am thinking between Cassandra or Amazon SimpleDB. Can you please suggest which is more suitable in this kind of ...

Cassandra load balancing with an ordered partitioner?

So I see here that Cassandra does not have automatic load balancing, which comes into view when using the ordered partitioner (a certain common range of values of a group of rows would be stored on a ...

热门标签