English 中文(简体)
How do I get all the keys that are stored in the Cassandra column family with pycassa?
原标题:

Is anyone having experience working with pycassa I have a doubt with it. How do I get all the keys that are stored in the database?

well in this small snippet we need to give the keys in order to get the associated columns (here the keys are foo and bar ),that is fine but my requirement is to get all the keys (only keys) at once as Python list or similar data structure.

cf.multiget([ foo ,  bar ])
{ foo : { column1 :  val2 },  bar : { column1 :  val3 ,  column2 :  val4 }}

Thanks.

最佳回答

try:

    list(cf.get_range().get_keys())

more good stuff here: http://github.com/vomjom/pycassa

问题回答

You can try: cf.get_range(column_count=0,filter_empty=False).

# Since get_range() returns a generator - print only the keys.
for value in cf.get_range(column_count=0,filter_empty=False):
    print value[0]

get_range([start][, finish][, columns][, column_start][, column_finish][, column_reversed][, column_count][, row_count][, include_timestamp][, super_column][, read_consistency_level][, buffer_size])

Get an iterator over rows in a specified key range.

http://pycassa.github.com/pycassa/api/pycassa/columnfamily.html#pycassa.columnfamily.ColumnFamily.get_range

Minor improvement on Santhosh s solution

dict(cf.get_range(column_count=0,filter_empty=False)).keys()

If you care about order:

OrderedDict(cf.get_range(column_count=0,filter_empty=False)).keys()

get_range returns a generator. We can create a dict from the generator and get the keys from that.

column_count=0 limits results to the row_key. However, because these results have no columns we also need filter_empty.

filter_empty=False will allow us to get the results. However empty rows and range ghosts may be included in our result now.

If we don t mind more overhead, getting just the first column will resolve the empty rows and range ghosts.

dict(cf.get_range(column_count=1)).keys()

There s a problem with Santhosh s and kzarns answers, as you re bringing in memory a potentially huge dict that you are immediately discarding. A better approach would be using list comprehensions for this:

keys = [c[0] for c in cf.get_range(column_count=0, filter_empty=False)]

This iterates over the generator returned by get_range, keeps the key in memory and stores the list.

If the list of keys where also potentially too large to keep it in memory all at once and you only need to iterate once, you should use a generator expression instead of a list comprehension:

kgen = (c[0] for c in cf.get_range(column_count=0, filter_empty=False))
# you can iterate over kgen, but do not treat it as a list, it isn t!




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签