English 中文(简体)
How to get two random records with Django
原标题:

How do I get two distinct random records using Django? I ve seen questions about how to get one but I need to get two random records and they must differ.

最佳回答

If you specify the random operator in the ORM I m pretty sure it will give you two distinct random results won t it?

MyModel.objects.order_by( ? )[:2] # 2 random results.
问题回答

The order_by( ? )[:2] solution suggested by other answers is actually an extraordinarily bad thing to do for tables that have large numbers of rows. It results in an ORDER BY RAND() SQL query. As an example, here s how mysql handles that (the situation is not much different for other databases). Imagine your table has one billion rows:

  1. To accomplish ORDER BY RAND(), it needs a RAND() column to sort on.
  2. To do that, it needs a new table (the existing table has no such column).
  3. To do that, mysql creates a new, temporary table with the new columns and copies the existing ONE BILLION ROWS OF DATA into it.
  4. As it does so, it does as you asked, and runs rand() for every row to fill in that value. Yes, you ve instructed mysql to GENERATE ONE BILLION RANDOM NUMBERS. That takes a while. :)
  5. A few hours/days later, when it s done it now has to sort it. Yes, you ve instructed mysql to SORT THIS ONE BILLION ROW, WORST-CASE-ORDERED TABLE (worst-case because the sort key is random).
  6. A few days/weeks later, when that s done, it faithfully grabs the two measly rows you actually needed and returns them for you. Nice job. ;)

Note: just for a little extra gravy, be aware that mysql will initially try to create that temp table in RAM. When that s exhausted, it puts everything on hold to copy the whole thing to disk, so you get that extra knife-twist of an I/O bottleneck for nearly the entire process.

Doubters should look at the generated query to confirm that it s ORDER BY RAND() then Google for "order by rand()" (with the quotes).

A much better solution is to trade that one really expensive query for three cheap ones (limit/offset instead of ORDER BY RAND()):

import random
last = MyModel.objects.count() - 1

index1 = random.randint(0, last)
# Here s one simple way to keep even distribution for
# index2 while still gauranteeing not to match index1.
index2 = random.randint(0, last - 1)
if index2 == index1: index2 = last

# This syntax will generate "OFFSET=indexN LIMIT=1" queries
# so each returns a single record with no extraneous data.
MyObj1 = MyModel.objects.all()[index1]
MyObj2 = MyModel.objects.all()[index2]

For the future readers.

Get the the list of ids of all records:

my_ids = MyModel.objects.values_list( id , flat=True)
my_ids = list(my_ids)

Then pick n random ids from all of the above ids:

n = 2
rand_ids = random.sample(my_ids, n)

And get records for these ids:

random_records = MyModel.objects.filter(id__in=rand_ids)

Object.objects.order_by( ? )[:2]

This would return two random-ordered records. You can add

distinct()

if there are records with the same value in your dataset.

About sampling n random values from a sequence, the random lib could be used,

random.Random().sample(range(0,last),2) 

will fetch 2 random samples from among the sequence elements, 0 to last-1

from django.db import models
from random import randint
from django.db.models.aggregates import Count


class ProductManager(models.Manager):
    def random(self, count=5):
        index = randint(0, self.aggregate(count=Count( id ))[ count ] - count)
        return self.all()[index:index + count]

You can get different number of objects.

class ModelName(models.Model):

    # Define model fields etc


    @classmethod
    def get_random(cls, n=2):
        """Returns a number of random objects. Pass number when calling"""

        import random
        n = int(n) # Number of objects to return
        last = cls.objects.count() - 1
        selection = random.sample(range(0, last), n)
        selected_objects = []
        for each in selection:
            selected_objects.append(cls.objects.all()[each])
        return selected_objects




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签