English 中文(简体)
Looking for a clear and concise web page explaining why lower bits of random numbers are usually not that random
原标题:
  • 时间:2010-04-01 14:15:18
  •  标签:
  • random
  • bit

I am putting together an internal "every developer should know" wiki page.

I saw many discussions regarding rand() % N, but not a single web page that explains it all.

For instance, I am curious if this problem is only C- and Linux-specific, or if it also applies to Windows, C++,. Java, .Net, Python, Perl.

Please help me get to the bottom of this. Also, just how non-random do the numbers get? Thank you!

最佳回答

I don t have a web page to refer you to but I might have a "back of the envelope" explanation that would help. The way simple random number generators work is by following the steps

  1. Use the last number generated n or a seed number.
  2. Multiply that number by a special large number
  3. Add another special large number
  4. Divide that by a third special large number and throw away the remainder
  5. Return the result

Now if you think about what happens in all but step 4 you are doing operations where only the lower bits can alter the lower bits of the result. Adding 1001 and 100...00001 will end in ...02 (Ha you though I was talking base 2, really these number are base 12 for giggles.) regardless of what is on the high end of the calculation. Similarly, when you multiply it will end in a 1, no matter what.

There is a similar problem at the top end as well, a billion times a billion will invariably dominate the contribution of the hundreds places of wither number. This points to the fact that the middle is where the good stuff happens. Lots of bits interact here--high, middle, and low.

That is the purpose of the division step, it cuts off the bottom chunk of the result where there was not as much interaction. The top chunk is not usually chopped off because the computer drops the upper bits when the multiplications do not fit into a machine word any more.

In the end though the cut off points are somewhat arbitrary and you can be more picky than the people who designed the algorithm and still chop off a few more bits.

For you question of how bad they can be, they can be really bad. The easiest way to see this is to group individual numbers into tuples and graph them. So if you had random numbers a, b, c, d, ... graph (a,b), (c,d), ... and look at the results. This is called a Spectral Test and Rand fails it beautifully. This one I have a link for try http://random.mat.sbg.ac.at/results/karl/spectraltest/

问题回答

Check out http://en.wikipedia.org/wiki/Linear_congruential_generator, which is likely the algorithm used for most built-in random number generators.

Scrolling down, look for the paragraph beginning with "A further problem of LCGs is that the lower-order bits of the generated sequence have a far shorter period.." for some insight into rand() % N.





相关问题
Weighted random numbers

I m trying to implement a weighted random numbers. I m currently just banging my head against the wall and cannot figure this out. In my project (Hold em hand-ranges, subjective all-in equity ...

Comprehensive information about hash salts

There are a lot of questions about salts and best practices, however most of them simply answer very specific questions about them. I have several questions which feed into one another. Assuming a ...

Generate unique names?

I am working on a php site in which we have to upload images from users.i have to rename that file for preventing conflicts in the name of the image. uniqid(rand(), true); and adding a large random ...

How to get two random records with Django

How do I get two distinct random records using Django? I ve seen questions about how to get one but I need to get two random records and they must differ.

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

热门标签