Too Random

faith mountain statue

J-Walk recently held a raffle and he describes his procedure for selecting the winner. It involves Excel, but what would you expect from Mr. Spreadsheet?

The interesting part, to me, was step nine:

9. I will press F9 five times to regenerate the random numbers.

He put =RAND() in a column and intended to sort based on that column. F9, you’ll recall, is the shortcut key to recalculate a worksheet. But why recalculate five times? Does that make the numbers more random?

The reason that it interests me is because I’m guilty of the same thing. A fantasy football league I ran a few years ago had a rule that called for a raffle if two or more teams wanted the same player. Each team was allotted a number of chances based on their ranking to give the lower ranked team a better chance. If the second ranked team and the seventh ranked team were in the raffle, the second ranked team would get two of the nine chances and the other team would get seven of the nine. I used a similar set up as John in that I used =RAND() in a column and sorted. I only recalculated three times though, not five.

How many times should you recalculate? The answer is none. The number returned by =RAND() is as random as it’s ever going to get when you enter the formula. John could have entered the formula in the first cell, filled the formula down the column, and sorted and the list would have been as random as Excel can muster.

Why isn’t that enough?

Posted in Uncategorized

23 thoughts on “Too Random

  1. I wanted to simulate all of the entries being thrown into a big hopper, and then an attractive spokesmodel turning the handle to mix them all up really well.

    Also, when people throw dice, they shake them up first.

  2. Speaking of random numbers, it sure is convenient having RANDBETWEEN built into Excel 2007. I never used that function in the past because I rarely have Analysis Toolpak installed.

  3. Like Stacie, I’m guilty of hitting Save, over and over, as I think about what I should change next. Maybe I think that if I Save the spreadsheet I’m thinking about, it’ll somehow magically also Save my thoughts. My track record for the success of this method is conclusive proof it doesn’t work…!

    Refreshing a RAND is the same thought process as getting a bunch of raffle tickets on a table, then mixing them up, then putting them into the bag/hat/pair of dungarees, then shaking them about again, and then one final shake for good luck before the winner is drawn by someone who can’t see into the bag, and searches around before picking one out. Despite the fact that it makes it no more random at all, we tend to think that the more “mixed up” the available options are, the more random the result will be. Maybe it’s an evolved survival trait, where cavemen who didn’t mix up their rocks enough never survived. Or something.

  4. And for those of you that want to get really technical, until XL2003 arrived, the numbers generated by RAND apparently were not random enough for “power users.” Being a CPA, I’ve had issues in the past using XL to generate random numbers because I read somewhere years ago that the RAND function was not truly random, but based on the internal system clock, I believe. But I followed orders and did it anyway. But now I feel better since we were not generating the “millions” of random numbers that a power user would, but usually somewhere between 25 and 100. And now I apparently don’t have to worry about it at all since the RAND function in XL2003 meets some type of randomness test…that’s a load off my mind.

    I’m just glad we don’t use statistical sampling!

  5. I once was involved a survey in which people were asked to choose and rank their top 3 reasons for something (out of a set of 12 reasons). Because the survey was not particularly well written and many people did not bother to read the instructions about half of the surveys had some ranking information but not in the 1, 2, 3 format that was asked for. I decided that I could use the information by imputing a ranking in many of the cases. For example, if the person responded with 3 “1’s” I used a random function in VB to “create” a 1,2,3 ranking.

    The interesting thing was that I did not just run a random generator and create ranks. Oh no. I ALSO created a loop that would “randomly” shuffle “random” ranks in the ordering “random” number of times.

  6. J-Walk, when we got Excel 2003 (eventually) Analysis Toolpak was installed. I de-installed it and there seemed to be an appreciable improvement. As the Toolpak is built into 2007 (as I understand it), I’m a bit worried about performance (not that we’ll get 2007 for a while – if ever).

    If you didn’t usually have the Toolpak installed in 2003, how do you rate the performance of 2007 over 2003?

  7. This is like the “Deal or No Deal” TV show. The money is randomly distributed throughout the brief cases, yet people randomly pick their brief case numbers. The randomness shows up in that the left and right money columns stay balanced for choices.

    Makes as much sense to go in “1-2-3? order, but human nature says otherwise ;-)


  8. J-Walk and Haffy,

    Why do you have the Analysis Toopak uninstalled… isn’t that the one with =OEMonth() and other useful worksheet functions? Haffy, there is a performance impact to having it installed… I’ve never noticed, what happens?

  9. Mike R,

    Until now, the Toolpak wasn’t installed by default on our PCs at work, so you could have problems sharing workbooks with people who didn’t have it installed. In any case, although I found some functions to be useful, I could always find another way of doing it.

    The performance impact was simple – the PC always went much slower. There may be another solution, but uninstalling did the job for me. Maybe I’ll give it another shot and report back.

    Then again, perhaps I shouldn’t try to run Excel 2003 on a Sinclair ZX Spectrum?

  10. In Excel 2007, only the ATP functions are integrated into Excel. The ATP add-in is still required if you want to use any of the statistical procedures.

    I usually keep ATP uninstalled because it’s one less thing to clutter up the VB Editor window. I’ve never noticed any performance hit by having it installed.

  11. Here is an odd story that was told to me so as to whether it is true or not… but it makes a good story none the less. In the Bars and such there is a betting game you can play called Keno. It works similar to a lottery. It too used a computer random number generator. One particularily bright individual knowing the computers can not really generate random numbers figured out the algorythm being used and took his new found knowledge and use it to win the jackpot… twice… in a row… The makers of Keno who were on the hook for the rather hefty pay out sued the gentleman insisting that he cheated. The Judge sided with the winner noting that there is nothing illegal about beating someone at their own game. Keno changed the way they generate their “Random” numbers…

  12. Half of that story is true. I saw it on the Discovery Channel. But you are missing a few key details. The “bright individual” was a techinical expert of the Nevada Gaming Control Board and defeated the algorithem because he had access to code.

    He did spend time jail, because he did cheat.

  13. I’ve commented about random numbers on this blog before, so here’s another.

    The comment above that recalculating the sheet doesn’t make RAND() any more random is correct. I think shuffling the cards or rolling the dice does make it more random, as there is some physical aspect to creating the result.

    Random numbers are too important to be left to chance. That’s an axim often used in the practice of computer simulations. I do some of this at work, and coworkers of mine still get skittish thinking about Excel’s RAND() function. I pretty much only use it to generate a bogus set of data when needed for display purposes. :)

    Someone mentioned basing a random number generator off of the clock. This has been shown to be one of the poorest ways to generate a stream of random numbers. Don’t do this in your coding projects.

    Also, I like jkpieterse’s comment: no matter how good the generator is, the operator can always game the result!


  14. 1) One way of making the “Multiple F9 pressing” give better (more random) random numbers would be to time the intervals between the presses (is there an F9 event?) and use that in someway as the seed for the final randomisation step.

    2) Re performance of the Analysis Toolpack. In Excel 2002 some debug.print statements had been left active by the developers. These were easy enough to turn off once you knew the W**!! password and did speed things up. Have they fixed that now?

  15. Regarding the performance of XL 2007, it’s slow. About half the speed of 2003 in general, much slower than that for some things. I don’t know whether it has anything to do with the analysis toolpack functions, but I hope they fix it soon.

  16. Mpemba,
    You wrote “W**!!” like it represents a bad word.
    Just curious what the heck that represents. As the ugly american, I understand
    “F**!!” , “S**!!” , “A**!!”. I can’t imagine what W**!! means. Wanker?

  17. OK, having generated a set of random numbers (20r x 36c), how do I lock those numbers in? Everytime I do something to the worksheet the list of numbers changes!!

Leave a Reply

Your email address will not be published. Required fields are marked *