Every time I post code that includes picking a random integer, Tushar reminds me that my methods produce biased results. The last time, I swore to myself I would do it right. Here’s how the old me would have done it:
Sub RandTest1()
Dim lRand As Long
Dim i As Long
Dim aResult() As Long
Dim dLow As Double
Dim dHigh As Double
dLow = 1
dHigh = 4
ReDim aResult(dLow To dHigh) As Long
For i = 1 To 10000
lRand = Round(Rnd * (dHigh – dLow) + dLow, 0)
aResult(lRand) = aResult(lRand) + 1
Next i
For i = 1 To 4
Debug.Print i & “: “ & aResult(i)
Next i
End Sub
Dim lRand As Long
Dim i As Long
Dim aResult() As Long
Dim dLow As Double
Dim dHigh As Double
dLow = 1
dHigh = 4
ReDim aResult(dLow To dHigh) As Long
For i = 1 To 10000
lRand = Round(Rnd * (dHigh – dLow) + dLow, 0)
aResult(lRand) = aResult(lRand) + 1
Next i
For i = 1 To 4
Debug.Print i & “: “ & aResult(i)
Next i
End Sub
Because I’m using Round(), it will only pick the edges ~50% as often as the middle numbers. It seems that if I simply increase my range, that is reduce dLow by 0.5 and increase dHigh by .49, my distribution will be right. Technically it won’t be right, but it will be close enough for government work. Whereas the old method wasn’t even close enough for that.
Sub RandTest2()
Dim lRand As Long
Dim i As Long
Dim aResult() As Long
Dim dLow As Double
Dim dHigh As Double
dLow = 1
dHigh = 4
ReDim aResult(dLow To dHigh) As Long
dLow = dLow – 0.5
dHigh = dHigh + 0.49
For i = 1 To 10000
lRand = Round(Rnd * (dHigh – dLow) + dLow, 0)
aResult(lRand) = aResult(lRand) + 1
Next i
For i = 1 To 4
Debug.Print i & “: “ & aResult(i)
Next i
End Sub
Dim lRand As Long
Dim i As Long
Dim aResult() As Long
Dim dLow As Double
Dim dHigh As Double
dLow = 1
dHigh = 4
ReDim aResult(dLow To dHigh) As Long
dLow = dLow – 0.5
dHigh = dHigh + 0.49
For i = 1 To 10000
lRand = Round(Rnd * (dHigh – dLow) + dLow, 0)
aResult(lRand) = aResult(lRand) + 1
Next i
For i = 1 To 4
Debug.Print i & “: “ & aResult(i)
Next i
End Sub
Here’s the results from running both
Looks like a winner to me. Am I missing anything?
Why not just truncate a text string and pass it back to an int? – am I missing something ;-)
Why not use the RandBetween worksheet function?
This is why numeric programming isn’t trivial, no matter how easy it may seem.
Don’t round, truncate.
f = lo + Int(Rnd * (hi – lo + 1))
End Function
Putting this differently, is
R = Round(Rnd * (H – L) + L, 0)
really clearer than
R = L + Int(Rnd * (H – L + 1))
?
I suppose the ‘+ 1’ may not be immediately obvious. It arises from there being only N – 1 intervals bounded by N distinct boundary points. This N – 1 to N relationship between intervals and boundary points is fundamental to numeric programming.
Sub RandTest2() is still biased. The upper bound will lack favor by about 1% in long runs, owing to the lack of precision in adding 0.49. Other evils creep in too, such as lRand = 0. Adjusting the precision of the offset seems futile. Not sure if that’s close enough for the guv?
fzz’s is the way I’ve done it since way back, Apple //e days.
You are indeed missing something!
With RandTest2 you will get
a) value 4 with probability (0.99/3.99)=24.812%; and
b) values 1..3 with probability (each)25.0627%
so not an accuate uniform distribution
Best to use ‘int’ rather than ’round’. Try this instead:
Sub RandTest3()
Dim lRand As Long
Dim i As Long
Dim aResult() As Long
Dim dLow As Double
Dim dHigh As Double
dLow = 1
dHigh = 4
ReDim aResult(dLow – 1 To dHigh) As Long
dLow = dLow
dHigh = dHigh
For i = 1 To 10000
lRand = Int(Rnd * (dHigh – dLow + 1) + dLow)
aResult(lRand) = aResult(lRand) + 1
Next i
For i = 1 To 4
Debug.Print i & “: ” & aResult(i)
Next i
End Sub
Dick –
fzz beat me to it, and Ross hinted at it first.
Your second procedure is essentially calculating Low + Int(High – Low + 0.99)
For an accountant, isn’t that somewhat precise? I guess if you’re an accountant for Enron, it’s orders of magnitude too precise.
So, I’m really curious now. Why go through all the trouble of creating a random function when one exists? Is the built in function RANDBETWEEN biased? Are there advantages to building your own?
Speed is one big advantage to writing your own. A few quick tests on my machine showed that my random function (like fzz’s) is about 30-50 times faster than using the RandBetween Worksheetfunction. Also, previous to 2007 (at least in 2003), you needed to have the Analysis Tool Pak installed for RandBetween. Writing your own function helps improve compatibility.
-Josh
@JoshG
Thanks for the explanation. Good stuff to know.
According to “Excel for Scientists and Engineers” (E. Joseph Billo) a new algorithm for random numbers was introduced in Excel 2003, because the old one wasn’t very good. Does anyone know if VBA uses the new algorithm or the old one (or something different)?
they’re all biased! use this: http://www.random.org/clients/http/
Dick wrote: “Because I’m using Round(), it will only pick the edges ~50% as often as the middle numbers. It seems that if I simply increase my range, that is reduce dLow by 0.5 and increase dHigh by .49, my distribution will be right. Technically it won’t be right, but it will be close enough for government work. Whereas the old method wasn’t even close enough for that.”
So, why not just use INT in Excel or Int in VBA and get the correct result in the first place — subject, of course, to the limitations in RAND (or Rnd). It’s a lot easier than trying to fudge ROUND/Round. {grin}
Hey,
Does this give unique random or there are repeats ??
I am looking @ creating a Quizz which pick unique random questions from different sections,however this below picks up repeats tooo
For j = 1 To SA
RD = WorksheetFunction.RandBetween(2, ACount)
Worksheets(“Quiz Paper”).Cells(j + 10, 1) = Worksheets(“Section A”).Cells(RD, 1)
Worksheets(“Quiz Paper”).Cells(j + 10, 2) = Worksheets(“Section A”).Cells(RD, 2)
Worksheets(“Quiz Paper”).Cells(j + 10, 3) = “Section A”
Worksheets(“Quiz Paper”).Cells(j + 10, 4) = Worksheets(“Section A”).Cells(RD, 3)
Next j
Busybee:
See
Select elements at random without repetition
http://www.tushar-mehta.com/excel/newsgroups/rand_selection/
@Tushar
I did go thru the page.. am just confused newbie…will it work as a worksheet function ?
I am just not sure which is one is right for me :-(
Thx in advance.