A favorite peeve of mine is code with “bushy trees.” I first saw this phrase in Kernighan and Plauger’s Elements of Programming Style.
Recently, I saw some code that checked if a RefEdit control referred to a single cell that contained a non negative integer. I cleaned up the formatting some since the original indentation style might be best described as “random tabs.” But, code formatting is not the subject of this post.
The IFs are fairly easy to understand since they essentially follow how most people would think to validate the string parameter. The tricky part comes in ensuring that one has the correct Else clause at the correct indentation level. In this case, that too might not be too difficult since the Else clauses are fairly trivial, each consisting of a single boolean assignment. But, imagine how much more difficult the task would be if there were further If clauses or loop structures in some of the Else clauses!
Public Function getNonNegInt(aRefEditText As String, _
ByRef Rslt As Integer) As Boolean
Dim aRng As Range
Const MaxInt As Integer = 32767
getNonNegInt = True
On Error Resume Next
Set aRng = Range(aRefEditText)
On Error GoTo 0
If Not aRng Is Nothing Then
If aRng.Cells.Count = 1 Then
If IsNumeric(aRng.Value) Then
If CDbl(aRng.Value) >= 0 Then
If CDbl(aRng.Value) = CLng(aRng.Value) Then
If CLng(aRng.Value) < = MaxInt Then
Rslt = CInt(aRng.Value)
Else
getNonNegInt = False
End If
Else
getNonNegInt = False
End If
Else
getNonNegInt = False
End If
Else
getNonNegInt = False
End If
Else
getNonNegInt = False
End If
Else
getNonNegInt = False
End If
End Function
Code Sample 1
As a first pass, one could remove all the boolean assignments by first setting the function value to false and not to true as in Code Sample 1. Then only if we have an acceptable value do we return a True value.
Public Function getNonNegInt(aRefEditText As String, _
ByRef Rslt As Integer) As Boolean
Dim aRng As Range
Const MaxInt As Integer = 32767
On Error Resume Next
Set aRng = Range(aRefEditText)
On Error GoTo 0
getNonNegInt = False
If Not aRng Is Nothing Then
If aRng.Cells.Count = 1 Then
If IsNumeric(aRng.Value) Then
If CDbl(aRng.Value) >= 0 Then
If CDbl(aRng.Value) = CLng(aRng.Value) Then
If CLng(aRng.Value) < = MaxInt Then
Rslt = CInt(aRng.Value)
getNonNegInt = True
End If
End If
End If
End If
End If
End If
End Function
Code Sample 2
However, this still doesn’t help with all the nested If and End If clauses.
So, how does one clean up this deeply nested code? How about if we reverse the tests? Instead of testing if the range is not nothing, test if it is nothing. Instead of testing if the range contains 1 cell, test if it contains more than 1 cell. And, so on. The result as shown in Code Sample 3 is a single If statement (with multiple ElseIf clauses) that is ‘flat’ — with no confusing nesting!
Public Function getNonNegInt(aRefEditText As String, _
ByRef Rslt As Integer) As Boolean
Dim aRng As Range
Const MaxInt As Integer = 32767
getNonNegInt = False
On Error Resume Next
Set aRng = Range(aRefEditText)
On Error GoTo 0
If aRng Is Nothing Then
ElseIf aRng.Cells.Count > 1 Then
ElseIf Not IsNumeric(aRng.Value) Then
ElseIf CDbl(aRng.Value) < 0 Then
ElseIf CDbl(aRng.Value) <> CLng(aRng.Value) Then
ElseIf CLng(aRng.Value) > MaxInt Then
Else
Rslt = CInt(aRng.Value)
getNonNegInt = True
End If
End Function
Code Sample 3
The code above uses a very powerful concept — that of a ‘null clause.’
In most cases, when we have a If…Then, we have some statement in the True branch of the If statement. It might be a series of assignments or it might be another If or a loop of some sort but there is something in the True branch. For example, in Code Sample 1 and Code Sample 2 above, there are two assignments.
Rslt = CInt(aRng.Value)
getNonNegInt = True
End If
However, in Code Sample 3, each Then is followed not by a statement but by the ElseIf clause. This results in a null statement (or empty block) in the True branch. This is perfectly legal in every programming language I’ve used and in this case it serves a very powerful role in simplifying the code.
For the sake of completeness, we will look at one more way of coding the above. In this particular scenario since there is no further processing after the string is validated, one could use use a series of Ifs that simply go to an exit point for bad data. But, this would not work for scenarios in which we wanted to do additional processing after the string of Ifs.
Public Function getNonNegInt(aRefEditText As String, _
ByRef Rslt As Integer) As Boolean
Dim aRng As Range
Const MaxInt As Integer = 32767
On Error Resume Next
Set aRng = Range(aRefEditText)
On Error GoTo 0
If aRng Is Nothing Then GoTo ErrXIT
If aRng.Cells.Count > 1 Then GoTo ErrXIT
If Not IsNumeric(aRng.Value) Then GoTo ErrXIT
If CDbl(aRng.Value) < 0 Then GoTo ErrXIT
If CDbl(aRng.Value) <> CLng(aRng.Value) Then GoTo ErrXIT
If CLng(aRng.Value) > MaxInt Then GoTo ErrXIT
Rslt = CInt(aRng.Value)
getNonNegInt = True
Exit Function
ErrXIT:
getNonNegInt = False
End Function
Code Sample 4
Yes the code sample 2 example is one I like to use but I tend to rem it up a bit, so it is clear what is going on when I come back to it after six months
If aRng Is Nothing Then
‘do nowt
ElseIf aRng.Cells.Count > 1 Then
‘do nowt
etc.
@Tushar –
I would do it like this:
ByRef rslt As Integer) As Boolean
Dim aRng As Range
On Error GoTo ErrXIT:
Set aRng = Range(aRefEditText)
If Application.Evaluate(“(“ & aRefEditText & “>0)*(“ & aRefEditText & “<32767)”) Then
rslt = aRng
getNonNegInt = True
End If
ErrXIT:
End Function
Regards,
Daniel Ferry
excelhero.com
Nice article Tushar.
@Daniel
Surely your solution also needs to test for the cell value being the same as the CLng case? Your evaluate would become something like
Application.Evaluate(“(” & aRefEditText & “>0)*(” & aRefEditText & “<32767) * (CLng(” & aRefEditText & “)=” & aRefEditText &”)”)
@Daniel, I might be tempted to use this slightly shorter function (which, as a personal choice, defaults to processing the first cell of a multi-cell range address is such a range is passed into the first argument)…
ByRef rslt As Integer) As Boolean
Dim CellValue As String
CellValue = Range(aRefEditText)(1).Value
getNonNegInt = IIf(Len(CellValue) > 0 And Not CellValue Like “*[!0-9]*”, CellValue < 32767, False)
If getNonNegInt Then rslt = CellValue
End Function
@Daniel and Myself,
Actually, I might consider passing the first argment ByVal and end up not creating that intermediate variable (and saving another line of code)…
RefText = Range(RefText)(1).Value
getNonNegInt = IIf(Len(RefText) > 0 And Not RefText Like “*[!0-9]*”, RefText < 32767, False)
If getNonNegInt Then rslt = CellValue
End Function
Another one
ByRef Rslt As Integer) As Boolean
Dim aRng As Range
Const MaxInt As Integer = 32767
On Error Resume Next
Set aRng = Range(aRefEditText)
On Error GoTo 0
Select Case True
Case aRng Is Nothing, aRng.Cells.Count > 1, Not IsNumeric(aRng.Value), _
CDbl(aRng.Value) < 0, CDbl(aRng.Value) <> CLng(aRng.Value), _
CLng(aRng.Value) > MaxInt
Case Else
Rslt = CInt(aRng.Value)
getNonNegInt2 = True
End Select
End Function
@Rick –
Nice. But won’t your method fail if RefText does not actually describe a valid range?
Four of the lines in my procedure were there to deal with that possibility. If we are to ignore that, mine could have been:
If Application.Evaluate(“(“ & aRefEditText & “>0)*(“ & aRefEditText & “<32767)”) Then
rslt = aRng: getNonNegInt = True
End If
End Function
Damn, damn, damn! Okay, a minor change to my functions is required. The IIf function call fails as written if the referenced cell does not contain a number; and the first assignment will fail if the reference cell contains an error value. Here is modified code for both of my posted functions that corrects these problems (still keeping the same number of code lines in each)…
First posted version
————————————
ByRef rslt As Integer) As Boolean
Dim CellValue As String
CellValue = Range(aRefEditText)(1).Text
getNonNegInt = IIf(Len(CellValue) > 0 And Not CellValue Like “*[!0-9]*”, Val(CellValue) < 32767, False)
If getNonNegInt Then rslt = CellValue
End Function
Second posted version
————————————
RefText = Range(RefText)(1).Text
getNonNegInt = IIf(Len(RefText) > 0 And Not RefText Like “*[!0-9]*”, Val(RefText) < 32767, False)
If getNonNegInt Then rslt = RefText
End Function
And this version forces the use of the first cell:
If Application.Evaluate(“(index(” & aRefEditText & “,1,1)>0)*(index(” & aRefEditText & “,1,1)
@Daniel,
Hmm, I didn’t think about the argument not describing a real range. Okay, I guess that will cost me two additional lines of code…
First posted version
—————————
ByRef rslt As Integer) As Boolean
Dim CellValue As String
On Error GoTo Whoops
CellValue = Range(aRefEditText)(1).Text
getNonNegInt = IIf(Len(CellValue) > 0 And Not CellValue Like “*[!0-9]*”, Val(CellValue) < 32767, False)
If getNonNegInt Then rslt = CellValue
Whoops:
End Function
Second posted version
—————————
On Error GoTo Whoops
RefText = Range(RefText)(1).Text
getNonNegInt = IIf(Len(RefText) > 0 And Not RefText Like “*[!0-9]*”, Val(RefText) < 32767, False)
If getNonNegInt Then rslt = RefText
Whoops:
End Function
Try that once more…
If Application.Evaluate(“(index(“ & aRefEditText & “,1,1)>0)*(index(“ & aRefEditText & “,1,1)<32767)”) Then
rslt = Range(aRefEditText)(1): getNonNegInt = True
End If
End Function
And with the range error back in play…
On Error GoTo Whoops
If Application.Evaluate(“(index(“ & aRefEditText & “,1,1)>0)*(index(“ & aRefEditText & “,1,1)<32767)”) Then
rslt = Range(aRefEditText)(1): getNonNegInt = True
End If
Whoops:
End Function
I would say both our methods are excellent, totally different roads to the same destination…
In case folks missed the tags I used, they did *not* include Excel!
This post was about programming techniques: bushy trees, reversing tests, null clauses, and empty blocks. Towards that goal I used an *example* of a RefEdit control in the context of Excel. That was an example, not an end on to itself.
That said, Daniel, your code doesn’t check for an integer. You also have boundary condition errors. Also, over the years, I’ve been burned by Evaluate and use it only as a last resort. I have no idea how well it would fare if the consumer selected refedit fields from multiple workbooks/worksheets.
Rick, the original example did not allow the customer to specify multiple cells. Selecting just the first might reduce the number of characters a developer has to type but it *ignores* bad input! Also, format a cell with 1, or more, decimals and the code flags an integer value as an error. Finally, there is a boundary condition error.
I realize there are those who believe size matters, but is it really smaller is better? {vbg}
There are times when explicitly and methodically testing the input pays off big time. Just for the record, there is a bug in the original code I posted but it was in the original code and I did not bother fixing since it was irrelevant to the concept of this post.
The advantage of the the original code is that, with very little additional work, it is transportable to other environments for validating non negative integers that fit in an integer data type. It can also be *easily* adapted to include more specific error information. But, most of all, it is concept that matters. Avoid bushy trees, think of the benefit of reversing test conditions, and leverage null clauses.
@Daniel,
Yes, two totally different roads to reach the same destination. I notice, however, that you joined multiple statements on a single line using the colon operator. If the colon operator is “allowed”, then I can reduce my functon to a two-liner…
On Error GoTo Whoops: RefText = Range(RefText)(1).Text: getNonNegInt = IIf(Val(RefText) > 0 And Not RefText Like “*[!0-9]*”, Val(RefText) < 32767, False)
Whoops: If getNonNegInt Then rslt = RefText
End Function
@Daniel… just to note, there was the symbol for “very big grin” at the end of my first sentence, but I forgot about these comment processors and their penchant for “eating” anything between “less than, greater than” symbols, so it did not show up in my posted message. Rest assured, there was DEFINITELY supposed to be a “very big grin” symbol introducing my “two-liner”.
Tushar –
Thanks for the response. I did indeed forget to check for the integer requirement. Over the years my path has gone opposite yours as I have gone from hating Evaluate to loving it. I use it frequently, especially the shortcut version, [].
Here is a revised version that I believe accounts for all of the requirements:
Dim aRng As Range
Dim t As String
On Error GoTo ErrXIT:
Set aRng = Range(aRefEditText)
t = aRefEditText
If Application.Evaluate(“(“ & t & “>0)*(“ & t & “<32767)” * (Int(” & t & “) = ” & t & “) * (Rows(” & t & “) = 1) * (Columns(” & t & “) = 1)) Then
rslt = aRng
getNonNegInt3 = True
End If
ErrXIT:
End Function
To point to the article tags as not including “Excel” when the article is posted on an Excel blog, and then proceed to use Excel-specific language items, and then not mention that the point was a general purpose routine that could be used or quickly converted to other platforms, is asking the reader to have ESP.
@Tushar,
I understand your point about your code being for example purposes and your point about lengthy versus brief code, but just in case anyone following this thread is interested, I believe the following revision to my function addresses the problems you found with how my code functioned…
On Error GoTo Whoops
RefText = Range(RefText).Value
getNonNegInt = IIf(Len(RefText) > 0 And RefText > 0 And Not RefText Like “*[!0-9]*”, RefText < 32768, False)
Whoops: If getNonNegInt Then rslt = RefText
End Function
I would go with a modification of Code Sample 1:
On Error Resume Next
Set rng = Range(str)
On Error GoTo 0
If Not rng Is Nothing Then _
If rng.Cells.Count = 1 Then _
If IsNumeric(rng.Value) Then _
If CDbl(rng.Value) >= 0 Then _
If CDbl(rng.Value) = CLng(rng.Value) Then _
If CLng(rng.Value) <= cMaxInt Then _
bln = True
If bln Then res = CInt(rng.Value)
Tushar, thanks for the interesting post. I’ve used the null clause method before, although not to such an extent. I tend to put a comment in there like “‘continue”, to remember why it’s there. I like Dick’s Select Case True version – it’s easy to read, and would be easy to modify. (I see that syntax is still allowed in VB.Net, although it causes some teeth-gnashing.)
It’s refreshing to see a code war nowadays :)
Thinking about this more, I think when you’ve got a bunch of conditions any one of which leads to failure, it’s just a big OR statement, which is why Dick’s Select Case True appeals to me.
Very neat examples and discussion – but I wonder which of these is easiest to maintain, especially for someone else to maintain (for obvious reasons we all know about). And my answer would be Code Sample 2.
I like Dick’s suggestion, but I still think someone would scratch their head for a while before figuring it out.
I guess comments could help speed the thinking, but if we are writing comments, then is it worth it – or should we just write the code in a way that would be understandable without the comments (I agree this is not possible in all situations, but in these situations there is a reasonable alternative)
–Charlie
Can’t use Evaluate if passing the function a textnum like “10? or a contrived textnumarray like “{1,2,3}” should throw an error. That is, if the string argument MUST be a textref, it’s NECESSARY to try to convert it to a range object.
As for the actual coding, why return success/failure? The same Kernighan & Plauger wrote another book titled Software Tools, which was for a FORTRAN dialect, and a derivative Software Tools in Pascal, which was a screed against the poor design of Pascal’s I/O library which included asinine functions like this where the variable receiving the input was passed as a parameter to the library I/O function. Kinda inconsistent with these authors’ entire work to focus just on the bushy trees and ignore the suboptimal function interface.
Lack of shortcircuit boolean evaluation in VBA necessitates multiple nested If blocks. Even so this could be cleanly handled by creating the range object, then derefencing it properly.
Const MAXINT As Long = 100000
Dim rng As Range, v As Variant
foo = -1 ‘defaults to error return value
On Error GoTo ExitFunction
‘throws a runtime error when textref isn’t valid range address
Set rng = Range(textref)
‘exits when rng isn’t a single cell
‘throws a runtime error when rng isn’t numeric
If rng.Count = 1 Then v = CDbl(rng.Value2) Else GoTo ExitFunction
‘finally just check whole number between 0 and MAXINT
If v >= 0 And v <= MAXINT And v = Int(v) Then foo = v
ExitFunction:
End Function
@fzz –
Agreed, but Tushar made it clear that the purpose was to validate a cell value – so Evaluate is a good choice.
That being said, I like your approach and would choose it myself if I were not to use Evaluate.
Regards,
Daniel Ferry
excelhero.com
@Daniel –
Agreed, validate a cell value. And if the function took a Range object as it’s 1st argument rather than a string textref, there’d be no reason for the initial validation checking that the first argument referred to a range. Evaluate, OTOH, won’t throw an error converting “10? to 10 or “{1,2,3,4}” to an array even though neither string refers to a range. That begs the essential question whether the string 1st argument should be tested to confirm that it refers to a range. If not, Evaluate would be expedient. If so, Evaluate would be inadequate.
Perhaps worse than the syntax of passing the ‘return value’ as a ByRef parameter is passing the 1st reference as a textref rather than the intended Range object. If the 1st arg should be a range reference, WTH not make the argument a Range object and put that bit of validity checking into the calling procedure? If that’s not allowed or practical for some reason, this is an exquisite example of when encapsulating a type conversion inside another function is warranted.
Const MAXINT As Long = 100000
Dim rng As Range, v As Variant
foo = -1 ‘defaults to error return value
Set rng = textref2range(TextRef)
If rng Is Nothing Then Exit Function
If rng.Cells.Count > 1 Then Exit Function
v = CDbl(rng.Value2)
If v >= 0 And v <= MAXINT And v = Int(v) Then foo = v
End Function
Function textref2range(TextRef As String) As Range
On Error Resume Next
Set textref2range = Range(TextRef)
End Function
This is closer to the true spirit of the collective writings of messrs Kernighan & Plaugher: multiple return/exit statements for different exceptions are clearer than nested If blocks.
I’m with fzz on this one:
This is closer to the true spirit of the collective writings of messrs Kernighan & Plaugher: multiple return/exit statements for different exceptions are clearer than nested If blocks.
I write a lot of code with lengthy validation processes and my approach as always been a sequence of ‘drop-through-or-exit’ tests:
Exit Sub
End If
If target.value < 0 Then
Exit Sub
End If
… And so on.
I do, however, make frequent use of Null ‘IF’ clauses, using a comment to show that I’m doing nothing:
‘ No action: range 2 is not in range 1
Else
arr1 = rng1.value2
arr2 = rng2.value2
Endif
That being said, Tushar’s sequential ElseIf is functionally equivalent to the ‘drop-through-or-exit’ approach, with the additional benefit that you stay within the function and can handle the ‘rejected’ values with specific error messages.
A nice example.
Anyone got a good example out there of short-cutting (or ‘shortcircuiting’) a more complex decison tree in VBA? Constructing a binary token and applying bitwise operations is conceptually simple – we all do it, even if we don’t realise it, when we pass parameters to the msgBox function:
…However, constructing practical examples is much harder, and I’ve never seen a satisfactory solution to a decision tree with orthogonal tests: a common situation in which VBA coders always end up duplicating code in at least one ‘Else’ clause, no matter how the IF statements are nested.
Orthogonality… This might not be expressed as clearly as it could be: – Tushar’s example can be thought of as a simple ‘linear’ test set, as we can drop straight through with a conceptual ‘valid=true’ or ‘valid=false’. An ‘orthogonal’ test forces us to ‘branch off’ in another direction with (say) ‘CurrentValue=true’ and ‘CurrentValue=false’, with each of those two choices containing separate ‘valid=true’ and ‘valid=false’ clauses.
Add in (say) the test ‘ExtendedCharset=true’ and ‘ExtendedCharset=false’, and you’ll find yourself with eight separate cases – but there might sets of answers that are handled the same way (eg:three cases might go to a common error heandler)… But, in a genuinely orthogonal test set, you can only ever get one result set into the same ‘Else’ clause: all the others will end up in clauses containing duplicated code.
public boolean [] foo( int [] a) { // code }