Testing Strings Using Left

The “wrong” way:

Sub FindTotals()
   
    Dim rCell As Range
   
    For Each rCell In Sheet1.Columns(1).Cells
        If Left$(rCell, 5) = “Total” Then
            ‘Do something
       End If
    Next rCell
   
End Sub

The “right” way:

Sub FindTotals2()
   
    Dim rCell As Range
   
    Const sFIND As String = “Total”
   
    For Each rCell In Sheet1.Columns(1).Cells
        If Left$(rCell, Len(sFIND)) = sFIND Then
            ‘Do something
       End If
    Next rCell
   
End Sub

Both “right” and “wrong” are in quotes because, like many best practices, it can be a matter of taste and preference. In the second method, I can change the string in one place and I’m done. In the first, if I change the string, I also have to change the second argument of Left$().

Posted in Uncategorized

20 thoughts on “Testing Strings Using Left

  1. Just another possibility (with a different emphasis than yours…):

    Sub FindTotals3()

    Const sFIND As String =”Total*”
    Dim rCell As Range

    For Each rCell In Sheet1.Columns(1).Cells
    If rCell.Text Like sFIND Then
    ‘Do something
    End If
    Next rCell

    End Sub

  2. Since I spend a ton of time working on “other people’s” code (read “I have this macro that somebody else did a long time ago that used to do X, but now it behaves funny… can you please…”), I join you in screaming from the rooftops that if your code does not have a constant for everything that can be used more than once then you’re doing it “wrong”. Unfortunately, the cell that contains the data you are comparing your constant to likely needs any combination of “Clean” or “Trim” or “Format” or… but I digress.

  3. J.E. McGimpsey beat me to mentioning the Like operator, so I’ll content myself with the InStr approach.

    Const s2m As String =“Total”
    Dim c As Range

    For Each c In SomeRange
      If InStr(1, c.Text, s2m) = 1 Then
        ‘Do something
     End If
    Next rCell

  4. Depending on what appears in

    ‘Do Something

    both methods may be “wrong”. Often it makes sense to read your search range into an array, and query the array elements in VBA.

  5. Well, it’s “wrong” in the sense that it needlessly checks possibly empty cells, or cells that may contain only numbers or formulas.

    Usually I code like this when I’m too lazy to come up with a “real” solution, or when speed isn’t that important (i.e. when I’m coding for someone else LOL!)

    But why even use a loop when you can use a native feature like Find which is probably much faster?

    If you have to use a loop, at least use .SpecialCells(xlCellTypeConstants, 2) to limit the operation to cells that contain text values. And use WorksheetFunction.CountA to limit the number of iterations to only the number of times the word “Total” appears in that column. That should make the loop as efficient as possible.

    Thx,
    JP

  6. I might be inclined to create a new function:

    Public Function BeginsWith(ByVal SearchIn As String, ByVal SearchFor As String) As Boolean

  7. Ahem. As I was trying to say before I inadvertently hit whatever key combination presses “Submit”…

    Public Function BeginsWith(ByVal SearchIn As String, ByVal SearchFor As String) As Boolean
    BeginsWith = (Left(SearchIn, Len(SearchFor)) = SearchFor)
    End Function

    Although to be honest I probably over-do this one-line function business.

  8. Or use the Find function.

    Sub FindTotals3()
       
        Dim rCell As Range
        Dim strFirstAddress As String
       
        Const sFIND As String = “Total*”
       
        With Sheet1.Columns(1)
            Set rCell = .Find(sFIND, lookat:=xlWhole, LookIn:=xlValues)
            If Not rCell Is Nothing Then
                strFirstAddress = rCell.Address
                Do
                    ‘ Do Something
                   rCell.Font.Bold = Not rCell.Font.Bold
                    Set rCell = .FindNext(rCell)
                Loop While Not rCell Is Nothing And rCell.Address  strFirstAddress
            End If
        End With
       
    End Sub
  9. What I was getting at was the speed of looping of worksheet cells vs. looping in a VB array. However, it looks like I neglected SpecialCells and Find.

    The speed of a routine isn’t too important if it takes a relatively long time for the user to scratch his head and decide what to look at next, compared to the time the routine takes to provide the next view. That’s why I usually have no problem with late binding or with using Variants.

  10. Since you guys are so interested in the periphery, I’ll make up a backstory for you. All the cells in column A have text in them. All but 10 of the 65,536 strings start with “Total”. Does that make my loop reasonably efficient?

  11. Looping wasn’t the focus of the original post. Finding matches for “Total” at the beginning of strings was. For that, either J.E. McGimpsey’s Like construct or my InStr construct are better than the originally posed ‘right way’. The Like approach is more flexible, but the InStr approach is minimal.

  12. LOL Dick all you did was create an issue that fits your solution.

    But I think Find and CountA would still work faster. Maybe if I’m inspired tomorrow I’ll try it out.

    –JP

  13. Several comments.

    A best practice is *not* a matter or taste and preference. Something either is the right way of doing something or it is not.

    All one has to do is look at the relative costs of doing things differently. Code must be easy to understand and maintain. That is the number one priority. And, that has to be without silly — yes, silly — comments like “Declare a variable” or “Loop through all the cells” or other equally inane comments that one is likely to see in various sample (and real) code. If one must document code, document the intent, not the implementation.

    Of course, code must work in real time but the definition of real time varies widely. If something must run while a customer is waiting for a transaction to complete, the speed requirement is very different from something that has to run over a weekend once a year.

    In any case, what I call micro-optimizations are rarely useful for speed improvements since at most they yield percentage improvements. Coming up with a better algorithm will almost always result in a *magnitude* improvement.

    True, I often transfer data from Excel to VBA using the array=range.value or array=transpose(range.value) statements as well as the other way around but that is often more for flexibility than for speed. This way whatever core function I am writing always operates on an array passed to it as an argument. Auxiliary functions map ranges-to-from-arrays. Now, the core function is usable either with an Excel range or as a stand-alone VBA function. The overall code is self-documented, highly modular, and trivial to drop as a black box into other projects! Since the code is elegantly trivial it is also completely self-documented. For an example, see http://www.tushar-mehta.com/excel/tips/powerset.html In this case, the functions towards the bottom of the page are the interface functions to the core routines. One example is the UDFSubset function. Does anyone need to be told it’s meant for use as a UDF? It reads:

    Public Function UDFSubset(Arr, NbrItems As Integer, Delim As String)
        Arr = fixInput(Arr)
        UDFSubset = createOutput(createSubset(Arr, NbrItems, Delim))
        End Function

    What does fixInput do? Check it out. It requires no comments to understand. Basically, if Arr is a range, it maps it to an array, transposing it if appropriate. createSubset is the core routine that can also be called by another VBA routine to create the required subset. Finally, createOutput maps the result of createSubset to a range if the caller is an Excel range.

    Don’t like the fact that the functions argument is misnamed as Arr? Change it to ArrOrRng, fix the 2 lines in the function, and rest assured that you do not have to look at any of the other routines.

    Don’t like the fact that the function reuses Arr? Just add a local variant, assign the result of fixInput to it and be completely confident that no other routine will be affected by your decision!

    Why do I raise the above example? Because it is similar to Dick’s original intent in making this post — at least as I read it. It makes the intent of the code obvious and it makes maintenance that much easier. I would have written Dick’s code somewhat differently to make it more obvious that the scope of the constant is limited.

        For Each rCell In Sheet1.Columns(1).Cells
            Const sFIND As String = “Total”
            If Left(rCell.Value, Len(sFIND)) = sFIND Then
                ‘Do something
               End If
            Next rCell

    Unfortunately, the VB(A) compiler doesn’t honor scopes smaller than a routine (other languages do). Consequently, sFIND is available through the entire subroutine. To avoid the temptation that someone (including myself at some point in the future) might reuse sFIND, I tend to write

        For Each rCell In Sheet1.Columns(1).Cells
            If Left(rCell.Value, Len(“Total”)) = “Total” Then
                ‘Do something
               End If
            Next rCell

    Anyone reading or changing this code who misses the two occurrences of the literal “Total” shouldn’t be be allowed anywhere near the code in the first place!

    Of course, if I were working on this task, I would have modularized it.

    Option Explicit
    Sub doSomething(aCell As Range)
        Debug.Print aCell.Address
        End Sub
    Sub processTotals(aRng As Range)
        Dim rCell As Range
        For Each rCell In aRng
            If Left(rCell.Value, Len(“Total”)) = “Total” Then doSomething rCell
            Next rCell
        End Sub
    Sub MainControl()
        With ActiveSheet
        processTotals Application.Intersect(.UsedRange, .Columns(1))
            End With
        End Sub

    Improve the above? Only if the performance was *actually* unacceptable.

  14. OK, should anyone using multiple instances of the same text constant be allowed anywhere near a language compiler or interpreter of any kind?

    There’s a reason local, temporary variables have been implemented in most languages. That VBA doesn’t provide reasonable scoping isn’t a good excuse to avoid using temporary variables.

    But that still leave the question whether

    Left(somestring, Len(anotherstring)) = anotherstring

    is clearer than either

    somestring Like anotherstring & “*”

    or

    InStr(somestring, anotherstring) = 1

    The Like approach is closest to scripting languages with built-in regular expression support, and there is NO BETTER WAY to process text than using regular expressions. The InStr approach has the benefit of only needing to refer once each to somestring and anotherstring, and beyond that all that’s needed is an understanding that 1 is the initial character position in VBA strings.

  15. Tushar, is there any particular reason why you indent the closing to your “For”, “If” and “Sub” statements? Code indented like that drives me mad when reading it. Is that really the “Right Way”?

    Charles

  16. Tushar’s style makes me a little crazy too. But I run the Smart Indenter on it, and it’s normal again. I guess it’s what you’re used to.

  17. Charles Chickering wrote ‘Tushar, is there any particular reason why you indent the closing to your “For”, “If” and “Sub” statements? Code indented like that drives me mad when reading it. Is that really the “Right Way”?’

    Yes, though I sympathize with your problem. When I switched to this method of indenting 25+ years ago, I had to make an adjustment to it. But, the productivity improvement was worth the effort. A few of my colleagues from that era also made the switch and stayed with the new approach.

    Here’s the reasoning. The keywords that flag the end of block structures are for the compiler, not for people. People rely on indentation and the judicious use of white space to read code. So, why highlight what’s important to the compiler? It just distracts human beings.

    Why should the End Sub be at the same indent level as the Sub? I *know* where the subroutine ends. It ends before the next statement at the same indent level. I just have to scan down the white space starting below the Sub statement. Similarly, I *know* where every block structure ends. It ends at the line prior to the next statement that starts at the same indent level as the line starting the block. I don’t need to see the End If or the Next X or whatever.

    [And, before people jump all other this one exception, yes, I know that a VB label has to be in column 1; it’s something I’ve learned to live with.]

    OK, so what if the block ending statement is missing? Again, 35 years ago it was important to know that since I might well have to submit a deck of punched cards for overnight processing only to discover the next day that I was missing a End If. But, today? The compiler will tell me that as soon as I compile or execute the code! Not to mention that modern editors will insert the block-end statement as soon as one types in the block-start statement.

    Conceptually, this issue is along the same line as something most developers “know” is “best practice” when it comes to naming variables. They use a leading character (and sometimes several) to identify the data type of a variable. I’ve refused to do it for nearly 30 years and was thrilled to discover that Microsoft has apparently dropped that convention with .Net.

    The whole idea is rather archaic. I suspect it made sense back in the days when the number of data types were limited and the software tools were (relatively) primitive. The cost of finding the type of a variable was high if one had to visually scan printed pages to find the declaration. But in today’s world when one can create custom types and classes and has much more sophisticated editors at one’s disposal? It’s hard to imagine it making sense.

    Not to mention that I find the information less than helpful. Do I care if the computer stores the variable representing a month in one byte, 2 bytes, or 4? What I want is for it to contain integer values between 1 and 12. So, what does the l in the variable lMth tell me? That the variable can legitimately contain the value -14567? Obviously, that’s bad information.

    What would be great is a mechanism to enforce a legitimate value. And, I will be happy to let the compiler and/or OS decide how best to allocate memory for that variable. Frankly, that was a major disappointment with .Net. I had hoped it would allow declarations like Mth: Integer[1..12] or Dim RandVar: Real[0..1)


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.