If it ain’t broke…fix it Part 1 of many

Since this is my first post, I wonder if I should take a moment and introduce myself. But, then, it might be easier to let the author profile and my posts themselves do the talking…

One of the clichés that has guided developers through the decades has been “if it ain’t broke don’t fix it.” But when a system is based on good development methodologies, preemptive changes can have significant benefits. This post discusses one such preemptive measure VB(A) developers should consider. Many, if not most, VB(A) developers use 1-based arrays or even arrays declared with bounds that are highly relevant to their application. For example, someone who has to create an array for the years 2001 to 2010 might use

.Net, on the other hand, requires all arrays to be zero-based. Further, since the lower bound must be zero, the array declaration does not allow the specification of a lower bound. Essentially, the syntax

declares an array of 10 elements from a lower bound of zero to an upper bound of 9. Clearly, when it comes to declaring arrays .Net supports only one of the many flavors, if you will, that VB(A) supports. To the extent that one plans to migrate to VB.Net or develop in both VB(A) and VB.Net, adopting .Net protocols now will help minimize transition difficulties. In VBA, one can use

to accomplish essentially what .Net requires.

Is that all there is to making a preemptive change to zero-based arrays? No, not really. One of the first things to consider is that one needs to think differently. Most of us grew up learning to count starting from 1. Now, of course, we need to start counting from zero. The first element of an array is at index zero and not at index 1.

In addition, when it comes to changing existing code, there are other issues to consider. If the developer has code like

it will now fail.

On the other hand, if the developer had practiced safe programming all along, the code would have used references to the array’s bounds in the first place and would require no change!

In the next post, we will look at the handling the interactions between VB(A)/.Net zero-based arrays and Excel’s 1-based collections.

Posted in Uncategorized

18 thoughts on “If it ain’t broke…fix it Part 1 of many

  1. I’m not entirely clear what your point is. You seem to be saying that because the .NET programming languages require zero-bound arrays that this is the best way to model every solution that requires an array. Just because Microsoft decided to make it so does not make it a good idea.

    Arrays with non-zero lower bounds often model the actual business problem far better than zero-bound arrays. This is a perfect example of the type of programming language abstraction that leaves more brain cycles available to solve the business problem rather than forcing you to squeeze your problem into the implementation of the programming language.

    The whole point of VB/VBA was to create a programming language that was abstracted far enough from the underlying machine code to allow rapid solution development for common business logic problems by people who were not primarily computer programmers.

    If you are trying to say that arbitrary array bounds are theoretically OK but they will be obsoleted by the .NET programming languages then you should say that. If you are saying there is a fundamental problem with arbitrary array bounds then I can’t think of any reason this is true and you should explain your reasoning in more detail.

  2. “.Net, on the other hand, requires all arrays to be zero-based.”

    That is not correct as VB.NET actually allow us to create non-zero based arrays.

    In practice and in particular for one-dimensional arrays it’s highly recommended to use zero-based arrays.

    It would been more interesting if Your case had cover the transition of VB/VBA-solutions to .NET in view of arrays, lists and collections.

    Kind regards,
    Dennis

  3. Enforced zero-based arrays is one of the (lesser) ways that Classic VB programmers felt betrayed by the upheaval caused by the introduction of VB.Net. It’s one of the obvious ways that forwards compatibility of existing code was broken.

    This just popped into my head. I use variants a lot to extract data from worksheet ranges. The variant becomes a 2-D 1-based array. Does VB.Net/VSTO also get a 1-based or a 0-based array of values?

    Man, that would trash a LOT of existing code.

  4. Rob, as always, I appreciate your sensible approach. I feel like you are speaking on my behalf, somebody who wants to be able to address the problem at hand without unneccessary overhead. VBA is a great fit for the way I work. Much as I yearn to master .Net, I’m not sure my remaining “brain cycles” are up to it.

    Jon, would Contrarian’s suggestion re: UBound/LBound eliminate your theoretical problem or is there more to it than that?

    I notice that Filesearch.Foundfiles returns a 1-based array, whereas GetAllSettings returns 0-based and, as I think Dennis was referring to, collections are 1-based.

  5. Doug –

    I already use LBound to UBound in most of my code, but that’s so the called routines don’t care what kind of array they receive, and can be reused flexibly. But if I’m just slurping the values in a range, I generally haven’t gone to that bother.

    – Jon

  6. Rob: I am not staking out a position vis-a-vis VB(A) vs .Net. My point is that array management is more restrictive in .Net. So, it might make more sense for those — and only for those — interested in migrating to it or developing software for both platforms to acclimate themselves to those aspects of .Net that they can even while still working with VB(A).

    True, there are applications that can be better modeled with lower bounds that are *neither* zero nor one. But how many developers actually do that? And, irrespective of those few, the reality is that even with VB(A) we *have* to deal with zero-based arrays. Various VB(A) functions return zero based arrays. Various userform (MSForms) components are zero-based. Various components available through the Scripting or FileSystemObject interfaces work with zero-based arrays. So, it isn’t like one can ignore zero-based structures.

    Essentially, the point is that one can ease the pain of migration if one starts adopting — and even retrofitting — those program elements that will have to be changed anyway.

    Dennis: Yes, .Net works with 1-based arrays. And, I would love to know how to declare one with code. Dim x(1 to 10) doesn’t compile. And, then there is http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvb600/html/vb6tovbdotnet.asp

    Jon: The Excel object model continues to return 1-based arrays in VB.Net. For example,
    Dim x As Object
    x = CType(ThisApplication.ActiveWorkbook.Worksheets(1), Excel.Worksheet).Range(“a1:a100?).Value
    returns a 2D array 1 to 100, 1 to 1. But, since the developer cannot create a 1-based array, if you want to transfer data to/from X and your own array, you will have to deal with mismatched bounds.

    Doug: Quite true. As I’ve already mentioned a few times over, for those who don’t plan to use .Net, there’s no reason to do anything different. However, for those who do, doesn’t it make more sense to learn to work with those differences that we can (such as zero-based arrays) in an otherwise familiar environment? Lacking a Computer Science (or similar) background, it’s more difficult than one might think because it goes against a lot of what we’ve learnt through the years. Counting from one is ingrained in us. Ask a child to count out the numbers. How many start “zero, one, two…?”

    Further complicating matters is that the Excel OM will continue to return 1-based results in many cases. The developer will have to deal with some structures that are zero based and others that are 1-based. Why not learn how to manage those differences now and thereby ease the pain of the actual transition?

    But, more to the title of the post, let’s not get caught up with VB vs VB.Net issues. Preemptively adopting zero-based arrays is only one example that falls in the “if it ain’t broke…fix it” category.

  7. If .Net provides associative arrays like WSH Dictionary or Perl/Python/Ruby hashes, then this entire discussion misses the point. There are better data structures than non-0/1-based arrays to represent identifiers for logical records where those identifiers ideally don’t begin with 0 or 1.

    On a different tack, if you want to use .Net, you need to start thinking in C terms.

  8. In retrospect, it would have been a good idea to introduce yourself, the topics you intend to post about, and what you hope to achieve. If your intention is to simply question the intrenched norms of the community, then you’re welcome to do that and I’ll join in the debate. If this is the first in a series of “Preparing for .Net” posts, that’s fine too and I’ll ignore them at my leisure. In all cases, though, it would be a good idea to include some sound justification for your assertions if you want them to be taken seriously (and I don’t think “Because .Net does it that way” is a justification that will garner much respect in these parts).

    And just to be picky, I notice that in your ‘recommended’ sample code of using LBound and UBound, you have introduced a logic error – the new result is the ten numbers 0-9 rather than 1-10. It’s exactly that sort of logic error that means doing such a seemingly innocuous change requires a complete and full retest of every bit of an application.

  9. Check out the following:

    ‘Number of items per dimension
    Dim NumberofItems() As Integer = {3000, 500}
    ‘The lower bound for each dimension
    Dim LowerBounds() As Integer = {2001, -1000}
    ‘Creating a generic array where we set the bounds
    Dim Arrobj As Array = Array.CreateInstance(GetType(Integer), NumberofItems, LowerBounds)
    ‘Assign the array
    Dim Arr(,) As Integer = CType(Arrobj, Integer(,))

    ‘Show the outcome
    MsgBox(“Starts at: ” & Arr.GetLowerBound(0) & ” Ends at ” & Arr.GetUpperBound(0))
    MsgBox(“Starts at: ” & Arr.GetLowerBound(1) & ” Ends at ” & Arr.GetUpperBound(1))

    Kind regards,
    Dennis

  10. I will go ahead and say that zero-based arrays are sometimes easier to work with. I was forced to learn some C to do some work in a specialized application. I took a C++ course a few years back, but absorbed very little of it. In revisiting C, I found it easier to write most loops when things are zero based.

    I have since started applying that to new VBA code I’ve written. Contrarian has a good point for people that work in other languages besides VBA, whether that is VB.Net, C#, or what have you: zero-based arrays are far more universal. It will be easier to write code in all of them if you use similar structures.

    For those who only work in VBA, this won’t be a consideration. It will still be easier to keep doing what you were doing.

    -Jason

  11. Its an interesting article, and I see the logic.
    Heres my issue though:
    what are the real world saleable benefits of moving to .net with Excel? And what are the costs?

    One of the reasons ‘if it aint broke don’t fix it’ came to pass is that there is no such thing as ‘if it aint broke.. fix it’, its really ‘if it aint broke meddle pointlessly till it is’. As in your mangling up your MyArr() to be off by one throughout. If its that hard to get a 3 line example right is it any wonder those of us maintaining 10’s of thousands of lines of live commercial code, dont share your enthusiasm for change to proven code.

    If .net were that good we’d all be on it – look at the stampede from XLM to VBA in 95-7. The reality is .net has been out since 2001 and the Excel story is still weak. And the take up is absolutely tiny, I’d be stunned if someone could show more than 5% of Excel/VBA developers had delivered any commercial code including .net work, and thats after 5 years. There are probably even less ‘mainstream developers'(old VB6/MFC types) interested in targetting Excel, they all seem to prefer web apps.

    I use .net and it is great for server based work, on the client though, I think its a long way from compelling.
    I’ll sprint to .net as soon as the job ads on http://www.jobserve.com indicate paying clients are taking it seriously (unless I (or .net) have already retired).
    Until then I’ll carry on keeping my code as close to the business as I can, thanks.

    I hope your next article isn’t going to advise us to use the old data type postfix characters because the migration tool copes with them better?

    BTW I (and all the other posters) have chosen not be anonymous – youre not an academic by any chance are you? they seem to favour this secrecy nonsense.

    Cheers
    Simon

  12. Well, that was a less than august introduction. Two rookie mistakes compounded by an error in judgement.

    First, Stephen, you are, obviously, right in the code having a bug. It was introduced not when the code was originally written — it read MyArr(I)=1 — but while editing the blog write-up. For a reason that escapes me now, “I” replaced the one. Not that it matters since the result hurt the issue I was wanted to focus on.

    Second, I did not realize how passionate people are about not Net. Using it as an example turned out to be the wrong decision. It needlessly distracted from the issues I wanted to raise. To me .Net is simply *one* of the reasons why I do things the way I do. Clearly, that did not come through in my post.

    Third, my original instinct, indeed, was to start with an introduction. But, it quickly became apparent that by choosing to remain anonymous I could not call on my past experiences or accomplishments. Consequently, every introduction came across as too long or too abstract or too pretentious or some combination of the three. Hence, it appeared best to let my posts speak for themselves.

    The rough start aside, I passionately believe in the topics I want to raise. Hopefully, future posts will do a better job of focusing on the core issues and not get so easily side-tracked.

  13. Hi Contrarian,

    Just a clarification, I don’t think (m)any people here are passionately “not .Net”, but we are passionately against “having .Net rammed down our throats when there’s little compelling reason for us to adopt it”.

    So how about you start again with an introduction not about yourself, but about the purpose and intention of your posts.

    Regards

    Stephen Bullen

  14. I take a more pragmatic view: we’ll deal with it when we have to, and .NET isn’t here yet. Or at least, it isn’t compulsory, and you can be certain that the big corporations won’t migrate their spreadsheets to a new Excel platform that lacks support for some kind of simple programming language – something to allow users to define their own functions and automate simple tasks, and allow developers to do more complex things and import extended functionality.

    Yes, there’s a move among the developers to build .net components and withdraw embedded VBA code from the workbook, placing it in add-ins or external classes called with a COM interface.

    But we’ve *always* been doing that in the big financial institutions: start with a spreadsheet written by some swaps trader, add a year’s worth of bloat and feature creep, discover that dealing rooms on three continents are dependant on it, then hand the heap over to IT… Who will first rationalise and debug the code, then ‘hollow out’ the workbook so that it becomes a traders’ scratchpad and presentation layer for the real calculations, with the heavy lifting done by a database engine for the data retrieval and by C++ components for the maths.

    So we come to the specific issue: zero-based arrays. Firstly, the pros have always mistrusted hardcoded assumptions about array bounds and always used the ‘For Lbound(arrX) to Ubound(arrX)’ construct, because you never really know what kind of array is going to come back from an Excel component.

  15. Secondly, If I really need to do it – say the code is full of nonzero-based arrays – I’ll just write a ShiftArray class with a default property that bhaves eaxactly like a nonzero vector or n-dimensional array; it’ll expose the standard ReDim and Erase methods, and hide the nonintuitive .Net array.

    That way, I preserve the close match between business logic and coded logic; that’s what VBA is *for*, the Basic family of programming languages is all about clarity and maintainability, and giving other developers the ability to understand your code at a glance. And if I really wanted the performance I’d have coded it up in another language anyway.

  16. Not sure I have a whole lot to add to this discussion…

    I almost felt an obligation to post ‘something’ for fear of the minute possibility that someone might think I was the Contrarian.

    I find that 1 based arrays are typically more useful (ie: Ranges), although I don’t usually dictate that with Option Base 1, rather I’ve had the idea drilled into me that the array boundaries should always be explicitly stated as in: Dim xArray(1 to 2, 1 to 4)

    …this has served me well. I would be surprised and disappointed if there was no way in the future to somehow explicitly dimension an array. That just strikes me as a step backwards?

    -AB

  17. So who said .Net was a step or two forward?

    Besides, (getting repetitive) if the goal is to have semantically meaningful indices into collections, there are better alternatives than arrays. If the need is for collections that allow implementations as close as possible to descriptions in mathematics texts, then base-1 should be the standard. Base-0 arrays are close to the hardware since most underlying implementations of array dereferencing involves offsetting from a base memory address with first item at offset 0.

    There were and are good reasons for C to have used 0-based arrays, and it’s fact that C# derives from C and was the first .Net language. It’s just unfortunate that this governed the architecture and basic syntax of .Net.

  18. >the ‘For Lbound(arrX) to Ubound(arrX)’ construct
    All very well and good – but it assumes we always want to handle every element ;o)

    >explicitly stated as in: Dim xArray(1 to 2, 1 to 4)
    Seems good to me and the way I ALWAYS do it anyway: zero-based (or even 1 based) arrays are ambiguous with so many silly conventions around.

    It would seem to me that an advanced compiler (meaning progress: the future) would be able to handle whatever the user threw at it. Enabling the user to model real world things (which might start with 6 and end with 23) without having to themselves deal with offsets and un-offsets. Surely that’s what higher level languages should be after? Internally – then the compiled .net code could be zero based or whatever (and thus totally consistent amongst the various .net languages).

    So does excel 12 start with row 0 ?


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.