A quick tip – Working with arrays with unknown bounds

Two recent instances required transferring data from one array to another. One was a solution to speed up the use of COUNTIF for each element in a large range testing against another large range. In the process I had to build several building blocks including a replacement for the native Excel TRANSPOSE function, which unfortunately is limited to 65,536 elements.
COUNTIF for a large unsorted array and many searches

In the other I was illustrating how to build an “abstraction layer” between an Excel range and a VBA array (http://www.dailydoseofexcel.com/archives/2006/12/04/writing-to-a-range-using-vba/#comment-41826).

Transferring data from one array to another requires a loop to transfer data one element at a time. While one might be tempted to focus on an array assignment (ToArr = FromArr) or even the CopyMemory (RtlMoveMemory) API, neither would work for transposing an arbitrary array or for performing some mathematical operation while transferring the data from one array to the other.

It would be fairly easy to write a For loop if we knew that the bounds of both arrays were the same, which, in turn, is fairly easy to do in VB(A) since it supports the ability to declare both the lower bound and the upper bound of an array. However, in some cases, we might be given both the source and the destination array and not have the ability to change their respective dimensions, Moreover, other languages, including VB.Net do not support an explicit lower bound — it is always zero. Consequently, for the sake of generalization the code has to deal with any lower bound, and to minimize the pain of moving to .Net, I’ve started declaring all my arrays with a lower bound of zero.

Given the above, here’s what we have: An unknown lower bound of the source array, a possibly unknown lower bound for the destination array, and any array we declare will have a lower bound of zero. So, when we index these two source and destination arrays, we need two indices, one for each array. Further, since a For loop cannot have 2 control variables, we have to keep track of one of the indices by hand.

Option Explicit
Option Base 0
Function ArrLen(Arr, Optional ByVal aDim As Integer = 1)
    ArrLen = UBound(Arr, aDim) – LBound(Arr, aDim) + 1
    End Function
Sub xferArr(FromArr(), ByRef ToArr())
    ReDim ToArr(ArrLen(FromArr) – 1)
    Dim I As Long, J As Long: J = LBound(ToArr)
    For I = LBound(FromArr) To UBound(FromArr)
        ToArr(J) = FromArr(I)
        J = J + 1
        Next I
    End Sub

While the above should be easy to understand, I don’t like the fact that one has to keep track of the J index separately from the loop control variable (the I index). An alternative that uses a single index is below. Given the index to the source array it computes the destination index as follows. From the source index subtract the lower bound of the source array. This gives the index treating the source array as if it were zero-based, irrespective of its actual lower bound. This index would work with any array that were zero-based, including the destination array – if it were zero-based. To adjust to its actual lower bound, simply add the zero-based index to the array’s actual lower bound. This gives the actual index for the destination array.

Sub xferArr(FromArr(), ByRef ToArr())
    ReDim ToArr(ArrLen(FromArr) – 1)
    Dim I As Long
    For I = LBound(FromArr) To UBound(FromArr)
        ToArr(I – LBound(FromArr) + LBound(ToArr)) = FromArr(I)
        Next I
    End Sub

One might be tempted to argue that the 2 lower bounds are repeatedly computed inside the loop. I don’t know how expensive an operation that is but there are three ways to address the issue. One, don’t worry about it unless it proves to be a real bottleneck. Two, hope the compiler is an optimizing compiler and smart enough to recognize neither function changes inside the loop. Three, explicitly declare two variables, one for each lower bound and assign the values just before starting the loop.

Sub xferArr(FromArr(), ByRef ToArr())
    ReDim ToArr(ArrLen(FromArr) – 1)
    Dim LBF As Long, LBT As Long
    LBF = LBound(FromArr): LBT = LBound(ToArr)
    Dim I As Long
    For I = LBound(FromArr) To UBound(FromArr)
        ToArr(I – LBF + LBT) = FromArr(I)
        Next I
    End Sub

In a follow up link to the one referenced above on creating a data abstraction layer, I show how to use the above technique when dealing with three arrays. See http://www.dailydoseofexcel.com/archives/2006/12/04/writing-to-a-range-using-vba/#comment-41827

On a slightly different, though related, note I think Microsoft, like the designers of other modern languages, made a big mistake removing the ability to explicitly declare both bounds of an array dimension. There are many instances where this feature would lead to code that would be much more natural to understand. One simple example is working with ages. If one wanted to deal with, say, financial planning for retirement, it is a lot more natural to have an array X(65 to 80) rather than X(14) and then having to implicitly (or explicitly) map X(0) to age 65 and X(14) to age 80. One may also want negative indices when dealing with height (and depth) relative to, say, sea level or with temperatures that can go below zero. See also
On Arrays with non zero lower bounds etc
Posted by Gary Cornell (http://ablog.apress.com/?p=31)

Posted in Uncategorized

6 thoughts on “A quick tip – Working with arrays with unknown bounds

  1. Hi Tushar –

    Not sure I follow your last note. You can Dim array bounds to suit. From the Help for LBound and UBound, for example:

    Dim A(1 To 100, 0 To 3, -3 To 4)

    UBound(A, 1) = 100
    UBound(A, 2) = 3
    UBound(A, 3) = 4

    LBound(A, 1) = 1
    LBound(A, 2) = 0
    LBound(A, 3) = -3


  2. Michael,
    I think Tushar’s last note was in reference to some compiled languages such as VB.NET, which does not allow you to explicitely declare a lower bound – it must be zero. I also don’t understand why they added this “feature” to VB.NET, and it still causes me a few problems when I try to translate code from VBA to VB.NET.


  3. Declaring lower and upper bounds is a matter of personal choice. While it is more natural to you, it would be unnatural to me. In fact, it would upset me to find it.
    I think it was a good design choice to force zero-based arrays in .NET. The consistency brings it’s own benefits.

    While I’m here on my box, I also think Excel should also introduce row zero. *ducks*

Leave a Reply

Your email address will not be published. Required fields are marked *