In Adding Functionality – 2a I laid out the UI changes to support sorting of the results of Dick Kusleika’s VBE Find.

In this post, I develop the actual sort routine.

To recap, the task is to sort a 2D matrix on any number of columns. The sort columns are listed in a 1D array, with each entry in this array identifying one sort column. So, the sort routine’s signature is

End Sub

The routine uses what is called the Bubble Sort. It is fast for “small” matrices, and will be more than adequate for our needs.

The Bubble Sort essentially goes through the matrix row by row. For each row, it compares that row with every other row between itself and the end of the matrix. If necessary, it swaps the two rows. Consequently, once we are done with a particular row, we never need to reexamine it again. And, that’s why it’s called a Bubble Sort. The results “bubble” to their correct spot.

If ArrLen(DataArr) = 0 Or ArrLen(SortCols) = 0 Then Exit Sub

‘Swap the rows of DataArr as needed to perform an ascending sort

Dim I As Long, J As Long

For I = LBound(DataArr) To UBound(DataArr) – 1

For J = I + 1 To UBound(DataArr)

if row_I_key > row_J_key then swapRow DataArr, I, J

Next J

Next I

End Sub

That’s it. At its core, that’s the Bubble Sort. If we had a single key to sort on, we’d use the test `DataArr(I,key-column) > DataArr (J, key-column)`

. However, in this case, we have to cater to possibly multiple keys in the SortCols array. So, we write a slightly more complex comparison routine — and in the process also convert a number provided as a string to a numeric value.

To process multiple keys, we start with the ‘outermost’ key. If we can decide on the relative positions of the 2 records, we don’t have to check the next key. However, if the 2 key values are equal, we check the next key.

Option Base 0

Option Compare Text

Function ArrLen(Arr, Optional whatDim As Integer = 1)

ArrLen = UBound(Arr, whatDim) – LBound(Arr, whatDim) + 1

End Function

Sub swapRow(DataArr, R1 As Long, R2 As Long)

Dim J As Long

For J = LBound(DataArr, 2) To UBound(DataArr, 2)

Dim Temp

If TypeOf DataArr(R1, J) Is Object Then

Set Temp = DataArr(R1, J)

Set DataArr(R1, J) = DataArr(R2, J)

Set DataArr(R2, J) = Temp

Else

Temp = DataArr(R1, J): DataArr(R1, J) = DataArr(R2, J): DataArr(R2, J) = Temp

End If

Next J

End Sub

Sub doSort(DataArr(), SortCols() As Integer)

If ArrLen(DataArr) = 0 Or ArrLen(SortCols) = 0 Then Exit Sub

‘Swap the rows of DataArr as needed to perform an ascending sort

Dim I As Long, J As Long

For I = LBound(DataArr) To UBound(DataArr) – 1

For J = I + 1 To UBound(DataArr)

Dim SortColIdx As Integer

SortColIdx = LBound(SortCols)

Dim PairDone As Boolean: PairDone = False

Do

Dim X1, X2, SortCol As Long

SortCol = SortCols(SortColIdx)

X1 = DataArr(I, SortCol): X2 = DataArr(J, SortCol)

If IsNumeric(X1) And IsNumeric(X2) Then X1 = CDbl(X1): X2 = CDbl(X2)

If X1 > X2 Then

swapRow DataArr, I, J

PairDone = True

ElseIf X1 < X2 Then

PairDone = True

Else

SortColIdx = SortColIdx + 1

End If

Loop Until PairDone Or SortColIdx > UBound(SortCols)

Next J

Next I

End Sub

The algorithm I eventually implemented uses an ‘index array’ to track the final sequence of the rows in the data matrix. Use of an index array means that each swap involves just 1 element rather than every column in the data matrix.

Here’s where one would see a benefit. Suppose we have 3 records with the keys 33,22, and 11, respectively. Then, the final, and desired, sequence would be the records with keys 11,22, and 33.

With a Bubble Sort we would get the following swaps: (33,22,11) -> (22,33,11) -> (22,11,33) -> (11,22,33). So, we have to swap the rows 4 times, with each swap requiring an exchange of all the columns in the 2 rows, one at a time.

Alternatively, we could build an array with the record numbers in it, i.e., (1,2,3). Now, we change only the contents of this array leaving the original array alone. So, the four swaps would yield the index array (1,2,3) -> (2,1,3) -> (2,3,1) -> (3,2,1). This gives us the sequence of the final output, i.e., row 3, row 2, and finally row 1. So, we would have 4 swaps with the index array and 3 sequence moves for the data matrix.

[…] This post was mentioned on Twitter by gorocube, Excel MVP. Excel MVP said: Adding Functionality – 2b http://goo.gl/fb/0ZnnB […]

I took a different approach to sort a 2D-array on several ‘fields/columns’

I have a general sorting routine ‘function sorteer_000’ that sorts a 1-dimensional array.

If I want to sort a 2D-arrray I transform the 2-D array to a 1-dimensional array using ‘function sorteer_001’.

In that routine I add the sorting datafields to the beginning of each item.

Then I pass the resulting 1-dimensional array to the sorting routine ‘function sorteer_000’

The resulting sorted 1-dimensional array will be retransformed to the original 2_D array.

Example:

the first row of the array consists of one|blue|88 (| as fieldseparator)

if the array has to be sorted on 1. the 3rd field, 2 the first field; I transform this row to :

88one_one|blue|88

when the sorting routine is finished the stringpart after the underscore will be put into the 2D_array.

The start_macro

sr = ActiveWorkbook.Sheets(1).Cells(1).CurrentRegion

sn = Array(3, 1)

ActiveWorkbook.Sheets(1).Cells(1).Offset(,5).Resize(UBound(sr), UBound(sr, 2)) = F_sorteer_001(sp, sr)

End Sub

The transforming 1D to 2D; 2D to 1D array

For J = LBound(sq) To UBound(sq)

c00 = “”

For jj = 0 To UBound(sn)

c00 = c00 & sq(J, sn(jj))

Next

For jj = LBound(sq) To UBound(sq, 2)

c01 = c01 & IIf(jj = LBound(sq), c00 & “_”, “~”) & sq(J, jj) & IIf(jj = UBound(sq, 2), vbCr, “”)

Next

Next

sn = F_sorteer_000(Filter(Split(c01, vbCr), “_”))

For J = 0 To UBound(sn)

sr = Split(Split(sn(J), “_”)(1), “~”)

For jj = 0 To UBound(sr)

sq(J + LBound(sq), jj + LBound(sq)) = sr(jj)

Next

Next

F_sorteer_001 = sq

End Function

The sorting function

sn = Split(sq(IIf(sq(0) < sq(1), 0, 1)) & “|” & sq(IIf(sq(0) < sq(1), 1, 0)), “|”)

For J = 2 To UBound(sq)

If Not (sq(J) < sn(0) Or sq(J) > sn(UBound(sn))) Then

y = 0

If UBound(sn) > 10 Then y = IIf(sq(J) > sn(UBound(sn) 2 + 1), UBound(sn) 2, 0)

For jj = y To UBound(sn) – 1

If (sq(J) > sn(jj) And sq(J) < sn(jj + 1)) Then

sn(jj) = sn(jj) & “|” & sq(J)

Exit For

End If

Next

End If

sn = Split(IIf(sq(J) < sn(0), sq(J) & “|”, “”) & Join(sn, “|”) & IIf(sq(J) > sn(UBound(sn)), “|” & sq(J), “”), “|”)

Next

F_sorteer_000 = sn

End Function

As far as I can see this sorting method needs much less loops than the ‘bubble-sort’ method.

1) You have an interesting way of writing compact code, Hans. Compared to more obvious ways of doing the same, whether it is more understandable is debatable. As far as machine-efficiency goes, I doubt the compact style is better. Take this simple example:

It requires 6 array index operations, 1 string comparison in each of 2 IIf functions, 3 concatenations, and 1 string-to-array conversion creating an array in a variant. An optimizing compiler might reduce this to 4 array index operations, 1 string comparison with the result stored in a temporary variable, 2 IIf functions using this temporary variable, 3 concatenations, and 1 string-to-array conversion creating an array in a variant. [I might be wrong but I don’t see any way the compiler could do any better.]

By contrast, the more obvious and IMO eminently more readable

requires 4 array index operations, 1 string comparison, and 2 array indexed assignments. An optimizing compiler will do even better! It also has a lot lower probability of having a bug in the code.

2) Sorting on concatenated fields of variable length values of variable data types is *guaranteed* to fail.

Hans: Have you ever done any tests on how much time your sort takes versus a more conventionally coded bubble sort?