Excel VBA Masterclass

Posted on April 27, 2020 by jkpieterse

Hi there. Hope you and yours are well in these difficult times!

This is just a very short announcement that I’ll be doing an on-line version of my Excel VBA Masterclass.

The training is scheduled for May 18, 20, 26, 28, June 2, 4 and I’ll be using Microsoft Teams to deliver it to your homes!

Regards,

Jan Karel Pieterse

Retrieving Data from Add-in Worksheets

Posted on December 9, 2015December 8, 2015 by Dick Kusleika

Add-ins have worksheets. You just can’t see them. But you can store information on them and it’s a good way to store settings, preferences, and other data. When you want to get to that data, you can go to the Properties for ThisWorkbook and change the IsAddin property to False. Now you can see the worksheets and change the data if necessary.

When you’re done, go back to the VBE and change the IsAddin property back to True before you save your changes. Don’t forget that part; it’s important.

I have a list of vendor codes stored on a worksheet in an addin. I need to see the list, but not change it. I didn’t want to go through all the IsAddin rigmarole, so I did this in the Immediate Window.

wshvendors.ListObjects(1).ListColumns(1).DataBodyRange.value

1	wshvendors.ListObjects(1).ListColumns(1).DataBodyRange.value

That part returns a two-dimensional array of all the values in the first column

application.transpose(...)

1	application.transpose(...)

That turns a two-dimensional array into a one-dimensional array.

?join(...,",")

1	?join(...,",")

That turns an array into a string with commas between the values. In retrospect, I should have used

?join(application.transpose(wshvendors.ListObjects(1).ListColumns(1).DataBodyRange.value),vbnewline)

1	?join(application.transpose(wshvendors.ListObjects(1).ListColumns(1).DataBodyRange.value),vbnewline)

to get each code on its own line. Here’s a way to put it into a range, if that’s where you’re going with it anyway.

wshvendors.ListObjects(1).ListColumns(1).DataBodyRange.Copy workbooks.Add.Worksheets(1).cells(1,1)

1	wshvendors.ListObjects(1).ListColumns(1).DataBodyRange.Copy workbooks.Add.Worksheets(1).cells(1,1)

MaxMinFair Rewrite

Posted on September 18, 2015 by Dick Kusleika

I read Charles William’s MaxMinFair algorithm and I didn’t like his approach. That’s typical. I’ll read somebody’s code and think “They’re making that too hard”. Then I’ll set about rewriting it. In this case, as in most cases, it turns out that it is that hard, but I wasn’t going to convince myself until I tried it. I ended up with a different approach that’s not shorter, not easier to read, and not easier to follow. Oh well, here it is anyway.

Function MaxMinFairDK(Supply As Double, Demands As Variant) As Variant
    
    Dim dPrior As Double
    Dim vaReturn As Variant
    Dim dAvailable As Double
    Dim i As Long, j As Long
    Dim dTemp As Double
    Dim wf As WorksheetFunction
    
    On Error GoTo ErrHandler
    Set wf = Application.WorksheetFunction
    If IsObject(Demands) Then Demands = Demands.Value2 'make range array
    dAvailable = Abs(Supply) 'ignore negative supplies
    
    If Not IsArray(Demands) Then
        'One demand = min of supply or demand
        MaxMinFairDK = Array(dAvailable, Demands)(Abs(dAvailable > Demands))
    Else
        'Excel returns NA when you use too many columns
        If UBound(Demands, 2) > 1 Then Err.Raise xlErrNA
        'Assume everybody gets everything they want
        ReDim vaReturn(LBound(Demands, 1) To UBound(Demands, 1), 1 To 1)
        vaReturn = Demands
        
        For i = UBound(Demands, 1) To LBound(Demands, 1) Step -1
            'If there's enough, do nothing except reduce what's available
            If dAvailable / i > (wf.Large(Demands, i) - dPrior) Then
                dAvailable = dAvailable - ((wf.Large(Demands, i) - dPrior) * i)
                dPrior = wf.Large(Demands, i)
            Else
                'Once there's not enough, everyone splits what's left
                For j = LBound(Demands, 1) To UBound(Demands, 1)
                    If Demands(j, 1) > dPrior Then
                        vaReturn(j, 1) = dPrior + (dAvailable / i)
                    End If
                Next j
                Exit For
            End If
        Next i
        
        MaxMinFairDK = vaReturn
    End If
    
ErrExit:
    Exit Function
    
ErrHandler:
    MaxMinFairDK = CVErr(Err.Number)
    Resume ErrExit
    
End Function

Function MaxMinFairDK(Supply As Double, Demands As Variant) As Variant

Dim dPrior As Double

Dim vaReturn As Variant

Dim dAvailable As Double

Dim i As Long, j As Long

Dim dTemp As Double

Dim wf As WorksheetFunction

On Error GoTo ErrHandler

Set wf = Application.WorksheetFunction

If IsObject(Demands) Then Demands = Demands.Value2 'make range array

dAvailable = Abs(Supply) 'ignore negative supplies

If Not IsArray(Demands) Then

'One demand = min of supply or demand

MaxMinFairDK = Array(dAvailable, Demands)(Abs(dAvailable > Demands))

Else

'Excel returns NA when you use too many columns

If UBound(Demands, 2) > 1 Then Err.Raise xlErrNA

'Assume everybody gets everything they want

ReDim vaReturn(LBound(Demands, 1) To UBound(Demands, 1), 1 To 1)

vaReturn = Demands

For i = UBound(Demands, 1) To LBound(Demands, 1) Step -1

'If there's enough, do nothing except reduce what's available

If dAvailable / i > (wf.Large(Demands, i) - dPrior) Then

dAvailable = dAvailable - ((wf.Large(Demands, i) - dPrior) * i)

dPrior = wf.Large(Demands, i)

Else

'Once there's not enough, everyone splits what's left

For j = LBound(Demands, 1) To UBound(Demands, 1)

If Demands(j, 1) > dPrior Then

vaReturn(j, 1) = dPrior + (dAvailable / i)

End If

Next j

Exit For

End If

Next i

MaxMinFairDK = vaReturn

End If

ErrExit:

Exit Function

ErrHandler:

MaxMinFairDK = CVErr(Err.Number)

Resume ErrExit

End Function

In Charles’s implementation, he allocates an equal amount of the supply to each node, then takes back what that node didn’t need and puts it back in the available pool. When I was looking at the results, I was thinking that the smallest n nodes simply get their demand and only when there’s not enough to go around do we need to do something different than allocate the full demand.

In my implementation, I start by giving everyone what they demand. Then I start with the smallest demand, and if I can accommodate that amount for everyone, I just reduce the amount available and move to the second smallest demand. At some point (the sixth smallest demand in Charles’s data) I can’t meet that demand and still give everyone an equal share. At that point, I give anyone who hasn’t had their demand met an equal amount – the amount that’s already been distributed plus an equal share of what’s left.

Rank	Demand	Incremental Demand	Allocated	Remaining
				18.30
7	0.70	0.70	4.90	13.40
6	1.00	0.30	1.80	11.60
5	1.30	0.30	1.50	10.10
4	2.00	0.70	2.80	7.30
3	3.50	1.50	4.50	2.80
2	7.40	3.90	7.80	(5.00)
1	10.00	2.60	2.60	(7.60)

In the first iteration, I hand out 0.70 to everyone because I have enough supply to do that. In the second iteration, I had out the differential, 0.30, to everyone who’s left because I have enough supply remaining. When I get to #2, I can’t hand out 3.90 to the remaining two nodes because I don’t have enough supply. I’ve allocated up to 3.5 to anyone who’s demanded it, so the last two get the 3.5 plus half of the 2.8 that remains.

Although I didn’t accomplish anything, it was still a fun exercise.

Joining Two Dimensional Arrays

Posted on January 28, 2015January 27, 2015 by Dick Kusleika

The Join function takes an array and smushes it together into a String. I love the Join function. The only thing I don’t like about it is when I forget that it doesn’t work on 2d arrays. Join only works with 1-dimensional arrays. The last time my memory failed me, I decided to write my own. And here it is.

Public Function Join2D(ByVal vArray As Variant, Optional ByVal sWordDelim As String = " ", Optional ByVal sLineDelim As String = vbNewLine) As String
    
    Dim i As Long, j As Long
    Dim aReturn() As String
    Dim aLine() As String
    
    ReDim aReturn(LBound(vArray, 1) To UBound(vArray, 1))
    ReDim aLine(LBound(vArray, 2) To UBound(vArray, 2))
    
    For i = LBound(vArray, 1) To UBound(vArray, 1)
        For j = LBound(vArray, 2) To UBound(vArray, 2)
            'Put the current line into a 1d array
            aLine(j) = vArray(i, j)
        Next j
        'Join the current line into a 1d array
        aReturn(i) = Join(aLine, sWordDelim)
    Next i
    
    Join2D = Join(aReturn, sLineDelim)
    
End Function

Public Function Join2D(ByVal vArray As Variant, Optional ByVal sWordDelim As String = " ", Optional ByVal sLineDelim As String = vbNewLine) As String

Dim i As Long, j As Long

Dim aReturn() As String

Dim aLine() As String

ReDim aReturn(LBound(vArray, 1) To UBound(vArray, 1))

ReDim aLine(LBound(vArray, 2) To UBound(vArray, 2))

For i = LBound(vArray, 1) To UBound(vArray, 1)

For j = LBound(vArray, 2) To UBound(vArray, 2)

'Put the current line into a 1d array

aLine(j) = vArray(i, j)

Next j

'Join the current line into a 1d array

aReturn(i) = Join(aLine, sWordDelim)

Next i

Join2D = Join(aReturn, sLineDelim)

End Function

It’s pretty simple. It loops through the first dimension (the row dimension) and joins each line with sLineDelim. Inside that loop, it joins each element in the second dimension with sWordDelim. What this function doesn’t do is automatically insert itself into only the projects I want. That requires me to remember that I wrote it and where I put it. In reality, I’ll probably reinvent the wheel the next time I need it.

Here’s my extensive testing procedure.

Sub TEST_Join2d()
    
    Dim a(1 To 2, 1 To 2) As String
    
    a(1, 1) = "The"
    a(1, 2) = "Quick"
    a(2, 1) = "Brown"
    a(2, 2) = "Fox"
    
    Debug.Print Join2D(a)
    Debug.Print
    Debug.Print Join2D(a, ",")
    Debug.Print
    Debug.Print Join2D(a, , "|")
    Debug.Print
    Debug.Print Join2D(a, ";", "||")
    
End Sub

Sub TEST_Join2d()

Dim a(1 To 2, 1 To 2) As String

a(1, 1) = "The"

a(1, 2) = "Quick"

a(2, 1) = "Brown"

a(2, 2) = "Fox"

Debug.Print Join2D(a)

Debug.Print

Debug.Print Join2D(a, ",")

Debug.Print

Debug.Print Join2D(a, , "|")

Debug.Print

Debug.Print Join2D(a, ";", "||")

End Sub

Sending Images via WinSCP

Posted on February 11, 2014February 9, 2014 by Dick Kusleika

Since my recent move to Digital Ocean for hosting, I’ve had to make a change to how I upload images for this blog. I used to create an FTP file and a batch file, but as far as I know that doesn’t support SFTP. I’m using WinSCP to transfer files manually and learned that it has a command line interface. I made a procedure called SendViaSCP to replace my SendViaFTP.

Public Sub SendViaSCP(vFname As Variant)


    Dim aScript() As String

    Dim i As Long
    ReDim aScript(1 To 4 + UBound(vFname))
    aScript(1) = "option batch abort"

    aScript(2) = "option confirm off"

    aScript(3) = "open sftp://username:password@000.000.000.000"

    aScript(UBound(aScript)) = "exit"
    For i = LBound(vFname) To UBound(vFname)

        aScript(3 + i) = "put " & Dir(vFname(i)) & " /home/wordpress/public_html/blogpix/"

    Next i
    Open "winscpup.txt" For Output As #1

    Print #1, Join(aScript, vbNewLine)

    Close #1
    Shell "winscpup.bat"

End Sub

The vFname arguments is a variant array that holds all of the files I selected from Application.GetOpenFileName. The aScript array holds three lines of setup, a line for each file, and an exit line.

The commands are joined together and written to a batch file and the batch file is run. It doesn’t solve the problem that Billkamm and Haines solved of having your username and password in a batch file, but I can live with it.

You might be wondering why I don’t just use the file upload functions in WordPress. What fun would that be?

Sparkline Gauge

Posted on December 7, 2012March 20, 2018 by Dick Kusleika

I have a list of labels and a list of values. I also have a value from somewhere in the range of my values list.

I want to show where the value falls in the list of values with a red mark. The labels need to be proportional to their values. An XY chart seemed like the obvious answer, but would require the XY Chart Label add-in or some better charting skills on my part. Also, there may be a few of these on a sheet, so I wanted to keep it more light-weight. Sparklines move nicely with cells, so I tried that route.

First, I needed to get the labels spread proportionally across a cell. I made the cell 20 characters wide by typing '00000000000000000000, setting the font to Courier New, and resizing the column. Cell C6 shows the positions. To get the proper spread of labels, I used a UDF, shown below.

Public Function PropAxis(rLabels As Range, rValues As Range) As String
    
    Dim vaLabels As Variant
    Dim vaValues As Variant
    Dim i As Long, j As Long
    Dim dSpan As Double
    Dim dIncrement As Double
    Dim lPosition As Long
    Dim aReturn(1 To 20) As String
    
    'Put ranges in an array
    vaLabels = rLabels.Value
    vaValues = rValues.Value
    
    'Find the span of the values range
    dSpan = vaValues(UBound(vaValues, 1), 1) - vaValues(LBound(vaValues, 1), 1)
    
    'initialize the array with spaces
    For i = LBound(aReturn) To UBound(aReturn)
        aReturn(i) = Space(1)
    Next i
    
    'put the first and last labels in the array
    aReturn(1) = vaLabels(1, 1)
    aReturn(UBound(aReturn)) = vaLabels(UBound(vaLabels, 1), 1)
    
    'Put the middle labels proportionally
    For i = LBound(vaLabels) + 1 To UBound(vaLabels) - 1
        dIncrement = (vaValues(i, 1) - vaValues(1, 1)) / dSpan
        dIncrement = dIncrement * (UBound(aReturn) - 1)
        lPosition = Round(dIncrement, 0)
        
        'If they're too close, just move one over
        If aReturn(lPosition) <> Space(1) And lPosition < UBound(aReturn) Then
            lPosition = lPosition + 1
        End If
        
        aReturn(lPosition) = vaLabels(i, 1)
    Next i
    
    PropAxis = Join(aReturn, vbNullString)
    
End Function

Public Function PropAxis(rLabels As Range, rValues As Range) As String

Dim vaLabels As Variant

Dim vaValues As Variant

Dim i As Long, j As Long

Dim dSpan As Double

Dim dIncrement As Double

Dim lPosition As Long

Dim aReturn(1 To 20) As String

'Put ranges in an array

vaLabels = rLabels.Value

vaValues = rValues.Value

'Find the span of the values range

dSpan = vaValues(UBound(vaValues, 1), 1) - vaValues(LBound(vaValues, 1), 1)

'initialize the array with spaces

For i = LBound(aReturn) To UBound(aReturn)

aReturn(i) = Space(1)

Next i

'put the first and last labels in the array

aReturn(1) = vaLabels(1, 1)

aReturn(UBound(aReturn)) = vaLabels(UBound(vaLabels, 1), 1)

'Put the middle labels proportionally

For i = LBound(vaLabels) + 1 To UBound(vaLabels) - 1

dIncrement = (vaValues(i, 1) - vaValues(1, 1)) / dSpan

dIncrement = dIncrement * (UBound(aReturn) - 1)

lPosition = Round(dIncrement, 0)

'If they're too close, just move one over

If aReturn(lPosition) <> Space(1) And lPosition < UBound(aReturn) Then

lPosition = lPosition + 1

End If

aReturn(lPosition) = vaLabels(i, 1)

Next i

PropAxis = Join(aReturn, vbNullString)

End Function

The values still have to be reasonably spread out or this will overwrite some of them. But it worked for my purposes. The formula =PropAxis(F3:F8,G3:G8) is in C3.

Next I set up the range I2:AB3 to feed the sparkline. Row 2 represents the 20 characters in the cell. Row three should be all zeros except one, which will be a 1. Then a win/loss sparkline will show the one ‘win’ as a mark.

In H2, I calculate which character gets the win. Here’s how that formula progressed:

=ROUND((G2-G3)/(G8-G3)*20,0)

That figures where the value is in the list of values as a percentage and multiplies by the number of characters in the cell (20). I wanted to protect against a value that was not in the range, so I modified it to this:

=MIN(MAX(ROUND((G2-G3)/(G8-G3)*20,0),1),20)

Now it will never be less than 1 or greater than 20. Because of rounding, the mark may be one character off of an exact match. That’s probably not a problem – it’s obviously not supposed to be hyper-accurate since I’m only using 20 characters – but it just doesn’t look right. So I handle exact matches as special cases.

=IF(ISNA(MATCH(G2,G3:G8,FALSE)),MIN(MAX(ROUND((G2-G3)/(G8-G3)*20,0),1),20),FIND(INDEX(F3:F8,MATCH(G2,G3:G8,FALSE),1),C3))

If there isn’t an exact match, find the percentage and get close. If it is an exact match, find the character’s position in the string and put the mark there. Now in Row 3 of my sparkline data range, I use this number to find the ‘win’.

=IF($H$2=I2,1,0)

I thought this whole process would be easier with sparklines. But it’s not really any more light-weight than just a chart. Anybody else want to take a crack at it?

You can download SparklineProportion.zip

Retrieving Lost Comments

Posted on November 5, 2012March 20, 2018 by Dick Kusleika

I’ve restored a few posts in the last few months that were lost. I didn’t restore any of the comments. Honestly, I should have but I didn’t even think about it. But when I went to restore the In Cell Charting post, I noticed there were 85 comments. That seemed worth my while.

First I set a reference to Microsoft XML, v6.0 and Microsoft HTML Object Library. Here’s the main procedure.

Public Sub CreateCommentSQL()
    
    Dim xHttp As MSXML2.XMLHTTP
    Dim hDoc As MSHTML.HTMLDocument
    Dim hOLComments As MSHTML.HTMLOListElement
    Dim hLIComments As MSHTML.IHTMLElementCollection
    Dim hLIComment As MSHTML.HTMLLIElement
    Dim clsComments As CComments
    Dim clsComment As CComment
    
    'Go get the lost comments from the wayback machine
    Set xHttp = New MSXML2.XMLHTTP
    xHttp.Open "GET", "http://web.archive.org/web/20100418043617/http://www.dailydoseofexcel.com/archives/2006/02/05/in-cell-charting/"
    xHttp.send
    
    'Wait until the page loads
    Do: DoEvents: Loop Until xHttp.readyState = 4
    
    Set clsComments = New CComments
    
    'Load the document
    Set hDoc = New MSHTML.HTMLDocument
    hDoc.body.innerHTML = xHttp.responseText
    
    'The ordered list has an id of "comments"
    Set hOLComments = hDoc.getElementsByName("comments")(0)
    'Get all the listindex elements
    Set hLIComments = hOLComments.getElementsByTagName("li")
    
    For Each hLIComment In hLIComments
        Set clsComment = New CComment
        With clsComment
            .AddNameFromCite hLIComment.getElementsByTagName("cite")(0)
            .AddDate hLIComment.getElementsByClassName("comment-meta commentmetadata")(0)
            .AddContent hLIComment.getElementsByTagName("p")
        End With
        clsComments.Add clsComment
    Next hLIComment
    
    clsComments.CreateSQLFile
    
End Sub

Public Sub CreateCommentSQL()

Dim xHttp As MSXML2.XMLHTTP

Dim hDoc As MSHTML.HTMLDocument

Dim hOLComments As MSHTML.HTMLOListElement

Dim hLIComments As MSHTML.IHTMLElementCollection

Dim hLIComment As MSHTML.HTMLLIElement

Dim clsComments As CComments

Dim clsComment As CComment

'Go get the lost comments from the wayback machine

Set xHttp = New MSXML2.XMLHTTP

xHttp.Open "GET", "http://web.archive.org/web/20100418043617/http://www.dailydoseofexcel.com/archives/2006/02/05/in-cell-charting/"

xHttp.send

'Wait until the page loads

Do: DoEvents: Loop Until xHttp.readyState = 4

Set clsComments = New CComments

'Load the document

Set hDoc = New MSHTML.HTMLDocument

hDoc.body.innerHTML = xHttp.responseText

'The ordered list has an id of "comments"

Set hOLComments = hDoc.getElementsByName("comments")(0)

'Get all the listindex elements

Set hLIComments = hOLComments.getElementsByTagName("li")

For Each hLIComment In hLIComments

Set clsComment = New CComment

With clsComment

.AddNameFromCite hLIComment.getElementsByTagName("cite")(0)

.AddDate hLIComment.getElementsByClassName("comment-meta commentmetadata")(0)

.AddContent hLIComment.getElementsByTagName("p")

End With

clsComments.Add clsComment

Next hLIComment

clsComments.CreateSQLFile

End Sub

I looked at the source for the web page to figure out how it was laid out and how to get at the data I needed. The CComment and CComments classes store the data as I loop through the list index items in the comment list. The first CComment method is AddNameFromCite. I didn’t even know there was a Cite tag in HTML (but you could fill a warehouse with what I don’t know about HTML).

Public Sub AddNameFromCite(ByVal hCite As MSHTML.HTMLPhraseElement)
    
    Dim hAnchor As MSHTML.HTMLAnchorElement
    
    Me.Author = hCite.innerText
    
    Set hAnchor = hCite.getElementsByTagName("a")(0)
    
    If Not hAnchor Is Nothing Then
        Me.AuthorLink = Mid$(hAnchor.href, InStr(2, hAnchor.href, "http://"), Len(hAnchor.href))
    End If
    
End Sub

Public Sub AddNameFromCite(ByVal hCite As MSHTML.HTMLPhraseElement)

Dim hAnchor As MSHTML.HTMLAnchorElement

Me.Author = hCite.innerText

Set hAnchor = hCite.getElementsByTagName("a")(0)

If Not hAnchor Is Nothing Then

Me.AuthorLink = Mid$(hAnchor.href, InStr(2, hAnchor.href, "http://"), Len(hAnchor.href))

End If

End Sub

I made this a method because I generally reserve properties to getting/setting values. If I change more than one property or do any extensive manipulation, I go with a method instead of a property. I’m not uber-consistent about it though. The comment author’s name is the innertext of the PhraseElement (that’s what a Cite is, at least according to the TypeName function). To get the AuthorLink, I need to find the anchor and get the href attribute. Because the wayback machine put its own URL in from of other URLs, I had to find the second instance of “http://” to get the real link. Next the AddDate method.

Public Sub AddDate(ByVal hDiv As MSHTML.HTMLDivElement)
    
    Dim sDate As String
    Dim vaDate As Variant
    
    sDate = hDiv.innerText
    vaDate = Split(sDate, " at ")
    
    Me.CommentDate = DateValue(vaDate(0)) + TimeValue(vaDate(1))
    
End Sub

Public Sub AddDate(ByVal hDiv As MSHTML.HTMLDivElement)

Dim sDate As String

Dim vaDate As Variant

sDate = hDiv.innerText

vaDate = Split(sDate, " at ")

Me.CommentDate = DateValue(vaDate(0)) + TimeValue(vaDate(1))

End Sub

This really should have been a property instead of a method, but oh well. The innertext of the DivElement is something like “January 1, 2010 at 6:16 am”. I split that string on the “at” and used DateValue and TimeValue to build a date. Finally the content of the comment.

Public Sub AddContent(ByVal hParas As MSHTML.IHTMLElementCollection)
    
    Dim hPara As MSHTML.HTMLParaElement
    Dim sContent As String
    
    For Each hPara In hParas
        sContent = sContent & hPara.innerText
    Next hPara
    
    Me.Content = sContent
    
End Sub

Public Sub AddContent(ByVal hParas As MSHTML.IHTMLElementCollection)

Dim hPara As MSHTML.HTMLParaElement

Dim sContent As String

For Each hPara In hParas

sContent = sContent & hPara.innerText

Next hPara

Me.Content = sContent

End Sub

I passed in a collection of elements that are ParaElements (tag=p=paragraph). Then I looped through them and concatenated a string for the content. By looping through just the p elements, I skip all the comment meta crap that is auto-generated by WordPress and just get to the text.

At this point I have 85 CComment objects and I’m ready to build the SQL string.

Public Sub CreateSQLFile()
    
    Dim sFile As String, lFile As Long
    Dim clsComment As CComment
    Dim sSql As String
    Dim aSql() As String
    Dim lCnt As Long
    
    ReDim aSql(1 To Me.Count)
    
    'Build the first part of the sql string with the column names
    sSql = "INSERT INTO `wp_comments` (`comment_post_ID`, `comment_author`, `comment_author_email`, `comment_author_url`,"
    sSql = sSql & Space(1) & "`comment_author_IP`, `comment_date`, `comment_date_gmt`, `comment_content`, `comment_karma`,"
    sSql = sSql & Space(1) & "`comment_approved`, `comment_agent`, `comment_type`, `comment_parent`, `user_id`) VALUES" & vbNewLine
    
    'Put all the comment values in an array
    For Each clsComment In Me
        lCnt = lCnt + 1
        aSql(lCnt) = clsComment.SQLInsert
    Next clsComment
    
    'put it all together
    sSql = sSql & Join(aSql, ", " & vbNewLine) & ";"
    
    'write it to a sql file
    sFile = ThisWorkbook.Path & Application.PathSeparator & "wp_incellcomments.sql"
    lFile = FreeFile
    
    Open sFile For Output As lFile
    Print #lFile, sSql
    
    Close lFile
    
End Sub

Public Sub CreateSQLFile()

Dim sFile As String, lFile As Long

Dim clsComment As CComment

Dim sSql As String

Dim aSql() As String

Dim lCnt As Long

ReDim aSql(1 To Me.Count)

'Build the first part of the sql string with the column names

sSql = "INSERT INTO `wp_comments` (`comment_post_ID`, `comment_author`, `comment_author_email`, `comment_author_url`,"

sSql = sSql & Space(1) & "`comment_author_IP`, `comment_date`, `comment_date_gmt`, `comment_content`, `comment_karma`,"

sSql = sSql & Space(1) & "`comment_approved`, `comment_agent`, `comment_type`, `comment_parent`, `user_id`) VALUES" & vbNewLine

'Put all the comment values in an array

For Each clsComment In Me

lCnt = lCnt + 1

aSql(lCnt) = clsComment.SQLInsert

Next clsComment

'put it all together

sSql = sSql & Join(aSql, ", " & vbNewLine) & ";"

'write it to a sql file

sFile = ThisWorkbook.Path & Application.PathSeparator & "wp_incellcomments.sql"

lFile = FreeFile

Open sFile For Output As lFile

Print #lFile, sSql

Close lFile

End Sub

Just a bunch string building and putting in a file that I can import into PHPMyAdmin. In the CComment class, the values are put together like this

Public Property Get SQLInsert() As String
    
    Dim sReturn As String
    Dim aReturn(1 To 14) As Variant
    
    Const sNOVALUE As String = "''"
    Const sSQ As String = "'"
    
    aReturn(1) = "7534"
    aReturn(2) = sSQ & EscSq(Me.Author) & sSQ
    aReturn(3) = sNOVALUE
    aReturn(4) = sSQ & EscSq(Me.AuthorLink) & sSQ
    aReturn(5) = sNOVALUE
    aReturn(6) = sSQ & Format(Me.CommentDate, "yyyy-mm-dd hh:mm:ss") & sSQ
    aReturn(7) = aReturn(6)
    aReturn(8) = sSQ & EscSq(Me.ContentScrubbed) & sSQ
    aReturn(9) = 0
    aReturn(10) = sSQ & "1" & sSQ
    aReturn(11) = sNOVALUE
    aReturn(12) = sNOVALUE
    aReturn(13) = 0
    aReturn(14) = 0
    
    sReturn = "(" & Join(aReturn, ", ") & ")"
    
    SQLInsert = sReturn
    
End Property

Public Property Get SQLInsert() As String

Dim sReturn As String

Dim aReturn(1 To 14) As Variant

Const sNOVALUE As String = "''"

Const sSQ As String = "'"

aReturn(1) = "7534"

aReturn(2) = sSQ & EscSq(Me.Author) & sSQ

aReturn(3) = sNOVALUE

aReturn(4) = sSQ & EscSq(Me.AuthorLink) & sSQ

aReturn(5) = sNOVALUE

aReturn(6) = sSQ & Format(Me.CommentDate, "yyyy-mm-dd hh:mm:ss") & sSQ

aReturn(7) = aReturn(6)

aReturn(8) = sSQ & EscSq(Me.ContentScrubbed) & sSQ

aReturn(9) = 0

aReturn(10) = sSQ & "1" & sSQ

aReturn(11) = sNOVALUE

aReturn(12) = sNOVALUE

aReturn(13) = 0

aReturn(14) = 0

sReturn = "(" & Join(aReturn, ", ") & ")"

SQLInsert = sReturn

End Property

I really like this method of building a string – putting it into an array and using Join – so I think I’ll start using it. The EscSq function turns any single quotes into two single quotes. The ContentScrubbed property converts any vbNewLines into \r\n. I exported some existing comments from MySQL to see how all this stuff went together. In the end, I ended up with a file that looks like this.

phpMyAdmin kept erroring out that the file was using too much memory. It’s 51kb, so I knew that wasn’t true. But the helpful people at HostGator imported it for me and set me up with console access so I can do it myself next time. I just need to learn the commandline stuff for importing.

I took a quick look through through the comments and they look alright. It’s hard to tell what I screwed up formatting-wise because some people use code tags and most don’t. But the info appears to be there and that’s the most important thing. I guess since I have this set up, I should go back and make sure any other lost posts get their comments too.

As always, if you see something that’s not right on the site, shoot me an email. I have a few hundred posts that still look like crap, but are readable and I’m fixing them as I see them.

String Diffing

Posted on September 13, 2012March 25, 2018 by Dick Kusleika

I’ve wanted to have some wiki-like diffing in my userform textboxes for a while now. Since I’ve been using wikis almost daily, I want the revisioning feature in everything I do. I’m not there yet, but I decided to see what kind of algorithm I would need to do it. I read the Wikipedia article on longest common subsequence and played around with it a little.

Public Function LCSTable(ByRef aOriginal() As String, ByRef aRevised() As String) As Variant
    
    Dim aReturn() As Long
    Dim i As Long, j As Long
    
    'aOriginal and aRevised should be 1-based.  Here we make a matrix with a gutter
    'row and column that will always be zero.
    ReDim aReturn(0 To UBound(aOriginal), 0 To UBound(aRevised))
    
    For i = 1 To UBound(aOriginal)
        For j = 1 To UBound(aRevised)
            'If the elements match, bump up the count from the element
            'one up and one left
            If aOriginal(i) = aRevised(j) Then
                aReturn(i, j) = aReturn(i - 1, j - 1) + 1
            Else
            'If they don't match, copy the largest from either above or from the left
                aReturn(i, j) = Application.WorksheetFunction.Max(aReturn(i, j - 1), aReturn(i - 1, j))
            End If
        Next j
    Next i
    
    LCSTable = aReturn
    
End Function

Public Function LCSTable(ByRef aOriginal() As String, ByRef aRevised() As String) As Variant

Dim aReturn() As Long

Dim i As Long, j As Long

'aOriginal and aRevised should be 1-based. Here we make a matrix with a gutter

'row and column that will always be zero.

ReDim aReturn(0 To UBound(aOriginal), 0 To UBound(aRevised))

For i = 1 To UBound(aOriginal)

For j = 1 To UBound(aRevised)

'If the elements match, bump up the count from the element

'one up and one left

If aOriginal(i) = aRevised(j) Then

aReturn(i, j) = aReturn(i - 1, j - 1) + 1

Else

'If they don't match, copy the largest from either above or from the left

aReturn(i, j) = Application.WorksheetFunction.Max(aReturn(i, j - 1), aReturn(i - 1, j))

End If

Next j

Next i

LCSTable = aReturn

End Function

This code is called LCSLength in the article. It returns a matrix (2d array) with counts of matching elements at each position. For instance, if you’re diffing “Dick” and “Rick”, they have three letters in common and this table will compute that. It looks like this

		R	i	c	k
	0	0	0	0	0
D	0	0	0	0	0
i	0	0	1	1	1
c	0	0	1	2	2
k	0	0	1	2	3

The rest of the functions use this table to figure out what’s what.

Public Function LCSString(ByRef vaTable As Variant, ByRef aOriginal() As String, ByRef aRevised() As String, ByVal i As Long, ByVal j As Long) As String
    
    Dim sReturn As String
    
    If i = 0 Or j = 0 Then
        sReturn = ""
    ElseIf aOriginal(i) = aRevised(j) Then
        sReturn = LCSString(vaTable, aOriginal, aRevised, i - 1, j - 1) & aOriginal(i)
    Else
        If vaTable(i, j - 1) > vaTable(i - 1, j) Then
            sReturn = LCSString(vaTable, aOriginal, aRevised, i, j - 1)
        Else
            sReturn = LCSString(vaTable, aOriginal, aRevised, i - 1, j)
        End If
    End If
    
    LCSString = sReturn
    
End Function

Public Function LCSString(ByRef vaTable As Variant, ByRef aOriginal() As String, ByRef aRevised() As String, ByVal i As Long, ByVal j As Long) As String

Dim sReturn As String

If i = 0 Or j = 0 Then

sReturn = ""

ElseIf aOriginal(i) = aRevised(j) Then

sReturn = LCSString(vaTable, aOriginal, aRevised, i - 1, j - 1) & aOriginal(i)

Else

If vaTable(i, j - 1) > vaTable(i - 1, j) Then

sReturn = LCSString(vaTable, aOriginal, aRevised, i, j - 1)

Else

sReturn = LCSString(vaTable, aOriginal, aRevised, i - 1, j)

End If

LCSString = sReturn

End Function

This function (called backtrack in the article) traces back through the table and outputs the longest common subsequence. It’s a recursive function (it calls itself) and continually appends letters (or other elements) on to the return string.

When both the i and j counters are zero, it stops calling itself. Otherwise, if the two letters match, it appends the current letter to the end and calls itself using the element up and to the left. If there’s no match, it goes to the larger of the element above (i-1) and the one to the left (j-1). By following the path of the larger numbers through the matrix, it can find the common letters. It’s originally called with the largest i and j – in the above table, it’s called looking at the 3 (the bottom right cell). Here’s how it tracks through the matrix (I’ll use cell references, but it’s not really cells).

F6: k=k, so add k to the end of the string.
E5: c=c, so add c to the end of the string.
D4: i=i, so add i to the end of the string.
C3: D <> R so find the larger of C2 or B3
C2: i=0 so that’s it.
Return “ick”

Thrilling, isn’t it?

Public Sub PrintDiff(ByRef vaTable As Variant, ByRef aOriginal() As String, ByRef aRevised() As String, ByVal i As Long, ByVal j As Long)
    
    If i > 0 Or j > 0 Then
        If i = 0 Then
            PrintDiff vaTable, aOriginal, aRevised, i, j - 1
            Debug.Print "+" & Space(1) & aRevised(j)
        ElseIf j = 0 Then
            PrintDiff vaTable, aOriginal, aRevised, i - 1, j
            Debug.Print "-" & Space(1) & aOriginal(i)
        Else
            If aOriginal(i) = aRevised(j) Then
                PrintDiff vaTable, aOriginal, aRevised, i - 1, j - 1
                Debug.Print Space(2) & aOriginal(i)
            ElseIf vaTable(i, j - 1) >= vaTable(i - 1, j) Then
                PrintDiff vaTable, aOriginal, aRevised, i, j - 1
                Debug.Print "+" & Space(1) & aRevised(j)
            ElseIf vaTable(i, j - 1) < vaTable(i - 1, j) Then
                PrintDiff vaTable, aOriginal, aRevised, i - 1, j
                Debug.Print "-" & Space(1) & aOriginal(i)
            Else
                Debug.Print
            End If
        End If
    End If
        
End Sub

Public Sub PrintDiff(ByRef vaTable As Variant, ByRef aOriginal() As String, ByRef aRevised() As String, ByVal i As Long, ByVal j As Long)

If i > 0 Or j > 0 Then

If i = 0 Then

PrintDiff vaTable, aOriginal, aRevised, i, j - 1

Debug.Print "+" & Space(1) & aRevised(j)

ElseIf j = 0 Then

PrintDiff vaTable, aOriginal, aRevised, i - 1, j

Debug.Print "-" & Space(1) & aOriginal(i)

Else

If aOriginal(i) = aRevised(j) Then

PrintDiff vaTable, aOriginal, aRevised, i - 1, j - 1

Debug.Print Space(2) & aOriginal(i)

ElseIf vaTable(i, j - 1) >= vaTable(i - 1, j) Then

PrintDiff vaTable, aOriginal, aRevised, i, j - 1

Debug.Print "+" & Space(1) & aRevised(j)

ElseIf vaTable(i, j - 1) < vaTable(i - 1, j) Then

PrintDiff vaTable, aOriginal, aRevised, i - 1, j

Debug.Print "-" & Space(1) & aOriginal(i)

Else

Debug.Print

End If

End Sub

This is another recursive function working backward through the matrix. When it finds a match, there’s no prefix. If it’s a new element (in Revised, but not Original) the prefix is a “+”. If it’s a deleted element, you get a “-“. This prints the results to the immediate window. Let’s look at some examples.

Public Sub DiffLetters()
    
    Dim sOriginal As String
    Dim sRevised As String
    Dim vaTable As Variant
    Dim i As Long
    Dim aOriginal() As String
    Dim aRevised() As String
    
    'Create strings
    sOriginal = "Richard J. Kusleika"
    sRevised = "Richard Kusleika Jr."
    
    'Make an array of letters
    ReDim aOriginal(1 To Len(sOriginal))
    For i = 1 To Len(sOriginal)
        aOriginal(i) = Mid$(sOriginal, i, 1)
    Next i
    
    ReDim aRevised(1 To Len(sRevised))
    For i = 1 To Len(sRevised)
        aRevised(i) = Mid$(sRevised, i, 1)
    Next i
    
    'Create the longest common sequence matrix
    vaTable = LCSTable(aOriginal, aRevised)
    
    'Print the longest common sequence
    Debug.Print LCSString(vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised))
    
    'Show the diff between the letters
    PrintDiff vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised)
    
End Sub

Public Sub DiffLetters()

Dim sOriginal As String

Dim sRevised As String

Dim vaTable As Variant

Dim i As Long

Dim aOriginal() As String

Dim aRevised() As String

'Create strings

sOriginal = "Richard J. Kusleika"

sRevised = "Richard Kusleika Jr."

'Make an array of letters

ReDim aOriginal(1 To Len(sOriginal))

For i = 1 To Len(sOriginal)

aOriginal(i) = Mid$(sOriginal, i, 1)

Next i

ReDim aRevised(1 To Len(sRevised))

For i = 1 To Len(sRevised)

aRevised(i) = Mid$(sRevised, i, 1)

Next i

'Create the longest common sequence matrix

vaTable = LCSTable(aOriginal, aRevised)

'Print the longest common sequence

Debug.Print LCSString(vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised))

'Show the diff between the letters

PrintDiff vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised)

End Sub

This shows the diff on a letter-by-letter basis.

The first line is a printout of the longest common subsequence. The rest is a letter-by-letter diff that shows which elements were added, deleted, and unchanged. We can also diff on words.

Public Sub DiffWords()
    
    Dim sOriginal As String
    Dim sRevised As String
    Dim aOriginal() As String
    Dim aRevised() As String
    Dim vaTable As Variant
    
    sOriginal = "Richard J. Kusleika"
    sRevised = "Richard Kusleika Jr."
    
    aOriginal = Split(Space(1) & sOriginal, Space(1))
    aRevised = Split(Space(1) & sRevised, Space(1))
    
    vaTable = LCSTable(aOriginal, aRevised)
    
    Debug.Print LCSString(vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised))
    
    PrintDiff vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised)
    
End Sub

Public Sub DiffWords()

Dim sOriginal As String

Dim sRevised As String

Dim aOriginal() As String

Dim aRevised() As String

Dim vaTable As Variant

sOriginal = "Richard J. Kusleika"

sRevised = "Richard Kusleika Jr."

aOriginal = Split(Space(1) & sOriginal, Space(1))

aRevised = Split(Space(1) & sRevised, Space(1))

vaTable = LCSTable(aOriginal, aRevised)

Debug.Print LCSString(vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised))

PrintDiff vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised)

End Sub

Instead of filling the array with letters, I split the string on spaces to get words. Note that I put a leading space in front of each string before the split. The array needs to be 1-based and the Split function is zero based. The array doesn’t actually need to be 1-based, but the first row and column is ignored, so I made sure that it was something I didn’t care about. Once the arrays are filled, everything is the same.

Traditionally, diffing text is done line-by-line. So let’s do that. I found an example essay and made two files; OriginalDiff.txt and RevisedDiff.txt. I changed one thing in Revised and used this code to diff them.

Public Sub DiffLines()
    
    Dim sFile As String
    Dim lFile As Long
    Dim aOriginal() As String
    Dim aRevised() As String
    Dim lCnt As Long
    Dim vaTable As Variant
    
    sFile = Environ$("USERPROFILE") & "\Dropbox\Excel\OrignalDiff.txt"
    lFile = FreeFile
    
    Open sFile For Input As lFile
    Do While Not EOF(lFile)
        lCnt = lCnt + 1
        ReDim Preserve aOriginal(1 To lCnt)
        Line Input #lFile, aOriginal(lCnt)
    Loop
    Close lFile
    
    sFile = Environ$("USERPROFILE") & "\Dropbox\Excel\RevisedDiff.txt"
    
    lCnt = 0
    Open sFile For Input As lFile
    Do While Not EOF(lFile)
        lCnt = lCnt + 1
        ReDim Preserve aRevised(1 To lCnt)
        Line Input #lFile, aRevised(lCnt)
    Loop
    Close lFile
    
    vaTable = LCSTable(aOriginal, aRevised)
    
    PrintDiff vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised)
    
End Sub

Public Sub DiffLines()

Dim sFile As String

Dim lFile As Long

Dim aOriginal() As String

Dim aRevised() As String

Dim lCnt As Long

Dim vaTable As Variant

sFile = Environ$("USERPROFILE") & "\Dropbox\Excel\OrignalDiff.txt"

lFile = FreeFile

Open sFile For Input As lFile

Do While Not EOF(lFile)

lCnt = lCnt + 1

ReDim Preserve aOriginal(1 To lCnt)

Line Input #lFile, aOriginal(lCnt)

Loop

Close lFile

sFile = Environ$("USERPROFILE") & "\Dropbox\Excel\RevisedDiff.txt"

lCnt = 0

Open sFile For Input As lFile

Do While Not EOF(lFile)

lCnt = lCnt + 1

ReDim Preserve aRevised(1 To lCnt)

Line Input #lFile, aRevised(lCnt)

Loop

Close lFile

vaTable = LCSTable(aOriginal, aRevised)

PrintDiff vaTable, aOriginal, aRevised, UBound(aOriginal), UBound(aRevised)

End Sub

And that’s as far as I got. Next, I need to put the diffs into a database so I can display diffs and revert to prior versions. Or, quite possibly, I’ll lose interest because I don’t have a burning need for this. It’s just something I’ve wanted to do.

You can download Diffing.zip