Retrieving Data from Add-in Worksheets

Add-ins have worksheets. You just can’t see them. But you can store information on them and it’s a good way to store settings, preferences, and other data. When you want to get to that data, you can go to the Properties for ThisWorkbook and change the IsAddin property to False. Now you can see the worksheets and change the data if necessary.

When you’re done, go back to the VBE and change the IsAddin property back to True before you save your changes. Don’t forget that part; it’s important.

I have a list of vendor codes stored on a worksheet in an addin. I need to see the list, but not change it. I didn’t want to go through all the IsAddin rigmarole, so I did this in the Immediate Window.

That part returns a two-dimensional array of all the values in the first column

That turns a two-dimensional array into a one-dimensional array.

That turns an array into a string with commas between the values. In retrospect, I should have used

to get each code on its own line. Here’s a way to put it into a range, if that’s where you’re going with it anyway.

MaxMinFair Rewrite

I read Charles William’s MaxMinFair algorithm and I didn’t like his approach. That’s typical. I’ll read somebody’s code and think “They’re making that too hard”. Then I’ll set about rewriting it. In this case, as in most cases, it turns out that it is that hard, but I wasn’t going to convince myself until I tried it. I ended up with a different approach that’s not shorter, not easier to read, and not easier to follow. Oh well, here it is anyway.

In Charles’s implementation, he allocates an equal amount of the supply to each node, then takes back what that node didn’t need and puts it back in the available pool. When I was looking at the results, I was thinking that the smallest n nodes simply get their demand and only when there’s not enough to go around do we need to do something different than allocate the full demand.

In my implementation, I start by giving everyone what they demand. Then I start with the smallest demand, and if I can accommodate that amount for everyone, I just reduce the amount available and move to the second smallest demand. At some point (the sixth smallest demand in Charles’s data) I can’t meet that demand and still give everyone an equal share. At that point, I give anyone who hasn’t had their demand met an equal amount – the amount that’s already been distributed plus an equal share of what’s left.

Rank Demand Incremental Demand Allocated Remaining
        18.30
7 0.70 0.70 4.90 13.40
6 1.00 0.30 1.80 11.60
5 1.30 0.30 1.50 10.10
4 2.00 0.70 2.80 7.30
3 3.50 1.50 4.50 2.80
2 7.40 3.90 7.80 (5.00)
1 10.00 2.60 2.60 (7.60)

In the first iteration, I hand out 0.70 to everyone because I have enough supply to do that. In the second iteration, I had out the differential, 0.30, to everyone who’s left because I have enough supply remaining. When I get to #2, I can’t hand out 3.90 to the remaining two nodes because I don’t have enough supply. I’ve allocated up to 3.5 to anyone who’s demanded it, so the last two get the 3.5 plus half of the 2.8 that remains.

Although I didn’t accomplish anything, it was still a fun exercise.

Joining Two Dimensional Arrays

The Join function takes an array and smushes it together into a String. I love the Join function. The only thing I don’t like about it is when I forget that it doesn’t work on 2d arrays. Join only works with 1-dimensional arrays. The last time my memory failed me, I decided to write my own. And here it is.

It’s pretty simple. It loops through the first dimension (the row dimension) and joins each line with sLineDelim. Inside that loop, it joins each element in the second dimension with sWordDelim. What this function doesn’t do is automatically insert itself into only the projects I want. That requires me to remember that I wrote it and where I put it. In reality, I’ll probably reinvent the wheel the next time I need it.

Here’s my extensive testing procedure.

Sending Images via WinSCP

Since my recent move to Digital Ocean for hosting, I’ve had to make a change to how I upload images for this blog. I used to create an FTP file and a batch file, but as far as I know that doesn’t support SFTP. I’m using WinSCP to transfer files manually and learned that it has a command line interface. I made a procedure called SendViaSCP to replace my SendViaFTP.

Public Sub SendViaSCP(vFname As Variant)

Dim aScript() As String
Dim i As Long

ReDim aScript(1 To 4 + UBound(vFname))

aScript(1) = "option batch abort"
aScript(2) = "option confirm off"
aScript(3) = "open sftp://username:password@000.000.000.000"
aScript(UBound(aScript)) = "exit"

For i = LBound(vFname) To UBound(vFname)
aScript(3 + i) = "put " & Dir(vFname(i)) & " /home/wordpress/public_html/blogpix/"
Next i

Open "winscpup.txt" For Output As #1
Print #1, Join(aScript, vbNewLine)
Close #1

Shell "winscpup.bat"

End Sub

The vFname arguments is a variant array that holds all of the files I selected from Application.GetOpenFileName. The aScript array holds three lines of setup, a line for each file, and an exit line.

The commands are joined together and written to a batch file and the batch file is run. It doesn’t solve the problem that Billkamm and Haines solved of having your username and password in a batch file, but I can live with it.

You might be wondering why I don’t just use the file upload functions in WordPress. What fun would that be?

Sparkline Gauge

I have a list of labels and a list of values. I also have a value from somewhere in the range of my values list.

I want to show where the value falls in the list of values with a red mark. The labels need to be proportional to their values. An XY chart seemed like the obvious answer, but would require the XY Chart Label add-in or some better charting skills on my part. Also, there may be a few of these on a sheet, so I wanted to keep it more light-weight. Sparklines move nicely with cells, so I tried that route.

First, I needed to get the labels spread proportionally across a cell. I made the cell 20 characters wide by typing '00000000000000000000, setting the font to Courier New, and resizing the column. Cell C6 shows the positions. To get the proper spread of labels, I used a UDF, shown below.

The values still have to be reasonably spread out or this will overwrite some of them. But it worked for my purposes. The formula =PropAxis(F3:F8,G3:G8) is in C3.

Next I set up the range I2:AB3 to feed the sparkline. Row 2 represents the 20 characters in the cell. Row three should be all zeros except one, which will be a 1. Then a win/loss sparkline will show the one ‘win’ as a mark.

In H2, I calculate which character gets the win. Here’s how that formula progressed:

=ROUND((G2-G3)/(G8-G3)*20,0)

That figures where the value is in the list of values as a percentage and multiplies by the number of characters in the cell (20). I wanted to protect against a value that was not in the range, so I modified it to this:

=MIN(MAX(ROUND((G2-G3)/(G8-G3)*20,0),1),20)

Now it will never be less than 1 or greater than 20. Because of rounding, the mark may be one character off of an exact match. That’s probably not a problem – it’s obviously not supposed to be hyper-accurate since I’m only using 20 characters – but it just doesn’t look right. So I handle exact matches as special cases.

=IF(ISNA(MATCH(G2,G3:G8,FALSE)),MIN(MAX(ROUND((G2-G3)/(G8-G3)*20,0),1),20),FIND(INDEX(F3:F8,MATCH(G2,G3:G8,FALSE),1),C3))

If there isn’t an exact match, find the percentage and get close. If it is an exact match, find the character’s position in the string and put the mark there. Now in Row 3 of my sparkline data range, I use this number to find the ‘win’.

=IF($H$2=I2,1,0)

I thought this whole process would be easier with sparklines. But it’s not really any more light-weight than just a chart. Anybody else want to take a crack at it?

You can download SparklineProportion.zip

Retrieving Lost Comments

I’ve restored a few posts in the last few months that were lost. I didn’t restore any of the comments. Honestly, I should have but I didn’t even think about it. But when I went to restore the In Cell Charting post, I noticed there were 85 comments. That seemed worth my while.

First I set a reference to Microsoft XML, v6.0 and Microsoft HTML Object Library. Here’s the main procedure.

I looked at the source for the web page to figure out how it was laid out and how to get at the data I needed. The CComment and CComments classes store the data as I loop through the list index items in the comment list. The first CComment method is AddNameFromCite. I didn’t even know there was a Cite tag in HTML (but you could fill a warehouse with what I don’t know about HTML).

I made this a method because I generally reserve properties to getting/setting values. If I change more than one property or do any extensive manipulation, I go with a method instead of a property. I’m not uber-consistent about it though. The comment author’s name is the innertext of the PhraseElement (that’s what a Cite is, at least according to the TypeName function). To get the AuthorLink, I need to find the anchor and get the href attribute. Because the wayback machine put its own URL in from of other URLs, I had to find the second instance of “http://” to get the real link. Next the AddDate method.

This really should have been a property instead of a method, but oh well. The innertext of the DivElement is something like “January 1, 2010 at 6:16 am”. I split that string on the “at” and used DateValue and TimeValue to build a date. Finally the content of the comment.

I passed in a collection of elements that are ParaElements (tag=p=paragraph). Then I looped through them and concatenated a string for the content. By looping through just the p elements, I skip all the comment meta crap that is auto-generated by WordPress and just get to the text.

At this point I have 85 CComment objects and I’m ready to build the SQL string.

Just a bunch string building and putting in a file that I can import into PHPMyAdmin. In the CComment class, the values are put together like this

I really like this method of building a string – putting it into an array and using Join – so I think I’ll start using it. The EscSq function turns any single quotes into two single quotes. The ContentScrubbed property converts any vbNewLines into \r\n. I exported some existing comments from MySQL to see how all this stuff went together. In the end, I ended up with a file that looks like this.

phpMyAdmin kept erroring out that the file was using too much memory. It’s 51kb, so I knew that wasn’t true. But the helpful people at HostGator imported it for me and set me up with console access so I can do it myself next time. I just need to learn the commandline stuff for importing.

I took a quick look through through the comments and they look alright. It’s hard to tell what I screwed up formatting-wise because some people use code tags and most don’t. But the info appears to be there and that’s the most important thing. I guess since I have this set up, I should go back and make sure any other lost posts get their comments too.

As always, if you see something that’s not right on the site, shoot me an email. I have a few hundred posts that still look like crap, but are readable and I’m fixing them as I see them.

String Diffing

I’ve wanted to have some wiki-like diffing in my userform textboxes for a while now. Since I’ve been using wikis almost daily, I want the revisioning feature in everything I do. I’m not there yet, but I decided to see what kind of algorithm I would need to do it. I read the Wikipedia article on longest common subsequence and played around with it a little.

This code is called LCSLength in the article. It returns a matrix (2d array) with counts of matching elements at each position. For instance, if you’re diffing “Dick” and “Rick”, they have three letters in common and this table will compute that. It looks like this

R i c k
0 0 0 0 0
D 0 0 0 0 0
i 0 0 1 1 1
c 0 0 1 2 2
k 0 0 1 2 3

The rest of the functions use this table to figure out what’s what.

This function (called backtrack in the article) traces back through the table and outputs the longest common subsequence. It’s a recursive function (it calls itself) and continually appends letters (or other elements) on to the return string.

When both the i and j counters are zero, it stops calling itself. Otherwise, if the two letters match, it appends the current letter to the end and calls itself using the element up and to the left. If there’s no match, it goes to the larger of the element above (i-1) and the one to the left (j-1). By following the path of the larger numbers through the matrix, it can find the common letters. It’s originally called with the largest i and j – in the above table, it’s called looking at the 3 (the bottom right cell). Here’s how it tracks through the matrix (I’ll use cell references, but it’s not really cells).

  1. F6: k=k, so add k to the end of the string.
  2. E5: c=c, so add c to the end of the string.
  3. D4: i=i, so add i to the end of the string.
  4. C3: D <> R so find the larger of C2 or B3
  5. C2: i=0 so that’s it.
  6. Return “ick”

Thrilling, isn’t it?

This is another recursive function working backward through the matrix. When it finds a match, there’s no prefix. If it’s a new element (in Revised, but not Original) the prefix is a “+”. If it’s a deleted element, you get a “-“. This prints the results to the immediate window. Let’s look at some examples.

This shows the diff on a letter-by-letter basis.

The first line is a printout of the longest common subsequence. The rest is a letter-by-letter diff that shows which elements were added, deleted, and unchanged. We can also diff on words.

Instead of filling the array with letters, I split the string on spaces to get words. Note that I put a leading space in front of each string before the split. The array needs to be 1-based and the Split function is zero based. The array doesn’t actually need to be 1-based, but the first row and column is ignored, so I made sure that it was something I didn’t care about. Once the arrays are filled, everything is the same.

Traditionally, diffing text is done line-by-line. So let’s do that. I found an example essay and made two files; OriginalDiff.txt and RevisedDiff.txt. I changed one thing in Revised and used this code to diff them.

And that’s as far as I got. Next, I need to put the diffs into a database so I can display diffs and revert to prior versions. Or, quite possibly, I’ll lose interest because I don’t have a burning need for this. It’s just something I’ve wanted to do.

You can download Diffing.zip

Find Matching Data in Array Speed Test

JP has a good post about finding exact matches in arrays. I use a similar method. I Join the array with delimiters around all the values, then use Instr to see if it’s there. Here’s my code:

Function IsInArrayDK(vArr As Variant, sValueToCheck As String, _
Optional bMatch As Boolean = True) As Boolean

Dim bReturn As Boolean
Dim sWordList As String

Const sDELIM As String = "|"

'See if it's a match even if only a substring
bReturn = UBound(Filter(vArr, sValueToCheck)) > -1

'If a match and need exact
'If exact match not needed, the line above provides the return value
If bReturn And bMatch Then
'put pipes around all the values
sWordList = sDELIM & Join(vArr, sDELIM) & sDELIM
'See if the values with pipes is there
bReturn = InStr(1, sWordList, sDELIM & sValueToCheck & sDELIM) > 0
End If

IsInArrayDK = bReturn

End Function

To test, I filled an array with 100,000 random strings, picked one of the strings to find, then timed JP’s funciton, my function, and the non-optimized method. The non-optimized method simply loops through the array and checks for values.

Function IsInArrayLoop(vArr As Variant, sValueToCheck As String, _
Optional bMatch As Boolean = True) As Boolean

Dim bReturn As Boolean
Dim i As Long

For i = LBound(vArr) To UBound(vArr)
If bMatch Then
If vArr(i) = sValueToCheck Then
bReturn = True
Exit For
End If
Else
If InStr(1, vArr(i), sValueToCheck) > 0 Then
bReturn = True
Exit For
End If
End If
Next i

IsInArrayLoop = bReturn

End Function

The code to fill the array converts Rnd to a string and puts it in the array. Then I pick one of the values (first, middle, and last) as the value I want to check.

Sub FillArray(ByRef vArr As Variant, ByVal lPlace As Long, ByRef sValue As String)

Dim i As Long

For i = 1 To 100000
vArr(i) = CStr(Int(Rnd * 10000000))
If i = lPlace Then
sValue = vArr(i)
End If
Next i

End Sub

I used the same API timer that JP uses when he does speed tests.

Public Declare Function timeGetTime Lib "winmm.dll" () As Long

And finally, the sub to test loops through the early, middle, and late values-to-check and times them.

Sub TestArray()

Dim aNames(1 To 100000) As Variant
Dim i As Long
Dim bResult As Boolean
Dim lStart As Long, lEnd As Long
Dim sValueToCheck As String
Dim aPlace(1 To 3, 1 To 2) As Variant
Dim sTable As String, sRow As String

'name the tests and determine where the value to check is in the array
aPlace(1, 1) = "Value Early:": aPlace(1, 2) = 1
aPlace(2, 1) = "Value Middle:": aPlace(2, 2) = 50000
aPlace(3, 1) = "Value Late:": aPlace(3, 2) = 99999

'The results go in an html table
sRow = Tag(Tag("Milliseconds", "td") & Tag("JP", "td") & Tag("DK", "td") & Tag("Loop", "td"), "tr") & vbNewLine
sTable = sRow

For i = 1 To 3
sRow = Tag(aPlace(i, 1), "td")
FillArray aNames, aPlace(i, 2), sValueToCheck

lStart = timeGetTime
bResult = IsInArrayJP(aNames, sValueToCheck, True)
lEnd = timeGetTime
sRow = sRow & Tag(lEnd - lStart, "td")

lStart = timeGetTime
bResult = IsInArrayDK(aNames, sValueToCheck, True)
lEnd = timeGetTime
sRow = sRow & Tag(lEnd - lStart, "td")

lStart = timeGetTime
bResult = IsInArrayLoop(aNames, sValueToCheck, True)
lEnd = timeGetTime
sRow = sRow & Tag(lEnd - lStart, "td")

sTable = sTable & Tag(sRow, "tr") & vbNewLine
Next i

Debug.Print Tag(sTable, "table", , True)

End Sub

The results:

Milliseconds JP DK Loop
Value Early: 53 53 0
Value Middle: 48 53 11
Value Late: 49 54 22

 
JP’s and mine are a wash and the loop is fastest. I guess I should just use that.