Importing Fixed Width Text Files with ActiveX Data Objects

Posted on May 10, 2013March 17, 2018 by Dick Kusleika

A few weeks ago I was struggling with fixed width text files. Remember 1995? Me neither, but I’m living it every day. I happened upon a stackoverflow.com post that dealt with using ADO for this purpose. I don’t remember the post, so no link. It did, however, lead me to this MSDN article, which was very helpful.

If you haven’t been following along, I wrote a post about creating a sample fixed-width file and one about importing said file. This post is about importing that file with ADO. You may remember ADO from such database objects as Connection, Command, and Recordset. But, like me, you may never have considered using to import text files; or even knew that it could.

Before I get into the specifics, there are two things that endeared me to this method. First, it allows me to only import the data I want and easily ignores headers, totals, and other non-record data. Second, it’s super fast. I had a huge text file that took several minutes to read using the Input$ function. I got it down to 90 seconds using ADO. Ninety seconds still stinks, but it beats having to get a coffee every time the code runs. The bonus third reason I love ADO is that I can replicate it for different text files easily. Usually, fixed width text files are not arranged as normalized data, so there are some challenges. But I went from setting this up for one specific report to a dozen reports very quickly. Let’s see how it’s done.

Go read the MSDN article if you want the specifics, but basically you need a file called Schema.ini that tells ADO what’s what in your file. I don’t take a crap without a class module, so we’ll be using class modules here. Didn’t this use to be a family friendly blog?

If you couldn’t tell, the text file in question contains transactions from a general ledger. That means I need a Transaction class to hold each of them. Using the column headers from the file, I create a CTransaction class module.

Option Explicit

Public TransactionID As Long
Public Entry As String
Public Period As Long
Public PostDate As Date
Public GLAccount As String
Public Description As String
Public Srce As String
Public Cflow As Boolean
Public Ref As String
Public Post As Boolean
Public Debit As Double
Public Credit As Double
Public Alloc As Boolean

Option Explicit

Public TransactionID As Long

Public Entry As String

Public Period As Long

Public PostDate As Date

Public GLAccount As String

Public Description As String

Public Srce As String

Public Cflow As Boolean

Public Ref As String

Public Post As Boolean

Public Debit As Double

Public Credit As Double

Public Alloc As Boolean

My VBHelpers add-in quickly converts those to properties and creates a CTransactions parent class. Next, I create an MEntryPoints standard module and insert the following code.

Public Sub ImportGLTransactions()
    
    Dim clsTransactions As CTransactions
    Dim sh As Worksheet
    Dim sFile As String
    
    sFile = Application.GetOpenFilename("*.txt, *.txt")
    
    If sFile <> "False" Then
        Set clsTransactions = New CTransactions
        clsTransactions.FillFromFile sFile
        Set sh = Workbooks.Add.Sheets(1)
        clsTransactions.WriteToRange sh.Range("A1")
    End If

End Sub

Public Sub ImportGLTransactions()

Dim clsTransactions As CTransactions

Dim sh As Worksheet

Dim sFile As String

sFile = Application.GetOpenFilename("*.txt, *.txt")

If sFile <> "False" Then

Set clsTransactions = New CTransactions

clsTransactions.FillFromFile sFile

Set sh = Workbooks.Add.Sheets(1)

clsTransactions.WriteToRange sh.Range("A1")

End If

End Sub

I don’t have a FillFromFile method or a WriteToRange method, but I like to write my main procedure as if I already had those. You’ll need a reference to Microsoft ActiveX Data Objects 2.8 Library (although any version close to that will do). The FillFromFile method is pretty simple. It creates and ADO Connection and an ADO Recordset, then loops through the recordset adding CTransaction instances as it reads them in. It’s treating our text file as if it’s a database with field names and everything.

Public Sub FillFromFile(ByVal sFile As String)
    
    Dim adCn As ADODB.Connection, adRs As ADODB.Recordset
    Dim vaConn As Variant, aSql(1 To 4) As String
    Dim sPATH As String
    Dim clsTransaction As CTransaction
    
    sPATH = Replace$(sFile, Dir$(sFile), vbNullString)
    
    'We'll talk about this line later
    MakeSchema sFile, sPATH, Me.Schema
    
    'Create a connection string and SQL statement
    vaConn = GetConnectionString(sPATH)
    aSql(1) = "SELECT"
    aSql(2) = Join(Me.Columns, ",")
    aSql(3) = "FROM [" & Dir$(sFile) & "]"
    aSql(4) = "WHERE PostDate Like ""[0-1][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]"""
    
    'Open the connection and the recordset
    Set adCn = New ADODB.Connection
    adCn.Open Join(vaConn, ";")
    
    Set adRs = New ADODB.Recordset
    adRs.Open Join(aSql, Space(1)), adCn, adOpenStatic, adLockReadOnly, adCmdText
                    
    'Loop through the rs and create CTransaction instances
    If Not adRs.BOF And Not adRs.EOF Then
        Do While Not adRs.EOF
            Set clsTransaction = New CTransaction
            clsTransaction.FillFromRecordset adRs
            Me.Add clsTransaction
            adRs.MoveNext
        Loop
    End If
    
    adRs.Close
    adCn.Close
    
End Sub

Public Sub FillFromFile(ByVal sFile As String)

Dim adCn As ADODB.Connection, adRs As ADODB.Recordset

Dim vaConn As Variant, aSql(1 To 4) As String

Dim sPATH As String

Dim clsTransaction As CTransaction

sPATH = Replace$(sFile, Dir$(sFile), vbNullString)

'We'll talk about this line later

MakeSchema sFile, sPATH, Me.Schema

'Create a connection string and SQL statement

vaConn = GetConnectionString(sPATH)

aSql(1) = "SELECT"

aSql(2) = Join(Me.Columns, ",")

aSql(3) = "FROM [" & Dir$(sFile) & "]"

aSql(4) = "WHERE PostDate Like ""[0-1][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]"""

'Open the connection and the recordset

Set adCn = New ADODB.Connection

adCn.Open Join(vaConn, ";")

Set adRs = New ADODB.Recordset

adRs.Open Join(aSql, Space(1)), adCn, adOpenStatic, adLockReadOnly, adCmdText

'Loop through the rs and create CTransaction instances

If Not adRs.BOF And Not adRs.EOF Then

Do While Not adRs.EOF

Set clsTransaction = New CTransaction

clsTransaction.FillFromRecordset adRs

Me.Add clsTransaction

adRs.MoveNext

Loop

End If

adRs.Close

adCn.Close

End Sub

We’ll hold off on how to create the Schema file for now. The Connection string is created with this little utility. You pass in the path and returns an array of strings ready to be joined.

Function GetConnectionString(ByVal sPATH As String) As Variant
    
    Dim aConn(1 To 3) As String
    
    aConn(1) = "Provider=Microsoft.Jet.OLEDB.4.0"
    aConn(2) = "Data Source=" & sPATH
    aConn(3) = "Extended Properties=""text;HDR=No;FMT=FixedLength"""

    GetConnectionString = aConn
    
End Function

Function GetConnectionString(ByVal sPATH As String) As Variant

Dim aConn(1 To 3) As String

aConn(1) = "Provider=Microsoft.Jet.OLEDB.4.0"

aConn(2) = "Data Source=" & sPATH

aConn(3) = "Extended Properties=""text;HDR=No;FMT=FixedLength"""

GetConnectionString = aConn

End Function

I’ve recently starting using arrays and Joins to concatenate strings of any length. I find it makes the code much more readable and manageable once you get used to it. Let’s talk about that SQL statement. In Schema.ini, I’ve defined column names and column widths. We’ll look at it in a moment. The SQL statement selects all the columns from the text file based on some criteria. The first section of the SQL statement is the SELECT keyword. For the second section, I have a property that returns an array of columns. You could just as easily use “SELECT * FROM”, but I was recently shown the benefit of following the never-select-astrisk rule, so I’m trying to be good.

Public Property Get Columns() As Variant
    
    Columns = Array("Entry", "Period", "PostDate", "GLAccount", "Description", "Srce", "Cflow", "Ref", "Post", "Debit", "Credit", "Alloc")
    
End Property

Public Property Get Columns() As Variant

Columns = Array("Entry", "Period", "PostDate", "GLAccount", "Description", "Srce", "Cflow", "Ref", "Post", "Debit", "Credit", "Alloc")

End Property

Just an array of column names used in the SQL statment and in Schema.ini. The third section of the SQL statement is the FROM keyword followed by the file name in brackets. The Dir$ function strips the path out of the fullname and returns only the file name. You don’t need the path in the SQL statement because Schema.ini is in the same directory as the text file. It has to be, so it’s not looking anywhere else.

The final section of the SQL statement is the WHERE clause. This is where you have to get a little creative. As I scan down my text file, I need to find some characteristic of “good” rows that is not present in “bad” rows. For this example, it was pretty easy. Every row that I want has a real date in PostDate and every row that I don’t want doesn’t. They aren’t all that easy. Would you like to see some examples of WHERE clauses I’ve used? Well, would you?

aSql(4) = "WHERE (Tax Like ""__/__/____"" And Not(IsNull(Vendor))) Or Account Like ""[1-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]"""
aSql(4) = "WHERE PaymentDate Like ""__/__/____"" And Not DiscountAmount Like ""%[ ][ ].00"""
aSql(4) = "WHERE PostDate Like ""__/__/____"" Or GLAccount Like ""[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]"""
aSql(4) = "WHERE InvoiceDate Like ""__/__/____"" AND ItemID <> ""Invoice Totals"""
aSql(4) = "WHERE Kitted = ""YES"" Or Kitted = ""NO"""
aSql(4) = "WHERE TranDate Like ""__/__/____"" Or (LotNo Like ""[A-Z][A-Z]%"" And TranDate Like ""     [A-Z]%"" And Not LotNo Like ""Item ID%"")"

aSql(4) = "WHERE (Tax Like ""__/__/____"" And Not(IsNull(Vendor))) Or Account Like ""[1-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]"""

aSql(4) = "WHERE PaymentDate Like ""__/__/____"" And Not DiscountAmount Like ""%[ ][ ].00"""

aSql(4) = "WHERE PostDate Like ""__/__/____"" Or GLAccount Like ""[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]"""

aSql(4) = "WHERE InvoiceDate Like ""__/__/____"" AND ItemID <> ""Invoice Totals"""

aSql(4) = "WHERE Kitted = ""YES"" Or Kitted = ""NO"""

aSql(4) = "WHERE TranDate Like ""__/__/____"" Or (LotNo Like ""[A-Z][A-Z]%"" And TranDate Like "" [A-Z]%"" And Not LotNo Like ""Item ID%"")"

Some of those are pretty gnarly. Our example file has lots of options, not just PostDate. We have three Yes/No fields and we could use any of those. It’s pretty unlikely that header information or totals rows are going to have a Yes or No in that same position. The idea is only get the rows you want. For some of the dates I used "__/__/____" and I think it’s pretty safe. But for this example, I used "[0-1][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]" which is a little more specific.

With a good connection string and SQL statement, I open the connection, open the recordset, and start looping. There’s not much to filling the CTransaction class via the FillFromRecordset method. It uses an Nz function that I wrote to avoid Null problems and return a default. For numbers and dates, I specify that I want to return a zero in place of a Null. For strings, it automatically returns vbNullString. The FillFromRecordset procedure is a method in CTransaction. The Nz function is in a standard module.

Public Sub FillFromRecordset(ByRef adRs As ADODB.Recordset)
    
    Me.Entry = Nz(adRs.Fields("Entry"))
    Me.Period = Nz(adRs.Fields("Period"), 0)
    Me.PostDate = Nz(adRs.Fields("PostDate"), 0)
    Me.GLAccount = Nz(adRs.Fields("GLAccount"))
    Me.Description = Nz(adRs.Fields("Description"))
    Me.Srce = Nz(adRs.Fields("Srce"))
    Me.Cflow = Nz(adRs.Fields("Cflow")) = "Yes"
    Me.Ref = Nz(adRs.Fields("Ref"))
    Me.Post = Nz(adRs.Fields("Post")) = "Yes"
    Me.Debit = Nz(adRs.Fields("Debit"), 0)
    Me.Credit = Nz(adRs.Fields("Credit"), 0)
    Me.Alloc = Nz(adRs.Fields("Alloc")) = "Yes"
    
End Sub

Function Nz(fldTest As ADODB.Field, _
    Optional vDefault As Variant) As Variant
   
    If IsNull(fldTest.Value) Then
        If IsMissing(vDefault) Then
            Select Case fldTest.Type
                Case adBSTR, adGUID, adChar, adWChar, adVarChar, adVarWChar
                    Nz = vbNullString
                Case Else
                    Nz = 0
            End Select
        Else
            Nz = vDefault
        End If
    Else
        Nz = fldTest.Value
    End If
   
End Function

Public Sub FillFromRecordset(ByRef adRs As ADODB.Recordset)

Me.Entry = Nz(adRs.Fields("Entry"))

Me.Period = Nz(adRs.Fields("Period"), 0)

Me.PostDate = Nz(adRs.Fields("PostDate"), 0)

Me.GLAccount = Nz(adRs.Fields("GLAccount"))

Me.Description = Nz(adRs.Fields("Description"))

Me.Srce = Nz(adRs.Fields("Srce"))

Me.Cflow = Nz(adRs.Fields("Cflow")) = "Yes"

Me.Ref = Nz(adRs.Fields("Ref"))

Me.Post = Nz(adRs.Fields("Post")) = "Yes"

Me.Debit = Nz(adRs.Fields("Debit"), 0)

Me.Credit = Nz(adRs.Fields("Credit"), 0)

Me.Alloc = Nz(adRs.Fields("Alloc")) = "Yes"

End Sub

Function Nz(fldTest As ADODB.Field, _

Optional vDefault As Variant) As Variant

If IsNull(fldTest.Value) Then

If IsMissing(vDefault) Then

Select Case fldTest.Type

Case adBSTR, adGUID, adChar, adWChar, adVarChar, adVarWChar

Nz = vbNullString

Case Else

Nz = 0

End Select

Else

Nz = vDefault

End If

Else

Nz = fldTest.Value

End If

End Function

Now on to Schema.ini, at long last. I have a MakeSchema procedure in a standard module that simply creates the file where it’s supposed to. One of the arguments to MakeSchema is a string for the contents of the file. That comes from the Schema property of the CTranscations class (shown as Me.Schema in the FillFromFile method above). The Schema property takes the columns from the Columns property and puts them together with column widths to create the string.

Public Property Get Schema() As String
    
    Dim aReturn() As String
    Dim vaNames As Variant
    Dim vaWidths As Variant
    Dim i As Long
    
    vaNames = Me.Columns
    vaWidths = Array(8, 4, 12, 13, 27, 5, 4, 10, 4, 19, 22, 4)
    ReDim aReturn(LBound(vaNames) To UBound(vaNames))
    
    For i = LBound(vaNames) To UBound(vaNames)
        aReturn(i) = "Col" & i + 1 & "=" & vaNames(i) & Space(1) & "Text Width" & Space(1) & vaWidths(i)
    Next i
    
    Schema = Join(aReturn, vbNewLine)
    
End Property

Public Property Get Schema() As String

Dim aReturn() As String

Dim vaNames As Variant

Dim vaWidths As Variant

Dim i As Long

vaNames = Me.Columns

vaWidths = Array(8, 4, 12, 13, 27, 5, 4, 10, 4, 19, 22, 4)

ReDim aReturn(LBound(vaNames) To UBound(vaNames))

For i = LBound(vaNames) To UBound(vaNames)

aReturn(i) = "Col" & i + 1 & "=" & vaNames(i) & Space(1) & "Text Width" & Space(1) & vaWidths(i)

Next i

Schema = Join(aReturn, vbNewLine)

End Property

The widths array is simply how many characters wide each column is. The lines in my file are 132 characters long. Counting them is a pain. Usually, I grab a couple of representative lines from the text file and put them in a spreadsheet. Here’s how I came up with the column widths for this file.

Lines 2-5 are Courier New 9pt and the first two lines are typed – no fancy formula to get those numbers. I like to get a couple of representative lines so I don’t miss anything. Then I go put pipes where I want the column breaks to be and put this formula in A7

=FIND("|",$A$4,A6+1)

and fill down until I get an error. Column B is just the difference. Finally, the MakeSchema utility takes that string and puts into a file.

Public Sub MakeSchema(ByVal sFile As String, ByVal sPATH As String, ByVal sCols As String)
    
    Dim lFile As Long
    Dim aWrite(1 To 4) As String
    
    Const sSCHEMA As String = "Schema.ini"
    
    aWrite(1) = "[" & Dir$(sFile) & "]"
    aWrite(2) = "Format=FixedLength"
    aWrite(3) = vbNullString
    aWrite(4) = sCols
    
    lFile = FreeFile
    Open sPATH & sSCHEMA For Output As lFile
    Print #lFile, Join(aWrite, vbNewLine)
    Close lFile
    
End Sub

Public Sub MakeSchema(ByVal sFile As String, ByVal sPATH As String, ByVal sCols As String)

Dim lFile As Long

Dim aWrite(1 To 4) As String

Const sSCHEMA As String = "Schema.ini"

aWrite(1) = "[" & Dir$(sFile) & "]"

aWrite(2) = "Format=FixedLength"

aWrite(3) = vbNullString

aWrite(4) = sCols

lFile = FreeFile

Open sPATH & sSCHEMA For Output As lFile

Print #lFile, Join(aWrite, vbNewLine)

Close lFile

End Sub

And the Schema.ini file looks like this:

The final piece is writing all of the CTransaction objects to a range. The CTransactions collection class has a WriteToRange method that calls an OutputRange property. It’s pretty straightforward.

Public Sub WriteToRange(rStart As Range)
    
    Dim vaWrite As Variant
    
    vaWrite = Me.OutputRange
    rStart.Resize(UBound(vaWrite, 1), UBound(vaWrite, 2)).Value = vaWrite
    rStart.CurrentRegion.EntireColumn.AutoFit
    
End Sub

Public Property Get OutputRange() As Variant
    
    Dim aReturn() As Variant
    Dim clsTransaction As CTransaction
    Dim lCnt As Long
    Dim vaHead As Variant
    Dim i As Long
    
    ReDim aReturn(1 To Me.Count + 1, 1 To 12)
    
    vaHead = Me.Columns
    lCnt = lCnt + 1
    For i = LBound(vaHead) To UBound(vaHead)
        aReturn(lCnt, i + 1) = vaHead(i)
    Next i
    
    For Each clsTransaction In Me
        lCnt = lCnt + 1
        With clsTransaction
            aReturn(lCnt, 1) = "'" & .Entry
            aReturn(lCnt, 2) = .Period
            aReturn(lCnt, 3) = .PostDate
            aReturn(lCnt, 4) = "'" & .GLAccount
            aReturn(lCnt, 5) = "'" & .Description
            aReturn(lCnt, 6) = "'" & .Srce
            aReturn(lCnt, 7) = .Cflow
            aReturn(lCnt, 8) = "'" & .Ref
            aReturn(lCnt, 9) = .Post
            aReturn(lCnt, 10) = .Debit
            aReturn(lCnt, 11) = .Credit
            aReturn(lCnt, 12) = .Alloc
        End With
    Next clsTransaction
    
    OutputRange = aReturn
    
End Property

Public Sub WriteToRange(rStart As Range)

Dim vaWrite As Variant

vaWrite = Me.OutputRange

rStart.Resize(UBound(vaWrite, 1), UBound(vaWrite, 2)).Value = vaWrite

rStart.CurrentRegion.EntireColumn.AutoFit

End Sub

Public Property Get OutputRange() As Variant

Dim aReturn() As Variant

Dim clsTransaction As CTransaction

Dim lCnt As Long

Dim vaHead As Variant

Dim i As Long

ReDim aReturn(1 To Me.Count + 1, 1 To 12)

vaHead = Me.Columns

lCnt = lCnt + 1

For i = LBound(vaHead) To UBound(vaHead)

aReturn(lCnt, i + 1) = vaHead(i)

Next i

For Each clsTransaction In Me

lCnt = lCnt + 1

With clsTransaction

aReturn(lCnt, 1) = "'" & .Entry

aReturn(lCnt, 2) = .Period

aReturn(lCnt, 3) = .PostDate

aReturn(lCnt, 4) = "'" & .GLAccount

aReturn(lCnt, 5) = "'" & .Description

aReturn(lCnt, 6) = "'" & .Srce

aReturn(lCnt, 7) = .Cflow

aReturn(lCnt, 8) = "'" & .Ref

aReturn(lCnt, 9) = .Post

aReturn(lCnt, 10) = .Debit

aReturn(lCnt, 11) = .Credit

aReturn(lCnt, 12) = .Alloc

End With

Next clsTransaction

OutputRange = aReturn

End Property

I put apostrophes in front of the strings so they don’t get converted just because they look like a number or a date. And when I’m done, I sum up the Debit column and compare it to the text file.

It’s a match! I must have done something right. No comments for you accountants about how my debits don’t equal my credits. You try to make fake data this internally consistent. :)

You can download FixedWidthADO.zip

Importing Fixed Width Text Files

Posted on May 8, 2013February 24, 2017 by Dick Kusleika

I’ve been working with fixed width text files quite a bit lately. The built-in method for importing these files is terrible. First, it’s yet another wizard and I think it would be better as a single form (but then I think everything is better as a single form). The worst part, by far, is the tiny window that shows a preview of your data.

After a few days of using this wizard, it was pretty obvious that I was going to write my own, if for no other reason than to make that window bigger. My importer would be significantly different than Excel’s. Microsoft has to make their method flexible and universal, but I don’t. I know certain things about my text files and can build in some assumptions to make things better. For one, my files have repeating page headers. While Excel allows you to start your import at something other than first row to skip all that crap, it doesn’t do me any good because there’s just more crap to come.

Step 2 of the wizard allows you to add, delete, and move the dividing lines that delineate your columns. I find that Excel does a pretty poor job at placing these lines. But I grant that I haven’t used this on a wide variety of files so it’s entirely possible that their algorithm is the best – it’s just not the best for the types of files I’m using.

Did you think I was going to blow by this step without commenting on the lack of keyboard support? Not a chance. To move the lines, click them and drag them. To add a line, single click. To delete a line, double click. Actually, to delete a line, double click next to the line you want to delete, then double click that new line to delete it, then carefully double click on the original line to delete it. You know what I’m talking about. Click, click, click. Where’s the keyboard love?

The lack of large enough preview window really hurts on this step. I’ll discuss determining where the column lines should go later in this post. Normally after step 2, I just hit finish. But let’s take a look at the last step anyway.

I don’t use this step much because the defaults are really good. As much as I don’t like wizards, I have to give MS props for putting a Finish button on step 2 so I can skip step 3. Now that you’ve defined your columns, this step allows you to specify a data type. The General type works well for most situations, but if you have some text that happens to look like a number, it’s best to set the column to the Text format.

Under the Advanced button, you can switch what decimals and commas mean and, most importantly to me, tell it how to handle trailing minus signs.

So back to how nothing’s ever good enough and I can do everything better. I’m only dealing with fixed width files and I’m always starting on the first row, so step 1 of the wizard is gone. Step 2 and 3 could be combined, I think. It should draw a combobox over each column that let’s you choose the format. When you add or delete columns, it redraws the boxes. The best use of resources is getting the columns right in the first place. If you don’t have to move columns because it guesses so well, then the whole thing becomes a breeze.

Let’s look at this sample file in terms of columns. In order to get all the numbers to show, this chart is kind of big.

I wrote a little macro to analyze the file and report how many characters are in each column.

Sub ReadTextFile()
    
    Dim sFile As String
    Dim lFile As Long
    Dim sInput As String
    Dim vaLines As Variant
    Dim i As Long, j As Long
    Dim aChars() As Long
    Dim bLow As Boolean
    
    sFile = "C:\Users\dkusleika\Dropbox\Excel\FixedWidthExample.txt"
    lFile = FreeFile
        
    Open sFile For Input As lFile
    sInput = Input$(LOF(lFile), lFile)
    Close lFile
    
    vaLines = Split(sInput, vbNewLine)
    ReDim aChars(1 To Len(vaLines(0)) + 1, 1 To 1)
    
    For i = LBound(vaLines) To UBound(vaLines)
        For j = 1 To Len(vaLines(i))
            If Mid$(vaLines(i), j, 1) <> Space$(1) Then
                aChars(j, 1) = aChars(j, 1) + 1
            End If
        Next j
    Next i
    
    Sheet1.Range("B2").Resize(UBound(aChars, 1), 1).Value = aChars
    
End Sub

Sub ReadTextFile()

Dim sFile As String

Dim lFile As Long

Dim sInput As String

Dim vaLines As Variant

Dim i As Long, j As Long

Dim aChars() As Long

Dim bLow As Boolean

sFile = "C:\Users\dkusleika\Dropbox\Excel\FixedWidthExample.txt"

lFile = FreeFile

Open sFile For Input As lFile

sInput = Input$(LOF(lFile), lFile)

Close lFile

vaLines = Split(sInput, vbNewLine)

ReDim aChars(1 To Len(vaLines(0)) + 1, 1 To 1)

For i = LBound(vaLines) To UBound(vaLines)

For j = 1 To Len(vaLines(i))

If Mid$(vaLines(i), j, 1) <> Space$(1) Then

aChars(j, 1) = aChars(j, 1) + 1

End If

Next j

Next i

Sheet1.Range("B2").Resize(UBound(aChars, 1), 1).Value = aChars

End Sub

Then I charted them. It seems pretty clear that there’s a break around 7, 12, 23, and 35. The points at 41 and 45 are a little less clear, but starting at 47, you see a clear downward trend. This is the tell-tale sign of left-justified text. Similarly, 87-104 is a right-justified number. The headers muddy up the waters a bit because they contain data that’s no good to me, but still adds to the character count. As I mentioned before, I’m not building a general-purpose solution and it just so happens I can remove the headers. So I did.

Sub ReadTextFileNoHeaders()
    
    Dim sFile As String
    Dim lFile As Long
    Dim sInput As String
    Dim vaLines As Variant
    Dim i As Long, j As Long
    Dim aChars() As Long
    Dim bHeader As Boolean
    
    sFile = "\\99991-dc01\99991\dkusleika\My Documents\FixedWidthExample.txt"
    lFile = FreeFile
        
    Open sFile For Input As lFile
    sInput = Input$(LOF(lFile), lFile)
    Close lFile
    
    vaLines = Split(sInput, vbNewLine)
    ReDim aChars(1 To Len(vaLines(0)) + 1, 1 To 1)
    
    For i = LBound(vaLines) To UBound(vaLines)
        If Len(vaLines(i)) > 0 Then
            If Asc(Left$(vaLines(i), 1)) = 12 Then
                bHeader = True
            ElseIf vaLines(i) = String(132, "-") Then
                bHeader = False
            End If
        End If
        
        If Not bHeader Then
            For j = 1 To Len(vaLines(i))
                If Mid$(vaLines(i), j, 1) <> Space$(1) Then
                    aChars(j, 1) = aChars(j, 1) + 1
                End If
            Next j
        End If
    Next i
    
    Sheet3.Range("B2").Resize(UBound(aChars, 1), 1).Value = aChars
    
End Sub

Sub ReadTextFileNoHeaders()

Dim sFile As String

Dim lFile As Long

Dim sInput As String

Dim vaLines As Variant

Dim i As Long, j As Long

Dim aChars() As Long

Dim bHeader As Boolean

sFile = "\\99991-dc01\99991\dkusleika\My Documents\FixedWidthExample.txt"

lFile = FreeFile

Open sFile For Input As lFile

sInput = Input$(LOF(lFile), lFile)

Close lFile

vaLines = Split(sInput, vbNewLine)

ReDim aChars(1 To Len(vaLines(0)) + 1, 1 To 1)

For i = LBound(vaLines) To UBound(vaLines)

If Len(vaLines(i)) > 0 Then

If Asc(Left$(vaLines(i), 1)) = 12 Then

bHeader = True

ElseIf vaLines(i) = String(132, "-") Then

bHeader = False

End If

If Not bHeader Then

For j = 1 To Len(vaLines(i))

If Mid$(vaLines(i), j, 1) <> Space$(1) Then

aChars(j, 1) = aChars(j, 1) + 1

End If

Next j

End If

Next i

Sheet3.Range("B2").Resize(UBound(aChars, 1), 1).Value = aChars

End Sub

The ASCII code for the page break character is 12. All of my headers end in a string of 132 dashes. That’s damn convenient. Look what happens when I remove the headers.

Based on that, I think I could predict the column breaks perfectly. But there’s one aspect of this file that allows me to say that. I don’t have any truly variable length, left-justified text followed by any truly variable length, right-justified number. For instance, if the description column was immediately followed by the debit column, it might be difficult to determine exactly where to break it.

How did I solve that problem? I didn’t. This is as far as I got writing my own text import wizard. Out of nowhere, I read something about reading text files with ADO and all my problems were solved. I had used external data tables to read text files, but never ADO. I’ll post about how I’m importing text files with ADO in my next post.

Sample Fixed Width Text File

Posted on April 12, 2013March 18, 2018 by Dick Kusleika

I have a post or two in my head that deals with fixed width text files, something I’ve been dealing with quite a bit. Like me, you probably don’t run into fixed width text files in your personal life. It’s usually some crappy computer program at your job that forces you to deal with them. The problem is that I can’t simply throw around private company information, so I had to replicate a real fixed width file with obfuscated information. Before I post about all the lovely things I’ve been doing with these files, I thought I’d post about how I made the sample.

fixed width text file sample

Sub MakeSampleFile()
    
    Dim sFile As String, sOutFile As String
    Dim lFile As Long
    Dim sInput As String
    Dim vaLines As Variant
    Dim i As Long
    Dim aOutput() As String
    Dim aLine() As String
    Dim dPdDebit As Double, dPdCredit As Double
    Dim dTotDebit As Double, dTotCredit As Double
    Dim dThisDebit As Double, dThisCredit As Double
    
    sFile = "\\99991-dc01\99991\dkusleika\My Documents\AJE_Cost.txt"
    sOutFile = "C:\Users\dkusleika\Dropbox\Excel\FixedWidthExample2.txt"
    lFile = FreeFile
        
    'Open the input file, read it all in, split it by line
    Open sFile For Input As lFile
    sInput = Input$(LOF(lFile), lFile)
    Close lFile
    vaLines = Split(sInput, vbNewLine)
    ReDim aOutput(LBound(vaLines) To UBound(vaLines))
    
    'Loop through the lines of the import file
    For i = LBound(vaLines) To UBound(vaLines)
        ReDim aLine(1 To 10)
        'Lines with dates need special processing
        If IsDate(Mid(vaLines(i), 13, 10)) Then
            aLine(1) = Left$(vaLines(i), 24)
            aLine(2) = GetGLAccount 'make up a GL account
            aLine(3) = GetDescription(Mid$(vaLines(i), 38, 64 - 38 + 1)) 'Desc with random letters
            aLine(4) = "S1" & Space(3)
            aLine(5) = GetYesNo 'Get a Yes or No randomly
            aLine(6) = GetDescription(Mid$(vaLines(i), 74, 83 - 74 + 1))
            aLine(7) = GetYesNo
            
            'Get a random debit at the same scale as the current debit,
            'and keep track of it for totals
            dThisDebit = GetRandomNumber(Mid$(vaLines(i), 88, 106 - 88 + 1))
            If dThisDebit = 0 Then
                aLine(8) = Pad(Space(1), 17, vbNullString)
            Else
                aLine(8) = Pad(Format(dThisDebit, "#,##0.00"), 17, vbNullString)
            End If
            dPdDebit = dPdDebit + dThisDebit
            
            'Same for credit
            dThisCredit = GetRandomNumber(Mid$(vaLines(i), 107, 128 - 107 + 1))
            If dThisCredit = 0 Then
                aLine(9) = Pad(Space(1), 20, vbNullString) & Space(4)
            Else
                aLine(9) = Pad(Format(dThisCredit, "#,##0.00"), 20, vbNullString) & Space(4)
            End If
            dPdCredit = dPdCredit + dThisCredit
            
            aLine(10) = GetYesNo
        
        'Period balances need special processing
        ElseIf vaLines(i) Like "*BALANCE PERIOD*" Then
            aLine(1) = Left$(vaLines(i), 88)
            aLine(2) = Pad(Format(dPdDebit, "#,##0.00"), 16, vbNullString)
            aLine(3) = Pad(Format(dPdCredit, "#,##0.00"), 20, vbNullString)
            dTotDebit = dTotDebit + dPdDebit
            dTotCredit = dTotCredit + dPdCredit
            dPdDebit = 0: dPdCredit = 0
            
        'Ending balances need special processing
        ElseIf vaLines(i) Like "*BALANCE*" Then
            aLine(1) = Left$(vaLines(i), 88)
            aLine(2) = Pad(Format(dTotDebit, "#,##0.00"), 16, vbNullString)
            aLine(3) = Pad(Format(dTotCredit, "#,##0.00"), 20, vbNullString)
        
        'Headers, blanks, and other stuff comes straight over as is
        Else
            aLine(1) = vaLines(i)
        End If
        aOutput(i) = Join(aLine, vbNullString)
    Next i
    
    lFile = FreeFile
    Open sOutFile For Output As lFile
    Print #lFile, Join(aOutput, vbNewLine)
    Close lFile
    
End Sub

Sub MakeSampleFile()

Dim sFile As String, sOutFile As String

Dim lFile As Long

Dim sInput As String

Dim vaLines As Variant

Dim i As Long

Dim aOutput() As String

Dim aLine() As String

Dim dPdDebit As Double, dPdCredit As Double

Dim dTotDebit As Double, dTotCredit As Double

Dim dThisDebit As Double, dThisCredit As Double

sFile = "\\99991-dc01\99991\dkusleika\My Documents\AJE_Cost.txt"

sOutFile = "C:\Users\dkusleika\Dropbox\Excel\FixedWidthExample2.txt"

lFile = FreeFile

'Open the input file, read it all in, split it by line

Open sFile For Input As lFile

sInput = Input$(LOF(lFile), lFile)

Close lFile

vaLines = Split(sInput, vbNewLine)

ReDim aOutput(LBound(vaLines) To UBound(vaLines))

'Loop through the lines of the import file

For i = LBound(vaLines) To UBound(vaLines)

ReDim aLine(1 To 10)

'Lines with dates need special processing

If IsDate(Mid(vaLines(i), 13, 10)) Then

aLine(1) = Left$(vaLines(i), 24)

aLine(2) = GetGLAccount 'make up a GL account

aLine(3) = GetDescription(Mid$(vaLines(i), 38, 64 - 38 + 1)) 'Desc with random letters

aLine(4) = "S1" & Space(3)

aLine(5) = GetYesNo 'Get a Yes or No randomly

aLine(6) = GetDescription(Mid$(vaLines(i), 74, 83 - 74 + 1))

aLine(7) = GetYesNo

'Get a random debit at the same scale as the current debit,

'and keep track of it for totals

dThisDebit = GetRandomNumber(Mid$(vaLines(i), 88, 106 - 88 + 1))

If dThisDebit = 0 Then

aLine(8) = Pad(Space(1), 17, vbNullString)

Else

aLine(8) = Pad(Format(dThisDebit, "#,##0.00"), 17, vbNullString)

End If

dPdDebit = dPdDebit + dThisDebit

'Same for credit

dThisCredit = GetRandomNumber(Mid$(vaLines(i), 107, 128 - 107 + 1))

If dThisCredit = 0 Then

aLine(9) = Pad(Space(1), 20, vbNullString) & Space(4)

Else

aLine(9) = Pad(Format(dThisCredit, "#,##0.00"), 20, vbNullString) & Space(4)

End If

dPdCredit = dPdCredit + dThisCredit

aLine(10) = GetYesNo

'Period balances need special processing

ElseIf vaLines(i) Like "*BALANCE PERIOD*" Then

aLine(1) = Left$(vaLines(i), 88)

aLine(2) = Pad(Format(dPdDebit, "#,##0.00"), 16, vbNullString)

aLine(3) = Pad(Format(dPdCredit, "#,##0.00"), 20, vbNullString)

dTotDebit = dTotDebit + dPdDebit

dTotCredit = dTotCredit + dPdCredit

dPdDebit = 0: dPdCredit = 0

'Ending balances need special processing

ElseIf vaLines(i) Like "*BALANCE*" Then

aLine(1) = Left$(vaLines(i), 88)

aLine(2) = Pad(Format(dTotDebit, "#,##0.00"), 16, vbNullString)

aLine(3) = Pad(Format(dTotCredit, "#,##0.00"), 20, vbNullString)

'Headers, blanks, and other stuff comes straight over as is

Else

aLine(1) = vaLines(i)

End If

aOutput(i) = Join(aLine, vbNullString)

Next i

lFile = FreeFile

Open sOutFile For Output As lFile

Print #lFile, Join(aOutput, vbNewLine)

Close lFile

End Sub

That procedure is a bit longer than I like, but it’s not something I wanted to spend a lot of time on. It reads in the legitimate file, then goes line-by-line through it. When it encounters a line that needs changing, it creates fake data to put in place of the real data. I want the totals to match so I have something to compare it to after I parse the text file (in a later post), so I have to keep track of the fake numbers I make up and total them appropriately.

The GL Account is a 10 digit number that start with 1-5 and can have any other digits after that.

Function GetGLAccount() As String
    
    Dim i As Long
    Dim aOut(1 To 10) As String
    
    aOut(1) = Int((5 - 1 + 1) * Rnd + 1)
    For i = 2 To 10
        aOut(i) = Int((9 - 0 + 1) * Rnd + 0)
    Next i
    
    GetGLAccount = Join(aOut, vbNullString) & Space(3)
    
End Function

Function GetGLAccount() As String

Dim i As Long

Dim aOut(1 To 10) As String

aOut(1) = Int((5 - 1 + 1) * Rnd + 1)

For i = 2 To 10

aOut(i) = Int((9 - 0 + 1) * Rnd + 0)

Next i

GetGLAccount = Join(aOut, vbNullString) & Space(3)

End Function

The description obfuscator respects spaces, but replaces anything else with an upper case letter.

Function GetDescription(sDesc As String) As String
    
    Dim aOut() As String
    Dim lLen As Long
    Dim i As Long
    
    ReDim aOut(1 To Len(sDesc))
    
    For i = 1 To Len(sDesc)
        If Mid$(sDesc, i, 1) = Space(1) Then
            aOut(i) = Space(1)
        Else
            aOut(i) = Chr$(Int((90 - 65 + 1) * Rnd + 65))
        End If
    Next i
    
    GetDescription = Join(aOut, vbNullString)
    
End Function

Function GetDescription(sDesc As String) As String

Dim aOut() As String

Dim lLen As Long

Dim i As Long

ReDim aOut(1 To Len(sDesc))

For i = 1 To Len(sDesc)

If Mid$(sDesc, i, 1) = Space(1) Then

aOut(i) = Space(1)

Else

aOut(i) = Chr$(Int((90 - 65 + 1) * Rnd + 65))

End If

Next i

GetDescription = Join(aOut, vbNullString)

End Function

The Yes/No generator gives me a Yes about 80% of the time. You have to stay positive, you know.

Function GetYesNo() As String
    
    If Rnd < 0.8 Then
        GetYesNo = "Yes "
    Else
        GetYesNo = "No  "
    End If
    
End Function

Function GetYesNo() As String

If Rnd < 0.8 Then

GetYesNo = "Yes "

Else

GetYesNo = "No "

End If

End Function

For the debits and credits, I wanted to stay somewhat realistic, so I kept the same scale as whatever number is there. To determine the scale, I remove all spaces, commas, and periods from the screen, then take the length x 2. A number like $9,453.65 will have a scale of 4 and will produce a number between 1,000 and 9,999. Rnd, as you know, generates a number between 0 and 1. I multiply Rnd by 10^Scale to get the right number of digits, then round it to two decimal places.

Function GetRandomNumber(sNum As String) As Double
    
    Dim lScale As Long
    Dim dReturn As Double
    
    lScale = Len(Trim$(Replace$(Replace$(sNum, ".", vbNullString), ",", vbNullString))) - 2
    
    If lScale > 0 Then
        dReturn = Round(Rnd * 10 ^ (lScale), 2)
    End If
    
    GetRandomNumber = dReturn
    
End Function

Function GetRandomNumber(sNum As String) As Double

Dim lScale As Long

Dim dReturn As Double

lScale = Len(Trim$(Replace$(Replace$(sNum, ".", vbNullString), ",", vbNullString))) - 2

If lScale > 0 Then

dReturn = Round(Rnd * 10 ^ (lScale), 2)

End If

GetRandomNumber = dReturn

End Function

Finally, I have a Pad function to fill out spaces around numbers. First, Pad truncates the string to the right width in case it’s already too long. If it’s not too long, spaces are inserted in front of it to fill out the right width. Then there’s the sAfter argument that you’re probably wondering about. I don’t have trailing negatives in this report, but I have had them in some others. I need to be able to stick a negative or a space after the string and used the sAfter argument to do it. I believe they’re all null strings in this example.

Making realistic sample data sucks.

You can download FixedWidthExample2.txt

Opening a PDF from VBA

Posted on March 18, 2013March 19, 2018 by Dick Kusleika

Several years ago I needed to open the newest CSV file from a particular directory. Now I’m faced with a similar problem. My accounting system produces PDFs in some kind of proprietary PDF reader. It doesn’t have near the feature set of Foxit, my preferred reader. Instead of fighting it, now I immediately save the pdf and open it in Foxit.

Drawing from that previous post, I made a function to find the name of the most recently created PDF.

Function GetNewestPDFFileName()

    Dim fso As Scripting.FileSystemObject
    Dim fsoFile As Scripting.File
    Dim fsoFldr As Scripting.Folder
    Dim dtNew As Date, sNew As String
    
    Const sTYPE As String = " PDF "
    Const sFLDR As String = "\\99991-dc01\99991\dkusleika\My Documents\"
    
    Set fso = New Scripting.FileSystemObject
    Set fsoFldr = fso.GetFolder(sFLDR)
    
    For Each fsoFile In fsoFldr.Files
        If fsoFile.DateCreated > dtNew And InStr(1, fsoFile.Type, sTYPE) > 0 Then
            sNew = fsoFile.Path
            dtNew = fsoFile.DateCreated
        End If
    Next fsoFile

    GetNewestPDFFileName = sNew
    
End Function

Function GetNewestPDFFileName()

Dim fso As Scripting.FileSystemObject

Dim fsoFile As Scripting.File

Dim fsoFldr As Scripting.Folder

Dim dtNew As Date, sNew As String

Const sTYPE As String = " PDF "

Const sFLDR As String = "\\99991-dc01\99991\dkusleika\My Documents\"

Set fso = New Scripting.FileSystemObject

Set fsoFldr = fso.GetFolder(sFLDR)

For Each fsoFile In fsoFldr.Files

If fsoFile.DateCreated > dtNew And InStr(1, fsoFile.Type, sTYPE) > 0 Then

sNew = fsoFile.Path

dtNew = fsoFile.DateCreated

End If

Next fsoFile

GetNewestPDFFileName = sNew

End Function

Once I have the name, a simple FollowHyperlink method will get me where I want to go. Oh, except that hyperlinks are bad and Excel needs to show me a warning. That’s not going to work. Instead, I take the long way around. I create a batch file to open the PDF and run that.

Sub OpenNewestPDF()
    
    Dim sNew As String
    Dim sFile As String, lFile As Long
    
    sNew = GetNewestPDFFileName
    
    sFile = "\\99991-dc01\99991\dkusleika\My Documents\OpenPDF.bat"
    lFile = FreeFile
    Open sFile For Output As lFile
    Print #lFile, "K:" & vbNewLine & "start " & Dir(sNew)
    Close lFile
    
    'ThisWorkbook.FollowHyperlink snew
    Shell sFile
    
End Sub

Sub OpenNewestPDF()

Dim sNew As String

Dim sFile As String, lFile As Long

sNew = GetNewestPDFFileName

sFile = "\\99991-dc01\99991\dkusleika\My Documents\OpenPDF.bat"

lFile = FreeFile

Open sFile For Output As lFile

Print #lFile, "K:" & vbNewLine & "start " & Dir(sNew)

Close lFile

'ThisWorkbook.FollowHyperlink snew

Shell sFile

End Sub

The file opens and there’s no warning to click through. Coincidentally, JW has been working around that same security measure, only for MP3 files. There’s an interesting approach.

Sub OpenNewestPDF2()
    
    Dim oleo As OLEObject
    Dim sFile As String
    
    sFile = GetNewestPDFFileName
    
    Set oleo = wshPdf.OLEObjects.Add(, sFile, True)
    oleo.Verb
    oleo.Delete
    
End Sub

Sub OpenNewestPDF2()

Dim oleo As OLEObject

Dim sFile As String

sFile = GetNewestPDFFileName

Set oleo = wshPdf.OLEObjects.Add(, sFile, True)

oleo.Verb

oleo.Delete

End Sub

It works and no message. Good one John.

Finally, I wanted a third method. fzz commented that I should use a console command because that’s what consoles are good at. I made a batch file following his example:

for /F %%a in ('dir /b/o-d "K:*.pdf"') do (start %%a & exit)

No warnings, obviously, and even though the VBA above is lightning quick, I think we can all appreciate that this is the quickest and most direct way. I’m having a problem running from VBA though.

Sub OpenNewestPDF3()
    
    Shell "K:\OpenPDF2.bat"
    'ShellExecute 0, "OPEN", "K:\OpenPDF2.bat", "", "", 0
    
End Sub

Sub OpenNewestPDF3()

Shell "K:\OpenPDF2.bat"

'ShellExecute 0, "OPEN", "K:\OpenPDF2.bat", "", "", 0

End Sub

It got the file name right, but says it can’t find it. As you can see, I tried ShellExecute too. Same result. It doesn’t matter. I have a batch file, so I don’t need Excel. I put a shortcut to the batch file on my desktop and set the shortcut key. Now I can open it regardless of which applications are open or have the focus.

Monitor Worksheet Changes via RSS

Posted on March 16, 2012March 16, 2012 by Dick Kusleika

I love RSS. If you have a website and don’t have a feed, I don’t follow it. I monitor stuff via Google Reader and it’s a great way to keep up on a lot of sites. So it should only follow that monitoring spreadsheet changes via RSS would be great too. Right? Well, not really, but that never stopped me before.

First some caveats. This is just an experiment and not meant for general use. There may actually be some bugs in it, if you can believe that. If you want to modify the code for your own use, be warned that if you monitor too many cells it might be slow. Or you might get so much information that it’s worthless. OK, now that that’s over.

An RSS feed is an XML file that sits on a web server. This blog has such an XML file. When I post this blog entry, WordPress will update the XML file with an entry for this post. Occasionally Google Reader will check the XML file and see if there’s anything new. If there is, it will display the new stuff for anyone who has subscribed to the feed.

It’s fairly trivial to create an XML file, even from Excel. The potentially difficult part is putting that file on a web server. Oh, except for one little thing. I have a webserver right on my computer and you might too. Mine is called “C:\Users\dick\Dropbox\Public\”. Did I trick you? Dropbox has a Public folder and you can get a “public link” from any file in that folder, including the XML file we’re about to create. What a simple way to publish something to the intertubes. Here’s how you get that public link.

Alright, enough screwing around. Let’s get to the code.

I start with a class module called CChange (and its parent CChanges). CChange has the following read/write properties:

Address - the cell address we're watching for changes and a way to uniquely identify the instance. OldValue - the value in the cell before it changed. NewValue - the value in the cell after it changed. Modified - a time stamp when the changed occurred.

I’ve named a range on the sheet called RSSWatch. When the workbook opens, a CChange object is created for every cell in that range.

Sub Auto_Open()


    Set gclsChanges = New CChanges
    gclsChanges.Initialize
End Sub
Public Sub Initialize()
    Dim clsChange As CChange

    Dim rCell As Range
    For Each rCell In Sheet1.Range(gsNAMEDRNG).Cells

        Set clsChange = New CChange

        With clsChange

            .Address = rCell.Parent.Name & "!" & rCell.Address

            .OldValue = rCell.Value

            .NewValue = .OldValue

        End With

        Me.Add clsChange

    Next rCell

End Sub

These CChange objects are just sitting out there waiting to record any changes. They all have a Modified date of 12:00:00 AM (because I didn’t set anything) and in this state they won’t be written to the XML file. In the ThisWorkbook module, I use the Workbook_SheetChange event to monitor my range for changes. You might notice that I switch pretty liberally between sheet-specific references and general references. For instance, in the above Initialize method, I limit my range to Sheet1. When I went to code the event, I thought that someday I would want this to monitor different ranges on different sheets, so I used a Workbook level event. It’s totally inconsistent, but it will be helpful if you just ignore it.

Private Sub Workbook_SheetChange(ByVal Sh As Object, ByVal Target As Range)


    Dim clsChange As CChange

    Dim rCell As Range

    Dim rRng As Range
    On Error Resume Next

        Set rRng = Sh.Range(gsNAMEDRNG)

    On Error GoTo 0
    If Not rRng Is Nothing Then

        If Not Intersect(Target, rRng) Is Nothing Then

            For Each rCell In Target.Cells

                Set clsChange = gclsChanges.Change(Sh.Name & "!" & rCell.Address)

                If Not clsChange Is Nothing Then

                    clsChange.NewValue = rCell.Value

                    clsChange.Modified = Now

                End If

            Next rCell

        End If

    End If

End Sub

For every cell that has changed, I find it’s CChange brother and change the NewValue and Modified properties. I haven’t written the XML file yet. I still just have a bunch of CChange objects, except that at least one of them has a Modified property that will make it eligible to be included in the file. Changes get made and recorded and only the last value and time are saved. Then, when the workbook is saved, any eligible CChange objects are written to the file.

Private Sub Workbook_BeforeSave(ByVal SaveAsUI As Boolean, Cancel As Boolean)


    If Not gbDEBUG Then

        gclsChanges.WriteRSS

    End If

End Sub

And that’s pretty much all there is. Just kidding, there’s lots more. Before we look at more code, let’s take a look at what the file might look like after a few changes.

I’m sure you’re all experts on XML, so I’ll spare you the details, but the general hierarchy goes like this

rss channel title link description language lastBuildDate ttl item - one or more of these guys title link description pubDate /item /channel /rss

Now that you know what the file looks like, let’s write one. I called the WriteRSS method from the Workbook_BeforeSave event.

Public Sub WriteRSS()


          Dim xmlDoc As MSXML2.DOMDocument

          Dim xmlChannel As MSXML2.IXMLDOMElement

          Dim xmlLastBuild As MSXML2.IXMLDOMElement

          Dim xmlItem As MSXML2.IXMLDOMElement

          Dim clsChange As CChange

          Dim dtMax As Date
10        If Me.HasChanges Then

20            If Not Me.FileExists Then Me.CreateFile
30            Set xmlDoc = New MSXML2.DOMDocument

40            xmlDoc.Load gsPATH & Me.Filename

50            Set xmlChannel = xmlDoc.SelectSingleNode(gsXRSS).SelectSingleNode(gsXCHANNEL)

60            Set xmlLastBuild = xmlChannel.SelectSingleNode(gsXBUILD)

70            Me.LastBuildDate = ConvertDate(xmlLastBuild.Text)

80            dtMax = Me.LastBuildDate
90            For Each clsChange In Me

100               If clsChange.ShouldWrite Then

110                   Set xmlItem = clsChange.xmlItem(xmlDoc)

120                   xmlChannel.appendChild xmlItem
130                   If clsChange.Modified > dtMax Then dtMax = clsChange.Modified

140               End If

150           Next clsChange
160           Me.LastBuildDate = dtMax

170           xmlLastBuild.Text = Format(Me.LastBuildDate, gsFMTDATE)

180           FormatXMLDoc xmlDoc

190           xmlDoc.Save gsPATH & Me.Filename

200       End If

End Sub

Generally, I’m creating an XML file if it doesn’t exist, reading that file in, appending Items to it for any changes, and writing that file back out. The first thing I do is make sure there’s something to write. In line 10, I call the HasChanges property, which loops through all the CChange instances to see what’s eligible. If nothing has changed, there’s no need to create the XML file.

Line 20: If there isn’t already a file, I need to make one. I check to see if it’s out there.

Public Property Get FileExists() As Boolean


    FileExists = Len(Dir(gsPATH & Me.Filename)) > 0
End Property
Public Sub CreateFile()
    Dim xmlDoc As MSXML2.DOMDocument

    Dim xmlRss As MSXML2.IXMLDOMElement

    Dim xmlVer As MSXML2.IXMLDOMAttribute

    Dim xmlChannel As MSXML2.IXMLDOMElement

    Dim xmlNode As MSXML2.IXMLDOMElement
    Set xmlDoc = New MSXML2.DOMDocument
    Set xmlRss = xmlDoc.createElement(gsXRSS)

    Set xmlVer = xmlDoc.createAttribute(gsXVER)

    xmlVer.Value = gsRSSVERSION

    xmlRss.Attributes.setNamedItem xmlVer
    Set xmlChannel = xmlDoc.createElement(gsXCHANNEL)
    Set xmlNode = xmlDoc.createElement(gsXTITLE)

    xmlNode.Text = Me.Filename

    xmlChannel.appendChild xmlNode
    Set xmlNode = xmlDoc.createElement(gsXLINK)

    xmlNode.Text = gsLINK

    xmlChannel.appendChild xmlNode
    Set xmlNode = xmlDoc.createElement(gsXDESC)

    xmlNode.Text = "Changes made to " & ThisWorkbook.Name

    xmlChannel.appendChild xmlNode
    Set xmlNode = xmlDoc.createElement(gsXLANG)

    xmlNode.Text = gsLANG

    xmlChannel.appendChild xmlNode
    Set xmlNode = xmlDoc.createElement(gsXBUILD)

    xmlNode.Text = Format(Now - 1, gsFMTDATE)

    xmlChannel.appendChild xmlNode
    Set xmlNode = xmlDoc.createElement(gsXTTL)

    xmlNode.Text = glTTL

    xmlChannel.appendChild xmlNode
    xmlRss.appendChild xmlChannel
    xmlDoc.appendChild xmlRss
    xmlDoc.Save gsPATH & Me.Filename

End Sub

You’ll need to set a reference to Microsoft XML, v6.0 or similar. CreateFile sets up everything in the file that’s not an Item, like the title, link, description, language, etc. The basics of XML generation are 1) create a new node and 2) append it to its parent node.

Line 40: I read in the existing file or the one I just created. The Load method populates the XMLDOMDocument with all the hierarchies and data. Line 50 and 60 go find specific nodes in the file that I’m interested in. I want the Channel node because I’ll be appending Items to it. I want the lastBuildDate node so I can compare that to CChange Modified properties and only write new changes.

Line 70: The RSS example I got off the web showed dates formatted like Tue, 06 Mar 2012 21:28:01 CST. Neither Excel’s CDate or Datevalue functions can convert that to a date, so I have to strip off the day and the timezone, which I do in a separate function.

Line 80: Ultimately I want to make my lastBuildDate in the XML file equal to the latest date of all the CChange objects. I’m initializing the maximum variable here. If I don’t, every cell’s Modified would be greater than this date.

Line 100: For every CChange, I check to see if it needs to be written to the file.

Public Property Get ShouldWrite() As Boolean


    ShouldWrite = Me.Modified >= gclsChanges.LastBuildDate

End Property

If it’s been modified since the last time I wrote the file, it gets written this time. I really should also check to see if OldValue is different than NewValue. Right now if you change a cell and then change it back, it will still show up as a change. Sounds like a V2 enhancement.

The rest of the loop creates an Item, appends it to the Channel, and updates the maximum Modified date. The xmlItem property returns an object that can be appended.

Public Property Get xmlItem(xmlDoc As MSXML2.DOMDocument) As MSXML2.IXMLDOMElement


    Dim xmlReturn As MSXML2.IXMLDOMElement

    Dim xmlSubItem As MSXML2.IXMLDOMElement
    Set xmlReturn = xmlDoc.createElement(gsXITEM)
    Set xmlSubItem = xmlDoc.createElement(gsXTITLE)

    xmlSubItem.Text = Me.Address

    xmlReturn.appendChild xmlSubItem
    Set xmlSubItem = xmlDoc.createElement(gsXLINK)

    xmlSubItem.Text = gsLINK

    xmlReturn.appendChild xmlSubItem
    Set xmlSubItem = xmlDoc.createElement(gsXDESC)

    xmlSubItem.Text = Me.Description

    xmlReturn.appendChild xmlSubItem
    Set xmlSubItem = xmlDoc.createElement(gsXPUBDATE)

    xmlSubItem.Text = Format(Me.Modified, gsFMTDATE)

    xmlReturn.appendChild xmlSubItem
    Set xmlItem = xmlReturn

End Property

Not much to this – make a node and append it. The Description property is a read-only property that makes a nice English sentence describing what happened.

Line 160: I change the LastBuildDate to equal the max, then change that node in the XML file.

Line 180: This code I stole from VB Helper. It adds the line breaks and indentation that, while not necessary, is really helpful when debugging. No, I didn’t write this code perfectly the first time.

Finally I save the modified XML document. Dropbox publishes to the web and Google Reader reads it. Here’s what the last change looks like in the reader.

If I ever look at this again, here’s what I’d do in version 2

Put all the changes in one Item rather than one per cell
Check to see if a cell was changed and changed back and exclude it
Get the timezone from the Windows API rather than hardcoding it in a constant
Modify to use mulitple ranges on multiple sheets
Put the code in an add-in and look for workbooks to monitor
Change the Description to be easier to read

I like this Public Dropbox folder. I’m thinking of using it as a version control system to keep my add-ins up-to-date from multiple computers. It’s a shame that some companies block it. Oh well, your thoughts on the RSS code are welcome.

You can download RSSChanges.zip