Importing Fixed Width Text Files with ActiveX Data Objects

A few weeks ago I was struggling with fixed width text files. Remember 1995? Me neither, but I’m living it every day. I happened upon a stackoverflow.com post that dealt with using ADO for this purpose. I don’t remember the post, so no link. It did, however, lead me to this MSDN article, which was very helpful.

If you haven’t been following along, I wrote a post about creating a sample fixed-width file and one about importing said file. This post is about importing that file with ADO. You may remember ADO from such database objects as Connection, Command, and Recordset. But, like me, you may never have considered using to import text files; or even new that it could.

Before I get into the specifics, there are two things that endeared me to this method. First, it allows me to only import the data I want and easily ignores headers, totals, and other non-record data. Second, it’s super fast. I had a huge text file that took several minutes to read using the Input$ function. I got it down to 90 seconds using ADO. Ninety seconds still stinks, but it beats having to get a coffee every time the code runs. The bonus third reason I love ADO is that I can replicate it for different text files easily. Usually, fixed width text files are not arranged as normalized data, so there are some challenges. But I went from setting this up for one specific report to a dozen reports very quickly. Let’s see how it’s done.

Go read the MSDN article if you want the specifics, but basically you need a file called Schema.ini that tells ADO what’s what in your file. I don’t take a crap without a class module, so we’ll be using class modules here. Didn’t this use to be a family friendly blog?

If you couldn’t tell, the text file in question contains transactions from a general ledger. That means I need a Transaction class to hold each of them. Using the column headers from the file, I create a CTransaction class module.

Option Explicit

Public TransactionID As Long
Public Entry As String
Public Period As Long
Public PostDate As Date
Public GLAccount As String
Public Description As String
Public Srce As String
Public Cflow As Boolean
Public Ref As String
Public Post As Boolean
Public Debit As Double
Public Credit As Double
Public Alloc As Boolean

My VBHelpers add-in quickly converts those to properties and creates a CTransactions parent class. Next, I create an MEntryPoints standard module and insert the following code.

Public Sub ImportGLTransactions()
   
    Dim clsTransactions As CTransactions
    Dim sh As Worksheet
    Dim sFile As String
   
    sFile = Application.GetOpenFilename("*.txt, *.txt")
   
    If sFile <> "False" Then
        Set clsTransactions = New CTransactions
        clsTransactions.FillFromFile sFile
        Set sh = Workbooks.Add.Sheets(1)
        clsTransactions.WriteToRange sh.Range("A1")
    End If

End Sub

I don’t have a FillFromFile method or a WriteToRange method, but I like to write my main procedure as if I already had those. You’ll need a reference to Microsoft ActiveX Data Objects 2.8 Library (although any version close to that will do). The FillFromFile method is pretty simple. It creates and ADO Connection and an ADO Recordset, then loops through the recordset adding CTransaction instances as it reads them in. It’s treating our text file as if it’s a database with field names and everything.

Public Sub FillFromFile(ByVal sFile As String)
   
    Dim adCn As ADODB.Connection, adRs As ADODB.Recordset
    Dim vaConn As Variant, aSql(1 To 4) As String
    Dim sPATH As String
    Dim clsTransaction As CTransaction
   
    sPATH = Replace$(sFile, Dir$(sFile), vbNullString)
   
    'We'll talk about this line later
    MakeSchema sFile, sPATH, Me.Schema
   
    'Create a connection string and SQL statement
    vaConn = GetConnectionString(sPATH)
    aSql(1) = "SELECT"
    aSql(2) = Join(Me.Columns, ",")
    aSql(3) = "FROM [" & Dir$(sFile) & "]"
    aSql(4) = "WHERE PostDate Like ""[0-1][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]"""
   
    'Open the connection and the recordset
    Set adCn = New ADODB.Connection
    adCn.Open Join(vaConn, ";")
   
    Set adRs = New ADODB.Recordset
    adRs.Open Join(aSql, Space(1)), adCn, adOpenStatic, adLockReadOnly, adCmdText
                   
    'Loop through the rs and create CTransaction instances
    If Not adRs.BOF And Not adRs.EOF Then
        Do While Not adRs.EOF
            Set clsTransaction = New CTransaction
            clsTransaction.FillFromRecordset adRs
            Me.Add clsTransaction
            adRs.MoveNext
        Loop
    End If
   
    adRs.Close
    adCn.Close
   
End Sub

We’ll hold off on how to create the Schema file for now. The Connection string is created with this little utility. You pass in the path and returns an array of strings ready to be joined.

Function GetConnectionString(ByVal sPATH As String) As Variant
   
    Dim aConn(1 To 3) As String
   
    aConn(1) = "Provider=Microsoft.Jet.OLEDB.4.0"
    aConn(2) = "Data Source=" & sPATH
    aConn(3) = "Extended Properties=""text;HDR=No;FMT=FixedLength"""

    GetConnectionString = aConn
   
End Function

I’ve recently starting using arrays and Joins to concatenate strings of any length. I find it makes the code much more readable and manageable once you get used to it. Let’s talk about that SQL statement. In Schema.ini, I’ve defined column names and column widths. We’ll look at it in a moment. The SQL statement selects all the columns from the text file based on some criteria. The first section of the SQL statement is the SELECT keyword. For the second section, I have a property that returns an array of columns. You could just as easily use “SELECT * FROM”, but I was recently shown the benefit of following the never-select-astrisk rule, so I’m trying to be good.

Public Property Get Columns() As Variant
   
    Columns = Array("Entry", "Period", "PostDate", "GLAccount", "Description", "Srce", "Cflow", "Ref", "Post", "Debit", "Credit", "Alloc")
   
End Property

Just an array of column names used in the SQL statment and in Schema.ini. The third section of the SQL statement is the FROM keyword followed by the file name in brackets. The Dir$ function strips the path out of the fullname and returns only the file name. You don’t need the path in the SQL statement because Schema.ini is in the same directory as the text file. It has to be, so it’s not looking anywhere else.

The final section of the SQL statement is the WHERE clause. This is where you have to get a little creative. As I scan down my text file, I need to find some characteristic of “good” rows that is not present in “bad” rows. For this example, it was pretty easy. Every row that I want has a real date in PostDate and every row that I don’t want doesn’t. They aren’t all that easy. Would you like to see some examples of WHERE clauses I’ve used? Well, would you?

aSql(4) = "WHERE (Tax Like ""__/__/____"" And Not(IsNull(Vendor))) Or Account Like ""[1-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]"""
aSql(4) = "WHERE PaymentDate Like ""__/__/____"" And Not DiscountAmount Like ""%[ ][ ].00"""
aSql(4) = "WHERE PostDate Like ""__/__/____"" Or GLAccount Like ""[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]"""
aSql(4) = "WHERE InvoiceDate Like ""__/__/____"" AND ItemID <> ""Invoice Totals"""
aSql(4) = "WHERE Kitted = ""YES"" Or Kitted = ""NO"""
aSql(4) = "WHERE TranDate Like ""__/__/____"" Or (LotNo Like ""[A-Z][A-Z]%"" And TranDate Like ""     [A-Z]%"" And Not LotNo Like ""Item ID%"")"

Some of those are pretty gnarly. Our example file has lots of options, not just PostDate. We have three Yes/No fields and we could use any of those. It’s pretty unlikely that header information or totals rows are going to have a Yes or No in that same position. The idea is only get the rows you want. For some of the dates I used “__/__/____” and I think it’s pretty safe. But for this example, I used “[0-1][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]” which is a little more specific.

With a good connection string and SQL statement, I open the connection, open the recordset, and start looping. There’s not much to filling the CTransaction class via the FillFromRecordset method. It uses an Nz function that I wrote to avoid Null problems and return a default. For numbers and dates, I specify that I want to return a zero in place of a Null. For strings, it automatically returns vbNullString. The FillFromRecordset procedure is a method in CTransaction. The Nz function is in a standard module.

Public Sub FillFromRecordset(ByRef adRs As ADODB.Recordset)
   
    Me.Entry = Nz(adRs.Fields("Entry"))
    Me.Period = Nz(adRs.Fields("Period"), 0)
    Me.PostDate = Nz(adRs.Fields("PostDate"), 0)
    Me.GLAccount = Nz(adRs.Fields("GLAccount"))
    Me.Description = Nz(adRs.Fields("Description"))
    Me.Srce = Nz(adRs.Fields("Srce"))
    Me.Cflow = Nz(adRs.Fields("Cflow")) = "Yes"
    Me.Ref = Nz(adRs.Fields("Ref"))
    Me.Post = Nz(adRs.Fields("Post")) = "Yes"
    Me.Debit = Nz(adRs.Fields("Debit"), 0)
    Me.Credit = Nz(adRs.Fields("Credit"), 0)
    Me.Alloc = Nz(adRs.Fields("Alloc")) = "Yes"
   
End Sub

Function Nz(fldTest As ADODB.Field, _
    Optional vDefault As Variant) As Variant
   
    If IsNull(fldTest.Value) Then
        If IsMissing(vDefault) Then
            Select Case fldTest.Type
                Case adBSTR, adGUID, adChar, adWChar, adVarChar, adVarWChar
                    Nz = vbNullString
                Case Else
                    Nz = 0
            End Select
        Else
            Nz = vDefault
        End If
    Else
        Nz = fldTest.Value
    End If
   
End Function

Now on to Schema.ini, at long last. I have a MakeSchema procedure in a standard module that simply creates the file where it’s supposed to. One of the arguments to MakeSchema is a string for the contents of the file. That comes from the Schema property of the CTranscations class (shown as Me.Schema in the FillFromFile method above). The Schema property takes the columns from the Columns property and puts them together with column widths to create the string.

Public Property Get Schema() As String
   
    Dim aReturn() As String
    Dim vaNames As Variant
    Dim vaWidths As Variant
    Dim i As Long
   
    vaNames = Me.Columns
    vaWidths = Array(8, 4, 12, 13, 27, 5, 4, 10, 4, 19, 22, 4)
    ReDim aReturn(LBound(vaNames) To UBound(vaNames))
   
    For i = LBound(vaNames) To UBound(vaNames)
        aReturn(i) = "Col" & i + 1 & "=" & vaNames(i) & Space(1) & "Text Width" & Space(1) & vaWidths(i)
    Next i
   
    Schema = Join(aReturn, vbNewLine)
   
End Property

The widths array is simply how many characters wide each column is. The lines in my file are 132 characters long. Counting them is a pain. Usually, I grab a couple of representative lines from the text file and put them in a spreadsheet. Here’s how I came up with the column widths for this file.

Lines 2-5 are Courier New 9pt and the first two lines are typed – no fancy formula to get those numbers. I like to get a couple of representative lines so I don’t miss anything. Then I go put pipes where I want the column breaks to be and put this formula in A7

=FIND("|",$A$4,A6+1)

and fill down until I get an error. Column B is just the difference. Finally, the MakeSchema utility takes that string and puts into a file.

Public Sub MakeSchema(ByVal sFile As String, ByVal sPATH As String, ByVal sCols As String)
   
    Dim lFile As Long
    Dim aWrite(1 To 4) As String
   
    Const sSCHEMA As String = "Schema.ini"
   
    aWrite(1) = "[" & Dir$(sFile) & "]"
    aWrite(2) = "Format=FixedLength"
    aWrite(3) = vbNullString
    aWrite(4) = sCols
   
    lFile = FreeFile
    Open sPATH & sSCHEMA For Output As lFile
    Print #lFile, Join(aWrite, vbNewLine)
    Close lFile
   
End Sub

And the Schema.ini file looks like this:

The final piece is writing all of the CTransaction objects to a range. The CTransactions collection class has a WriteToRange method that calls an OutputRange property. It’s pretty straightforward.

Public Sub WriteToRange(rStart As Range)
   
    Dim vaWrite As Variant
   
    vaWrite = Me.OutputRange
    rStart.Resize(UBound(vaWrite, 1), UBound(vaWrite, 2)).Value = vaWrite
    rStart.CurrentRegion.EntireColumn.AutoFit
   
End Sub

Public Property Get OutputRange() As Variant
   
    Dim aReturn() As Variant
    Dim clsTransaction As CTransaction
    Dim lCnt As Long
    Dim vaHead As Variant
    Dim i As Long
   
    ReDim aReturn(1 To Me.Count + 1, 1 To 12)
   
    vaHead = Me.Columns
    lCnt = lCnt + 1
    For i = LBound(vaHead) To UBound(vaHead)
        aReturn(lCnt, i + 1) = vaHead(i)
    Next i
   
    For Each clsTransaction In Me
        lCnt = lCnt + 1
        With clsTransaction
            aReturn(lCnt, 1) = "'" & .Entry
            aReturn(lCnt, 2) = .Period
            aReturn(lCnt, 3) = .PostDate
            aReturn(lCnt, 4) = "'" & .GLAccount
            aReturn(lCnt, 5) = "'" & .Description
            aReturn(lCnt, 6) = "'" & .Srce
            aReturn(lCnt, 7) = .Cflow
            aReturn(lCnt, 8) = "'" & .Ref
            aReturn(lCnt, 9) = .Post
            aReturn(lCnt, 10) = .Debit
            aReturn(lCnt, 11) = .Credit
            aReturn(lCnt, 12) = .Alloc
        End With
    Next clsTransaction
   
    OutputRange = aReturn
   
End Property

I put apostrophes in front of the strings so they don’t get converted just because they look like a number or a date. And when I’m done, I sum up the Debit column and compare it to the text file.

It’s a match! I must have done something right. No comments for you accountants about how my debits don’t equal my credits. You try to make fake data this internally consistent. :)

You can download FixedWidthADO.zip

Importing Fixed Width Text Files

I’ve been working with fixed width text files quite a bit lately. The built-in method for importing these files is terrible. First, it’s yet another wizard and I think it would be better as a single form (but then I think everything is better as a single form). The worst part, by far, is the tiny window that shows a preview of your data.

After a few days of using this wizard, it was pretty obvious that I was going to write my own, if for no other reason than to make that window bigger. My importer would be significantly different than Excel’s. Microsoft has to make their method flexible and universal, but I don’t. I know certain things about my text files and can build in some assumptions to make things better. For one, my files have repeating page headers. While Excel allows you to start your import at something other than first row to skip all that crap, it doesn’t do me any good because there’s just more crap to come.

Step 2 of the wizard allows you to add, delete, and move the dividing lines that delineate your columns. I find that Excel does a pretty poor job at placing these lines. But I grant that I haven’t used this on a wide variety of files so it’s entirely possible that their algorithm is the best – it’s just not the best for the types of files I’m using.

Did you think I was going to blow by this step without commenting on the lack of keyboard support? Not a chance. To move the lines, click them and drag them. To add a line, single click. To delete a line, double click. Actually, to delete a line, double click next to the line you want to delete, then double click that new line to delete it, then carefully double click on the original line to delete it. You know what I’m talking about. Click, click, click. Where’s the keyboard love?

The lack of large enough preview window really hurts on this step. I’ll discuss determining where the column lines should go later in this post. Normally after step 2, I just hit finish. But let’s take a look at the last step anyway.

I don’t use this step much because the defaults are really good. As much as I don’t like wizards, I have to give MS props for putting a Finish button on step 2 so I can skip step 3. Now that you’ve defined your columns, this step allows you to specify a data type. The General type works well for most situations, but if you have some text that happens to look like a number, it’s best to set the column to the Text format.

Under the Advanced button, you can switch what decimals and commas mean and, most importantly to me, tell it how to handle trailing minus signs.

So back to how nothing’s ever good enough and I can do everything better. I’m only dealing with fixed width files and I’m always starting on the first row, so step 1 of the wizard is gone. Step 2 and 3 could be combined, I think. It should draw a combobox over each column that let’s you choose the format. When you add or delete columns, it redraws the boxes. The best use of resources is getting the columns right in the first place. If you don’t have to move columns because it guesses so well, then the whole thing becomes a breeze.

Let’s look at this sample file in terms of columns. In order to get all the numbers to show, this chart is kind of big.

I wrote a little macro to analyze the file and report how many characters are in each column.

Sub ReadTextFile()
   
    Dim sFile As String
    Dim lFile As Long
    Dim sInput As String
    Dim vaLines As Variant
    Dim i As Long, j As Long
    Dim aChars() As Long
    Dim bLow As Boolean
   
    sFile = "C:\Users\dkusleika\Dropbox\Excel\FixedWidthExample.txt"
    lFile = FreeFile
       
    Open sFile For Input As lFile
    sInput = Input$(LOF(lFile), lFile)
    Close lFile
   
    vaLines = Split(sInput, vbNewLine)
    ReDim aChars(1 To Len(vaLines(0)) + 1, 1 To 1)
   
    For i = LBound(vaLines) To UBound(vaLines)
        For j = 1 To Len(vaLines(i))
            If Mid$(vaLines(i), j, 1) <> Space$(1) Then
                aChars(j, 1) = aChars(j, 1) + 1
            End If
        Next j
    Next i
   
    Sheet1.Range("B2").Resize(UBound(aChars, 1), 1).Value = aChars
   
End Sub

Then I charted them. It seems pretty clear that there’s a break around 7, 12, 23, and 35. The points at 41 and 45 are a little less clear, but starting at 47, you see a clear downward trend. This is the tell-tale sign of left-justified text. Similarly, 87-104 is a right-justified number. The headers muddy up the waters a bit because they contain data that’s no good to me, but still adds to the character count. As I mentioned before, I’m not building a general-purpose solution and it just so happens I can remove the headers. So I did.

Sub ReadTextFileNoHeaders()
   
    Dim sFile As String
    Dim lFile As Long
    Dim sInput As String
    Dim vaLines As Variant
    Dim i As Long, j As Long
    Dim aChars() As Long
    Dim bHeader As Boolean
   
    sFile = "\\99991-dc01\99991\dkusleika\My Documents\FixedWidthExample.txt"
    lFile = FreeFile
       
    Open sFile For Input As lFile
    sInput = Input$(LOF(lFile), lFile)
    Close lFile
   
    vaLines = Split(sInput, vbNewLine)
    ReDim aChars(1 To Len(vaLines(0)) + 1, 1 To 1)
   
    For i = LBound(vaLines) To UBound(vaLines)
        If Len(vaLines(i)) > 0 Then
            If Asc(Left$(vaLines(i), 1)) = 12 Then
                bHeader = True
            ElseIf vaLines(i) = String(132, "-") Then
                bHeader = False
            End If
        End If
       
        If Not bHeader Then
            For j = 1 To Len(vaLines(i))
                If Mid$(vaLines(i), j, 1) <> Space$(1) Then
                    aChars(j, 1) = aChars(j, 1) + 1
                End If
            Next j
        End If
    Next i
   
    Sheet3.Range("B2").Resize(UBound(aChars, 1), 1).Value = aChars
   
End Sub

The ASCII code for the page break character is 12. All of my headers end in a string of 132 dashes. That’s damn convenient. Look what happens when I remove the headers.

Based on that, I think I could predict the column breaks perfectly. But there’s one aspect of this file that allows me to say that. I don’t have any truly variable length, left-justified text followed by any truly variable length, right-justified number. For instance, if the description column was immediately followed by the debit column, it might be difficult to determine exactly where to break it.

How did I solve that problem? I didn’t. This is as far as I got writing my own text import wizard. Out of nowhere, I read something about reading text files with ADO and all my problems were solved. I had used external data tables to read text files, but never ADO. I’ll post about how I’m importing text files with ADO in my next post.

Queen’s day sale

As some of you may know, I’m Dutch and live in “The Kingdom Of The Netherlands”.
Well, tomorrow is an important day for the Dutch people: Queen’s Day.

More so, tomorrow is an extra special day.

Not because our Queen abdicates to make room for King Willem-Alexander, but because my 24 hour “Queen’s Day sale” is held. :-)

Anyone who places an order for my RefTreeAnalyser between April 30, 2013, 8:00 AM GMT and May 1st, 2013 8:00 AM GMT will receive a 50 % rebate on the list price.

### Update: Well, it’s been a great day yesterday and the Queen’s day sale has ended.###

I’m also going to give away some of the licenses: I’ll randomly select 1 out of every 10 orders placed in that period of time. These people will receive their license for free!

Note that you can head over to my website now and download and install the tool to try it, most of it is fully functional without the license code, you just get a nag screen now and then.

Have a nice Queen’s day (which was held in my home town in 2011) and enjoy a taste of our “orange fever“!

Regards,

Jan Karel Pieterse
www.jkp-ads.com

Draw a circle in an Excel chart

By default, Excel has a limited number of charts. That does not mean that those are the only charts one can create. It turns out that with a little imagination and creativity, we can format and configure the default charts so that the effect is like many other kinds of charts.

One of the more versatile of charts is the XY Scatter chart. We can use it as the base for many data visualization tasks. Recently, for a client, I used one to create a radial org chart as in Figure 1. Such a chart is also called a Node-Link chart or a Reingold–Tilford Tree.

image001
Figure 1 – Example of a radial org graph created in an Excel XY Scatter chart
after removal of all identifiable information and the obfuscation of data
necessary to protect the client’s confidentiality.

A lot of “out of the box” work into making this chart. One key element was the set of equidistant concentric circles that provide a visual reference for the small colored dots. This note demonstrates how to create concentric circles in an Excel XY Scatter chart.

For a version in a page by itself (i.e., not in a scrollable iframe as below) visit http://www.tushar-mehta.com/publish_train/xl_vba_cases/0610%20draw%20circle.shtml

Tushar Mehta

Analyze ‘free’ money

Recently, I received yet another promotion from yet another company offering me money at zero percent interest with the predictable asterisk next to the zero percent. Instead of just shredding the offer I decided to create a downloadable Excel template to analyze the offer, which was an interest-free 18 month loan for a 4% transaction fee with a minimum $10 fee.

Obviously, the transaction fee makes sure that the money is not ‘free.’ So, how does one calculate the cost of the loan? I settled on an “effective interest rate.”

image001

For a version in a page by itself (i.e., not in a scrollable iframe as below) visit http://www.tushar-mehta.com/publish_train/xl_vba_cases/0920%20free%20money.shtml

Tushar Mehta

Date Picker Add-in for Excel

Hi all

Post Updated : 28-April-2013

I upload a Date Picker add-in based on John Walkenbach’s Date Picker on my site for the Mac and Windows.

This Date Picker have local Day letters in the header and also local Date formats in the format dropdown and have a option to use 3 week number systems(also ISO weeks).

Windows version :

http://www.rondebruin.nl/win/addins/datepicker.htm

Mac version :

http://www.rondebruin.nl/mac/addins/datepicker.htm

Have a nice day

Stop Replying to Yourself

In this edition of How to Make Outlook More Like GMail, I address the problem of replying to a message that you have sent. Inexplicably, Outlook creates a message addressed to you. That make sense from the standpoint that a reply to should be addressed to the sender. But then there’s the whole common sense thing. You know that thing where you wouldn’t send an email to yourself.

I’m a chronic reply-er to my own sent messages. I send a message to someone that omits some key information like, oh I don’t know, the attachment, and need to send a quick follow up.

In general, I need to monitor newly created email messages, see if they’re addressed to me, and then figure out to whom they should be addressed. That starts with creating an Inspectors variable declared WithEvents. The WithEvents keyword exposes the events of the object declared. In the ThisOutlookSession module:

Private WithEvents olInsp As Inspectors

Private Sub Application_Startup()
   
    Set olInsp = Application.Inspectors
   
End Sub

Now by using the dropdown boxes at the top of the code module, I can select olInsp from the left and NewInspector (the only event for this type of object) from the right. I cobbled this next bit of code from various places, so not only will I not be giving proper attribution, but I don’t know what it all means either.

Private Sub olInsp_NewInspector(ByVal Inspector As Inspector)
   
    Dim olMi As MailItem
    Dim olParent As MailItem
    Dim olRecip As Recipient
   
    If Inspector.currentItem.Size = 0 And Inspector.currentItem.Class = olMail Then
       
        Set olMi = Inspector.currentItem
       
        If olMi.Recipients.Count = 1 Then
            If olMi.Recipients.Item(1).Name = "Dick Kusleika" Then
                Set olParent = FindParentMessage(olMi)
               
                If Not olParent Is Nothing Then
                    olMi.Recipients.Remove 1
                   
                    For Each olRecip In olParent.Recipients
                        olMi.Recipients.Add olRecip.Name
                    Next olRecip
                   
                    olMi.Recipients.ResolveAll
                End If
            End If
        End If
    End If
   
End Sub

I’m not sure why (or if) I need to check that the size of the item is zero, but there it is. The second part of the If statement makes sure it’s email because an Inspector can hold any type of item. Next I check that there is only one Recipient and that it’s me. The FindParentMessage sub is shown below and it finds the message I’m replying to.

As long as everything works up to this point, I’m removed as the only Recipient and every recipient from the parent message is added. To find the parent message, I use the following code, which I borrowed from Sue Mosher. Sue says

The key to finding the parent message for a reply or forward is knowing that all messages in a conversation have the same ConversationTopic value, while the ConversationIndex is increased by 5 bytes with each exchange.

Function FindParentMessage(msg As Outlook.MailItem) As Outlook.MailItem
   
    Dim sFind As String
    Dim sIndex As String
    Dim olFldr As Outlook.MAPIFolder
    Dim olItems As Outlook.Items
    Dim olMi As Object
   
    sIndex = Left$(msg.ConversationIndex, Len(msg.ConversationIndex) - 10)
   
    If Application.ActiveInspector Is Nothing Then
        Set olFldr = Application.ActiveExplorer.Selection.Item(1).Parent
    Else
        Set olFldr = Application.ActiveInspector.currentItem.Parent
    End If
   
    sFind = "[ConversationTopic] = " & Chr$(34) & msg.ConversationTopic & Chr$(34)
   
    Set olItems = olFldr.Items.Restrict(sFind)

    For Each olMi In olItems
        If olMi.Class = olMail Then
            If olMi.ConversationIndex = sIndex Then
                Set FindParentMessage = olMi
                Exit For
            End If
        End If
    Next olMi
   
End Function

I could reply from an open message (ActiveInspector) or from a list of messages (ActiveExplorer) and I get the folder depending on which it was. I find all of the messages in that folder with the same ConversationTopic, then loop through them to determine which has the correct ConversationIndex.

And that’s it. Now when I reply to a message I’ve sent, it’s addressed properly.

Excel corrupts certain workbooks in migrating from 2003 to 2007

I got a email from a client asking for help because Excel was “destroying,” to use his terminology, his 2003 workbook after conversion to the 2007 format. And, after analyzing the kind of change Excel made, I had to agree.

The following in 2003

badnames 1
Figure 1

becomes, in 2013 (and in 2010),

badnames 5
Figure 2

The basic problem is that names that are legitimate names in Excel 2003 may become unacceptable in 2007 (or later). But, a more devastating problem is with a formula using a name with a dot in it. Even though it is completely legitimate, Excel changes the dot to a colon. This causes the formula =SW1.SW2 to become =SW1:SW2. Don’t ask me why. It just does. The result is the formula is all wrong and destroys the integrity of the workbook.

It appears that the cause may be Excel trying to help manage the transition of a XLS workbook into the newer format. In 2007, Microsoft increased the number of columns from 256 to 16,384. Consequently, the reference to last column went from IV to XFD. So, a name such as SW1, completely OK in 2003, became unacceptable in 2007. On converting a XLS file to a XLSX file, Excel will convert such names by adding an underscore at the start of the name. But, it seems to go beyond that, converting formula references to certain names with dots in them to a colon. This happens if both the tokens to the left and to the right of the dot could be legitimate cell references. So, Excel converts the formula =XFD1.XFD2 to =XFD1:XFD2 but it will leave =XFD1.XFE2 alone.

To replicate the problem:

  • Start with Excel 2003. Create a workbook and add the names shown in the Figure 1. Save and close the workbook.
  • Open the workbook in Excel 2013. Save it as a XLSX file. Acknowledge the warning message (see Figure 3),

    badnames 3
    Figure 3

  • Close and reopen the new XLSX workbook. The formulas will have the errors shown in Figure 2.

The safest way to work around this problem is to add an underscore before every name in the workbook before making the transition to the 2007 format. Obviously, the quickest way to do this would be with a very simple VBA procedure. But, through trial and error I discovered the code will not work in 2003. It runs without any problems but it doesn’t do anything!

So, the correct way to use the code is the following sequence.

  • Open the XLS file in 2013 (or 2010).
  • Run the macro below.
    Option Explicit

    Sub fixNames()
        Dim aName As Name
        For Each aName In ActiveWorkbook.Names
            With aName
            If Left(.Name, 1) <> "_" Then _
                .Name = "_" & .Name
            End With
            Next aName
        End Sub
  • Now, save the file in the newer format. If your original workbook had no code in it, save the file as a XLSX file and acknowledge the warning that the VB project will be lost.
  • Close and reopen the file. You should see the correct data with all the names now starting with an underscore.

    badnames 7
    Figure 4

Tushar Mehta

Almost There

Look how close I am. It’s like the odometer rolling over on my car.

Sample Fixed Width Text File

I have a post or two in my head that deals with fixed width text files, something I’ve been dealing with quite a bit. Like me, you probably don’t run into fixed width text files in your personal life. It’s usually some crappy computer program at your job that forces you to deal with them. The problem is that I can’t simply throw around private company information, so I had to replicate a real fixed width file with obfuscated information. Before I post about all the lovely things I’ve been doing with these files, I thought I’d post about how I made the sample.

fixed width text file sample

Sub MakeSampleFile()
   
    Dim sFile As String, sOutFile As String
    Dim lFile As Long
    Dim sInput As String
    Dim vaLines As Variant
    Dim i As Long
    Dim aOutput() As String
    Dim aLine() As String
    Dim dPdDebit As Double, dPdCredit As Double
    Dim dTotDebit As Double, dTotCredit As Double
    Dim dThisDebit As Double, dThisCredit As Double
   
    sFile = "\\99991-dc01\99991\dkusleika\My Documents\AJE_Cost.txt"
    sOutFile = "C:\Users\dkusleika\Dropbox\Excel\FixedWidthExample2.txt"
    lFile = FreeFile
       
    'Open the input file, read it all in, split it by line
    Open sFile For Input As lFile
    sInput = Input$(LOF(lFile), lFile)
    Close lFile
    vaLines = Split(sInput, vbNewLine)
    ReDim aOutput(LBound(vaLines) To UBound(vaLines))
   
    'Loop through the lines of the import file
    For i = LBound(vaLines) To UBound(vaLines)
        ReDim aLine(1 To 10)
        'Lines with dates need special processing
        If IsDate(Mid(vaLines(i), 13, 10)) Then
            aLine(1) = Left$(vaLines(i), 24)
            aLine(2) = GetGLAccount 'make up a GL account
            aLine(3) = GetDescription(Mid$(vaLines(i), 38, 64 - 38 + 1)) 'Desc with random letters
            aLine(4) = "S1" & Space(3)
            aLine(5) = GetYesNo 'Get a Yes or No randomly
            aLine(6) = GetDescription(Mid$(vaLines(i), 74, 83 - 74 + 1))
            aLine(7) = GetYesNo
           
            'Get a random debit at the same scale as the current debit,
            'and keep track of it for totals
            dThisDebit = GetRandomNumber(Mid$(vaLines(i), 88, 106 - 88 + 1))
            If dThisDebit = 0 Then
                aLine(8) = Pad(Space(1), 17, vbNullString)
            Else
                aLine(8) = Pad(Format(dThisDebit, "#,##0.00"), 17, vbNullString)
            End If
            dPdDebit = dPdDebit + dThisDebit
           
            'Same for credit
            dThisCredit = GetRandomNumber(Mid$(vaLines(i), 107, 128 - 107 + 1))
            If dThisCredit = 0 Then
                aLine(9) = Pad(Space(1), 20, vbNullString) & Space(4)
            Else
                aLine(9) = Pad(Format(dThisCredit, "#,##0.00"), 20, vbNullString) & Space(4)
            End If
            dPdCredit = dPdCredit + dThisCredit
           
            aLine(10) = GetYesNo
       
        'Period balances need special processing
        ElseIf vaLines(i) Like "*BALANCE PERIOD*" Then
            aLine(1) = Left$(vaLines(i), 88)
            aLine(2) = Pad(Format(dPdDebit, "#,##0.00"), 16, vbNullString)
            aLine(3) = Pad(Format(dPdCredit, "#,##0.00"), 20, vbNullString)
            dTotDebit = dTotDebit + dPdDebit
            dTotCredit = dTotCredit + dPdCredit
            dPdDebit = 0: dPdCredit = 0
           
        'Ending balances need special processing
        ElseIf vaLines(i) Like "*BALANCE*" Then
            aLine(1) = Left$(vaLines(i), 88)
            aLine(2) = Pad(Format(dTotDebit, "#,##0.00"), 16, vbNullString)
            aLine(3) = Pad(Format(dTotCredit, "#,##0.00"), 20, vbNullString)
       
        'Headers, blanks, and other stuff comes straight over as is
        Else
            aLine(1) = vaLines(i)
        End If
        aOutput(i) = Join(aLine, vbNullString)
    Next i
   
    lFile = FreeFile
    Open sOutFile For Output As lFile
    Print #lFile, Join(aOutput, vbNewLine)
    Close lFile
   
End Sub

That procedure is a bit longer than I like, but it’s not something I wanted to spend a lot of time on. It reads in the legitimate file, then goes line-by-line through it. When it encounters a line that needs changing, it creates fake data to put in place of the real data. I want the totals to match so I have something to compare it to after I parse the text file (in a later post), so I have to keep track of the fake numbers I make up and total them appropriately.

The GL Account is a 10 digit number that start with 1-5 and can have any other digits after that.

Function GetGLAccount() As String
   
    Dim i As Long
    Dim aOut(1 To 10) As String
   
    aOut(1) = Int((5 - 1 + 1) * Rnd + 1)
    For i = 2 To 10
        aOut(i) = Int((9 - 0 + 1) * Rnd + 0)
    Next i
   
    GetGLAccount = Join(aOut, vbNullString) & Space(3)
   
End Function

The description obfuscator respects spaces, but replaces anything else with an upper case letter.

Function GetDescription(sDesc As String) As String
   
    Dim aOut() As String
    Dim lLen As Long
    Dim i As Long
   
    ReDim aOut(1 To Len(sDesc))
   
    For i = 1 To Len(sDesc)
        If Mid$(sDesc, i, 1) = Space(1) Then
            aOut(i) = Space(1)
        Else
            aOut(i) = Chr$(Int((90 - 65 + 1) * Rnd + 65))
        End If
    Next i
   
    GetDescription = Join(aOut, vbNullString)
   
End Function

The Yes/No generator gives me a Yes about 80% of the time. You have to stay positive, you know.

Function GetYesNo() As String
   
    If Rnd < 0.8 Then
        GetYesNo = "Yes "
    Else
        GetYesNo = "No  "
    End If
   
End Function

For the debits and credits, I wanted to stay somewhat realistic, so I kept the same scale as whatever number is there. To determine the scale, I remove all spaces, commas, and periods from the screen, then take the length x 2. A number like $9,453.65 will have a scale of 4 and will produce a number between 1,000 and 9,999. Rnd, as you know, generates a number between 0 and 1. I multiply Rnd by 10^Scale to get the right number of digits, then round it to two decimal places.

Function GetRandomNumber(sNum As String) As Double
   
    Dim lScale As Long
    Dim dReturn As Double
   
    lScale = Len(Trim$(Replace$(Replace$(sNum, ".", vbNullString), ",", vbNullString))) - 2
   
    If lScale > 0 Then
        dReturn = Round(Rnd * 10 ^ (lScale), 2)
    End If
   
    GetRandomNumber = dReturn
   
End Function

Finally, I have a Pad function to fill out spaces around numbers. First, Pad truncates the string to the right width in case it’s already too long. If it’s not too long, spaces are inserted in front of it to fill out the right width. Then there’s the sAfter argument that you’re probably wondering about. I don’t have trailing negatives in this report, but I have had them in some others. I need to be able to stick a negative or a space after the string and used the sAfter argument to do it. I believe they’re all null strings in this example.

Making realistic sample data sucks.

You can download FixedWidthExample2.txt