Import Text in VBA

VBA provides some file operation statements that can be used to read from or write to a text file. Most text files are easily imported into Excel using File > Open and the Text Import Wizard. Sometimes the file contains so much junk to clean up that it’s easier to write your own procedure.

In this example, I have a carat (^) delimited text file with some extraneous tabs and linefeeds.

ITF1

I want to read this file in one line at a time, strip out the characters I don’t want, then write the remaining text to cells.

Sub GetTextFile()
   
    Dim sFile As String
    Dim sInput As String
    Dim lFNum As Long
    Dim vaFields As Variant
    Dim i As Long
    Dim lRow As Long
    Dim vaStrip As Variant
   
    Const sDELIM = “^” ‘Set the delimeter
   
    lFNum = FreeFile
    sFile = “C:CaratDelim.txt”
    vaStrip = Array(vbLf, vbTab) ‘list the text to strip
   
    ‘Open the file
   Open sFile For Input As lFNum
   
    ‘Loop through the file until the end
   Do While Not EOF(lFNum)
        Line Input #lFNum, sInput ‘input the current line
       
        ‘remove the unwanted text
       For i = LBound(vaStrip) To UBound(vaStrip)
            sInput = Replace(sInput, vaStrip(i), “”)
        Next i
       
        ‘split the text based on the delimeter
       vaFields = Split(sInput, sDELIM)
        lRow = lRow + 1
       
        ‘Write to the worksheet
       For i = 0 To UBound(vaFields)
            Sheet1.Cells(lRow, i + 1).Value = vaFields(i)
        Next i
    Loop
   
    Close lFNum
   
End Sub

The result is a clean table in Excel.

ITF2

Of course, if the real life file was as simple as this example, you would just import it normally and do some finding and replacing. But the bigger the file and the more complex the text you need to remove, the more attractive a technique like this becomes.

Five Thirty Eight

This is the 538th post on Daily Dose of Excel. I meant to post when it was at 500, but somehow it slipped away from me. So, instead of celebrating 500, we’ll celebrate 538.

Five hundred thirty-eight might be the most boring number in the entire universe. The most interesting result of a Google search is a radio station from the Netherlands. Why, I wonder, is there so much English on their site? Don’t they have words for “Now on Air”?

It goes down hill from there. There’s Form I-538 Certification by Designated School Official from the INS.

This form is used to certify aliens seeking off-campus employment as a student and that the employment is related to the student alien’s field of study.

Zzzzzz. Oh, excuse me. I must have dozed off there. Who better than Internal Revenue to rival Immigration for the boredom champion. They have Publication 538 Accounting Periods and Methods (pdf). I’m an accountant and I think this is boring.

This publication explains some of the rules for accounting periods and accounting methods. In many cases, however, you may have to refer to the cited sources for a fuller explanation of the topic.

They can’t explain how long a freakin’ year is 33 pages? Finally, what Bibliographic Formats and Standards page would be complete without 538 System Details Note. Not the Online Computer Library Center, that’s for sure.

1st Indicator Undefined. The 1st indicator position is undefined and contains a blank

You go, OCLC! Daily Dose’s anniversary is coming up next month. I’d better mark it in my Outlook Calendar so I don’t forget to post THAT day too. Thanks for reading, everybody.

Comment Spam

It’s literally coming faster than I can delete it. I need to do something drastic. Most of it doesn’t hit the site, but I still have to delete it. I spent an hour deleting comment spam on Sunday. I’m looking at a few options and these are the ones on which I would like your opinion:

  • Preview – force users to preview their comments before they post. This prevents bots from flooding the posts with spam. If you’ve read JWalkblog, you know how this works.
  • Registered Users – Only users who have registered would be able to comment. I think 90% of the commenters are regular readers and are already registered. However, it’s one more username, password that you have to remember.
  • Close Comments – I’m seriously considering closing comments on posts that are one month old. Very few of those posts get comments that aren’t spam. Some do and some add valuable information which would be lost under this method.

I’m looking at other alternatives that are less intrusive to your reading experience. I will likely use a combination of techniques. If you have an opinion on the above three or any other ideas, please leave a comment.

Summing Text

Paul recently asked at j-walkblog.com

…In Excel, if I put “Paul” in cell A1, and 100 in cell A2, and in cell A3, I put in the formula “=A1+A2? I get #VALUE, however if I put in “=SUM(A1:A2)” I get 100. Isn’t summing adding? Why does one return a number and the other gimme #VALUE?

Here’s Dick’s theory: When you use an operator (e.g. +), you supply the numbers (operands?) on which to operate explicitly. “=A1+A2? – A1 and A2 are explicitly entered by the user, so there’s no need to “handle” non-numeric entries. If you don’t want an error, don’t include that cell reference in your formula. With SUM, however, you supply a range, not individual cells. In order to make SUM easier to use, it handles non-numeric entries in a specific and consistent way. If you had to split up all the ranges just to avoid some text, you SUM formulas would be cumberSUM formulas.

This is how it probably happened:

The developer started by programming the plus sign. He programmed it to return an error when one of the operands was non-numeric. Later, he invented the SUM function. He programmed that to return an error when a value in the array (read: the value of a cell in the range) was non-numeric. Then it went to alpha testing and a bunch of people said, “Every time I use SUM I get a VALUE error. Can’t you make it easier to use SUM?” To which he said, “Get out of my office, you whiners!” Later, after he had a sufficient amount of caffeine, he reconsidered and added special handling to the SUM function.

Data Validation and Hyperlinks

I hate putting posts in the Bugs category. I don’t know why I even made this category. Well, here it goes anyway – let me know if you’ve seen this before.

I had some data validation set up in a cell. The cell is supposed to end up being a web address and, of course, excel automatically converts it to a hyperlink. That seems to circumvent the data validation. Here’s the steps:

Set up Data Validation in a cell

DVH1

Now enter some text in that cell that will be sure to be converted to a hyperlink and also that will fail the validation

DVH2

Hit enter and you should get this message

DVH3

Now click Retry, then press Esc and you’ll get invalid text in the cell

DVH4

Seems like a bug to me. Another way to beat Data Validation, what a surprise.

BTW: Excel 2000, WinXP Pro

Freezing Column or Row Headers

Otherwise know as Freeze Panes. This feature, under the Windows menu, allows you to lock certain columns, rows, or both in place so that they never scroll out of view.

FreezePanes1

Start by locating the split bars. I don’t their technical name, but it was either split bars or doohickies. I went with the former. The horizontal split bar is just above the arrow on the vertical scroll bar. When you hover over it, the cursor changes to a two-headed arrow.

FreezPanes2 FreezPanes3

The vertical split bar is just to right of the horizontal split bar and has the same function. These bars can be moved to split the window into two or four panes.

FreezePanes4

You can scroll inside any of these panes, which can be useful, but is more often just confusing. To lock those panes into place so that their contents can’t be scrolled, use the Freeze Panes option.

FreezePanes5

The thick split pane bars now become thin freeze pane bars. You can click inside those panes, you just can’t scroll around in there and get lost. To remove, go back to the Windows menu and choose Unfreeze panes.

Using ADO in Functions

Normally, Data > Get External Data is a fine option for using data from an external source in you worksheets. There are situations, though, where the structure of an external data table isn’t what you want. You may want to use pieces of the data in various places, rather than in a tidy table.

In a situation I recently encountered, I needed the user to specify a key field of a recordset and have certain information returned. Normally, I would create a hidden worksheet and put an external data table on it with the whole recordset. Then, I would use VLOOKUPs to get at the data based on the user’s entry. The problem was that the recordset was more than 65,000 rows. I tried to think of a way to limit the data in the recordset, but I couldn’t predict what the users would enter, and therefore needed access to the whole thing.

I ended up using a user-defined function with ADO automation. Here’s generally what it looked like. This example uses the Products table of Northwind.mdb.

Dim adoCn As ADODB.Connection
Dim adoRs As ADODB.Recordset
   
Function GetFields(sKey As String, lField As Long) As Variant
   
    Dim sCon As String, sSql As String
   
    ‘Create recordset if first call
   If adoCn Is Nothing Or adoRs Is Nothing Then
        sCon = “DSN=MS Access Database;” & _
            “DBQ=C:Program FilesMicrosoft Office 2000OfficeSamplesNorthwind.mdb;” & _
            “DefaultDir=C:Program FilesMicrosoft Office 2000OfficeSamples;” & _
            “DriverId=25;FIL=MS Access;MaxBufferSize=2048;PageTimeout=5;”
        sSql = “SELECT ProductID, ProductName, QuantityPerUnit, Products.UnitPrice “ & _
            “FROM Products”
       
        Set adoCn = New ADODB.Connection
        adoCn.Open sCon
       
        Set adoRs = New ADODB.Recordset
        adoRs.CursorType = adOpenDynamic
        adoRs.CursorLocation = adUseClient
        adoRs.Open sSql, adoCn
    End If
   
    adoRs.MoveFirst
    adoRs.Find “ProductID=” & sKey
   
    If adoRs.EOF Or adoRs.BOF Then
        GetFields = “Not found”
    Else
        GetFields = adoRs.Fields(lField).Value
    End If
   
End Function

And it was used, more or less, like this:

ADOUDF1

Parsing SQL

Last winter, Jan Karel and I worked a little on the QueryManager add-in. I was interested in that project because I wanted an easy way to add a parameter to an external data query. After the first round, the method I wrote to add a parameter was just too hokey. It merely added a criterion with a question mark, and didn’t give the user a whole lot of options. Once the question mark was added to the SQL, it was up to the user to define the parameter in the UI. We needed something that did the whole job.

On subsequent rounds, everything was getting complicateder and complicateder. I ended up just parsing the whole SQL statement to try to accomplish what I wanted. The problem was I don’t know SQL well enough to be parsing it for a general purpose utility. From what I read, even people who do know SQL found the task difficult. I like the utility that Jan Karel developed, but I was never (and am not) satisfied with my end of it.

I should just quit thinking about, but it haunts me. My latest foray into SQL Parsing was going to be brilliant. I was going to use DAO automation to create a query in Jet, then read the contents of the MSysQueries table (a hidden table in Access). If you use queries in Access and you want to see how Access stores the details, go to Tools > Options and check the Hidden Objects check box. Then open the MSysQueries table and you can see the various components of a query. I wanted to put this table into a recordset so I could manipulate the pieces and reassemble them, but I ran into access restrictions. I don’t know enough about Access security to even understand the problem, much less the solution.

I was perusing the References dialog in the VBE earlier. I was actually looking for a mozilla dll to see what I could do with it and came upon this:

ParseSQL1

An SQL parsing library? Sweet! I haven’t got it all figured out and from what I’ve done, I wouldn’t necessarily call this robust. It seems to be a step in the right direction, though.

Sub TestParse()
   
    Dim prsApp As MSSQLParser.vbSQLParser
    Dim sSql As String
    Dim sTemp As String
    Dim vaParsed As Variant
    Dim i As Long
   
    Const sSPLIT As String = “”
   
    Set prsApp = New MSSQLParser.vbSQLParser
   
    sSql = “SELECT * FROM TblCustomers WHERE TblCustomers.CustId = ‘1234’ “
    sSql = sSql & “AND TblCustomers.State = ‘NE'”
   
    sTemp = prsApp.ParseSQLSyntax(sSql, vbSqlServerSyntax)
   
    vaParsed = Split(sTemp, sSPLIT)
   
    For i = LBound(vaParsed) To UBound(vaParsed)
        Debug.Print i, vaParsed(i)
    Next i
   
End Sub

ParseSQL2

I don’t have this thing quite figured out, but here’s what I’ve gathered so far.

  • cf1 = SQL keywords like SELECT, FROM, WHERE
  • cf5 = punctuations, mathematical operators, and logical operators
  • cf6 = string literals
  • cf7 = numeric literals
  • cf10 = This appears to be a catch all. I thought it was table names and fields, but after misspelling SELECT, it put it as cf10. So now I think this is just anything it doesn’t recognize.

Kind of a strange parsing system, I think. Not really as useful as MSysQueries, but at least it can identify SQL keywords, which is more than I can say for myself. I’m not sure if this will turn into something useful for me, but it has renewed my interest in our QueryManager project. Now I need to figure out how to get Rob involved and take the best of his Query Editor program and ours.