I have a post or two in my head that deals with fixed width text files, something I’ve been dealing with quite a bit. Like me, you probably don’t run into fixed width text files in your personal life. It’s usually some crappy computer program at your job that forces you to deal with them. The problem is that I can’t simply throw around private company information, so I had to replicate a real fixed width file with obfuscated information. Before I post about all the lovely things I’ve been doing with these files, I thought I’d post about how I made the sample.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
Sub MakeSampleFile() Dim sFile As String, sOutFile As String Dim lFile As Long Dim sInput As String Dim vaLines As Variant Dim i As Long Dim aOutput() As String Dim aLine() As String Dim dPdDebit As Double, dPdCredit As Double Dim dTotDebit As Double, dTotCredit As Double Dim dThisDebit As Double, dThisCredit As Double sFile = "\\99991-dc01\99991\dkusleika\My Documents\AJE_Cost.txt" sOutFile = "C:\Users\dkusleika\Dropbox\Excel\FixedWidthExample2.txt" lFile = FreeFile 'Open the input file, read it all in, split it by line Open sFile For Input As lFile sInput = Input$(LOF(lFile), lFile) Close lFile vaLines = Split(sInput, vbNewLine) ReDim aOutput(LBound(vaLines) To UBound(vaLines)) 'Loop through the lines of the import file For i = LBound(vaLines) To UBound(vaLines) ReDim aLine(1 To 10) 'Lines with dates need special processing If IsDate(Mid(vaLines(i), 13, 10)) Then aLine(1) = Left$(vaLines(i), 24) aLine(2) = GetGLAccount 'make up a GL account aLine(3) = GetDescription(Mid$(vaLines(i), 38, 64 - 38 + 1)) 'Desc with random letters aLine(4) = "S1" & Space(3) aLine(5) = GetYesNo 'Get a Yes or No randomly aLine(6) = GetDescription(Mid$(vaLines(i), 74, 83 - 74 + 1)) aLine(7) = GetYesNo 'Get a random debit at the same scale as the current debit, 'and keep track of it for totals dThisDebit = GetRandomNumber(Mid$(vaLines(i), 88, 106 - 88 + 1)) If dThisDebit = 0 Then aLine(8) = Pad(Space(1), 17, vbNullString) Else aLine(8) = Pad(Format(dThisDebit, "#,##0.00"), 17, vbNullString) End If dPdDebit = dPdDebit + dThisDebit 'Same for credit dThisCredit = GetRandomNumber(Mid$(vaLines(i), 107, 128 - 107 + 1)) If dThisCredit = 0 Then aLine(9) = Pad(Space(1), 20, vbNullString) & Space(4) Else aLine(9) = Pad(Format(dThisCredit, "#,##0.00"), 20, vbNullString) & Space(4) End If dPdCredit = dPdCredit + dThisCredit aLine(10) = GetYesNo 'Period balances need special processing ElseIf vaLines(i) Like "*BALANCE PERIOD*" Then aLine(1) = Left$(vaLines(i), 88) aLine(2) = Pad(Format(dPdDebit, "#,##0.00"), 16, vbNullString) aLine(3) = Pad(Format(dPdCredit, "#,##0.00"), 20, vbNullString) dTotDebit = dTotDebit + dPdDebit dTotCredit = dTotCredit + dPdCredit dPdDebit = 0: dPdCredit = 0 'Ending balances need special processing ElseIf vaLines(i) Like "*BALANCE*" Then aLine(1) = Left$(vaLines(i), 88) aLine(2) = Pad(Format(dTotDebit, "#,##0.00"), 16, vbNullString) aLine(3) = Pad(Format(dTotCredit, "#,##0.00"), 20, vbNullString) 'Headers, blanks, and other stuff comes straight over as is Else aLine(1) = vaLines(i) End If aOutput(i) = Join(aLine, vbNullString) Next i lFile = FreeFile Open sOutFile For Output As lFile Print #lFile, Join(aOutput, vbNewLine) Close lFile End Sub |
That procedure is a bit longer than I like, but it’s not something I wanted to spend a lot of time on. It reads in the legitimate file, then goes line-by-line through it. When it encounters a line that needs changing, it creates fake data to put in place of the real data. I want the totals to match so I have something to compare it to after I parse the text file (in a later post), so I have to keep track of the fake numbers I make up and total them appropriately.
The GL Account is a 10 digit number that start with 1-5 and can have any other digits after that.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Function GetGLAccount() As String Dim i As Long Dim aOut(1 To 10) As String aOut(1) = Int((5 - 1 + 1) * Rnd + 1) For i = 2 To 10 aOut(i) = Int((9 - 0 + 1) * Rnd + 0) Next i GetGLAccount = Join(aOut, vbNullString) & Space(3) End Function |
The description obfuscator respects spaces, but replaces anything else with an upper case letter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
Function GetDescription(sDesc As String) As String Dim aOut() As String Dim lLen As Long Dim i As Long ReDim aOut(1 To Len(sDesc)) For i = 1 To Len(sDesc) If Mid$(sDesc, i, 1) = Space(1) Then aOut(i) = Space(1) Else aOut(i) = Chr$(Int((90 - 65 + 1) * Rnd + 65)) End If Next i GetDescription = Join(aOut, vbNullString) End Function |
The Yes/No generator gives me a Yes about 80% of the time. You have to stay positive, you know.
1 2 3 4 5 6 7 8 9 |
Function GetYesNo() As String If Rnd < 0.8 Then GetYesNo = "Yes " Else GetYesNo = "No " End If End Function |
For the debits and credits, I wanted to stay somewhat realistic, so I kept the same scale as whatever number is there. To determine the scale, I remove all spaces, commas, and periods from the screen, then take the length x 2. A number like $9,453.65 will have a scale of 4 and will produce a number between 1,000 and 9,999. Rnd, as you know, generates a number between 0 and 1. I multiply Rnd by 10^Scale to get the right number of digits, then round it to two decimal places.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Function GetRandomNumber(sNum As String) As Double Dim lScale As Long Dim dReturn As Double lScale = Len(Trim$(Replace$(Replace$(sNum, ".", vbNullString), ",", vbNullString))) - 2 If lScale > 0 Then dReturn = Round(Rnd * 10 ^ (lScale), 2) End If GetRandomNumber = dReturn End Function |
Finally, I have a Pad
function to fill out spaces around numbers. First, Pad
truncates the string to the right width in case it’s already too long. If it’s not too long, spaces are inserted in front of it to fill out the right width. Then there’s the sAfter
argument that you’re probably wondering about. I don’t have trailing negatives in this report, but I have had them in some others. I need to be able to stick a negative or a space after the string and used the sAfter
argument to do it. I believe they’re all null strings in this example.
Making realistic sample data sucks.
You can download FixedWidthExample2.txt
YES! Some of the files I have to deal with are fixed width for upload to the IRS. They are so much fun to work with and Excel does not play nice with them.
Here is an example: http://www.irs.gov/pub/irs-pdf/p1220.pdf
Around page 80 the real fun starts.
They add a bonus challenge by having the first row or two of the reports be completely different sizes and information then the rest of the report.
Did you considering generating all of the data with formulas, and exporting as a PRN file?
No, but that’s a good thought. How would you do the description part with a formula?
I was thinking the same thing as John (create it and PRN it.)
I’d use a UDF that does the obfuscation for the description. That’d be a hairy set of formulas, though if you set up a range of randomly-created letter combos in various lengths you could probably create a lookup approach and assemble them formulaically. Too late on Friday to solve that one…
Random description….
=MultConc(CHAR(LARGE(ROW(65:90),ROW(INDIRECT(“1:”&RANDBETWEEN(1,26))))))
MultConc being a UDF that Excel should have always had
Hello,
MultConc, is it a new function ?
or is it a copy of L.L. MCONCAT ?