I have several years of vendor invoices, in text file format, in some directories on a share. I need to search through these text files to find an order number, manifest number, or some other piece of information. I can’t search everything because it would take too long. And I don’t have control over the server, so if there is some indexing that could be done, I can’t do it. I’m stuck with good old VBA.
The folders are yyyymmdd (ex: 20150725 for July 25th) and corresponds to the invoice dates for any invoices in the file. Each file starts with a three letter abbreviation of the vendors name. Invoice date and vendor name are the only two pieces of information I can use to limit the search. The final piece of information is, of course, the search term. Here’s what the form looks like
I have a table of vendors and codes to populate the Vendor combobox. The QuickDate combobox populates the Date Range textboxes and contains common date ranges, namely, Last Month, This Month, Last Quarter, This Quarter, Last Year, This Year. I can change the dates to whatever I want if there isn’t a Quick Date that suits me. The Search Terms textbox takes a space separated list of terms to search for.
And now the fun part. The code. This converts the Quick Dates into real dates
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
Private Sub cbxQuick_Change() Dim dtStart As Date, dtEnd As Date Select Case Me.cbxQuick.Value Case "Last Month" dtStart = DateSerial(Year(Now), Month(Now) - 1, 1) dtEnd = DateSerial(Year(Now), Month(Now), 0) Case "This Month" dtStart = DateSerial(Year(Now), Month(Now), 1) dtEnd = DateSerial(Year(Now), Month(Now) + 1, 0) Case "Last Quarter" dtStart = DateSerial(Year(Now), Month(Now) - (((Month(Now) - 1) Mod 3) + 3), 1) dtEnd = DateSerial(Year(dtStart), Month(dtStart) + 3, 0) Case "This Quarter" dtStart = DateSerial(Year(Now), Month(Now) - (((Month(Now) - 1) Mod 3)), 1) dtEnd = DateSerial(Year(dtStart), Month(dtStart) + 3, 0) Case "Last Year" dtStart = DateSerial(Year(Now) - 1, 1, 1) dtEnd = DateSerial(Year(Now), 1, 0) Case "This Year" dtStart = DateSerial(Year(Now), 1, 1) dtEnd = DateSerial(Year(Now) + 1, 1, 0) End Select Me.tbxStartDate.Text = Format(dtStart, "mm/dd/yyyy") Me.tbxEndDate.Text = Format(dtEnd, "mm/dd/yyyy") End Sub |
This makes sure a real date is entered, but provides for 6 or 8 digit date entry.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Private Sub tbxEndDate_Exit(ByVal Cancel As MSForms.ReturnBoolean) If IsDate(Me.tbxEndDate.Value) Then tbxEndDate.Text = FormatDateTime(tbxEndDate.Value, vbShortDate) ElseIf Len(tbxEndDate.Text) = 6 Then tbxEndDate.Text = DateSerial(Right(tbxEndDate.Text, 2), Left(tbxEndDate.Text, 2), Mid(tbxEndDate.Text, 3, 2)) ElseIf Len(tbxEndDate.Text) = 8 Then tbxEndDate.Text = DateSerial(Right(tbxEndDate.Text, 4), Left(tbxEndDate.Text, 2), Mid(tbxEndDate.Text, 3, 2)) Else MsgBox "You must enter a valid date." Cancel = True End If End Sub |
And the big one, the actual search. This is pretty long and needs to be refactored, but it works for now.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
Private Sub cmdSearch_Click() Dim vaTerms As Variant Dim i As Long, j As Long Dim aFolders() As String Dim sFolder As String, sFile As String, lFile As Long Dim lCnt As Long Dim dtFolder As Date Dim sText As String Const sPATH As String = "\\yourserver\rawdata\" Me.lbxResults.Clear ReDim aFolders(1 To 1000) sFolder = Dir(sPATH & "*", vbDirectory) 'get a list of folders in the date range Do While Len(sFolder) > 0 If Len(sFolder) = 8 Then dtFolder = DateSerial(Left$(sFolder, 4), Mid$(sFolder, 5, 2), Right$(sFolder, 2)) If dtFolder >= CDate(Me.tbxStartDate.Text) And dtFolder <= CDate(Me.tbxEndDate.Text) Then lCnt = lCnt + 1 aFolders(lCnt) = sFolder sFolder = Dir End If End If sFolder = Dir Loop ReDim Preserve aFolders(1 To lCnt) lCnt = 0 vaTerms = Split(Me.tbxSearch.Text, Space(1)) 'Make a dummy result Me.lbxResults.AddItem vbNullString For i = LBound(aFolders) To UBound(aFolders) sFolder = sPATH & aFolders(i) & "\" sFile = Dir(sFolder & Me.cbxVendor.Value & "*.IN?") Do While Len(sFile) > 0 'Show the current folder as a result Me.lbxResults.Column(0, 0) = sFolder & sFile Me.Repaint 'Open the file and read in all the text lFile = FreeFile Open sPATH & aFolders(i) & "\" & sFile For Binary As lFile sText = Space$(LOF(lFile)) Get #1, , sText Close lFile 'Loop through the space separated search terms and see if 'they're in the file For j = LBound(vaTerms) To UBound(vaTerms) If InStr(1, sText, vaTerms(j), vbTextCompare) > 0 Then 'This is the animation part Me.lbxResults.AddItem vbNullString, 0 Me.lbxResults.TopIndex = 0 lCnt = lCnt + 1 DoEvents Exit For End If Next j sFile = Dir Loop Next i 'Get rid of the dummy Me.lbxResults.RemoveItem 0 End Sub |
It takes about 60 seconds per month to search the files. That’s a long time so it’s necessary to entertain the user while he waits. The top entry in the results listbox is whatever the current file is. It rapidly changes the display as it loops through the folder. When there’s a hit, that file becomes the second entry and any prior hits move down. This little animation lets the user know that it’s still working and gives him a list of what hits have been found already.
You can download SearchTextFiles.zip
Dick,
Very impressive. But had you thought of using Power Query?
I hadn’t considered that. I installed PQ in Feb of 2014 but uninstalled it shortly thereafter. Every day a little window would popup asking me to make PQ better, but what it really did was remind me that I was using computer resources to load something that I never use. Probably not what MS had in mind with that annoying reminder. Anyway, I’d be interested to see how that would work and what the results look like. If anyone wants to bite that off and doesn’t have somewhere to put it, I can put it up here.
I thought about parsing everything periodically and putting it into SQL Server. We don’t add that many new ones and once it’s parsed, it’s parsed. But I’d have to write a parser for 20 different formats and I don’t want to.
A simplewr & probably faster search routine.
the initialize event can be written as:
Whenever I have to search folders for a text snippet, I’m amazed at how fast & good Agent Ransack (www.mythicsoft.com) can parse through directories of information and find the needle I need. I can’t customize it in VBA, but it just works.
hi
i want search and find into several word doucument
please help me
its very important for me
tanks