Web queries are one of the most useful and coolest things for me in Excel. I use them *a lot*, because they allow me to combine the front end of Excel with back end information stored in different web servers.
However, sometimes they can be a little limiting. There’s the option of automating Internet Explorer (which I try to use as the ultimate resource, because it’s the slowest option from what I’ve seen), but there are some different alternatives as well.
The XMLHTTP object (which you can use in early binding by adding a reference to ‘Microsoft XML, vX.X’, where X.X is one of the available versions (I have versions 2.6, 3.0 and 5.0 installed on this computer). I’ve used 3.0 (for no particular reason), and I’m not sure what the difference between the versions is, but that one seems to work for what I’ve needed so far.
Ok, so, let’s get the links from a web page using a POST method. One page that I know uses this is MrExcel’s message board (I tried using this blog’s search page, but I got an error…). The sub that I used is this:
‘We’ll search the MrExcel board here
Dim Url As String
Dim HtmlBody As String
Dim Col As Collection
‘The url of the page
Url = “http://www.mrexcel.com/board2/search.php”
Url = Url & “?search_keywords=querytable+refresh”
‘Get the html document
HtmlBody = UseXMLHTTP(Url)
‘You can parse the HTML here, for example, retrieve all the links
Set Col = New Collection
Do While InStr(1, HtmlBody, “<a href”) > 0
HtmlBody = Mid$(HtmlBody, InStr(1, HtmlBody, “</a><a href”))
Col.Add Left$(HtmlBody, InStr(1, HtmlBody, “</a>”) + 3)
HtmlBody = Mid$(HtmlBody, InStr(1, HtmlBody, “</a>”) + 3)
Loop
MsgBox “Found “ & Col.Count & ” links !”
End Sub
Of course, the function that really matters here is the UseXMLHTTP, which follows:
Dim oHttp As Object
Dim Parameters As String
Dim i As Long
On Error Resume Next
‘create the object, error out if not found in the machine
Set oHttp = CreateObject(“Microsoft.XMLHTTP”)
If Not oHttp Is Nothing Then
‘Make sure we have some parameters in the URL
i = InStr(1, Url, “?”)
If i > 0 Then
‘Extract the parameters from the URL
Parameters = Mid$(Url, i + 1)
‘Now remove them from the URL, so we can send it to the XMLHTTP object
‘I guess we could receive two variables, but I’m lazy tonight <g>
Url = Left$(Url, i – 1)
‘Use GET or POST as the first parameter of the .Open method (depends on the webpage)
oHttp.Open “POST”, Url, False
‘Using the .setRequestHeader you can add more headers as needed
oHttp.setRequestHeader “Content-Type”, “application/x-www-form-urlencoded”
‘Send the parameters (similar to the .Refresh method of the querytable)
‘it’s here where the page is actually loaded
oHttp.Send CStr(Parameters)
‘Get the response back
‘You can check the .status or the .statusText of the oHttp object, to see if the
‘request was succesful or not
If oHttp.statusText = “OK” Then
UseXMLHTTP = oHttp.responseText
End If
End If
‘Destroy the object
Set oHttp = Nothing
End If
End Function
Reading the HTML body is one of the things that you can do (most useful to me when the webpage returns a text file, because you don’t have to mess around with the HTML), and is very simple to do so. But there are two other things that you can do. One, is read an XML document directly from the request, using the .responseXML
method, like I showed here. I’ve been using this feature (since I finally decided to jump in the XML train…) and I really enjoy the simplicity that this provides. You get everything in one single object.
The other thing is that you can (at least in theory, because I haven’t tried it live…) is that you can retrieve a file using the .responseStream
method, that could be sent from an ASP’s IO.Stream (keep in mind that I’m starting to talk about stuff that I don’t really know here, ok ?)
Juan
Very cool, Your explanination was spot on and simple enough for an xlm challanged person such as my self
Thanks for taking the time to post this.
JP,
You can retrieve a file from the Web with the XMLHTTP object. Here’s some code that implements a “file copy” via HTTP:
http://groups.google.com/group/microsoft.public.excel.programming/browse_thread/thread/d86562f72b3775a4/8b93dbf506ee96fd
Very cool Jake, thanks!
This “should” mean too that I could download the file asynchronously as well ? (if I set the third parameter of the .Open to True ?
Right now I’m using the URLDownloadToFile, which I understand supports asynch downloads as well, but the implementation seems to be impossible.
I guess I’ll try that ! thanks again!
Yes, it does work… I’ll try to clean the code a little and post it.
Hi,
Just trying to follow this, looks very useful but I can’t yet get it to function…When I paste the code into a module I’m getting “an invalid procedure call” error at this line:
HtmlBody = Mid$(HtmlBody, InStr(1, HtmlBody, “
Graham
Replace this line
HtmlBody = Mid$(HtmlBody, InStr(1, HtmlBody, “</a><a href”))
with
HtmlBody = Mid$(HtmlBody, InStr(1, HtmlBody, “<a href”))
Another method to download a file from the internet with XMLHTTP – but without ADO – below
Sub GetXLs()
Dim WebUrlStr As String, LocalFile As String
Dim oXMLHTTP As Object, bArray() As Byte, hfile As Integer
Dim newWB As Workbook
‘credit to Matt Vidas from Expert-Exchange & VBAeXpress
Set oXMLHTTP = CreateObject(“Microsoft.XMLHTTP”)
‘download the file from the web to the hardrive
WebUrlStr = “http://members.iinet.net.au/~brettdj/EE/Ltt.xls”
LocalFile = “c:Ltt.xls”
oXMLHTTP.Open “GET”, WebUrlStr, False
oXMLHTTP.send
bArray = oXMLHTTP.ResponseBody
hfile = 1
Open LocalFile For Binary As #hfile
Put #hfile, , bArray
Close #hfile
Set newWB = Workbooks.Open(LocalFile)
End Sub
Cheers
Dave
Does anyone know how to simply download an html file from the internet? I just want to be able to download html from any website to use in excel… The methods posted above don’t seem to work for me when I change the URL to something like Yahoo.com
This one makes sence “One’s first step in wisdom is to kuesstion everything – and one’s last is to come to terms with everything.”