HTML Tables

Nat asks:

How do I refer to a table on the web? How to I refer to specific colums/row/cells in a web table?

Here’s an example that may help you out. It goes to Yahoo’s finance page (where I knew there would be a table) and finds the table with the mortgage rate. Then it finds the row for 15 year mortgages and writes the whole row to an Excel sheet. MSDN has help for these HTML objects here. Look for ‘document’, ‘table’, and ‘td’.

Sub GetTableRow()
   
    Dim ieApp As InternetExplorer
    Dim ieDoc As Object
    Dim ieTbl As Object
    Dim i As Long, j As Long
   
    Const sURL As String = “http://finance.yahoo.com/loan”
   
    ‘Create new IE instance
   Set ieApp = New InternetExplorer
   
    ‘Go to url
   ieApp.Visible = True
    ieApp.Navigate sURL
   
    ‘Wait for page to load
   Do
        DoEvents
    Loop Until ieApp.ReadyState = READYSTATE_COMPLETE
   
    Set ieDoc = ieApp.Document
   
    ‘Loop through all the elements in the document via the ‘all’ property
   For i = 0 To ieDoc.all.Length – 1
        ‘Only look at tables
       If TypeName(ieDoc.all(i)) = “HTMLTable” Then
            Set ieTbl = ieDoc.all(i)
            ‘I want to check the 3rd row (.Rows(2)) and will get an error if there
           ‘are less than three rows.
           If ieTbl.Rows.Length > 2 Then
                ‘Here’s the text in the first cell of the third row that tells me
               ‘I have the right table
               If ieTbl.Rows(2).Cells(0).innertext = “15-yr Fixed” Then
                    ‘Loop through the cells and write to the sheet
                   For j = 0 To ieTbl.Rows(2).Cells.Length – 1
                        Sheet1.Cells(1, j + 1).Value = ieTbl.Rows(2).Cells(j).innertext
                    Next j
                End If
            End If
        End If
    Next i

    ‘Alternative method – when you figure out that i = 132 in the above loop
   ‘you can refer to that element directly.  But beware, web sites change and
   ‘they don’t have the decency to even call you.
   
‘    Set ieTbl = ieDoc.all(132)
‘    For j = 0 To ieTbl.Rows(2).Cells.Length – 1
‘        Sheet1.Cells(1, j + 1).Value = ieTbl.Rows(2).Cells(j).innertext
‘    Next j

   ieApp.Quit
    Set ieApp = Nothing
   
End Sub

Posted in Uncategorized

18 thoughts on “HTML Tables

  1. Dick –

    For a similar need, I wrote this macro to validate a long list of suspect email addresses against the corporate directory internal website. The email address occurs in a row higher than 23, in Column D adjacent to search string “Email Address:” in Column C

    The Jump error handles times-out from the website. As the title implies, I tried this 3 times.

    FWIW.

    Sub DirectoryCall3()

    ‘ Macro written March 2006

    Const cPrefix As String = “URL;http://home.directory.myco.com/singlematch.epl?search_string=
    Const cSuffix As String = “&search_type=email”
    Const cTitle As String = “WebSite Directory Search”

    Dim r As Range
    Dim Cell As Range
    Dim Col As Range
    Dim FindCell As Range

    Dim L1 As Long
    Dim L2 As Long
    Dim L3 As Long

    Dim k As Integer
    Dim calcmode As Integer
    Dim StatusBar As Boolean

    calcmode = Application.Calculation
    StatusBar = Application.DisplayStatusBar

    Application.DisplayStatusBar = True
    Application.Calculation = xlCalculationManual

    Call AddWks ‘Adds TEMP

    L2 = Sheets(1).Range(“F65536?).End(xlUp).Row

    L1 = Application.InputBox(“Enter the Row to start, or Zero to Cancel.” & vbLf _
    & “The Range’s bottom row is Row ” & L2 & “.”, cTitle, 2, , , , , 1)

    If L1 = 0 Then Exit Sub

    L3 = L1

    On Error GoTo Jump
    Jump:
    Set r = Sheets(1).Range(“F” & L3, “F” & L2)

    Application.StatusBar = “Connecting to MyCo’s Directory Services Search WebSite.”

    For Each Cell In r
    Cell.Select
    Cell.Hyperlinks.Delete
    Cell.Value = VBA.Replace(Cell.Value, Chr(160), vbNullString, 1)

    With Cell.Borders
    .LineStyle = xlContinuous
    .Weight = xlThin
    .ColorIndex = xlAutomatic
    End With

    L3 = Cell.Row – 1
    Sheets(“TEMP”).UsedRange.Clear
    With Sheets(“TEMP”).QueryTables.Add(Connection:= _
    cPrefix & Cell.Value & cSuffix, Destination:=Sheets(“TEMP”).Range(“A1?))
    .Name = “test”
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlAllTables
    .WebFormatting = xlWebFormattingNone
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .Refresh BackgroundQuery:=False
    End With

    k = 23
    Set Col = Worksheets(“TEMP”).Range(“C” & k, “C” & Worksheets(“TEMP”).UsedRange.Rows.Count)
    For Each FindCell In Col
    If Left(FindCell.Value, 14) = “Email Address:” Then Exit For
    k = k + 1
    Next FindCell

    If Len(Sheets(“TEMP”).Range(“D” & k).Value) > 0 And _
    Len(Sheets(“TEMP”).Range(“D” & k + 1).Value) > 0 Then
    Cell.Offset(0, 2).Value = Sheets(“TEMP”).Range(“D” & k).Value ‘Valid Email Address
    Else
    Cell.Offset(0, 2).Formula = “=NA()”
    End If

    Next Cell

    Set r = Nothing
    Set Cell = Nothing

    Application.StatusBar = vbNullString
    Application.DisplayStatusBar = StatusBar
    Application.Calculation = calcmode
    Call Filter

    End Sub

  2. Or, just import the whole table and delete what you don’t need.

    Sub GetYahooMortgageRates()
        With ActiveSheet.QueryTables.Add(Connection:= _
            “URL;http://finance.yahoo.com/loan, Destination:=Range(“$A$1”))
            .WebFormatting = xlWebFormattingNone
            .WebTables = “2”
            .Refresh BackgroundQuery:=False
        End With
    End Sub
  3. I would agree with John that it’s a lot easier to just import the whole page but then I would search that imported page specifically for what I was looking for.

    Regarding the “web sites change and they don’t have the decency to even call you”….
    Usually the information that you’re looking for will always be there but maybe not where you last found it.

    I have a neat little workbook that imports the last 7 days worth of “Quick Draw” (AKA Lotto Crack) numbers and lets me know if I would have won or lost had I played at a particular time.
    I must say that having a workbook like this for someone who always plays the same numbers is probably the worst thing that could ever happen to them (me).

    Getting back to the original topic…..there is usually some unique string before and after the information that your looking for on the page. Import it, define your range, copy it and then delete the page.

    I’ve used this method on my “Quick Draw” workbook and it’s been working flawlessly for years.

    Even if they redesign the page, the unique strings that you’re using to define your range will probably still be there.

  4. John Wilson

    I was wondering if you could possibly post your “Quick Draw” workbook? I download hockey stats and depending on the web site they can change addresses quite frequently.

    Thanks in advance

    Bart

  5. Bart,

    Here’s a link to it:
    http://nbpl.home.att.net/QuickDraw.xls
    It’s crude, but it works (actually, it won’t work when opened from the web?? but will if you download it).

    You said that they change the (web) “addresses” for your hockey stats?
    This method won’t work for changed addresses but sometimes there’s a pattern to the addresses that you might be able duplicate.

    Good luck with your project,
    John

  6. I have a free open-source add-in I wrote that allows the usage of user-defined functions to pull financial data off the web into EXCEL. It can be found in the files area at:

    http://finance.groups.yahoo.com/group/smf_addin/

    In addition to the functions that are preprogrammed to get specific financial data, there are also several general functions that allow you to pull data from a web page:

    RCHGetWebData() returns the HTML source of a web page (subject to EXCEL string length limits).
    RCHGetTableCell() returns the data within a particular table cell of a web page.
    RCHGetHTMLTable() returns an entire table (or a particular row of a table) from a web page.

    Documentation of the functions and templates showing examples of their usage can also be found in the files area of the Yahoo group.

    I originally wrote the add-in functions because of problems I was having with EXCEL Web Queries.

  7. …more on add-in…

    To retrieve the table in question, one would array-enter the following formula over a 16 row by 3 column range:

    =RCHGetHTMLTable(“http://finance.yahoo.com/loan”,”National Averages”,-1,””,1)

    …which basically say:

    – Get the HTML source code from website http://finance.yahoo.com/loan
    – Look for the text string “National Averages” within that source code
    – Look backwards from there for a start table tag (-1 = direction and # of tags) to find the start of the table to extract
    – Look for the text string “National Averages” within that source code (a null string indicates to reuse the string from the second parameter again)
    – Look forwards from there for an end table tag (+1 = direction and # of tags) to find the end of the table to extract

    …and then a 2-dimensional array is returned with the table elements, which is why the formula needs to be array-entered (otherwise a UDF could only return data to a single cell).

    Now, if all I wanted was the “Rate” and “APR” of the 30-year rate, I could have used these two formulas:

    =RCHGetTableCell(“http://finance.yahoo.com/loan”,1,”30-yr Fixed”)
    =RCHGetTableCell(“http://finance.yahoo.com/loan”,2,”30-yr Fixed”)

    The first formula basically says:

    – Get the HTML source code from website http://finance.yahoo.com/loan
    – Return the next table cell after finding the string “30-yr Fixed”

    The second one just gets the second table cell after that string. That formula can have up to four search strings to position a pointer onto the page, then multiple rows and table cells can be skipped, forward or backward, to get to the desired table cell to retrieve.

  8. Here is a function that I wrote to lookup Quotes on Yahoo:

    I just type “=GetQuote(“FFFFX”)” into a cell and I get the value.

    Dave Miller

    ====================================================================

    Public Const URL As String = “http://finance.yahoo.com/q?s=”
    Function GetQuote(sTicker As String) As String
    Dim ie As Object
    Dim sHTML As String

    On Error GoTo GetQuote_Err
    Application.ScreenUpdating = False

    Set ie = CreateObject(“InternetExplorer.Application”)

    If IsNull(sTicker) Then GoTo GetQuote_Err

    With ie
    .Visible = False
    .navigate (URL & sTicker)
    Do Until .readystate = 4
    DoEvents
    Loop
    sHTML = .document.body.innerHTML
    GetQuote = Mid(sHTML, InStr(sHTML, “yfs_l10_” & LCase(sTicker)) + Len(“yfs_l10_” & _
    sTicker & Chr(34) & “>”) – 1, 8)
    If InStr(GetQuote, ” 0 Then
    GetQuote = Left(GetQuote, InStr(GetQuote, “

  9. Try that agian!

    Public Const URL As String = “http://finance.yahoo.com/q?s=”
    Function GetQuote(sTicker As String) As String
    ‘=======================================================
    ‘Fuction written by: David Miller
    ‘Returns a !Yahoo Finanace quote
    ‘=======================================================
    Dim ie As Object
    Dim sHTML As String
    On Error GoTo GetQuote_Err
    Application.ScreenUpdating = False
    Set ie = CreateObject(“InternetExplorer.Application”)
    If IsNull(sTicker) Then GoTo GetQuote_Err
    With ie
    .Visible = False
    .navigate (URL & sTicker)
    Do Until .readystate = 4
    DoEvents
    Loop
    sHTML = .document.body.innerHTML
    GetQuote = Mid(sHTML, InStr(sHTML, “yfs_l10_” & LCase(sTicker)) + Len(“yfs_l10_” & sTicker & Chr(34) & “>”) – 1, 8)
    If InStr(GetQuote, ” 0 Then
    GetQuote = Left(GetQuote, InStr(GetQuote, “

  10. Hi there,
    i got a time problem macro, let me know if there’s another way to make it easier, it’s a find process to vinculate values from another workbook…

    Sub abre(direccion, iatadestino, aaa)

    Set exc = New Excel.Application
    exc.Workbooks.Open (direccion)
    Set ccc = exc.Worksheets(“Hoja1?)
    x = 4
    While aaa.Cells(x, 2) Empty
    ID = aaa.Cells(x, 4)
    u = 13
    While ccc.Cells(u, 1) Empty
    IO = ccc.Cells(u, 2)
    If IO = ID Then
    aaa.Cells(x, 23) = ccc.Cells(u, 10)
    aaa.Cells(x, 24) = ccc.Cells(u, 11)
    GoTo 1
    End If
    u = u + 1
    Wend
    1 x = x + 1
    Wend
    ‘exc.Application.DisplayAlerts = False
    exc.Quit

    thanks in advance!!
    Remy

  11. John,
    I wonder if you switch up your numbers a combination of these:(7/8 per game)
    7265755168783567119

    I bet you’d have more luck at winning the big ones. They are now averaging over 4.5 games per showing as compared to the overall average of 3.5 games. It stands to reason that if everything evens out eventually, that these numbers will come out more often and the other numbers will slow down.

    I’d love to know how it works out for ya, what state is this?

  12. Dave,

    I liked the succinct style of your code, but am have trouble with it! You may have inadvertently missed the ‘End With’ closure. But even after correcting that omission I still am having error at:
    sHTML = .document.body.innerHTML

    Are any additional VB ‘Tools:References’ required? Where should the ‘End With’ be inserted? do you get the code to run as published?

    I am running vista with Excel 2003, and 3 references ‘Visual Basic For Applications’, Microsoft Excel 11.0 Object Library’ and Microsoft Forms 2.0 Object Library’.

    Collette

  13. Dick,

    Tried your code but could not get passed ‘Dim ieApp As InternetExplorer’. Is there a ‘Tools:Reference’ required to execute this statement?

    Note that I can get ‘Set ieApp = CreateObject(“InternetExplorer.Application”)’ to work OK if I declare ‘Dim ieApp as Object’ but otherwise compiler balks at your code.

    Any help appreciated.

    Collette

  14. Hi Randy,
    Can you please upload the add-in again to a google group if possible or mail it to me.
    Thanks for you help

Leave a Reply

Your email address will not be published. Required fields are marked *