Avoiding Date Conversion When Pasting an HTML Table

From Get Data From a Website…, MD comments:

However, the info in the table I copy is formatted ## / ## whereby # represents a number.
So when it tries to paste 10 / 10 in the worksheet for example, it auto-changes to 10/okt in my sheet.

This is a common problem that’s hard to solve. There’s no setting I can find in Excel that tells it to stop converting things that look like dates into dates. Since I’m reading in the data and putting it in the clipboard, I can just message the data before I do it. That’s not so easy when you just want to copy and paste, but if you’re using code, you may find the technique useful.

In the above linked post, I automated Internet Explorer to login to a website. I don’t automate Internet Explorer any more, preferring XML instead. But it doesn’t matter which you use. It all ends up in an HTMLDocument, so it’s the same from there. Also, this example doesn’t log into a webpage. It uses Contextures’ Sample Data.

Sub GetTableNoDateConversion()

Dim hDoc As MSHTML.HTMLDocument
Dim hTable As MSHTML.HTMLTable
Dim hCell As MSHTML.HTMLTableCell
Dim doClip As MSForms.DataObject

'Get the webpage
Set xHttp = New MSXML2.XMLHTTP
xHttp.Open "GET", "http://www.contextures.com/xlSampleData01.html"

'Wait for it to load
Do: DoEvents: Loop Until xHttp.readyState = 4

'Put it in a document
Set hDoc = New MSHTML.HTMLDocument
hDoc.body.innerHTML = xHttp.responseText

'Find the third table
Set hTable = hDoc.getElementsByTagName("table").Item(2)

'Fix anything that looks like a date
For Each hCell In hTable.Cells
If IsDate(hCell.innerText) Then
hCell.innerText = "'" & hCell.innerText
End If
Next hCell

'put it in the clipboard
Set doClip = New MSForms.DataObject
doClip.SetText "" & hTable.outerHTML & ""

'paste it to the sheet
Sheet1.PasteSpecial "Unicode Text"

'Make the leading apostrophes go away
Sheet1.Range("A1").CurrentRegion.Value = Sheet1.Range("A1").CurrentRegion.Value

End Sub

Once I get the table into an HTMLTable object, I loop through all the HTMLTableCells to see if any of them looks like a date. If so, I put an apostrophe in front of it. The apostrophe is the Excel way to say “No matter what I type next, assume it’s text”. Except when you’re pasting special as Unicode Text. In that case, it doesn’t hide the apostrophe like it should. So the last line of the code is the equivalent of pressing F2 and Enter all the say down the column and forcing Excel to reevaluate its decision to ignore that apostrophe.

8 thoughts on “Avoiding Date Conversion When Pasting an HTML Table

  1. Did you try ?

    Sub M_snb()
    With CreateObject("MSXML2.XMLHTTP")
    .Open "post", "http://www.contextures.com/xlSampleData01.html"
    x3 = .responsetext
    End With

    With New MSHTML.HTMLDocument
    .body.innerHTML = x3
    With .getElementsByTagName("table").Item(2)
    ReDim sn(.Rows.Length - 1, .Cells.Length \ .Rows.Length - 1)

    For Each cl In .Cells
    sn(j \ (UBound(sn, 2) + 1), j Mod (UBound(sn, 2) + 1)) = cl.innerText
    j = j + 1
    End With
    End With

    Cells(1).Resize(UBound(sn) + 1, UBound(sn, 2) + 1) = sn
    End Sub

  2. I presumed he wanted the formatting, so I used the clipboard. Also, your code converts dates, but you could use the same technique of putting an apostrophe in front of things that look like dates.

  3. Your code doesn’t work outside the US. I’m not interested in a string ‘1/9/2012 where I expect 09-01-2012.

  4. Thanks for posting this method, i have just needed to implement something similar to allow for different user date settings.

    A bit of digging revealed there is one setting for this purpose under GetExternal Data From Web and then under Options there is a checkbox for “disable date recognition”.

    Another possibility is to preformat the cells with number format set to text before pasting. Any numeric cells can then be changed by clicking the green triangle in the corner and choosing “Convert To Number”.

    One thing i’m not too keen on is prefixing a value with an apostrophe which seems to set a hidden cell property not exposed to vba. This can be seen since the apostrophe reappears when you set the cell numberformat to text and then press F2 enter. The only way i have found to clear this property is to clear all cell formats.

  5. Lori: I think that option only works for web queries, not copying and pasting, but it’s a good find.

    When I set the cells to Text format, the paste overwrites it and makes it a date format.

    You can check the apostrophe with ActiveCell.PrefixCharacter, but I agree it’s not my favorite hack.

  6. Dick, thanks for coming back. The point i wanted to make was a web query is another way to do the same thing and the code can just be recorded. Try it on the same sample data from the contextures website then right-click the imported table, select data range properties and uncheck save query definition.

    And yeah it’s true when you paste text as html the formats will get overwritten, the text number format option only works for plain text. Also agree it is the prefixcharacter that is set which is readonly but there is no way to change it back it seems.

Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.