Different approaches to retrieve data from underlying workbooks

Different approaches to retrieve data from underlying workbooks

One of the more frequent questions I see on different public forums is “How can I summarize data from several workbooks into a master worksheet?” In general the original posters have made a basic structure, which include fixed folder(s) and the same layout for all the underlying workbooks.

Open – Read – Close workbooks
This is the basic approach to just simple open each workbook and pick up the wanted information as the following example shows:

Const stDir As String = “c:DDESources”
Dim wsSource As Worksheet
Dim wsTarget As Worksheet
Dim stFile As String
Dim lnCounter As Long

Sub Open_Read_Close_Workbooks()

Set wsTarget = ActiveWorkbook.Worksheets(1)

stFile = Dir(stDir & “*.xls”)

Application.ScreenUpdating = False

Do While stFile <> “”
    Application.Workbooks.Open Filename:=stFile, ReadOnly:=True
    Set wsSource = ActiveWorkbook.Worksheets(1)
    With wsTarget
        lnCounter = .Cells(.Rows.Count, “A”).End(xlUp).Row + 1
        .Cells(lnCounter, 1).Value = wsSource.Range(“A2”).Value
    End With
    ActiveWorkbook.Close
    stFile = Dir
Loop

End Sub

Read workbooks using Application.ExecuteExcel4Macro
This approach can be both flexible and fast. The following example shows how it can be applied:

Sub Execute_Excel4_Macro()
Dim stSource As String

Set wsTarget = ActiveWorkbook.Worksheets(1)

stFile = Dir(stDir & “*.xls”)

Do While stFile <> “”
    stSource = “‘” & stDir & “[“ & stFile & “]Sheet1′!R2C1:R2C1”
    With wsTarget
        lnCounter = .Cells(.Rows.Count, “A”).End(xlUp).Row + 1
        .Cells(lnCounter, 1).Value = Application.ExecuteExcel4Macro(stSource)
    End With
    stFile = Dir
Loop

End Sub

Read workbooks using ADO & SQL
In order to use this approach we need to involve an external library, Microsoft ActiveX Data Object Library, where the version is 2.5 and later.

Sub ADO_SQL()
Dim rst As Object
Dim stCon As String
Dim stSQL As String

Set rst = CreateObject(“ADODB.Recordset”)

Set wsTarget = ActiveWorkbook.Worksheets(1)

stFile = Dir(stDir & “*.xls”)

Do While stFile <> “”
    stCon = “Provider=Microsoft.Jet.OLEDB.4.0;” & _
                    “Data Source=” & stDir & “” & stFile & “;” & _
                    “Extended Properties=’Excel 8.0;HDR=No'”
    stSQL = “SELECT * From [Sheet1$A2:A2]”
    rst.Open stSQL, stCon, 3, 3, 1
    With wsTarget
        lnCounter = .Cells(.Rows.Count, “A”).End(xlUp).Row
        .Cells(lnCounter, 1).CopyFromRecordset rst
    End With
    stSQL = Empty
    stCon = Empty
    rst.Close
    stFile = Dir
Loop

Set rst = Nothing

End Sub

VSTO 2005 solution
Just for fun I include a solution based on ExcelExecute4Macro:

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As _ System.EventArgs) Handles Button1.Click
        ‘Option Strict On is used in this example and it forbid us to use  late binding.
       Const stDir As String = “c:DDESources”
        Dim stSource, stFile As String
        Dim iCounter As Integer

        stFile = Dir(stDir & “*.xls”)

        Do While stFile <> “”
            stSource = “‘” & stDir & “[“ & stFile & “]Sheet1′!R2C1:R2C1”
            With Me
                iCounter = .Rows.CurrentRegion.Count + 1
                .Cells(iCounter, 1) = Globals _                      
                                      .ThisWorkbook.Application. _
                                      ExecuteExcel4Macro(stSource)
            End With
            stFile = Dir()
        Loop

End Sub

To sum up
Personally I prefer the ExcecuteExcel4Macro approach as it is fast, create no overhead and don’t require opening workbooks. However there are situation where we need to manipulate properties of worksheets and therefore need to open workbooks.

The negative with ExcecuteExcel4Macro is that it can be little bit messy like the following snippet code shows:

With wsTarget
        .Cells(1 + i, 1).Offset(0, j).Value =  _                          
                                    Application.ExecuteExcel4Macro(“VLOOKUP(““” & _
                                   vaObjects(i, 1) & “”“,” & stFilename & “)”)
End With

When to use ADO/SQL instead of ExecuteExcel4Macro? One possible situation is when we have larger amount of data to be “pumped” to a master sheet.

Anyway, what do You use and what are the scenarios where You apply them?

Edit note:
Ron de Bruin has some good and a more flexible ADO-approach then the above at the following place:
Copy a range from closed workbooks(ADO)

Kind regards,
Dennis

Posted in Uncategorized

18 thoughts on “Different approaches to retrieve data from underlying workbooks

  1. Great stuff these past few days! Thanks to all the writers, this is a great blog. I just started learning VBA unfortunately but I think I’m getting there with the help of this site.

  2. Ron,

    Yes – Thanks for reminding me :)

    I recalled that “someone” had some ADO-samples but I didn’t associated it with You in the first place (propably due to Your great work with Excel & Outlook ;) )

    It also reflect another issue, all aspects of Excel is more or less already well known in the global Excel community.

    Kind regards,
    Dennis

  3. Ron,

    Well, no wonder why employees prefer that approach while external consultant would like to use it but the customer is not prepared to pay for the ehum coffee-time ;)

    Now I wait to see Your link to Your ExecuteExcel4Macro page as well :)

    Kind regards,
    Dennis

  4. I’ve used two methods. J-Walk’s tip #82 at

    http://j-walk.com/ss/excel/tips/tip82.htm

    another version of the Excel 4 macro. Very handy for reading in data and populating re-demensioned arrays. I first set a variable that gets the count of the data from a calculation inside the closed workbook, and then re-dim to that variable and populate arrays in a loop, reading from the closed workbook.

    When I want to bring in large amounts of data from a closed book, I use ADO as found at:

    http://erlandsendata.no/english/index.php?d=envbadacimportwbado

    Need to understand that if not getting from the first worksheet, you have to have a named range to retrieve from, per the comments at the top of the web-page.

    Both methods work extremely well.

    …mrt

  5. The ADO approach can be rationalized with the following approach:

    SELECT data_col
    FROM [Excel 8.0;DATABASE=C:db1.xls;].[Test$]
    UNION ALL
    SELECT data_col
    FROM [Excel 8.0;DATABASE=C:db2.xls;].[Test$]
    UNION ALL
    SELECT data_col
    FROM [Excel 8.0;DATABASE=C:db3.xls;].[Test$]
    UNION ALL
    SELECT data_col
    FROM [Excel 8.0;DATABASE=C:db4.xls;].[Test$]

    In other words, you can compile a single SQL query and get all the data in one hit. Instead of UNION ALL you could use UNION and eliminate some duplicates along the way. Now the question is, which limit will you hit first: the maximum UNIONs in a Jet query for my approach or the maximum row limit in Excel for yours ;)?

    Jamie.

    –

  6. Jamie,

    That’s quite impressive and it works excellent – thanks.

    In order to answer Your question I believe that it will be a third option (!) – my ability to keep tracks when creating it ;)

    Kind regards,
    Dennis

  7. Hi Dennis,

    Just for the record, if you need to reference a particular cell or range from another workbook you could also do it without using VBA.
    For example:

    Retrieve a cell value from another workbook:

    =’D:SomeDir[OtherWKB.xls]Sheet1′!A1

    Or to sum up a range in another workbook:

    =SUM(‘D:SomeDir[OtherWKB.xls]Sheet1’!$A$1:$A$9)

    These formula’s will work fine even if the source workbook is closed.
    To my knowledge not all functions can be used on a closed workbooks though. For example the OFFSET function will only work on an opened workbook. So using a dynamic range in your sum (that uses the OFFSET function) won’t work on a closed workbook.

    Rembo

  8. Remco,

    I know but since I dislike all kind of links I gracefully avoid them and it will never be part of any solutions I deliver.

    Kind regards,
    Dennis

  9. Another bit of information about the ExecuteExcel4Macro solution:

    In my experience, this solution is not able to import data from sourcerows above 16384. It will return errors in stead of the expected content.

    Another bit of information on the ADO solution:

    When a column you want to import has got mixed types (alternating between text and numbers) and it is starting with numbers, it does leave the cells in which you would expect the text EMPTY.

    At least… that’s my experience (maybe I make some mistake?)

    Kind regards,

    Marcel Kreijne

  10. I found that the limit of 16375 (not 16384) rows is not caused by the ExecuteExcel4Macro itself. It is a limit to the rownumber that can be linked to in a closed workbook. At least, I cannot build a link to a cell above row 16375 in a closed workbook, without getting an error…

  11. Using ado or similar seems to pull the data very well, but what about obtaining the whole sheet and its characteristics.
    I would be interested to see a vba solution to importing a worksheet from a closed workbook that would include the original worksheet name and all the worksheet attributes.

  12. Hi

    Very helpful indeed. I was wondering though, is it possible to get the actual Formula value from a closed workbook, or just the numeric/text value?

  13. Hey Brian,

    I’m pretty sure that you cannot. You’d have to open the workbook to do that. You could open a copy of the workbook, for example, if you had to be careful of other processes that might be writing to the workbook at the same time.

    But ADO retrieves values only, not formulas.

    Btw, if you are the same Brian as here: http://www.xtremevbtalk.com/showthread.php?p=1282370#post1282370, then feel free to continue within the thread if you have more questions…

    Mike

  14. Hi
    I’m looking for a piece of code whih is checking the file name first before opening and operating on it. File name format is filenameyyyymmdd.xls but the problem is files in that directory are saved irregularly (couple times a week) and I need to retrieve the latest one.
    Any ideas?


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.