Populating Class Properties

Depending on the requirements, I’ll choose a method for populating my custom collection classes using from data from a worksheet.

In this example, I’m using a list of the best selling albums of all time.

My Album class has properties as follows:

The code in my main routine is:

Filling the collection is just a matter of reading each row and popping the contained values into the right property.
The difficulty is knowing which columns relate to what properties. It’s a mapping problem – mapping columns to properties.

I could make an assumption about the positions of the columns and assume each is a known index.

I don’t really like this because I’ve been taught and have tried to follow the approach that offsets/indices like these should be defined as constants.
I’ll modify my FillFromSheet code.

Seems roughly better, but this too has problems. It can be a pain when you want to change the column order or insert a new column. You’d have to go through the code and update all of the numbers. While this is ok for 5, as I have here, changing out 50 columns is a chore.

So, I use an Enum block to determine column positions. At the top of the class module:

And another modification to my FillFromSheet code:

That works well. If I rearrange my columns on the worksheet, the only code change needed is a swap of items in the Enum block – a painless and quick update!

But that’s only if the changes to column order are in my control. What if an end user changes the column order? Do I really want them poking around in my Enum code too? I’d usually stop now and decide that if the end user starts screwing with column positions, it’s fair enough that the code should break. However, if I wanted to go the extra mile, I’d have to find the column index by searching for the text in the column headers.

Excel 2007 (and 2003 to an extent) has a feature called Tables, otherwise known as ListObjects.

My code in the main routine changes to:

… and the code in my collection class is:

In the preceding code I created a collection of key-value pairs. The key is the column label, the value is column index. I use that as a lookup when populating my properties.

That’s just a few approaches to reading a table of values into a collection. I’m interested in feedback. How do you map column positions to properties?

7 thoughts on “Populating Class Properties

  1. Wow, what a great post, Rob. Very informative and nicely presented.

    I’ve never done anything like that, so I have have nothing to offer. But I’m pretty sure you nailed it.

  2. To really go the extra mile, I would create names for each of the column headers and let the code get the column index for the named ranges. This approach buys triple immunity against: 1-column reordering; 2-column insertion; 3-changes in header text.
    No need to modify the code at all!

  3. Well said! Tables make data management so much easier, and your post nicely highlights one of the areas where they shine. I used to have lots of code that added dynamic named ranges to worksheets. Now I just convert ranges to tables (unless the data was imported, in which case Excel creates a table automatically) and work with the built-in table names.

  4. This post identifies some important issues when dealing with data in the real world. There’s noise. IT people may want the data to be clean and pretty, but that is rarely the case.

    One can take the steps described in this post a bit further. I was working with a VP and his group at the HQ of a multinational client. It received data from around the globe. In spite of the best organizational efforts, there was no way to ensure how the data would be labeled when it arrived at HQ. The only certainty was that the data would be in an Excel worksheet and that, at a minimum, all the required columns would be present, and that each column would have a label.

    Each column label was supposed to be standard but as processes changed and people changed, the names could evolve, especially in the area of the Brit s and ou, the American z and u, and similar spelling variants in other countries.

    There was also uncertainty as to where the table would start — one or more leading blank rows and maybe even some non-blank rows containing comments from the sender!

    So, the solution I came up with was a step (or two or three {grin}) beyond the last item in the post. The post-processing column labels were under the control of the group I was working with. That was the only thing we could be certain of — well, that and the fact that all the necessary columns would be somewhere in the incoming data.

    So, I developed the following algorithm:

    Start with a new table, call it Table 1, which was a 1-N relationship. This table was 2 columns in a Excel worksheet (and, no, not a table or list as Excel understands them but just data in 2 column ranges). Column 1 was the known column labels. Column 2 contained the names provided at one time or another by the various global subsidiaries.

    So, while working with a new incoming file:

    See if some row satisfied the following 2 criteria:

    For the 1st entry in Table 1 column 1, using each corresponding entry in column 2, search for a cell containing that value.

    If found, cross-check that every entry in column 1 table 1 had a corresponding entry from column 2 in the same row.

    If some row satisfied both of the above criteria, proceed with further analysis.

    If no row satisfied the above, the HQ person was processing the wrong file (it had been known to happen {grin}) or the file had yet another name for at least one column label. If the latter, update table 1 and rerun the process.

  5. Tushar,

    I have used vertical market software that has to deal with various spellings of place names (and other names) and their aliases that can show up in EDI messages. The maker of this software does something like what you describe to map the inbound words to something that is acceptable to the company using the software. They call it a synonym table. It works well but does require some user maintenance at times. Note none of this is pertinent to Excel, just that the concept you describe works for the situations you describe.

  6. I usually did this to look for a field index or append the newfield

    I used recently an approach that this is similar, i wanted to deal with fields (whose columns will be possibly switched)

    the rest of this macro didn’t work very well :

  7. I’ve built several Excel applications that require manipulating Excel tables. I started a blog to describe my technique: BriargateExcel.com. Please see the blog for a full listing of the code and a more complete explanation of the approach.
    All the code for one table is in one module. I restrict myself to putting only the table manipulation code in that module. This maximizes the cohesion of each module.
    Briefly, the module starts by defining constants containing the words in the header row of the table. Then, I use those constants to search for the column of the table containing the header words. This allows the user to switch the columns around as long as the header words remain the same. (I like Tushar Mehta’s approach that allows alternate spellings of the column headers. I’m thinking of adding it to my code to increase its robustness.)
    Once the module knows the table columns containing the header text, it initializes range variables for each column of the table. The module contains public procedures that search for a primary key in one column and return the corresponding data in another column.
    I encapsulate all this code in a separate module for each table. By insisting that all table access go through the table manipulation module, I protect the integrity of the table. In other words, coupling is minimized.

Leave a Reply

Your email address will not be published. Required fields are marked *