End Shift Down

I’m sick of writing this

Sheet1.Range("B2",Sheet1.Cells(Sheet1.Rows.Count,2).End(xlUp))

I want a new argument to the End property. I want to type

Sheet1.Range("B2").End(xlShiftDown)

I would prefer if this new argument mirrored the code above. That is, I would prefer if it returned a range from the current cell until the last cell in that column. That’s as opposed to returning a range from the current cell to the cell just above the first blank cell, as you should get if you used Ctrl+Shift+Down on the keyboard. But I’d be happy either way.

The other thing this new argument would have to do is know when B2 is the only cell or there is nothing in B2:B?. Oh, screw it. Here’s a table of what I want.

End(xlUp) End(xlDown) End(xlShiftDown)
B2:B10 contains contiguous values B2:B10 B2:B10 B2:B10
B2:B10 contains a blank at B5 B2:B10 B2:B4 B2:B10
B2:B10 has a blank at B2 B2:B10 B2:B3 B2:B10
B2 is the only cell in B B2:B2 B2:B1048576 B2:B2
Column B is empty B1:B2 B2:B1048576 B2:B2

I guess that wasn’t so complicated. It should act just like going up from the last cell except when the column is empty. In that case, it should just return the one cell.

Who’s with me?

The Range.Find method and a FindAll function

Two things that could be better about the Range.Find method have been 1) up-to-date and correct documentation, and 2) adding the UI’s ‘Find All’ capability to the Excel Object Model. As of Office 2013 neither has happened.

Consequently, every time I want to use the Find method, I continue to have to jump through hoops to figure out the correct values for the different arguments.

I also discovered that FindNext does not work as expected when one wants to search for cells that meet certain format criteria. Consequently, I updated my long available FindAll function so that it works correctly with format criteria.

For a version in a page by itself (i.e., not in a scrollable iframe as below) visit http://www.tushar-mehta.com/publish_train/xl_vba_cases/1001%20range.find%20and%20findall.shtml

Tushar Mehta

Range Variables and Deleted Cells

Here’s a stackoverflow question that I’ve never seen before. And I’ve seen a lot of questions.

Set a Range variable that holds its state, such as a module-level variable. Then delete the row. Then inspect that variable in the Locals Window.

Option Explicit

Dim r As Range

Sub starttest()

Set r = Sheet1.Range("A1")

End Sub

Sub stoptest()

Stop

End Sub

I ran starttest, deleted row 1, then ran stoptest. The Locals Window looks like this:

It still points to a memory address and is not Nothing:

I think I’ve never seen this before because I’ve never had a range variable other than as a local variable in a procedure. It seems strange to me that I’ve never encountered that. I guess it could be useful to check if range was deleted. Anyway, I thought it was interesting.

A Case for Value as a Default Property

Charles Williams had an interesting post1 the other day about the Text, Value, and Value2 properties of the Range object. In it, he wrote:

So .Value2 really should be the default, and is definitely the one to use 99% of the time.

I disagree. I never use Value2.

The Value2 property returns a Double data type (64-bit floating point). The Value property returns the same thing except when the cell is formatted as Date or Currency. When it’s currency, Value returns a Currency data type (64-bit scaled integer), and when it’s a date, Value returns a Date date type (another 64-bit floating point). Who cares (besides Charles and me) and what does it mean?

Let’s imagine that we’re writing our own spreadsheet program to get a better understanding of what Excel is doing. I don’t actually know what Excel does under the hood, so this is all conjecture. In our spreadsheet application that we’re writing from scratch, we’ll store all numbers in memory as IEEE double precision floating point number, or Doubles. There is a case when the user may want more calculation accuracy than floating point allows and is willing to sacrifice a little precision (I’m looking at you fellow accountants). To offer that feature, we’ll allow the user to identify certain numbers as scaled integers with 15 digits to the left of the decimal and four digits to the right. Let’s call that data type Currency. We’re still going to store the value as a Double, but we’ll do the conversion prior to any calculations to ensure the accuracy.

As spreadsheet writers, we have a decision to make. I said we’d allow the user to identify certain numbers as Currency, but how exactly do they identify it. We could add a property called DataType to the Range class. OK, but how does the user set that property? Most Range properties are set in the Format Cells dialog box. We have a few tabs on that dialog already, so let’s add another one called Data Type. The user can set the data type and we’ll do the floating point to scaled integer conversion when the DataType property is set to Currency. That’s when the usability people come in.

The usability people begin by hurling epithets at us regarding our lack of sex lives and penchant for role playing games. They say that normal people (i.e. non-programmers) don’t have any idea what a “data type” is and if we make them learn, our sales will go down by 14.1% (a totally made up number because marketing people can’t do math). Surely, they go on, there is a better way to identify Currency values. Then they get out the corporate programming guidelines and remind us that no dialog box can have more than six tabs. We already have six tabs on the Format Cells dialog, so we can’t add another (Tools – Options hasn’t been invented yet).

Time to compromise. One of our bright, young interns suggests that we make Currency a format. As programmers, we are incensed. We already have an Accounting format and a Currency format will be confusing. And besides, Currency is a data type, not a format. Due to our lack of persuasiveness or the caffeine withdrawals, we agree to add it is a format. Because of the Currency debacle we go from two levels to three levels for data display. The bottom level is the raw Double that’s stored in memory, just like before. The top level is the text representation of the number with all the commas and periods and such, just like before. We need to add a pre-processor level. At this level, we’ll check to see if the Range object has it’s NumberFormat property set to Currency and, if so, we’ll convert from floating point to scaled integer. Once converted, we’ll send to the formatting layer to add the text goodies.

The deed is done and we all feel a little dirty. We code the changes and send them over to the VBA folks. The guy building the object model already has a Value property and a Text property. Now that we’ve added another layer, he doesn’t know which value to return for the Value property. There’s really not much debate – it has to be the value that comes from the pre-processor. If people “format” a value as Currency, they’re going to expect a scaled integer from the Value property even if they don’t know what the hell a scaled integer is. Object model guy then asks if he should expose the raw value. Of course. We’re all about exposing ourselves. We call in the dullest intern on staff to come up with the name and, true to form, he produces Value2. Positively inspired.

And scene.

You may disagree with some of the decision made in the above dramatic reenactment, but you have to realize they’re all connected. I personally disagree with calling Currency a format. But whether we call it format or a data type, doesn’t change the fact that we have to convert it. You can disagree with the decision to store all numbers as Doubles, and just store Currency as a scaled integer in memory. That doesn’t really change how the user identifies Currency values and you can’t simply store everything as currency because then everything would be slow.

Back to my assertion that Value is the appropriate default (or, put another way, that Value points to the right layer). I contend that if you “format” something as Currency or Date, that you do so knowingly and for a specific purpose. One of the knocks on Value is that you lose precision for Currency because Currency only goes four digits out to the right. I say you don’t lose precision. The precision that’s there is a myth. It’s only there because we had to convert the number to a float. When we chose Currency, we consciously forfeited all rights to precision beyond the 10,000th place (or should have). The fact that we can convert that number to a Double and see what looks like precision, doesn’t make it there.

For the Date type, I have a different argument. It’s not that there’s any problem with the data manipulation when converting from Double to Date, it’s just that it takes longer. If I’m reading in a date and spitting it back out, I agree that Value2 is probably the better choice. However, if that’s all I’m really doing I might want to address what value I’m adding. In most cases, I’m manipulating the date in such a way that I care that it’s a date. What I don’t want to do is read in a double using Value2 (avoiding the conversion overhead), then have to convert that number to a date myself to manipulate it, then convert it back to a double. In almost all cases, I want a Date or Currency typed number when I “format” it that way.

I suspect that most people who prefer Value2, in fact, disagree with the decision to confuse formatting and data types in the user interface. And if so, then we agree on that. Let me see if I can reword part of Charles’ statement so that we both agree with it: When performance matters, Value2 should be used 100% of the time there are no dates or currency, and Value should be used 100% of the time there are. I wonder if that will fly.

Footnotes
1In fact, all of Charles’ posts are interesting, so if you’re not subscribed to his RSS feed2 yet, get on it.
2If you’re still not reading blogs via RSS, be sure to say hello to my mom at the next Rotary meeting.

Testing for Empty Cells

To test for empty cells, use the IsEmpty function. IsEmpty takes one argument, a variable, and returns True if that variable contains nothing (Technically, I think it returns True if the variable is uninitialized). While it’s true that you can pass variables to IsEmpty, you can also pass object’s properties, specifically the Value property of the Range object.

Many times you will see programmers test for a zero length string, like this

If Range(“A10”).Value = “” Then

In 99% of the cases, that will work. However, if the cell contains a single quote and nothing else, then it will contain a zero length string, but will not really be empty. To test for true emptiness, use IsEmpty on the Value property.

Sub TestForEmpty()

Dim sPrompt As String
Dim rRng As Range

Set rRng = Sheet1.Range("A10")
sPrompt = "Range contains "

rRng.ClearContents

MsgBox sPrompt & "nothing = " & IsEmpty(rRng.Value) & vbNewLine & _
sPrompt & "a zero length string = " & CBool(rRng.Value = "")

rRng.Value = "'" 'single quote

MsgBox sPrompt & "nothing = " & IsEmpty(rRng.Value) & vbNewLine & _
sPrompt & "a zero length string = " & CBool(rRng.Value = "")

End Sub

Update: A newsgroup post by Otto Moehrbach prompted me to look a little deeper into how IsEmpty works. I wanted to determine why a cell with a formula that returns an empty string behaves differently than an empty cell. I set up a watch for A1 and B1 to see what was going on. A1 contains the formula =”” and B1 contains nothing at all.

watch window showing status of A1 and B1

A1 is a Variant/String and B1 is a Variant/Empty which obviously accounts for the difference. There’s nothing too shocking here, it’s just an interesting glimpse into the inner workings of VBA. Although the “interesting” part is debatable.