Handicap History

Back in 2005, I created a spreadsheet to calculate my USGA handicap and I’ve been using it ever since (175 scores posted). It calculates my current handicap but doesn’t give me any historical information. Until now.

GHIN calculates handicaps on a schedule, not in real time. That means many of the handicaps on my spreadsheet aren’t official. They do that so that one score doesn’t swing the calculation too much. By calculating approximately monthly, they give you a chance to post a few scores between calculations. Here’s what my spreadsheet looks like today:

The formulas up to column J are the same as they were in 2005. I’ve added formulas in K through O.

K181: =SUMPRODUCT(((H162:H181+(ROW(H162:H181)/100000))<=SMALL(H162:H181+(ROW(H162:H181)/100000),10))*(H162:H181))

The part that says +(ROW(H162:H181)/100000) is used to add a small number so that I can break ties but won't affect the score. Otherwise it's summing up the 10 smallest of the last 20. I started 20 rows down so I wouldn't have to mess with less than 20 scores. That's too much work for not enough payoff.

L181: =SUMPRODUCT(--((H162:H181+(ROW(H162:H181)/100000))<=SMALL(H162:H181+(ROW(H162:H181)/100000),10)))

I don't know why I wrote this one - it's always 10. Except that sometimes it was 11 before I added the +ROW stuff to break ties. It's principally the same formula as the previous one except for the last part. And it includes the double-negation to coerce the Trues and Falses into numbers.

M181: =TRUNC(K181/L181*0.96,1)

Divide one by t'other and take 96%.

N181: =IF((M181+(ROW(M181)/100000))=MIN(IF(((YEAR($D$27:$D$181)=YEAR(D181))*($M$27:$M$181))=0,"",($M$27:$M$181+(ROW($M$27:$M$181)/100000)))),M181,NA())
O181: =IF((M181+(ROW(M181)/100000))=MAX((YEAR($D$27:$D$181)=YEAR(D181))*(($M$27:$M$181+(ROW($M$27:$M$181)/100000)))),M181,NA())

Once I got that part done, the data was begging to be graphed. Once on a chart, I could see that I needed some Min's and Max's to clutter it up. These last two formulas compute the minimum and maximum indexes by year and return that value on the row where it exists. If it doesn't exist on a particular row, it returns NA() so it doesn't show up on the chart.

To make the chart, I selected all the data from D:O and created an XY chart. Then I deleted all the series that I didn't want. Sorry for the 2003isness of the chart. Some of the steps I took to create this chart are:

  • CumIndex: Remove markers, make Olive Green line, add Polynomial2 trendline
  • Trendline: Make 25% grey
  • Major Gridlines: Make 25% grey
  • Min: Remove lines, make makers green, make data labels = Y values, put data labels on bottom
  • Max: Remove lines, make makers red, make data labels = Y values, put data labels on top
  • Y axis: Make minimum of 4 and maximum of 16 to get rid of some white space
  • X axis: Adjust min and max to tighten it up
  • Plot area: Remove fill

Oh, how I pine for the days of a 7.5 handicap index. The season in Nebraska ends November 15, so I have a little more than two months to get that down. I'd like to get it down to, at most, an 8.0 by then.

You can download Handicap2.zip

Model business dependencies in an Excel worksheet

The Civilization game dependency tree
How to build a dependency tree when there is no obvious connection between the business model and an Excel model.

A long time ago I spend a lot of time playing the strategy game Civilization. For those not familiar with the game, one of its features was “advances.” Each advance brought with it certain additional capabilities and benefits. Of course, there was a requirement before one could acquire an advance, particularly a set of pre-requisites. For example, pre-requisites for the “Navigation” advance were the “Seafaring” and the “Astronomy” advances. In turn, the Seafaring advance required one to already know “Pottery” and “Map Making”.

I created an Excel worksheet that let me use Excel’s Precedent arrows to understand the optimal path to specific advances as in Figure 1. The advances are shown in red and the benefit(s) of each advance are in black, blue and green. The Excel blue arrows show the pre-requisites for the Seafaring advance.

Figure 1

While the dated worksheet may be of limited value even to Civilization enthusiasts, the technique for creating the dependency tree is unique enough to be of value to Excel consumers. What makes it of value is that the game dependencies (in the context of work this would be the dependencies in the business model) had nothing to do with what Excel considers as dependencies! Consequently, to use Excel’s Trace Precedents feature I had to somehow map the model dependency into Excel formula dependencies.

For a version in a page by itself (i.e., not in a scrollable iframe as below) visit http://www.tushar-mehta.com/publish_train/xl_vba_cases/0908%20CIV%20game%20dependency%20tree.shtml

Tushar Mehta

Working With Circular references in Excel

Have you ever experienced the dreaded “Circular reference warning” popping up when you opened an Excel file or entered a formula?

Excel detects a circular reference when a chain of calculations visits the same cell more than once. Many users get (very) confused by this message and have no idea what it is about. I’ll tried to demystify that message in a new article:

Working with Circular references in Excel

The article contains these chapters:

Types of circular references

Simplistically speaking there are only a few types of circular references to discern:

Deliberate circular references

Accidental circular references

Calculation settings

If you want to work with circular references, the calculation settings of Excel are very important. This page gives you some pointers!

Properly setting up circular references

Whereas I am no fan of using circular references, they can be beneficial to your model and really solve the problem you are trying to solve. So here is some advice on how to properly work with them.

Reasons why circular references may not be detected

This page shows a couple of reasons why circles are not detected.

Enjoy!

Regards,

Jan Karel Pieterse

www.jkp-ads.com

Does the location of a formula affect its accuracy?

Someone in Microsoft’s Excel forum had a question for which I don’t have an adequate response.

In A2:A7 enter the values:
6494.55
1311.36
198.08
124.9
131.81
131.81

In A1 sum the values with =SUM(A2:A7)

In A9 compute the same result by double counting all the values and subtracting the total, i.e., =SUM(A1:A7)-A1

In A10 check if A1 and A9 are the same. =A9=A1. The result is FALSE, i.e., they are not the same.

8392.51
6494.55
1311.36
198.08
124.9
131.81
131.81

8392.51
FALSE

Turns out that A9 has a 1 in the 11th decimal place. 8392.510000000010

Fair enough. There’s the 15 digits of accuracy thing going on and we have a rounding error.

Now, build the same model except this time put the A1 formula in A8 (and adjust A9 and A10 accordingly). The result is A10 shows TRUE!

6494.55
1311.36
198.08
124.9
131.81
131.81
8392.51
8392.51
TRUE

As far as rounding errors go, there isn’t one. This time A9 contains 8392.510000000000

Like I wrote at the beginning. I don’t know why the two models yield different results.

Working with the MAXIF() and LARGEIF() functions

Oh, Wait! Excel doesn’t have those functions. We’ll have to roll our own. We’re going to use the Double Unary tip found here. In Excel, 1*TRUE = 1, 0+TRUE = 1, TRUE^1 = 1, and minus-minus TRUE (–TRUE, the double unary) = 1. It’s two sign changes, and thought to be the best way to turn an array of TRUE/FALSE into ones/zeros. Multiplying an array of parameters by an array of ones and zeros leaves only the TRUE parameters non-zero.

Back to our list of the 122 major teams in professional sports, the Boss has decided that sorting by name length, then alphabetically, isn’t going to do. He wants it by name length by stadium size. So, thanks to Wikipedia, your data looks like this, Column(F) is arena size:

  D E F
1 Anaheim Ducks 13 17,174
2 Arizona Cardinals 17 63,400
3 Arizona Diamondbacks 20 48,633
4 Atlanta Braves 14 50,097
5 Atlanta Falcons 15 71,228
6 Atlanta Hawks 13 18,729
7 Baltimore Orioles 17 45,363
8 Baltimore Ravens 16 71,008
9 Boston Bruins 13 17,565
10 Boston Celtics 14 18,624

Using fzz’s comment about the ROW() function, we’ll sort the length by G1=LARGE($E$1:$E$122,ROWS($G$1:$G1)) filled down. The table looks much like before:

  D E F G
1 Anaheim Ducks 13 17,174 29
2 Arizona Cardinals 17 63,400 22
3 Arizona Diamondbacks 20 48,633 22
4 Atlanta Braves 14 50,097 22
5 Atlanta Falcons 15 71,228 21
6 Atlanta Hawks 13 18,729 21
7 Baltimore Orioles 17 45,363 21
8 Baltimore Ravens 16 71,008 20
9 Boston Bruins 13 17,565 20
10 Boston Celtics 14 18,624 20

We know there are only three teams of name length 22. A formula – -($E$1:$E$122=22), using the double unary operation, gives us an array of 119 zeros, and just 3 ones that line up on 22. If we multiply that array by the array of arena capacities (F1:F122) we have 119 multiplications by zero, and three multiplications by 1. A MAX() function or an equivalent LARGE(,1) function on that array product returns the capacity of largest stadium hosting a team name 22 letters long. Those equations formulas, array entered, would look like this:

  • {=LARGE(- -($E$1:$E$122=22)*$F$1:$F$122,1)}
  • {=MAX(- -($E$1:$E$122=22)*$F$1:$F$122)}

We want to modify those for different teams. Changing the =22 to =$G1 is a start, and we’ve gone about as far as we can with our ‘MAXIF().’ It’s going to return the same thing three times. A scheme using COUNTIF() starting from the top and working down will improve the ‘LARGEIF()’. Using mixed references, $G$1:$G1 will grow as we fill down. COUNTIF($G$1:$G2,$G2) will only count one 22, COUNTIF($G$1:$G3,$G3) will count two 22s, and COUNTIF($G$1:$G4,$G4) will count 3, and that’s all there are. This array-entered equationformula then, filled down, is ‘LARGEIF()’:

  • {=LARGE(- -($E$1:$E$122=$G1)*$F$1:$F$122,COUNTIF($G$1:$G1,$G1))}

In I1 filldown =MATCH(H1,$F$1:$F$122,0) and in J1 filldown =INDEX(D:D,I1). Your table should look like this:

  D E F G H I J K
1 Anaheim Ducks 13 17,174 29 45,389 52 Los Angeles Angels of Anaheim  
2 Arizona Cardinals 17 63,400 22 19,980 95 Portland Trail Blazers  
3 Arizona Diamondbacks 20 48,633 22 19,356 63 Minnesota Timberwolves  
4 Atlanta Braves 14 50,097 22 18,144 30 Columbus Blue Jackets  
5 Atlanta Falcons 15 71,228 21 43,651 89 Philadelphia Phillies  
6 Atlanta Hawks 13 18,729 21 19,596 42 Golden State Warriors  
7 Baltimore Orioles 17 45,363 21 18,203 83 Oklahoma City Thunder  
8 Baltimore Ravens 16 71,008 20 68,756 69 New England Patriots  
9 Boston Bruins 13 17,565 20 67,164 49 Jacksonville Jaguars  
10 Boston Celtics 14 18,624 20 65,857 108 Tampa Bay Buccaneers  

Column(I) shows the indices reordered by size of the arena.

There are some problems with this approach. It only works with positive numbers, such as our capacities. Negative numbers will be less than a FALSE, and a zero will be returned in their place. And if ever capacities are equal, it will always return only the first (same old problem). Nicely here, hockey and basketball held in the same arena draw to different capacity. The file is available at http://wl.filegenie.com/~JMOprof/LargeIF.xls

…mrt
©¿©¬

Finding the 2nd, 3rd, 4th … MATCH()

Excel’s MATCH() function will return the relative location of the first occurrence that satisfies the match criterion (the lookup_value) within the specified array (the lookup_array).  Most often, this is what you want.  For instance, if you have duplicate entries, returning the first match is no different than returning the last match.  However, if you have unique entries with the duplicated criterion, how do you return all the unique values?

For example, you have a list of the 122 major professional sports teams (32 NFL, and 30 MLB, NBA, and NHL) sorted alphabetically in Column D.  In Column E you have code that returns the length of the team name, i.e. =LEN(D1).  Your data might look like this:

D E
1 Anaheim Ducks 13
2 Arizona Cardinals 17
3 Arizona Diamondbacks 20
4 Atlanta Braves 14
5 Atlanta Falcons 15
6 Atlanta Hawks 13
7 Baltimore Orioles 17
8 Baltimore Ravens 16
9 Boston Bruins 13
10 Boston Celtics 14

For no good reason we want to create a descending sort on name length. You can do it from the Sort selection, but we also want to keep the alpha-sort. In F1 you put =LARGE($E$1:$E$122,ROW()) and fill down. The ROW() function will cause the return of the first largest, then the second largest etc. Then you want to know how far down Column E those descending numbers fall, so that you can Index the name list that amount. In G1 put the code =MATCH(F1,$E$1:$E$122,0) and fill down. You’ll see from the data that the longest name is 29 letters, and it’s in ROW(52).

D E F G
1 Anaheim Ducks 13 29 52
2 Arizona Cardinals 17 22 30
3 Arizona Diamondbacks 20 22 30
4 Atlanta Braves 14 22 30
5 Atlanta Falcons 15 21 42
6 Atlanta Hawks 13 21 42
7 Baltimore Orioles 17 21 42
8 Baltimore Ravens 16 20 3
9 Boston Bruins 13 20 3
10 Boston Celtics 14 20 3

You can begin to see the problem: There are three teams with a 22-letter name, three with a 21-letter name, and at least 3 (in fact 13) with a 20-letter name. If in H1 we put =INDEX(D:D,G1) and filldown, we get:

D E F G H
1 Anaheim Ducks 13 29 52 Los Angeles Angels of Anaheim
2 Arizona Cardinals 17 22 30 Columbus Blue Jackets
3 Arizona Diamondbacks 20 22 30 Columbus Blue Jackets
4 Atlanta Braves 14 22 30 Columbus Blue Jackets
5 Atlanta Falcons 15 21 42 Golden State Warriors
6 Atlanta Hawks 13 21 42 Golden State Warriors
7 Baltimore Orioles 17 21 42 Golden State Warriors
8 Baltimore Ravens 16 20 3 Arizona Diamondbacks
9 Boston Bruins 13 20 3 Arizona Diamondbacks
10 Boston Celtics 14 20 3 Arizona Diamondbacks

This is exactly what we wanted to avoid. So, using Col(H) now as a helper column, in H1 put =G1. In H2 put

  • =IF(G2<>G1,G2,H1+MATCH(F2,INDIRECT(“E”&(H1+1)&”:E122″),0))

and fill down. If the value did not repeat, just copy it over; and if it didn’t, we’re making a “sliding range” that starts one row after the start of a repeated pair, ends at the end of the data, and “slides” down Column E. The sliding range is made by the INDIRECT() function. It builds the range one row down (H1+1) from the first of a repeating match. We add H1 to that result, and we have our index down Column D. This may be better seen in H3, adjacent to G3, where the first repeat starts.

  • =IF(G3<>G2,G3,H2+MATCH(F3,INDIRECT(“E”&(H2+1)&”:E122″),0))

The IF() is false. We are looking for the relative location of the lookup_value (22) in a new lookup_range (E31:E122). It’s 33 rows in. We add the original 30 to that, and the index is now 63. A similar thing happens in H4. H5 is G5 copied over. In Column I, I1 =INDEX(D:D,H1), filled down. The resulting table:

D E F G H I
1 Anaheim Ducks 13 29 52 52 Los Angeles Angels of Anaheim
2 Arizona Cardinals 17 22 30 30 Columbus Blue Jackets
3 Arizona Diamondbacks 20 22 30 63 Minnesota Timberwolves
4 Atlanta Braves 14 22 30 95 Portland Trail Blazers
5 Atlanta Falcons 15 21 42 42 Golden State Warriors
6 Atlanta Hawks 13 21 42 83 Oklahoma City Thunder
7 Baltimore Orioles 17 21 42 89 Philadelphia Phillies
8 Baltimore Ravens 16 20 3 3 Arizona Diamondbacks
9 Boston Bruins 13 20 3 49 Jacksonville Jaguars
10 Boston Celtics 14 20 3 53 Los Angeles Clippers

The names are now sorted by length, by alphabet. INDIRECT() is volatile, so there is a penalty for this. For extra credit, and no peeking, what’s the shortest name?

…mrt
©¿©¬