Summing the digits of a number is a chore I’ve been doing alot lately. Originally I’d parse the number out over the columns. And since SUM() ignores text, I’d turn the characters into digits by applying an arithmatic identity operation, like this:
- =- -MID($A10,COLUMN(),1)
That’s double minus signs before the MID() function. The reasons for picking that identity operation are here at XLDYNAMIC’s website, about half-way down.
And filling right. But if the numbers were of uneven length, filling down would throw a #VALUE! error for all but the longest number. Contrary to what the Help advises, I find that SUM() does not ignore error values. So I was ending up with this formula so I could fill down:
- =IF(ISERR(MID($A10,COLUMN(),1)),0,MID($A10,COLUMN(),1))
That’s double ugly, and a cell-eater to boot. I did a Google search and found Microsoft Knowledge Base article 214053 on this topic. Here’s what it says:
Formula 1: Sum the Digits of a Positive Number
To return the sum of the digits of a positive number contained in cell A10, follow these steps:
- Start Excel 2000.
- Type 123456 in cell A10.
- Type the following formula in cell B10:
=SUM(VALUE(MID(A10,ROW(A1:OFFSET(A1,LEN(A10)-1,0)),1)))- Press CTRL+SHIFT+ENTER to enter the formula as an array formula.
- The formula returns the value 21.
Ignoring Step 1, I looked at Step 4 and thought, from hanging around here, that we can do better. But to do better, let’s first look at the formula from the inside out. OFFSET() returns a reference one row less then A10 is long (more on OFFSET() later). ROW() then returns an array of row numbers starting from 1 (the row of A1It’s the 1 that’s important, not the A) to the bottom of the offset. The array has as many elements as the length of the number in A10. MID() then creates an array of each digit as text. VALUE() turns the text into numbers, and then SUM(), array entered, sums the array of values.
While my formula was double-ugly, this one is just ugly. To impove it, from the outside in:
- Replace SUM() with SUMPRODUCT(). The formula no longer has to be array-entered, and it works just as well.
- Replace VALUE() with the double minus
- Instead of using LEN(A10)-1 as a row offset, use LEN(A10) as a height parameter.
- Make the reference to A1 absolute with respect to row, allowing fill-down.
The new formula is:
- =SUMPRODUCT(- -MID(A10,ROW(OFFSET(A$1,,,LEN(A10))),1))
Much prettier, and even not counting curly-braces, two characters shorter. The Knowledge Base goes on to give this as the formula for summing the digits of a negative number:
- =SUM(VALUE(MID(A11,ROW(A2:OFFSET(A2,LEN(A11)-2,0)),1))) also array-entered.
This is the better version, simply entered:
- =SUMPRODUCT(MID(A11,ROW(OFFSET(A$2,,,LEN(A11)-1)),1))
You have to start the array at 2 (via A$2) to skip the negative sign, and then also shorten the length by one for the same reason. This one is the same length as Microsoft’s. If you want one formula for all numbers, this one has no counterpart in the Knowledge base:
- =SUMPRODUCT(- -MID(ABS(A11),ROW(OFFSET($A$1,,,LEN(ABS(A11)))),1))
It uses the absolute value ABS() function for the obvious reason. It only works for true numbers. It will not handle long text strings as numbers, such as you may have for credit cards or international phone numbers. For those, either use the earlier one, or use SUBSTITUTE(A11,”-“,””) in place of ABS(A11). Now we’re the ones getting getting ugly.
…mrt
Here’s another one (array entered):
=SUM(IFERROR(MID(A1,ROW(INDIRECT(“1:”&LEN(A1))),1)*1,0))
It ignores everything except the numeric characters in the cell. It uses IFERROR, so it works only with Excel 2007 or later.
I’ve always thought of this as a solution in search of a problem. What do you use it for. Michael?
THis array formula works pre-2007 as well for mixed text and number strings
=SUM(IF(ISNUMBER(MID(A1,ROW($1:$256),1)),MID(A1,ROW($1:$256),1),0))
I’d be tempted to use Longre’s MOREFUNC add-in, as in
=EVAL(REGEX.SUBSTITUTE(REGEX.SUBSTITUTE(x,”D”,””),”(.)”,”+[1]”))
To sum only the decimal numerals in an arbitrary string, use
=SUMPRODUCT(LOOKUP(MID(E1,s,1),d&””,N(d)))
where s is defined as =ROW(INDEX($1:$6656,1,1):INDEX($1:$6656,255,1)) and d is defined as ={“”,1,2,3,4,5,6,7,8,9,”A”}.
or you could use a VBA function. i don’t understand the obsession with pure formula.
I agree with Rob, a UDF will work better and probably faster:
Public Function SumDigits(Number As Variant) As Variant
Dim aByte() As Byte
Dim j As Long
aByte = CStr(Number)
For j = LBound(aByte) To UBound(aByte) – 1 Step 2
SumDigits = SumDigits + Val(Chr(aByte(j)))
Next j
End Function
Well, I just timed it on XP SP3, Excel 2003 SP3, and sure your UDF is faster than my formula, and probably faster than Walk’s, but ours caters for non-numeric characters. If the source is all digits as I read Micheal’s post then the last formula that he gives is faster than the UDF by some 7%.
Declaring the SumDigits function as a Double and using Chr$ instead of Chr makes it about 10% faster.
Hi John –
Well, the thought was to create a Checksum number for data being created as an input file to a server. What it does in reality is say that one of these two numbers is wrong. And since that hasn’t happened yet, it is still a solution seeking a problem ;-) We’ve got belts and suspenders. How often does an output file get written wrong or an input file read wrong these days?
Rob – amplifying Dick K’s MVP remarks about new users, I provide corporate XL support, and I find there are users who steer clear of UDF’s. That’s a plateau in their learning still in their future. They have to change their security setting, learn about modules, etc. But give them a formula, and they’re all set.
…mrt
“But give them a formula, and they’re all set.”
Until they need to do something slightly different, and your services are again required.
To me, this is nothing more than “Look Ma, No Macros”.
I think it’s clever, but unsuitable for business.
[…] Adding Every Other Cell (also at the Microsoft Office Blog: Adding Every Other Cell in a Column ) and Summing the Digits of a Number […]
Rob sez: “i don’t understand the obsession with pure formula.”
If you use a formula, you can be assured that it will actually work. As you know, Microsoft wants people to fear VBA macros and does everything possible to make the user disable macros when a file is opened. And, if you’re doing thousands of such calculations, it’s likely that there will be a noticeable performance hit if you use VBA functions rather than formulas.
John – you say “you can be assured that it will actually work”. You mean “work” as in give the correct answer?
Which version of Excel are you using, and where can I get a copy? :)
One of the reasons I prefer a UDF to an array with convoluted logic is because it is easier to see how they work, and thus easier to see what’s going wrong, when something goes wrong.
FWIW, I’ve posted my effort at this problem and the Sum-skip problem at: http://newtonexcelbach.wordpress.com/2009/10/05/three-udfs/
Doug – and you say “You mean “work” as in give the correct answer?”
No, he means it doesn’t work because the user has a low security setting or doesn’t understand what enabling macros means so doesn’t, or doesn’t work because the IT department don’t allow macros/scripts.
Bob – OK, that makes sense. Fair point then.
The UDF also handles non-numeric characters.
The version below is about 1/3 faster and is faster than the numerics-only SUMPRODUCT formula, but both are probably fast enough for most real-life purposes.
I use the Byte array technique for verifying Check Characters, since many check character algorithms are based on non-numeric characters, and its fast enough to handle large ranges.
Putting UDFs into an XLA or Automation adding packaged with an Installer generally gets around the end-user’s lack of understanding about Macro warnings (BTW when was the last time anyone saw a Macro virus?), but does not solve the problem with paranoid IT departments.
Dim aByte() As Byte
Dim j As Long
On Error GoTo Fail
aByte = CStr(Number.Value2)
For j = LBound(aByte) To UBound(aByte) – 1 Step 2
If aByte(j) 47 And aByte(j) 58 Then
SumDigits2 = SumDigits2 + aByte(j) – 48
End If
Next j
Exit Function
Fail:
SumDigits2 = -1
End Function
Hmm.. tag problem with less than and greater than
Good morning, Charles –
Yep, there is an elusive VB-tag bug. It treats less-than/greater-than pairs in that order as HTML markers and ignores/throws away everything between.
In your case, it just discarded the not-equals I think. In other instances, when the less-than is lines away from the greater-than, it will shorten your code for you. Quite helpful that way. ;-)
The workarounds are to use things like LT, LE, GT, GE and !=, or post the code not inside the VB tags. I always give a warning about angle-bracket substitutions, and trust the IDE to flag those instances. A pita, I know. It doesn’t show up in normal posts.
..mrt
one more: =SUM(LEN(SUBSTITUTE(A1,{1,2,3,4,5,6,7,8,9,””},””))*-{1,2,3,4,5,6,7,8,9,-45})
don’t know how it compares in speed, but is efficient in that it doesn’t depend on other ranges so only recalculates when A1 changes. lhm
Lori: That’s slick. It works with positive and negative numbers, with an empty cell, and with decimal numbers. It evens adds the numbers in a string that contains non-numeric characters. Of course, whether that’s good or bad is not clear…
If you are willing to accept an error being generated if a cell does not contain a real number (whether as text or as an actual number), 12AB34 as an example, and if you textual numbers are no more than 29 significant digits long, then this one-liner UDF should work…
Function SumDigits2(S As Variant) As Long
If S “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), “0? & Replace(String(28, “x”), “x”, “+0?)))
End Function
Let’s try that formula again, this time using .ne. in place of the “not equal” sign (less than symbol followed by greater than symbol) that the Comment Processor “ate” (so if copy/pasting this, make the substition back to the “not equal” sign)…
Function SumDigits2(S As Variant) As Long
If S .ne. “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), “0? & Replace(String(28, “x”), “x”, “+0?)))
End Function
If S NEQ “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), Replace(string(len(s),“x”), “x”, “+0”)))
End Function
Hans,
Your function will not be able to handle more than 29 total (text) digits because of a limitation in the Format function… it appears to use the Decimal sub-type of the Variant data type to process its first argument and the Decimal sub-type is limited to a maximum of 29 digits total digits providing there is no decimal point (which is the reason for the first Replace function call). For example, try this macro out with your function and you will see the problem (displayed in the Immediate Window)…
Dim X As Long, TestNumber As String
For X = 1 To 3
TestNumber = “123456789012345678901234567” & Mid(“890”, 1, X)
Debug.Print “Sum “ & TestNumber & ” digits is “ & SumDigits2(TestNumber)
Next
End Sub
The first two digit sums will be correct but the third one (where only a zero is being affixed to the end of the second one) will be obviously incorrect. However, I do note your elimination of the concatenation of the leading zero in favor of including it inside the second Replace function call is a good way to simplify the function further…
If S “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), Replace(String(29, “x”), “x”, “+0”)))
End Function
Arrgh! I did it again… I forgot about the “not equal” sign problem…
If S “” Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), Replace(String(29, “x”), “x”, “+0”)))
End Function
What! I did it again!! Okay, I changed the test so that I can’t screw it up again ;-)
If Len(S) Then SumDigits2 = Evaluate(Format(Replace(S, “.”, “”), Replace(String(29, “x”), “x”, “+0”)))
End Function
Rick,
to avoid limits and ‘impossible’ tests:
c1 = Replace(String(29, “+”), “+”, “+0”)
For j = 1 To Len(Replace(S, “.”, “”)) Step 29
SumDigits2 = SumDigits2 + Evaluate(Format(Mid(Replace(S, “.”, “”), j, 29), c1))
Next
End Function
I wrote my original function more as an interesting case because it was a one-liner solution. If you want to go to multiple lines and, especially, a loop solution, then I would use the code Charles Williams posted earlier… the Byte array method would be among the fastest possible loop solutions that could be constructed.