# Dynamically extracting the nth numerical element without delimiters

ExcelXOR has a great post on using formulas to extract all numbers from a string where:

• The string in question consists of a mixture of numbers, letters and special characters
• The numbers may appear anywhere within that string
• Decimals within the string are to be returned as such
• The desired result is to have all numbers returned to separate cells

That’s a tall order with formulas. Here’s what ExcelXOR came up with:

``` =0+MID(“α”&s&”α0″,1+SMALL(IF(MMULT(0+(ABS(51.5-CODE(MID(SUBSTITUTE(“α”&s&”α0″,”/”,”α”),ROW(INDIRECT(“1:”&LEN(“α”&s&”α0″)-1)) +{0,1},1)))>6)*{2,1},{1;1})=2,ROW(INDIRECT(“1:”&LEN(“α”&s&”α0″)-1))),e),SUM(SMALL(IF(ISNUMBER(MATCH(MMULT(0+(ABS(51.5-CODE( MID(SUBSTITUTE(“α”&s&”α0″,”/”,”α”),ROW(INDIRECT(“1:”&LEN(“α”&s&”α0″)-1))+{0,1},1)))>6)*{2,1},{1;1}),{1,2},0)),ROW(INDIRECT( “1:”&LEN(“α”&s&”α0″)-1))),2*e+{-1,0})*{-1,1})) ```

…where s is the string you want to break apart, and e the element you want returned.

That fatboy runs to 415 characters. Which is a heck of a lot less than my first effort:
``` =SUM(IFERROR(--REPT(MID(s&"|",ROW(OFFSET(\$A\$1,,,LEN(s)))*COLUMN(OFFSET(\$A\$1,,,,LEN(s)))^0,COLUMN(OFFSET(\$A\$1,,,,LEN(s)))*ROW( OFFSET(\$A\$1,,,LEN(s)))^0),(IF((ISNUMBER(-MID(s&"|",ROW(OFFSET(\$A\$1,,,LEN(s)))*COLUMN(OFFSET(\$A\$1,,,,LEN(s)))^0,COLUMN(OFFSET( \$A\$1,,,,LEN(s)))*ROW(OFFSET(\$A\$1,,,LEN(s)))^0))*NOT(ISNUMBER(-MID("|"&s&"|",ROW(OFFSET(\$A\$1,,,LEN(s)))*COLUMN(OFFSET(\$A\$1,,,, LEN(s)))^0,COLUMN(OFFSET(\$A\$1,,,,LEN(s)))*ROW(OFFSET(\$A\$1,,,LEN(s)))^0)))*NOT(ISNUMBER(-MID(s&"|",1+ROW(OFFSET(\$A\$1,,,LEN(s))) *COLUMN(OFFSET(\$A\$1,,,,LEN(s)))^0,COLUMN(OFFSET(\$A\$1,,,,LEN(s)))*ROW(OFFSET(\$A\$1,,,LEN(s)))^0))))>0,ROW(OFFSET(\$A\$1,,,LEN(s))) *COLUMN(OFFSET(\$A\$1,,,,LEN(s)))^0) = SMALL(IF((ISNUMBER(-MID(s&"|",ROW(OFFSET(\$A\$1,,,LEN(s)))*COLUMN(OFFSET(\$A\$1,,,,LEN(s)))^0, COLUMN(OFFSET(\$A\$1,,,,LEN(s)))*ROW(OFFSET(\$A\$1,,,LEN(s)))^0))*NOT(ISNUMBER(-MID("|"&s&"|",ROW(OFFSET(\$A\$1,,,LEN(s)))*COLUMN( OFFSET(\$A\$1,,,,LEN(s)))^0,COLUMN(OFFSET(\$A\$1,,,,LEN(s)))*ROW(OFFSET(\$A\$1,,,LEN(s)))^0)))*NOT(ISNUMBER(-MID(s&"|",1+ROW(OFFSET (\$A\$1,,,LEN(s)))*COLUMN(OFFSET(\$A\$1,,,,LEN(s)))^0,COLUMN(OFFSET(\$A\$1,,,,LEN(s)))*ROW(OFFSET(\$A\$1,,,LEN(s)))^0))))>0,ROW(OFFSET (\$A\$1,,,LEN(s)))*COLUMN(OFFSET(\$A\$1,,,,LEN(s)))^0),e))),0)) ```

…although there is some pretty nifty stuff going on in there, that I’ll bore you with at a later date…including a MID that breaks a string down into ALL possible slices of text:
`=MID(s&"|",ROW(OFFSET(\$A\$1,,,LEN(s)))*COLUMN(OFFSET(\$A\$1,,,,LEN(s)))^0,COLUMN(OFFSET(\$A\$1,,,,LEN(s)))*ROW(OFFSET(\$A\$1,,,LEN(s)))^0)`

I couldn’t rest with just that. Literally. It was too heavy. So I had another crack. The result: Here’s a generic ExtractNumbers formula I just put together. And by ‘just’ I mean I only just managed it, and it took an entire weekend, I ignored the kids, and forgot to bathe. (That last one is pretty much a given, and I can’t really blame Excel).

``` =MID(s,SMALL( IF( ISERROR( -MID( TEXT( MID("||"&s,ROW( OFFSET( \$A\$1,,,LEN( s))),2),"|"),2,1)),FALSE,ROW( OFFSET( \$A\$1,,,LEN( s)))),e) -1,SUM( SMALL(IF( ISERROR( -MID( TEXT( MID("|"&s&"|",ROW( OFFSET( \$A\$1,,,LEN( s)+1)),2),"|"),{1,2},1)),FALSE,ROW( OFFSET( \$A\$1,,,LEN( s)+1))),{1,2}+(e-1)*2)*{-1,1})) ```

…again, where:
S = the string you want to break apart
E = the number element you want to return

It will handle numbers with decimal places provided there is a digit to the left of the decimal place e.g. SomeText5.745 and NOT SomeText.745, and it’s a much more svelte 277 characters in length. Isn’t she a beauty? A lot of the inspiration for the approach came from Excel Ninja Sajan, over at the awesome Formula Challenges section of Chandoo’s forum.

In that incarnation, you can use it to extract just one element of a specific number:

Or if you prefer, you can use this version:
```=MID(\$A28,SMALL(IF(ISERROR(-MID(TEXT(MID(“||”&\$A28,ROW(OFFSET(\$A\$1,,,LEN(\$A28))),2),”|”),2,1)),FALSE,ROW(OFFSET(\$A\$1,,,LEN (\$A28)))),COLUMNS(\$B28:B28))-1,SUM(SMALL(IF(ISERROR(-MID(TEXT(MID(“|”&\$A28&”|”,ROW(OFFSET(\$A\$1,,,LEN(\$A28)+1)),2),”|”),{1,2},1)), FALSE,ROW(OFFSET(\$A\$1,,,LEN(\$A28)+1))),{1,2}+(COLUMNS(\$B28:B28)-1)*2)*{-1,1}))```

…you can use it to extract numbers into separate columns, where \$A28 holds the string to be split, and B28 holds the first column that you want to extract a number to. Like so:

I don’t know how either of these perform against a UDF, let alone each other. Anyone got a lean, mean, UDF-based splitting machine that we can test it against?

Here’s a sample file:
ExtractNumber_20141120

—Edit 21 Nov 2014—
It turns out that my above formula fails for a few specific special character & number combinations. Here’s a table, showing in which cases Excel will still treat a number as a number when you pad it out with a non number. For completeness I do three tests:

• Special character before the number
• Special character after the number
• Special character on either side of the number

—Edit 8 Dec 2014—
Crikey…after chaining myself to the computer since the last update, I finally managed to cobble together a formula that will extract all numbers in practically any shape or form that the local version of Excel deems as a number. That is, given a string like this:
Jeff Weir Age: 43 DOB: 25/4/71 Salary: \$100,000 StartTime: 8:30
…It returns this
43
25/4/71
100000
8:30

It ain’t pretty. If you like UDFs, then you’ll agree with my pal Gareth Hayter that this is pure formulabation. Unless you like formulas, in which case it’s orgasmic. (You sick, sick analyst, you. Don’t let your boss catch you playing with it in the office…)

``` =MID( String,SMALL( IF( ( FREQUENCY( ISNUMBER( -MID( String,ROW( OFFSET( \$A\$1,,,LEN( String)))*COLUMN( OFFSET( \$A\$1,,,,LEN( String)))^0,COLUMN( OFFSET( \$A\$1,,,,LEN( String)))*ROW( OFFSET( \$A\$1,,,LEN( String)))^0))*( COLUMN( OFFSET( \$A\$1,,,,LEN( String)))+ROW( OFFSET( \$A\$1,,,LEN( String)))-1),ROW( OFFSET( \$A\$1,,,LEN( String)+1))-1)=1)*ISNUMBER( -MID( "|"&String&"|",ROW( OFFSET( \$A\$1,,,LEN( String)+2)),1))=1,ROW( OFFSET( \$A\$1,,,LEN( String)+2))-1,FALSE),ROW( OFFSET( \$A\$1,,,LEN( String)))),SMALL( IF( ( FREQUENCY( ISNUMBER( -MID( String&"|",ROW( OFFSET( \$A\$1,,,LEN( String)))*COLUMN( OFFSET( \$A\$1,,,,LEN( String)))^0,COLUMN( OFFSET( \$A\$1,,,,LEN( String)))*ROW( OFFSET( \$A\$1,,,LEN( String)))^0))*( LEN( String)+1-ROW( OFFSET( \$A\$1,,,LEN( String)))),MOD( LEN( String)+2-ROW( OFFSET( \$A\$1,,,LEN( String)+1)),LEN( String)+1))=1)*ISNUMBER( -MID( "|"&String,ROW( OFFSET( \$A\$1,,,LEN( String)+2)),1))=1,ROW( OFFSET( \$A\$1,,,LEN( String)+2))-1,FALSE),ROW( OFFSET( \$A\$1,,,LEN( String))))-SMALL( IF( ( FREQUENCY( ISNUMBER( -MID( String,ROW( OFFSET( \$A\$1,,,LEN( String)))*COLUMN( OFFSET( \$A\$1,,,,LEN( String)))^0,COLUMN( OFFSET( \$A\$1,,,,LEN( String)))*ROW( OFFSET( \$A\$1,,,LEN( String)))^0))*( COLUMN( OFFSET( \$A\$1,,,,LEN( String)))+ROW( OFFSET( \$A\$1,,,LEN( String)))-1),ROW( OFFSET( \$A\$1,,,LEN( String)+1))-1)=1)*ISNUMBER( -MID( "|"&String&"|",ROW( OFFSET( \$A\$1,,,LEN( String)+2)),1))=1,ROW( OFFSET( \$A\$1,,,LEN( String)+2))-1,FALSE),ROW( OFFSET( \$A\$1,,,LEN( String))))+1) ```

Here’ a file showing how this bit of formulabation comes together: Dynamic Split Numbers_20141208 v6

## 36 thoughts on “Dynamically extracting the nth numerical element without delimiters”

1. Looks like a great effort, Jeff!

Just wondering if you can tweak it to return the numbers at the very end as well? Also, it doesn’t seem to handle all cases with special characters, for example the string:

returns, consecutively: 9 0 5:C 7 #NUM!

(P.S. Thanks a lot for all the kind words and links on your wonderful site! I recall spending whole afternoons pouring through all your articles, along with David Hager's E-Letter Archive series and several others, all of which led to the wonder I now see in Excel formulas. So I'd just like to say many, many thanks!)

2. Hi XOR. Nice to meet you, and thanks for the kind words. I thought it was returning numbers on the end, as per the screenshot of the table above. Although now I see it trips up on your sample string:

```';9>B=0AD
I'm not sure why. I'll take a look, see if I can tweak it.
Hey, where the heck did you come across that INDEX deferencing syntax I talked about in my previous post? I've not seen that anywhere. Inspired. I discovered something similar independently using some combination of OFFSET(OFFSET but can't recall the specifics just now.
```
3. This UDF:

```Function ExtractNumerics(Source) Dim sIn As String Dim sOut() As String Dim lCols As Long Dim lStart As Long Dim lLen As Long Dim sPart As String Select Case TypeName(Source) Case "Range" sIn = CStr(Source.Value) & "| " Case "String" sIn = Source & "| " End Select ReDim sOut(1 To 1, 1 To 1) For lStart = 1 To Len(sIn) For lLen = 1 To Len(sIn) - lStart sPart = Mid(sIn, lStart, lLen) If IsNumeric(sPart) Then If Not IsNumeric(Mid(sIn, lStart, lLen + 1)) Then lCols = lCols + 1 ReDim Preserve sOut(1 To 1, 1 To lCols) sOut(1, lCols) = CDbl(sPart) lStart = lStart + lLen Exit For End If Else Exit For End If Next Next ExtractNumerics = sOut End Function```

``` ```

`is about twice as fast as the formula in the block of cells starting at cell C35.`

4. c00: the string to parse
y : the nth numerical item

```Function F_snb(c00, y) As String c01 = "| " & c00```

``` For j = Len(c01) To 2 Step -1 If IsNumeric(Mid(c01, j, 1)) + IsNumeric(Mid(c01, j - 1, 1)) = -1 Then c01 = Left(c01, j - 1) & " " & IIf(IsNumeric(Mid(c01, j, 1)), "", "|") & Mid(c01, j) Next ```

``` sn = Filter(Split(c01), "|", False) If y < UBound(sn) + 1 Then F_snb = sn(y) End Function```

5. @snb: so how does yours compare to the Excel formulas regarding calc speed?

6. @XOR LX. Have worked out why my formula is failing, and in that specific case it’s because Excel considers : to be a number when at the end of a number, as it denotes a time. I’ve added a table to the end of my post detailing a few special cases. This raises an interesting thought: perhaps your formula needs to be changed to return all these special cases as numbers. Then again, perhaps it already does…I haven’t checked yet…still to busy fixing mine!

7. @Charles: Yeah, I saw your post on the GROUPS function. I haven’t had the time to delve at all into the FastExecl functions despite purchasing it probably over a year ago now. Largely because I’m concentrating on writing materials that will be available in the public domain. So for instance, in my book I’ll be giving FastExcel a mention, but I’ll also be providing cheaper – albeit slower – alternatives to some common issues that users have, such as this one.

I think it’s staggering what you’ve done with FastExcel. But I can also see that many users will turn to FastExcel functions when their workbook starts going slow, when they should firsttake a look at their workbook layout, and then see if FastExcel functions are even needed for their particular situation. Because using FastExcel functions seriously limits the audience for the spreadsheet.

In fact, Narayan put it quite well in this comment at http://forum.chandoo.org/threads/excel-can-not-handle-it.20333/#post-122661 :

There are two aspects to workbook heaviness ; one where the workbook is a well-optimized one , but is burdened with the tasks that it has been developed to handle ; the other where the workbook has grown organically over time the way a tree develops roots viz. haphazardly and in all directions.

A workbook is supposed to be planned well , not just for immediate needs , but anticipating what might come in the future. I leave it to you to decide into which category your workbook fits.
However , once a workbook has assumed such dimensions as yours has , the way to optimization and speed is uncertain.

No doubt the profiler you’ve built into FastExecl will probably do the op at that post just as much good as the FastExcel functions. This isn’t a critique of FastExcel in any way…in fact the best thing that could happen is for MS to pay you enough money so that you pass over your IP to them, and they then give it to all of us as part of the Excel bundle.

8. @Jeff Weir

Yes – the issue you’re seeing is the result of you performing tests on strings of length 2.

When testing for numericalness within strings, anything other than checking on a purely character-by-character process is a risky business, and certainly not the most rigorous approach.

A similar issue occurs with the (sadly oft-recommended) solution for extracting consecutive numbers from a string using LOOKUP, e.g. for consecutive numbers from the start of a string :

=-LOOKUP(1,-LEFT(A1,ROW(INDIRECT(“1:”&LEN(A1)))))

which works in 99.9% of cases, though, since it necessarily relies upon checking strings of lengths greater than one, there will always be the risk (albeit small) of encountering one of a few alphanumeric strings which will cause incorrect results, namely any appropriate mixing of letters and numbers which also happens to qualify as a valid Excel date string within the version being used, or as a valid form of scientific notation.

See here for more:

http://excelxor.com/2014/09/14/extracting-numbers-from-a-string-1-consecutive-numbers-at-start/

Perhaps you can tweak it somehow, though in general it might be best to re-work it so that, as my formula does, it operates on a strictly character-by-character basis, even if this means an increase in formula length.

Regards

9. Sorry! Forgot Code tags!!

10. I completely rely on Jeff’s speed testing skills ..

Meanwhile:

```Function F_snb(c00, y) As String sn = Split(StrConv(c00, 64), Chr(0))```

``` For j = 0 To UBound(sn) If Not IsNumeric(sn(j)) Then sn(j) = "" If j > 0 Then If IsNumeric(sn(j - 1)) Then sn(j - 1) = sn(j - 1) & "~" End If Next sn = Split(Join(sn, ""), "~") ```

``` If y <= UBound(sn) Then F_snb = sn(y - 1) End Function```

11. @ Jeff,

Could you please replace my latest contribution by:

```Function F_snb(c00, y) As String On Error Resume Next```

``` For j = 1 To Len(c00) If Not IsNumeric(Mid(c00, j, 1)) Then Mid(c00, j, 1) = " " Next ```

``` F_snb = Split(Application.Trim(c00))(y - 1) End Function```

12. Even a 1-liner does the trick:

```Function F_snb(c00, y) As String On Error Resume Next F_snb = Split(Application.Trim(Join(Evaluate("transpose(If(isnumber(-mid(""" & c00 & """,row(offset(A1,,," & Len(c00) & ")),1)),mid(""" & c00 & """,row(offset(A1,,," & Len(c00) & ")),1),"" ""))"), "")))(y - 1) End Function```

13. Lori says:

@XOR LX: One possiblity for restricting numeric interpretations is to append something to the string, for instance:

`=-LOOKUP(1,-(LEFT(A1,ROW(INDIRECT("1:"&LEN(A1))))&"**0"))`

will only allow for decimals (not dates, times, fractions or exponentials). Another possibility that only allow for integer values is to append instead &” 0/1″.

@jeff: Clever use of TEXT for identifying numeric pairs, i will make a note of that and then likely forget it. But I really think it’s hard to do any better than Excelxor’s solution which appears to be quite robust and clearly a lot of thought has gone into it. Testing the solutions with the following string “1.2.3″ shows up some differences:

jeff: 1.2.3
xor lx: #NUM!
jkp: {1.2,3}
snb: {1,2,3}

There’s a good case to be made for throwing where there is ambiguity, also it looks like snb’s may need adapting to accept decimals.

14. @Lori

Ingenious. How does Excel interpret e.g. the string “123**0” as 123?

On a side note, did anyone know that two (not one, or three) consecutive asterisks, appropriately-placed, will perform the equivalent role of the letter “E” (or “e”) in generating scientific notation?

For example, typing both:

“1**3”

and

“1E3”

will result in a (numerical) cell value of 1000.

Curiouser and curiouser!

Cheers

15. Ah. Stupidity extreme. I believe I answered my own question!

Apologies. More coffee needed!

16. Lori says:

Using “e0” allows for misinterpretation eg in your example “123jun”. “E+0” might be ok, but the “E” might also be interpreted differently in different locales, so i thought “**0” was safer.

17. So it is a known feature, this syntax? Never seen it documented, myself.

Wonderful in any case!

18. Dick Kusleika says:

Not known to me. Until now.

19. PeterB says:

@snb How is a 201 character ‘one-liner’ helpful?

20. Lori says:

i don’t know where the ** syntax comes from, perhaps it’s there for Lotus 123 compatibility? i haven’t seen it documented either but i’m fairly certain i came across it in exactly these circumstances where a text string had some numbers and asterisks that were misinterpreted in the data cleansing process. The cause took some time to track down, and in the end i came up with the idea of using the same syntax to guard against it.

Of course it depends on your data, but i would definitely not trust the numeric parser on a random string for reasons like this and the naive formulas posted in forums should not be advocated. Even if the formula were 99.9% effective, on data of over a thousand rows you’d expect it to be unreliable. I’d think it was less than that, for instance simple strings such as “1 a” and “1 p” get converted to 0.041666667 (1am) and 0.541666667 (1pm).

21. Harlan Grove says:

Longre’s MOREFUNC.XLL can still be found on the web. The REGEX.MID function it provides makes this fairly easy. The regular expression to find anything which could be interpreted as a number (without E+/- scaling) is [-+]?(\d+(\.\d+)?|\.\d+).

22. I had a UDF using regular expressions that extracted a number from anywhere in a text string:
http://newtonexcelbach.wordpress.com/2014/04/12/extracting-numbers-with-regular-expressions/
The original code came from Winston Snyder at:
http://dataprose.org/2013/12/scrub-your-data-expressively/

Prompted by this thread I have now modified it to extract all the separate values in a text string and return all of them as an array, or any specified value. It also allows any character to be specified as the decimal separator (default .) and any character to be the thousands separator, which is ignored (default ,); so for “a 12,345.678 or 876,543.12, 1” it returns 12345.678 876543.12 1.

Here is the code

``` Function ExtractNums(NumStrings As Variant, Optional Position As Long = 0, Optional MaxNum As Long = 10, _ Optional DecString As String = ".", Optional IgnoreString As String = ",") As Variant```

``` 'Based on code by Winston Snyder: http://dataprose.org/2013/12/scrub-your-data-expressively/ 'Extract one or more numbers from anywhere in an input string, UDF Version Dim RE As Object, Match As Object, Matches As Object, strNumber As String, i As Long, j As Long, k As Long Dim numrows As Long, Row As Long, NumString As String, OutA() As Variant, NumMatches As Long Dim NumChar As Long, PrevMatch As Boolean, Digit As String, REPatt As String 'Initialize variables Set RE = CreateObject("VBScript.RegExp") If TypeName(NumStrings) = "Range" Then NumStrings = NumStrings.Value2 If TypeName(NumStrings) = "String" Then numrows = 1 Else numrows = UBound(NumStrings) End If ReDim OutA(1 To numrows, 1 To MaxNum) On Error Resume Next For Row = 1 To numrows If IsArray(NumStrings) = True Then NumString = NumStrings(Row, 1) Else NumString = NumStrings End If 'Criteria for Regular Expression REPatt = "[0-9" & DecString & "]" With RE .Pattern = REPatt '"[0-9.]" .Global = True Set Matches = .Execute(NumString) End With 'Loop Matches collection and compare with each character of original string to build strings of all numbers in the sample string strNumber = "" NumMatches = Matches.Count NumChar = Len(NumString) j = 0 k = 1 PrevMatch = False For i = 1 To NumChar Digit = Matches(j) If Mid(NumString, i, 1) <> IgnoreString Then If Digit = Mid(NumString, i, 1) Then strNumber = strNumber + Matches(j) j = j + 1 PrevMatch = True ElseIf PrevMatch = True Then If Position = 0 Then OutA(Row, k) = CDbl(strNumber) ElseIf Position = k Then OutA(Row, 1) = CDbl(strNumber) Exit For End If strNumber = "" k = k + 1 PrevMatch = False End If End If Next i If PrevMatch = True Then If Position = 0 Then OutA(Row, k) = CDbl(strNumber) ElseIf Position = k Then OutA(Row, 1) = CDbl(strNumber) End If End If Next Row ExtractNums = OutA ```

``` 'Tidy up 'Destroy objects Set Matches = Nothing Set RE = Nothing End Function ```

23. Revised code that will:
– Recognise numbers starting with + or –
– Recognise numbers in E notation (with upper or lower case E)
– Return #NA if the requested number index does not exist
– Not treat E or e as part of a number unless it is preceded by a digit

``` Function ExtractNums(NumStrings As Variant, Optional Position As Long = 0, Optional MaxNum As Long = 10, _ Optional DecString As String = ".", Optional IgnoreString As String = ",") As Variant```

``` 'Based on code by Winston Snyder: http://dataprose.org/2013/12/scrub-your-data-expressively/ 'Extract one or more numbers from anywhere in an input string, UDF Version Dim RE As Object, Match As Object, Matches As Object, strNumber As String, i As Long, j As Long, k As Long Dim numrows As Long, Row As Long, NumString As String, MatchString As String, OutA() As Variant, NumMatches As Long Dim NumChar As Long, PrevMatch As Boolean, Digit As String, REPatt As String Const ExpString As String = "E", pmString As String = "+-" 'Initialize variables Set RE = CreateObject("VBScript.RegExp") If TypeName(NumStrings) = "Range" Then NumStrings = NumStrings.Value2 If TypeName(NumStrings) = "String" Then numrows = 1 Else numrows = UBound(NumStrings) End If ReDim OutA(1 To numrows, 1 To MaxNum) On Error Resume Next For Row = 1 To numrows If IsArray(NumStrings) = True Then NumString = NumStrings(Row, 1) Else NumString = NumStrings End If 'Criteria for Regular Expression REPatt = "[0-9" & DecString & ExpString & LCase(ExpString) & pmString & "]" With RE .Pattern = REPatt .Global = True Set Matches = .Execute(NumString) End With ```

``` 'Loop Matches collection and compare with each character of original string to build strings of all numbers in the sample string strNumber = "" NumMatches = Matches.Count NumChar = Len(NumString) j = 0 k = 1 PrevMatch = False For i = 1 To NumChar MatchString = Mid(NumString, i, 1) If PrevMatch = False And UCase(MatchString) = ExpString Then j = j + 1 ElseIf MatchString <> IgnoreString Then Digit = Matches(j) If Digit = MatchString Then strNumber = strNumber + Matches(j) j = j + 1 PrevMatch = True ElseIf PrevMatch = True Then If Position = 0 Then OutA(Row, k) = CDbl(strNumber) ElseIf Position = k Then OutA(Row, 1) = CDbl(strNumber) Exit For End If strNumber = "" k = k + 1 PrevMatch = False End If End If Next i If PrevMatch = True Then If Position = 0 Then OutA(Row, k) = CDbl(strNumber) ElseIf k = Position Then OutA(Row, 1) = CDbl(strNumber) End If End If If k < Position Then OutA(Row, 1) = CVErr(xlErrNA) Next Row ExtractNums = OutA 'Tidy up 'Destroy objects Set Matches = Nothing Set RE = Nothing End Function ```

24. Thanks all for the code…I’ll try to find some time to do some speed testing soon.

25. Crikey…after chaining myself to the computer since the last update, I finally managed to cobble together a formula that will extract all numbers in practically any shape or form that the local version of Excel deems as a number. That is, given a string like this:
Jeff Weir Age: 43 DOB: 25/4/71 Salary: \$100,000 StartTime: 8:30
…It returns this
43
25/4/71
100000
8:30

It ain’t pretty. If you like UDFs, then you’ll agree with my pal Gareth Hayter that this is pure formulabation. Unless you like formulas, in which case it’s orgasmic. (You sick, sick analyst, you. Don’t let your boss catch you playing with it in the office…)

Link to sample file in the original post above.

26. In this parat of the world your details would have to be written as:

43 DOB: 25-04-1971 Salary: €100.000,- StartTime: 8.30

27. Jeff Weir says:

Hi snb. You’re in luck: in my sample file you’l note I pre-clean the string with a formula like this:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,” “,”|”),”\$”,””),”,”,””)

I think all you need to do is change it to this:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,” “,”|”),”€”,””),”.”,””)

…although further tweaks might be necessary, which I’ll leave to you.

Note that this ‘pre-cleaning’ could just as easily be performed within the main formula itself. I just thought it was simpler to keep this outside the main formula…

28. I don’t think I will either use or tweak an unnecessary overcomplicated approach to a simple task. ;)

29. snb: I wouldn’t use it either. Hence my “formualbation” comment. Let me be clear with my intent here: to see if I could do something for the sake of it. To quote a fellow New Zealander, I did it to see if I could “Knock the bastard off”. (Aside: That quote comes from Edmond Hillary, who climbed Everest “because it was there”.) To me, formula challenges are a bit like that: a pointless waste of time really, apart from the challenge.

That said, here’s where I feel duty bound to point out that at least my unnecessary overcomplicated approach actually handles this “simple task”. The original task was to split out numbers including decimals. The revised task of that last formula I posted was to extract things that Excel recognizes as numbers be they integers, decimals, dates, or times. Unfortunately your UDFs do neither.

I’ll be posting my own UDF in due course. In fact, the main reason for me doing this is to illustrate in my book that a UDF can often be a much better approach than a complicated formula. And a well-optimized UDF can often thrash the pants off a poorly optimized one.

That said, I live in the world of SharePoint Server, where VBA is not an option. So I have to rely on formulas to do things all the time that would otherwise be much better suited to a UDF.

30. You don’t need to convince me:

```Sub M_snb() c00 = "Jeff Weir Age: 43 DOB: 25/4/71 Salary: \$100,000 StartTime: 8:30"```

``` For j = 1 To Len(c00) If InStr("0123456789/,:", Mid(c00, j, 1)) = 0 Then Mid(c00, j, 1) = " " Next sn = Split(Application.Trim(c00)) For j = 0 To UBound(sn) If Val(sn(j)) = 0 Then sn(j) = "" Next MsgBox Join(Split(Application.Trim(Join(sn))), vbLf) End Sub ```

`Isn't it time to emigrate to non Sharepoint country ?`

31. There’s not too many jobs around where I can work from home in my Pyjamas. This SharePoint gig is one of them.

Your sub is getting there, but doesn’t handle decimal points. While we’re at it, let’s throw in some scientific notation such as 1E3 or 1**3 just to give you a UDF Everest to climb. I’ll keep adding edge cases until a) we reach the summit or b) you tell me to climb it myself.

32. Here’s some more edge cases that my UDF handles: percents, times with am or pm in them, dates like 1 January 2010.

33. Nice !

Posting code? Use <pre> tags for VBA and <code> tags for inline.