Deutsche Version / German version
A very common task is to extract a specific element (sub-string) from a longer string values. This first part shows some functions to extract the numeric part of a string while the next part will concentrate on getting the nth substring.
So let’s start with an assumption: The string to be processed does not contain 2 or more numeric parts but only one. e.g.
“This is part 123 of 234 parts”
is not allowed. Though
“This is part 123 of all parts”
is a valid string for the following functions.
First we try to evaluate the starting and ending positions of the numeric part. But before we can start using the formulas we have to define a name (Menu ‘Insert – Name – Define’) to make them more readable:
- Name: seq
- RefersTo Formula: =ROW(INDIRECT(“1:1024”))
We start with evaluating the beginning and ending position of our numeric part. For our example
“This is part 123 of all parts”
we should get 14 and 16 as result. Formulas to achieve this result are:
- Getting the starting position of the numeric part:
- Array formula: =MIN(IF(ISNUMBER(-MID(A1,seq,1)),seq))
- Non-array formula: LOOKUP(2,1/MID(A1,1024-seq,1),1024-seq)
- Getting the last position of the numeric part:
- Array formula: =MAX(IF(ISNUMBER(-MID(A1,seq,1)),seq))
- Non-array formula: =LOOKUP(2,1/(ISNUMBER(-MID(A1,seq,1))),seq)
To extract the numeric part all we have to do is combine the above functions together with the MID function:
- Array formula:
=IF(MIN(IF(ISNUMBER(-MID(A1,seq,1)),seq)),–MID(A1,MIN(IF(ISNUMBER(-MID(A1,seq,1)),seq)),MAX(IF(ISNUMBER(-MID(A1,seq,1)),seq))-MIN(IF(ISNUMBER(-MID(A1,seq,1)),seq))+1),””) - Non-array formula:
=IF(ISNUMBER(LOOKUP(2,1/MID(A1,seq,1),seq)),MID(A1,LOOKUP(2,1/MID(A1,1024-seq,1),1024-seq),LOOKUP(2,1/MID(A1,seq,1),seq)-LOOKUP(2,1/MID(A1,1024-seq,1),1024-seq)+1),””)
I have to admit these formulas do not look simple and of course you can achieve the same goal with some VBA code quite easily. But in case you can’t use VBA this provides you with a non-macro alternative.
If performance is an issue you may consider using the following array formula (nearly twice as fast than the other two formula combinations):
=–MID(A1,MATCH(TRUE,ISNUMBER(-MID(A1,seq,1)),0),ROWS(seq)-SUM(–ISERROR(-MID(A1,seq,1))))
The above formulas can be simplified if you can make some additional assumptions about your string value:
- Numeric part is always at the beginning of your string:
- Array formula:
=IF(MIN(IF(ISNUMBER(-MID(A1,seq,1)),seq)),–LEFT(A1,MAX(IF(ISNUMBER(-MID(A1,seq,1)),seq))),””) - Non-array formula:
=IF(ISNUMBER(LOOKUP(2,1/(ISNUMBER(-MID(A1,seq,1))),seq)),–LEFT(A1,LOOKUP(2,1/(ISNUMBER(-MID(A1,seq,1))),seq)),””) - Or a very nice non-array alternative (I think first posted by Aladin Akyurek):
=–LEFT(A1,SUMPRODUCT((LEN(A1)-LEN(SUBSTITUTE(A1,{“.”,0,1,2,3,4,5,6,7,8,9},””)))))
- Array formula:
- Numeric part is always at the end of your string:
- Array formula:
=IF(MIN(IF(ISNUMBER(–MID(A1,seq,1)),seq)),–MID(A1,MIN(IF(ISNUMBER(-MID(A1,seq,1)),seq)),20),””) - Non-array formula:
=–MID(A1, LOOKUP(2,1/MID(A1,1024-seq,1),1024-seq),20)
- Array formula:
This is only a compilation of some common used worksheet formulas for extracting numbers from strings. You may explore these functions in your own spreadsheet.
But keep in mind: For complicated string processing you may switch to VBA (together with using regular expressions).
Frank Kabel
Pages: 1 2