View previous topic :: View next topic 
Author 
Message 
Romisek Newbie
Joined: 23 Oct 2010 Posts: 2

Posted: Sat Oct 23, 2010 3:34 am Post subject: Extract number / value from alphanumeric string 


Hi, pls could anyone help me with this task?
I need to extract number / value from a short alphanumeric string as following example (a number always appears at the beginning and can be of different length decimal):
8.5hr
6rm
6.83hr
4rm
To get:
8.5
6
6.83
4
Can I use any Reg Exp function (but without a java script)?
I would also need it to function ideally on both Calc and Excel spreadsheets.
Many thx 

Back to top 


keme Moderator
Joined: 30 Aug 2004 Posts: 2910 Location: Egersund, Norway

Posted: Sat Oct 23, 2010 4:43 am Post subject: 


A few examples that should be easy to understand (but may fail if the string is very much different from the examples given):
=VALUE(LEFT(<text>;SEARCH("([:alpha:])";<text>)1)) should work for your examples.
A slightly more elaborate regex will catch a few other "suffixes". Searching for the first transition from number to text:
=VALUE(LEFT(<text>;SEARCH("([:digit:]([:space:][:alpha:]))";<text>)))
Note that this search term includes the last digit of your number in the match, so the "1" adjustment is not needed.
AFAIK those formulas will not work in Excel. (I don't think this is easily done in Excel without installing some regex addon, but I have limited experience with recent Excel versions so I may be wrong.) 

Back to top 


ken johnson Super User
Joined: 23 Apr 2009 Posts: 2032 Location: Sydney, Australia

Posted: Sat Oct 23, 2010 7:26 am Post subject: 


This array formula starts from the left side of such a numeric/alphabetic string and looks at progressively longer substrings, checking for an error returned by VALUE(substring).
If the error is detected the array formula returns a blank, otherwise it returns the length of the substring.
The maximum value in the resulting array is the number of characters making up the number contained in the string.
This maximum value is then used to extract the numerical characters using the LEFT function.
The VALUE function then converts the substring to a number.
With the numeric/alphabetic strings in column A, starting in A1, this formula in B1 worked in Calc...
Code:  VALUE(LEFT(A1;MAX(IF(ISERROR(VALUE(LEFT(A1;ROW(INDIRECT("$B$1:$B$"&LEN(A1))))));"";ROW(INDIRECT("$B$1:$B$"&LEN(A1))))))) 
Unfortunately, Excel could not cope. This part of the formula returned the number of characters in the numeric substring... Code:  MAX(IF(ISERROR(VALUE(LEFT(A1,ROW(INDIRECT("$B$1:$B"&LEN(A1)))))),"",ROW(INDIRECT("$B$1:$B"&LEN(A1)))))  but it refused to then use this value with the LEFT and VALUE functions.
To get Excel working I've had to do away with the ROW(INDIRECT("$B$1:$B$"&LEN(A1))) part and use instead ROW($B$1:$B$30). Here I have assumed that 30 would be a reasonable maximum number of characters in the longest numeric/alphabetic string in column A.
So, if the assumption that "no column A string is longer than 30 characters" is correct, and if the first string is in A1, then in Calc try... Code:  VALUE(LEFT(A1;MAX(IF(ISERROR(VALUE(LEFT(A1;ROW($B$1:$B$30))));"";ROW($B$1:$B$30)))))  entered into B1 using the Ctrl+Shift+Enter key combination, then hold down the Ctrl key while dragging the fill handle to copy the array formula into the rows below.
In Excel try... Code:  VALUE(LEFT(A1,MAX(IF(ISERROR(VALUE(LEFT(A1,ROW($C$1:$C$30)))),"",ROW($C$1:$C$30)))))  entered into B1 using the Ctrl+Shift+Enter key combination, then filled down to the rows below.
Ken Johnson _________________ If your problem has been solved please add "[Solved]" to the beginning of your first post title (edit button). 

Back to top 


Jens S Power User
Joined: 26 May 2008 Posts: 87 Location: Denmark

Posted: Sat Oct 23, 2010 11:34 am Post subject: 


You can use Regex in Search & Replace (ctrl+F) in Calc:
Search for: [^09\.]
Replace with: (nothing)
x Selected cells
x Regular expressions
Replace all
VBA can handle Regex in Excel, try this udf. Just type in B1: =Freenum(A1)
Function FreeNum(s As String) As String
Dim re As Object
Set re = CreateObject("vbscript.regexp")
With re
.Pattern = "[^09\.]"
.Global = True
.MultiLine = True
End With
FreeNum = re.Replace(s, "")
Set re = Nothing
End Function
Jens 

Back to top 


Romisek Newbie
Joined: 23 Oct 2010 Posts: 2

Posted: Sat Oct 30, 2010 3:24 am Post subject: 


Thank you all very much! I did not expect the answer so quickly.
Seeing the formulas, wow, such a basic task, retrieve a number from a string, my mobile phone can handle this, why it is difficult? There is no formula function for RE? Why? Because of compatibility with Excel?
Anyway, the VB macro seems to be the easiest way after all. Despite I wanted to stay away from "programming" it seems to be the easiest.
Any drawbacks using VB in Calc?
Is it compatible with Excel? I guess yes.
By using Star Basic I guess I would break it wouldn't I?
P.S.
Where I can put the FreeNum() macro please? What is UDF?
Thank you all again, you all helped a lot. 

Back to top 


Villeroy Super User
Joined: 04 Oct 2004 Posts: 10106 Location: Germany

Posted: Sat Oct 30, 2010 5:22 am Post subject: 


Quote:  my mobile phone can handle this, why it is difficult? 
Look at the programming code of your phone software (just in case it is open source).
A spreadsheet can be seen as a simplified, visual programming language with 3 different types of data: Number, text and error. In Excel boolean /true/false) constitutes a 4th type.
You to come to terms with this programming language when you master the following polarities:
Data vs. Formulas (later: array formulas)
Values vs. Formatting
Text vs. Number (including booleans, dates, times)
Absolute vs. Relative referencing
It can not be simplier without losing the versatility of a programming language. The spreadsheet language is not made for string manipulation in the first place. Virtually all scripting languages can extract patterns from strings more easily.
No, VB, VBA, StarBasic are different, much more importantly Excel and Calc are totally different applciations. You may use the same language but you have to tell different things.
Excel does not support regular expressions at all, making any kind of pattern matching much more complicated. VB/VBA has some loadable Regex Module. All this depends on Windows.
OOo comes with a Python runtime with a full set of libraries as a more professional alternative. You can also use JavaScript with Regexes. _________________ Rest in peace, oooforum.org
Get help on https://forum.openoffice.org 

Back to top 




You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum

Powered by phpBB © 2001, 2005 phpBB Group
