| View previous topic :: View next topic |
| Author |
Message |
adamsjw2 General User

Joined: 09 Jan 2003 Posts: 8
|
Posted: Wed Oct 05, 2005 8:16 am Post subject: How can I find characters such as paragraph mark or tab mark |
|
|
Hi All,
I need to know the formatting symbols such as paragraph marks, CRLF, tab etc., so that I can use find & replace to format documents. For example, many times information from the web has paragraph marks afer each line, and each line is does not wrap to the margin. In MS Word, I can replace ^p with a "space" for example and concat the lines.
TIA,
Jim A. |
|
| Back to top |
|
 |
RGB Super User


Joined: 25 Nov 2003 Posts: 1743 Location: In Lombardy, near a glass of red Tuscany wine
|
Posted: Wed Oct 05, 2005 8:32 am Post subject: |
|
|
| Search in the help for "regular expressions" |
|
| Back to top |
|
 |
adamsjw2 General User

Joined: 09 Jan 2003 Posts: 8
|
Posted: Wed Oct 05, 2005 9:41 am Post subject: |
|
|
RGB,
Thanks, the help system isn't working on 2.0. I thought it was installed, but apparently not. I'm using Debian package for OO 2.0
Jim A. |
|
| Back to top |
|
 |
JohnV Administrator

Joined: 07 Mar 2003 Posts: 8979 Location: Lexinton, Kentucky, USA
|
Posted: Wed Oct 05, 2005 10:18 am Post subject: |
|
|
Well here is the list from Help the best I can manage to copy and paste it in. Be sure to check Regular Expression which is under More in version 2. Help does not make it clear that $ is used to find paragraph breaks.
Actually much text copied from the web will have a line break instead of a paragraph break. \n is used to find them but for mysterious reasons is also used to replace something with a paragraph break thus you can not find paragraph breaks and replace them with line breaks.
List of Regular Expressions
.
Represents any single character except for a line break or paragraph break. For example, the search term "sh.rt" returns both "shirt" and "short".
^
Only finds the search term if the term is at the beginning of a paragraph. Special objects such as empty fields or character-anchored frames, at the beginning of a paragraph are ignored. Example: "^Peter".
$
Only finds the search term if the term appears at the end of a paragraph. Special objects such as empty fields or character-anchored frames at the end of a paragraph are ignored. Example: "Peter$".
*
Finds zero or more of the characters in front of the "*". For example, "Ab*c" finds "Ac", "Abc", "Abbc", "Abbbc", and so on.
+
Finds one or more of the characters in front of the "+". For example, "AX.+4" finds "AXx4", but not "AX4".
The longest possible string that matches this search pattern in a paragraph is always found. If the paragraph contains the string "AX 4 AX4", the entire passage is highlighted.
?
Finds zero or one of the characters in front of the "?". For example, "Texts?" finds "Text" and "Texts" and "x(ab|c)?y" finds "xy", "xaby", or "xcy".
\
Search interprets the special character that follows the "\" as a normal character and not as a regular expression (except for the combinations \n, \t, \>, and \<). For example, "tree\." finds "tree.", not "treed" or "trees".
\n
Represents a line break that was inserted with the Shift+Enter key combination. To change a line break into a paragraph break, enter \n in the Search for and Replace with boxes, and then perform a search and replace.
\t
Represents a tab. You can also use this expression in the Replace with box.
\>
Only finds the search term if it appears at the end of a word. For example, "book\>" finds "checkbook", but not "bookmark".
\<
Only finds the search term if it appears at the beginning of a word. For example, "\<book" finds "bookmark", but not "checkbook".
^$
Finds an empty paragraph.
^.
Finds the first character of a paragraph.
&
Adds the string that was found by the search criteria in the Search for box to the term in the Replace with box when you make a replacement.
For example, if you enter "window" in the Search for box and "&frame" in the Replace with box, the word "window" is replaced with "windowframe".
You can also enter an "&" in the Replace with box to modify the Attributes or the Format of the string found by the search criteria.
[abc123]
Represents one of the characters that are between the brackets.
[a-e]
Represents any of the characters that are between a and e.
[a-eh-x]
Represents any of the characters that are between a-e and h-x.
[^a-s]
Represents any character that is not between a and s.
\xXXXX
Represents a special character based on its four-digit hexadecimal code (XXXX).
The code for the special character depends on the font used. You can view the codes by choosing Insert - Special Character.
|
Finds the terms that occur before or after the "|". For example, "this|that" finds "this" and "that".
{2}
Defines the number of times that the character in front of the opening bracket occurs. For example, "tre{2}" finds "tree".
{1,2}
Defines the number of times that the character in front of the opening bracket can occur. For example, "tre{1,2}" finds both "tree" and "treated".
{1,}
Defines the minimum number of times that the character in front of the opening bracket can occur. For example, "tre{2,}" finds "tree", "treee", and "treeeee".
( )
Defines the characters inside the parentheses as a reference. You can then refer to the first reference in the current expression with "\1", to the second reference with "\2", and so on.
For example, if your text contains the number 13487889 and you search using the regular expression ( 7\1\1, "8788" is found.
You can also use () to group terms, for example, "a(bc)?d" finds "ad" or "abcd".
[:digit:]
Represents a decimal digit.
[:space:]
Represents a white space character such as space.
[:print:]
Represents a printable character.
[:cntrl:]
Represents a nonprinting character.
[:alnum:]
Represents an alphanumeric character ([:alpha:] and [:digit:]).
[:alpha:]
Represents an alphabetic character.
[:lower:]
Represents a lowercase character if Match case is selected in Options.
[:upper:]
Represents an uppercase character if Match case is selected in Options. |
|
| Back to top |
|
 |
adamsjw2 General User

Joined: 09 Jan 2003 Posts: 8
|
Posted: Wed Oct 05, 2005 10:31 am Post subject: |
|
|
JohnV,
That's awesome. Thanks so much!
Jim A. |
|
| Back to top |
|
 |
Jaap General User

Joined: 30 Jan 2004 Posts: 12
|
Posted: Fri Oct 07, 2005 5:06 am Post subject: find and replace |
|
|
I would like to know how to find a - sign and replace it.
I want every - in the dokument to be the beginning of a new paragraph.
I cannot find a way to do this in a quick and easy way.
Jaap _________________ Jaap |
|
| Back to top |
|
 |
Robert Tucker Moderator


Joined: 16 Aug 2004 Posts: 3367 Location: Manchester UK
|
Posted: Fri Oct 07, 2005 6:43 am Post subject: |
|
|
Edit>Find & Replace or Ctrl+F
Search for: -
Replace with: \n\n-
Click “More Options”
Check: Regular expressions
Click “Replace All” |
|
| Back to top |
|
 |
nicklawford7666 General User

Joined: 28 Nov 2005 Posts: 11
|
Posted: Mon Nov 28, 2005 11:11 pm Post subject: |
|
|
I have read all the previous msgs and tried it all.
I am a new user to OOo, Starting with 2.0 with no previous experience of any OOo. My PC runs Windows ME. Macros is the reason I want OOo.
But before I can record a macro I need to know how functions work manually.
I have been using M$ Works - but this has no macro facility. In M$ Works (even in old versions like 4.5) if I wanted to search and replace tabs or paragraphs you simply click the right box. ((In M$ Word, you select them as special characters, into the macro.)) What I have been doing recently, here, where I do not have M$ Office, is to save files from M$ Works as ASCII TXT files, then use Intel Aedit - which is now freeware and has a macro facility - then converting them back to Works format, and eventuality to DOC if it is needed later for Office. I want to use OOo specifically for this reason : to avoid going all around the houses and do it all in one application.
Right away I am not seeing what is going on here. Because almost every macro I am going to create in OOo will need to do (along with other functions) 'Find and Replace' for multiple TAB or multiple <cr> (LF/CR) - or replace another character with either single TAB or single <cr> .
Now before I can create a macro I need to know how to do a simple manual replace. Despite looking at help and reading other posts on exactly this matter I am *totally* baffled. I am not stupid - but this is not working according to Help or documentation. Nor does following the steps (which are in fact the same as Help) as posted in another question help.
In simple terms my question is WHAT do I input into the Find and Replace boxes to represent :
TAB ?
<cr> ?
Following Help? in Find and Replace - and to Standard Expressions - says things like the expression \t means TAB - and the PDF files from OOo say the same thing. If I literally type in \t in the Find and Replace boxes, then on executing the function it is literally using \t as the text string.
Does \t mean to do something else (i.e. not type a "\" followed by a "t") ? If so, what is it, is not explained in Help? One of the answers to questions here suggested $ means tab. Tried that too, Does not work. Again if it was meant $ means some variable, then this was not explained.
I will also want to replace Unicode characters.
Now I know the Alt+keypad number trick from long ago. Had it not been for the that, OOo is not helpful. It shows in Help Standard Expressions type \xXXXX where XXXX is the 4 digit hex code seen from Insert Special Character. But again what does that mean? If I type \xXXXX where XXXX is the hex (and I tried the decimal equivalent too) it again puts in the text string \xXXXX - not the character. Only by alt+keypad+XXXX (where XXXX is DECimal not hex does it work).
What am I not seeing here ?
((( I even thought of using 0009 for TAB which OOo accepts as input in a document - but try that in the Find and replace dialogue and it sees it as a normal tab and moves cursor to next input box. )))
I'm sorry if I am sounding dumb. I'm not. I'm just getting annoyed with this as it ought to be a very simple thing to do that I've been able to do without recourse to assistance on various different applications over 20 years.
-- |
|
| Back to top |
|
 |
Robert Tucker Moderator


Joined: 16 Aug 2004 Posts: 3367 Location: Manchester UK
|
Posted: Tue Nov 29, 2005 2:26 am Post subject: |
|
|
| nicklawford7666 wrote: | If I literally type in \t in the Find and Replace boxes, then on executing the function it is literally using \t as the text string.
|
Have you clicked "More Options" and checked "Regular expressions"? |
|
| Back to top |
|
 |
Gabor Super User

Joined: 21 Sep 2003 Posts: 610 Location: Hungary (E-Europe)
|
|
| Back to top |
|
 |
nicklawford7666 General User

Joined: 28 Nov 2005 Posts: 11
|
Posted: Tue Nov 29, 2005 3:19 pm Post subject: |
|
|
On clicking ''more options'' in Find and Replace only one of the 5 options is usable.
I can check - and have tested - 'Backwards' but :
'Current selection on'
'Regular expressions'
'Similarity search'
'Search for styles'
are all greyed out and I cannot check them. Is this the problem ?
I have looked through Help and PDF documents and cannot immediately see an option anywhere that I might have blocked this out.
For other reasons, I am wondering if I need to do a re-install ?
[1] At start-up of OOo I am locked in a cyclic loop with the ''register now / later / never / already'' pop up box. It pops up every time I start OOo , ignores all 4 inputs. I was ignoring this one as an irritating detail to sort out later.
[2] in Writer the drop down box of fonts appears to have rubbish characters in it at the bottom, beyond where the list of fonts seems to end.
--
Nick |
|
| Back to top |
|
 |
nicklawford7666 General User

Joined: 28 Nov 2005 Posts: 11
|
Posted: Mon Dec 05, 2005 2:33 am Post subject: |
|
|
OK, one more step forwards here.
De-installed OOo completely - and re-installed from the same source.
This seems to have sorted most things - and now OOo starts up much faster now too which I can't explain, and the other errors have gone.
''Find and Replace'' TAB working as per Help, and same for searching for Unicode characters.
That leaves me now with unable to fathom out how to do the same for paragraphs. I do not see how to do that directly.
I thought of using the HEX code - but since OOo saves ODT files in a compressed format, I cannot use Aedit or other such editor to open one and find the HEX code behind paragraph. Normally in all word processors (except I think ye olden dayes Word Star) I have used before it is two characters 000D 000A [ CR LF ] .
I tried that and it does not work.
What next ?
Hoping once I have this sorted out I shall be well under way with OOo. Seems to otherwise do all I want it to do and more ... for free. Already been plugging it amongst friends - especially those who have PC that come with only M$ Works (and parts of M$ Word) which is pretty limited these days.
--
Nick |
|
| Back to top |
|
 |
Robert Tucker Moderator


Joined: 16 Aug 2004 Posts: 3367 Location: Manchester UK
|
Posted: Mon Dec 05, 2005 3:12 am Post subject: |
|
|
This is considered by many to be a weak point on OOo, see:
http://www.oooforum.org/forum/viewtopic.phtml?t=27643
As things are, I think one has to consider that OOo searches only for strings within paragraphs (as I think is pointed out in another forum post). Why one shouldn't need to check that one has only hit "enter" twice between paragraphs rather than three or more times (not to mention the file processing need to search for paragraphs) I would not like to try and answer. |
|
| Back to top |
|
 |
nicklawford7666 General User

Joined: 28 Nov 2005 Posts: 11
|
Posted: Tue Dec 06, 2005 5:43 pm Post subject: |
|
|
Hhmmmm ..... this is going to be a blocking point.
The files I wish to handle (for which I was going to create macros to handle the routine stuff) are outputs of no human typing those more than 2 LF/CR paragraph marks. They are the ASCII output from data collection devices that are no more the equivalent of old dumb screen outputs sent to file. Lots of large (to 2-3 MB) files, lots of multiple tabs (can do) and multiple CR/LF in every file. Despite what todays software developers might think, there are many many such legacy devices out there that have no hope of being replaced. The devices might not output to a dumb green screen anymore, but they output data in the same format.
Looks like M$ office might have the upper hand here ?
Even Works word processor can search and replace CR/LF.
--
Nick |
|
| Back to top |
|
 |
JohnV Administrator

Joined: 07 Mar 2003 Posts: 8979 Location: Lexinton, Kentucky, USA
|
Posted: Wed Dec 07, 2005 6:53 am Post subject: |
|
|
Nick,
Please give this marco a try. It was programed specifically for converting ASCII text files by stripping out excess paragragh breaks. There are other options provided for additional processing of the file after the main job is done. At the beginning of the actual code there are user configurable setting which you may want to use once you understand how the program works.
http://www.oooforum.org/forum/viewtopic.phtml?t=6429 |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|