| View previous topic :: View next topic |
| Author |
Message |
bountonw General User

Joined: 20 Jan 2006 Posts: 18 Location: Saraburi, Thailand
|
Posted: Fri Feb 10, 2006 4:53 pm Post subject: replacing paragraph marks |
|
|
New to the community of open source, but have taken the plunge and gone linux...
Frequently I edit matierials typed by other people. I am slowly educating how to use styles etc, but often they will format by hand by just spacing or double returning to get the look that they want.
Sorry for the reference to MS Word, don't mind doing things the new way, just don't know how to yet.
In MS Word I could hit Ctrl F and then replace ^p^p with ^p. Doing this a couple of times would give me single paragraphs all the way through. I could also search and replace for tabs ^t and other things. I looked at the attributes list on search screen and read a little in the help manual and did a search of the forums, but didn't see anything for paragraph replacements. Manually going through 30 pages of text and deleting each double paragraph is going to get old quick.
Appreciate help on this. |
|
| Back to top |
|
 |
howard OOo Advocate

Joined: 28 Feb 2004 Posts: 320 Location: Newfoundland
|
Posted: Fri Feb 10, 2006 5:36 pm Post subject: |
|
|
Bountonw,
What you are looking for are called "regular expresions" in OOo. You can see a list of them in the inbuilt help pages. You can use the "find and replace" (<CTRL+F>) mechanism for these, but you must check "regular expressions" in the drop down menu you get when you click "more options". When I use my optical character reader all the lines are presented as paragraphs - I remove these by replacing $ with <space> (it doesn't show of course, but it is there!) If you have some text selected it assumes you want only to change items within this selection.
All this refers to the English version of OOo, if you are using a different one you'll have to translate it! _________________ Howard
3.41 (Vanilla) on various versions of PCLinuxOS and XP Home (hardly ever used!) |
|
| Back to top |
|
 |
bountonw General User

Joined: 20 Jan 2006 Posts: 18 Location: Saraburi, Thailand
|
Posted: Fri Feb 10, 2006 5:43 pm Post subject: |
|
|
| Thank you very much for pointing me out where to look. I will study into this. Is this the same regex that people use in programming? On the linux pages, someone pointed out the benifit of regular expressions. If this is one of the same thing I can lead to horses with one carrot. (not into killing birds.) |
|
| Back to top |
|
 |
jgs17 Newbie

Joined: 10 Feb 2006 Posts: 2 Location: Michigan
|
Posted: Fri Feb 10, 2006 8:23 pm Post subject: |
|
|
I've had the same problem searching and replacing paragraph symbols to clean up many text sources.
I found the OOo "\t" option for manual tabs (like MSWord "^t"), but nothing for the carriage return, end of paragraph character (MSWord uses "^p"). Already searched the OOo help pages and internet.
Any work arounds if there is no direct character?
Thanks! _________________ J Smith
-------------------------
OOo 1.9.129 / Kubuntu 5.10 |
|
| Back to top |
|
 |
Robert Tucker Moderator


Joined: 16 Aug 2004 Posts: 3367 Location: Manchester UK
|
Posted: Fri Feb 10, 2006 11:11 pm Post subject: |
|
|
The weakness of OpenOffice in searching for end-of-line characters (or its avoidance of doing so) is well commented upon in these forums.
If one is talking about plain text files then the best piece of software I know to search for them is Bluefish. You just need to hold the mouse button down going from the end of one line to the beginning of the next and then copy and paste into the "find" box of the "find and replace" dialogue box. Voilà, it appears as a little box with the Unicode number in it. It works whether they are soft returns, hard returns, carriage returns - whatever. |
|
| Back to top |
|
 |
Gabor Super User

Joined: 21 Sep 2003 Posts: 610 Location: Hungary (E-Europe)
|
Posted: Sat Feb 11, 2006 1:12 am Post subject: try this |
|
|
If you wish to search for such control characters regularly I would suggest to download Ian Laurenson's:
http://homepages.paradise.net.nz/hillview/OOo/IannzFindReplace.sxw
which is a Writer file with a macro embedded. By that you can easily change almost anything you wish.The site itself is this:
http://homepages.paradise.net.nz/hillview/OOo/
You may find it useful to check out other macros there. Such as AltKeyHandler which allows you to use the Alt key, either alone or in combination with Shift and/or Control (since OOo developers are blind to the Alt key without any public explanation). It is also embedded in an sxw document, with full how-to. |
|
| Back to top |
|
 |
howard OOo Advocate

Joined: 28 Feb 2004 Posts: 320 Location: Newfoundland
|
Posted: Sat Feb 11, 2006 5:02 am Post subject: |
|
|
I obviously need to give more detailed instructions.
Press F1 - you will get into the itemised help menu.
Type "regular expressions" into the box at the top on the left.
Click on "list of" in the menu below - you will then get a list of regular expressions in the window on the right.
Putting any of these into "Search & Replace" will find and/or replace them when you have "regular expressions" ticked (checked) in the "more options" drop down menu.
One thing that isn't clear is that $ alone will find (and replace if required) the CR-LF at the end of paragraphs.
Also note that clicking on the ¶ symbol in the upper toolbar will reveal some of the more common regular expressions. _________________ Howard
3.41 (Vanilla) on various versions of PCLinuxOS and XP Home (hardly ever used!) |
|
| Back to top |
|
 |
BillP Super User

Joined: 07 Jan 2006 Posts: 2702
|
Posted: Sat Feb 11, 2006 7:00 am Post subject: |
|
|
| In Word, you can search for two consecutive paragraph marks and replace with one paragraph mark. You can't do it this way in Writer. Instead, you have to search for empty paragraphs and delete them. The regular expression for an empty paragraph is ^$. Use ctrl-F to open Find and Replace. Check Regular Expressions and put ^$ in the Search for box. Leave the Replace with box empty. Click the Replace all button and all empty paragraphs will be deleted. |
|
| Back to top |
|
 |
bountonw General User

Joined: 20 Jan 2006 Posts: 18 Location: Saraburi, Thailand
|
Posted: Sat Feb 11, 2006 8:03 am Post subject: |
|
|
I appreciate all of the response that I am getting from this thread. I think in a regular document, I probably can do what I want. I copied an 11 page newsletter from the internet that was posted in HTML. In front of the extra paragraphs there is one space surrounded by a little yellow box. I managed to get rid of the space by searching [:space:]$ and leaing the replace box blank. The little yellow box moved to include the paragraph mark.
It was no longer a blank paragraph and ^$ didn't work. I ran my mouse over it and found that it is a message which contains the following code
| Code: | | <!--[if !supportEmptyParas]--> |
This is very cryptic to me. Yes, I can manually delete each of these and get on with life, but that wouldn't tell me how to do it the next time.
I appreciate being pointed out to regular expressions. Thank you. I experimented with various forms of [:cntrl:] trying to get rid of what I think is a comment window, without success. Any pointers? |
|
| Back to top |
|
 |
JohnV Administrator

Joined: 07 Mar 2003 Posts: 8984 Location: Lexinton, Kentucky, USA
|
Posted: Sat Feb 11, 2006 8:30 am Post subject: |
|
|
| Please try this macro which is designed to deal with documents having extra paragraph or line breaks. It even has a little section of code designed specifically to deal with a problem I have, but others may not, when I scan and OCR documents into OOo. http://www.oooforum.org/forum/viewtopic.phtml?t=6429 |
|
| Back to top |
|
 |
BillP Super User

Joined: 07 Jan 2006 Posts: 2702
|
Posted: Sat Feb 11, 2006 10:10 am Post subject: |
|
|
| bountonw wrote: | In front of the extra paragraphs there is one space surrounded by a little yellow box. I managed to get rid of the space by searching [:space:]$ and leaing the replace box blank. The little yellow box moved to include the paragraph mark.
It was no longer a blank paragraph and ^$ didn't work. I ran my mouse over it and found that it is a message which contains the following code
| Code: | | <!--[if !supportEmptyParas]--> |
This is very cryptic to me. Yes, I can manually delete each of these and get on with life, but that wouldn't tell me how to do it the next time.
I appreciate being pointed out to regular expressions. Thank you. I experimented with various forms of [:cntrl:] trying to get rid of what I think is a comment window, without success. Any pointers? |
The yellow box is probably a note. Pressing F5 opens the Navigator where notes are listed. If there is a box with a plus sign to the left of "Notes", click the box to expand the listing of notes. I have found that I can quickly delete all notes by single-clicking on one note in the Navigator to highlight it, then repeatedly pressing the Delete key until all notes have been deleted. |
|
| Back to top |
|
 |
RealGrouchy OOo Enthusiast


Joined: 25 Jan 2006 Posts: 144 Location: Ottawa, Canada
|
Posted: Sat Feb 11, 2006 10:17 am Post subject: |
|
|
I've got a little twist to this question:
When I typed up a selection of excerpts from a book I was reading, I put "(c###)" at the end of the line ("C" being a note of the author's name, since this was for personal reference only, ### representing the page number).
When I converted the document to a spreadsheet, I wanted to search to find and remove all instances of "(c###)." But when I had it search for regular expressions to catch the numbers, it also took the brackets as part of the expression, and not part of the text I wanted to find and replace.
Although this is all ancient history now, I'd like to know how (and if) to do this in the future.
Thanks,
- RG> _________________ Quite simply, OOo Impress, does not.
XPsp2, OOo 2.3, SeaMonkey 1.1.7, IE v.6.6.6... |
|
| Back to top |
|
 |
bountonw General User

Joined: 20 Jan 2006 Posts: 18 Location: Saraburi, Thailand
|
Posted: Sat Feb 11, 2006 4:09 pm Post subject: |
|
|
Thank you BillP. I deleted all the notes. When I deleted one, my curser would go back to the document so I had to click back on the navigation bar (116 times!)
I wish there was an easier way, but I am thankful for this navigation bar trick.
I can search for [:space:]$ but not ^[:space:]
I can not replace with any regular expresion. Let us say I want to search for each double space or triple space and replace with tab.
I tried searching [:space:][:space:][:space:] (which is very ineffecient) but got an error. I have tried several combinations of having /t or [:space:] in the replace line only to get the /t or [:space:] replaced throughout the document. I suggest a different regular expression for space 9 key strokes to replace one is not effecient. One should also be able to search for more than one of these in a row. (I can type the space key and have it replace, but not at the beging of paragraphs. I have 3 spaces at the beginning of each paragragh and want to replace with a tab.)
I am not complaining as I am not going to go back to MS word and am here for the long term. I am just pointing out a real life situation. I am not asking Writer to be the same as Word. It is better in many ways.
My humble newbie suggestion is that the regular expressions be revised a little or expanded to include a short cut for space. Include /r for carriage return.
If I don't know what I am talking about, sorry. I am new. I have been playing with the regular expressions and find them powerful and appreciate them. I just wish that I could replace with tabs spaces return carriages or line breaks etc. |
|
| Back to top |
|
 |
BillP Super User

Joined: 07 Jan 2006 Posts: 2702
|
Posted: Sat Feb 11, 2006 5:17 pm Post subject: |
|
|
| bountonw wrote: | | I tried searching [:space:][:space:][:space:] (which is very ineffecient) but got an error. I have tried several combinations of having /t or [:space:] in the replace line only to get the /t or [:space:] replaced throughout the document. I suggest a different regular expression for space 9 key strokes to replace one is not effecient. One should also be able to search for more than one of these in a row. (I can type the space key and have it replace, but not at the beging of paragraphs. I have 3 spaces at the beginning of each paragragh and want to replace with a tab.). |
I don't know the purpose of the regular expression [:space:]. To find 3 spaces, I just type 3 spaces in the Search for box, not [:space:][:space:][:space:]. I did this to replace 3 spaces at the beginning of a paragraph with a tab, and it worked for me. I am using OOo 2.0.1 on Windows XP SP2. |
|
| Back to top |
|
 |
jgs17 Newbie

Joined: 10 Feb 2006 Posts: 2 Location: Michigan
|
Posted: Sat Feb 11, 2006 9:55 pm Post subject: |
|
|
Ok, per the above comments I was able to successfully remove CR characters (thanks!), the next problem was being able to _insert_ CR characters.. Below are the steps of an example text file I might get from website/old text documents/etc – they have short text lines with paragraph breaks at the end of each line and a double paragraph break between intended paragraphs. Goal is to have three text paragraphs with one single CR character at just the end of each paragraph (so all text within a paragraph will word wrap); I inserted “P” to represent the paragraph character shown with tool bar button clicked:
[Given]:
P
1asjfls kdjflks P
skdf jsdf P
P
2asd fsadf sdfadf P
sdfs dfsdf P
P
3asd fasd fsadf P
asdf sadf sadf P
P
[Desired output]:
P
1asjfls kdjflks skdf jsdf P
2asd fsadf sdfadf sdfs dfsdf P
3asd fasd fsadf asdf sadf sadf P
P
[So I try]:
A) Replace “^$” with “#####” (or any unique text to replace later) & check “regular expressions” – to mark real paragraphs.
1asjfls kdjflks P
skdf jsdf P
#####2asd fsadf sdfadf P
sdfs dfsdf P
#####3asd fasd fsadf P
asdf sadf sadf P
B) Replace “$” with “ “ (space) character & check “regular expressions” - to eliminate end of line CR and enable normal document word wrap.
1asjfls kdjflksskdf jsdf#####2asd fsadf sdfadfsdfs dfsdf#####3asd fasd fsadfasdf sadf sadf P
C) Replace “#####” with “^$” (c1) or with “$” (c2) & check “regular expressions” - to restore original desired paragraphs.
(c1) 1asjfls kdjflksskdf jsdf^$2asd fsadf sdfadfsdfs dfsdf^$3asd fasd fsadfasdf sadf sadf P
(c2) 1asjfls kdjflksskdf jsdf$2asd fsadf sdfadfsdfs dfsdf$3asd fasd fsadfasdf sadf sadf P
As you can see, no CR characters were replaced, just a standard $ inserted.
Suggestions for this last step? _________________ J Smith
-------------------------
OOo 1.9.129 / Kubuntu 5.10 |
|
| Back to top |
|
 |
|