OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

replacing paragraph marks
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Writer
View previous topic :: View next topic  
Author Message
bountonw
General User
General User


Joined: 20 Jan 2006
Posts: 18
Location: Saraburi, Thailand

PostPosted: Fri Feb 10, 2006 4:53 pm    Post subject: replacing paragraph marks Reply with quote

New to the community of open source, but have taken the plunge and gone linux...

Frequently I edit matierials typed by other people. I am slowly educating how to use styles etc, but often they will format by hand by just spacing or double returning to get the look that they want.

Sorry for the reference to MS Word, don't mind doing things the new way, just don't know how to yet.

In MS Word I could hit Ctrl F and then replace ^p^p with ^p. Doing this a couple of times would give me single paragraphs all the way through. I could also search and replace for tabs ^t and other things. I looked at the attributes list on search screen and read a little in the help manual and did a search of the forums, but didn't see anything for paragraph replacements. Manually going through 30 pages of text and deleting each double paragraph is going to get old quick.

Appreciate help on this.
Back to top
View user's profile Send private message
howard
OOo Advocate
OOo Advocate


Joined: 28 Feb 2004
Posts: 320
Location: Newfoundland

PostPosted: Fri Feb 10, 2006 5:36 pm    Post subject: Reply with quote

Bountonw,

What you are looking for are called "regular expresions" in OOo. You can see a list of them in the inbuilt help pages. You can use the "find and replace" (<CTRL+F>) mechanism for these, but you must check "regular expressions" in the drop down menu you get when you click "more options". When I use my optical character reader all the lines are presented as paragraphs - I remove these by replacing $ with <space> (it doesn't show of course, but it is there!) If you have some text selected it assumes you want only to change items within this selection.

All this refers to the English version of OOo, if you are using a different one you'll have to translate it!
_________________
Howard
3.41 (Vanilla) on various versions of PCLinuxOS and XP Home (hardly ever used!)
Back to top
View user's profile Send private message
bountonw
General User
General User


Joined: 20 Jan 2006
Posts: 18
Location: Saraburi, Thailand

PostPosted: Fri Feb 10, 2006 5:43 pm    Post subject: Reply with quote

Thank you very much for pointing me out where to look. I will study into this. Is this the same regex that people use in programming? On the linux pages, someone pointed out the benifit of regular expressions. If this is one of the same thing I can lead to horses with one carrot. (not into killing birds.)
Back to top
View user's profile Send private message
jgs17
Newbie
Newbie


Joined: 10 Feb 2006
Posts: 2
Location: Michigan

PostPosted: Fri Feb 10, 2006 8:23 pm    Post subject: Reply with quote

I've had the same problem searching and replacing paragraph symbols to clean up many text sources.

I found the OOo "\t" option for manual tabs (like MSWord "^t"), but nothing for the carriage return, end of paragraph character (MSWord uses "^p"). Already searched the OOo help pages and internet.

Any work arounds if there is no direct character?

Thanks!
_________________
J Smith

-------------------------
OOo 1.9.129 / Kubuntu 5.10
Back to top
View user's profile Send private message Visit poster's website
Robert Tucker
Moderator
Moderator


Joined: 16 Aug 2004
Posts: 3367
Location: Manchester UK

PostPosted: Fri Feb 10, 2006 11:11 pm    Post subject: Reply with quote

The weakness of OpenOffice in searching for end-of-line characters (or its avoidance of doing so) is well commented upon in these forums.

If one is talking about plain text files then the best piece of software I know to search for them is Bluefish. You just need to hold the mouse button down going from the end of one line to the beginning of the next and then copy and paste into the "find" box of the "find and replace" dialogue box. Voilà, it appears as a little box with the Unicode number in it. It works whether they are soft returns, hard returns, carriage returns - whatever.
Back to top
View user's profile Send private message
Gabor
Super User
Super User


Joined: 21 Sep 2003
Posts: 610
Location: Hungary (E-Europe)

PostPosted: Sat Feb 11, 2006 1:12 am    Post subject: try this Reply with quote

If you wish to search for such control characters regularly I would suggest to download Ian Laurenson's:
http://homepages.paradise.net.nz/hillview/OOo/IannzFindReplace.sxw
which is a Writer file with a macro embedded. By that you can easily change almost anything you wish.The site itself is this:
http://homepages.paradise.net.nz/hillview/OOo/
You may find it useful to check out other macros there. Such as AltKeyHandler which allows you to use the Alt key, either alone or in combination with Shift and/or Control (since OOo developers are blind to the Alt key without any public explanation). It is also embedded in an sxw document, with full how-to.
Back to top
View user's profile Send private message
howard
OOo Advocate
OOo Advocate


Joined: 28 Feb 2004
Posts: 320
Location: Newfoundland

PostPosted: Sat Feb 11, 2006 5:02 am    Post subject: Reply with quote

I obviously need to give more detailed instructions.

Press F1 - you will get into the itemised help menu.

Type "regular expressions" into the box at the top on the left.

Click on "list of" in the menu below - you will then get a list of regular expressions in the window on the right.

Putting any of these into "Search & Replace" will find and/or replace them when you have "regular expressions" ticked (checked) in the "more options" drop down menu.

One thing that isn't clear is that $ alone will find (and replace if required) the CR-LF at the end of paragraphs.

Also note that clicking on the ¶ symbol in the upper toolbar will reveal some of the more common regular expressions.
_________________
Howard
3.41 (Vanilla) on various versions of PCLinuxOS and XP Home (hardly ever used!)
Back to top
View user's profile Send private message
BillP
Super User
Super User


Joined: 07 Jan 2006
Posts: 2702

PostPosted: Sat Feb 11, 2006 7:00 am    Post subject: Reply with quote

In Word, you can search for two consecutive paragraph marks and replace with one paragraph mark. You can't do it this way in Writer. Instead, you have to search for empty paragraphs and delete them. The regular expression for an empty paragraph is ^$. Use ctrl-F to open Find and Replace. Check Regular Expressions and put ^$ in the Search for box. Leave the Replace with box empty. Click the Replace all button and all empty paragraphs will be deleted.
Back to top
View user's profile Send private message
bountonw
General User
General User


Joined: 20 Jan 2006
Posts: 18
Location: Saraburi, Thailand

PostPosted: Sat Feb 11, 2006 8:03 am    Post subject: Reply with quote

I appreciate all of the response that I am getting from this thread. I think in a regular document, I probably can do what I want. I copied an 11 page newsletter from the internet that was posted in HTML. In front of the extra paragraphs there is one space surrounded by a little yellow box. I managed to get rid of the space by searching [:space:]$ and leaing the replace box blank. The little yellow box moved to include the paragraph mark.

It was no longer a blank paragraph and ^$ didn't work. I ran my mouse over it and found that it is a message which contains the following code

Code:
<!--[if !supportEmptyParas]-->


This is very cryptic to me. Yes, I can manually delete each of these and get on with life, but that wouldn't tell me how to do it the next time.

I appreciate being pointed out to regular expressions. Thank you. I experimented with various forms of [:cntrl:] trying to get rid of what I think is a comment window, without success. Any pointers?
Back to top
View user's profile Send private message
JohnV
Administrator
Administrator


Joined: 07 Mar 2003
Posts: 8984
Location: Lexinton, Kentucky, USA

PostPosted: Sat Feb 11, 2006 8:30 am    Post subject: Reply with quote

Please try this macro which is designed to deal with documents having extra paragraph or line breaks. It even has a little section of code designed specifically to deal with a problem I have, but others may not, when I scan and OCR documents into OOo. http://www.oooforum.org/forum/viewtopic.phtml?t=6429
Back to top
View user's profile Send private message
BillP
Super User
Super User


Joined: 07 Jan 2006
Posts: 2702

PostPosted: Sat Feb 11, 2006 10:10 am    Post subject: Reply with quote

bountonw wrote:
In front of the extra paragraphs there is one space surrounded by a little yellow box. I managed to get rid of the space by searching [:space:]$ and leaing the replace box blank. The little yellow box moved to include the paragraph mark.

It was no longer a blank paragraph and ^$ didn't work. I ran my mouse over it and found that it is a message which contains the following code

Code:
<!--[if !supportEmptyParas]-->


This is very cryptic to me. Yes, I can manually delete each of these and get on with life, but that wouldn't tell me how to do it the next time.

I appreciate being pointed out to regular expressions. Thank you. I experimented with various forms of [:cntrl:] trying to get rid of what I think is a comment window, without success. Any pointers?


The yellow box is probably a note. Pressing F5 opens the Navigator where notes are listed. If there is a box with a plus sign to the left of "Notes", click the box to expand the listing of notes. I have found that I can quickly delete all notes by single-clicking on one note in the Navigator to highlight it, then repeatedly pressing the Delete key until all notes have been deleted.
Back to top
View user's profile Send private message
RealGrouchy
OOo Enthusiast
OOo Enthusiast


Joined: 25 Jan 2006
Posts: 144
Location: Ottawa, Canada

PostPosted: Sat Feb 11, 2006 10:17 am    Post subject: Reply with quote

I've got a little twist to this question:

When I typed up a selection of excerpts from a book I was reading, I put "(c###)" at the end of the line ("C" being a note of the author's name, since this was for personal reference only, ### representing the page number).

When I converted the document to a spreadsheet, I wanted to search to find and remove all instances of "(c###)." But when I had it search for regular expressions to catch the numbers, it also took the brackets as part of the expression, and not part of the text I wanted to find and replace.

Although this is all ancient history now, I'd like to know how (and if) to do this in the future.

Thanks,

- RG>
_________________
Quite simply, OOo Impress, does not.

XPsp2, OOo 2.3, SeaMonkey 1.1.7, IE v.6.6.6...
Back to top
View user's profile Send private message
bountonw
General User
General User


Joined: 20 Jan 2006
Posts: 18
Location: Saraburi, Thailand

PostPosted: Sat Feb 11, 2006 4:09 pm    Post subject: Reply with quote

Thank you BillP. I deleted all the notes. When I deleted one, my curser would go back to the document so I had to click back on the navigation bar (116 times!)

I wish there was an easier way, but I am thankful for this navigation bar trick.

I can search for [:space:]$ but not ^[:space:]

I can not replace with any regular expresion. Let us say I want to search for each double space or triple space and replace with tab.

I tried searching [:space:][:space:][:space:] (which is very ineffecient) but got an error. I have tried several combinations of having /t or [:space:] in the replace line only to get the /t or [:space:] replaced throughout the document. I suggest a different regular expression for space 9 key strokes to replace one is not effecient. One should also be able to search for more than one of these in a row. (I can type the space key and have it replace, but not at the beging of paragraphs. I have 3 spaces at the beginning of each paragragh and want to replace with a tab.)

I am not complaining as I am not going to go back to MS word and am here for the long term. I am just pointing out a real life situation. I am not asking Writer to be the same as Word. It is better in many ways.

My humble newbie suggestion is that the regular expressions be revised a little or expanded to include a short cut for space. Include /r for carriage return.

If I don't know what I am talking about, sorry. I am new. I have been playing with the regular expressions and find them powerful and appreciate them. I just wish that I could replace with tabs spaces return carriages or line breaks etc.
Back to top
View user's profile Send private message
BillP
Super User
Super User


Joined: 07 Jan 2006
Posts: 2702

PostPosted: Sat Feb 11, 2006 5:17 pm    Post subject: Reply with quote

bountonw wrote:
I tried searching [:space:][:space:][:space:] (which is very ineffecient) but got an error. I have tried several combinations of having /t or [:space:] in the replace line only to get the /t or [:space:] replaced throughout the document. I suggest a different regular expression for space 9 key strokes to replace one is not effecient. One should also be able to search for more than one of these in a row. (I can type the space key and have it replace, but not at the beging of paragraphs. I have 3 spaces at the beginning of each paragragh and want to replace with a tab.).


I don't know the purpose of the regular expression [:space:]. To find 3 spaces, I just type 3 spaces in the Search for box, not [:space:][:space:][:space:]. I did this to replace 3 spaces at the beginning of a paragraph with a tab, and it worked for me. I am using OOo 2.0.1 on Windows XP SP2.
Back to top
View user's profile Send private message
jgs17
Newbie
Newbie


Joined: 10 Feb 2006
Posts: 2
Location: Michigan

PostPosted: Sat Feb 11, 2006 9:55 pm    Post subject: Reply with quote

Ok, per the above comments I was able to successfully remove CR characters (thanks!), the next problem was being able to _insert_ CR characters.. Below are the steps of an example text file I might get from website/old text documents/etc – they have short text lines with paragraph breaks at the end of each line and a double paragraph break between intended paragraphs. Goal is to have three text paragraphs with one single CR character at just the end of each paragraph (so all text within a paragraph will word wrap); I inserted “P” to represent the paragraph character shown with tool bar button clicked:

[Given]:

P
1asjfls kdjflks P
skdf jsdf P
P
2asd fsadf sdfadf P
sdfs dfsdf P
P
3asd fasd fsadf P
asdf sadf sadf P
P

[Desired output]:

P
1asjfls kdjflks skdf jsdf P
2asd fsadf sdfadf sdfs dfsdf P
3asd fasd fsadf asdf sadf sadf P
P

[So I try]:
A) Replace “^$” with “#####” (or any unique text to replace later) & check “regular expressions” – to mark real paragraphs.

1asjfls kdjflks P
skdf jsdf P
#####2asd fsadf sdfadf P
sdfs dfsdf P
#####3asd fasd fsadf P
asdf sadf sadf P

B) Replace “$” with “ “ (space) character & check “regular expressions” - to eliminate end of line CR and enable normal document word wrap.

1asjfls kdjflksskdf jsdf#####2asd fsadf sdfadfsdfs dfsdf#####3asd fasd fsadfasdf sadf sadf P

C) Replace “#####” with “^$” (c1) or with “$” (c2) & check “regular expressions” - to restore original desired paragraphs.

(c1) 1asjfls kdjflksskdf jsdf^$2asd fsadf sdfadfsdfs dfsdf^$3asd fasd fsadfasdf sadf sadf P
(c2) 1asjfls kdjflksskdf jsdf$2asd fsadf sdfadfsdfs dfsdf$3asd fasd fsadfasdf sadf sadf P

As you can see, no CR characters were replaced, just a standard $ inserted.

Suggestions for this last step?
_________________
J Smith

-------------------------
OOo 1.9.129 / Kubuntu 5.10
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Writer All times are GMT - 8 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group