OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

How to change the structure of this specific text.

 
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Writer
View previous topic :: View next topic  
Author Message
teste
General User
General User


Joined: 23 May 2010
Posts: 7

PostPosted: Wed May 26, 2010 2:53 pm    Post subject: How to change the structure of this specific text. Reply with quote

Hi people, I have a big text copied from a pdf file with this structure:

103119556, ABELAR VIEIRA ROSA NETO, 67.00, 1319; 103150727, ABELARDO MELO GOMES,
61.00, 2962; 103110766, ABIGAIL DA SILVA JOSE, 66.00, 1461; 103108055, ABNER FERREIRA
SANTOS DE SOUZA, 60.00, 3532; 103126985, ADAIL SOARES SIQUEIRA JUNIOR, 64.00, 2045;
103126151, ADALA MICHELINE GALVAO RUELA FELICIANO, 70.00, 845;

I would like to know how to let this text in this way:
103119556, ABELAR VIEIRA ROSA NETO, 67.00, 1319;
103150727, ABELARDO MELO GOMES, 61.00, 2962;
103110766, ABIGAIL DA SILVA JOSE, 66.00, 1461;
....


What are the steps to do this?
Could someone hemp me?

Wait an answer, thanks
Back to top
View user's profile Send private message
Taolin
Newbie
Newbie


Joined: 26 May 2010
Posts: 3
Location: Texas

PostPosted: Wed May 26, 2010 6:50 pm    Post subject: Reply with quote

I followed your discussion on this in the Openoffice.org IRC channel and now, reading the above, see why none of the advice you were given could work.

All of the suggestions made were targeted at turning the ";" into a CR. The problem is that you *already* have a bunch of CR in your data that makes it impossible to have the lines come out with the same number of fields in each line.

The "\n" or "\r" business, whether it is done with SED or OO itself will correctly replace all ";" with CR, but it won't do what you are asking, leave the ";" at the end and make each new line begin with the next character after a ";". You have to get rid of the CR that has been added by the paste/save (whatever) from the PDF.

103119556, ABELAR VIEIRA ROSA NETO, 67.00, 1319;

will break correctly (although it will lose the ";") but then you also have an *existing* break after

MELO GOMES,

so that winds up on a line by itself, then the next line in the input file breaks into two lines at

66.00, 1461; 103108055, ABNER

You will have to remove all of the existing CR (CR/LF, probably) before the OO Find/Replace or SED methods will work. SED should be able to do this with something like

sed -e 's/\r//' originalfile.txt > result.txt

then use the SED command that you were given in the IRC channel.
(sed -e 's/;/\r/' originalfile.txt > result.txt [if I recall correctly]) and even then, I would not kill off the ";", but rather use sed -e 's/;/;\r/' originalfile.txt > result.txt

That may leave you a few blanks to clean up, but that should be worst-case.
I am no SED expert, but I *am* sure of the cause of the failure of the advice you received on IRC.

Please feel welcome to contact me directly on IRC and I can help more.
Back to top
View user's profile Send private message
JohnV
Administrator
Administrator


Joined: 07 Mar 2003
Posts: 9183
Location: Lexinton, Kentucky, USA

PostPosted: Wed May 26, 2010 8:20 pm    Post subject: Reply with quote

Use Find & Replace with Regular Expressions checked. The 1st step may not be needed if you have no line breaks.

Search = \n – find line breaks
Replace = \n – replace with paragraph breaks (looks strange but it is correct)

Search = <spacebar>$ - find a space followed by paragraph breaks
Replace = nothing – the space is removed but paragraph breaks will remain

Search = $ - find paragraph breaks
Replace = nothing – paragraph breaks will be deleted

Search = ;<spacebar> - find ; plus a space
Replace = \n – replace with paragraph breaks

You will have to fix the last line yourself because there is no space after the ;.
Back to top
View user's profile Send private message
teste
General User
General User


Joined: 23 May 2010
Posts: 7

PostPosted: Fri May 28, 2010 4:45 am    Post subject: Reply with quote

Hi JohnV, your suggestion works,

Thanks
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Writer All times are GMT - 8 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group