OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

convert PDF to OOo formats?
Goto page 1, 2, 3, 4  Next
 
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Writer
View previous topic :: View next topic  
Author Message
mantera
Guest





PostPosted: Mon Sep 15, 2003 6:40 am    Post subject: convert PDF to OOo formats? Reply with quote

does anyone know a way to convert a PDF file to a format that OOo can edit?
Back to top
DannyB
Moderator
Moderator


Joined: 02 Apr 2003
Posts: 3991
Location: Lawrence, Kansas, USA

PostPosted: Mon Sep 15, 2003 6:57 am    Post subject: Reply with quote

I do not know of any way.

I have been daydreaming for a long time about building a program that would read and parse a PDF, and then create an OOo drawing of what is in the PDF. Each page of the PDF would end up as a separate page of the Drawing. Text, pictures, geometric shapes, etc. You then could edit the drawing, and you could re-export is as a PDF.

But like I said, daydreaming. I've read the PDF document spec before. There are still some aspects of OOo Draw that I need to learn better before I would begin a project like this. (Such as bezier curve shapes.)
_________________
Want to make OOo Drawings like the colored flower design to the left?
Back to top
View user's profile Send private message
carl
Super User
Super User


Joined: 21 Apr 2003
Posts: 920
Location: Germany

PostPosted: Mon Sep 15, 2003 8:55 am    Post subject: Reply with quote

the adobe product alwows you the saveas txt
_________________
carl
Using OpenOffice.org 2 on XP sp2
Back to top
View user's profile Send private message
KirkJobSluder
Power User
Power User


Joined: 25 Apr 2003
Posts: 73

PostPosted: Mon Sep 15, 2003 10:40 am    Post subject: Reply with quote

A problem with pdf is that you never know how the text in encoded internally. It is popular within my department to scan entire articles and transmit pdfs which are basically a stack of bitmap files.

Depending on the internal coding, there are a couple of PDF->text utilities for both unix and windows. pdftotext is the one I use.

PDF is a difficult format to translate from because it is highly optimized for consistent printing (in fact much of it is wrapped postscript).
Back to top
View user's profile Send private message
ftack
Moderator
Moderator


Joined: 27 Jan 2003
Posts: 3102
Location: Belgium

PostPosted: Mon Sep 15, 2003 11:43 pm    Post subject: Reply with quote

Yes, it is a format designed for printing and viewing. A PDF is an "end product". You should have acces to the source document and recreate the PDF to change it. The source document can be anything from a text document over Latex code to a Writer or word document.
Adobe Acrobat lets you edit a PDF to some extent, but this is limited to correcting some typos and changes that do not affect the layout. We shold not expect that Writer at some point can open an PDF for editing, because it's really not designed for that.
Back to top
View user's profile Send private message
Guest






PostPosted: Wed Oct 29, 2003 2:50 pm    Post subject: Why I Want Writer to Open a pdf or Postscript File Reply with quote

ftack wrote:
Yes, it is a format designed for printing and viewing. A PDF is an "end product". You should have acces to the source document and recreate the PDF to change it. The source document can be anything from a text document over Latex code to a Writer or word document.
Adobe Acrobat lets you edit a PDF to some extent, but this is limited to correcting some typos and changes that do not affect the layout. We shold not expect that Writer at some point can open an PDF for editing, because it's really not designed for that.


But there are instances when one might want to open a pdf in Open Office. For example, I have a 500 page pdf and I want to send only page 320 to someone. Or I have a very detailed roadmap in pdf but it was created like a poster. When I try to print to letter size paper all of the detail is mashed into this small size. I want to print a small portion at 500%. I don't see these examples as editing-proper, but I need to open them in some type of editor to produce the results I want.
Back to top
DannyB
Moderator
Moderator


Joined: 02 Apr 2003
Posts: 3991
Location: Lawrence, Kansas, USA

PostPosted: Wed Oct 29, 2003 3:14 pm    Post subject: Reply with quote

I would also point out that PDF is not a bitmap format.

Yes it can contain bitmaps, because obviously, a page can contain pictures. Some software, such as scanning software, generates PDFs whose pages are nothing but giant bitmaps.

But really, fundamentally, PDF is a vector graphics format. Some of the "objects" on a page, can be bitmaps. Even a single bitmap per page which takes up the entire page.

In any event, it should be possible to parse a PDF and generate a Draw document whose pages mimic the PDF contents. Even if the PDf is just a large collection of scanned bitmaps in some cases. But in most cases, whenever a PDF did NOT come from a scanner, the PDF is vector graphics that are scalable.

I've skimmed through the PDF specification before. This is what got me to thinking many months ago about the possibility to import a PDF into Draw.
_________________
Want to make OOo Drawings like the colored flower design to the left?
Back to top
View user's profile Send private message
ftack
Moderator
Moderator


Joined: 27 Jan 2003
Posts: 3102
Location: Belgium

PostPosted: Thu Oct 30, 2003 4:32 am    Post subject: Reply with quote

As we hear DannyB, there is a future for OOo Draw reading PDF. In the mean time

<quote>For example, I have a 500 page pdf and I want to send only page 320 to someone. </quote>

I'd load the PDF in Acrobat Viewer and print one single page to PDF using my Ghostscript/Redmon PDF Writer. Or you'd open it in Ghostview and save one page directly from Ghostview to PDF

<quote>Or I have a very detailed roadmap in pdf but it was created like a poster. ... I want to print a small portion at 500%. </quote>

This one's more difficult. I can think of displaying the file full screen, zooming in to the portion of interest and making a screenshot. Perhaps, one could create an eps from the map (using a postscript printer driver and print to file), read it in into draw and enlarge it such that the portion of interest fills the page. Otherways, print the map to a large bitmap, again using the ghostscript/Redmon combo, and crop the selection of interest. Problem would be the very large intermediate graphic, probably.
Back to top
View user's profile Send private message
Guest






PostPosted: Thu Oct 30, 2003 4:54 am    Post subject: depends... Reply with quote

If the pdf file contains text and embedded pictures, the pdf import plugin for koffice under Linux does a good job. For plain text it does anyway;-)

You'd end up with a koffice file that you can export as a RTF file which you could open in OpenOffice. All embedded pictures are saved as png files if I remember right.

What you can always do is print into a postscript file and then edit that postscript file after opening at a high enough resolution in tools like photoshop or the imp under Unix.

for tex only file pdf2txt does a quick and good job.

If you search freshmeat.net you find lots of conversion tools some create a ong file per page which basicallyis a screenshot per page of the pdf file others combine HTML with embedded pcitures.

Overall one has to say that the quility of all the conversion tools is often lousy. Try lots of them and pick the one that suits your needs best.

Juergen
Back to top
DannyB
Moderator
Moderator


Joined: 02 Apr 2003
Posts: 3991
Location: Lawrence, Kansas, USA

PostPosted: Thu Oct 30, 2003 6:14 am    Post subject: Re: depends... Reply with quote

Anonymous wrote:
If the pdf file contains text and embedded pictures, the pdf import plugin for koffice under Linux does a good job. For plain text it does anyway;-)

You'd end up with a koffice file that you can export as a RTF file which you could open in OpenOffice. All embedded pictures are saved as png files if I remember right.


It sounds like KOffice imports a PDF as a "word processing" document not as a "drawing" document. I believe that OOo's Draw and NOT Writer is the correct destination for an imported PDF.

PDF is a language for placing black (or color) marks onto paper. These marks can consist of commands such as:
* draw text "FooBar" at position such and so in SansSerif 15 pt.
* draw a triangle at position such and so, fill with puke green
* draw a dashed line over there
Essentially, a vector graphics language.

Now just imagine how you would "render" each of these commands as a draw object, instead of into an array of pixels.

A PDF import module would definitely have to parse the PDF doc, but the "rendering" would not consist of needing a graphics engine to draw pixels, it would instead consist of generating the most appropriate Draw shape for each PDF markup command.
_________________
Want to make OOo Drawings like the colored flower design to the left?
Back to top
View user's profile Send private message
Guest






PostPosted: Thu Oct 30, 2003 9:27 pm    Post subject: OCR the PDF Reply with quote

Scansoft's most recent Ominpage claims to directly OCR PDF files, no print and scan required. I doubt they support OO native formats, but .DOC and RTF are supported.

Of course you may be able to simply print the PDF files to a 200 or 300 DPI resolution image format (ie. jpg) then feed them to any OCR app.

It's the only solution I have ever seen.
Back to top
dorpm
General User
General User


Joined: 13 Oct 2003
Posts: 29

PostPosted: Fri Oct 31, 2003 4:44 am    Post subject: Reply with quote

Look at http://www.watsys.unh.edu/Darlene/TechToolsFiles/Postscript-PDF/ps-epslist.html .

Last edited by dorpm on Tue Dec 07, 2004 12:01 pm; edited 1 time in total
Back to top
View user's profile Send private message
chuck
General User
General User


Joined: 30 Nov 2003
Posts: 10

PostPosted: Sun Nov 30, 2003 7:14 am    Post subject: maps Reply with quote

The Macromedia draw program Freehand will do exactly what you want with your map and PDF pages. You can zoom into the map and print the zoomed section perfectly.
This is beyond the reach of anything Linux has or ever will!
Back to top
View user's profile Send private message
Lee
Guest





PostPosted: Thu Jan 01, 2004 11:44 am    Post subject: Editing PDFs Reply with quote

It is to be noted that one can open a PDF file in The Gimp, the imaging software which is standard in pretty well Linux distributions. It treats it as an image, and you can then make changes to it using the tools available there. You can zoom in on the particular part of an image, clip out and copy the parts you want, and so on. You can also add text. If you have a PDF converter set up, you can then print back to PDF format.
Not perfect, but probably the best short term solution.
Back to top
DannyB
Moderator
Moderator


Joined: 02 Apr 2003
Posts: 3991
Location: Lawrence, Kansas, USA

PostPosted: Thu Jan 01, 2004 12:41 pm    Post subject: Reply with quote

When the GIMP imports a PDF, won't it rasterize it into pixels?

Therefore, any editing you do is really just editing pixels. The PDF you then save from the GIMP is just more pixels. It has lost any notion of text, fonts, lines, vectors, tables, shapes, etc.
_________________
Want to make OOo Drawings like the colored flower design to the left?
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Writer All times are GMT - 8 Hours
Goto page 1, 2, 3, 4  Next
Page 1 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group