OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

Converting Documents by using the OpenOffice API

 
Post new topic   Reply to topic    OOoForum.org Forum Index -> General Discussion
View previous topic :: View next topic  
Author Message
gpeper
Newbie
Newbie


Joined: 09 Aug 2004
Posts: 1
Location: Netherlands

PostPosted: Mon Aug 09, 2004 10:16 pm    Post subject: Converting Documents by using the OpenOffice API Reply with quote

Hi, altough I use OpenOffice/Startoffice for many years, I'm new on this forum.

Can anybody help me to guide me to how I can use the OO API to convert documents from one file format into the other: eg a MSWord document into a OO *.sxw document or PDF.

Main issue is that this should be done without launching the GUI part of OO.

Preferable in case of OO file formats it would be better to export in plain XML and not jarred into the final OO file format.

I ask this because I like to develop some converters for office documents and it looks like OO has all the bells and whistles already there.

These converters will also be used to extract metadata from those files.
While the OO fileformats are very understandable XML, it would be a good idea to convert any office like fileformat into a OO file and then extract the metadata by parsing the OO XML.

If any body could give me some directions how to continue it would be of great help Smile
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address
DannyB
Moderator
Moderator


Joined: 02 Apr 2003
Posts: 3991
Location: Lawrence, Kansas, USA

PostPosted: Tue Aug 10, 2004 5:58 am    Post subject: Reply with quote

The GUI is an integral part of OOo. But you can drive OOo without displaying a document.

This message is about the simple kind of conversion that most people want to so.
1. Open document
2. Save in new format.
3. Close document

If you need some new output format that OOo cannot save into, then you must either (A) write a custom export filter (possibly using the XSLT filter dialog in the GUI to create a named filter from an existing XSLT file you must supply), or (B) use the API to inspect the structure of the document, and then use your language primitives or the API's I/O capabilities to write out a new file, byte by byte.

Conversion consists of three steps.
1. Open original document. This may require use of an import filter if TypeDetection is unable to detect the type of incomming document, or if you want some behavior other than the default. For instance, opening HTML by default opens in Web, but for conversion, you often want it opened in Writer which has a larger number of export filters. So you would need to specify the html->writer import filter.
2. Save the document in new format. Must specify an export filter, unless you are saving in native OOo format.
3. Close the document.

On step 1, you can specify the Hidden=True property so that when the document is opened, it does not appear on the display. If you do this, then it is important to do step 3, close the document, because the document window is invisible, and the user cannot close it.

Here are some links that may help.....




Document Conversion
===================

List of many past conversion examples
http://www.oooforum.org/forum/viewtopic.php?t=4998

Filter list
http://www.oooforum.org/forum/viewtopic.php?t=3549

Xcel to Calc conversion using the API
http://www.oooforum.org/forum/viewtopic.php?t=2668

A very similar one, converting Xcel to Text
http://www.oooforum.org/forum/viewtopic.php?t=2819

Batch mode conversion
http://www.oooforum.org/forum/viewtopic.php?p=16056#16056

Document conversion
http://www.oooforum.org/forum/viewtopic.php?t=4163

VB: converting Excel files to txt files
http://www.oooforum.org/forum/viewtopic.php?t=3194
General Visual Basic document conversion of Text...
http://www.oooforum.org/forum/viewtopic.php?p=22034#22034

Converting Word -> PDF from the command line
http://www.oooforum.org/forum/viewtopic.php?t=3772
http://www.oooforum.org/forum/viewtopic.php?t=5513
http://www.oooforum.org/forum/viewtopic.php?t=3768

Convert Word --> Writer from the command line
http://www.oooforum.org/forum/viewtopic.php?p=24891#24891

Convert Excel -> PDF from the command line
http://www.oooforum.org/forum/viewtopic.php?t=5596
http://www.oooforum.org/forum/viewtopic.php?p=21050#21050

Convert SXC to CSV from commandline
http://www.oooforum.org/forum/viewtopic.php?t=6987

Convert PPT to HTML from command line...
http://www.oooforum.org/forum/viewtopic.php?t=5137

Convert PPT to HTML short example...
http://www.oooforum.org/forum/viewtopic.php?t=9437

see tail end of thread...
http://www.oooforum.org/forum/viewtopic.php?t=3772

Converting SXW -> PDF
http://www.oooforum.org/forum/viewtopic.php?t=3017

Draw export to PDF
http://www.oooforum.org/forum/viewtopic.php?t=3545

In Python...
http://www.oooforum.org/forum/viewtopic.php?t=3451

Thread about converting document to PDF in Java
http://www.oooforum.org/forum/viewtopic.php?t=1480

I wrote a batch document converter
http://www.oooforum.org/forum/viewtopic.php?t=3525
http://www.oooforum.org/forum/viewtopic.php?t=2810
http://www.oooforum.org/forum/viewtopic.php?p=10311#10311
you can get it here
http://www.ooomacros.org/user.php#95532
more discussion of it here...
http://www.oooforum.org/forum/viewtopic.php?t=5708

Macro to save in three formats
http://www.oooforum.org/forum/viewtopic.php?t=3612
Macro to save backups with timestamps
http://www.oooforum.org/forum/viewtopic.php?t=7674

Open HTML with Writer not Web in order to export
http://www.oooforum.org/forum/viewtopic.php?t=3973
http://www.oooforum.org/forum/viewtopic.php?p=44367#44367

Discussion that ends in DocConverter utility.
http://www.oooforum.org/forum/viewtopic.php?t=2668

Convert DBF into XLS, SXC, PDF and HTML
http://www.oooforum.org/forum/viewtopic.php?t=5728

Good Visual Basic code example...converting documents
http://www.oooforum.org/forum/viewtopic.php?t=7673

Draw exporting and printing
http://www.oooforum.org/forum/viewtopic.php?t=3620



Using OOo's source code to read / convert / write documents
in the formats supported by its filters.
http://www.oooforum.org/forum/viewtopic.php?t=5785




Import Export Filters list
==========================
http://www.oooforum.org/forum/viewtopic.php?t=3549
http://www.oooforum.org/forum/viewtopic.php?p=15416#15416

http://framework.openoffice.org/files/documents/25/897/filter_description.html

http://www.oooforum.org/forum/viewtopic.php?t=3545
http://www.oooforum.org/forum/viewtopic.php?t=3175
http://www.oooforum.org/forum/viewtopic.php?p=10311#10311

No documentation on filter options
http://www.oooforum.org/forum/viewtopic.php?t=2735
http://www.oooforum.org/forum/viewtopic.php?t=3458
http://www.oooforum.org/forum/viewtopic.php?t=5565
Some possible insight to filter options
http://www.oooforum.org/forum/viewtopic.php?t=6769


Developing Filters for Writer
http://www.oooforum.org/forum/viewtopic.php?p=20672#20672
http://ooo.ximian.com/text-tutorial.html

Using XSLT filter adapter to import some XML to Writer
http://www.oooforum.org/forum/viewtopic.php?t=5825

Building XML import/export filters using XSLT
http://www.oooforum.org/forum/viewtopic.php?t=6809


FilterOptions
=============

For dBase import filter
http://www.oooforum.org/forum/viewtopic.php?t=6058
http://www.oooforum.org/forum/viewtopic.php?t=5728

Exporting page range to PDF using FilterData
http://www.oooforum.org/forum/viewtopic.php?p=31247#31247
_________________
Want to make OOo Drawings like the colored flower design to the left?
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> General Discussion All times are GMT - 8 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group