Forum at OOoForum.orgThe Forum
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

Converting Documents by using the OpenOffice API

Post new topic   Reply to topic Forum Index -> General Discussion
View previous topic :: View next topic  
Author Message

Joined: 09 Aug 2004
Posts: 1
Location: Netherlands

PostPosted: Mon Aug 09, 2004 10:16 pm    Post subject: Converting Documents by using the OpenOffice API Reply with quote

Hi, altough I use OpenOffice/Startoffice for many years, I'm new on this forum.

Can anybody help me to guide me to how I can use the OO API to convert documents from one file format into the other: eg a MSWord document into a OO *.sxw document or PDF.

Main issue is that this should be done without launching the GUI part of OO.

Preferable in case of OO file formats it would be better to export in plain XML and not jarred into the final OO file format.

I ask this because I like to develop some converters for office documents and it looks like OO has all the bells and whistles already there.

These converters will also be used to extract metadata from those files.
While the OO fileformats are very understandable XML, it would be a good idea to convert any office like fileformat into a OO file and then extract the metadata by parsing the OO XML.

If any body could give me some directions how to continue it would be of great help Smile
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address

Joined: 02 Apr 2003
Posts: 3991
Location: Lawrence, Kansas, USA

PostPosted: Tue Aug 10, 2004 5:58 am    Post subject: Reply with quote

The GUI is an integral part of OOo. But you can drive OOo without displaying a document.

This message is about the simple kind of conversion that most people want to so.
1. Open document
2. Save in new format.
3. Close document

If you need some new output format that OOo cannot save into, then you must either (A) write a custom export filter (possibly using the XSLT filter dialog in the GUI to create a named filter from an existing XSLT file you must supply), or (B) use the API to inspect the structure of the document, and then use your language primitives or the API's I/O capabilities to write out a new file, byte by byte.

Conversion consists of three steps.
1. Open original document. This may require use of an import filter if TypeDetection is unable to detect the type of incomming document, or if you want some behavior other than the default. For instance, opening HTML by default opens in Web, but for conversion, you often want it opened in Writer which has a larger number of export filters. So you would need to specify the html->writer import filter.
2. Save the document in new format. Must specify an export filter, unless you are saving in native OOo format.
3. Close the document.

On step 1, you can specify the Hidden=True property so that when the document is opened, it does not appear on the display. If you do this, then it is important to do step 3, close the document, because the document window is invisible, and the user cannot close it.

Here are some links that may help.....

Document Conversion

List of many past conversion examples

Filter list

Xcel to Calc conversion using the API

A very similar one, converting Xcel to Text

Batch mode conversion

Document conversion

VB: converting Excel files to txt files
General Visual Basic document conversion of Text...

Converting Word -> PDF from the command line

Convert Word --> Writer from the command line

Convert Excel -> PDF from the command line

Convert SXC to CSV from commandline

Convert PPT to HTML from command line...

Convert PPT to HTML short example...

see tail end of thread...

Converting SXW -> PDF

Draw export to PDF

In Python...

Thread about converting document to PDF in Java

I wrote a batch document converter
you can get it here
more discussion of it here...

Macro to save in three formats
Macro to save backups with timestamps

Open HTML with Writer not Web in order to export

Discussion that ends in DocConverter utility.

Convert DBF into XLS, SXC, PDF and HTML

Good Visual Basic code example...converting documents

Draw exporting and printing

Using OOo's source code to read / convert / write documents
in the formats supported by its filters.

Import Export Filters list

No documentation on filter options
Some possible insight to filter options

Developing Filters for Writer

Using XSLT filter adapter to import some XML to Writer

Building XML import/export filters using XSLT


For dBase import filter

Exporting page range to PDF using FilterData
Want to make OOo Drawings like the colored flower design to the left?
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic Forum Index -> General Discussion All times are GMT - 8 Hours
Page 1 of 1

Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB © 2001, 2005 phpBB Group