burke Newbie

Joined: 08 Aug 2007 Posts: 2
|
Posted: Wed Aug 08, 2007 5:48 am Post subject: Headless Conversion -- Character Set Issues |
|
|
Short version:
How can I restrict the output character set of an export while running OOo in headless mode?
Long version:
I'm converting various documents to html using OOo running as a server, using the following command.
/usr/lib/openoffice/program/soffice -headless -display :99 -accept="socket,port=8100,host=localhost;urp;"
I'm using a package called jodconverter to interface my ruby on rails application with openoffice, called like so:
system("java -jar #{RAILS_ROOT}/lib/jodconverter/jodconverter-cli-2.2.0.jar /tmp/cjs-#{random}/f.#{ext} /tmp/cjs-#{random}/f.xhtml")
Ruby parses #{string} as the value of string when found inside " ".
Anyway, this works fine, but I get some fun garbage data. Have a look at http://ltc.umanitoba.ca:3006/article/show/15 . Even in source view, it looks fine, but when you export to pdf (which uses a program external to OOo at the moment), there are odd characters that don't belong.
If I pass it through Iconv (UTF-8 to ISO-8859-1) immediately after converting to html, I get this: http://ltc.umanitoba.ca:3006/article/show/23 .
Is there a character set option I can set to avoid whitelisting allowed characters? |
|