Pages

Friday, 15 October 2010

Converting Microsoft Office (Word, Excel) documents to PDFs in Java

This question is asked a lot on the JDC and I'm sure will continue being asked now the JDC has been merged with the OTN.

The stock answer is a link to Apache POI and if you are lucky some complaints about not googling before asking. Alas this is not helpful - Apache POI is not a mature product and focused primarily on getting data and not on rendering the data.
A few other Java options exist for reading and writing Office documents exist but most focus on getting data or writing data, not rendering data.

Three products that I know of can render Office documents:

JDocToPdf
Uses Apache POI to read the Word document and iText to write the PDF. Completely free, 100% Java but has some limitations.

Snowbound  Imaging SDK
Snowbound appears to be a 100% Java solution and costs over $2,500. It contains samples describing how to convert documents in the evaluation download.

OpenOffice API
OpenOffice is a native Office suite which supports a Java API. This supports reading Office documents and writing PDF documents. The SDK contains an example in document conversion (examples/java/DocumentHandling/DocumentConverter.java). To write PDFs you need to pass the "writer_pdf_Export" writer rather than the "MS Word 97" one.
Or you can use the wrapper API JODConverter.