Saturday, 2 August 2008

Generating Word Documents In Java

UPDATE: I think need to depricated this post. Go with FOP.

Very now and again I see this on the JDC, and as I have recently needed to do this I thought it would be good to write down how to do this.

Are you creating word documents or editing existing Word documents? If the latter then here you go. Good luck.

If you need to create a new Word document then first it might be worth using RTF. Word 97 used RTF instead of Words native format so don't worry; no-one will ever know our dirty little secret. The advantage of using RTF is it is a text file not a scary binary file and the specification is a free download. Java also has some primitive support for RTFs.

I am going to guess that you don't simply wish to create a free-form word document but rather you will be applying a model to a simple template?

For those who do not know what that means:

Lets give this a little context, we want to send a bill out, so we some key details about our customer, and the order.
customer_name=Bob MacBobby
items=(
  (name="blow up sheep", cost=12.5),
  (name="Fun with Goats, the motion picture", cost=9.99
  )
total_untaxed= 22.49
vat= 3.94
total= 26.43


And we want:

Dear Bob MacBobby,

Item: blow up sheep
Cost: 12.5
Item: Fun with Goats, the motion picture
Cost: 9.99

Total (without tax): 22.49
VAT: 3.94
Total: 26.46


Doing this is very easy. First we need to go get a templating engine. I've used Freemarker before and it works nicely.

The only issue with this is that creating a RTF template can be a bit of a bitch, the way to do it is to create the RTF document looks how you want it to look in say Word. Where you need a variable don't use ${blah} that Freemarker requires but use an all uppercase word. This is because RTF uses { }.

Word Document RTF

Now we need to edit it so it is a valid freemarker template. Open the document in a plain text editor. Find the INSERT_XXX_HERE and replace them with the Freemarker syntax: ${xxx?rtf}. Take note of the ?rtf.
Finally we need the loop. Simply add <#list …> to the line above the items you want to loop over, and to the line below. The template file is alas not a valid RTF document, but running Freemarker over it will create a nice valid RTF document.

{\rtf1\ansi\ansicpg1252\uc1\deff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deflang2057\deflangfe2057{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f36\froman\fcharset238\fprq2 Times New Roman CE;}
{\f37\froman\fcharset204\fprq2 Times New Roman Cyr;}{\f39\froman\fcharset161\fprq2 Times New Roman Greek;}{\f40\froman\fcharset162\fprq2 Times New Roman Tur;}{\f41\froman\fcharset177\fprq2 Times New Roman (Hebrew);}
{\f42\froman\fcharset178\fprq2 Times New Roman (Arabic);}{\f43\froman\fcharset186\fprq2 Times New Roman Baltic;}{\f44\froman\fcharset163\fprq2 Times New Roman (Vietnamese);}}{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;
\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;
\red128\green128\blue128;\red192\green192\blue192;}{\stylesheet{\ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs24\lang2057\langfe2057\cgrid\langnp2057\langfenp2057 \snext0 Normal;}{\*\cs10 \additive \ssemihidden
Default Paragraph Font;}{\*\ts11\tsrowd\trftsWidthB3\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\trcbpat1\trcfpat1\tscellwidthfts0\tsvertalt\tsbrdrt\tsbrdrl\tsbrdrb\tsbrdrr\tsbrdrdgl\tsbrdrdgr\tsbrdrh\tsbrdrv
\ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs20\lang1024\langfe1024\cgrid\langnp1024\langfenp1024 \snext11 \ssemihidden Normal Table;}}{\*\latentstyles\lsdstimax156\lsdlockeddef0}{\*\rsidtbl \rsid14418598\rsid15407471
\rsid16010694}{\*\generator Microsoft Word 11.0.5604;}{\info{\title Dear INSERT_NAME_HERE}{\author Michael Lee}{\operator Michael Lee}{\creatim\yr2008\mo8\dy2\hr21\min49}{\revtim\yr2008\mo8\dy2\hr21\min55}{\version1}{\edmins3}{\nofpages1}{\nofwords30}
{\nofchars177}{\*\company Your Company Name}{\nofcharsws206}{\vern24689}}\paperw11906\paperh16838 \widowctrl\ftnbj\aenddoc\noxlattoyen\expshrtn\noultrlspc\dntblnsbdb\nospaceforul\formshade\horzdoc\dgmargin\dghspace180\dgvspace180\dghorigin1701
\dgvorigin1984\dghshow1\dgvshow1\jexpand\viewkind4\viewscale100\pgbrdrhead\pgbrdrfoot\splytwnine\ftnlytwnine\htmautsp\nolnhtadjtbl\useltbaln\alntblind\lytcalctblwd\lyttblrtgr\lnbrkrule\nobrkwrptbl\snaptogridincell\allowfieldendsel\wrppunct
\asianbrkrule\rsidroot14418598\newtblstyruls\nogrowautofit \fet0\sectd \linex0\headery708\footery708\colsx708\endnhere\sectlinegrid360\sectdefaultcl\sftnbj {\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl2
\pnucltr\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl3\pndec\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang {\pntxta )}}{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl6
\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang
{\pntxtb (}{\pntxta )}}\pard\plain \ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs24\lang2057\langfe2057\cgrid\langnp2057\langfenp2057 {\insrsid14418598 Dear }{\b\insrsid14418598\charrsid14418598 ${customer_name}}{
\insrsid16010694
\par }{\insrsid14418598
\par Thank you for your order. Now pay up.
\par
<#list items as item> 
\par }{\b\insrsid14418598\charrsid14418598 Item}{\insrsid14418598 :
${item.name?rtf}
\par Cost: ${item.cost}
</#list>
\par
\par Total (without tax): ${total_untaxed}
\par VAT: ${vat}
\par Total: ${total}
\par }}



This alas is not perfect. While static images are simple enough, just add them into the template file, I have no idea how to do dynamic images. Which is why I'm now looking into Apache FOP.

0 comments: