Friday, December 19, 2008

Using iText to convert TIFF to PDF and to combine multiple PDFs into one PDF

We recently had the requirement to take a Microsoft Excel file as well as one or more PDF "attachments" and 1) convert the Excel file to PDF, and 2) combine all the PDFs into a single PDF.

Unfortunately we were unable to find an open source Java solution for converting the Excel file to PDF.  If you want to extract information from the Excel file you can do so using Apache POI.  You can then generate a PDF file from this data using iText.  This wasn't an appropriate solution for us.  An alternative solution in the interim was to print the Excel file to TIFF using the Microsoft Document Image Writer and then convert the TIFF to PDF using iText.  The results look alright, not ideal, but workable.

Here are two Java methods to do it.  Let me know if you find a free Java solution to converting Excel to PDF.

import java.util.ArrayList;
import java.util.List;

import com.lowagie.text.Document;
import com.lowagie.text.DocumentException;
import com.lowagie.text.Image;
import com.lowagie.text.PageSize;
import com.lowagie.text.pdf.PRAcroForm;
import com.lowagie.text.pdf.PdfCopy;
import com.lowagie.text.pdf.PdfImportedPage;
import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfWriter;
import com.lowagie.text.pdf.SimpleBookmark;

public class FileConversions {

     * Convert a TIFF file to a PDF.
     * @param tiffFile
     * @return
     * @throws DocumentException
     * @throws MalformedURLException
     * @throws IOException
    public static byte[] convertTiffToPdf(byte[] tiffFile) 
            throws DocumentException, MalformedURLException, IOException {

        ByteArrayOutputStream outfile = new ByteArrayOutputStream();
        Document document = new Document(PageSize.A4.rotate());
        PdfWriter writer = PdfWriter.getInstance(document, outfile);
        Image tiff = Image.getInstance(tiffFile);
        tiff.scaleToFit(800, 600);
        return outfile.toByteArray();

     * Combine multiple PDFs into a single PDF.
     * @param pdfs
     * @param combinedPdfFile TODO
     * @throws IOException
     * @see
    public static void combinePdfFiles(List<byte[]> pdfs, File combinedPdfFile) throws Exception {

        PdfReader reader = null;
        Document document = null;
        PdfCopy  writer = null;
        ArrayList master = new ArrayList();
        int pageOffset = 0;

        for (byte[] pdf : pdfs) {
            int size = pdf.length;
            reader = new PdfReader(pdf);
            int n = reader.getNumberOfPages();
            List bookmarks = SimpleBookmark.getBookmark(reader);
            if (bookmarks != null) {
                if (pageOffset != 0) {
                    SimpleBookmark.shiftPageNumbers(bookmarks, pageOffset, null);
            pageOffset += n;

            if (document == null) {
                // step 1: creation of a document-object
                document = new Document(reader.getPageSizeWithRotation(1));
                // step 2: we create a writer that listens to the document
                writer = new PdfCopy(document, new FileOutputStream(combinedPdfFile));
                // step 3: we open the document
            // step 4: we add content
            PdfImportedPage page;
            for (int i = 0; i < n; ) {
                page = writer.getImportedPage(reader, i);
            PRAcroForm form = reader.getAcroForm();
            if (form != null) {
        if (!master.isEmpty()) {
        if (document != null) {

Thursday, December 18, 2008

Simple request benchmarking of a Ruby on Rails application using ApacheBenchmarker

You can use ApacheBenchmarker which comes with your default Apache install.  You can find the ab.exe executable in C:\Program Files\Apache Group\Apache2\bin on Windows.

Usage: ab [options] [http://]hostname[:port]/path
Options are:
  -n requests     Number of requests to perform
  -c concurrency  Number of multiple requests to make
  -t timelimit    Seconds to max. wait for responses
  -p postfile     File containing data to POST
  -T content-type Content-type header for POSTing
  -v verbosity    How much troubleshooting info to print
  -w              Print out results in HTML tables
  -i              Use HEAD instead of GET
  -x attributes   String to insert as table attributes
  -y attributes   String to insert as tr attributes
  -z attributes   String to insert as td or th attributes
  -C attribute    Add cookie, eg. 'Apache=1234. (repeatable)
  -H attribute    Add Arbitrary header line, eg. 'Accept-Encoding: gzip'
                  Inserted after all normal header lines. (repeatable)
  -A attribute    Add Basic WWW Authentication, the attributes
                  are a colon separated username and password.
  -P attribute    Add Basic Proxy Authentication, the attributes
                  are a colon separated username and password.
  -X proxy:port   Proxyserver and port number to use
  -V              Print version number and exit
  -k              Use HTTP KeepAlive feature
  -d              Do not show percentiles served table.
  -S              Do not show confidence estimators and warnings.
  -g filename     Output collected data to gnuplot format file.
  -e filename     Output CSV file with percentages served
  -h              Display usage information (this message)

My simple benchmarking tests for my Ruby on Rails website.  I wanted to compare the performance of RoR over CGI with a new server instance created on each request versus requests over CGI proxied to a single long-running mongrel_rails server.  These tests do 10 individual requests, then 100 requests, 5 concurrently. Results are output in HTML.

ab -n 10 -c 1 -w > 10-requests.html ab -n 100 -c 5 -w > 100-5-concurrent-requests.html

For your information I'm running my mongrel_rails using God on port 3000 and I am proxying requests using the standard RoR .htaccess file as follows:

RewriteEngine On RewriteCond %{HTTP_HOST} ^$ RewriteRule ^(.*)${REQUEST_URI} [P,QSA,L]

I have to adopt such an arcane setup because my host HostGator only supports RoR over CGI, which is not very performant.

The test results basically tell me what I already: never, ever run a RoR app over CGI where you're starting the server on every request! Holy smokes! For 100 requests (5 concurrent) over CGI average total request time was 16437 ms (ouch!!!) serving 0.3 reqs/sec. Talking to a proxied mongrel server fared much better with average total request time being 565 ms serving 8.61 reqs/sec.

Here is the output from the latter test.

This is ApacheBench, Version 2.0.41-dev <$Revision: $> apache-2.0 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, Copyright (c) 2006 The Apache Software Foundation,


Server Software:Mongrel
Server Port:80
Document Path:/
Document Length:4174 bytes
Concurrency Level:5
Time taken for tests:0.11609 seconds
Complete requests:100
Failed requests:0
Total transferred:457200 bytes
HTML transferred:417400 bytes
Requests per second:8.61
Transfer rate:39.38 kb/s received
Connnection Times (ms)
Connect: 62 107 3062
Processing: 188 458 703
Total: 250 565 3765

Monday, December 15, 2008

Starting tomcat from Eclipse gives error "Can't load server.xml"

I am using Eclipse Gannymede and I am trying to start a Tomcat 5.5 and/or Tomcat 6.0 server.

Problem: When trying to start the Tomcat server from within Eclipse you get an error stating that Eclipse cannot load the server.xml file:

WARNING: Can't load server.xml from C:\projects\workspace-gannymede\.metadata\.plugins\org.eclipse.wst.server.core\tmp0\conf\server.xml
Dec 15, 2008 3:52:51 PM org.apache.catalina.startup.Catalina load
WARNING: Can't load server.xml from C:\projects\workspace-gannymede\.metadata\.plugins\org.eclipse.wst.server.core\tmp0\conf\server.xml
Dec 15, 2008 3:52:51 PM org.apache.catalina.startup.Catalina start
INFO: Server startup in 0 ms
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.catalina.startup.Bootstrap.start(
at org.apache.catalina.startup.Bootstrap.main(
Caused by: java.lang.NullPointerException
at org.apache.catalina.startup.Catalina.await(
at org.apache.catalina.startup.Catalina.start(
... 6 more

If I try to look at the server.xml file that Eclipse is trying to read, it does not exist on the filesystem.

Solution: I tried a number of things such as deleting all my server configurations as well as the contents of the .metadata\.plugins\org.eclipse.wst.server.core\ directory.  Nothing worked.  The solution is to create a new Server Runtime Environment (Window -> Preferences -> Server -> Runtime Environments).  Then in your Servers view (Window -> Show View -> Servers) add a new server configuration (right-click and select New -> Server) but DO NOT ADD ANY PROJECTS TO IT.  Double click the new server configuration to bring up the Server Overview page which looks like this:

Under Server Locations select Use Tomcat installation.  Save.  You should be able to start your server now (assuming you can start it fine from the command line).  Now add your project to the server and you're ready to go.

Friday, December 12, 2008

OC4J 10.1.3.* does not support JSF 1.2

I spent a very frustrating day wading through the version soup that is JavaServer Faces and trying to deploy a JSF 1.2 app on on OC4J 10.1.3 container.  To save you a lot of time, OC4J DOES NOT SUPPORT JSF 1.2.

This technical paper explains that OC4J 10.1.3 (all versions) supports Servlet 2.4, JSP 2.0 and JSF 1.1.

And this handy webpage explains that "JSF 1.2 is the latest release and it works with servlet 2.5 and jsp 2.1". So obviously it's not compatible.  It also details all the versions of Java technologies that each JSF version relies upon.

So don't waste your time like I did.