Web Reports with Adobe Acrobat Documents

Last Updated: 03/04/02

Ó Rick Strahl, West Wind Technologies, 1999 - 2002

Source files for this article:

http://www.west-wind.com/files/wwPDF.zip

Note: This document was originally created in early 1999 and still has references to older versions of the PDFWriter software. The document has been periodically updated for later versions, but there may still be references to older versions of Acrobat. The attached class provides support for the latest versions of Acrobat 5.0, AmyUni and ActivePDF as well as the older classes..

When you think about building output for Web applications or generating content for display in the browser, you most likely think of HTML as the only mechanism to generate that output. HTML has become synonymous with Web output, because any browser is capable of viewing the result document. While it’s a great way to go to get output to a wide audience, using HTML for generating output can be a tedious task, especially if you already have reports in your existing applications that generate that output. Migrating reports from the WYSIWG (sort of) interface of the VFP (or another) report designer and shoehorning it into an HTML document usually result in unusable or, at the very least unattractive output. HTML also cannot deal well with precisely positioned text like forms that must either print on pre-printed templates or even produce output to exact specs like Tax forms.

Fortunately, there another options available. You can generate output from Visual FoxPro reports to Adobe Acrobat (PDF – Portable Document Format) documents. Acrobat is a tool that uses a PostScript based document format to display data inside a special Acrobat viewer that is available as a freely downloadable add-in for browsers and other applications. An ActiveX control for Internet Explorer (and your own applications) and a plug-in for Netscape are included in the free reader package as well as a standalone reader application that lets you view PDF documents by double clicking on them in Explorer. .PDF is a fairly widespread document format, which many companies use these days for distributing online documentation. What makes Acrobat nice is that it is a rich document format, which can include full type and image information embedded directly into the .PDF document. All this information is preserved when viewing or printing the document assuming the appropriate type information is available on the client using the reader software – otherwise Acrobat uses smart font substitutions to keep the document as close to the original as possible.

Acrobat works on the Windows and Mac platforms, so you can create and view PDF files on either of these platforms easily. As the name implies Acrobat is very portable. The PDF format also allows for additional features like editing documents (which requires the full retail Acrobat package), embedding of hyperlinks and forms and a host of other powerful features. See www.adobe.com for a full features list that this powerful package provides. In this article however I’ll only focus on creating straight print output from Visual FoxPro reports into PDF documents.

Different PDF Generation engines

Before I go into describing how the Acrobat PDFWriter works in version 5.0 of Adobe it's important to mention that there are various tools for generation PDF content. Adobe is the official holder of the PDF format, but they are not the only ones providing tools to create them. In fact, generating PDF files with PDFWriter is bound by a very strict license agreement which you should carefully read especially if you plan to use it in Web applications.

Several other companies have also created PDF generation engines:

AmyUni
A small company specializing in custom printer drivers. Their PDF driver works very well and comes with a VFP FLL interface you can call directly from your applications. There are some configuration issues with this product if you want to use it in a high volume environment. This solution is very economical if it fits your needs and provides for Web Server printing.

http://www.amyuni.com/pdfpd.htm#pdfconvdesc

ActivePDF
This company specializes in PDF creation tools and their engine supports full multi-threading and many advanced configuration features that allow full programmatic control over the PDF generation process. This product is complete, complex (if you use all of the stuff it provides) and on the pricey side.

http://www.activepdf.com/en/Products/Server

There are even more products, some of them free, that also generate PDF files. I only have experience with Acrobat/Distiller and the two listed above, but if you're adventurous and want to find a cheaper/free solution it might be out there.

The PDFWriter Printer Driver makes it happen

Note: I wrote this article originally in early 1999 when there weren't a lot of choices, so it focused only on Acrobat. The class provided on disk provides subclasses for the various PDF engines (wwPDF50, wwPDF40, wwPDF(3.x), wwDistiller, wwAmyUni, wwActivePDF). The article here refers to Acrobat 50 only, but the different classes are available in wwPDF.prg.

The cool thing about the .PDF format is that you can generate .PDF documents using the Adobe PDFWriter software, which captures output to the printer and converts it to a .PDF document. This makes it a snap for example to print a Word document, send the output to the Acrobat PDFWriter driver and generate a PDF file which anybody with the PDF viewer can view on the Web. You can do the same with a PageMaker document, a Rational Rose diagram, output from a photo package and of course a report from Visual FoxPro. Any print output can be captured and redirected to the PDFWriter driver.

The .PDF file format is also compressed which means that it’s actually very efficient in terms of size – in many cases .PDF reports can be smaller than comparable HTML documents even with images enclosed. It depends considerably on the size of the document (in general, the longer the document, the more compression) and graphical content. Graphics are compressed also, but if you have images on each page (even the same ones), your documents will not be small as they are embedded as binary data into the resulting PDF file rather than loading images dynamically one at a time as HTML does.

Acrobat is based on Adobe’s Postscript engine so the idea is simple: A printer redirection driver captures output from a PostScript Printer driver directly and dumps the result to a .PDF file which you can then be viewed in the Acrobat reader. The PDFWriter software consists of a PostScript driver and capture software which converts the file to a PDF on the fly.

While the Acrobat Reader software is free and easily downloadable from the Web, the PDFWriter software is part of the full Adobe Acrobat package which you have to purchase (about US $200 retail). For Web applications this should not a problem since you can use a single copy of the PDFWriter on the Web server.

Once you install the Adobe package you’ll have the printer driver available as an option from your printer dialog as shown in Figure 1.

Figure 1 - The Adobe PDFWriter printer driver is available as a printer output option. Here I’m running a REPORT FORM … PROMPT

When you send a document to the PDFWriter driver a dialog pops up and allows you to select a .PDF file to redirect the output to. This works well for interactive operation.

Viewing the document with a browser

PDF documents can then be viewed either using Adobe’s external Acrobat reader or directly in the browser using the Acrobat ActiveX control or Plug-in (for Netscape browser). Internet Explorer and Netscape don’t include the Acrobat reader software by default, so until users download and install the Acrobat reader (4meg download for Acrobat 4.0) from the Web, users cannot view .PDF files. Once the viewer is installed however, .PDF documents can be viewed directly from the browser as shown in Figure 2.

Figure 2 - Read .PDF files in a browser - The Acrobat reader ActiveX control allows viewing of rich content directly in the browser. A document like this was probably created with page layout software like PageMaker and then printed to a .PDF file. Note that you can ‘flip’ through pages and search the document for text.

The viewer can display rich text and graphics exactly as you laid them out in your word processor or any other application. The viewer allows you turn pages similar to the way the Report Designer works in Visual FoxPro. You can zoom in and out and even search for text inside of the document. Because PostScript output uses text and font information, the document is context sensitive and allows interaction. The Adobe Exchange (part of the retail Acrobat package) software even allows annotations and offers the ability to modify and save the document with the changes.

PDF with Visual FoxPro

In order to print to .PDF documents, you need to have the Adobe PDFWriter software from the full Adobe Acrobat software package installed. To send VFP reports to .PDF is straight forward. Print a report interactively and select the Adobe PDFWriter software from the printer prompt, specify a file and off you go.

From code you can do:

SET PRINTER TO NAME "Acrobat PDFWriter"
REPORT FORM MyReport TO PRINT

When running with the default Acrobat installation, PDFWriter pops up a dialog box asking for a filename to export your print output to. The dialog also lets you configure how output is to be presented. For example, you can set the compression options, the resolution (default is optimized for screen display) as well as orientation and font substitutions.

Configuring PDFWriter for unattended operation

The above works fine in interactive applications, but for Web applications you cannot have any prompts or dialogs popping up or your Web application would hang.

Unfortunately it isn’t as simple as doing REPORT FORM … TO FILE <somefile.pdf>. The PDFWriter software appears to not have been designed smartly for unattended operation and the operation varies considerably depending on which version of the software you use and which operating system it runs under.

When I wrote this article originally I wrote it for Version 3.02 of the PDFWriter. A fairly complex mechanism of writing output options to an INI file was required to tell the software where to print the output to and to surpress the dialogs. It required finding the appropriate INI file and then rebooting to have the changes take effect – very, very messy. Then a few months later Acrobat 4.0 was released and while you still can’t simply use the TO FILE clause to create output, you can at least bypass the dialog under Windows NT. Under Windows95/98 however, the old INI file mechanism is still required even with the 4.0 software. Now Acrobat 5.0 uses yet another scheme to handle the PDF output.

I highly recommend you don’t bother with the 3.0x PDFWriter software. While it works, the 4.0/5.0 driver is much faster and under Windows NT at least provides concurrent printing support. Since Web apps are likely to run under NT you'll get the benefits of the improved interface. The 5.0 reader is also much better – faster and more stable when running in the browser, which was a serious problem with Acrobat 3.x. For this reason, I’ll focus only on Acrobat 5.0 in this article, but the source provided with this article include classes all the different PDF creation tools I've mentioned.

In this article I want to show you how to run VFP reports dynamically on a Web server and return the result PDF content directly back to the browser for display. The idea is to do the following:

¨        Set the printer to the PDFWriter

¨        Print the report to a file we specify

¨        Retrieve the file into a string

¨        Send the file back to the Web server/browser as binary content

The wwPDF50 class

In order to automate the process of generating report output, I’ve created a couple of bunch of classes that handle the process of generating PDF output automatically. The different classes use differerent PDF generation engines to generate their output including Acrobat, Distiller, AmyUni and ActivePDF. The different classes are contained in wwPDF.prg. I'll use the wwPDF50 class here for the examples:

SET PROCEDURE TO WWAPI ADDITIVE

SET PROCEDURE TO WWPDF ADDITIVE

oPDF=CREATE("wwPDF50")

lcCust = "A"

SELECT * from TT_CUST INTO CURSOR Tquery

llResult = oPDF.PrintReport("Report.frx",;

"c:\temp\MyPdf.pdf","FOR Customer = '"+lcCust+"'")

The parameters to PrintReport specify the name of the report, the output file and any optional report command clauses.

If you want to send output from a report directly back from a Web request use code like the following (this example uses West Wind Web Connection):

lcCompany = UPPER(Request.Form("Company"))

SELECT * ;

   FROM TT_Cust ;

   WHERE UPPER(Company) = lcCompany ;

   ORDER BY Company ;

   INTO CURSOR TQuery

oPDF=CREATE("wwPDF40")

lcPDF = oPDF.PrintReportToString("custlist")

Response.ContentTypeHeader("application/pdf")

Response.Write(lcPDF)

Figure 3 shows the result of this report in the browser.

Figure 3: Attractive reports with ease - Running a report in the browser only takes a few lines of code using the wwPDF40 class.

The code above generates the report, saving it to a string that contains the binary content of the report in PDF format. The content is then stuffed directly into the HTTP output stream, in this case using Web Connection. The same approach works with FoxISAPI, but you must be using an updated version of FoxISAPI.dll that supports output of binary data. The updated version is available on this month’s Professional Resource CD, or from the West Wind Web site’s article on FoxISAPI. See the sidebar on some limitations on using this technique with Active Server Pages.

Note that in order to display a PDF document in the browser you need to change the content type of response data. The content type lets the browser know what type of data it is receiving. Based on the application/pdf type the browser knows to launch the PDF viewer either as an ActiveX control in IE or a plug-in in Netscape.

Running PDF reports with Active Server Pages

Returning PDF documents from ASP components is not quite as easy as it is with native VFP tools like Web Connection or FoxISAPI. The problem is that the VFP runtime does not allow the VFP report engine to run from in an In Process COM DLL. The report writer creates a status dialog while printing which causes a COM exception when REPORT FORM is invoked in a DLL server. Although this seems like a superficial problem with the VFP runtime since the dialog is not required, the underlying problem is that the report engine is not thread safe to run under the Apartment Model Threading that is required for ASP components. If you want to use this functionality from within ASP you have to build an EXE server and enable the ASPAllowOutOfProcComponents IIS Metabase key (see IIS docs for details on how to change this).

Even once you have an EXE server ASP makes returning binary output like PDF documents a little tricky. VFP cannot send binary data over a COM connection directly due to conversion from Ansi text to COM multi-byte characters, which cannot deal with binary data. To get around this you have to use the little known CreateBinary() function. The following code is an example how you’d run a report to PDF and display it from a ASP PDF page:

ASP Code (full .ASP document):

<% Set oServer = ServerCreateObject("Tserver.MyASPClass")

   oServer.AspPDFReport(Request,Response)

%>

VFP Method:

DEFINE CLASS MyASPClass as CUSTOM OLEPUBLIC

FUNCTION AspPDFReport

LPARAMETER Request, Response

lcCompany = UPPER(Request.Form("Company").Item())

SELECT * ;

   FROM TT_Cust ;

   WHERE UPPER(Company) = lcCompany ;

   ORDER BY Company INTO CURSOR TQuery

oPDF=CREATE("wwPDF50")

lcPDF = oPDF.PrintReportToString("custlist")

*** Clear all response output

Response.Clear

*** Change the content type if you need to display

*** this binary data in the browser

Response.ContentType = "application/pdf"

*** Write out binary data.

*** Both BinaryWrite (ASP) and CreateBinary (VFP) are required

Response.BinaryWrite( CreateBinary(lcPDF) )

ENDFUNC

ENDDEFINE

The key is that the server that runs this method must be an EXE. Once the output is generated the VFP CreateBinary() function must be used to create a ByteArray string that can pass binary data back over a COM connection. On the ASP side binary data must be written out with the Response.BinaryWrite method, which unlike the Write method can deal with NULLs and other extended characters. If you use the ASP page to return the output from a VFP method, you still need to use CREATEBINARY() on whatever binary value the VFP method returns and Response.WriteBinary() on the data in the ASP page.

A closer look at wwPDF

At first glance it would seem that creating PDF output should be trivial by using the PDFWriter printer driver and dumping the output to a file. Each of the different engines generate output files differently and hence there are different classes for each implementation. Acrobat's implementation is the most clumsy one, which involves generation of temporary output files which aren't automatically cleaned up.

wwPDF basically manages how the output is sent to file and manages any cleanup depending on the mechanism used. For PDFWriter the temp files are cleaned up, for Distiller the postscript print file is cleaned up. ActivePDF and AmyUni clean up after themselves so there's less code involved.

The key method of the class is PrintReport, which deals with setting up the printer driver and handling the output files generated by it. PDFWriter creates two files, a postscript file which is created with a .ps extension. The second file will be a file by the same name with a .pdf appended to the full filename. Yes, it gets appended after the .ps so you get a filename with two extensions like: outputfile.ps.pdf. Once the report is done, the code erases the Postscript file and then renames the .PDF file to the actual output filename specified.

The PrintReportToString method can output the PDF file directly into a string. This method simply calls PrintReport to create a PDF output file and reads that binary file into a string and returns it as a return value.

Browser PDF viewing problems

There are a number of quirks with the way the Acrobat control works both in Internet Explorer and Netscape when called from dynamic requests. Static .PDF files always work fine, but requests returning .PDF content from a dynamic (read: CGI/ISAPI/ASP) HREF link or Form POST operation are quirky if you are using the 3.x viewer. I'm aware of the following problems:

¨        Internet Explorer displays an empty page
Occasionally Internet Explorer causes pages to display blank rather than displaying the .PDF document. Apparently there’s a bug in the Acrobat 3.x ActiveX control when dealing with multiple pages. This bug is inconsistent. In many cases reloading the page results in the output being displayed.

¨        Netscape won’t display .PDF documents from HTTPS requests
Netscape works much more reliably with Acrobat 3.x, but it too has a problem: Running requests over HTTPS and displaying a dynamic document (that is, returned from a Web application rather than from a static .pdf file) fails by displaying a blank page.

¨        Both browsers cause multiple hits on the back end server
This inconsistent bug causes the same request to fire more than once on the back end server. When you submit the Form or HREF link to a dynamic request you can see the request running multiple times in a row, performing exactly the same task. This seems to be tied to the Adobe plug-in/ActiveX control trying to reload the data multiple times due to a timing bug. This problem is more prevalent with Internet Explorer, but can occur in either browser. This problem occurs both with the 3.x and 4.x Acrobat viewers.
There has been no workaround that works consistently in all browser/reader versions other than not directly displaying the result from the server immediately. Instead you can use an intermediate page as the following example demonstrates (using Web Connection):
lcCompany = UPPER(Request.Form("Company"))

SELECT * ;
   FROM TT_Cust ;
   WHERE UPPER(Company) = lcCompany ;
   ORDER BY Company ;
   INTO CURSOR TQuery
 
oPDF=CREATE("wwPDF50")
lcFile = SYS(2015) + ".pdf"

*** Print the report to a temporary Web Directory
oPDF.PrintReport("custlist.frx",Server.oConfig.owwDemo.cHTMLPagePath + "/temp/"+lcFile)

*** This page briefly displays, then redirects to the PDF file
THIS.StandardPage("Report Complete",[<a href="/wconnect/temp/] + lcFile + [">] +;
                  [Click here to view the Customer Report</a>],,;
                  1,"/wconnect/temp/"+lcFile)

*** Delete files that have 'timed out' (300 secs)
DeleteFiles(Server.oConfig.owwDemo.cHTMLPagePath + "temp/*.pdf",300)
This approach uses an intermediary page to hold a link to the generated PDF file which then auto-redirects to the page. This works reliably in all tested situations, but you may have to change the redirect timeout to a slightly longer value than the 1 second specified above in the call to StandardPage. Clicking the link should be 100% safe.

For non-Web Connection users, you can use:
<META HTTP-EQUIV="Refresh" CONTENT="0; URL=/relpath/mypdf.pdf">
in the HTML header (inside of the <header></header> block) to force the page to automatically redirect to the specified PDF URL. Note, even Redirect() operations have been known to not work correctly, while the meta refresh or an explicit link on a result page are always reliable.

Note the call to DeleteFiles() to clean up the generated PDF files. The PDF file must be generated into a known Web directory which should be a specially set up Web temp directory. Web Connection installs a TEMP directory off the /WCONNECT root by default and I tend to use such a temp directory for any temporary files. DeleteFiles deletes files selectively after a specified period to avoid piling up large number of files. Here 5 minutes is used as the timeout before a file is deleted. Remember users can download the file and save it locally if needed so 5 minutes is plenty.

Another suggestion for the above example is to send the result to a separate window. The problem with the above is that if you click the BACK button in the browser you end up on the temporary page, so to get back to the query page that generated the PDF you have to click BACK twice. You can avoid this by using a TARGET on your HREF or FORM links to force the temp page and the PDF into a separate window which the user can close when done with the document.
At work

I use the .PDF generating features on my Web site for printing invoice receipts (see Figure 1.4) and PO forms for customers. When people order software online, they can click on a link to pre-print a PO form that has all the current order information on it. This can be mailed or faxed in by the Purchasing Department.

Figure 4 – Provide a printed invoice for the customer. Using PDF documents can provide features that would be difficult to implement with HTML.

Another place where .PDF reports come in very handy is with pre-printed forms or forms that must follow a very strict format. Reproducing reports such as this with HTML is impossible and the .PDF format makes it possible to create remote output the same way you would expect from the server that created the form.

Using Acrobat in your Web applications can provide a rapid migration path for existing reports. Whether you need to generate static reports or run them dynamically over the Web the output generated in .PDF format is readily usable over the Web. Compared to recoding complex reports in HTML, this technique can save you tremendous time and possibly provide more usable documentation.

When using Acrobat on the Web server for dynamic requests, keep in mind that running a report tends to be fairly resource intensive and slow. It takes a few seconds for VFP to initialize its report engine and a few more to parse and run the report. Compared to HTML generation, this process is much slower.

Still, consider this technique for those occasions where quick development turnaround or fine graphic control over output format is required.

Resources

Code for this article:

http://www.west-wind.com/files/wwPDF.zip

AmyUni PDF driver:

http://www.amyuni.com/pdfpd.htm#pdfconvdesc

ActivePDF Server:
http://www.activepdf.com/en/Products/Server

For comments, question etc. you can post a message at:

http://www.westwind.com/wwthreads/default.asp?forum=Code+Magazine.

Rick Strahl is president of West Wind Technologies on Maui, Hawaii. The company specializes in Web and distributed application development and tools with focus on IIS, .NET and Visual FoxPro. Rick is author of West Wind Web Connection, a powerful and widely used Web application framework for Visual FoxPro and West Wind HTML Help Builder. He's also a Microsoft Most Valuable Professional, and a frequent contributor to magazines and books. He is co-publisher and co-editor of CoDe

magazine, and his book, "Internet Applications with Visual FoxPro 6.0", is published by Hentzenwerke Publishing. For more information please visit: http://www.west-wind.com/.

Creating Web reports from Visual FoxPro with Adobe Acrobat Documents

Different PDF Generation engines

The PDFWriter Printer Driver makes it happen

Viewing the document with a browser

PDF with Visual FoxPro

Configuring PDFWriter for unattended operation

The wwPDF50 class

Running PDF reports with Active Server Pages

A closer look at wwPDF

Browser PDF viewing problems

At work

Resources

Creating Web reports from Visual FoxPro
with Adobe Acrobat Documents