PDF Power to the People

This article was first published in FoxTalk  in June 2005. It is currently available on MSDN, but is somewhat poorly formatted there.  I also wanted to add some source code for using the principles discussed here, in versions of FoxPro earlier than VFP 9.

The full source, including a PDFClass written suitably for use as far back as VFP 5 (!), is available in Spacefold downloads as PDFPowerSource.zip. While it is not discussed in the article, you will find more information here.
Update, September 2008:

If you are deploying applications to Vista, the approach described here still works. Please read this blog post for information about adjustments for the "transparent install" aspect of the approach you'll want to make.
Update, June 2010:

Please read this blog post for information about improving the creation of your very own "install.dbf".

When your output requirements include delivering exact electronic versions of printed documents, the PDF (Portable Document Format) is today's gold standard. The Microsoft Document Imaging (MDI) format, based on TIFF (Tagged Image File Format), is a viable alternative only if the recipient acquires the Microsoft Office Document Imaging reader or another program that can read TIFF and MDI files. The Adobe®Reader® for PDF is free, in widespread use, and available for every platform.

VFP 9.0 ReportListeners can produce multi-page TIFF documents, and other page-image file formats, directly, but your users may still ask why your FoxPro applications can't produce PDF documents the way some of their other software tools do. In reality, FoxPro applications can produce PDFs — taking advantage of exactly the same resources that the other software tools use, as I'll show you in this article.

These resources are not delivered with Visual FoxPro or other Microsoft products. Their reliance on "Free Software Movement" and open source components indicates one potential explanation for the omission. The fact that PDF is a de facto standard developed by another major vendor suggests another possibility.

Your users don't care about vendor strategy, so our speculation about such matters is useless. From users' point of view, you are the vendor. So what are you waiting for?

What you want

Users just want PDFs. As a professional developer, you have additional requirements:

  • No fee associated with adding this feature to your applications.
  • Ability to work with any FoxPro reports.
  • No user awareness of external programs and no user intervention in the process, even during installation. I'll refer to these characteristics as application transparency.

Application transparency is a special kind of grace, and it's no joke. It might be defined as the set of characteristics that enable your component-based software solution to appear to users as if it were purpose-built by one mind (yours), just for them, from the ground up. When you build a FoxPro application, you add value to the components in countless ways, but sometimes application transparency can make the difference between users accepting your value-added proposition and disputing it.

I live in Las Vegas, so I'll place a bet that you can fulfill your requirements, and I'll raise the stakes, too. With little adaptation, the same techniques should work for any output that obeys the instructions in a SET PRINTER TO command, not just any FRX or LBX results. They should also work in any version of FoxPro, even FoxPro for DOS and UNIX character-based reports or other printer output.

See sidebar, Commercial alternatives, below.

Consider your options

You can generate PDF, as you can solve most programming problems, using either low-level or high-level techniques.

In this instance, by "low-level", I mean you can take on the task of understanding the PDF format and describing report output using this format, one object at a time. In VFP 9.0, you can take this approach if you want. Simply intercept the rendering event using a ReportListener-derived class and generate the appropriate PDF code for each object as the report runs.

If you are already an expert in the PDF format, you might come out with a reasonable facsimile of the original report this way, but you will have to solve various issues differently for each type of rendered report object. When you see a product or utility that purports to generate PDF but indicates limits with certain types of output, such as images or shapes, or that produces only a cut-down rendition of its source, this is the approach it uses.

By "high-level" I mean you can compose a PDF document from some readily-available parts: a PostScript file, and a PostScript interpreter. PostScript is the page description language underlying PDF format. A PostScript interpreter can tokenize the document, bundling all the necessary references into a single compressed PDF file, optionally embedding fonts as it does so.

Capable and sophisticated PostScript printer drivers easily generate PostScript document files. These drivers are delivered in every version of Windows. I use the free software components of GhostScript to serve as the required interpreter, and supply the conversion to PDF.

GhostScript is a stable, well-supported, and mature product. It scales well enough to be used in many production web sites that offer PDF conversion services and generate other on-the-fly PDF output. When you see a full-featured commercial utility offering capable and complete PDF production, GhostScript is very often the underlying technology.

I think you can see where I'm headed. No matter how much you enjoy getting your hands dirty in the internals of code, I recommend using high-level procedures: leverage PostScript drivers to get high-quality PDF output.

See sidebar, Why not use XSL:FO?, below.

If you've tried this technique previously and couldn't get it to work, chances are you couldn't resolve some issues of application transparency. This article shows you how, and provides the relevant code encapsulated in a ReportListener-derived class for easy reference. If, by chance, you had additional FoxPro-specific concerns, additional code in the PDFListener class should resolve them as well. If you've been unsure about how to include open source or Free Software components in an application, keep in mind that different component-providers set different arrangements, but consider this a fully-worked example.

How you get what you want

The following are the required steps for PDF generation in the "high-level" approach. These steps are all included in the ReportListener-derived class definition you'll find in PDFLISTENER.PRG, for convenience, although none of them require a ReportListener to work:

  1. Determine if GhostScript is available in the environment. If necessary, install it silently.
  2. Determine if your special printer setup is available. If necessary, install the driver in a hands-off manner, suitably branded to your application.
  3. Instruct VFP to use the driver.
  4. Generate your report, or reports, to a PostScript file or files, storing the resulting filenames.
  5. To maintain good form, restore VFP's printing environment.
  6. Create a command text file, listing the PS file or files to be converted.
  7. Invoke GhostScript with the command file, the desired output filename, and other suitable command-line arguments.

Steps 1 and 2 can be done at any time, and typically should be done as part of your application's normal setup procedures. PDFListener includes them as part of the startup procedures for a report run, in the following method:

   PROCEDURE LoadPrinterInfo()
      IF NOT THIS.VerifyGSLibrary()
         RETURN .F.
      ENDIF   
      IF NOT THIS.VerifyPrinterSetup()
         RETURN .F.
      ENDIF
      IF NOT THIS.AdjustVFPPrinterSetups()
         RETURN .F.
      ENDIF
   ENDPROC

Setting up straight

The first two calls in this method are also public methods, which you can call separately in any setup procedures, to ensure availability of GhostScript and the PostScript driver. Presumably your setup application has the necessary rights to create a directory and install a printer setup.

You can continue to call these methods at the beginning of every report processing run, as PDFListener does, on the off-chance that a user has uninstalled either of these external components. If the current user has the required rights, a missing component will re-install. If the user does not have rights and a component is missing, the PDFListener.RunReports() method returns .F. Your application should provide reasonable error feedback to the user in such a case, just as it would if any other required files for the application, such as an editable FRX or a table, had been deleted.

Some people like the idea of supplying code to impersonate a different user to resolve this issue without error feedback in the application. I think it's overkill, as well as a potential security risk to store administrative log-in credentials in an app just for this purpose.

Installing GhostScript on-the-fly

VerifyGSLibrary is the method that checks for GhostScript components, using an exposed member property of the class (GSLocation) to determine your proposed location. If you don't set the property explicitly, the class uses a method named GetDefaultGSLocation to derive a location. As written, the method attempts to find an appropriately-named child directory below the running application, or below the location of the class library if you don't build it into an application. The goal is to determine a potential location guaranteed to be available, whether it currently contains the GhostScript files or not.

Once it has determined the proposed location, and if the files are not available, VerifyGSLibrary creates the subdirectory if necessary. It places the files on disk, using the absurdly simple, and time-honored, FoxPro method of dumping files out of memo fields. In a later section I'll explain how the correct files get into the INSTALL.DBF table.

You don't need to use any registry calls, and you don't need to run any external programs to install GhostScript. If your application knows the location of the files, you're all set. You may have thought otherwise, if you've used a typical GhostScript distribution for the Windows platform, but GhostScript is not "born" a Windows program. Neither its installation nor its use is dependent on fancy Windows tricks.

Proper use of GhostScript is, however, dependent on fulfilling your responsibility to its terms of license. Although they are not onerous, you should take the time to acquaint yourself with these terms and your various options.

See sidebar, Using Free Software responsibly.

Designating your driver

PDFListener.VerifyPrinterSetup() is the method that checks for your required printer setup and installs it on demand. The class uses a public member property, PSDriverSetupName, for the name you want the user to see as the installed printer. When you assign this value, it checks your value against the printer driver it expects to use, using a new argument to the APRINTERS() function to get the driver name. You can use a name such as "ABC Company SuperDriver", as many of the PDF utilities do, branding the setup to your installation, and with no obvious association to PostScript:

PROCEDURE PSDriverSetupName_Assign(tcVal)
IF VARTYPE(tcVal) = "C" AND NOT EMPTY(tcVal)
   LOCAL laSetups[1], liIndex, ;
         llFoundAndNotAppropriate
   FOR liIndex = 1 TO APRINTERS(laSetups,1) 
     IF ( UPPER(laSetups[liIndex,1]) == ;
          UPPER(ALLTR(THIS.PSDriverSetupName)) );
         AND ;
        ( NOT (UPPER(laSetups[liIndex,3]) == ;
           UPPER(ALLTRIM(DRIVER_TO_USE)) ) )
           llFoundAndNotAppropriate = .T.
           EXIT
     ENDIF 
   ENDFOR
   IF NOT llFoundAndNotAppropriate
     THIS.PSDriverSetupName = ALLTRIM(tcVal)
   ENDIF
ENDIF
ENDPROC

As you can see, this assign method will allow you to assign any driver name you like, even if the printer setup does not exist, but it will not allow you to assign a name already in use for a printer setup that does exist and uses a different printer driver than you need. This condition handles the unlikely case that somebody has already used your exact brand for a non-PostScript driver.

Using its internal default or your assigned printer setup name, PDFListener.VerifyPrinterSetup() executes a single call to PRINTUI.DLL, a Windows component located in the System32 directory. In one line of code, this call performs several tasks:

  1. installs the required base files;
  2. xassociates them with your named printer;
  3. makes this setup available with your brand;
  4. marks the printer setup shared, in case you're using this application on a dedicated print server box.

Here's what the call looks like:

   THIS.DoStatus(INSTALLING_DRIVER_LOC)
   lcCmd =  ;
    [%windir%\\system32\\rundll32.exe ] + ;
    [printui.dll,PrintUIEntry /if /b ] + ;
    ["] + THIS.PSDriverSetupName + ["] + ;
    [ /f %windir%\\inf\\ntprint.inf /r ] + ;
    [ "lpt1:" /m "] + ;
    DRIVER_TO_USE + [" /Z]
   llReturn = THIS.oWinAPI.ProgExecute(lcCmd)
   THIS.ClearStatus()

It looks scary, but you can run the following command at the DOS command line to find out what all the options mean:

%windir%\\system32\\rundll32.exe printui.dll, PrintUIEntry /?

You'll notice that I'm assigning the port as LPT1. The initial port assignment is irrelevant, because you're always going to tell VFP to send the PostScript output to a file.

The ReportListener.DoStatus() call above allows you to provide a custom message alerting the user to your activity, as the Windows call may take a moment to copy the driver files. (see Figure 1). If you're not interested in user feedback, use the ReportListener.QuietMode switch to suppress the message. The result is a branded printer setup for your application (see Figure 2).

figure 1
Figure 1. Standard Windows system installation dialogs...
figure 2
Figure 2. ... and your own branded ReportListener status messages, are the maximum interface a user sees if you use PDFListener to install a printer setup on-the-fly. Afterwards, they only see your application's specialized printer setup.

What printer driver should you choose? PDFListener #defines the DRIVER_TO_USE as "Apple Color LaserWriter 12/600", but almost any PostScript printer will do, if it is recognized by the version of Windows on which you are installing your application. The system file NTPRINT.INF, used in the command string above, provides a wide variety of choices. Refer to http://www.cs.wisc.edu/~ghost/doc/printer.htm for a list of potential printer issues, but in most cases any PostScript printer that supports your desired resolution and page sizes, and color if you need it, is fine.

Using PRINTUI.DLL to install a printer driver programmatically may not work in all versions of Windows; the version above works in XP and will work, with some alteration, in Windows 2000. (Refer to Microsoft KB article 189105 for details.) It is also not the only way to install a printer driver programmatically, if you can use Windows shell scripting. Check your Windows directory for files with the name skeleton PRN*.VBS and you will find a host of scripting utilities for controlling printers.

Remember I said that this method would work across versions of FoxPro? Before FoxPro programmers used Windows printer drivers, we used _GENPD to provide printer instructions for FoxPro character-based output. The default _GENPD application came supplied with a perfectly respectable PostScript mechanism, including a base set of PostScript font descriptions. You can SET PDSETUP TO <your PostScript setup>, SET PRINTER TO (sFilename), et voilà, with no special installation, your DOS applications are rolling in PostScript files every bit as good as the ones you get from Windows.

See sidebar, Some GhostScript features we'll ignore , below.

Configuring at runtime

Assuming you're using Visual FoxPro rather than DOS or UNIX, you're now up to step 3, telling FoxPro to use your designated driver. You've already given PDFListener your printer setup's name, so it can easily issue a standard FoxPro command to tell VFP about it:

SET PRINTER TO NAME (THIS.PSDriverSetupName)

Because PDFListener works in VFP 9.0, it adds some special magic to save and restore the printer setup more completely than previous versions allowed. Before SET PRINTER TO NAME, it uses SYS(1037,2) to save the current VFP printer environment to a cursor in the special FRX data session, in the PROTECTED method PrepareFRXPrintInfo. After the print processing run, it uses another PROTECTED method, UnloadPrinterInfo and SYS(1037,3) to bring the user's full printer environment back. Using this technique preserves any non-default options, such as page size and orientation, the user may have in his or her VFP environment.

PDFListener handles these tasks in the third method call you saw in the LoadPrinterInfo method: AdjustVFPPrinterSetups. I gave it that name to emphasize the fact that it does not touch the Windows environment, changing only your VFP printing behavior. Unlike the earlier setup method calls in LoadPrinterInfo, this one is PROTECTED, because you only want to call it as part of a defined output sequence. Its "push" mechanisms should always be matched with UnloadPrinterInfo() to "pop" the VFP printer environment back at the end of report processing. If you decide to add removal and restoration code for FRX and LBX printer environments, as I suggest in the next section, you can add it to these methods.

Ensuring use of your printing instructions in reports and labels

If you're using reports and labels for output, use this technique only with FRX and LBX files that do not have Printer Environments saved. Otherwise, the SET PRINTER TO NAME instructions will not affect the report system. (In VFP 9.0, new reports and labels do not store printer environment information, by default.) As delivered in your source code, PDFListener does not take care of this detail for you. Be sure to use the supplied TEST.PRG only with reports and labels that don't have their own printer setups. You can alter PDFListener to remove this data for later restoration, using a private cursor and a technique similar the way it handles the outer VFP printing environment.

If you store a printer environment in a report and later remove it through VFP's native menu option, be aware that this option retains a COLOR switch in the header record Expr field. For PDF purposes, you should evaluate whether you want color or grayscale. If you want color, ensure that the Expr field contains COLOR=2 (indicating "on"). If you're working in VFP 9.0, you can also leave the Expr field untouched, and add a user override setting of COLOR=2 in the Picture field of the header record. To learn more, read "Understanding and Extending Report Structure" in the help file, and become friends with the FILESPEC\60FRX.DBF specification table.

Getting results

You're now at step 4, generating the files and invoking GhostScript to convert them. This is really the easy part. PDFListener derives from _ReportListener in the FFC, so it maintains a collection of reports. It augments the _ReportListener.AddReport method to create temporary postscript output file names for each report in your collection, and adds some intelligent parsing to your proposed REPORT FORM command clauses for each report, to make sure the REPORT FORM command contains the information necessary to generate file output. It also augments the _ReportListener.RunReports method with two post-processing actions: restoring the VFP printer environment using UnloadPrinterInfo() as described above (step 5), and running its ProcessPDF method.

ProcessPDF first creates a GhostScript command text file, listing all the temporary PostScript files in your report run (step 6):

#DEFINE GS_COMMAND_STRING_1  " -q -dNOPAUSE " + ;
              " -I./lib;./fonts " + ;
              "-sFONTPATH=./fonts " + ;
              "-sFONTMAP=./lib/FONTMAP.GS " + ;
              "-sDEVICE=pdfwrite -sOUTPUTFILE="
#DEFINE GS_COMMAND_STRING_2 " -dBATCH  "   

PROTECTED PROCEDURE MakeGSCommandFile()
LOCAL lcFile, lcContents, lcFileStr, liH
lcFile = FORCEPATH("C"+SYS(2015)+".TXT", ;
   THIS.GSLocation)
lcFileStr = THIS.GetQuotedFileString()
* GetQuotedFileString uses PDFListener's 
* collection of temporary file names 
* associated with each report in the run 
lcContents = GS_COMMAND_STRING_1 + ;
             ["]+THIS.TargetFileName+["]+ ;
                GS_COMMAND_STRING_2 + ;
                + lcFileStr   
   * use forward slashes or doublebackslashes:
   lcContents = STRTRAN(lcContents,"\","/")  
   IF FILE(lcFile)
      ERASE (lcFile)
   ENDIF
   liH = FCREATE(lcFile)
   FPUTS(liH,lcContents)
   FFLUSH(liH,.T.)
   FCLOSE(liH)
RETURN lcFile 

This is another scary piece of code, because the line invoking GhostScript can have many options and is particularly exacting with respect to file paths. Most of the options I specify point the GhostScript executable to its libraries. If you change how you distribute the GhostScript libraries from the strategy I use in INSTALL.DBF, you'll change this section of the command.

When you read the code executing this command in the full PDFListener source, you'll notice that I temporarily change the current directory to the directory in which you've placed the GhostScript executable. My goal is to minimize any capitalization or directory-finding issues, which are standard hazards of working with code-not-born-in-Windows. FoxPro's somewhat cavalier treatment of filenames, with regard to exact case, tends to increase the potential hazards. I restore the original default directory immediately after running GhostScript.

Upon investigating the ProcessPDF method, you'll find that, after MakeGSCommandFile() returns the full command, it uses a different method to execute the command line (THIS.oWinAPI.ProgExecuteX) than PDFListener used earlier to install the printer driver. I find that shelling out to Windows is best handled with different error-handling strategies and timeout strategies for different types of programs. You probably have your own approach. You might also prefer to avoid the whole question, by using DECLARE DLL calls to invoke functionality supported by GhostScript's supporting DLL libraries. GhostScript has a well-documented API, so you're not really required to execute the GSWIN32C.EXE file. However, I prefer the command-line interface because its command-line switches are equally well-developed, and its use emphasizes the fact that you can perform these tasks with any application that can RUN a batch file or invoke a shell script.

No matter how you decide to do it, invoking GhostScript brings you to step 7. Upon completion, you now have a PDF incorporating all the reports you told PDFListener to run.

Evaluating the use of VFP 9.0 features

Using _ReportListener's report collection, or a similar mechanism, gives you several advantages over NOPAGEEJECT. The most important advantage is the ability to concatenate multiple reports with different page orientations and sizes. By contrast, NOPAGEEJECT opens a single connection to a printer for multiple reports, with a single page layout definition determined by the first one in the set.

Using GhostScript to append the files also permits you to mix PostScript source files from different output processing techniques, which may not all be REPORT FORM command results. They can even be PostScript files created by different applications or different instances of FoxPro. Once you have the code provided in PDFListener to create the command file, you can adapt it to these needs

Appending output from different applications, or different instances, can be especially useful on a web server. As you probably know, VFP 9.0 has been adjusted to make it easier to run report forms from a DLL (multi-threaded DLLs are still not allowed). PDFListener adds some awareness of OLEError-handling and _VFP.StartMode to help you here.

If you run this code in the context of a web application, you will probably want to run the setup procedures manually and make sure the appropriate user has rights to its features, including output directories and printer setup. In a server context, however, you're not going to be as worried about application transparency!

You'll notice that PDFListener exposes a member property, generateNewStyleOutput, which indicates whether you want a ReportListener object reference associated with your report run(s). Although _ReportListener runs the REPORT FORM commands, the commands themselves may be either old-style or new-style VFP output. PDFListener sets this property to .F. by default; the resulting PostScript files will be much smaller and the quality will be just as high. When you use the old report engine, the files contain text surrounded by PostScript instructions for your reporting data; in the new one, each full page is a graphical image.

Set generateNewStyleOutput .T. if you're using a ReportListener to provide dynamic effects, such as charts you render directly to the report using GDI+. _ReportListener's AddReport method gives you a chance to nominate ReportListener object references separately for each report in the collection, if you like.

See sidebar, Extra use of VFP 9.0 features for extra credit , below.

Delivering and deploying the installation files

Your source includes a short routine, CREATEINSTALLDBF.PRG, which creates the installation table from whatever version of GhostScript I'm preparing to distribute. It asks for the source directories and creates records to store every file I'll distribute -- including the Free Software Foundation-required license text file.

Please note that the resulting fileset is not the full original GhostScript set, but rather the set that GhostScript needs at runtime. These include the executable and DLL libraries, GhostScript-specific configuration and command files, and font files.

As extruded from INSTALL.DBF by PDFListener, this fileset is not placed in the original directory structure used by the GhostScript distribution. I use a streamlined arrangement (see Figure 3) that matches the directions I plan to give GhostScript to find its files at runtime. All this may seem like extra trouble, but it guarantees that your program will use this version of GhostScript, and no other that happens to be registered on the machine, with all its proper components in place.

figure 3
Figure 3. Simplified GhostScript runtime directory and fileset.

Even though it's not the full GhostScript distribution, INSTALL.DBF is a large file (about 8 megs). Most of this bulk is font definition files; you can choose to omit some of them, if you like. Checking the GhostScript site, you'll see that different distributions come with different sets of fonts. You can add PostScript fonts to the set if necessary, as well as removing some, but the base set you'll find here, supplied with default font mappings in GNU GhostScript, will provide appropriate matches for most PostScript output.

You may choose to re-purpose the GSLocation member property to indicate the location of INSTALL.DBF as well as the eventual location of the GhostScript files. In this scenario, PDFListener would always install the libraries in a specific relationship to the location of INSTALL.DBF.

You may prefer to use a completely different mechanism. For example, you could bind the contents of INSTALL.DBF, as individual files, into your SETUP.EXE, as support directories for your application. You could also create a separate VFPSTARTUP.EXE that will re-create these files on disk, as well as verifying your printer driver, on demand, storing the resulting names and locations to configuration files for your application to use for report runs. I like to create a CONFIG.XML file to handle this chore; it can be dragged and dropped as a single argument onto my EXE. Internally, my EXE knows that, if it was invoked with a command-line argument it can load as an XML file, it needs to read its configuration values from the file and do its setup work.

None of these decisions is critical to the main event, assuming you change the GhostScript command-line options to match the resulting locations of your files That's why, as provided in your source code, PDFListener expects INSTALL.DBF to be available and makes no great effort to handle cases where it's not present.

It sounds like more work than it is

Your source code includes a TEST.PRG harness. This program contains some useful comments with advice on using PDFListener and its various non-essential features. It also allows you to test the process, end-to-end.

The test harness class asks you to provide the names of FRXs, adding them to PDFListener's report collection in a loop (don't forget to choose FRXs without printer environments saved!). It next executes PDFListener's RunReports method, during which PDFListener will check for its required external components, transparently install if necessary, run the reports to generate PostScript files, and finally post-process the PostScript report output into a PDF. TEST.PRG's test harness class then uses a call to ShellExecute to show you the resulting PDF document.

Check your system's printer setups before and after the process, and you should see the setup that PDFListener adds. Check your print dialogs, and you should see your default Windows printer and your own VFP print settings remain undisturbed. Check the directory under your copy of TEST.PRG and PDFLISTENER.PRG, and you'll see the GhostScript files. Check your watch and, assuming you didn't pick 100 FRXs with 1000 pages each in the loop, you'll see that not much time has elapsed. Check the output, and you'll see it was what you asked for.

Now go and give your users what they're asking for.


Sidebars