707 Evaluating Document Formats If we present our visitors with an ASCII file, however, we have very little control over the appearance of their certificate.We cannot control fonts or page breaks.We can only include text and have very little control over formatting.We have no control over a recipient’s duplication or modification of the document.This is the method that makes it easiest for the recipient to fraudulently alter her certificate. HTML An obvious choice for delivering a document on the Web is HTML. Hypertext Markup Language is specifically designed for this purpose.As you are no doubt already aware, it includes formatting control, syntax to include objects such as images, and is compatible (with some variation) with a variety of operating systems and software. It is fairly simple, so it will be both easy to design and quick for a script to generate and deliver. Drawbacks to using HTML for this application include limited support for print related formatting such as page breaks, little consistency in the output on different plat- forms and programs, and variable quality printing. In addition, although HTML can include any type of external element, the capability of the browser to display or use these elements cannot be guaranteed for unusual types. Wo rd Processor Formats Particularly for intranet projects, providing documents as word processor documents makes some sense. However, for an Internet project, using a proprietary word processor format will exclude some visitors, but given its market dominance, Microsoft Word would make sense. Most users will either have access to Word or to a word processor that will try to read Word files. Windows users without Word can download the freeware Word Viewer from http://www.microsoft.com/office/000/viewers.asp Generating a document as a Microsoft Word document has some advantages. As long as you have a copy of Word, designing a document is easy.We have very good control over the printed appearance of our documents and a lot of flexibility with its contents.You can also make it relatively difficult for the recipient to modify by telling Word to ask for a password. Unfortunately,Word files can be large, particularly if they contain images or other complex elements.There is also no easy way to generate them dynamically with PHP. The format is documented, but is a binary format and the format documentation comes with license conditions. It is possible to generate Word documents with a COM object but it’s definitely not simple. Another new possibility you may now consider is OpenOffice Writer, which has the dual advantages of not being proprietary software and using an XML based file format. 36 525x ch30 1/24/03 3:40 PM Page 707 708 Chapter 30 Generating Personalized Documents in Portable Document Format (PDF) Rich Text Format Rich Text Format or RTF gives us most of the power of Word, but the files are easier to generate.We still have flexibility over layout and formatting of the printed page.We can still include elements such as vector or bitmap images.We can still be fairly sure that the user will see a similar result to us when they view or print the document. RTF is Microsoft Word’s text format. It is intended as an interchange format to trans- fer documents between different programs. In some ways, it is similar to HTML. It uses syntax and key words rather than binary data to convey formatting information. It is therefore relatively human readable. The format is well documented.The specification is freely available and can be found here: http://msdn.microsoft.com/library/specs/rtfspec.htm The easiest way to generate an RTF document is to choose a Save As RTF option in your word processor.As RTF files contain only text, it is possible to generate them directly and existing ones can easily be modified. Because the format is documented and freely available, RTF is readable by more soft- ware than Word’s binary format. Be aware though that users opening a complex RTF file in older versions of Word or different word processors will often see somewhat different results. Each new version of Word introduces new keywords to RTF, so older implemen- tations will usually ignore controls they do not understand or have chosen not to imple- ment. From our original list, an RTF certificate would be easy to design using Word or another word processor; is able to contain a variety of different elements such as vector and bitmap images; gives a high quality printout; can be generated easily and quickly; and can be delivered electronically at low cost. It will work with a variety of applications and operating systems, although with somewhat variable results. On the down side, an RTF document can be easily and freely modified by anybody, which is a problem for a certificate and some other types of docu- ment.The file size might get moderately large for complex documents. RTF is a good option for many document delivery applications, so we will use it as one option here. PostScript PostScript, from Adobe, is a page description language. It is a powerful and complex pro- gramming language intended to represent documents in a device independent way—that is, a description that will produce consistent results across different devices such as print- ers and screens. It is very well documented. At least three full-length books are available, as well as countless Web sites. A PostScript document can contain very precise formatting, text, images, embedded fonts, and other elements.You can easily generate a PostScript document from an appli- cation by printing it to a PostScript printer driver. If you were interested, you could even learn to program in it directly. 36 525x ch30 1/24/03 3:40 PM Page 708 709 Evaluating Document Formats PostScript documents are quite portable.They will give consistent high-quality printouts from different devices and different operating systems. There are a couple of significant down sides to using PostScript to distribute docu- ments: n The files can be huge. n Many people will need to download additional software to use them. Most Unix users will be able to deal with PostScript files, but Windows users will usually need to download a viewer such as GSview, which uses the Ghostscript PostScript inter- preter.This software is available for a wide variety of platforms.Although it is available free, we do not really want to force people to download more software. You can read more about Ghostscript at http://www.ghostscript.com/ and download it from http://www.cs.wisc.edu/~ghost/ For our current application, PostScript scores very well for consistent high-quality out- put, but falls short on most of our other needs. Portable Document Format Fortunately, there is a format with most of the power of PostScript, but with significant advantages.The Portable Document Format (also from Adobe) was designed as a way to distribute documents that would behave consistently on different platforms, and deliver predictable high-quality output on screen or on paper. Adobe describes PDF as “the open de facto standard for electronic document distri- bution worldwide. Adobe PDF is a universal file format that preserves all of the fonts, formatting, colors, and graphics of any source document, regardless of the application and platform used to create it. PDF files are compact and can be shared, viewed, navigat- ed, and printed exactly as intended by anyone with a free Adobe Acrobat Reader.” PDF is an open format, and documentation is available from here: http://partners.adobe.com/asn/developer/technotes/acrobatpdf.html as well as many other Web sites and an official book. Judged against our desired attributes, PDF looks very good. PDF documents give consistent, high-quality output, are capable of containing ele- ments such as bitmap and vector images, can use compression to create a small file, can be delivered electronically and cheaply, are usable on the major operating systems, and can include security controls. Working against PDF is the fact that most of the software used to create PDF docu- ments is commercial. A reader is required to view PDF files, but the Acrobat Reader is available free for Windows, Unix, and Macintosh from Adobe. Many visitors to your site will already be familiar with the .pdf extension and will most likely already have the reader installed. 36 525x ch30 1/24/03 3:40 PM Page 709 710 Chapter 30 Generating Personalized Documents in Portable Document Format (PDF) PDF files are a good way to distribute attractive, printable documents, particularly ones that you do not want recipients to be able to easily modify. We will look at two different ways to generate a PDF certificate. Solution Components To get the system working, we will need to be able to examine users’ knowledge and (assuming that they pass the test) generate a certificate reporting their performance.We will experiment with generating this certificate in three different ways: two using PDF and one using RTF. Let’s look at the requirements of each of these components in some detail. Question and Answer System Providing a flexible system for online assessment that allowed a variety of different ques- tion types, various media types for supporting information, useful feedback on wrong answers, and clever statistic gathering and reporting, would be a complex task on its own. In this chapter, we are mainly interested in the challenge of generating customized documents for delivery over the Web, so we will only build a very simple quiz system. The quiz does not rely on any special software. It uses an HTML form to ask ques- tions and a PHP script to process the answers.We have been doing this since Chapter 1, “PHP Crash Course.” Document Generation Software No additional software is needed on the Web server to generate RTF or PDF documents from templates, but you will need software to create the templates. In order to use the PHP PDF creation functions, you will need to have compiled PDF support into PHP. (We’ll discuss more about this in a minute.) Software to Create RTF Template You can use the word processor of your choice to generate RTF files.We used Microsoft Wo rd to create our certificate template.The certificate template is included on the CD- ROM in the Chapter 30 directory. If you prefer another word processor, it would still be a good idea to test the output in Word as this is the software that the majority of your visitors will be using. Software to Create PDF Template PDF documents are a little more difficult to generate.The easiest way is to purchase Adobe Acrobat.This software will let you create high-quality PDFs from various applica- tions.We used Acrobat to create the template file for this project. To create the file, we used Microsoft Word to design a document. One of the tools in the Acrobat package is Adobe Distiller.Within Distiller, we needed to select a few 36 525x ch30 1/24/03 3:40 PM Page 710 711 Solution Components non-default options.The file must be stored in ASCII format, and compression needs to be turned off. After these are set, creating a PDF file is as easy as printing. You can find out more about Acrobat here: http://www.adobe.com/products/acrobat/ and either buy it online or from a regular software retailer. Another option to create PDFs is the conversion program ps2pdf, which as the name suggests converts PostScript files into PDF files.This has the advantage of being free, but does not always produce good output for documents with images or non-standard fonts. The ps2pdf converter comes with the Ghostscript package mentioned previously. Obviously, if you are going to create a PDF file this way, you will need to create a PostScript file first. Unix users will typically use either the a2ps or dvips utilities for this purpose. If you are working in a Windows environment, you can also create PostScript files without Adobe Distiller, albeit via a slightly more complicated process.You will need to install a PostScript printer driver. For example, you can use the Apple LaserWriter IINT driver. If you don’t have a PostScript driver installed, you can download one from Adobe at http://www.adobe.com/support/downloads/product.jsp?product=44&platform=Windows To create your PostScript file, you will need to select this printer and the Print to File option, typically found on the Print dialog box. Most Windows applications will then produce a file with a .prn extension.This should be a PostScript file.You should probably rename this to be a .ps file.You should be able to view it using GSview or another PostScript viewer, or create a PDF file using the ps2pdf utility. Be aware that different printer drivers produce PostScript output of varying quality. You might find that some of the PostScript files you produce give errors when run through the ps2pdf utility.We suggest using a different printer driver. If you only intend to create a small number of PDF files, Adobe’s online service might suit you. For $9.99 a month, you can upload files in a number of formats and download a PDF file.The service worked well for our certificate, but does not let you select options that are important for this project.The PDF created will be stored as a binary file and compressed.This makes it very difficult to modify. This service can be found at http://createpdf.adobe.com/ There is a free trial option for this service if you want to test it out. There is also a free ftp-based interface to ps2pdf at the Net Distillery: http://www.babinszki.com/distiller/ 36 525x ch30 1/24/03 3:40 PM Page 711 . fonts, formatting, colors, and graphics of any source document, regardless of the application and platform used to create it. PDF files are compact and can be shared, viewed, navigat- ed, and printed exactly. across different devices such as print- ers and screens. It is very well documented. At least three full-length books are available, as well as countless Web sites. A PostScript document can contain. different platforms, and deliver predictable high-quality output on screen or on paper. Adobe describes PDF as “the open de facto standard for electronic document distri- bution worldwide. Adobe