XML in 60 Minutes a Day phần 2 pot

72 233 0
XML in 60 Minutes a Day phần 2 pot

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

prevent hackers, who would try to access the network through the Web server, from gaining access to the higher-security application servers or, especially, to the private segment of an organization’s network. DMZs are occasionally sacrificed to hackers, but at least the private networks remain safe. Demilitarized Zone 2 (DMZ2). A group of servers on an intermediate security network segment, that provide applications and services intended for Space Gems’ employees and their most trusted clients, suppliers, and so on. In this case, all of Space Gems’ DMZ1 and DMZ2 systems likely have Web server software installed on them. There may also be Web server software installed on some private network systems. Now, if an end user somewhere on the Internet enters the www.spacegems .com URL in his or her browser’s location bar, a request will be sent to the server that has been configured with the domain name spacegems (that server is probably in DMZ1 here). After the server receives the request, it responds by transmitting a page document designated by Space Gems, to the requester’s browser. Several domain names may be mapped to the same physical computer. This concept is called virtual hosting, and the computer is called a virtual server. Vir- tual hosting allows you to provide several different Web sites, each with its own domain name and even IP address, using the same Web server system. Requests sent to these different sites will be routed by IP address, hostname, or browser language setting to the correct virtual host (that is, to its own respec- tive Web site). Virtual hosting is a technique that will be illustrated in the lab exercises later in this chapter. Individual virtual hosts have unique Web root directories (or folders), direc- tory (or folder) hierarchies, default filenames, and error files and restricted access files. On the other hand, the different virtual host Web sites will likely share sys- tem caching, plug-ins, security realms, and other features. Many Web server software applications are available. The following are the most prominent: Public domain software. HTTPd is public domain software that can be downloaded from the National Center for Supercomputing Applications (NCSA, located at the University of Illinois at Urbana-Champaign, Illinois). Their HTTPd Web site is http://hoohoo.ncsa.uiuc.edu/docs/ Overview.html. Apache Web Server. Developed by the Apache Software Foundation, a membership-based, not-for-profit corporation that provides various kinds of support for Apache open source software projects. Information and downloads are available from http://httpd.apache.org/. 42 Chapter 2 422541 Ch02.qxd 6/19/03 10:09 AM Page 42 Microsoft Internet Information Server (IIS). Usually included with Windows server software; IIS is integrated at the Windows operating system level. Check Microsoft’s IIS Web site at www.microsoft.com/ windows2000/server/evaluation/features/web.asp for features, support, and downloads. Sun ONE Web Server (formerly iPlanet Web Server, Enterprise Edition). Developed by the Sun Microsystems, Inc Netscape Alliance. Under the iPlanet brand name, the Sun-Netscape Alliance is producing new ver- sions of Netscape products. Further information and a trial download can be found at Sun’s Web site at wwws.sun.com/software/products/ web_srvr/home_web_srvr.html. IBM HTTP Server. Part of IBM’s WebSphere line. Further information and downloads are available at IBM’s Web site at www-3.ibm.com/ software/webservers/httpservers/. Web Browsers Web browsers (also called Internet browsers) are software applications that locate, request, and display Web pages and navigate from one Web site or page to another. They also contain email and chat clients. Almost all browsers are graphical browsers (they can display text and graphics), although some text- only browsers are still around. Also, most browsers present multimedia infor- mation—sound and video are the most predominant—although they usually require plug-in utilities for some multimedia formats. Basically, browsers act as client applications to those server applications on remote Web server sys- tems. They usually use the HTTP protocol but also use FTP and others. To read XML, a browser application must contain another application called an XML parser (also called an XML processor), which conducts a preliminary check on XML documents. If the documents meet criteria for what are termed well-formedness and validity, the XML parser restructures the data in the doc- uments and then passes the restructured data to the application (that is, to the browser) proper. More explanations regarding parsers, well-formedness and validity can be found in Chapter 3, “Anatomy of an XML Document.” Browsers are generally judged according to how they measure up to the fol- lowing questions: ■■ Is the browser free or at least inexpensive? Are updates or upgrades free or inexpensive? ■■ Is installation easy and trouble-free? How about configuration? ■■ Is the interface easy to look at and use? Setting Up Your XML Working Environment 43 422541 Ch02.qxd 6/19/03 10:09 AM Page 43 ■■ How does the browser perform? For example, does it load pages quickly? Is it stable or does it crash occasionally—and why? Can you see the same information on Web sites with one browser as you can with another? ■■ What about its other features? For example, can you customize its appearance? Can you customize its behavior? Does it have integrated email and chat client programs? Does it support XML? ■■ Are service and support available? Are they free? Here are the most prominent Web browsers: Internet Explorer. The browser against which other browsers are usually compared. IE 4.0 was the first Web browser to implement XML. Microsoft provides two parsers: one nonvalidating and one validating. Supports DHTML, CSS1, DOM1, SMIL, Microsoft XML 3.0, and a .NET Web service behavior that allows XML/SOAP database queries. Further information and downloads are available from the Microsoft Web site at www.microsoft.com/windows/ie/default.asp. Netscape. Supports XML, HTML 4, and Cascading Style Sheets. Available for Windows, Linux, and Mac OS. More information and downloads are available from the Netscape Web site at http://channels.netscape.com/ ns/browsers/default.jsp. Konqueror. An open source KDE desktop environment-related (thus, available for Linux and other Unix variations) Web browser that com- plies with HTML 4 and supports Java applets, JavaScript, Cascading Style Sheets Recommendation 1 and (partially) 2. It is also compatible with Netscape plug-ins. It uses XML documents for configuration and other functions. More information and downloads are available from the Konqueror Web site at www.konqueror.org/. Mozilla. Developed by the Mozilla Organization, a virtual organization that makes their Mozilla browser a successful open source project and product. Mozilla is fast and stable, and it allows you to disable many pop-up ads. Mozilla supports XML, but its parser is nonvalidating. More information and downloads are available from the Mozilla Web site at www.mozilla.org/. Opera. Developed by Opera Software. Available for Windows, Linux, Macintosh, Symbian, QNX, and OS/2 operating systems. XML viewing capability became available with the Version 4.0 beta. Further information and downloads are available from the Opera Web site at www.opera.com/. Other browsers are available. As time goes by, more will be developed, and more will support XML. 44 Chapter 2 422541 Ch02.qxd 6/19/03 10:09 AM Page 44 XML Authoring Tools If you become an XML developer, your authoring or editing applications will probably become your most important XML software. We’ll refer to these applications as XML authoring tools or XML editors. Because XML is an open standard, it doesn’t restrict you to one editor or another (or one classification or another), even after you get started. If you find an editor is too restrictive, or you find yourself occasionally in a situation or location where you can’t use your customary editor, you can often switch to another, and your documents will still function. However, your options may be limited by software costs, licensing, and other factors. Meanwhile, your choice of editor will probably influence the look, structure, and interoperability of your XML documents, at least during the initial creation stages. For example, some applications require the creation of other components (such as DTDs or style sheets) prior to docu- ment creation. There are three basic XML authoring tool classifications, each with several authoring applications. In order of complexity, starting with the least complex, the three basic XML authoring tool classifications are as follows: ■■ Simple text editors ■■ Graphical editors ■■ Integrated development environments We’ll discuss each classification in turn and then list a few representative editing tools from each. Note that these classification boundaries are becoming blurred as the tool developers add to or modify the features in their respective applications. They do so by adopting or adapting features that were previ- ously available in applications in the higher categories or by becoming more interoperable with other types of applications (for example, graphics, audio, or video applications) or other document editors. As mentioned in Chapter 1, XML is being adopted by more and more Web developers; therefore, we can expect other types of Web-based applications— especially HTML editors, database software, and e-commerce software—to incorporate XML support and, with it, some level of XML creation capability. In the near future, these other application types will likely form their own cat- egory of XML creation tools. Simple Text Editors Simple text (also called plaintext) applications are small and uncomplicated, so they’re easy on computer system resources. Consequently, plaintext editors have shipped and installed with personal computer operating systems since Setting Up Your XML Working Environment 45 422541 Ch02.qxd 6/19/03 10:09 AM Page 45 the 1980s. With some Unix operating systems, they’ve been around since the 1970s. You can find one on virtually any computer you boot up. Text editors have few features and are limited in their display capabilities. Some use only one font; some only let you use a few different colors. You can’t really change the look and feel of your text with these programs, but because they allow you to write ASCII (but not usually Unicode) text, they are still good enough to create modest XML documents—XML tags generally use the symbols and characters found on a standard keyboard. They are not recom- mended for creating complex documents in larger structures, but if you know what you’re doing and you want to make only a few changes, they can still be used to modify any existing XML document. Following are some examples: Microsoft Notepad. Notepad installs with the Windows operating sys- tem. It is not resource-intensive, typically using less than 1 MB of RAM and just a few CPU cycles when activated. A few menu-driven options are available in Notepad—just enough to accomplish simple text editing. vi (found on virtually every Unix system, including Linux). Unix users likely recognize vi, although they may know it by its other names, like vim or other variations on the vi name. vi is the Unix equivalent to Notepad: It is the ubiquitous text editor in the Unix world. It, too, is a modest application, so it is likely to continue to be installed on almost every Unix system. Several vi variants are customizable and can recog- nize XML tags, so they can highlight those tags in different colors, indent, and perform other functions to facilitate XML creation and editing. A Unix version of vi is available from SourceForge.net’s vimonline Web site at http://vim.sourceforge.net/. A version of vi called WinVi (vi with a Windows wrapper interface) is available from Raphael “Ramo” Molle at www.winvi.de/en/. Microsoft WordPad. Another application that installs on almost every Windows system, WordPad provides more features than Notepad such as different fonts and font sizes, toolbars, and more sophisticated margin and tab stop controls. WordPad provides a slightly better user interface and more appealing-looking documents without the necessity of Microsoft Word. Emacs (found on more and more Unix systems). At one time, the equivalent of WordPad in the Unix environment, but now somewhat more sophisticated. SimpleText. SimpleText ships with every Macintosh system. It limits the size of a document that you can create, but you can use a drag-and- drop feature, record sounds, and use QuickDraw (though with minimal support). 46 Chapter 2 422541 Ch02.qxd 6/19/03 10:09 AM Page 46 As limited as they are, simple text editors are far from extinct. Their advan- tages stem from their simplicity to learn and use, their capability to get the job done, the few system resources they use, the convenience of finding them on virtually every system, and the fact that you don’t have to install a separate and much larger WYSIWYG application or an office suite of applications to create simple text documents. Witness how easy it was to examine the sample XML files found by Windows Explorer in the lab exercises for Chapter 1. Consequently, simple text editors are still among the most popular text manipulation tools, especially if the document being created or modified is not large or complex. Some developers are capable of, and comfortable with, cre- ating whole documents with simple text editors. Throughout this book, you will see several examples of basic documents created with simple text editors. Graphical Editors Despite our glowing words for them, simple text editors can be slow when producing XML and XML-related documents, such as style sheets, DTDs, and schemas. Many dedicated XML editors, complete with graphical user interfaces (GUIs), are now available that behave similarly to word processor applications with which we are familiar. In addition to simple text editing, the features of graphical XML editors include, but may not be limited to, the following: ■■ tags that are color-highlighted ■■ capability to hide tags, combined with immediate application of style sheets to provide a WYSIWYG document view ■■ menus of options ■■ drag-and-drop editing ■■ click-and-drag highlighting ■■ other special mechanisms for manipulating markup ■■ checking for well-formedness ■■ validity checking ■■ macro creation to save steps ■■ menus of only those elements that are declared and defined within DTDs or schemas The last feature, also referred to as structure checking, is popular. The editor can resist the addition of any element that doesn’t belong. That way the editor can prevent the author from making syntactic or structural mistakes. Keep in mind, however, that structure checking can also hinder someone from experi- menting with different element orderings by forcing the author to stop and figure out why one or another of those maneuvers was rejected. Setting Up Your XML Working Environment 47 422541 Ch02.qxd 6/19/03 10:09 AM Page 47 Unlike SGML editors, which by nature are more complex and expensive, simpler and more affordable editors are being created for XML. Here are some examples of graphical editors for XML. Some provide the features described previously, while others are in transition from graphical text editing to more of an integrated development environment discussed later in this chapter: Microsoft XML Notepad. Its interface consists of a two-pane display: elements, attributes, comments, and text are added to the XML document via the tree structure in the left pane; values for those components are entered in the corresponding text boxes in the right pane. For additional information and to download a copy of XML Notepad, go to the Microsoft Developer Network (MSDN) Web site (http://msdn.microsoft.com/ library/) and enter “xml notepad” in the search engine there. XAE (XML Authoring Environment for Emacs). Developed by Paul Kinnucan, XAE is add-on software that enables you to use Emacs (or XEmacs) and your Unix system’s HTML browser to create, transform, and display XML documents. For further information and to download a copy, go to http://xae.sunsite.dk/. Peter’s XML Editor. This is a modest, but effective, XML development tool. For further information and to download a copy, go to the Web site at www.iol.ie/~pxe/index.html. Adobe FrameMaker. Enterprise-class authoring and publishing soft- ware, FrameMaker is a WYSIWYG application that is evolving into an IDE. For further information or for trial software, go to the Adobe Web site at www.adobe.com/products/framemaker/main.html. Conglomerate. This is a hybrid word processor-style editor that is mov- ing toward becoming an IDE. Conglomerate is free-software licensed under the GNU General Public License. It consists of a GUI and a server- database combination that performs storage, searching, version control, transformation, and publishing. The code base is apparently still unfin- ished but reasonably stable, and it will be rewritten. Source code for Unix and Windows is available. Further information and a download- able copy are available through the Web site at www.conglomerate.org/. Emilé. Developed by Media Design In-Progress for the Macintosh envi- ronment, Emilé is a customizable XML editor that supports DTDs and comes with a validating parser. Color highlighting allows you to see the hierarchical structure and the content. It can be extended with other plug-in components. For further information and to download a test copy, see the Media Design In-Progress Web site at http://in-progress .com/emile/. 48 Chapter 2 422541 Ch02.qxd 6/19/03 10:09 AM Page 48 Microsoft FrontPage 2002. FrontPage 2002 has an option called Apply XML Formatting Rules to automatically reformat the HTML tags on an HTML page to make them XML-compliant. For further information, go to the Microsoft Office Assistance Center Web site at http://office .microsoft.com/assistance/default.aspx and search for “frontpage xml”. Microsoft Word. See the comments that follow in the next section. Use Only the Latest Versions of Microsoft Word for HTML/XML Creation No doubt about it, Microsoft Word is one of the most well-known and well- used word processing applications in modern publishing. If, however, you’re going to use Word to eventually generate XML (such as by creating a Word document, converting it to HTML, and converting that HTML document to XML), you should be aware of the drawbacks of using older versions of Word—in particular, any versions up to and including Word 97. Newer Word versions have better compatibility with Web page formats. Earlier versions of Microsoft Word add many extraneous tags and other information into their documents. The extra information and tags risk confu- sion with the tags and data you might create in your XML documents. Here’s an example you can try: 1. If you have a system with, for example, Word 97, click Start, Programs, Microsoft Word. 2. Click File, New and Blank Document, and OK. 3. When the new document window appears, type in a simple yet unique word or phrase as shown in Figure 2.2. Figure 2.2 A test document named sapphire_excerpt created with Word 97. Setting Up Your XML Working Environment 49 422541 Ch02.qxd 6/19/03 10:09 AM Page 49 4. Click File, Save As, and in the Save As dialog box give the file an appro- priate filename (in our example, you can see that the document has been named sapphire_excerpt_Word97). In the Save as Type field, click the down arrow to open the drop-down menu, click Rich Text Format (*.rtf), and click Save. The simple Word document is now in RTF format. 5. Click the File menu button again, and click Save as HTML Document. In the Save as HTML dialog box, give the file an appropriate filename. In the Save as Type field, accept the default HTML document and then click Save. 6. Open the Notepad application by clicking Start, Programs, Accessories, Notepad. 7. When Notepad has started, click File and Open. In the Open dialog box, browse through the Look In field’s directory structure until you find the RTF file you saved in Step 4. You may have to click the down arrow in the Files of Type field to open the drop-down menu and then select All Files. 8. When your file is displayed, you will see that your actual text (in the example, the sapphire description) begins near the end of the file. Meanwhile, look at all the tags Word 97 has inserted. Take a look at Figure 2.3 to see what happened with our sapphire excerpt example. Figure 2.3 RTF results from the Word 97 version of sapphire_excerpt. 50 Chapter 2 422541 Ch02.qxd 6/19/03 10:09 AM Page 50 9. Open another Notepad instance. Again, use Start, Programs, Accessories, Notepad. 10. When Notepad has started this time, click File and then Open. In the Open dialog box, navigate the Look In field’s directory structure until you find the HTML format file you saved in Step 5. Again, you may have to click the down arrow to open the drop-down menu in the Files of Type field and then select All Files. 11. When the HTML version of the file is displayed, you will see your text, but the HTML tags have been altered and several extra tags have been inserted by Word again. Figure 2.4 illustrates what happened with our sapphire excerpt example. For a small and simple file such as this, the conversion to HTML seems acceptable. For larger, complex documents, it could cause headaches. It should be clear from the results displayed in Figure 2.3 why old versions of Microsoft Word, despite all its document production benefits in many other contexts, is not as good a tool for XML document creation as other HTML- specific applications. Meanwhile, if you had used Notepad to view the file in DOC format, or even in TXT format, you would have seen that additional information had been added to the sapphire file, but the extra characters would have been unreadable. At least in the RTF and HTML formats you can see what Word 97 was trying to convey. Do you understand now why the size of the HTML ver- sion of the file is approximately 1 KB, while the RTF version is 3 KB? And Word 97’s DOC version is 19 KB! Figure 2.4 The sapphire_excerpt document after being saved in HTML format looks like this figure. Setting Up Your XML Working Environment 51 422541 Ch02.qxd 6/19/03 10:09 AM Page 51 [...]... Linux Again, all of the Setting Up Your XML Working Environment necessary instructions for configuring Apache on Linux are available on the XML in 60 Minutes a Day Web site In Lab 2. 2, you’ll install TIBCO Software, Inc.’s TurboXML as the XML editor With little effort, this lab could also be performed with other XML editing tools, such as Altova Inc./Altova GmbH’s XML Spy; however, we recommend that... tag is a start tag, the tag is an end tag, and is a declared-empty element tag Notice that each tag is delimited by a left angle bracket () at the end The end tag always has a slash immediately after the left angle bracket before its name The empty element tag also has a slash, but it appears immediately before the ending angle bracket... Recommendation) Each XML document has both a logical and a physical structure.” Expanding that definition, each XML document contains a unique instance of logically structured data, plus additional instructions for the parser and the application The data instance portion contains data components with unique values All the components and their respective values must conform to definitions in the language’s... Document The Data Instance The data instance portion of an XML document follows the prolog and consists of one or more elements Elements are an XML document’s data containers and are the basic building blocks of XML data instances Element Types, Tags, and Names Each element begins and ends with its element type (also referred to as an element name), contained in a tag (some refer to tags as tag names, but... using Windows XP Professional and Linux Instructions for using both Windows 20 00 and XP are documented within this book If you have installed—or will be installing—Linux as your operating system, you will find instructions for installing the Apache Web server and TurboXML at the XML in 60 Minutes a Day Web site as noted in the book’s introduction Creating Your XML Environment: Overview Once a version... language’s conformance-checking mechanisms in other words, a document type definition or schema After being processed by an XML parser, the data in a document is structured and then passed to the application But the W3C has drawn a bit of a boundary around XML documents when they refer to them as data objects They are not quite the same as, say, Java objects, which can contain a combination of data and procedures... purists prefer tags) There are three kinds of tags Start tags (also called opening tags), appear at the beginning of an element, and end tags (or closing tags) appear at the end of an element Also, a sort of hybrid tag introduces declared-empty elements (elements that are not intended to contain any data) Here is an excerpt from Figure 3 .2, which illustrates all three kinds of tags: 1 26 00 0... that xml must be lowercase The XML declaration is actually a kind of processing instruction (discussed next); that is, it talks to the application, not to the parser What it says, in a way, is “activate the XML parser; this is an XML document” and then provides additional information about the document for use by the application and the parser The information appears in three pseudo-attributes: the XML. .. also usually include the functions listed in the previous paragraphs plus all the major aspects of XML design and editing, such as document authoring, editing, and validation; DTD or schema editing, and validation; and Extensible Stylesheet Language editing and transformation (the latter topic is discussed in detail in Chapter 9, XML Transformations”) A sophisticated IDE environment facilitates large... schema This operator only appears in validating parsers 69 70 Chapter 3 An entity resolver Incorporates any data referenced within the XML document’s referential markup that is located outside the XML document entity itself or that is not intended to be parsed in a customary manner Several parsers are available, including expat (at www.jclark.com /xml/ expat.html or http://sourceforge.net/projects/expat/), . you have installed—or will be installing—Linux as your operating system, you will find instructions for installing the Apache Web server and TurboXML at the XML in 60 Minutes a Day Web site as. configuring Apache on Linux are available on the XML in 60 Minutes a Day Web site. In Lab 2. 2, you’ll install TIBCO Software, Inc.’s TurboXML as the XMLeditor. With little effort, this lab could also. at www.microsoft.com/windows/ie/default.asp. Netscape. Supports XML, HTML 4, and Cascading Style Sheets. Available for Windows, Linux, and Mac OS. More information and downloads are available from the Netscape Web site at http://channels.netscape.com/ ns/browsers/default.jsp. Konqueror.

Ngày đăng: 14/08/2014, 12:20

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan