1. Trang chủ
  2. » Công Nghệ Thông Tin

XML Step by Step- P7 doc

15 241 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 15
Dung lượng 303,31 KB

Nội dung

Chapter 5 Creating Valid XML Documents Using Document Type Definitions 93 5 Document Type Definitions The Advantages of Making an XML Document Valid Creating a valid XML document might seem to be a lot of unnecessary bother: You must first fully define the document’s content and structure in a DTD or XML schema and then create the document itself, following all the DTD or schema specifications. It might seem much easier to just immediately add what- ever elements and attributes you need, as you did in the examples of well- formed documents in previous chapters. If, however, you want to make sure that your document conforms to a specific structure or set of standards, providing a DTD or XML schema that describes the structure or standards allows an XML processor to check whether your document is in conformance. In other words, a DTD or XML schema provides a standard blueprint to the processor so that in checking the validity of the docu- ment, it can enforce the desired structure and guarantee that your document meets the required standards. If any part of the document doesn’t conform to the DTD or XML schema specification, the processor can display an error mes- sage so that you can edit the document and make it conform. Making an XML document valid also fosters consistency within that document. For example, a DTD or XML schema can force you to always use the same ele- ment type for describing a given piece of information (for instance, to always enter a book title using a TITLE element rather than a NAME element); it can ensure that you always assign a designated value to an attribute (for instance, hardcover rather than hardback); and it can catch misspellings or typos in ele- ment or attribute names (for instance, typing PHILUM rather than PHYLUM for an element name). Making XML documents valid is especially useful for ensuring uniformity among a group of similar documents. In fact, the XML standard defines a DTD as “a grammar for a class of documents.” Consider, for example, a Web publish- ing company that needs all its editors to create XML documents that conform to a common structure. Creating a single DTD or XML schema and using it for all documents can ensure that these documents uniformly comply with the required structure, and that editors don’t add arbitrary new elements, place information in the wrong order, assign the wrong data types to attributes, and so on. Of course, the document must be run through a processor that checks its validity. Including a DTD or XML schema and checking validity is especially important if the documents are going to be processed by custom software (such as a Web page script) that expects a particular document content and structure. If all users of the software use a common appropriate DTD or XML schema for their XML 94 XML Step by Step documents, and if the documents are checked for validity, the users can be sure that their documents will be recognized by the processing software. For ex- ample, if a group of mathematicians are creating mathematical documents that will be displayed using a particular program, they could all include in their documents a common DTD that defines the required structure, elements, at- tributes, and other features. In fact, most of the “real-world” XML applications listed at the end of Chapter 1, such as MathML, consist of a standard DTD or XML schema that all users of the application use with their XML documents, so that checking the documents for validity ensures that they conform to the application’s structure and will be recognized by any software designed for that application. note The Microsoft Internet Explorer processor will check a document for validity only if the document contains a document type declaration and you open the docu- ment through an HTML Web page (using the techniques you’ll learn in Chap- ters 10 and 11), or if you use an XML schema as explained in Chapter 11. If you open an XML document—one with or without a style sheet—directly in Internet Explorer (as you have done so far in this book and will do in Chapters 8, 9, and 12), the processor will check the entire document—including any document type declaration it contains—for well-formedness and will display a fatal error message for any infraction it encounters. However, the Internet Explorer processor will not check the document for validity, even if it contains a document type declaration. To test a document with a DTD or XML schema for validity and to see messages for any well-formedness or validity errors the document contains, you can use one of the validity checking scripts (contained in HTML Web pages) that are given in “Checking an XML Document for Validity” on page 396. (These scripts are also provided on the companion CD.) You might want to read the instruc- tions in that section now so that you can begin checking the validity of the XML documents you create. Chapter 5 Creating Valid XML Documents Using Document Type Definitions 95 5 Document Type Definitions Adding the Document Type Declaration A document type declaration is a block of XML markup that you add to the prolog of a valid XML document. It can go anywhere within the prolog—out- side of other markup—following the XML declaration. (Recall that if you in- clude the XML declaration, it must be at the very beginning of the document.) Prolog Document element Document type declaration can go here or here A document type declaration defines the content and structure of the document. If you open a document without a document type declaration (or XML schema) in Internet Explorer, the Internet Explorer processor will merely check that the document is well-formed. If, however, you open a document with a document type declaration in Internet Explorer, the processor will, under certain circum- stances, check the document for validity as well as for well-formedness, and your document must therefore conform to all declarations within the document type declaration. (See the note at the end of the previous section for a descrip- tion of the circumstances under which Internet Explorer checks for validity.) You won’t, for example, be able to include any elements or attributes in the document that you haven’t declared in the document type declaration. And ev- ery element and attribute that you do include must match the specifications (such as the allowable content of an element or the permissible type of an at- tribute value) expressed in the corresponding declaration. 98 XML Step by Step Well-Formedness and Validity Constraints Well-formedness constraints are a set of rules given in the XML specifica- tion that you must follow—in addition to the rules specified in the formal XML grammar—to create a well-formed document. Because an XML document must be well-formed, any violation of a well-formedness con- straint or any other failure to achieve well-formedness is considered a fa- tal error. When the XML processor encounters a fatal error, it must stop normal processing of the document and not attempt to recover. Validity constraints are a further set of rules in the XML specification that you must follow if you’ve chosen to create a valid document by defining a DTD. (They don’t apply if you’ve chosen to create a valid document using an XML schema.) Because validity is optional for an XML document, a violation of a validity constraint is considered only an error, as opposed to a fatal error. When a validating XML processor (that is, one that checks documents for validity) encounters an error, it can simply report the problem and attempt to recover from it. Validity constraints consist of specific rules for creating a proper document type declaration with its DTD, and for creating a document that conforms to the specifications within your DTD. Declaring Element Types In a valid XML document created using a DTD, you must explicitly declare the type of every element that you use in the document in an element type declara- tion within the DTD. An element type declaration indicates the name of the ele- ment type and the allowable content of the element (often specifying the order in which child elements can occur). Taken together, the element type declarations in the DTD map out the entire content and logical structure of the document. That is, the element type declarations indicate the element types that the docu- ment contains, the order of the elements, and the contents of these elements. The Form of an Element Type Declaration An element type declaration has the following general form: <!ELEMENT Name contentspec> Chapter 5 Creating Valid XML Documents Using Document Type Definitions 99 5 Document Type Definitions Here, Name is the name of the element type being declared. (To review the rules for legal element names, see “The Anatomy of an Element” on page 53.) And contentspec is the content specification, which defines what the element can con- tain. The next section describes the different types of content specifications you can use. The following is a declaration of an element type named TITLE, which is per- mitted to contain only character data (no child elements would be allowed): <!ELEMENT TITLE (#PCDATA)> And here’s a declaration for an element type named GENERAL, which can con- tain any type of content: <!ELEMENT GENERAL ANY> As a final example, here’s a complete XML document with two element types. The declaration of the COLLECTION element type indicates that it can contain one or more CD elements, and the declaration of the CD element type specifies that it can contain only character data. Notice that the document conforms to these declarations and is therefore valid: <?xml version=”1.0"?> <!DOCTYPE COLLECTION [ <!ELEMENT COLLECTION (CD)+> <!ELEMENT CD (#PCDATA)> <! You can also insert a comment in a DTD. > ] > <COLLECTION> <CD>Mozart Violin Concertos 1, 2, and 3</CD> <CD>Telemann Trumpet Concertos</CD> <CD>Handel Concerti Grossi Op. 3</CD> </COLLECTION> note You can declare a particular element type only once in a given document. For general information on redeclaring items in the DTD, see the sidebar “Redeclarations in a DTD” on page 148. 100 XML Step by Step The Element’s Content Specification You can specify the content of an element—that is, fill in the contentspec part of the element type declaration—in four different ways: ■ EMPTY content. You use the EMPTY keyword to indicate that the element must be empty—that is, that it cannot have content. Here’s an example: <!ELEMENT IMAGE EMPTY> The following would be valid IMAGE elements you could enter into your document: <IMAGE></IMAGE> <IMAGE /> ■ ANY content. You use the ANY keyword to indicate that the element can have any legal content. That is, an element of this type can contain zero or more child elements of any declared type, in any order or number of repetitions, with or without interspersed character data. This is the most lax content specification, and creates an element type without content constraints. Here’s an example of a declaration: <!ELEMENT MISC ANY> ■ Element content (also known as children content). With this type of content specification, the element can contain child elements of the indicated types, but can’t directly contain character data. I’ll de- scribe this option in the next section. ■ Mixed content. With this type of content specification, the element can contain any quantity of character data. Also, if one or more child element types are specified in the declaration, the character data can be interspersed with any number of these child elements, in any order. I’ll describe this option later in this chapter. Specifying Element Content If an element has element content, it can directly contain only the specified child elements. The element cannot contain character data, except for white space characters used to separate the child elements and enhance readability (for ex- ample, you can display each child element on a separate line and indent them using space or tab characters). As always, the processor must pass the white space characters on to the application, but the application will typically ignore Chapter 5 Creating Valid XML Documents Using Document Type Definitions 101 5 Document Type Definitions them. (For more details, and to learn about an exception, see the sidebar “White Space in Elements” on page 56.) Consider the following example XML document, which describes a single book: <?xml version=”1.0"?> <!DOCTYPE BOOK [ <!ELEMENT BOOK (TITLE, AUTHOR)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT AUTHOR (#PCDATA)> ] > <BOOK> <TITLE>The Scarlet Letter</TITLE> <AUTHOR>Nathaniel Hawthorne</AUTHOR> </BOOK> In this document, the BOOK element type is declared to have element content. The (TITLE, AUTHOR) following the element name in the declaration is known as the content model. A content model indicates the allowed types of child ele- ments and their order. In this example, the content model indicates that a BOOK element must have exactly one TITLE child element followed by exactly one AUTHOR child element. A content model can have either of the following two basic forms: ■ Sequence. The sequence form of content model indicates that the ele- ment must contain a specific sequence of child element types. You separate the names of the child element types with commas. For ex- ample, the following DTD indicates that a MOUNTAIN document element must have one NAME child element, followed by one HEIGHT child element, followed by one STATE child element: <!DOCTYPE MOUNTAIN [ <!ELEMENT MOUNTAIN (NAME, HEIGHT, STATE)> <!ELEMENT NAME (#PCDATA)> <!ELEMENT HEIGHT (#PCDATA)> <!ELEMENT STATE (#PCDATA)> ] > 102 XML Step by Step Hence, the following document element would be valid: <MOUNTAIN> <NAME>Wheeler</NAME> <HEIGHT>13161</HEIGHT> <STATE>New Mexico</STATE> </MOUNTAIN> The following document element, however, would be invalid because the order of the child element types isn’t as declared: <MOUNTAIN> <! Invalid element! > <STATE>New Mexico</STATE> <NAME>Wheeler</NAME> <HEIGHT>13161</HEIGHT> </MOUNTAIN> Omitting a child element type or including the same child element type more than once would also be invalid. As you can see, this is a very rigid form of declaration. ■ Choice. The choice form of content model indicates that the ele- ment can have any one of a series of possible child element types, which are separated using | characters. For example, the following DTD specifies that a FILM element can contain one STAR child ele- ment, or one NARRATOR child element, or one INSTRUCTOR child element: <!DOCTYPE FILM [ <!ELEMENT FILM (STAR | NARRATOR | INSTRUCTOR)> <!ELEMENT STAR (#PCDATA)> <!ELEMENT NARRATOR (#PCDATA)> <!ELEMENT INSTRUCTOR (#PCDATA)> ] > Hence, the following document element would be valid: <FILM> <STAR>Robert Redford</STAR> </FILM> Chapter 5 Creating Valid XML Documents Using Document Type Definitions 107 5 Document Type Definitions <!ELEMENT TITLE (#PCDATA | SUBTITLE)*> <!ELEMENT SUBTITLE (#PCDATA)> The following are valid TITLE elements, conforming to this declaration: <TITLE>Moby-Dick <SUBTITLE>Or, The Whale</SUBTITLE></TITLE> <TITLE><SUBTITLE>Or, The Whale</SUBTITLE> Moby-Dick</TITLE> <TITLE>Moby-Dick</TITLE> <TITLE> <SUBTITLE>Or, The Whale</SUBTITLE> <SUBTITLE>Another Subtitle</SUBTITLE> </TITLE> <TITLE></TITLE> Declaring Attributes In a valid XML document, you must also explicitly declare all attributes that you intend to use with the document’s elements. You define all the attributes as- sociated with a particular element by using a type of DTD markup declaration known as an attribute-list declaration. This declaration does the following: ■ It defines the names of the attributes associated with the element. In a valid document, you can include in an element start-tag only those attributes defined for that element. ■ It specifies the data type of each attribute. ■ It specifies for each attribute whether that attribute is required. If the attribute isn’t required, the attribute-list declaration also indicates what the processor should do if the attribute is omitted. (The decla- ration might, for example, provide a default attribute value that the processor will pass to the application.) notenote notenote note You can declare elements and attributes in any order in a DTD. For example, you can declare the attribute-list specification for a particular element before you declare that element. 108 XML Step by Step The Form of an Attribute-List Declaration An attribute-list declaration has the following general form: <!ATTLIST Name AttDefs> Here, Name is the type name of the element associated with the attribute or at- tributes. AttDefs is a series of one or more attribute definitions, each of which defines one attribute. (The order of the attribute definitions in the attribute-list declaration isn’t significant. You can always include the attribute specifications in an element start-tag in any order.) An attribute definition has the following form: Name AttType DefaultDecl Here, Name is the name of the attribute. (To review the rules for legal attribute names, see “Rules for Creating Attributes” on page 63.) AttType is the attribute type, which is the kind of value that can be assigned to the attribute. (I’ll de- scribe the attribute type in the next section.) And DefaultDecl is the default dec- laration, which indicates whether the attribute is required and provides other information. (I’ll describe the default declaration later in this chapter.) Say, for example, that you’ve declared an element type named FILM like this: <!ELEMENT FILM (TITLE, (STAR | NARRATOR | INSTRUCTOR))> Here’s an example of an attribute-list declaration that declares two attributes— named Class and Year—for FILM elements: <!ATTLIST FILM Class CDATA “fictional” Year CDATA #REQUIRED> Here are the different parts of this declaration: Second attribute definition Default declaration Attribute type Attribute name Attribute name Attribute type Default declaration First attribute definitionName of associated element An attribute-list declaration You can assign to the Class attribute any legal quoted string (the CDATA key- word); if you omit the attribute from a particular element, it will automatically be assigned the default value fictional. You can assign to the Year attribute any legal quoted string; this attribute, however, must be assigned a value in every FILM element (the #REQUIRED keyword), and it therefore doesn’t have a default value. [...]... attribute as an enumerated type, like this: Here’s a complete XML document that shows the use of the Class attribute: < ?xml version=”1.0"?> Electric Coffee Grinder XML Documents Using Document Type Definitions 111 ] > Here’s a complete list of the keywords you can... typically one storing non -XML data I’ll discuss these entities in Chapter 6 For example, in the DTD you might declare an element named IMAGE to represent a graphic image and an ENTITY type attribute named Source to indicate the source of the graphic data, like this: Chapter 5 Creating Valid XML Documents Using Document Type Definitions... underscore (_) followed by zero or more letters, digits, periods (.), hyphens (-), or underscores However, the XML specification states that ID attribute values beginning with the letters xml (in any combination of uppercase or lowercase letters) are “reserved for standardization.” Although Internet Explorer doesn’t enforce this restriction, it’s better not to begin names with xml to avoid future problems...110 XML Step by Step Name of associated element Attribute definition An attribute-list declaration Default declaration Attribute type Attribute name You can specify the attribute type in three different ways: I... followed by a list of name tokens separated with | characters, followed by a close parenthesis Recall that a name token is a name that consists of one or more letters, digits, periods (.), hyphens (-), or underscores (_), and that can also contain a single colon (:) except in the first character position For example, if you wanted to restrict the values of the Class attribute to fictional, documentary,... described in “Rules for Legal Attribute Values” on page 65 In addition, the value must conform to the particular constraint that you specify in the attribute definition by using an appropriate keyword For example, in the following XML document, the StockCode attribute is defined as a tokenized type using the ID keyword (ID is only one of the keywords you can use to declare a tokenized type.) This keyword... INVENTORY document given above note #REQUIRED and #IMPLIED are the two forms of default declaration in which you don’t specify a default value for the attribute It doesn’t make sense to give a default value to an ID type attribute, since the attribute must have a unique value in each attribute specification I IDREF The attribute value must match the value of some ID type attribute in an element within the document . expects a particular document content and structure. If all users of the software use a common appropriate DTD or XML schema for their XML 94 XML Step by Step documents, and if the documents are checked. Chapter 5 Creating Valid XML Documents Using Document Type Definitions 93 5 Document Type Definitions The Advantages of Making an XML Document Valid Creating a valid XML document might seem to. validity of the XML documents you create. Chapter 5 Creating Valid XML Documents Using Document Type Definitions 95 5 Document Type Definitions Adding the Document Type Declaration A document type

Ngày đăng: 03/07/2014, 07:20