Contents Overview 1 Lesson: Overview of XML Parsing 2 Lesson: Parsing XML Using XmlTextReader 14 Lesson: Creating a Custom Reader 31 Review 37 Lab 2.1: Parsing XML 39 Module 2: Parsing XML Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2002 Microsoft Corporation. All rights reserved. Microsoft, MS-DOS, Windows, Windows NT, Win32, Active Directory, ActiveX, BizTalk, IntelliSense, JScript, Microsoft Press, MSDN, PowerPoint, SQL Server, Visual Basic, Visual C#, and Visual Studio are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. Module 2: Parsing XML iii Instructor Notes After completing this module, students will be able to: Create a Stream object from an XML file. Build a mutable string by using the StringBuilder object. Handle errors in the form of XML. Parse XML as text by using the XmlTextReader object. Create a custom XmlReader object. To teach this module, you need the following materials: Microsoft® PowerPoint® file 2663A_02.ppt 2663A_02_Code.htm To prepare to effectively teach this module: Read the following Microsoft .NET Framework Class Library topics: • XmlReader Class • XmlTextReader • StringBuilder Class Read all of the materials for this module. Complete the practices and the lab. Practice delivering the demonstrations. In this module, some of the Microsoft PowerPoint® slides provide hyperlinks that open a code samples page in the Web browser. The code samples page provides a way to show and discuss code samples when there is not enough space for the code on the PowerPoint slide. It also allows students to copy code samples directly from the browser window and paste them into a development environment. All of the linked code samples for this module are in a single .htm file. To open a code sample, click the appropriate hyperlink on the slide. To navigate between code samples in a particular language, use the table of contents provided at the top of the code page. Each hyperlink opens a separate instance the Web browser, so it is a good practice to click the Back button in Microsoft Internet Explorer after viewing a code sample. This will close the browser window and return you to the PowerPoint presentation. Required materials Preparation tasks Hyperlinked Code Examples iv Module 2: Parsing XML How to Teach This Module This section contains information that will help you to teach this module. Lesson: Overview of XML Parsing This section describes the instructional methods for teaching each topic in this lesson. This topic introduces the module by defining the technical problem of parsing XML. Most students will already understand what parsing is and why they would do it. This topic introduces XmlReader by comparing it with the Simple application programming interface (API) for XML, or SAX, which many students are already familiar with. Many students should also already be aware of the two models of XML parsing, the push model versus the pull model. This topic compares SAX, as an example of the push model, to the Microsoft .NET Framework XmlReader class, as an example of the pull model of XML processing. As the lesson progresses, if you identify those students who have previous experience writing a SAX application, they might be able to help you point out the advantages of XmlReader. Briefly cover the major features of the XmlReader class. Students might ask about the technique of using XmlValidatingReader with a ValidationEventHandler, which is covered in the next module. We cover reading XML from streams early, because it is a basic skill. Be prepared to provide a definition of a stream. Another basic skill is creating and appending parsed XML by using a StringBuilder object. StringBuilder is preferred over the String object, because it uses much less memory. StringBuilder also allows you to append content to the string without having to create a new StringBuilder object. Lesson: Parsing XML Using XmlTextReader This section describes the instructional methods for teaching each topic in this lesson. This demonstration consists of showing typically usage of three functions of a Microsoft Visual Studio ® .NET add-in that was custom-built for this course. To prepare for this demonstration, you should perform the demonstration steps as they are written and prepare to explain what the add-in does. Do not walk through the code during the demonstrations. There are separate code examinations you will perform in which you will do just that. For more information about the add-in see Appendix A, “The XML Tools Add-In.” Show how to instantiate a new XmlTextReader. Discuss the Read() method. Introduction to XML Parsin g XML Parsin g Models Parsing XML With the XmlReader Class How to Read Streams How to Build Strings from Parsed XML Demonstration: Parsing XML Note How to Create an XmlTextReader Object How to Navigate Nodes Module 2: Parsing XML v Discuss the NodeType property. Discuss how to use the Name, Value, and Attributes properties to read the contents of a node. Prepare to define the difference between significant and insignificant white space. Discuss the use of XmlException to handle errors that result from XML that is not well-formed. When you perform code examinations, increase the font size used by the Visual Studio .NET development environment, especially the font size used by the Code Editor and the Output window. To change the display options 1. On the Tools menu, click Options. 2. Click the Text Editor folder, and then click the HTML/XML folder. 3. Select the Word wrap and Line numbers options. While in the Code window, pressing CTRL+R twice will toggle word wrap on and off. 4. Click the Environment folder, and then click the Fonts and Colors folder. 5. Change the font used for the Text Editor and the Text Output Tool Windows to Lucida Console 14 pt. 6. Click OK. 7. Close and restart Visual Studio .NET for the changes to take effect. Lesson: Creating a Custom Reader This section describes the instructional methods for teaching each topic in this lesson. Be prepared to provide one or two anecdotes that illustrate the need for a custom reader. Discuss the types of XmlReader you can inherit from and the mechanics of overriding the Read() method. Be prepared to explain how the Read() method exposes the attribute as an element node type by using the XmlNodeType.Name and XmlNodeType.Value properties. How to Determine the Current Node Type How to Read the Contents of a Node How to Handle White Space How to Handle XML Errors While Parsing Code Examination: Parsin g XML Note Why Create a Custom Reader Ob j ect? Inheriting from XmlReade r Code Examination: Inheriting from XmlTextReade r Module 2: Parsing XML 1 Overview Overview of XML Parsing Parsing XML Using XmlTextReader Creating a Custom Reader ***************************** ILLEGAL FOR NON-TRAINER USE****************************** This module discusses how to parse Extensible Markup Language (XML) data from a file, string, or stream by using the XmlTextReader class. The XmlNodeReader object is not covered in this module, but works in a similar way as the XmlTextReader object. Both the XmlTextReader and XmlNodeReader objects inherit from XmlReader. If these descendant objects do not provide the needed functionality, you can create a custom reader object that inherits from XmlReader. After completing this module, you will be able to use the Microsoft ® .NET Framework to: Create a Stream object from an XML file. Build a mutable string by using the StringBuilder object. Handle errors in the form of XML. Parse XML as text by using the XmlTextReader object. Create a custom XmlReader object. Introduction Objectives 2 Module 2: Parsing XML Lesson: Overview of XML Parsing Introduction to XML Parsing XML Parsing Models Parsing XML with the XmlReader Class How to Read Streams How to Build Strings from Parsed XML ***************************** ILLEGAL FOR NON-TRAINER USE****************************** The XmlReader base class and the objects that inherit from it are a powerful set of tools for parsing XML. This lesson discusses how to use the XmlReader and supporting classes to parse XML in a variety of use contexts. After completing this lesson, you will be able to: Read XML from a File object. Read XML from a Stream object. Store XML in a StringBuilder object. Introduction Lesson ob j ectives Module 2: Parsing XML 3 Introduction to XML Parsing Parsing and reading XML mean the same thing Parse XML to find content and to use node information Create a list by node type Sort nodes by namespace identifier List all of the child elements in an XML source Find a node by relative position Find the last node to signal when to stop parsing ***************************** ILLEGAL FOR NON-TRAINER USE****************************** What does it mean to parse XML? Parsing refers to the process of reading XML and then performing some action based on the information read. When you parse XML, you often filter the data in an attempt to locate a particular data value or range of values. At other times, you might be more interested in the node information that the parser finds. The term node, when used in this context, refers to a node as defined by the World Wide Web Consortium (W3C) XML Information Set Recommendation available at http://www.w3.org/TR/xml-infoset. Parsing XML allows you to query an XML source to find a particular data value. For example, suppose that you must build an application that can query a local store of XML-based human resources data. Parsing the XML should allow you to find a particular value such as the record that is associated with an employee number that is equal to “12345.” Parsing also allows you to filter an XML source to find a set of related information. For example, you might want to filter a personnel listing to find those employees whose hire date falls within the current month. Parsing allows you to use the node information in an XML source, such as the node type, or node value. The following are useful tasks that you can accomplish by using node information made available by parsing: Use node information to create a list by node type Sort nodes by namespace identifier List all of the child elements in an XML source Introduction Find particular content Make use of node information 4 Module 2: Parsing XML XML Parsing Models Push Model Push Model Push Model Pull Model Pull Model Application Generate calls to XmlReader that pull specific XML Application Generate calls to XmlReader that pull specific XML <a> <b/> </a> <a> <b/> </a> SAX XML reader Push unfiltered XML to the calling application SAX XML reader Push unfiltered XML to the calling application XmlReader class Pull specified XML and implement error handling XmlReader class Pull specified XML and implement error handling <a> <b/> </a> <a> <b/> </a> Application Process nodes, handle errors, and monitor the state of the reader Application Process nodes, handle errors, and monitor the state of the reader XmlTextReader Content Handler Content Handler Error Handler Error Handler XmlNodeReader Node Handler Node Handler XmlValidatingReader ***************************** ILLEGAL FOR NON-TRAINER USE****************************** XML processors are based on the push model or the pull model of XML processing. The push model is typified by a processor that uses the Simple application programming interface (API) for XML, referred to as SAX. The pull model is typified by how the .NET Framework XML reader classes process XML. The push model of XML processing means that the parser “pushes” to the application an unfiltered, steady stream of parsed XML nodes. SAX is an example of a parser that does this. SAX pushes unfiltered XML nodes in response to a request by an application. You must write applications that consume unfiltered XML nodes to filter relevant node information and content. The push model assumes that there is perfectly formed XML. If the SAX processor finds an XML error, it immediately stops processing and then sends an exception to the calling application. You should write any application that uses the push model of XML processing to handle a variety of XML errors. SAX is not supported by the .NET Framework, but you can use existing SAX tools, such as the Microsoft XML Parser (MSXML), in your .NET-based programs. Introduction What is the push model of XML processing? [...]... i * 12; } MessageBox.Show(s); 14 Module 2: Parsing XML Lesson: Parsing XML Using XmlTextReader Demonstration: Parsing XML How to Create an XmlTextReader Object How to Navigate Nodes How to Determine the Current Node Type How to Read the Contents of a Node How to Handle White Space How to Handle XML Errors While Parsing Code Examination: Parsing XML Practice: Reading XML Content and Nodes *****************************ILLEGAL... the XmlTextReader Value property 3 The tag is read as a type XmlNodeType.EndElement node Other node types Above are three of the most important node types Other types include: XmlNodeType.CDATA XmlNodeType.Comment XmlNodeType.ProcessingInstruction XmlNodeType.WhiteSpace XmlNodeType.XmlDeclaration For a complete list, search online help for the XmlNodeType enumeration Module 2: Parsing XML. .. operation Values can be All, None, or Significant 24 Module 2: Parsing XML How to Handle XML Errors While Parsing To catch XML errors, use XmlException XmlException inherits from SystemException Catches errors of well-formedness in XML form as defined by the W3C XML Recommendation Examples of errors in XML form: improper nesting, inconsistent casing XmlException provides two extended properties: LineNumber... processing only those items that are of interest to the application This allows for extremely efficient applications Module 2: Parsing XML 7 Parsing XML with the XmlReader Class What is XmlReader? An abstract base class Extends to these XML readers: XmlTextReader, XmlNodeReader, and XmlValidatingReader Can be used either to create customized readers Non-cached, forward-only, read-only access Allows... current node End While // C# while (BooksReader.Read()) { // process current node } 20 Module 2: Parsing XML How to Determine the Current Node Type XmlNodeType properties XmlNodeType.Element XmlNodeType.Element XmlNodeType.EndElement XmlNodeType.EndElement Seattle Seattle XmlNodeType.Text XmlNodeType.Text NET Framework provides more node types than the W3C standard, for example.. .Module 2: Parsing XML What is the pull model of XML processing? 5 The pull model of XML processing means that the parser pulls from the XML source only those nodes that it is instructed to pull by a calling application XmlReader, a NET Framework class, is an example of a parser that pulls a filtered set of XML nodes in response to a request by an application XmlReader objects read the XML one... this lesson, you will be able to: Navigate through XML nodes by using the Read() methods Determine the current node type and extract information about the current node Read the attributes of an element type of node Handle white space in an XML document Implement XML error handling while parsing Module 2: Parsing XML 15 Demonstration: Parsing XML The XML Tools add-in: Parses Filters Converts *****************************ILLEGAL... of XML data during parsing, you use the XmlException class with the XmlTextReader class XmlException catches errors in the rules of XML form that are defined in the W3C XML Recommendation The XmlException class inherits from SystemException and returns detailed information about the last exception XmlException properties Two properties are specific to XmlException: LineNumber and LinePosition Module. .. the XML Tools toolbar, click Parse 16 Module 2: Parsing XML 4 Notice that the Output window opens, showing detailed information about the employee .xml file Each node in the XML file appears as a row in the details table, and a count of the number of each type of node appears in the summary table - Parsing: C:\ \Democode\Mod02\employee .xml DEPTH|PREFIX | 0| 0| 0| 1| 1| 1| 1| 1| 0| |NODETYPE | |XmlDeclaration... Create an XmlTextReader Object Use the XmlTextReader constructor Possible parameters: Stream String TextReader URL XmlTextReader BooksReader = XmlTextReader BooksReader = new XmlTextReader(@"c:\books .xml" ); new XmlTextReader(@"c:\books .xml" ); *****************************ILLEGAL FOR NON-TRAINER USE****************************** Introduction The XmlTextReader class is an implementation of XmlReader . of XML Parsing 2 Lesson: Parsing XML Using XmlTextReader 14 Lesson: Creating a Custom Reader 31 Review 37 Lab 2.1: Parsing XML 39 Module 2: Parsing. Examination: Inheriting from XmlTextReade r Module 2: Parsing XML 1 Overview Overview of XML Parsing Parsing XML Using XmlTextReader Creating