slide cơ sở dữ liệu tiếng anh chương (30) semistructured data and xml transparencies

156 453 0
slide cơ sở dữ liệu tiếng anh chương  (30) semistructured data and xml transparencies

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

1 Chapter 30 Semistructured Data and XML Transparencies © Pearson Education Limited 1995, 2005 2 Chapter 30 - Objectives ◆ What semistructured data is. ◆ Concepts of the Object Exchange Model (OEM), a model for semistructured data. ◆ Basics of Lore, a semistructured DBMS, and its query language, Lorel . ◆ Main language elements of XML. ◆ Difference between well-formed and valid XML documents. ◆ How Document Type Definitions (DTDs) can be used to define valid syntax of an XML document. © Pearson Education Limited 1995, 2005 3 Chapter 30 - Objectives ◆ How Document Object Model (DOM) compares with OEM. ◆ About other related XML technologies. ◆ Limitations of DTDs and how XML Schema overcomes these limitations. ◆ How RDF and RDF Schema provide a foundation for processing metadata. ◆ W3C XQuery Language. ◆ How to map XML to databases. ◆ SQL:2003 support for XML. © Pearson Education Limited 1995, 2005 4 Introduction ◆ In 1998 XML 1.0 was formally ratified by W3C. ◆ Yet, has impacted every aspect of programming including graphical interfaces, embedded systems, distributed systems, and database management. ◆ Already becoming de facto standard for data communication within software industry, and is quickly replacing EDI systems as primary medium for data interchange among businesses. ◆ Some analysts believe it will become language in which most documents are created and stored, both on and off Internet. © Pearson Education Limited 1995, 2005 5 Semistructured Data Data that may be irregular or incomplete and have a structure that may change rapidly or unpredictably. ◆ Semistructured data is data that has some structure, but structure may not be rigid, regular, or complete. ◆ Generally, data does not conform to fixed schema (sometimes use terms schema-less or self- describing). © Pearson Education Limited 1995, 2005 6 Semistructured Data ◆ Information normally associated with schema is contained within data itself. ◆ Some forms of semistructured data have no separate schema, in others it exists but only places loose constraints on data. ◆ Unfortunately, relational, object-oriented, and object-relational DBMSs do not handle data of this nature particularly well. © Pearson Education Limited 1995, 2005 7 Semistructured Data ◆ Has gained importance recently for various reasons: – may be desirable to treat Web sources like a database, but cannot constrain these sources with a schema; – may be desirable to have a flexible format for data exchange between disparate databases; – emergence of XML as standard for data representation and exchange on the Web, and similarity between XML documents and semistructured data. © Pearson Education Limited 1995, 2005 8 Example 30.1 © Pearson Education Limited 1995, 2005 9 Example 30.1 ◆ Note, data is not regular: – for John White, hold first and last names, but for Ann Beech store single name and also store a salary; – for property at 2 Manor Rd, store a monthly rent whereas for property at 18 Dale Rd, store an annual rent; – for property at 2 Manor Rd, store property type (flat) as a string, whereas for property at 18 Dale Rd, store type (house) as an integer value. © Pearson Education Limited 1995, 2005 10 Example 30.1 © Pearson Education Limited 1995, 2005 [...]... Limited 1995, 2005 21 DataGuide x x A dynamically generated and maintained structural summary of database, which serves as a dynamic schema Has three properties: – conciseness: every label path in the database appears exactly once in the DataGuide; – accuracy : every label path in DataGuide exists in original database; – conv enience: a DataGuide is an OEM (or XML) object, so can be stored and accessed using... Education Limited 1995, 2005 13 Lore and Lorel x x Lore (Lightweight Object REpository), is a multiuser DBMS, supporting crash recovery, materialized views, bulk loading of files in some standard format (XML is supported), and a declarative update language Has an external data manager that enables data from external sources to be fetched dynamically and combined with local data during QP © Pearson Education... Limited 1995, 2005 24 DataGuides x DataGuides can be classified as strong or weak: – strong is where each set of label paths that share same target set in the DataGuide is exactly the set of label paths that share same target set in source database © Pearson Education Limited 1995, 2005 25 DataGuides x (a) weak DataGuide; (b) strong DataGuide © Pearson Education Limited 1995, 2005 26 XML (eXtensible Markup... structure, and validation Since XML is a restricted form of SGML, any fully compliant SGML system will be able to read XML documents (although the opposite is not true) XML is not intended as a replacement for SGML or HTML © Pearson Education Limited 1995, 2005 30 Advantages of XML x Simplicity x Open standard and platform/vendor-independent x Extensibility x Reuse x Separation of content and presentation... source database © Pearson Education Limited 1995, 2005 22 DataGuides © Pearson Education Limited 1995, 2005 23 DataGuides x x x Can determine whether a given label path of length n exists in source database by considering at most n objects in the DataGuide For example, to verify whether path Staff.Oversees.annualRent exists, need only examine outgoing edges of objects &19, &21, and &22 in our DataGuide... (OEM) x Data in OEM is schema-less and self-describing, and can be thought of as labeled directed graph where nodes are objects, consisting of: – – – – x unique object identifier (for example, &7), descriptive textual label ( street), type (string), a value (“22 Deer Rd”) Objects are decomposed into atomic and complex: – atomic object contains value for base type (e.g., integer or string) and in diagram... Education Limited 1995, 2005 14 Lorel x Lorel (the Lore language) is an extension to OQL Lorel was intended to handle: – queries that return meaningful results even when some data is absent; – queries that operate uniformly over single-valued and set-valued data; – queries that operate uniformly over data with different types; – queries that return heterogeneous objects; – queries where the object structure... available with HTML x x Most documents on Web currently stored and transmitted in HTML One strength of HTML is its simplicity Simplicity may also be one of its weaknesses, with users wanting tags to simplify some tasks and make HTML documents more attractive and dynamic © Pearson Education Limited 1995, 2005 27 XML x x To satisfy this demand, vendors introduced some browser-specific HTML tags, making... separately defined structure, and by giving authors ability to define custom structures, SGML provides extremely powerful document management system However, SGML has not been widely adopted due to its inherent complexity © Pearson Education Limited 1995, 2005 29 XML x x x x XML attempts to provide a similar function to SGML, but is less complex and, at same time, network-aware XML retains key SGML advantages... Exchange Model (OEM) x x x A label indicates what the object represents and is used to identify the object and to convey the meaning of the object, and so should be as informative as possible Labels can change dynamically A name is a special label that serves as an alias for a single object and acts as an entry point into the database (for example, DreamHome is a name that denotes object &1) © Pearson . for data exchange between disparate databases; – emergence of XML as standard for data representation and exchange on the Web, and similarity between XML documents and semistructured data. ©. model for semistructured data. ◆ Basics of Lore, a semistructured DBMS, and its query language, Lorel . ◆ Main language elements of XML. ◆ Difference between well-formed and valid XML documents. ◆ How. 1 Chapter 30 Semistructured Data and XML Transparencies © Pearson Education Limited 1995, 2005 2 Chapter 30 - Objectives ◆ What semistructured data is. ◆ Concepts of the Object

Ngày đăng: 22/10/2014, 10:21

Từ khóa liên quan

Mục lục

  • Chapter 30

  • Chapter 30 - Objectives

  • Slide 3

  • Introduction

  • Semistructured Data

  • Slide 6

  • Slide 7

  • Example 30.1

  • Slide 9

  • Slide 10

  • Object Exchange Model (OEM)

  • Slide 12

  • Slide 13

  • Lore and Lorel

  • Lorel

  • Slide 16

  • Slide 17

  • Example 30.2 – Example Lorel Queries

  • Slide 19

  • Slide 20

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan