XML Technical Specification for Higher Education

Purpose and Scope

This document serves as a guide for developing and maintaining a data dictionary and XML schemas It encompasses the data exchanged between institutions and their partners to support essential business processes in Higher Education, including administrative applications related to student financial aid, admissions, and registrar functions.

Intended Audience

This document is intended for the members of the XML Forum for Education and the broader education community's technical members who are interested in utilizing XML for data exchanges.

General Guidelines

General Naming Conventions

The following recommendations by the XML Forum’s Technology Group for general conventional standards are used whenever possible.

 Upper Camel Case (UCC) SHALL be used UCC style capitalizes the first character of each word and compounds the name following the conventions of the ebXML Technical Architecture v1.0.4, section 4.3:

 Acronyms SHOULD be avoided, but in cases where they are used, the capitalization SHALL remain.

 Underscore ( _ ), periods ( ) and dashes ( - ) MUST NOT be used (examples: use HeaderManifest, not Header.Manifest; use StockQuote5, not Stock_Quote_5; use CommercialTransaction not Commercial-Transaction.)

 XML "type" names SHALL have "Type" appended to them.

 Schema names adhere to the following conventions.

1 Schema document names (the root element of a schema) SHALL be based on the business purpose of the document.

2 Schema names that support the data dictionary SHALL be based on the category of definitions in that schema

3 Schema physical file names SHALL be the same as the schema name, with a ".xsd" extension.

4 Schema names SHALL remain constant across all versions.

NOTE: A list of acronyms used in this document can be found in section 5.2.

The approved XML Forum schemas will be designated as version 1.0 Subsequent updates for maintenance or new document inclusion will be considered minor releases, incremented by 1 In contrast, major releases will be increased by a full 1.0 and will be designated under specific circumstances.

 Several new documents are developed

 Major additions are made to the data dictionary

 Changes to file, URL, or namespace schemes

 Changes in schema design approach

Versions must be identified by a four-character string consisting of two digits for the major version and two digits for the minor version, including leading zeroes Each version should utilize distinct URLs, URIs, and directories Additionally, every schema must include an attribute within the root element.

2.1.3 URI, URL, File, and Directory Structure

The base URI for namespaces in XML Forum schemas SHALL be http://schemas.PESCXML.org This URI SHALL also be valid as the base URL for the network location of the XML Forum schemas and associated files The version string MUST be appended to this base URI to form the URI relevant to the version

(example: Version 1.0 has the URI http://schemas.PESCXML.org/0100)

The Forum plans to make several different types of files available on its web site.

The following is a concise overview of the categories along with their respective URL/URI specifications Detailed information on each category can be found in the subsequent subsections All path names mentioned are situated under the base version URL path.

 Schema files - This is the largest category of component and is further broken down in succeeding subsections Schema files are located in the xsd path.

The Forum aims to facilitate the sharing and distribution of commonly used XSLT stylesheets, despite them not being considered standard work products These stylesheets can be found in the designated xsl path.

 Sample instance documents - As will be described later, there are several sample instance documents per business document schema These are located in the xmlExamples path.

 Documentation - Is located in the docs path.

For each area, the path has sub-paths for core and sectors (to be described shortly).

Root files and URLs: xsd/core/coreMain.xsd and xsd/baseTypes/baseTypesMain.xsd

Core - The Core Components team, in a common “Core” data dictionary, SHALL define all aggregates and their maximum universe of member elements This Core data dictionary SHALL be represented by a single several schemas, divided into groups of related items as will be described in a later section The root URI for Version 1.0 of the Core schema, for example, has the root namespace of “http://schemas/PESCXML.org/0100/core” and is associated with the namespace identifier “core” Names of these schemas are derived from reasonable names assigned by the Core Components team to the groups Such files are xsd:included into coreMain.xsd.

Base types are fundamental, reusable data elements in XML schemas that can be categorized as either simpleTypes or complexTypes, depending on whether they include aggregates These types do not possess child elements and are derived from standard W3C Schema data types through extension or restriction Common examples of base types include numerics with specific range or sign restrictions, strings with defined length limitations, and enumerations These specifications can be found in the baseTypesMain.xsd schema file or organized into related groups that are included in baseTypesMain.xsd.

Root files and URL path: xsd/sectorName/sectorNameMain.xsd

The Core Components team may establish Sector data dictionaries to address specific needs within defined functional sectors of the postsecondary arena These dictionaries cater to unique requirements related to aggregate membership, cardinality, patterns, and code values Each Sector dictionary is represented by a singular schema, with version 1.0 identified by a specific root URI.

“SectorName”) has the root namespace of “http://schemas.PESCXML.org/0100/ SectorName” Sector content may be specified in the sectorNameMain.xsd schema file, or broken into groups of related items that are xsd:included in them.

Instance document definitions are established under the relevant Sector URI or the Core URI Each document features a specific targetNamespace and schema file tailored to its content The schema for the instance document incorporates the Core namespace and, when necessary, the Sector namespaces For instance, the root URI for version 1.0 of the Transcript schema is defined within this framework.

Registrar and Administration Sector has the root namespace of

“http://schemas.PESCXML.org/0100/RegAdmin/Transcript”.

The instance document schema must feature a root element with an anonymously defined complexType Its first-level children should adhere to types outlined in the Core or Sector library If modifications are necessary for the types defined in these libraries, they must be represented as named types.

URI, URL, File, and Directory Structure

The base URI for namespaces in XML Forum schemas SHALL be http://schemas.PESCXML.org This URI SHALL also be valid as the base URL for the network location of the XML Forum schemas and associated files The version string MUST be appended to this base URI to form the URI relevant to the version

(example: Version 1.0 has the URI http://schemas.PESCXML.org/0100)

The Forum plans to make several different types of files available on its web site.

Below is a concise overview of the categories along with their corresponding URL/URI specifications The following subsections will provide more detailed information All path names mentioned are structured under the base version URL path.

 Schema files - This is the largest category of component and is further broken down in succeeding subsections Schema files are located in the xsd path.

The Forum aims to facilitate the sharing and distribution of commonly used XSLT stylesheets, despite them not being standard work products These stylesheets can be found in the designated xsl path.

 Sample instance documents - As will be described later, there are several sample instance documents per business document schema These are located in the xmlExamples path.

 Documentation - Is located in the docs path.

For each area, the path has sub-paths for core and sectors (to be described shortly).

Root files and URLs: xsd/core/coreMain.xsd and xsd/baseTypes/baseTypesMain.xsd

Core - The Core Components team, in a common “Core” data dictionary, SHALL define all aggregates and their maximum universe of member elements This Core data dictionary SHALL be represented by a single several schemas, divided into groups of related items as will be described in a later section The root URI for Version 1.0 of the Core schema, for example, has the root namespace of “http://schemas/PESCXML.org/0100/core” and is associated with the namespace identifier “core” Names of these schemas are derived from reasonable names assigned by the Core Components team to the groups Such files are xsd:included into coreMain.xsd.

Base types are fundamental, reusable data elements classified as either simpleTypes or complexTypes, which do not contain child elements They are derived from standard W3C Schema data types through extension or restriction, encompassing examples such as numerics with range or sign limitations, strings with specific length constraints, and enumerations These definitions can be found in the baseTypesMain.xsd schema file or organized into related groups that are included in the baseTypesMain.xsd.

Root files and URL path: xsd/sectorName/sectorNameMain.xsd

The Core Components team may establish one or more Sector data dictionaries that cater to specific functional sectors within the postsecondary education landscape These Sector dictionaries address unique requirements related to aggregate membership, cardinality, patterns, and code values Each Sector dictionary is represented by a single schema, with version 1.0 identified by a designated root URI.

“SectorName”) has the root namespace of “http://schemas.PESCXML.org/0100/ SectorName” Sector content may be specified in the sectorNameMain.xsd schema file, or broken into groups of related items that are xsd:included in them.

Instance document definitions are established under either the relevant Sector URI or the Core URI Each document includes a specific targetNamespace and schema file tailored to its content The schema for the instance document imports the Core namespace and, when necessary, the Sector namespaces For instance, the root URI for version 1.0 of the Transcript schema is defined accordingly.

Registrar and Administration Sector has the root namespace of

“http://schemas.PESCXML.org/0100/RegAdmin/Transcript”.

The instance document schema must include a root element with an anonymously defined complexType, and all first-level children should be of types specified in the Core or Sector library If modifications to these types are necessary, they must be declared as named types following the root element declaration, and these extended or restricted types must reside within the namespace of the instance document schema.

The primary objective is to minimize the number of locally declared types in instance document schemas and to transition shared content into sector and core libraries for greater efficiency.

Core Components

Metadata essential for XML syntax

To facilitate creation of schemas, the following metadata items SHALL be recorded, but is not limited to, in the data dictionary for each element.

 Cardinality rules (these are OPTIONAL for aggregates in the core dictionary).

 Element equivalence in other transaction(s)

 Data type (string, date, number, etc)

 Minimum length (NOTE: May be specified in the Core and raised in the Sector.)

 Maximum length (NOTE: May be specified in the Core and lowered in the Sector.)

For core component analysis, a simplified list of datatypes must be utilized instead of the complete range provided by XML schemas Each datatype includes several optional attributes that can be specified as necessary for individual data items.

 Number - precision (number of decimal places), minimum value, maximum value

According to the W3C in XML Schema Part 2: Datatypes, a string is defined by specific facets, including minimum length, maximum length, and pattern constraints For instance, a pattern like NNN-NN-NNNN can be used for Social Security Numbers It is essential that any patterns are defined using a regular expression language as specified by the W3C in the same schema.

DataTypes Regular Expressions Pattern facets may be specified in the

Core, and restricted in a Sector If an element contains a member of a list, all potential list values MUST be specified (this resolves the issue with coded fields).

NOTE: If a string item is specified as mandatory in an aggregate item, it is RECOMMENDED to have a minimum length of 1.

When defining a data item, it is essential to assign it a specific type from the designated set The attributes provided should be utilized to impose restrictions on the permissible values If these attributes are absent from the data item's definition, only the general restrictions associated with the datatype will apply.

Aggregate data items are composed of two or more data items For aggregates the following apply.

 The included elements MUST be specified in sequence.The core dictionary SHOULD specify the maximum universe of included elements.

 Sector dictionaries MAY restrict included elements, and MAY add additional elements.

In the core dictionary, cardinality for aggregates should not be explicitly defined; instead, it should reflect the broadest common range of usage The cardinality of elements within these aggregates is aimed at accommodating the widest variety of applications, thereby reducing the need for modifications in sector or document schemas, with typical defaults being 0 1 or 1 1 However, cardinality can be specified for aggregates in sector dictionaries and business document schemas, but when developing a business document schema, it is essential not to expand the cardinality beyond what is defined in the sector dictionary.

Cardinality is defined as l u, where l represents the minimum occurrences and u signifies the maximum occurrences A wildcard "*" indicates no upper limit on occurrences For instance, a cardinality of 1 1 indicates that the data item is mandatory and can occur only once, while 0 1 signifies that the data item is optional, with a maximum of one occurrence Additionally, 0 * denotes that the data item is optional and can occur an unlimited number of times.

NOTE: It is RECOMMENDED that judicious consideration be given before specifying an item in an aggregate as mandatory (minimum cardinality of 1).

The following recommendations are made for addressing issues regarding aggregates:

 Over-riding the cardinality of an item in an aggregate on a per document basis

(example: a street address is mandatory in a reissue but is not mandatory in an adjustment.)

It is advisable to avoid supporting this definition in Version 1, as it complicates the process of defining reusable aggregates A suggested method is to establish the street address with a cardinality of 0 2 within an "address" aggregate, while specifying the address with a cardinality of 1 1 in the reissue and 0 1 in the adjustment.

 Conditional use of items in an aggregate – As in the case of X12 EDI, these are the relational conditions often imposed on elements in segments

(examples: Use "a" or "b" but not both; if "a" then use "b", else use "c".)

It is advised that conditionals be excluded in Version 1 to simplify the analysis and schema construction Any conditional restrictions and edits that are not supported by the schemas will be the responsibility of the business applications utilizing the data.

Analysis spreadsheets SHOULD be organized as follows:

Aggregates refer to a collection of basic items or other aggregates arranged in a specific sequence When a basic item is not reused, its complete specification may be included within the aggregate instead of being listed separately.

Columns SHOULD be organized as follows:

 Name of included item If an aggregate is included within an aggregate, only the name of the aggregate SHOULD be listed - not the names of all of its children

 Cardinality - The number of times the included item can appear in the aggregate

 Minimum length - OPTIONAL (String Only)

 Maximum length - OPTIONAL (String Only)

 List of values - OPTIONAL (String Only)

 Minimum value - OPTIONAL (Number Only)

 Maximum value - OPTIONAL (Number Only)

 Comments – (example:Code sets or source)

A sector library spreadsheet categorizes items as new within the sector library, modified versions of core library items, or as items that are present in the core library but referenced in the sector library.

NOTE: Some reusable basic items MAY not have an aggregate name

It is RECOMMENDED that the data dictionary use the core components as

"abstract" items or types rather than the full set of all particular items.

(example: a general "party" is defined rather than specifying "student",

"lender", or "guarantor" separately.) This approach enhances reusability and simplifies maintenance.

 Short, two or three character codes SHALL be used where deemed appropriate by the Core Components team instead of longer, more fully described words or phrases.

Code lists maintained by the XML Forum must have permitted values documented in the data dictionary and in document schemas However, schemas should not be designed for run-time validation of these codes against the permitted values This approach is intended to prevent delays in implementation caused by administrative processes in updating schemas, as business applications typically handle their own code value validation, rendering schema checks unnecessary.

The Core Components team will assess whether schema validation for externally maintained code lists is necessary, considering factors like stability, size, and copyright status It is important to note that schemas must not incorporate or utilize schemas from other organizations for code list validation purposes.

Core Component Naming Conventions

The XML Forum logical component must adhere to ebXML core component naming conventions, as outlined in ISO 11179 Element names may be inspired by the IFX Forum's name fragment combinations for XML tags When suitable matches are available, the IFX Forum's name fragments should be utilized for XML Forum element names If no matching fragments exist, the XML Forum team tasked with the data dictionary will create the necessary fragments.

Best Practices

General Design Considerations

The XML Forum schemas are primarily designed for data interchange, although they can also accommodate schemas focused on presentation While the main emphasis is on facilitating data exchange, these schemas may sometimes reflect traditional paper business documents Consequently, the content model emphasizes semantics over presentation or structure, incorporating a blend of both aspects.

Schema vs DTD

The World Wide Web Consortium XML Schema Language recommendation SHALL be used to describe data instead of DTDs or BizTalk Schema (by

XML Schemas SHALL be used for the following reasons.

1 XML Schemas are supported by the W3C, ebXML, and other organizations.

2 XML Schemas support greater content and data type validation than DTDs.

3 XML Schemas are stable and reached the W3C Recommendation status as of May 2, 2001.

4 XML Schemas support open-ended data models (allow vocabulary extensions and inheritance); DTDs do not.

5 XML Schemas provide a rich core of base data types; DTDs do not.

6 XML Schemas support data types and data type reuse via object- oriented-like mechanisms; DTDs provide only limited support.

7 XML Schemas are well-formed XML documents; DTDs require an understanding of the SGML syntax.

XML Schemas offer advanced content checking capabilities that are not available in DTDs, making them essential for software development By utilizing XML Schemas, developers can significantly streamline their efforts in data validation and content management, ultimately enhancing the efficiency of their projects.

Tools like XML Spy (from Altova, http://www.xmlspy.com/) support XML

Users can create an initial XML Schema from a Document Type Definition (DTD) and maintain the content model However, when converting an XML Schema back to a DTD, users cannot preserve the content model due to the more advanced type definitions offered by XML Schemas.

BizTalk Schema (framework) works only with the BizTalk Server product It uses a proprietary schema syntax (XDR) that is incompatible with W3C XML

Schemas Microsoft has promised to eventually support W3C XML Schemas.

Use of Elements vs Attributes

In the majority of circumstances, elements SHALL be used in the design of XML Schemas that support data exchange in the PESC realm.

XML Forum Schemas facilitate data exchange for current and future transaction families and their associated data structures By utilizing elements, these schemas define and express document structure through the inclusion of child elements, enabling effective validation of the document's architecture Moreover, the element-based structure enhances extensibility, making it adaptable to future modifications and supportive of change.

Attributes serve to define intrinsic information about an element without being part of it, functioning similarly to metadata They offer descriptive details such as ID numbers, URLs, types, and other references However, attributes lack a hierarchical structure, meaning they cannot have child attributes or elements, their order is unregulated, and they do not contribute to overall structural organization.

In a schema document, all element and attribute forms are unqualified, meaning that imported element and attribute names from different namespaces must include a namespace prefix However, local names within the schema's target namespace do not need a prefix It's important to note that only the root element of an instance document is required to have a namespace prefix.

An XML document can effectively represent the structure of an office building with multiple floors and tenants, utilizing elements for the building (Building), floors (Floor), and tenants (Tenant) To distinguish each floor, an attribute called LevelNumber is employed, showcasing the appropriate use of elements and attributes in XML.

Example-1.xml - (Use of Elements vs Attributes)

Example-1.xsd - (Use of Elements vs Attributes)

While it is possible to represent the same structure using only elements

The document structure in Example-2.xml and Example-2.xsd is more intricate, making it slightly challenging to interpret To enhance clarity in the design, it is more effective to position LevelNumber as an attribute of Floor instead of a child element of Floor.

Example-2.xml - (Use of Elements vs Attributes)

Example-2.xsd - (Use of Elements vs Attributes)

Element vs Type

Core components are essential as they define types, from which elements are derived This approach promotes the reuse of a single definition for an element or a group of elements, allowing for consistency across various documents By utilizing type definitions, other element definitions can reference the same content, even if they share the same name (refer to Example-3.xml and Example-3.xsd) This practice reduces ambiguity regarding the format and allowable contents of data items, effectively addressing the question of whether elements are identical or not.

Example-3.xml - (Element vs Type)

Chicken Farm Road

Example-3.xsd - (Element vs Type)

New types can be created from existing types, allowing for the extension of element definitions within the original type, as demonstrated in Example-4.xml and Example-4.xsd This capability of deriving types is particularly beneficial for organizations that have unique data item requirements that differ from those set by the PESC XML Forum.

Chicken Farm Road

When classified as a type, the requirements of an item can differ between Nillable and non-Nillable The Nillable option allows for the indication that an element holds no value in a specific document instance, as illustrated in Example-5.xml and Example-5.xsd.

Hide vs Expose Namespaces

Schemas should be structured to conceal Namespaces, enhancing the readability and comprehension of XML instance documents, especially when definitions are imported from different namespaces This approach simplifies the understanding of complex XML structures.

An XML document and its corresponding schema can be structured with hidden namespaces, as demonstrated in Example-7.xml, Example-7.xsd, Sector-7.xsd, and Core-7.xsd By concealing namespaces, the intricacies of the document's framework are shifted to the schema level, simplifying the XML document's appearance while maintaining its underlying complexity.

Example-6.xml - (Hide vs Expose Namespaces)

Example-6.xsd - (Hide vs Expose Namespaces)

Sector-6.xsd - (Hide vs Expose Namespaces)

Core-6.xsd - (Hide vs Expose Namespaces)

Example-7.xml - (Hide vs Expose Namespaces)

John Mack

Full Time

Example-7.xsd - (Hide vs Expose Namespaces)

Tiêu đề	XML Technical Specification for Higher Education
Người hướng dẫn	Mike Rawlins
Trường học	Postsecondary Electronic Standards Council
Thể loại	Working Draft
Năm xuất bản	2002
Thành phố	Washington, DC

Định dạng
Số trang	54
Dung lượng	362,5 KB