Introducing the RSS 1.0 Structure

Một phần của tài liệu Pro PHP XML and Web Services phần 6 docx (Trang 71 - 76)

An RSS 1.0 document should always include an XML declaration. Although optional for an XML document, its use is normally recommended and is needed if trying to maintain back- ward compatibility with RSS 0.9. The document root of all RSS 1.0 documents is a namespaced RDFelement in the RSS syntax schema namespace that also defines the RSS 1.0 schema as the default namespace for the document. Although you can use any prefix to bind to the RSS syn- tax schema, rdfis normally used; in fact, you must use rdfwhen maintaining backward compatibility with RSS 0.9. For example:

<?xml version="1.0"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns="http://purl.org/rss/1.0/">

Besides the required namespace declarations, the RDFelement must contain the following structure:

• One and only one channelelement

• An optional imageelement

• One or more itemelements

• Zero or more textinputelements

You can also declare additional namespaces on this element, which allow for the extensibil- ity of the document, by using modules (explained later in the “Introducing Modules” section).

channel Element

The channelelement is the container that holds the information describing the channel. In my opinion, the structure of the channelelement is more difficult to work with than the other RSS branch and than Atom because the channel is not self-contained. RSS 1.0 allows only a single channel in a feed and is not self-contained. The individual itemelements are not contents of the element but, rather, are contents of the RDFelement.

An rdf:aboutattribute is required for a channelelement. The value is a URI that identifies a channel and must be unique in regard to all rdf:aboutattributes in the document. In this respect, it is similar to an XML IDtype attribute. Typically, the URI is the URL of the home page for the site or the URL for the RSS feed, but it could be any URI you like as long as it is unique from all other rdf:aboutvalues. For example:

<channel rdf:about="http://www.example.com/news.rss">

A channelelement needs to contain one titleelement, one linkitem, one description item, and one itemselement. Depending upon whether the channel has an associated image and/or textinputelement as a child of the rdf:RDFelement, an associated imageand/or textinputelement is required within the channelelement. It is invalid for an imageor textinputelement to exist as a child of an rdf:RDFelement and not have the corresponding child element within the channelelement. The same goes for the reverse of this statement.

title

The titleelement defines the title of the channel. It is a required element that contains

#PCDATAwith a suggested maximum length of 40 characters. The length can exceed 40 charac- ters, but doing so breaks backward compatibility with RSS 0.9. For example:

<title>Example RSS News</title>

link

The linkelement, which is required, defines the URL for an HTML page to which the title element links. Normally this is the site’s home page or a news page on the site. The only valid protocols for the URL are HTTP, HTTPS, and FTP (specified by http, https, and ftp, respec- tively). For example:

<link>http://www.example.com/</link>

description

The descriptionelement describes the channel. It is a required element containing #PCDATA with a suggested maximum length of 500 characters. Again, although only a suggestion, lengths of more than 500 characters break compatibility with RSS 0.9 parsers. For example:

<description>This is an example RSS feed from www.example.com.</description>

items

The itemselement, also required, acts as a table of contents for the child itemelements of the rdf:RDFelement. It defines the sequencing for the how the itemelements should be ordered when parsed. For example:

<items>

<rdf:Seq>

<rdf:li resource="http://www.example.com/pub/article1.html" />

<rdf:li resource="http://www.example.com/pub/article2.html" />

</rdf:Seq>

</items>

The child rdf:Seqelement denotes that its child elements, the rdf:lielements, are to be sequenced in the order its child rdf:lielements appear. Because RSS 1.0 requires a minimum of at least one itemelement, this element is required, and at least one rdf:lielement is required.

The rdf:lielements are associations to the document’s itemelements. The resource attribute corresponds to the itemelement’s rdf:aboutattribute and must contain the same value as the corresponding rdf:aboutattribute. If you consider the rdf:aboutattributes to be

XML IDtype attributes, then the resourceattribute would be an IDREFtype attribute. It points to and locates the itemelement in the document.

image

The imageelement is optional and used to associate an imageelement with the channel.

It is required only if an imageelement exists as a child of the rdf:RDFelement. Similar to how rdf:lielements work, the imageelement contains an rdf:resourceattribute. For some rea- son, the rdf:li resourceattribute was not namespaced in the RSS 1.0 specification but must be with the imageelement. The value must be identical to the imageelement’s rdf:aboutvalue so that the image can be located within the document. For example:

<image rdf:resource="http://www.example.com/images/rss_channel.gif" />

This element must always be an empty element.

textinput

The textinputelement associates an optional child textinputelement of the rdf:RDFelement with the channel. It is required only when such a child element exists in the document. Being an associative element, it is an empty element with only an rdf:resourceattribute. Again, the value of the rdf:resourceattribute must be identical to the value of the rdf:aboutattribute of the master textinputelement. For example:

<textinput rdf:resource="http://www.example.com" />

This element was not used in Listing 14-1 but is written as demonstrated here.

image Element

An imageelement, which is a child of the rdf:RDFelement, associates an image with an HTML rendering of the channel. It is not required that you use an image with a feed, but when sup- plied, the associated imageelement within the channelelement must also exist. This element requires an rdf:aboutattribute whose value is a URL locating the physical image. Like all URLs in the RSS 1.0 specification, the protocol must be HTTP, HTTPS, or FTP (specified by http, https, or ftp, respectively).

The format of the physical image has no restrictions (though it should be a common format for the greatest Web browser support). The height and width depend upon the RSS version com- patibility you are trying to obtain. The RSS 0.91 specification allows an image height from 1 to 144 and a width from 1 to 400. RSS 0.9, however, dictates an image of exactly 88 ✕31. For example:

<image rdf:about="http://www.example.com/images/rss_channel.gif">

<title>Example RSS News Feed</title>

<link>http://www.example.com</link>

<url>http://www.example.com/images/rss_channel.gif</url>

</image>

The imageelement, when used, cannot be an empty element. It must contain a titleele- ment, a urlelement, and a linkelement.

title

The titleelement supplies the alternate text for the image when the channel is rendered as HTML. Its content becomes the value for the image’s altattribute in the rendered HTML. Fol- lowing the same format as the other titleelements within the document, its content contains

#PCDATAand has a suggested maximum length of 40 characters. Again, the suggested length is required only when maintaining backward compatibility with RSS 0.9. For example:

<title>Example RSS News Feed</title>

url

The urlelement specifies the URL to the physical location of the image. When the channel is rendered as HTML, the contents of this element become the value for the image’s srcattribute in the rendered HTML. It is important to remember that only HTTP, HTTPS, and FTP (speci- fied by http, https, and ftp, respectively) are valid protocols for the URL. When maintaining compatibility with RSS 0.9, the length of the content can be no greater than 500 characters.

For example:

<url>http://www.example.com/images/rss_channel.gif</url>

link

The linkelement specifies the URL to which the image should link when the channel is ren- dered as HTML. The contents could become the value for the hrefattribute of an anchor tag surrounding the rendered image tag when displayed as HTML. The value is typically the site’s home page or a news page and, to maintain compatibility with RSS 0.9, must have a length no greater than 500 characters. The URL must also use only HTTP, HTTPS, or FTP (specified by http, https, or ftp, respectively). For example:

<link>http://www.example.com</link>

item Element

The master itemelements, which are those that are children of the rdf:RDFelement, contain the specific information for a block of content. This content could be anything identifiable by a URI, such as news information, a job listing, or a blog entry. A minimum of one itemelement is required and, when maintaining compatibility with RSS 0.9 or 0.91, must be limited to a maximum of 15 itemelements.

Each itemelement must contain a unique rdf:aboutattribute. The attribute must be unique within the entire document and not just among the different itemelements. The value for this attribute is a URL to the specific content. For example, if the particular itemelement were based on a blog entry, the attribute would contain the URL to the specific entry within the blog. This value must also be identical to the content of the child linkelement, as well as to the value of the resourceattribute from the rdf:lielement used within the channelele- ment. For example:

<item rdf:about="http://www.example.com/pub/article1.html">

<title>Article 1</title>

<link>http://www.example.com/pub/article1.html</link>

<description>

This is the description for article 1.

</description>

</item>

All itemelements must contain a titleelement and a linkelement. The description element is optional, but it is common to see one within an itemelement.

title

The titleelement contains the title of the item. Using a blog entry as an example, the content for this item would be the same as the title used for the entry within the blog. Its format is

#PCDATAwith a suggested maximum length, for RSS 0.9 compatibility, of 100 characters. For example:

<title>Article 1</title>

link

The linkelement contains the URL to a specific item. In the case of a blog entry, the content of this element is the direct URL to the specific blog entry to which the item refers. The rules for this element are the same as all other linkand urlelements from the RSS 1.0 specification. The URL protocol must be HTTP, HTTPS, or FTP (specified by http, https, or ftp, respectively), and the suggested maximum length is 500 characters. For example:

<link>http://www.example.com/pub/article1.html</link>

description

The descriptionelement provides a brief description or abstract of the content to which the item is referring. It consists of #PCDATAwith a suggested maximum length of 500 characters.

This element is optional, but it’s almost always used. For example:

<description>This is the description for article 1.</description>

Although not mentioned in the specification, it is generally acceptable to use HTML in a description. Early in RSS’s history, plain text was considered the only valid content. However, the original UserLand RSS reader never filtered out HTML, and developers began using it within content. This pretty much became the norm and is where RSS 1.0 stands today; all readers are generally expected to be able to handle HTML. You need to consider, though, that RSS, being in XML format, must properly encode entities. It is also common to see developers using CDATA sections to contain the content.

textinput Element

The textinputelement generates a form, such as a search box or subscription form, that would use the GET method when the feed is rendered into HTML. This is an optional element, and I have yet to come across it in any RSS feeds. It most likely exists to maintain compatibility

with RSS 0.9. Being a child of the rdf:RDFelement, it contains an rdf:aboutattribute that has a unique value corresponding to the location where the form should be submitted. This value must not only be identical to the content of the child linkelement but also be identical to the value of the rdf:resourceattribute for the textinputelement contained within the channel element. For example:

<textinput rdf:about="http://www.example.com/search.php">

<title>Channel Search</title>

<name>str_search</name>

<description>Search all information within channel</description>

<link>http://www.example.com/search.php</link>

</textinput>

When this element is used, it needs to contain title, description, name, and linkelements.

title

The titleelement provides a descriptive title for the textinputfield. Its content is #PCDATA with a suggested maximum length of 40 characters. For example:

<title>Channel Search</title>

description

The descriptionelement briefly describes the purpose of the form. Its content is #PCDATAwith a suggested maximum length of 100 characters. For example:

<description>Search all information within channel</description>

name

The nameelement defines the nameattribute of the textinputfield when rendered as HTML.

This means that when the form is submitted, the value of the nameelement will be the name of the parameter passed to the site when the form is submitted. The content of this element is#PCDATAwith a maximum length of 500 characters. For example:

<name>str_search</name>

link

The linkelement defines the URL to which the form will be submitted using the GET method.

Its content is #PCDATAwith a suggested maximum length of 500 characters. Following the URL standards in the RSS 1.0 specification, the protocol can be HTTP, HTTPS, or FTP (specified by http, https, or ftp, respectively). For example:

<link>http://www.example.com/search.php</link>

Một phần của tài liệu Pro PHP XML and Web Services phần 6 docx (Trang 71 - 76)

Tải bản đầy đủ (PDF)

(94 trang)