All RSS 2.0 documents start with a root rsselement. The XML declaration is optional, but as always, it is general practice to include it in the document. The rsselement encapsulates the information for the feed and uses a required versionattribute to denote the RSS version being used. This element contains a single channelelement that defines the channel and includes the feed content.
It may seem odd that the document contains a channelelement even though the docu- ment can contain only a single channel, rather than just having its contents live directly as children of the rsselement, but I assume this is because of the structure RSS had in versions 0.9 and 0.91. To keep compatibility, these elements were kept within the structure. This is only a guess, but otherwise it would be just as intuitive that all children of an rsselement would pertain to the channel, since it can contain only a single channel.
The channelelement contains all the information for the feed, other than the version of RSS being used. Its structure requires that the elements in Table 14-3 are implemented and allows for additional information using the optional elements listed in Table 14-4.
Table 14-3.Required channelElements
Element Description Example
title The name of the channel. If the feed “Example RSS News”
contains information from your Web site, the title should be the same as that of the Web site or specific page name from the Web site.
link The URL to the Web site or specific “http://www.example.com”
page to which the feed refers.
description The description of the channel. “This is an example RSS feed from www.example.com.”
Table 14-4.Optional channelElements
Element Description Example
language The language in which the channel is en-us written. The value should be a language
code as specified by Netscape (http://
blogs.law.harvard.edu/tech/stories/
storyReader$15) or defined by the W3C in RFC 1766 (http://www.ietf.org/rfc/
rfc1766.txt).
copyright Copyright notice for content in the Copyright 2005, Example Holder channel.
managingEditor Email address for the party responsible editor@example.com(Managing
for editing the content. Editor)
webMaster Email address for party responsible for webmaster@example.com(Webmaster) handling technical issues.
pubDate The publication date for the content in Mon, 03 Oct 2005 13:00:01 GMT the channel.
lastBuildDate The last date and time the content Mon, 03 Oct 2005 13:15:26 GMT changed.
category The category (or categories) to which <category>PHP</category>
this channel belongs. This element follows the same rules as the category element for an item.
Element Description Example
generator A string indicating the program used to My PHP RSS Generator v1.0 generate the channel.
docs A URL to the documentation for the RSS http://blogs.law.harvard.edu/
format used. Unless you have written tech/rss your own documentation for RSS 2.0,
you should use http://blogs.law.
harvard.edu/tech/rss.
cloud Specifies a Web service implemented in <cloud domain="soap.example.com"
HTTP-POST, XML-RPC, or SOAP 1.1, port="80" path="/rsscloud.php"
which supports the rssCloudinterface, registerProcedure="rssNotify"
allowing a process to register with and protocol="soap"/>
be notified of updates.
ttl The number of minutes a channel can <ttl>60</ttl>
be cached before refreshing from the source (time to live).
image Specifies a GIF, JPEG, or PNG image to See the “image Element” section for display with the channel. This element an example.
contains child elements that define the image.
rating The Platform for Internet Content <rating>(PICS-1.1 "http://
Selection (PICS) rating for the channel. www.classify.org/safesurf/" l r PICS is a W3C specification found at (SS~~000 1))</rating>
http://www.w3.org/PICS/to rate con- tent so users can control the type of material they are allowed to access.
textInput Used to create a text input box to display See the “textInput Element” section
with the channel. for an example.
skipHours Contains up to 24 <hour />child ele- <skipHours><hour>0></hour>
ments with values from 0 to 23, indicating <hour>12</hour><skipHours>
when the channel should not be read. This would ask an aggregator to not This element is rarely used within a feed access the channel from noon to but is still valid. From various statistics I 1 p.m. GMT or from midnight to could find, fewer than 2 percent of feeds 1 a.m. GMT.
utilize this element.
skipDays Contains up to seven <day />child ele- <skipDays><day>Thursday</day>
ments with values of Monday, Tuesday, </skipDays>
Wednesday, Thursday, Friday, Saturday, Do not read channel on Thursdays.
or Sunday, indicating days the channel should not be read.
This element seems to be used even less than skipHours. The best usage statistics I could find for this element came in at less than 0.2 percent.
A few of the optional elements require some additional explanation because they are more than simple text content containers. These elements are image, cloud, and textInput.
I’ll explain the categoryelement in more detail in the context of an itemin the “item Element”
section. The rules for using this element as a child of channelare the same as when used as a child of an itemelement.
image Element
The imageelement defines the image associated with the channel and allows the image to be rendered when the feed is rendered. You can use only GIF, JPEG, or PNG images for a channel.
When using an imageelement, you also need three child elements:
title: The title of the image. The value of this element is used as the value for the alt attribute on the imgtag when rendered as HTML. The value is normally the same as the value of the channel’s titleelement.
url: The URL of the image. The value of this element is used as the value for the src attribute of the imgtag when rendered as HTML.
link: The URL of the site or Web page to which the image should link. You would use this value to create an anchor tag with the value of the hrefattribute being the value of the linkelement. In practice, this value is typically the same as the channel’s child link element.
An imageelement can also define three additional optional elements to provide more information for the image:
height: The height of the image in pixels. The value can be an integer from 1 to 400.
When omitted, the default value of 31 is used for the image’s height.
width: The width of the image in pixels. The value can be an integer from 1 to 144.
When omitted, the default value of 88 is used for the image width.
description: A description of the content to which the linkelement points. The value of this element is used as the value for the titleattribute of the link that surrounds the rendered image.
The following structure uses the imageelement from the RSS document in Listing 14-2 and adds the optional elements to define a GIF with the dimension 100✕35 that will link to http://www.example.com/when selected in the rendered HTML:
<image>
<title>Example RSS News Feed</title>
<link>http://www.example.com</link>
<url>http://www.example.com/images/rss_channel.gif</url>
<width>100</width>
<heigth>35</heigth>
<description>Example RSS News Feed</description>
</image>
textinput Element
The textInputelement works in the same manner as that from the RSS 1.0 specification (though note the difference in case for the element name). The textInputelement generates a form, such as a search box or subscription form, that would use the GET method when the feed is rendered into HTML. When using it, you must also use the four required child elements:
title: The label of the Submit button in the text input area.
description: Explanation of the text input area.
name: The name of the text object in the text input area. The value of this element is used as the parameter name passed to the processing script.
link: The URL of the script that processes the request upon submission.
For example:
<textInput>
<title>Channel Search</title>
<name>str_search</name>
<description>Search all information within channel</description>
<link>http://www.example.com/search.php</link>
</textInput>
item Element
The itemelements contain the actual content for the feed. Unlike RSS 1.0, it is legal to have a feed without any items, though the feed would not serve much purpose in that case. Also, unlike RSS 1.0, these elements are children of the channelelement rather than just pointers to items. Although the basic structure is similar to that used in RSS 1.0, additional optional elements, defined by the RSS 2.0 specification, can further describe the item rather than having to extend the structure like what you must do in RSS 1.0.
title
The titleelement, which is required, contains the title of the item. Using a blog entry as an example, the content for this item would be the same as the title used for the entry within the blog. Other than containing character data, this element has no further restric- tions. For example:
<title>Article 1</title>
link
The linkelement, which is required, contains the URL to a specific item. In the case of a blog entry, the content of this element would be the direct URL to the specific blog entry to which the item refers. This element has no further restrictions for the content. Protocols are not restricted under RSS 2.0 like they are when using RSS 1.0. For example:
<link>http://www.example.com/pub/article1.html</link>
description
The descriptionelement provides a brief description or abstract of the content to which the item is referring. Unlike RSS 1.0, this element is required within an itemelement. For example:
<description>This is the description for article 1.</description>
author
The authorelement is optional and is used to identify the author of the current item. The con- tent contains the email address of the author. This element is useful when the feed contains items from many different authors rather than from a single source. For example:
<author>rrichards@php.net (Rob Richards)</author>
category
The categoryelement is an optional child element for both an itemelement and a channel element. It associates one or more categories with either an item or a channel, depending upon the context. It has one optional attribute, domain, whose value identifies a categorization taxonomy. The value of the element is a slash-separated string that identifies a hierarchic location in the indicated taxonomy. For example:
<category>PHP</category>
Here’s another example (which has been split into three lines for readability):
<category domain="http://www.dmoz.org">
Computers/Programming/Languages/PHP/
</category>
comments
The commentselement includes the URL to a comments page for the particular item. For example, most blog entries contain a section for user comments. The contents of the comments element for this item would be the URL pointing to the user comment page or section. For example:
<comments>http://www.exmaple.com/2005/10/01/article1.html#comments</comments>
enclosure
An enclosureelement is optionally used to locate and describe some type of content associ- ated with the current item. For example, an item for a news entry could refer to a multimedia clip in MPEG format that shows the actual footage of the event. You could use an enclosure element so that the video could be retrieved along with the feed. This way, if feed retrieval were automated, you could retrieve the video clip with the feed, allowing it to stored and viewed on a local machine rather than streaming it across the Internet.
This element was pretty much added to the RSS 2.0 specification specifically to allow for the syndication of audio files, eventually termed podcasts. Its structure consists of an empty element with three required attributes:
url: An HTTP URL locating the enclosure length: The size of the enclosure in bytes type: The MIME type of the enclosure
For example:
<enclosure url="http://www.example.com/news/article1.mpg"
length="9312164" type="video/mpeg" />
guid
The content of the optional guidelement is a globally unique identifier for the item. It is a string that an aggregator can use to determine whether the item is new. You can use an optional isPermaLinkwith either the value true, which is the default value, or the value false.
When the value is true, an aggregator assumes that the value of the element is a URL pointing to the item that could be opened in a Web browser. For example:
<!-- GUID is not a URL -->
<guid isPermaLink="false">1234567890</guid>
<!-- GUID is a URL that can be opened -->
<guid isPermaLink="true">http://www.example.com/pub/article1.html</guid>
<!-- GUID is a URL that can be opened using default value for isPermaLink -->
<guid>http://www.example.com/pub/article1.html</guid>
pubDate
The optional pubDateelement contains the date the current item was published. The value of this element is a date in the format defined in RFC 822 (http://asg.web.cmu.edu/rfc/
rfc822.html#sec-5). When a future date is used, an aggregator can choose to not display the current item until the specified date and time is reached. For example:
<pubDate>Sun, 02 Oct 2005 18:10:01 GMT</pubDate>
source
An optional sourceelement supplies the name of the RSS channel from which the item came.
It has one required attribute, url, which links to the XML from the source. For example:
<source url="http://www.example.net/foreign.xml">Third Party Feed</source>
The urlattribute, in this case, points to the www.example.netdomain, which is the origi- nator of the item. This allows the proper credits to be given to the originator when a feed incorporates items from other feeds.