USING PHP TO MANAGE FILES 201 Figure 7-1. PHP makes light work of creating a drop-down menu of images in a specific folder. When incorporated into an online form, the filename of the selected image appears in the $_POST array identified by the name attribute of the <select> element—in this case, $_POST['pix']. Thats all there is to it! You can compare your code with imagelist_02.php in the ch07 folder. PHP Solution 7-4: Creating a generic file selector The previous PHP solution relies on an understanding of regular expressions. Adapting it to work with other filename extensions isnt difficult, but you need to be careful that you dont accidentally delete a vital character. Unless regexes or Vogon poetry are your specialty, its probably easier to wrap the code in a function that can be used to inspect a specific folder and create an array of filenames of specific types. For example, you might want to create an array of PDF document filenames or one that contains both PDFs and Word documents. Heres how you do it. 1. Create a new file called buildlist.php in the filesystem folder. The file will contain only PHP code, so delete any HTML inserted by your editing program. CHAPTER 7 202 2. Add the following code to the file: function buildFileList($dir, $extensions) { if (!is_dir($dir) || !is_readable($dir)) { return false; } else { if (is_array($extensions)) { $extensions = implode('|', $extensions); } } } This defines a function called buildFileList(), which takes two arguments: • $dir: The path to the folder from which you want to get the list of filenames. • $extensions: This can be either a string containing a single filename extension or an array of filename extensions. To keep the code simple, the filename extensions should not include a leading period. The function begins by checking whether $dir is a folder and is readable. If it isnt, the function returns false, and no more code is executed. If $dir is OK, the else block is executed. It also begins with a conditional statement that checks whether $extensions is an array. If it is, its passed to implode(), which joins the array elements with a vertical pipe (|) between each one. A vertical pipe is used in regexes to indicate alternative values. Lets say the following array is passed to the function as the second argument: array('jpg', 'png', 'gif') The conditional statement converts it to jpg|png|gif. So, this looks for jpg, or png, or gif. However, if the argument is a string, it remains untouched. 3. You can now build the regex search pattern and pass both arguments to the DirectoryIterator and RegexIterator like this: function buildFileList($dir, $extensions) { if (!is_dir($dir) || !is_readable($dir)) { return false; } else { if (is_array($extensions)) { $extensions = implode('|', $extensions); } $pattern = "/\.(?:{$extensions})$/i"; $folder = new DirectoryIterator($dir); $files = new RegexIterator($folder, $pattern); } } Download from Wow! eBook <www.wowebook.com> USING PHP TO MANAGE FILES 203 The regex pattern is built using a string in double quotes and wrapping $extensions in curly braces to make sure its interpreted correctly by the PHP engine. Take care when copying the code. Its not exactly easy to read. 4. The final section of the code extracts the filenames to build an array, which is sorted and then returned. The finished function definition looks like this: function buildFileList($dir, $extensions) { if (!is_dir($dir) || !is_readable($dir)) { return false; } else { if (is_array($extensions)) { $extensions = implode('|', $extensions); } // build the regex and get the files $pattern = "/\.(?:{$extensions})$/i"; $folder = new DirectoryIterator($dir); $files = new RegexIterator($folder, $pattern); // initialize an array and fill it with the filenames $filenames = array(); foreach ($files as $file) { $filenames[] = $file->getFilename(); } // sort the array and return it natcasesort($filenames); return $filenames; } } This initializes an array and uses a foreach loop to assign the filenames to it with the getFilename() method. Finally, the array is passed to natcasesort(), which sorts it in a natural, case-insensitive order. What “natural” means is that strings that contain numbers are sorted in the same way as a person would. For example, a computer normally sorts img12.jpg before img2.jpg, because the 1 in 12 is lower than 2. Using natcasesort() results in img2.jpg preceding img12.jpg. 5. To use the function, use as arguments the path to the folder and the filename extensions of the files you want to find. For example, you could get all Word and PDF documents from a folder like this: $docs = buildFileList('folder_name', array('doc', 'docx', 'pdf')); The code for the buildFileList() function is in buildlist.php in the ch07 folder. Accessing remote files Reading, writing, and inspecting files on your local computer or on your own website is useful. But allow_url_fopen also gives you access to publicly available documents anywhere on the Internet. You cant directly include files from other servers—not unless allow_url_include is on—but you can read CHAPTER 7 204 the content, save it to a variable, and manipulate it with PHP functions before incorporating it in your own pages or saving the information to a database. You can also write to documents on a remote server as long as the owner sets the appropriate permissions. A word of caution is in order here. When extracting material from remote sources for inclusion in your own pages, theres a potential security risk. For example, a remote page might contain malicious scripts embedded in <script> tags or hyperlinks. Unless the remote page supplies data in a known format from a trusted source—such as product details from the Amazon.com database, weather information from a government meteorological office, or a newsfeed from a newspaper or broadcaster—sanitize the content by passing it to htmlentities() (see PHP Solution 5-3). As well as converting double quotes to ", htmlentities() converts < to < and > to >. This displays tags in plain text, rather than treating them as HTML. If you want to permit some HTML tags, use the strip_tags() function instead. If you pass a string to strip_tags(), it returns the string with all HTML tags and comments stripped out. It also removes PHP tags. A second, optional argument is a list of tags that you want preserved. For example, the following strips out all tags except paragraphs and first- and second-level headings: $stripped = strip_tags($original, '<p><h1><h2>'); For an in-depth discussion of security issues, see Pro PHP Security by Chris Snyder and Michael Southwell (Apress, 2005, ISBN: 978-1-59059-508-4). Consuming news and other RSS feeds Some of the most useful remote sources of information that you might want to incorporate in your sites come from RSS feeds. RSS stands for Really Simple Syndication, and its a dialect of XML. XML is similar to HTML in that it uses tags to mark up content. Instead of defining paragraphs, headings, and images, XML tags are used to organize data in a predictable hierarchy. XML is written in plain text, so its frequently used to share information between computers that might be running on different operating systems. Figure 7-2 shows the typical structure of an RSS 2.0 feed. The whole document is wrapped in a pair of <rss> tags. This is the root element, similar to the <html> tags of a web page. The rest of the document is wrapped in a pair of <channel> tags, which always contains the following three elements that describe the RSS feed: <title>, <description>, and <link>. USING PHP TO MANAGE FILES 205 rss channel link descriptionlinktitle pubDate rss h an ne h an ne li nk hanne title description itemitem criptio ubDat link title Figure 7-2. The main contents of an RSS feed are in the item elements. In addition to the three required elements, the <channel> can contain many other elements, but the interesting material is to be found in the <item> elements. In the case of a news feed, this is where the individual news items can be found. If youre looking at the RSS feed from a blog, the <item> elements normally contain summaries of the blog posts. Each <item> element can contain several elements, but those shown in Figure 7-2 are the most common—and usually the most interesting: • <title>: The title of the item • <link>: The URL of the item • <pubDate>: Date of publication • <description>: Summary of the item This predictable format makes it easy to extract the information from an RSS feed using SimpleXML. You can find the full RSS Specification at www.rssboard.org/rss-specification . Unlike most technical specifications, its written in plain language, and easy to read. Using SimpleXML As long as you know the structure of an XML document, SimpleXML does what it says on the tin: it makes extracting information from XML simple. The first step is to pass the URL of the XML document to simplexml_load_file(). You can also load a local XML file by passing the path as an argument. For example, this gets the world news feed from the BBC: $feed = simplexml_load_file('http://feeds.bbci.co.uk/news/world/rss.xml'); This creates an instance of the SimpleXMLElement class. All the elements in the feed can now be accessed as properties of the $feed object, using the names of the elements. With an RSS feed, the <item> elements can be accessed as $feed->channel->item. CHAPTER 7 206 To display the <title> of each <item>, create a foreach loop like this: foreach ($feed->channel->item as $item) { echo $item->title . '<br>'; } If you compare this with Figure 7-2, you can see that you access elements by chaining the element names with the -> operator until you reach the target. Since there are multiple <item> elements, you need to use a loop to tunnel further down. Alternatively, use array notation like this: $feed->channel->item[2]->title This gets the <title> of the third <item> element. Unless you want only a specific value, its simpler to use a loop. With that background out of the way, lets use SimpleXML to display the contents of a news feed. PHP Solution 7-5: Consuming an RSS news feed This PHP solution shows how to extract the information from a live news feed using SimpleXML and display it in a web page. It also shows how to format the <pubDate> element to a more user-friendly format and how to limit the number of items displayed using the LimitIterator class. 1. Create a new page called newsfeed.php in the filesystem folder. This page will contain a mixture of PHP and HTML. 2. The news feed chosen for this PHP solution is the BBC World News. A condition of using most news feeds is that you acknowledge the source. So add The Lates t from BBC News formatted as an <h1> heading at the top of the page. See http://news.bbc.co.uk/1/hi/help/rss/4498287.stm for the full terms and conditions of using a BBC news feed on your own site. 3. Create a PHP block below the heading, and add the following code to load the feed: $url = 'http://feeds.bbci.co.uk/news/world/rss.xml'; $feed = simplexml_load_file($url); 4. Use a foreach loop to access the <item> elements and display the <title> of each one: foreach ($feed->channel->item as $item) { echo $item->title . '<br>'; } 5. Save newsfeed.php, and load the page in a browser. You should see a long list of news items similar to Figure 7-3. USING PHP TO MANAGE FILES 207 Figure 7-3. The news feed contains a large number of items. 6. The normal feed often contains 50 or more items. Thats fine for a news site, but you probably want a shorter selection in your own site. Use another SPL iterator to select a specific range of items. Amend the code like this: $url = 'http://feeds.bbci.co.uk/news/world/rss.xml'; $feed = simplexml_load_file($url, 'SimpleXMLIterator'); $filtered = new LimitIterator($feed->channel->item, 0 , 4); foreach ($filtered as $item) { echo $item->title . '<br>'; } To use SimpleXML with an SPL iterator, you need to supply the name of the SimpleXMLIterator class as the second argument to simplexml_load_file(). You can then pass the SimpleXML element you want to affect to an iterator constructor. In this case, $feed->channel->item is passed to the LimitIterator constructor. The LimitIterator takes three arguments: the object you want to limit, the starting point (counting from 0), and the number of times you want the loop to run. This code starts at the first item and limits the number of items to four. The foreach loop now loops over the $filtered result. If you test the page again, youll see just four titles, as shown in Figure 7-4. Dont be surprised if the selection of headlines is different from before. The BBC News website is updated every minute. CHAPTER 7 208 Figure 7-4. The LimitIterator restricts the number of items displayed. 7. Now that you have limited the number of items, amend the foreach loop to wrap the <title> elements in a link to the original article, and display the <pubDate> and <description> items. The loop looks like this: foreach ($filtered as $item) { ?> <h2><a href="<?php echo $item->link; ?>"><?php echo $item->title; ?></a></h2> <p class="datetime"><?php echo $item->pubDate; ?></p> <p><?php echo $item->description; ?></p> <?php } ?> 8. Save the page, and test it again. The links take you directly to the relevant news story on the BBC website. The news feed is now functional, but the <pubDate> format follows the format laid down in the RSS specification, as shown in the next screenshot: 9. To format the date and time in a more user-friendly way, pass $item->pubDate to the DateTime class constructor, and then use the DateTime format() method to display it. 10. Change the code in the foreach loop like this: <p class="datetime"><?php $date= new DateTime($item->pubDate); echo $date->format('M j, Y, g:ia'); ?></p> This reformats the date like this: The mysterious PHP formatting strings for dates are explained in Chapter 14. USING PHP TO MANAGE FILES 209 11. That looks a lot better, but the time is still expressed in GMT (London time). If most of your sites visitors live on the East Coast of the United States, you probably want to show the local time. Thats no problem with a DateTime object. Use the setTimezone() method to change to New York time. You can even automate the display of EDT (Eastern Daylight Time) or EST (Eastern Standard Time) depending on whether daylight saving time is in operation. Amend the code like this: <p class="datetime"><?php $date = new DateTime($item->pubDate); $date->setTimezone(new DateTimeZone('America/New_York')); $offset = $date->getOffset(); $timezone = ($offset == -14400) ? ' EDT' : ' EST'; echo $date->format('M j, Y, g:ia') . $timezone; ?></p> To create a DateTimeZone object, you pass it as an argument one of the time zones listed at http://docs.php.net/manual/en/timezones.php. This is the only place that the DateTimeZone object is needed, so it has been created directly as the argument to the setTimezone() method. There isnt a dedicated method that tells you whether daylight saving time is in operation, but the getOffset() method returns the number of seconds the time is offset from Coordinated Universal Time (UTC). The following line determines whether to display EDT or EST: $timezone = ($offset == -14400) ? ' EDT' : ' EST'; This uses the value of $offset with the conditional operator. In summer, New York is 4 hours behind UTC (–14440 seconds). So, if $offset is –14400, the condition equates to true, and EDT is assigned to $timezone. Otherwise, EST is used. Finally, the value of $timezone is concatenated to the formatted time. The string used for $timezone has a leading space to separate the time zone from the time. When the page is loaded, the time is adjusted to the East Coast of the United States like this: 12. All the page needs now is smartening up with CSS. Figure 7-5 shows the finished news feed styled with newsfeed.css in the styles folder. You can learn more about SPL iterators and SimpleXML in my PHP Object-Oriented Solutions (friends of ED, 2008, ISBN: 978-1-4302-1011-5). CHAPTER 7 210 Figure 7-5. The live news feed requires only a dozen lines of PHP code. Although I have used the BBC News feed for this PHP solution, it should work with any RSS 2.0 feed. For example, you can try it locally with http://rss.cnn.com/rss/edition.rss. Using a CNN news feed in a public website requires permission from CNN. Always check with the copyright holder for terms and conditions before incorporating a feed into a website. Creating a download link A question that crops up regularly in online forums is, “How do I create a link to an image (or PDF file) that prompts the user to download it?” The quick solution is to convert the file into a compressed format, such as ZIP. This frequently results in a smaller download, but the downside is that inexperienced users may not know how to unzip the file, or they may be using an older operating system that doesnt include an extraction facility. With PHP file system functions, its easy to create a link that automatically prompts the user to download a file in its original format. PHP Solution 7-6: Prompting a user to download an image The script in this PHP solution sends the necessary HTTP headers, opens the file, and outputs its contents as a binary stream. 1. Create a PHP file called download.php in the filesystem folder. The full listing is given in the next step. You can also find it in download.php in the ch07 folder. . ?>">< ?php echo $item->title; ?></a></h2> < ;p class="datetime">< ?php echo $item->pubDate; ?>< /p& gt; < ;p& gt;< ?php echo $item->description;. is passed to the function as the second argument: array('jpg', 'png', 'gif') The conditional statement converts it to jpg|png|gif. So, this looks for jpg, or png,. stripped out. It also removes PHP tags. A second, optional argument is a list of tags that you want preserved. For example, the following strips out all tags except paragraphs and first- and