86 P l u g - i n P H P : 1 0 0 P o w e r S o l u t i o n s { case "gif": imagegif($image1, $tofile); break; case "jpeg": imagejpeg($image1, $tofile, $quality); break; case "png": imagepng($image1, $tofile, round(9 - $quality * .09)); break; } } function PIPHP_GD_FN1($image, $color) { return imagecolorallocate($image, hexdec(substr($color, 0, 2)), hexdec(substr($color, 2, 2)), hexdec(substr($color, 4, 2))); } CHAPTER 5 Content Management 88 P l u g - i n P H P : 1 0 0 P o w e r S o l u t i o n s 88 P l u g - i n P H P : 1 0 0 P o w e r S o l u t i o n s W hen developing web projects, there are certain content management processes that are so common it can save you a great deal of programming to have ready- made plug-ins available. Some examples include converting relative to absolute URLs, checking for broken links, tracking web visitors, and more. This chapter explores ten of these types of functions that you can add to your toolbox, and explains how they work so you can further tailor them to your own requir ements. Along the way, it covers parsing URLs, extracting information from web pages (even on other servers), reading the contents of local files and directories, accessing query strings that result from search engine referrals, embedding YouTube videos, counting raw and unique web visits, and tracking where users are coming from. Relative to Absolute URL Any project that needs to crawl web pages, whether their own or a third party’s, needs a way to convert relative URLs into absolute URLs that can be called up on their own, without reference to the page in which they are located. For example, the URL /sport/index .html means nothing at all when looked at on its own, and there is no way of knowing that the URL was extracted from the web page http://server.com/news/. Using this plug-in, relative URLs can be combined with the referring page to create stand-alone absolute URLs, such as http://server.com/sport/index.html. Figure 5-1 shows a variety of links being converted to absolute. FIGURE 5-1 This plug-in provides the solution to a common problem encountered in web development: converting a relative URL to absolute. 21 C h a p t e r 5 : C o n t e n t M a n a g e m e n t 89 C h a p t e r 5 : C o n t e n t M a n a g e m e n t 89 About the Plug-in This plug-in takes the URL of a web page, along with a link from within that page, and then returns the link in a form that can be accessed without reference to the calling page—in other words, an absolute URL. It takes these arguments: • $page A web page URL, including the http:// preface and domain name • $url A link extracted from $page Variables, Arrays, and Functions $parse Associative array derived from parsing $page $root String comprising the first part of $page up to and including the host domain name $p Integer pointer to the final / in $page $base The current directory where $page is located How It Works In order to convert a URL from relative to absolute it’s necessary to know where the relative URL is relative to. This is why the main page URL is passed along with the relative URL. In fact, not all the URLs passed may be relative, and they could even all be absolute, depending on how $page has been written. But what this plug-in does is process a URL anyway, and if it’s determined to be relative, then it’s turned into an absolute URL. It does this by first parsing the original URL, passed in $page, and extracting the scheme (for example, http:// or ftp://, and so on) and host (such as myserver.com) and combining just these two parts together into the string variable $root to create, for example, the string http://myserver.com. Then, $page is examined to see if there are any / characters after the initial http://. If so, the final one is located and its position is placed in $p. If there isn’t one, then $p is set to 0. Using this value, $base is assigned either the substring of $page all the way up to and including the final /, or if there wasn’t one, $base is assigned the value of $page itself, but with a final / appended to it. Either way, $base now represents the location of the directory containing $page. Next $url is examined, and if it starts with a /, then it must be a relative URL—referring to an offset from the domain’s document root. In which case $url is replaced with a value comprising the concatenation of $root and $url. So, for example, http://myserver.com and / news/index.html would combine to become http://myserver.com/news/index.html. If $url doesn’t start with a /, then a test is made to see whether it begins with http://. If not, the URL must also be relative, but this time it is relative to the directory location of $page, so $url is replaced with a value comprising the concatenation of $base and $url. So, for example, http://myserver.com/sport and results.html would combine to become http:// myserver.com/sport/results.html. If both these tests fail, then $url commences with http:// and therefore is an absolute URL and cannot be converted. It is therefore returned unchanged. 90 P l u g - i n P H P : 1 0 0 P o w e r S o l u t i o n s 90 P l u g - i n P H P : 1 0 0 P o w e r S o l u t i o n s NOTE For the sake of speed and simplicity, a complete relative-to-absolute URL conversion is not made. For example, the URL /news/index.html in the page http://myserver.com/sport/ is not converted to http://myserver.com/news/index.html. Instead it becomes http:// myserver.com/sport/ /news/index.html. This saves the code having to further parse a URL, locating examples of / and then removing the directory immediately previous to it. There’s no need because this longer form of absolute URL is perfectly valid and works just fine. How to Use It To use this plug-in, pass it the full URL of a page that contains a relative link, along with the relative link itself, like this: $page = "http://site.com/news/current/science/index.html"; $link = " / /prev/tech/roundup.html"; echo PIPHP_RelToAbsURL($page, $link); The value returned will be an absolute URL that can be used to access the destination page without recourse to the original web page. In the preceding case, the following URL will be returned: http://site.com/news/current/science/ / /prev/tech/roundup.html The Plug-in function PIPHP_RelToAbsURL($page, $url) { if (substr($page, 0, 7) != "http://") return $url; $parse = parse_url($page); $root = $parse['scheme'] . "://" . $parse['host']; $p = strrpos(substr($page, 7), '/'); if ($p) $base = substr($page, 0, $p + 8); else $base = "$page/"; if (substr($url, 0, 1) == '/') $url = $root . $url; elseif (substr($url, 0, 7) != "http://") $url = $base . $url; return $url; } Get Links from URL When you first need to extract HTML links from a web page (even your own) it looks almost impossible and seems quite a daunting task. And it’s true, parsing HTML is quite complex. But with this plug-in all you need to do is pass it the URL of a web page and all the links found within it will be returned. Figure 5-2 shows links being extracted from a web page. 22 . the Plug- in This plug- in takes the URL of a web page, along with a link from within that page, and then returns the link in a form that can be accessed without reference to the calling page in. first part of $page up to and including the host domain name $p Integer pointer to the final / in $page $base The current directory where $page is located How It Works In order to convert a URL. directories, accessing query strings that result from search engine referrals, embedding YouTube videos, counting raw and unique web visits, and tracking where users are coming from. Relative