Developing Large Web Applications- P25 ppsx

CHAPTER 9 Performance Ultimately, none of the techniques presented in this book would be practical if they didn’t provide a solid foundation on which to build large web applications that perform quickly and efficiently. This chapter shows how to use the foundation from the previous chapters to monitor and tweak the performance of your application. You may well get a performance boost simply by following the practices already presented in this book. For example, the semantically meaningful HTML presented in Chapter 3 can speed up page display for several reasons. Likewise, modular techniques for large-scale PHP (see Chapter 7) generally create a faster site than jumping in and out of the PHP interpreter multiple times whenever needed. But every professional web developer devotes time to performance as an end in itself, so this chapter shows how performance optimization interacts with the techniques in this book. To guide our discussion, we’ll explore some of the recommendations presented in High Performance Web Sites (O’Reilly). This book, based on research conducted by Steve Souders at Yahoo!, suggests that for most websites, backend performance accounts for only 10 to 20 percent of the overall time required for a page to load; the remaining 80 to 90 percent is spent downloading components for the user interface. By following a set of 14 rules, many web applications can be made 20 to 25 percent faster. These statistics emphasize the importance of paying close attention to the performance of how your HTML, CSS, JavaScript, and PHP work together. By utilizing a set of techniques for developing large web applications like the ones in this book, you can manage performance with relative ease and in a centralized manner. Tenet 9: Large-scale HTML, JavaScript, CSS, and PHP provide a good foundation on which to build large web applications that perform well. They also facilitate a good en- vironment for capturing site metrics and testing. We begin this chapter by looking at how the techniques for developing large web applications discussed in this book can help us manage opportunities for caching. Next, we’ll explore some performance improvements that apply specifically to JavaScript. We then cover performance improvements related to ways we can distribute the various 221 assets for an application across multiple servers. Finally, we’ll look at techniques that facilitate capturing site metrics and performing testing. Caching Opportunities One of the biggest opportunities for improving performance is caching. Caching is the preservation and management of a collection of data that replicates original data com- puted earlier or stored in another location. The idea is to avoid retrieving the original data repeatedly and thus to avoid the high performance cost of retrieval. Some examples of resources that you can cache in a user interface are CSS files, JavaScript files, images, and even the entire contents of modules and pages. Whenever you encounter something that doesn’t change very often (as is the case with CSS and JavaScript files especially), there is probably a good opportunity for caching. Caching CSS and JavaScript Whenever you can, you should place CSS and JavaScript in separate files that you can link on the pages that require them, as shown in Example 9-1. Not only does this allow you to share the contents of those files across multiple pages, it allows a browser to retrieve the files once over the wire, and then use them many times from the local cache. Certainly, a browser can cache an HTML file that contains embedded CSS or JavaScript. However, the HTML is likely to change much more often than the CSS or JavaScript, so the browser may only cache it for a few moments. In contrast, you might go for months or even years without changing your CSS or JavaScript for a page. Separating the CSS and JavaScript into dedicated files therefore lets the browser store the CSS and JavaScript for repeated use, and just download the new HTML when needed. In Chapter 7, you saw that modules and pages both define similar methods in their interfaces to specify the CSS and JavaScript files they require using get_css_linked and get_js_linked, respectively. Because each method results in links on the final page, as opposed to embedding CSS or JavaScript in the same page as the HTML, you get the benefits of caching. Example 9-1. Linking JavaScript files for the benefits of caching class PictureSlider extends Module { public function get_js_linked() { // Specify the JavaScript files that must be included on the page. // This module needs YUI libraries for managing the DOM and doing // animation. The module's JavaScript is a part of sitewide.js. return array ( "yahoo-dom-event.js", 222 | Chapter 9: Performance "animation.js", "sitewide.js" ); } } Versioning CSS and JavaScript files Anytime a browser caches a CSS or JavaScript file, it’s important to ensure that the browser knows when a copy of the cached file is no longer up to date with changes you’ve made. Without this, your application is likely to be styled incorrectly or contain JavaScript errors as your HTML gets out of sync with your CSS and JavaScript. A simple way to ensure the browser knows when to fetch a new version of a file is to give each file a version ID. Whenever you change the file, simply advance the version ID. As a result, the browser does not find the new version in its cache and subsequently fetches it. A good method for constructing version IDs is to append the date to the name of the file or use the version number from your source control system. For example, you could have the following: sitewide_20090710.js If you need to update the file multiple times on a single day, you can append a sequence number or letter after the date: sitewide_20090710a.js Of course, you’ll need to update references to the files wherever you link to them. Example 9-2 illustrates how easy this is to control in a centralized way using the register_links method presented in Chapter 7. Example 9-2 illustrates registering a JavaScript file with a version ID, and is based on the assumption that all pages in the web application have SitePage at some point in their class hierarchy. The get_js_linked method for pages and modules returns an array of keys. As files are linked for the page, these keys are used to look up the real path that was defined in regis ter_links. Each time you need to update the version ID for a file, you adjust it in one place, such as the SitePage class shown here. The process for CSS files is similar. Example 9-2. Registering a JavaScript file with a version ID class SitePage extends Page { public function register_links() { $this->js_linked_info = array ( "sitewide.js" => array Caching Opportunities | 223 ( "aka_path" => $this->aka_path."/sitewide_20090710.js", "loc_path" => $this->loc_path."/sidewide_20090710.js" ), ); } } Ideally, changes to a CSS or JavaScript file would apply wherever the file is accessed. But what if a dependency on one page prevents it from using the new version? Again, the register_links method provides an easy way to manage such fine-grained distinc- tions. The page class for the page containing the dependency defines a more specific version of register_links that first calls upon register_links in the parent to set up all the links as normal, then overwrites the name of the file for which the page requires the earlier version, as shown in Example 9-3. Example 9-3. Overriding a version ID for just one page class NewCarSearchResultsPage extends SitePage { public function register_links() { // Call upon the parent class to set up all the links as normal. parent::register_links(); // Alter the link for which this page needs a different version. $this->js_linked_info["sitewide.js"] = array ( "aka_path" => $this->aka_path."/sitewide_20090709.js", "loc_path" => $this->loc_path."/sidewide_20090709.js" ); } } Combining CSS and JavaScript files One of the issues when placing CSS and JavaScript in dedicated files is determining a good way to divide the CSS (or JavaScript). On the one hand, if you place all your CSS within a single, large file, your application will become monolithic, lack modularity, and end up more difficult to maintain. On the other hand, if you place the CSS for each module within its own individual file, you’ll end up with a large number of links on every page. 224 | Chapter 9: Performance The section “Minimizing HTTP Requests” on page 238 discusses a good middle ground for dividing your CSS and JavaScript across a set of files to minimize HTTP requests. Once you have a good division of files, you can minimize the number of requests made for CSS or JavaScript files even further by combining multiple requests into one. To do this, you need to implement a server that understands combined requests. Such a request for CSS files might look like the following using a link tag: <link href="http:// /?sitewide_20090710.css&newcars_20090630.css" type="text/css" rel="stylesheet" media="all" /> Such a request for JavaScript files looks similar, but occurs in a script tag. A request for JavaScript files might look like the following: <script src="http:// /ext/yahoo-dom-event_2.7.0.js&ext/yahoo- animation_2.7.0.js&sitewide_20090710.js" type="text/javacript"> </script> Once the server receives the request, it concatenates the files in the specified order and returns the concatenated file to the browser. It also caches a copy of the concatenated file on the server to use the next time a request with the same combination of files is made (for example, the next time the same page is displayed to any visitor). The browser receives the single, concatenated file for all the CSS (or JavaScript) via a single HTTP request. Furthermore, the next time a request is made from the same browser for the same set of files, the browser will already have the concatenated version cached and can avoid the request altogether. To combine CSS and JavaScript files, you need to write some scripts on a server to do the combining and some code to assemble the requests for combining files as you generate pages. In this book, we won’t examine the code to place on the server that does the combining, but the implementation is relatively straightforward. To build the requests for combining files, you need only make a few modifications to the Page class presented in Chapter 7. The modifications for combining JavaScript files are shown in Example 9-4. Combining CSS files is similar. For CSS, just remember that you can only combine links that share the same media type (e.g., all, print), since all the concatenated files will form one file with one media type. Since media types other than all generally don’t require multiple CSS files, a simple but effective approach is to ignore requests to combine CSS files that have a media type other than all. Example 9-4. The Page class with support for combining JavaScript files class Page { protected $js_is_combined; public function __construct() { parent::__construct(); Caching Opportunities | 225 // Default combining JavaScript to true; however, you can always // disable it in a derived page or by calling the setter method. $this->js_is_combined = true; } public function set_js_combined($flag) { // Offer a way to enable or disable handling combined JavaScript. $this->js_is_combined = $flag; } private function create_js_combined_part($k) { // Candidates for combining need to be from one server. Set that // here as a prefix to check. We'll log errors for other paths. $prefix = " "; // Look up the actual path for the file identified by the key k. $path = $this->js_linked_info[$k]["aka_path"]; // Return a query part only if combining is supported for the path. $pos = strpos($path, $prefix); if ($pos === 0) return str_replace($prefix, "", $path); else return ""; } private function create_js_combined_query() { $combined_query = ""; // We're making the assumption that local files are never combined // since normally alternative servers are used for the combining. if ($this->js_is_combined && !$this->js_is_local) { // Build an array of all the JavaScript keys in the order that // they were added by the page or modules created for the page. $all = array_merge ( $this->js_common, $this->js_page_linked, $this->js_module_linked ); $i = 0; 226 | Chapter 9: Performance // Build the combined query by appending each part one by one. foreach ($all as $k) { $part = $this->create_js_combined_part($k); if (empty($part)) { // An empty part indicates that the path for the file is // not a path that supports combining. Log this issue. break; } $sep = ($i++ == 0) ? "?" : "&"; $combined_query .= $sep.$part; } } return $combined_query; } } Caching Modules Another opportunity for caching occurs each time you generate the CSS, JavaScript, and content for a module on the server. Caching for a module is especially useful when the module’s content, styles, and behaviors require a fair amount of CPU work to generate and you don’t expect them to change very often. A good approach to implement- ing cacheable modules is to provide the capabilities required by all cacheable modules within a base class called CacheableModule, derived from the Module class in Chap- ter 7. To make your own module cacheable, simply derive it from CacheableModule. Example 9-5 illustrates an implementation for the CacheableModule class. Example 9-5. The implementation of a base class for cacheable modules class CacheableModule extends Module { protected $cache_ttl; protected $cache_clr; public function __construct($page) { parent::__construct($page); // The default time-to-live for entries in the cache is one hour. $this->cache_ttl = 3600; // The default is to check the cache first, but you can clear it. $this->cache_clr = false; } Caching Opportunities | 227 public function create() { // Check whether data exists in the cache for the module at all. $cache_key = $this->get_cache_key(); $cache_val = apc_fetch($cache_key); // Set the hash for the variables on which the new data is based. $hash = $this->get_cache_hash($this->get_cache_vars()); if (!$this->cache_clr && $cache_val && $cache_val["hash"]==$hash) { // Whenever we can use the cached module, access the cache. $content = $this->fetch_from_cache($cache_val["data"]); } else { // Otherwise, generate the module as normal and cache a copy. $content = $this->store_into_cache($cache_key, $hash); } return $content; } public function set_cache_ttl($ttl) { // Set the time-to-live to the specified value, in milliseconds. $this->cache_ttl = $ttl; } public function set_cache_clr() { // Force the cacheable module to bust any cached copy immediately. $this->cache_clr = true; } protected function get_cache_vars() { // Modules derived from this class should implement this method // to return a string that changes whenever the cache should be // discarded (the current microtime busts the cache by default). return microtime(); } protected function fetch_from_cache($data) { // Add cached CSS styles to the page on which the module resides. $this->page->add_to_css_linked($data["css_linked"]); $this->page->add_to_css($data["css"]); // Add cached JavaScript to the page on which the module resides. $this->page->add_to_js_linked($data["js_linked"]); $this->page->add_to_js($data["js"]); // Return the cached content for the module. 228 | Chapter 9: Performance return $data["content"]; } protected function store_into_cache($cache_key, $hash) { $css_linked = $this->get_css_linked(); $css = $this->get_css(); $js_linked = $this->get_js_linked(); $js = $this->get_js(); $content = $this->get_content(); // Set up the data structure for the data to place in the cache. $cache_val = array ( "hash" => $hash, "data" => array ( "css_linked" => $css_linked, "css" => $css, "js_linked" => $js_linked, "js" => $js, "content" => $content ) ); // Store the new copy into the cache and apply the time-to-live. apc_store($cache_key, $cache_val, $this->cache_ttl); // Add module CSS styles to the page on which the module resides. $this->page->add_to_css_linked($css_linked); $this->page->add_to_css($css); // Add module JavaScript to the page on which the module resides. $this->page->add_to_js_linked($js_linked); $this->page->add_to_js($js); // Return the content that was just generated using get_content. return $content; } protected function get_cache_hash($var) { // Hash the string used to determine when to use the cached copy. return md5($var); } protected function get_cache_key() { // This must be unique per module, so use the derived class name. return get_class($this); } } Caching Opportunities | 229 The CacheableModule class uses the APC (Alternative PHP Cache) cache of PHP to implement the caching between instantiations of the module. The class provides a good example of overriding create provided by Module (see Chapter 7). Instead of the default implementation of create, the implementation here inspects the APC cache before generating the module. If the module can use the cache, it fetches its CSS, JavaScript, and content instead of generating them from scratch. If the module cannot use the cache, it generates itself as normal and caches its CSS, JavaScript, and content for the next time. To be clear, there are four conditions under which the module will be generated from scratch: • There is no copy in the cache at all. • The variables from which the cached copy is derived have changed. • The time-to-live has expired. • The $cache_clr member is set. One of the nice things about the implementation in Example 9-5 is that using a cacheable module is very similar to using a module that is not cacheable. For example, sup- pose NewCarSearchResults were a module derived from CacheableModule. The code to instantiate and create this module looks like what was presented in Chapter 7. The call to set_cache_ttl is optional, just to set a different time-to-live than the default for the cache. You can also call the public method set_cache_clr whenever you want to ensure that a fresh copy of the module is generated. $mod = new NewCarSearchResults ( $this, $this->data["new_car_listings"] ); $mod->set_cache_ttl(1800); $results = $mod->create(); The main thing to remember when using a cacheable module is that your class derived from CacheableModule needs to implement get_cache_vars for how you want caching to occur. This method should return a string that changes whenever you no longer want to use the cached copy of the module. This string is typically a concatenation of the variables and values on which the cached module depends. Notice that the default implementation for get_cache_vars in the base class returns the current time in microseconds. This value ensures the default behavior is never to use the cached copy, since the time in microseconds is different whenever you generate the module. This will be the case until you provide more informed logic about when the cache should be considered valid by overriding get_cache_vars within your own implementation in the derived class. 230 | Chapter 9: Performance . set of techniques for developing large web applications like the ones in this book, you can manage performance with relative ease and in a centralized manner. Tenet 9: Large- scale HTML, JavaScript,. build large web applications that perform well. They also facilitate a good en- vironment for capturing site metrics and testing. We begin this chapter by looking at how the techniques for developing. recommendations presented in High Performance Web Sites (O’Reilly). This book, based on research conducted by Steve Souders at Yahoo!, suggests that for most websites, backend performance accounts

Định dạng
Số trang	10
Dung lượng	232,03 KB