Apache Server 2 Bible Hungry Minds phần 9 pot

610 Part VI ✦ Tuning for Performance and Scalability discards symbols from httpd under Linux. This makes the size of the executable a bit smaller, which results in RAM savings for each httpd. If you think your Apache executable ( httpd) is as lean and mean as it can be but you still suspect the bottleneck is within Apache, then take a close look at your Apache configuration files. Some Apache directives are expensive in terms of performance, and they are usually the ones that require domain name resolution, system calls, drive I/O, and process manipulation. Apache configuration tuning is discussed in the “Tuning the Apache Configuration” section. Tuning Your Network After you have your server hardware, operating system, and the Apache server itself tuned for performance, the next typical area of bottleneck is the network itself. To tune your network you must know the network architecture. Most Web servers are run from an Ethernet network — be it at the office network or at a co-location facility in an ISP. How this network is configured makes a great deal of difference. For example, if you have an Apache server connected to an Ethernet hub, which is, in turn, connected to other hubs that are connected to either workstations or servers, you have a great breeding environment for major bottlenecks. Each machine sees all the traffic on the network, thus the number of packet collisions increases, and network slow-down is more common. However, such a bottleneck is easily remedied by using network switches in place of hubs. No special hardware is needed on the devices that connect to an Ethernet switch. The same network interface used for shared media 10Base-T hubs will work with an Ethernet switch. From that device’s perspective, connecting to a switched port is just like being the only computer on the network segment. One common use for an Ethernet switch is to break a large network into segments. While it is possible to attach a single computer to each port on an Ethernet switch, it is also possible to connect other devices such as a hub. If your network is large enough to require multiple hubs, you could connect each of those hubs to a switch port so that each hub is a separate segment. Remember that if you simply cascade the hubs directly, the combined network is a single logical Ethernet segment. Using fast Ethernet The traditional Ethernet is 10MB/sec, which simply is not enough in a modern business environment that includes e-mail-based communication, Internet access, video conferencing, and other bandwidth-intensive operations. The 100MB/sec Ethernet is the way to go. However, 100 MB/sec or “fast” Ethernet is still quite expensive if you decide to go with fast switches as well. I recommend that you move toward a switched fast Ethernet from now on unless you have already done so. The migration path from 10MB/sec to 100MB/sec can be quite expensive if you have a lot of computers in your network. Each computer in your network must have m4821-2 ch22.F 2/22/02 10:32 AM Page 610 611 Chapter 22 ✦ Speeding Up Apache a 100MB/sec-capable NIC installed, which can be expensive in terms of cost, staff, and time. For a large LAN with several hundred users or more, you should do the upgrade one segment at a time. You can start by buying 10/100MB dual-speed NICs, which will enable you to support your existing 10MB/sec and your upcoming 100MB/sec infrastructure seamlessly. Using fast Ethernet hand-in-hand with switching hardware can bring a high degree of performance to your LAN. You should definitely consider this option if possible. If you have multiple departments to interconnect, consider an even faster solution between the departments: The emerging Gbit/sec Ethernet standard is very suitable for connecting local area networks together to form a wide area network (WAN). Understanding and controlling network traffic flow Understanding how your network traffic flows is the primary key in determining how you can tune it for better performance. Take a look at the network segment shown in Figure 22-3. Knowing your hubs from your switches The major difference between an Ethernet hub and switch is that each port on a switch is its own logical segment. A computer connected to a port on an Ethernet switch has a full set of bandwidth ascribed to it and need not contend with other computers. A main reason for purchasing a switch over a hub is for its address-handling capabilities. Whereas a hub will not look at the address of a data packet and just forward data to all devices on the network, a switch is supposed to read the address of each data packet and correctly forward the data to the intended recipient(s). If the switch does not correctly read the packet address and correctly forward the data, it has no advantage over a hub. The following table lists the major differences between hub and switch. Ethernet HubEthernet Switch Total network bandwidth is limited to the speed of the hub, that is, a 10Base-T hub provides a 10MB bandwidth, no matter how many ports exist. Total network bandwidth is determined by the number of ports on the switch. that is, a 12-port 100MB switch can support up to 1200MB/sec bandwidth — this is referred to as the switch’s maximum aggregate bandwidth. Supports half-duplex communications limiting the connection to the speed of the port, that is, 10MB port provides a 10MB link. Switches that support full-duplex communications offer the capability to double the speed of each link from 100MB to 200MB. Hop count rules limit the number of hubs that can be interconnected between two computers. Allows users to greatly expand networks; there are no limits to the number of switches that can be interconnected between two computers. Cheaper than switches. More expensive than hubs but price/performance is worth the higher price. m4821-2 ch22.F 2/22/02 10:32 AM Page 611 612 Part VI ✦ Tuning for Performance and Scalability Figure 22-3: An inefficient Web network Here, three Web servers are providing Web services to the Internet and they share a network with an NFS server and a database server. What’s wrong with this picture? Well, several things are wrong. First, these machines are still using dumb hubs instead of a switch. Second, the NFS and database traffic is competing with the incoming and outgoing Web traffic. If a Web application needs database access in response to a Web request, it generates one or more database requests, which, in turn, takes away from the bandwidth available for other incoming or outgoing Web requests, effectively making the network unnecessarily busy or less responsive. How can you solve such a problem? By using a traffic-control mechanism, of course! First determine what traffic can be isolated in this network. Naturally, the database and NFS traffic is only needed to service the Web servers. In such a case, NFS and database traffic should be isolated so that they do not compete with Web traffic. Figure 22-4 shows a modified network diagram for the same network. Here, the database and the NFS server are connected to a switch that is connected to the second NIC of each Web server. The other NIC of each Web server is connected to a switch that is, in turn, connected to the load-balancing hardware. Now, when a Web request comes to a Web server, it is serviced by the server without taking away from the bandwidth of other Web servers. The result is a tremendous increase in network efficiency, which trickles down to a more positive user experience. After you have a good network design, your tuning focus should shift to applications and services that you provide. Depending on your network load, you might have to consider deploying multiple servers of the same kind to implement a more responsive service. This is certainly true for the Web. The next section discusses how you can employ a simple load-balancing scheme by using a DNS trick. Router Hub Web Server 1 Web Server 2 Web Server 3 NFS Server Database Server Load Balancing Device (e.g. CISCO local Director) Internet m4821-2 ch22.F 2/22/02 10:32 AM Page 612 613 Chapter 22 ✦ Speeding Up Apache Figure 22-4: An improved Web network Balancing load using the DNS server The idea is to share the load among multiple servers of a kind. This typically is used for balancing the Web traffic over multiple Web servers. This trick is called round-robin Domain Name Service. Suppose that you have two Web servers, www1.yourdomain.com (192.168.1.10) and www2.yourdomain.com (192.168.1.20) and you want to balance the load for www.yourdomain.com on these two servers using the round-robin DNS trick. Add the following lines to your yourdomain.com zone file: www1 IN A 192.168.1.10 www2 IN A 192.168.1.20 www IN CNAME www1 www IN CNAME www2 Restart your name server and ping the www.yourdomain.com host. You will see the 192.168.1.10 address in the ping output. Stop and restart pinging the same host, and you’ll see the second IP address being pinged, because the preceding configuration tells the name server to cycle through the CNAME records for www. In other words, the www.yourdomain.com host is both www1.yourdomain.com and www2.yourdomain.com. Now, when someone enters www.yourdomain.com, the name server gives out the first address once, then gives out the second address for the next request, and keeps cycling between these addresses. Router Load Balancing Hardware SwitchSwitch Web Server 1 Web Server 2 Web Server 3 Internet NFS Server Database Server m4821-2 ch22.F 2/22/02 10:32 AM Page 613 614 Part VI ✦ Tuning for Performance and Scalability A disadvantage of the round-robin DNS trick is that the name server has no way of knowing which system is heavily loaded and which is not —it just blindly cycles. If one of the servers crashes or becomes unavailable for some reason, the round-robin DNS trick still returns the broken server’s IP on a regular basis. This could be quite chaotic, because some people will be able to get to the site and some won’t. Using load-balancing hardware If your load demands smarter load distribution and checking your server’s health is important, your best choice is to get a hardware solution that uses the new director products, such as Web Director ( www.radware.com), Ace Director (www.alteon. com ), or Local Director (www.cisco.com). Figure 22-5 shows a Web network, which consists of two CISCO Local Directors; a set of proxy servers; Apache Web servers; mod_perl, PHP, Java Servlet application servers; and database servers. All Web domains hosted in this network come to a virtual IP address that resolves to the Local Director. The Local Director uses its configuration and server health information, which it collects, to determine where to get the contents from. The second Local Director simply works as a standby in case the primary fails. Local Director enables you to do a stateful recovery when two Local Directors are connected via a special cable. If the primary fails, then the secondary can take over without anyone in the outside world knowing anything or receiving any errors. If you are serious about the reliability of your Web network, ensure that you have no single point of failure. For example, if you use a database server, make sure you have another that is replicating the data as close to real-time as possible so that you can recover from a database crash. Tuning the Apache Configuration After you have configured the hardware, operating system, network, and the Apache software itself (all of these processes are discussed earlier in this chapter), you are ready to tune the Apache configuration. The following sections discuss several tuning options that you can easily apply to increase server performance. Minimizing DNS lookups If the HostnameLookups directive is set to On, Apache will perform DNS lookup for each request to resolve IP address to a host name. This can degrade your server performance greatly. So, you should seriously consider not using host lookups for each request. Set HostnameLookups to Off in httpd.conf Tip Caution m4821-2 ch22.F 2/22/02 10:32 AM Page 614 615 Chapter 22 ✦ Speeding Up Apache Figure 22-5: A Web network that uses Local Director for load balancing. If you must resolve IP addresses to host names for log-processing purposes, use the logresolve tool instead. See Chapter 8 for details. Speeding up static file serving Although everyone is screaming about dynamic Web content that is database- driven or served by fancy application servers, the static Web pages are still there. In fact, dynamic contents are unlikely to completely replace static Web pages in the near future, because serving a static page is usually faster than serving a dynamic page. Some dynamic-content systems even create dynamically and periodically generated static Web pages as cache contents for faster delivery. This section discusses how you can improve the speed of static page delivery by using Apache and the Linux kernel HTTP module. Local Director (Stand by Mode) Local Director (Primary) Database Server (Replicated Data) Switch Switch Web Server 2Web Server 1 Database Server App Server 1 Web Server N Proxy Server 1 Proxy Server 2 App Server 2 App Server N m4821-2 ch22.F 2/22/02 10:32 AM Page 615 616 Part VI ✦ Tuning for Performance and Scalability Reducing drive I/O for faster static page delivery When Apache gets a request for a static Web page, it performs a directory tree search for .htaccess files to ensure that the requested page can be delivered to the Web browser. For example, if an Apache server running on www.nitec.com receives a request such as http://www.nitec.com/training/linux/sysad/ intro.html , Apache performs these checks: /.htaccess %DocRoot%/.htaccess %DocRoot%/training/.htaccess %DocRoot%/training/linux/.htaccess %DocRoot%/training/linux/sysad/.htaccess %DocRoot% is the document root directory set by the DocumentRoot directive in the httpd.conf file. So, if this directory is /www/nitec/htdocs, then the following checks are made: /.htaccess /www/.htaccess /www/nitec/.htaccess /www/nitec/htdocs/.htaccess /www/nitec/htdocs/training/.htaccess /www/nitec/htdocs/training/linux/.htaccess /www/nitec/htdocs/training/linux/sysad/.htaccess Apache looks for the .htaccess file in each directory of the translated (from the requested URL) path of the requested file ( intro.html). As you can see, a URL that requests a single file can result in multiple drive I/O requests to read multiple files. This can be a performance drain for high-volume sites. In such cases, your best choice is to disable .htaccess file checks all together. For example, when the following configuration directives are placed within the main server section (that is not within a VirtualHost directive) of the httpd.conf file, it will disable checking for .htaccess for every URL request. <Directory /> AllowOverride None </Directory> When the above configuration is used, Apache will simply perform a single drive I/O to read the requested static file and therefore gain performance in high-volume access scenarios. Reducing system calls and drive I/O for symbolic links On Unix and Unix-like systems running Apache, symbolic links present a danger. By using an inappropriately placed symbolic link, a Web user can view files and directories that should not be available via Web. This is why Apache offers a way for you to disable symbolic links or only follow a symbolic link if the user ID of the m4821-2 ch22.F 2/22/02 10:32 AM Page 616 617 Chapter 22 ✦ Speeding Up Apache symbolic matches the server’s own. For example, the following configuration in the main server section (that is, outside any virtual host configuration) of httpd.conf will instruct Apache not to follow symbolic links, effectively disabling all symbolic link access via Web. <Directory /> Options -FollowSymLinks </Directory> Unfortunately, this comes with a significant performance price. For each request, Apache performs an additional system call, lstat(), to ensure that it is not violating your don’t-follow-symbolic-link policy. To increase performance while having symbolic links and good security, do the following: 1. Find a way to not use any symbolic links on your Web document tree. You can use the find your_top_web_directory -type l -print command to find all the existing symbolic links in your top Web directory; then you can figure out how to avoid them. 2. Use the following configuration in the main server section of httpd.conf to enable symbolic links: <Directory /> Options FollowSymLinks </Directory> 3. If you must disable symbolic links, consider narrowing the directory scope with a specific directory name. For example, if you want to disallow symbolic links in a directory called my_dir but allow symbolic links everywhere else (for performance), you can use this configuration: <Directory /> Options FollowSymLinks </Directory> <Directory /my_dir> Options -FollowSymLinks </Directory> 4. Similarly, you can use the SymLinksIfOwnerMatch: <Directory /> Options FollowSymLinks </Directory> <Directory /my_dir> Options -FollowSymLinks +SymLinksIfOwnerMatch </Directory> Here Apache will follow symbolic links in the /my_dir directory if their owner ID matches the server’s user ID. m4821-2 ch22.F 2/22/02 10:32 AM Page 617 618 Part VI ✦ Tuning for Performance and Scalability Tuning your configuration using ApacheBench Apache server comes with a tool called ApacheBench (ab), which is installed by default in the bin directory of your Apache installation directory. By using this nifty tool, you can tune your server configuration. Depending on your multiprocessing module (MPM) choice ( prefork, threaded, perchild) you have to tune the values for the following default configuration: <IfModule prefork.c> StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 20 MaxRequestsPerChild 0 </IfModule> <IfModule threaded.c> StartServers 3 MaxClients 8 MinSpareThreads 5 MaxSpareThreads 10 ThreadsPerChild 25 MaxRequestsPerChild 0 </IfModule> <IfModule perchild.c> NumServers 5 StartThreads 5 MinSpareThreads 5 MaxSpareThreads 10 MaxThreadsPerChild 20 MaxRequestsPerChild 0 </IfModule> Tuning these directives randomly is not a good idea. Because your Web site and its traffic pattern and applications are likely to be different from other sites, there is no one-size-fits-all formula to calculate appropriate values for these directives. I will show you a technique, however, that uses ApacheBench to determine the appropriate values. You should use the ApacheBench tool on a system (or on multiple systems) different than the Web server itself, because trying to do benchmarking on the same server using a client/server model will give you false information. The benchmark tool, ab, itself takes away resources from the server and therefore tampers with your results. So, you must run ab on a different machine. I recommend you run ab on multiple machines to better simulate loads. You will have to compile Apache on other machines to get the ab binary installed on a non-Web server system. You can install a binary RPM of Apache on such system and uninstall it after your tuning is over. See Chapters 2 and 3 for details on how to install and configure Apache. Note Caution m4821-2 ch22.F 2/22/02 10:32 AM Page 618 619 Chapter 22 ✦ Speeding Up Apache Determine a goal for your server. Make an estimate (or guess) of how many requests you want to be able to service from your Web server. Write it down in a goal statement such as, “I wish to service N requests per second.” Restart your Web server and from a system other than the Web server, run the ab command as follows: ./ab -n number_of_total_requests \ -c number_of_simultaneous_requests \ http://your_web_server/page For example: ./ab -n 1000 -c 50 http://www.domain.com/ The ApacheBench tool will make 50 concurrent requests and a total of 1,000 requests. Sample output is shown below: Server Software: Apache/2.0.16 Server Hostname: localhost Server Port: 80 Document Path: / Document Length: 1311 bytes Concurrency Level: 50 Time taken for tests: 8.794 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 1754000 bytes HTML transferred: 1311000 bytes Requests per second: 113.71 Transfer rate: 199.45 kb/s received Connnection Times (ms) min avg max Connect: 0 0 5 Processing: 111 427 550 Total: 111 427 555 Notice that Requests per second is 113.71 for accessing the home page of the http://www.domain.com site. Change the concurrent request count to a higher number and see how the server handles additional concurrent load. Now change the values for the MaxClients, ThreadsPerChild, MaxThreadsPerChild, and so on based on your MPM, restart Apache, and apply the same benchmark tests by using ab as before. You should see your Requests per second go up and down based on numbers you try. As you tweak the numbers by changing the directive values, make sure you record the values and the performance so that you can determine what is a good setting for you. m4821-2 ch22.F 2/22/02 10:32 AM Page 619 [...]... your network to access to the proxy-cache For example, if your network address is 1 92 . 168.1.0 with subnet 25 5 .25 5 .25 5.0, then you can define the following line in squid.conf to create an ACL for your network: acl local_net src 1 92 . 168.1.0 /25 5 .25 5 .25 5.0 m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 623 Chapter 22 ✦ Speeding Up Apache 6 Squid needs to know that you want to allow machines in local_net ACL to have... single Apache server system ✦ ✦ ✦ 635 m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 636 m4 821 -2 ch23.F 2/ 22/ 02 10: 32 AM Page 637 23 C H A P T E R Creating a High-Availability Network I n this chapter, you learn about design considerations for building a Web network A Web network is a network of Web server nodes that create a Web service For example, Yahoo! uses a large number of Web servers, application servers,... server node on the network Figure 23 -2 shows an example load-balanced Web network m4 821 -2 ch23.F 2/ 22/ 02 10: 32 AM Page 641 Chapter 23 ✦ Creating a High-Availability Network 1 Request Load Balancer 3 Switch 2 Web Server 1 Web Server 2 Web Server N 1: Client Request comes to the load balancer 2: Load Balancer selects an available Web server 3: Selected Web server responses to client request Figure 23 -1:... server responses to client request Figure 23 -1: A simple load-balancing solution 20 7.183 .23 3.17 20 7.183 .23 3.17 Load Balancer Load Balancer 1: http://www.domain.com Request Switch 3 2 Web Server 1 Web Server 2 Web Server N www1.domain.com 20 7.183 .23 3.18 www2.domain.com 20 7.183 .23 3. 19 wwwN.domain.com 20 7.183 .23 3.N Figure 23 -2: A sample load-balancing solution for www.domain.com A request for http://www.domain.com... restart) the Apache server (listening on port 80) as usual by using the apachectl command However, you have to start the Apache on port 8080 by using the /usr/local /apache/ bin/httpd –f /usr/local/ apache/ conf/httpd-8080.conf command This assumes that you have installed the /usr/local /apache directory; if that is not so, make sure you change the path m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 633 Chapter 22 ✦ Speeding... $dbPassword, { AutoCommit => 1 }) || die; }; if ($@) { # die “Can’t connect to database $DBI::errstr”; # connect failed do something m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 6 29 Chapter 22 ✦ Speeding Up Apache print STDERR “Can not connect to $dataSource \n”; print STDERR “DB2 Server Error: $DBI::errstr \n”; } my $statement = “SELECT myfield from mytable where ID = $id”; my $sth = $dbh->prepare($statement);... handles once for the entire life cycle of the child -server process This makes the script much more efficient than the previous version m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 631 Chapter 22 ✦ Speeding Up Apache Running mod_perl applications on a partial set of Apache children When you start using many mod_perl scripts, you will notice that your Apache child -server processes become larger in size You can witness... IP multicast address to announce resource status from each server The above example uses the Ethernet broadcast address; you can use the multicast address instead m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 627 Chapter 22 ✦ Speeding Up Apache Now you need to decide which directory you want to load balance (that is, redirect) between all your Web servers Typically, this is the CGI directory For example, the... Page Server Dynamic Page Server http://myapps.domain.com Figure 22 -6: Separating static and dynamic (mod_perl script-generated) contents When a user requests the home page of a site called www.domain.com, the Apache server responsible for static pages returns the index.html page to the client The page contains embedded links for both static and dynamic contents The figure 631 m4 821 -2 ch 22. F 6 32 2 /22 / 02. .. in /usr/local/squid directory 621 m4 821 -2 ch 22. F 622 2/ 22/ 02 10: 32 AM Page 622 Part VI ✦ Tuning for Performance and Scalability After you have installed Squid, you need to configure it (see the next section) Configuring Squid To configure Squid, follow these steps: 1 Create a group called nogroup by using the groupadd nogroup command This group will be used by Squid 2 Run the chown -R nobody:nogroup . trick. Router Hub Web Server 1 Web Server 2 Web Server 3 NFS Server Database Server Load Balancing Device (e.g. CISCO local Director) Internet m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 6 12 613 Chapter 22 ✦ Speeding. Data) Switch Switch Web Server 2Web Server 1 Database Server App Server 1 Web Server N Proxy Server 1 Proxy Server 2 App Server 2 App Server N m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 615 616 Part VI ✦ Tuning for Performance. connect failed do something m4 821 -2 ch 22. F 2/ 22/ 02 10: 32 AM Page 628 6 29 Chapter 22 ✦ Speeding Up Apache print STDERR “Can not connect to $dataSource ”; print STDERR “DB2 Server Error: $DBI::errstr

Định dạng
Số trang	80
Dung lượng	434,44 KB