Windows Server 2003 Clustering & Load Balancing phần 9 pps

Now, that you’ve added your nodes to the cluster, let’s look at the NLB Manager and some of the problems you might encounter. Remember, if you want to continue to add nodes, then you can do the same thing. Right-click the cluster and add a node. You can also add another cluster. Doing this will create more than one cluster for you to manage in the same console. In Figure 7-6, you can see your two nodes are configured and ready to go. I have a problem, though. You can see in the figure that, within my cluster, I have a node with an hourglass, which means it’s in the process of connecting to the cluster. Notice in the right-hand side pane that NLB isn’t bound and that’s the problem. The status of your nodes can give you a good hint on what your nodes are doing. You can also look at the log entry in the bottom pane of the NLB Manager for a detailed listing of problems you might encounter as well as those of successful transitions. Now look at Figure 7-7. I intentionally made this considerably worse to show you what this console will flag. Remember, we also enabled logging earlier in the chapter. In Figure 7-7, I changed the IP addresses and enabled the cluster service. You’re given explicit details on what the problem is and how to troubleshoot it. As mentioned before, the Cluster Service started and this threw everything off. All I had to do was look in the bottom pane of the NLB Manager, and then click the error I wanted to investigate. As I opened it, I could see one of my critical errors came from the cluster node that had the cluster service enabled, as shown in the following illustration. 310 Windows Server 2003 Clustering & Load Balancing OsbNetw / Windows 2000 & Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 7 P:\010Comp\OsbNetw\622-6\ch07.vp Monday, March 24, 2003 11:52:45 AM Color profile: Generic CMYK printer profile Composite Default screen Chapter 7: Building Advanced Highly Available Load-Balanced Configurations 311 OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 7 In Figure 7-8, you'll notice there's a problem with one of my cluster nodes. In this one, the status on the right-hand side pane shows the host is unreachable. This is a problem because I blocked ICMP, which is the protocol ping uses. The reason this isn’t good is because NLBMGR uses ICMP to contact the nodes. Figure 7-6. NLB Manager error listing P:\010Comp\OsbNetw\622-6\ch07.vp Monday, March 24, 2003 11:52:45 AM Color profile: Generic CMYK printer profile Composite Default screen 312 Windows Server 2003 Clustering & Load Balancing OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 7 Figure 7-8. Blocking ICMP and getting an unreachable host Figure 7-7. NLB Manager status P:\010Comp\OsbNetw\622-6\ch07.vp Monday, March 24, 2003 11:52:46 AM Color profile: Generic CMYK printer profile Composite Default screen OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 7 Chapter 7: Building Advanced Highly Available Load-Balanced Configurations 313 Finally, I set up everything correctly. Notice it’s in converged status and everything is working well, as shown in Figure 7-9. That’s it! You built a NLB cluster and tested it thoroughly. CONCLUSION In this chapter, you learned the advanced topics of creating Highly Available solutions with Windows Server 2003. You built on the concepts learned in Chapters 1 and 3 to build load-balanced solutions. In this chapter, you took this a step further and learned the process of proper design and configuration, not only of the NLB cluster, but also regarding security and high availability. These are important concepts you need to master before you roll out a Windows Server 2003 clustered solution. You finalized the last cluster to be built within this book. Before moving on, I want to stress a few points. • Design, Design, Design! It’s the most important part. You don’t want a solution that loses money for your company. • Test! You need to do a great deal of research and planning to implement a Highly Available solution, especially if you take it out to the Internet where you need to consider security, routing, switching, and many other advanced infrastructure solutions. All this must be taken into account, so you can make the right decisions and not implement the wrong technology. Figure 7-9. Viewing the NLBMGR with a complete, active NLB cluster P:\010Comp\OsbNetw\622-6\ch07.vp Monday, March 24, 2003 11:52:46 AM Color profile: Generic CMYK printer profile Composite Default screen • Be selective about what you want to roll out whether it's a failover type of cluster or a load-balanced cluster. Although they share the same name, they are completely different in form (you can review this by rereading Chapters 1 through 3). In the next, and final, chapter, you learn the details about all the testing and monitoring that goes into Highly Available solutions, including how to monitor your clusters, baseline them, and test them for proper use. 314 Windows Server 2003 Clustering & Load Balancing OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 7 P:\010Comp\OsbNetw\622-6\ch07.vp Monday, March 24, 2003 11:52:46 AM Color profile: Generic CMYK printer profile Composite Default screen CHAPTER 8 High Availability, Baselining, Performance Monitoring, and Disaster Recovery Planning 315 OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 8 P:\010Comp\OsbNetw\622-6\ch08.vp Monday, March 24, 2003 10:08:05 AM Color profile: Generic CMYK printer profile Composite Default screen Copyright 2003 by The McGraw-Hill Companies, Inc. Click Here for Terms of Use. 316 Windows Server 2003 Clustering & Load Balancing OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 8 I n this chapter, you learn what you need to do after the cluster is operational. In the first chapter, I explained the basic concepts of high availability, including definitions of each high-availability component. In the chapters following Chapter 1, we reviewed many solutions using Windows 2000, Windows Server 2003 solutions, and how to integrate them successfully into your environment. This chapter covers advanced planning procedures, Disaster Recovery Planning, and monitoring the solution you now have available. This chapter will open your eyes to the ongoing maintenance you need to do long after you finish this book. After you read this chapter, you’ll be able to do advanced planning for high availability, implement a Disaster Recovery Plan and a performance monitor, as well as baseline your servers and monitor your cluster nodes for problematic issues. PLANNING FOR HIGH AVAILABILITY Taking the time to plan and design is the key to your success, and it’s not only the design, but also the study efforts you put in. I always joke with my administrators and tell them they’re doctors of technology. I say, “When you become a doctor, you’re expected to be a professional and maintain that professionalism by educational growth through constant learning and updating of your skills.” Many IT staff technicians think their job is 9 to 5, with no studying done after hours. I have one word for them: Wrong! You need to treat your profession as if you’re a highly trained surgeon except, instead of working on human life, you’re working on technology. And that’s how planning for High Availability solutions needs to be addressed. You can’t simply wing it, and you can’t guess at it. You must be precise—otherwise, your investment goes down the drain. This holds true for any profession but, from the rush of people into this field from the early ‘90s, you’d be surprised at the lack of knowledge out there from people making decisions such as high-availability planning. Make no mistake, if you don’t plan it out, you could be adding more problems into your network! Let’s continue with what you need to achieve. Planning Your Downtime You need to achieve as close to 100 percent uptime as possible. You know a 100 percent uptime isn’t realistic, though, and it can never be guaranteed. Breakdowns occur because of disk crashes, power or UPS failure, application problems resulting in system crashes, or any other hardware or software malfunction. So, the next best thing is 99.999 percent, which is reasonable with today’s technology. You can also define in a Service Level Agreement (SLA) what 99.999 percent means to both parties. If you promised 99.999 percent uptime to someone for a single year, that translates to a downtime ratio of about five to ten minutes. I would strive for a larger number, one that’s more realistic to scheduled outages and possible disaster-recovery testing performed by your staff. Go for 99.9 percent uptime, which allots for about nine to ten hours of downtime per year. This is more practical and feasible to obtain. Whether providing or receiving such a service, both sides should test planned outages to see if delivery schedules can be met. P:\010Comp\OsbNetw\622-6\ch08.vp Monday, March 24, 2003 10:08:05 AM Color profile: Generic CMYK printer profile Composite Default screen OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 8 Chapter 8: High Availability, Baselining, Performance Monitoring, and Disaster Recovery Planning 317 You can figure this formula by taking the amount of hours in a day (24) and multiplying it by the number of days in the year (365). This equals 8,760 hours in a year. Use the following equation: percent of uptime per year = (8,760 – number of total hours down per year) / 8,760 If you schedule eight hours of downtime per month for maintenance and outages (96 hours total), then you can say the percentage of uptime per year is 8,760 minus 96 divided by 8,760. You can see you’d wind up with about 98.9 percent uptime for your systems. This should be an easy way for you to provide an accurate accounting of your downtime. Remember, you must account for downtime accurately when you plan for high availability. Downtime can be planned or, worse, unexpected. Sources of unexpected downtime include the following: • Disk crash or failure • Power or UPS failure • Application problems resulting in system crashes • Any other hardware or software malfunction Building the Highly Available Solutions’ Plan Let’s look at the plan to use a Highly Available design in your organization and review the many questions you need to ask before implementing it live. Remember, if the server is down, people can’t work, and millions of dollars can be lost within hours. The following is a list of what could happen in sequence: 1. A company uses a server to access an application that accepts orders and does transactions. 2. The application, when it runs, serves not only the sales staff, but also three other companies who do business-to-business (B2B) transactions. The estimate is, within one hour’s time, the peak money made exceeded 2.5 million dollars. 3. The server crashes and you don’t have a Highly Availability solution in place. This means no failover, redundancy, or load balancing exists at all. It simply fails. 4. It takes you (the systems engineer) 5 minutes to be paged, but about 15 minutes to get onsite. You then take 40 minutes to troubleshoot and resolve the problem. 5. The company’s server is brought back online and connections are reestablished. Everything appears functional again. The problem was simple this time—a simple application glitch that caused a service to stop and, once restarted, everything was okay. Now, the problem with this whole scenario is this: although it was a true disaster, it was also a simple one. The systems engineer happened to be nearby and was able to diagnose the problem quite quickly. Even better, the problem was a simple fix. This easy problem still took the companies’ shared application down for at least one hour and, if this had been a peak-time period, over 2 million dollars could have been lost. P:\010Comp\OsbNetw\622-6\ch08.vp Monday, March 24, 2003 10:08:05 AM Color profile: Generic CMYK printer profile Composite Default screen 318 Windows Server 2003 Clustering & Load Balancing OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 8 Don’t believe me? Well, this does happen and this is what prompts people to buy a book like this. They want to become aware, so the possibility of 2 million in sales evaporating never occurs again. Worse still, the companies you connect to, and your own clientele, start to lose faith in your ability to serve them. This could also cost you revenue and the possibility of acquiring new clients moving forward. People talk and the uneducated could take this small glitch as a major problem with your company’s people, instead of the technology. Let’s look at this scenario again, except with a Highly Available solution in place: 1. A company uses a Server to access an application that accepts orders and does transactions. 2. The application, when it runs, serves not only the sales staff, but also three other companies who do business-to-business (B2B) transactions. The estimate is, within one hour’s time, the peak money made exceeded 2.5 million dollars. 3. The server crashes, but you do have a Highly Available solution in place. (Note, at this point, it doesn’t matter what the solution is. What matters is that you added redundancy into the service.) 4. Server and application are redundant, so when a glitch takes place, the redundancy spares the application from failing. 5. Customers are unaffected. Business resumes as normal. Nothing is lost and no downtime is accumulated. 6. The one hour you saved your business in downtime just paid for the entire Highly Available solution you implemented. One aspect we haven’t touched on in this book is people. We discussed the technological details in previous chapters but, now, let’s look at how you can position human resources to help with Highly Available solutions. Human Resources and Highly Available Solutions Human Resources (people) need to be trained and work onsite to deal with a disaster. They also need to know how to work under fire. As a former United States Marine, I know about the “fog of war,” where you find yourself tired, disoriented, and probably unfocused on the job. These characteristics don’t help your response time with management. In any organization, especially with a system as complex as one that’s highly available, you need the right people to run it. Managing Your Services In this section, you see all the factors to consider while designing a Highly Available solution. The following is a list of the main services to remember: • Service Management is the management of the true components of Highly Available solutions: the people, the process in place, and the technology needed to create the solution. Keeping this balance to have a truly viable solution is important. Service Management includes the design and deployment phases. P:\010Comp\OsbNetw\622-6\ch08.vp Monday, March 24, 2003 10:08:05 AM Color profile: Generic CMYK printer profile Composite Default screen • Change Management is crucial to the ongoing success of the solution during the production phase. This type of management is used to monitor and log changes on the system. • Problem Management addresses the process for Help Desks and Server monitoring. • Security Management is tasked to prevent unauthorized penetrations of the system. • Performance Management is discussed in greater detail in this chapter. This type of management addresses the overall performance of the service, availability, and reliability. Other main services also exist, but the most important ones are highlighted here. Service management is crucial to the development of your Highly Available solution. You must cater to your customer’s demands for uptime. If you promise it, you better deliver it. Highly Available System Assessment Ideas The following is a list of items for you to use during the postproduction planning phase. Make sure you covered all your bases with this list: • Now that you have your solution configured, document it! A lack of documentation will surely spell disaster for you. Documentation isn’t difficult to do, it’s simply tedious, but all that work will pay off in the end if you need it. • Train your staff. Make sure your staff has access to a test lab, books to read, and advanced training classes. Go to free seminars to learn more about high availability. If you can ignore the sales pitch, they’re quite informative. • Test your staff with incident response drills and disaster scenarios. Written procedures are important, but live drills are even better to see how your staff responds. Remember, if you have a failure on a system, it could failover to another system, but you must quickly resolve the problem on the first system that failed. You could have the same issue on the other nodes in your cluster, and if that’s the case, you’re living on borrowed time. Set up a scenario and test it. • Assess your current business climate, so you know what’s expected of your systems at all times. Plan for future capacity especially as you add new applications, and as hardware and traffic increase. • Revisit your overall business goals and objectives. Make sure what you intend to do with your high-availability solution is being provided. If you want faster access to the systems, is it, in fact, faster? When you have a problem, is the failover seamless? Are customers affected? You don’t want to implement a Highly Available solution and have performance that gets worse. This won’t look good for you! Chapter 8: High Availability, Baselining, Performance Monitoring, and Disaster Recovery Planning 319 OsbNetw / Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 / Chapter 8 P:\010Comp\OsbNetw\622-6\ch08.vp Monday, March 24, 2003 10:08:05 AM Color profile: Generic CMYK printer profile Composite Default screen [...]... 8-13, you can see I disabled the Cluster Service on my Windows Server 2003 because I’m running load balancing and Chapter 8: High Availability, Baselining, Performance Monitoring, and Disaster Recovery Planning Figure 8-12 Managing services in the Computer Management Console Figure 8-13 Viewing a service 343 344 Windows Server 2003 Clustering & Load Balancing I don’t need this service If I didn’t disable... configure monitoring of the server This is difficult to read and it can be confusing without an explanation of what you’re looking at 331 332 Windows Server 2003 Clustering & Load Balancing Figure 8-6 Viewing counters within the System Monitor Figure 8-7 Adding counters Chapter 8: High Availability, Baselining, Performance Monitoring, and Disaster Recovery Planning 4 Windows Server 2003 platform enables... traffic problems but, in my experience with the tool, it’s hard to get a feel for your network traffic with this tool I would still use the Network Monitor that comes with Windows Server 2003 341 342 Windows Server 2003 Clustering & Load Balancing The last tab is the Users tab, which is used for viewing currently connected and logged-in users From a performance standpoint, you might be interested in the... been tripped You can see an example of this in Figure 8-11 Figure 8-10 New alert active in the Performance Console 337 338 Windows Server 2003 Clustering & Load Balancing Figure 8-11 Viewing the alert after it’s been tripped Now you know how to set up alerts on Windows Servers 2003 and how to perform monitoring of your systems in case of disaster, so they’re highly available In the next section, you... network acting up today? It seems a bit slow 325 326 Windows Server 2003 Clustering & Load Balancing • Is the server having a problem? I can’t seem to access directories quickly today • Is the system down? I’m freezing up over here Okay—a show of hands How many times have you heard this? “Too many” is a good answer I can, however, remove all blame from the server immediately because, after a quick health... following illustration, is the most-used tab in the tool This is where you can see the running processes and what’s using them, as well as the CPU and Memory usage, by default 3 39 340 Windows Server 2003 Clustering & Load Balancing Use this tool to find the following: • Runaway processes • Memory leaks • Trojan applications • Nonresponding or hung process That’s not all you can find You have the option... such as custom scripts and programs, as well as a network message sent if configured to do so 335 336 Windows Server 2003 Clustering & Load Balancing 6 Once you finish setting the logging of the alert to the Application Log in the Event Viewer, click OK You’ve now created your first alert In Figure 8 -9, you can see the alert was created, it’s green in color, and operational 7 Your next step is to see... configure many things within these tabs, let’s focus on the most important items for configuring high availability We don’t want to get too deep into configuring System Monitor 3 29 330 Windows Server 2003 Clustering & Load Balancing Figure 8-4 Viewing reports in the System Monitor Figure 8-5 Configuring the General tab Chapter 8: High Availability, Baselining, Performance Monitoring, and Disaster Recovery...320 Windows Server 2003 Clustering & Load Balancing • Do a data-flow analysis on the connections the high availability uses You’d be surprised how much truouble damaged NICs, the wrong drivers, excessive protocols, bottlenecks,... this isn’t called the Performance Monitor Instead, it’s called the System Monitor and it’s located within the Performance Console Figure 8-1 Viewing the Performance Console 327 328 Windows Server 2003 Clustering & Load Balancing System Monitor graphically displays statistics for the set of parameters you selected for display You can do this by selecting counters Counters are almost unlimited as well . in the following illustration. 310 Windows Server 2003 Clustering & Load Balancing OsbNetw / Windows 2000 & Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 /. comparison during times of activity. 320 Windows Server 2003 Clustering & Load Balancing OsbNetw / Windows 2000 & Windows Server 2003 Clustering & Load Balancing / Shimonski/ 222622-6 /. thing is 99 .99 9 percent, which is reasonable with today’s technology. You can also define in a Service Level Agreement (SLA) what 99 .99 9 percent means to both parties. If you promised 99 .99 9 percent

Định dạng
Số trang	41
Dung lượng	0,96 MB