Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 59 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
59
Dung lượng
6,65 MB
Nội dung
ptg5994185 PROS AND CONS OF CLOUD COMPUTING 447 The importance of any of these or how much you should be concerned with them is deter- mined by your particular company’s needs at a particular time. We have covered what we see as the top drawbacks and benefits of cloud comput- ing as they exist today. As we have mentioned throughout this section, how these affect your decision to implement a cloud computing infrastructure will vary depend- ing on your business and your application. In the next section, we are going to cover some of the different ways in which you may consider utilizing a cloud environment as well as how you might consider the importance of some of the factors discussed here based on your business and systems. UC Berkeley on Clouds Researchers at UC Berkeley have outlined their take on cloud computing in a paper “Above the Clouds: A Berkeley View of Cloud Computing.” 1 They cover the top 10 obstacles that compa- nies must overcome in order to utilize the cloud: 1. Availability of service 2. Data lock-in 3. Data confidentiality and audit ability 4. Data transfer bottlenecks 5. Performance unpredictability 6. Scalable storage 7. Bugs in large distributed systems 8. Scaling quickly 9. Reputation fate sharing 10. Software licensing Their article concludes by stating that they believe cloud providers will continue to improve and overcome these obstacles. They continue by stating that “. . . developers would be wise to design their next generation of systems to be deployed into Cloud Computing.” 1. Armbrust, Michael, et al. “Above the Clouds: A Berkeley View of Cloud Computing.” http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf. ptg5994185 448 CHAPTER 29 SOARING IN THE CLOUDS Where Clouds Fit in Different Companies The first item to cover is a few of the various implementations of clouds that we have either seen or recommended to our clients. Of course, you can host your application’s production environment on a cloud, but there are many other environments in today’s software development organizations. There are also many ways to utilize dif- ferent environments together, such as combining a managed hosting environment along with a collocation facility. Obviously, hosting your production environment in a cloud offers you the scale on demand ability from a virtual hardware perspective. Of course, this does not ensure that your application’s architecture can make use of this virtual hardware scaling, that you must ensure ahead of time. There are other ways that clouds can help your organization scale that we will cover here. If your engineering or quality assurance teams are waiting for environments, the entire prod- uct development cycle is slowed down, which means scalability initiatives such as splitting databases, removing synchronous calls, and so on get delayed and affect your application’s ability to scale. Environments For your production environment, you can host everything in one type of infrastruc- ture, such as a managed hosting, collocation, your own data center, a cloud comput- ing environment, or any other. However, there are creative ways to utilize several of these together to take advantage of their benefits but minimize their drawbacks. Let’s look at an example of an ad serving application. The ad serving application consists of a pool of Web servers to accept the ad request, a pool of application servers to choose the right advertisement based on information conveyed in the original request, an administrative tool that allows publishers and advertisers to administer their accounts, and a database for persistent storage of information. The ad servers in our application do not need to access the database for each ad request. They make a request to the database once every 15 minutes to receive the newest advertisements. In this situation, we could of course purchase a bunch of servers to rack in a colloca- tion space for each of the Web server pool, ad server pool, administrative server pool, and database servers. We could also just lease the use of these servers from a man- aged hosting provider and let them worry about the physical server. Alternatively, we could host all of this in a cloud environment on virtual hosts. We think there is another alternative, as depicted in Figure 29.2. Perhaps we have the capital to purchase the pools of servers and we have the skill set in our team members to handle setting up and running our own physical environment, so we decide to rent space at a collocation facility and purchase our own servers. But, we also like the speed and flexibility gained from a cloud environment. We decide that since the Web and app servers don’t talk to the database very often we are going to ptg5994185 WHERE CLOUDS FIT IN DIFFERENT COMPANIES 449 host one pool of each in a collocation facility and another pool of each on a cloud. The database will stay at the collocation but snapshots will be sent to the cloud to be used as a disaster recovery. The Web and application servers in the cloud can be increased as traffic demands to help us cover unforeseen spikes. Another use of cloud computing is in all the other environments that are required for a modern software development organizations. These environments include but are not limited to production, staging, quality assurance, load and performance, development, build, and repositories. Many of these should be considered for imple- menting in a cloud environment because of the possible reduced cost, as well as flexi- bility and speed of setting up when needed and tearing down when they are no longer needed. Even enterprise class SaaS companies or Fortune 500 corporations who may never consider hosting production instances of their applications on a cloud could benefit from utilizing the cloud for other environments. Skill Sets What are some of the other factors when considering whether to utilize a cloud, and if you do utilize the cloud, then for which environments? One consideration is the skill set and number of personnel that you have available to manage your operations infrastructure. If you do not have both networking and system administration skill sets among your operations staff, you need to consider this when determining if you can implement and support a collocation environment. The most likely answer in Figure 29.2 Combined Collocation and Cloud Production Environment Collocation Facility Internet End Users Database Cloud Environment ptg5994185 450 CHAPTER 29 SOARING IN THE CLOUDS that case is that you cannot. Without the necessary skill set, moving to a more sophis- ticated environment will actually cause more problems than it will solve. The cloud has similar issues; if someone isn’t responsible for deploying and shutting down instances and this is left to each individual developer or engineer, it is very possible that the bill at the end of the month will be much more than you expected. Instances that are left running are wasting money unless someone has made a purposeful deci- sion that the instance is necessary. Another type of skill set that may influence your decision is capacity planning. Whether your business has very unpredictable traffic or you do not have the neces- sary skill set on staff to accurately predict the traffic, this may heavily influence your decision to use a cloud. Certainly one of the key benefits of the cloud is the ability to handle spiky demand by quickly deploying more virtual hosts. All in all, we believe that cloud computing likely has a fit in almost any company. This fit might not be for hosting your production environment, but may be rather for hosting your testing environments. If your business’ growth is unpredictable, if speed is of utmost urgency, and cutting costs is imperative to survival, the cloud might be a great solution. If you can’t afford to allocate headcount for operations management or predict what kind of capacity you may need down the line, cloud computing could be what you need. How you put all this together to make the decision is the subject of the next section in this chapter. Decision Process Now that we’ve looked at the pros and cons of cloud computing and we’ve discussed some of the various ways in which cloud environments can be integrated into a com- pany’s infrastructure, the last step is to provide a process for making the final deci- sion. The overall process that we are recommending is to first determine the goals or purpose of wanting to investigate cloud computing, then create alternative implemen- tations that achieve those goals. Weigh the pros and cons based on your particular situation. Rank each alternative based on the pros and cons. Based on the final tally of pros and cons, select an alternative. Let’s walk through an example. Let’s say that our company AlwaysScale.com is evaluating integrating a cloud infrastructure into its production environment. The first step is to determine what goals we hope to achieve by utilizing a cloud environment. For AlwaysScale.com, the goals are lower operation cost of infrastructure, decrease the time to procure and provision hardware, and maintain 99.99% availability for its application. Based on these three goals, the team has decided on three alternatives. The first is to do noth- ing, remain in a collocation facility, and forget about all this cloud computing talk. The second alternative is to use the cloud for only surge capacity but remain in the collocation facility for most of the application services. The third alternative is to ptg5994185 DECISION PROCESS 451 move completely onto the cloud and out of the collocation space. This has accom- plished steps one and two of the decision process. Step three is to apply weights to all of the pros and cons that we can come up with for our alternative environments. Here, we will use the five cons and three pros that we outlined earlier. We will use a 1, 3, or 9 scale to rank these in order that we highly differentiate the factors that we care about. The first con is security, which we care somewhat about but we don’t store PII or credit card info so we weight it a 3. We continue with portability and determine that we don’t really feel the need to be able to move quickly between infrastructures so we weight it a 1. Next, is Control, which we really care about so we rank it a 9. Then, the limitations of such things as IP addresses, load balancers, and certification of third-party software are weighted a 3. We care about the load balancers but don’t need our own IP space and use all open source unsupported third-party software. Finally, the last of the cons is performance. Because our application is not very memory or disk intensive, we don’t feel that this is too big of a deal for us, so we weight it a 1. For the pros, we really care about cost so we weight it a 9. The same with speed: It is one of the primary goals, so we care a lot about it. Last is flexibility, which we don’t expect to make much use of, so we rank it a 1. The fourth step is to rank each alternative on a scale from 0 to 5 of how well they demonstrate each of the pros and cons. For example, with the “use the cloud for only surge capacity” alternative, the portability drawback should be ranked very low because it is not likely that we need to exercise that option. Likewise, with the “move completely to the cloud” alternative, the limitations are more heavily influential because there is no other environment, so it gets ranked a 5. The completed decision matrix can be seen in Table 29.1. After the alternatives are all scored against the pros and cons, the numbers can be multiplied and summed. The Table 29.1 Decision Matrix Weight (1, 3, or 9) No Cloud Cloud for Surge Completely Cloud Cons Security –3 0 2 5 Portability –1 0 1 4 Control –9 0 3 5 Limitations –3 0 3 4 Performance –1 0 3 3 Pros Cost 9 0 3 5 Speed 9 0 3 3 Flexibility 1 0 1 1 Total 0 9 –6 ptg5994185 452 CHAPTER 29 SOARING IN THE CLOUDS weight of each pro is multiplied by the rank or score of each alternative; these prod- ucts are summed for each alternative. For example, alternative #2, Cloud for Surge, has been ranked a 2 for security, which is weighted a –3. All cons are weighted with negative scores so the math is simpler. The product of the rank and the weight is –6, which is then summed with all the other products for alternative #2, equaling 9 for a total score: (2 u –3) + (1 u –1) + (3 u –9) + (3 u –3) + (3 u –1) + (3 u 9) + (3 u 9) + (1 u 1) = 9. The final step is to compare the total scores for each alternative and apply a level of common sense to it. Here, we have the alternatives with 0, 9, and –6 scores, which would clearly indicate that alternative #2 is the better choice for us. Before automati- cally assuming that this is our decision, we should verify that based on our common sense and other factors that might not have been included, this is a sound decision. If something appears to be off or you want to add other factors such as operations skill sets, redo the matrix or have several people do the scoring independently to see how a group of different people score the matrix differently. The decision process is meant to provide you with a formal method of evaluating alternatives. Using these types of matrixes, it becomes easier to see what the data is telling you so that you make a well-informed and data based decision. For times when a full decision matrix is not justified or you want to test an idea, consider using a rule of thumb. One that we often employ is a high-level comparison of risk. In the Web 2.0 and SaaS world, an outage has the potential to cost a lot of money. Consid- ering this, a potential rule of thumb would be: If the cost of just one outage exceeds the benefits gained by whatever change you are considering, you’re better off not introducing the change. Decision Steps The following are steps to help make a decision about whether to introduce cloud computing into your infrastructure: 1. Determine the goals or purpose of the change. 2. Create alternative designs for how to use cloud computing. 3. Place weights on all the pros and cons that you can come up with for cloud computing. 4. Rank or score the alternatives using the pros and cons. 5. Tally scores for each alternative by multiplying the score by the weight and summing. This decision matrix process will help you make data driven decisions about which cloud computing alternative implementation is best for you. ptg5994185 CONCLUSION 453 The most likely question with regard to introducing cloud computing into your infrastructure is not whether to do it but rather when and how is the right way to do it. Cloud computing is not going away and in fact is likely to be the preferred but not only infrastructure model of the future. We all need to keep an eye on how cloud computing evolves over the coming months and years. This technology has the potential to change the fundamental cost and organization structures of most SaaS companies. Conclusion In this chapter, we covered the benefits and drawbacks of cloud computing. We iden- tified five categories of cons to cloud computing including security, portability, con- trol, limitations, and performance. The security category is our concern over how our data is handled after it is in the cloud. The provider has no idea what type of data we store there and we have no idea who has access to that data. This discrepancy between the two causes some concern. The portability addresses the fact that porting between clouds or clouds and physical hardware is not necessarily easy depending on your application. The control issues come from integrating another third-party ven- dor into your infrastructure that has influence over not just one part of your system’s availability but has control over probably the entirety of your site’s availability. The limitations that we identified were inability to use your own IP space, having to use software load balancers, and certification of third-party software on the cloud infra- structure. Last of the cons was performance, which we noted as being varied between cloud vendors as well as physical hardware. The degree to which you care about any of these cons should be dictated by your company and the applications that you are considering hosting on the cloud environment. We also identified three pros: cost, speed, and flexibility. The pay per usage model is extremely attractive to companies and makes great sense. The speed is in reference to the unequaled speed of procurement and provisioning that can be done in a virtual environment. The flexibility is in how you can utilize a set of virtual servers today as a quality assurance environment: shut them down at night and bring them back up the next day as a load and performance testing environment. This is a very attractive feature of the virtual host in cloud computing. After covering the pros and cons, we discussed the various ways in which cloud computing could exist in different companies’ infrastructure. Some of these alterna- tives included not only as part or all of the production environment but also in other environments such as quality assurance or development. As part of the production environment, the cloud computing could be used for surge capacity or disaster recov- ery or of course to host all of production. There are many variations in the way that companies can implement and utilize cloud computing in their infrastructure. These ptg5994185 454 CHAPTER 29 SOARING IN THE CLOUDS examples are designed to show you how you can make use of the pros or benefits of cloud computing to aid your scaling efforts, whether directly for your production environment or more indirectly by aiding your product development cycle. This could take the form of making use of the speed of provisioning virtual hardware or the flexibility in using the environments differently each day. Lastly we talked about how to make the decision of whether to use cloud comput- ing in your company. We provided a five-step process that included establishing goals, describing alternatives, weighting pros and cons, scoring the alternatives, and tallying the scores and weightings to determine the highest scoring alternative. The bottom line to all of this was that even if a cloud environment is not right for your organization today, you should continue looking at them because they will continue to improve; and it is very likely that it will be a good fit at some time. Key Points • Pros of cloud computing include cost, speed, and flexibility. • Cons of cloud computing include security, control, portability, inherent limita- tions of the virtual environment, and performance differences. • There are many ways to utilize cloud environments. • Clouds can be used in conjunction with other infrastructure models by using them for surge capacity or disaster recovery. • You can use cloud computing for development, quality assurance, load and per- formance testing, or just about any other environment including production. • There is a five-step process for helping to decide where and how to use cloud computing in your environment. • All technologists should be aware of cloud computing; almost all organizations can take advantage of cloud computing. ptg5994185 455 Chapter 30 Plugging in the Grid And if we are able thus to attack an inferior force with a superior one, our opponents will be in dire straits. —Sun Tzu In Chapter 28, Clouds and Grids, we covered the basics of grid computing. In this chapter, we will cover in more detail the pros and cons of grid computing as well as where such computing infrastructure could fit in different companies. Whether you are a Web 2.0, Fortune 500, or Enterprise Software company, it is likely that you have a need for grid computing in your scalability toolset. This chapter will provide you with a framework for further understanding a grid computing infrastructure as well as some ideas of where in your organization to deploy it. Grid computing offers the scaling on demand of computing cycles for computationally intense applications or programs. By understanding the benefits and cons of grid computing and provid- ing you with some ideas on how this type of technology might be used, you should be well armed to use this knowledge in your scalability efforts. As a way of a refresher, we defined grid computing in Chapter 28 as the term used to describe the use of two or more computers processing individual parts of an overall task. Tasks that are best structured for grid computing are ones that are computation- ally intensive and divisible, meaning able to be broken into smaller tasks. Software is used to orchestrate the separation of tasks, monitor the computation of these tasks, and then aggregate the completed tasks. This is parallel processing on a network dis- tributed basis instead of inside a single machine. Before grid computing, mainframes were the only way to achieve this scale of parallel processing. Today’s grids are often composed of thousands of nodes spread across networks such as the Internet. Why would we consider grid computing as a principle, architecture, or aid to an organization’s scalability? The reason is that grid computing allows for the use of sig- nificant computational resources by an application in order to process quicker or solve problems faster. Dividing processing is a core component to scaling, think of the x-, y-, and z-axes splits in the AKF Scale Cubes. Depending on how the separation of ptg5994185 456 CHAPTER 30 PLUGGING IN THE GRID processing is done or viewed, the splitting of the application for grid computing might take the shape or one or more of the axes. Pros and Cons of Grids Grid environments are ideal for applications that need computationally intensive environments and for applications that can be divisible into elements that can be simultaneously executed. With that as a basis, we are going to discuss the benefits and drawbacks of grid computing environments. The pros and cons are going to mat- ter differently to different organizations. If your application can be divided easily, either by luck or design, you might not care that the only way to achieve great bene- fits is with applications that can be divided. However, if you have a monolithic appli- cation, this drawback may be so significant as to completely discount the use of a grid environment. As we discuss each of the pros and cons, this fact should be kept in mind that some of each will matter more or less to your technology organization. Pros of Grids The pros of grid computing models include high computational rates, shared infra- structure, utilization of unused capacity, and cost. Each of these is explained in more detail in the following sections. The ability to scale computation cycles up quickly as necessary for processing is obviously directly applicable to scaling an application, ser- vice, or program. In terms of scalability, it is important to grow the computational capacity as needed but equally important is to do this efficiently and cost effectively. High Computational Rates The first benefit that we want to discuss is a basic premise of grid computing—that is, high computational rates. The grid computing infrastructure is designed for applications that need computationally intensive envi- ronments. The combination of multiple hosts with software for dividing tasks and data allows for the simultaneous execution of multiple tasks. The amount of parallel- ization is limited by the hosts available—the amount of division possible within the application and, in extreme cases, the network linking everything together. We cov- ered Amdahl’s law in Chapter 28, but it is worth repeating as this defines the upper bound of this benefit from the limitation of the application. The law was developed by Gene Amdahl in 1967 and states that the portion of a program that cannot be par- allelized will limit the total speed up from parallelization. 1 This means that nonse- 1. Amdahl, G.M. “Validity of the single-processor approach to achieving large scale comput- ing capabilities.” In AFIPS Conference Proceedings, vol. 30 (Atlantic City, N.J., Apr. 18- 20). AFIPS Press, Reston, Va., 1967, pp. 483-485. [...]... have the space and resources to do so, we can plot the run times of each of our functions within our application over time We can use the most recent 24 hours of data, compare it to the last week of data, and compare the last week of data to the last month of data We don’t have to keep the granular by transaction records for each of our calls, but rather aggregate them over time for the purposes of comparison... infinite cost and a very, very low relative return Monitoring and Processes Alas, we come to the point of how all of this monitoring fits into our operations and business processes Our monitoring infrastructure is the lifeblood of many of our processes The monitoring we perform to answer the questions of “Is there a problem?” to “What is the problem?” will likely create the data necessary to inform the decisions... from the primary sources, transformed into a different form— usually a denormalized star schema form and then loaded into the data warehouse The transformation can be computationally intensive and therefore a primary candidate for the power of grid computing The transformation process may be as simple as denormalizing data or it may be as extensive as rolling up many months’ worth of sales data for. .. environment of some applications, 467 468 C HAPTER 30 P LUGGING IN THE G RID the transformation part of the data warehousing ETL process, the building or compiling process for applications, and the back office processing of computationally intensive tasks Each of these is a great example where you may have a need for fast and large amounts of computations Not all similar applications can make use of the grid,... our QA environment for the grid, so we ranked it high for both projects We continued in this manner scoring each project until the entire matrix was filled in Step four is to multiply the scores by the weights and then sum the products up for each project For the ETL example, we multiply the weight –1 by the score 1, add it to the product of the second weight –1 by the score 1 again, and continue in this... when a host dies in the middle of a job, what data the host needs to perform the task, gathering the processed results back afterward, deleting the data from the host, and aggregating the results together This adds a lot of complexity and if you have ever debugged an application that has hundreds of instances of the same application on different servers, you can imagine the challenge of debugging one... reimplement the current system “If we only take some of the noise out of the system, my team can sleep better and address the real issues that we face,” she might say We’ve heard the reasons for new and better monitoring systems time and again, and although they are sometimes valid, most often we believe they result in a destruction of shareholder value The real issue isn’t typically that the monitoring... ROS AND C ONS OF G RIDS quential parts of a program will benefit from the parallelization, but the rest of the program will not Shared Infrastructure The second benefit of grid computing is the use of shared infrastructure Most applications that utilize grid computing do so either daily, weekly, or some periodic amount of time Outside of the periods in which the computing infrastructure is used for. .. thought regarding the design and deployment of your proprietary technology User Experience and Business Metrics User experience and business metric monitors are meant to answer the question of “Is there a problem?” Often, you need to implement both of them to get a good view of the overall health of a system, but in many cases, you need only a handful to be able to answer the question of whether a problem... but parts of many of them can be implemented on a grid Perhaps the entire ETL process doesn’t make sense to run on a grid, but the transformation process might be the key part that needs the additional computations The last section of this chapter was the decision matrix We provided a framework for companies and organizations to use to think through logically which projects make the most sense for implementing . weights and then sum the products up for each project. For the ETL example, we multiply the weight –1 by the score 1, add it to the product of the second weight –1 by the score 1 again, and continue. think of the x-, y-, and z-axes splits in the AKF Scale Cubes. Depending on how the separation of ptg 599 4185 456 CHAPTER 30 PLUGGING IN THE GRID processing is done or viewed, the splitting of the. 483-485. ptg 599 4185 PROS AND CONS OF GRIDS 457 quential parts of a program will benefit from the parallelization, but the rest of the program will not. Shared Infrastructure The second benefit of grid