Co m pl im ts of Steve Suehring en Hybrid and Multicloud Solutions Hybrid and Multicloud Solutions Steve Suehring Beijing Boston Farnham Sebastopol Tokyo Hybrid and Multicloud Solutions by Steve Suehring Copyright © 2019 O’Reilly Media Inc All rights reserved Printed in the United States of America Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472 O’Reilly books may be purchased for educational, business, or sales promotional use Online editions are also available for most titles (http://oreilly.com) For more infor‐ mation, contact our corporate/institutional sales department: 800-998-9938 or cor‐ porate@oreilly.com Editor: Jonathan Hassell Production Editor: Deborah Baker Copyeditor: Octal Publishing, LLC Proofreader: Matthew Burgoyne February 2019: Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Rebecca Demarest First Edition Revision History for the First Edition 2019-02-05: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781492047216 for release details The O’Reilly logo is a registered trademark of O’Reilly Media, Inc Hybrid and Multi‐ cloud Solutions, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limi‐ tation responsibility for damages resulting from the use of or reliance on this work Use of the information and instructions contained in this work is at your own risk If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsi‐ bility to ensure that your use thereof complies with such licenses and/or rights This work is part of a collaboration between O’Reilly and Mesosphere See our state‐ ment of editorial independence 978-1-492-04719-3 [LSI] Table of Contents Understanding the Hybrid Cloud What Is a Hybrid Cloud? Hybrid Cloud Challenges How Can I Use a Hybrid Cloud? Conclusion Hybrid Cloud Architecture Understanding the Components of the Hybrid Cloud Architecting the Hybrid Cloud Conclusion 16 17 Choosing a Hybrid Cloud Solution 19 Examining Capabilities Examining Decision-Making Criteria Conclusion 19 21 23 iii CHAPTER Understanding the Hybrid Cloud The datacenter has undergone fundamental changes over the past 10 years Virtualization technology enables organizations to create internal private clouds, speeding up delivery of software projects In many cases, public cloud providers provide a viable substitute for local data, and in other cases, public cloud providers supplement the private cloud As the world moves to the cloud, organizations are increasingly looking for ways to take advantage of their investment in private clouds for a competitive advantage This advantage can be achieved by combining public and private cloud technologies to create a hybrid cloud This chapter examines the hybrid cloud, beginning with some basic vocabulary and continuing with use cases for the hybrid cloud What Is a Hybrid Cloud? A hybrid cloud is a combination of one or more public cloud pro‐ viders (like Amazon Web Services [AWS], Microsoft Azure, and Google Cloud Platform [GCP]) and a private cloud platform (hosted on-premises for use by a single organization) or private IT infra‐ structure The public cloud and private infrastructure are distinct and independent elements that communicate over an encrypted connection This enables organizations to store secure data in their own datacenter while using computational resources from the pub‐ lic cloud to run applications that rely on this data Many organizations use a combination of resources from both external providers and internal IT-based hardware and services The use of each type of resource has changed over time, and organiza‐ tions manage both internal and external resources through existing management practices and by using existing infrastructure tools An important change in enterprise computing in the past decade— which has shown increasing growth in the last three to five years—is the use of automation and DevOps practices The premise is to auto‐ mate as much as possible—from testing, to build, to production migrations—that leads to repeatability Processes to test and build a project can be repeated as many times as necessary and should bring the same results every time Automation, Continuous Integration/Continuous Delivery (CI/CD), and related practices enable a more efficient use of resources As the organization grows and as service offerings from public cloud pro‐ viders mature, processes that were previously handled internally are moving to external providers However, much of the work to move jobs and processes between the private and external cloud is manual, requiring not only engineering and IT, but also architectural assistance The labor-intensive migra‐ tion of workloads is also error prone and works against the efforts toward automation and repeatability Organizations are increasingly looking for ways to automate and control the movement of workloads between private and public clouds With automated, repeatable processes, IT can now seam‐ lessly move between cloud services to take advantage of cost savings and resource availability Hybrid cloud enables organizations to reduce costs by utilizing the least inexpensive resource for the task while taking advantage of the best solution for that task For example, organizations might keep bandwidth-intensive operations on-premises, within their own data‐ centers or private clouds, where the on-premises network can easily handle the traffic Hybrid cloud architectures also enable a more efficient use of resources For workloads such as CPU-intensive operations, an organization might offload the process to a public cloud to save the investment in hardware Applications themselves can be deployed | Chapter 1: Understanding the Hybrid Cloud without needing to focus on the underlying platform on which the application resides Hybrid Cloud Challenges Although the hybrid cloud solves many problems, it also presents some challenges Migration of workloads and integration between the private and public clouds are primary challenges Ensuring that applications and data not become fragmented between clouds is a key issue as well This section expands on some of those challenges Workload Migration Integrating clouds is a labor-intensive process, and simply choosing when to use a public cloud provider is a daunting task as well Applications previously deployed within the internal datacenter can be the most challenging to move From an operational standpoint, the applications are working well, so there is a natural tendency to leave these applications alone From a development standpoint, shifting legacy applications presents challenges If the application has not been changed recently, there might be a lack of organizational knowledge about how the application works internally When there is an enhancement project for the application, the fact that it still resides on physical hardware and does not use DevOps deployment practices becomes a hin‐ drance to accurately completing the project on time There are three primary means to move legacy applications: bigbang, phased, and pilot With a big-bang rollout, the application is turned off in the legacy environment and immediately begins pro‐ cessing transactions in the new environment A phased rollout sees portions of the application moved to the new platform Finally, a pilot-based rollout tests major functionality of the application in the new environment There are problems with each approach With a big-bang rollout, there is no clear path to migrate back to the legacy environment if there are problems with the new system This means fixing issues as they arise and hoping that none of the issues are critical A phased rollout might be difficult because it requires decoupling portions of Hybrid Cloud Challenges | the legacy application, which can be difficult or impossible in some situations A pilot approach can miss key functionality and therefore not be a valid test of the new system It’s important to note that these issues exist regardless of whether an organization is migrating within their own cloud or between clouds However, a key element to success in all cases is visibility Being able to see what is happening at all phases, from development to rollout, leads to the ability to better control all aspects of the project Control over workload migration is a fundamental problem The sheer number of tools involved in management of systems presents its own set of problems Tools that help to abstract the issues around management are quite helpful Integrating Between Clouds After you have moved workloads to a public cloud, managing the integration is the next challenge There can be portions of the work‐ load in each cloud For example, the data processing and analytics might be handled in the public cloud, whereas other business rules are applied in the application hosted within the private cloud Public cloud providers manage resources such as compute, memory, and disk as pools of on-demand resources from which the organiza‐ tion can draw to meet its needs For example, adding more RAM for a particularly memory-intensive task is straightforward with a pub‐ lic cloud provider When working with a private cloud, abstracting the technology to a higher level and treating the underlying server and other hardware resources as a single entity enables management of the private cloud in much the same way It’s when you need to manage the public and private clouds together that administration tasks become more difficult The vendors for both clouds are usually different, sometimes requiring different skillsets for troubleshooting On-premises management starts at the bare metal and goes to higher levels from there, whereas public cloud management brings with it a web interface or an application programming interface (API), both of which present difficulties | Chapter 1: Understanding the Hybrid Cloud Figure 2-2 Managing virtualized resources into virtual servers Importantly, even with virtualization, resources could still remain idle, with compute resources allocated to a virtual server and not being used efficiently As virtualization technologies matured, dynamic allocation of many resources became available, but the underlying technology itself was not as self-service oriented as today’s environment demands Virtualization led to private clouds, in which resources are managed in aggregate at a higher level Private cloud architectures accomplish this by adding another layer of abstraction, with memory and pro‐ cessing from multiple physical hardware servers combined into larger pools that can be managed and allocated on-demand based on business needs The on-demand nature of the private cloud facili‐ tates self-service and automation From the internal perspective, a cloud is an aggregation of storage and networking along with bare-metal components such as RAM and processing (CPU and Graphics Processor Unit [GPU]) On top of these hardware components sits a control plane The control plane is used to allocate and manage the hardware elements below it, as shown in Figure 2-3 Understanding the Components of the Hybrid Cloud | 11 Figure 2-3 Components in a private cloud Virtualization led to containerization with software like Docker, in which applications are managed as a single entity Containerization enables portable deployment, without needing to worry about the configuration of the underlying operating system or infrastructure Creating a private cloud Deployment of a private cloud requires a large outlay of funds to get started The initial capital expenditure for hardware is one compo‐ nent and is sunk as soon as the datacenter is operational Further, the monthly cost to maintain a datacenter does not vary much based on usage In other words, whether the datacenter is being utilized at only 5% or is being utilized at 100%, the costs are roughly the same Beyond the hardware and infrastructure-related expenses, controlplane software needs to be deployed in order to manage the hard‐ ware resources Because this software provides the basis for management of the private cloud, stability and maturity are impor‐ tant Being able to obtain timely and expert support when necessary is important to organizations, as well Disaster recovery with private cloud Disaster recovery and providing for multiple region-separated and logically separated datacenters are consistent challenges with private clouds The cost to create, manage, and maintain physically geodispersed datacenters is prohibitive for many organizations Not only is duplicating the hardware costly, but maintenance of hard‐ ware and network means hiring staff in those locations to perform the work 12 | Chapter 2: Hybrid Cloud Architecture However, business needs frequently dictate the ability to perform recovery and this translates into redundancy and replication across regional datacenters or private clouds Cost recovery Some organizations perform a certain level of chargeback to the business area based on usage for private cloud components This mirrors the charges incurred when using a public cloud provider and helps to offset costs associated with various business units’ needs For example, if a business unit requires a large amount of data storage and redundancy across datacenters, the costs charged for those services would be higher than those charged for a simple development environment deployed in the internal cloud Although some organizations not perform direct chargeback to the business unit, the resource usage and utilization can still be tracked in a much more direct way when compared to virtualization technologies Self-service Private cloud deployments vary with regard to the amount of selfservice that project teams are able to perform with the cloud Deployments might be accomplished by only IT staff such as system administrators This ensures a level of control over the environment and the resources in a managed way A DevOps team might also control access to deployment within the cloud, helping to guide both developers and operations staff as to usage and resource needs DevOps teams lead the way in automat‐ ing processes including allocation of cloud resources during devel‐ opment, testing, and production promote Finally, developers can also deploy environments into the private cloud, thus providing true self-service for the cloud resources Deployments can be template-based and limits are typically placed on the amount and types of resources that can be deployed by a developer without additional assistance from a member of the DevOps or IT team Understanding the Components of the Hybrid Cloud | 13 External Components In many ways, the architecture of private clouds mirrors that of public clouds such as Amazon, Google, and Microsoft For all three companies, compute, memory, and storage are managed as separate entities, and network is handled externally Creating a public cloud Getting started with a public cloud provider means manually allo‐ cating and managing resources on an as-needed basis, commission‐ ing and decommissioning environments when required This is usually done through the cloud provider’s web interface and is labor intensive Day-to-day management is accomplished through a com‐ bination of web interface and application programming interfaces (APIs) Figure 2-4 depicts the typical view of a public cloud provider Figure 2-4 Resources in the public cloud Of course, there are ways to control costs with public cloud provid‐ ers, such as paying an up-front fee for guaranteed resources But doing so defeats the purpose of paying on-demand, and it becomes clear that an investment into internally based resources makes financial sense in those cases Disaster recovery with public clouds Deployment of resources and environments across multiple regions is also much easier with public cloud providers There is no need to invest in hardware or geographically dispersed staff to maintain the regional locations This significantly reduces initial costs to get started with public cloud providers and is one reason why many organizations deploy only into public clouds 14 | Chapter 2: Hybrid Cloud Architecture Cost recovery As an organization matures and grows, the on-demand nature of public cloud resources means that costs begin to increase, as well All resources are billed on an as-used basis, usually by the minute Controlling costs for compute and memory resources is feasible because those resources can be added and removed easily However, storage costs accrue and become larger and larger over time because of the nature of storage itself Storage grows over time because of direct data collection As trans‐ actions are moved from online data collection sources like NoSQL databases and into relational data stores or other data storage, data can be filtered and de-duplicated Even with de-duplication and other cleansing efforts, storage requirements will still continue to grow This adds up to increasing costs over time unless that data storage can be moved to less-costly means for long-term storage Self-service As the organization’s use of the public cloud matures, more and more automation is added to the process At first, templates might be used for specific applications or for environments Over time, templates turn into additional automated use to add and remove environments when needed DevOps and scripting practices have further enhanced the automa‐ tion of cloud environments, notably for external providers for which APIs are used to create the environments as part of a deployment The automation usually also decommissions environments and resources when they are no longer needed This is the case for test‐ ing environments through the software development life cycle Not only developers benefit from dynamically allocated testing envi‐ ronments, but automated testing and the quality assurance team also benefits Application platforms Many public cloud providers also feature application platforms that go well beyond simple compute-type or raw resources These appli‐ cation platforms, like AWS Elastic Beanstalk, provide a managed environment from which an application can be launched without needing to manage the underlying operating system Orchestration Understanding the Components of the Hybrid Cloud | 15 with these application platforms is more complex than managing the external cloud as a resource pool When application platforms are used, backward and crosscompatibility between the private and public cloud is more difficult While there is typically transparency or at least visibility into the environment of the application platform, recreating that environ‐ ment internally is again a manual process Architecting the Hybrid Cloud Thus far, the components of both private clouds and public clouds have been discussed Both clouds are similar insofar as they both manage resources in aggregate and give some of the same benefits to organizations that deploy one or both A hybrid cloud is the connection between the private and public clouds However, it’s not just a private network connection between the two that makes the difference A true hybrid cloud has manage‐ ment of both private and public clouds as one single unified resource pool, as illustrated in Figure 2-5 Figure 2-5 A hybrid cloud brings together private and public clouds It is the management and control layer of a hybrid cloud that brings value to an organization With a true hybrid cloud, an organization can move workloads between the private and public clouds based on spot pricing and resource utilization For example, if it is less expen‐ sive to process a job internally and there is capacity available, the job is moved internally If capacity doesn’t exist internally or if it is less expensive to run a workload externally, the job is moved to an exter‐ nal provider Management Layers Work within the cloud is sometimes split between short-running batch-style jobs and long-running services Containers are also used 16 | Chapter 2: Hybrid Cloud Architecture in order to better and more portably manage jobs and services These elements are executed within the software layer of a hybrid cloud along with message queuing monitoring and the tools used for DevOps and Continuous Integration (CI) The software used for jobs and services, along with the other items in the software layer, are managed by a platform layer A platform layer provides abstraction and orchestration for the software layer Things like networking and storage are found on the platform layer along with management for clusters and containers and security The hybrid cloud is then managed as nodes with an agent running on each node for management and monitoring Workloads are routed to the appropriate node, whether in the private or public cloud Logging and metrics are also contained in the platform layer to pro‐ vide a centralized location for gathering and processing of this infor‐ mation A platform layer is dependent on its underlying infrastructure layer At the infrastructure layer in a hybrid cloud are the cloud compo‐ nents themselves This includes private and public cloud compo‐ nents and generally any suitable infrastructure components even if not part of a formal cloud The platform layer enables management of software as packages and management of containers within a repository or registry The pack‐ ages and containers can then be installed and deployed within the private or public clouds or a combination of both, depending on the needs of the application Ideally, package management would be as seamless and automated as possible, and not require the administra‐ tor to manually repeat steps to install the package This level of orchestration is enabled because of the abstraction of the infrastruc‐ ture Automation is facilitated through command-line and API interac‐ tions, both of which enable programmatic and scripted access into the architecture Conclusion The ability to create a private cloud has evolved from simple, singleserver deployments, through virtualization, both static and dynamic, to management of resources in pool and dynamic alloca‐ Conclusion | 17 tion of those resources on an as-needed basis Private clouds can be costly to deploy, but the investment is mitigated by a more efficient use of the deployed resources Disaster recovery and high availability are challenging for private cloud deployments because providing both typically means deploy‐ ment of redundant resources in multiple locations Those resources might be unused and therefore idle throughout an application’s life cycle As use of the private cloud matures within an organization, more and more automation and self-service processes are added Automa‐ tion comes from DevOps practices and Agile development method‐ ologies, where environments are deployed as needed and only when needed This is the case for testing environments in which code builds can be automated and an environment deployed for testing without any manual intervention Getting started with a public cloud is usually much less expensive than creating a private cloud The initial outlay does not exist for external cloud deployments beyond any automation or other orchestration deployed within an organization Resources can also be used in a more efficient manner because disaster recovery is eas‐ ier Cloud providers enable deployment into multiple regions, thus alleviating the need for an investment into hardware at multiple locations 18 | Chapter 2: Hybrid Cloud Architecture CHAPTER Choosing a Hybrid Cloud Solution Connecting the private and public clouds is where many organiza‐ tions stop The benefits of going further are obvious, but the added complexity can make creating a true hybrid cloud seem too difficult Although there are solutions available, it can be difficult to find a robust solution offering the necessary level of support and advanced capabilities This chapter helps with the decision with respect to choosing a hybrid cloud solution that can make deployment of a hybrid cloud much easier Examining Capabilities Several capabilities are necessary in order to provide the level of ser‐ vice necessary in today’s modern organization When looking at cloud-specific capabilities, these are key ingredients: • Extensive workload support • Advanced resource pooling • Application-centric automation • Connectivity across clouds • Zero-downtime migrations Extensive Workload Support Cloud workloads consist of long-running services and short-lived jobs Although both private and public clouds support both types, 19 providing orchestration and integration for these workloads is chal‐ lenging Best practice dictates running applications on their native platform, regardless of the underlying operating system Doing so enables the greatest compatibility and performance for a given application IT staff frequently support various versions of Microsoft Windows and various distributions of Linux, not to mention other Unix variants It is no longer uncommon to find multiple operating systems run‐ ning in support of an individual application This cross-platform nature of line-of-business applications also facilitates efficiency within the datacenter Ideally, a platform with hybrid cloud capabilities would also support the various underlying operating systems and applications necessary for a given workload Certain products abstract an application into a single-button or low-touch installation Advanced Resource Pooling With a true hybrid cloud platform, resources are abstracted such that applications can execute in the private or public clouds This is true whether the application runs on another virtualization platform such as vSphere or OpenStack, on a public cloud such as Amazon Web Services (AWS) or Google Cloud Platform (GCP), or within a Docker container Applications should also be isolated and partitioned in order to meet security and compliance rules Ensuring that containers not run with administrative or root privileges is a key factor in provid‐ ing security Application-Centric Automation A modern hybrid cloud platform should have capabilities to interact programmatically and automatically For example, integration points offered through an API can be used to trigger automated pro‐ cesses The platform should ideally also automatically sense failed nodes and replace as needed Beyond failure handling, the hybrid cloud platform should also scale up and down as needed Scaling is one of the key features of a public cloud platform and having dynamic scaling is useful in a hybrid cloud as well 20 | Chapter 3: Choosing a Hybrid Cloud Solution Connectivity Across Clouds The hybrid cloud wouldn’t be much of a hybrid without being able to connect and orchestrate between private and public clouds The control plane should work with the popular vendors for both cloud types, but it is the orchestration that provides the next level of con‐ nectivity Zero-Downtime Migrations A unified interface for connectivity is valuable but is not enough for the next level of automation and control needed by modern organi‐ zations Seamless migration of workloads between private and pub‐ lic clouds is necessary for an organization to truly take advantage of a hybrid cloud The migration of workloads should result in zero downtime for the end user This is achieved by abstracting the infrastructure and by providing that level of abstraction for jobs and services Zero-downtime migration is the single most important element in choosing a hybrid cloud solution Examining Decision-Making Criteria There are three criteria that are helpful when examining solutions to the problem of creating a hybrid cloud: • Pricing • Application control • Data locality and regulatory compliance This section looks at each of those criteria Pricing Many open source packages are available to provide pieces of func‐ tionality discussed in this book Not only these packages offer access to the source code for customizations, but many times the packages are available at no cost However, combining many such packages requires another level of care and expertise to achieve the same result as a reasonably priced, single-source solution Examining Decision-Making Criteria | 21 Adding the vital enterprise support and configuration assistance is a differentiator between a self-compiled open source package and one obtained from a vendor This is notably the case with several large open source projects, for which enterprise support is provided by a vendor who also is an expert in the open source package Application Control Using a single interface and platform for control of an application or workload is another differentiator when looking at hybrid cloud sol‐ utions There is tangible effort involved when working with multiple interfaces and vendors in order to configure a job or service Consider this example: to deploy a new service you need to work with database export software, configuration tools, deployment tools, and other elements in order to bring up the service within the private cloud Making that same service work correctly in the public cloud is no small effort because the tools might be different or at least have a different front-end interface Contrast that example with a single control plane from which you can configure and launch a service in either or both clouds and move the service between the clouds without downtime When looking for a hybrid cloud solution, choose one that has a single, unified interface for application control Data Locality and Regulatory Regulatory compliance is more important than ever For certain cases, part of that compliance is ensuring that data is homed or stored in the region or country in which it was collected or is to be used This can be challenging and frequently leads to master data management issues wherein there might be more than one single source of truth for a given data element By itself, the hybrid cloud helps with regulatory compliance by facil‐ itating use of public cloud providers located in the desired regions You then can store data where it needs to be stored in compliance with the necessary regulations and local laws However, like other cases, no configuration or built-in template will help create the architecture that is necessary to be compliant There‐ fore, it’s up to the system administrators and operations staff to 22 | Chapter 3: Choosing a Hybrid Cloud Solution understand and ensure compliance through the use of disparate tools Ideally, though, a hybrid cloud solution would help to facilitate the compliant architecture In much the same way that a single interface makes implementation of a true hybrid cloud better, a single inter‐ face can also give the administrator power to move services and jobs around seamlessly This makes future compliance easier, as well Not only can the system be compliant with today’s regulations, but as those regulations change, workloads and data can be shifted around as needed to maintain compliance Auditing is also part of the compliance landscape Tracking where and when actions happened with data and services is important Like other aspects, a hybrid cloud by itself will have no facilities for integrated logging but rather will have logs and audit trails spread over all of the tools used This makes piecing together what hap‐ pened and when it happened very challenging A solution that uses both a single control plane abstracts the jobs and services in such a way that unified logging and a unified audit trail becomes possible Collection of that audit trail is much easier and parsing the information to find out what happened and when it happened becomes trivial Conclusion Choosing a solution for a hybrid cloud implementation involves the examination of several criteria The first of which is typically the workloads themselves: can the proposed solution run the jobs and services that the organization needs to run? After you answer that question, you should evaluate the hybrid cloud solution on how well it does things like pooling of resources and automation The depth at which a hybrid cloud solution connects between clouds is impor‐ tant Superficial connections can give the appearance of hybrid behavior while not really providing much substantive assistance to architects and engineers The most important element in choosing a hybrid cloud solution is how it handles migrations Migrations that that you can perform with no noticeable downtime for the end user is the goal To achieve this goal, the solution needs to have an architecture designed with zero-downtime in mind Conclusion | 23 Final decision-making criteria typically includes the cost of the solu‐ tion along with its ability to control the applications needed by the organization Many organizations are also affected by data locality and regulatory issues that are a factor in the final decision The solu‐ tion you choose must be able to support the regulatory environment in which the organization operates 24 | Chapter 3: Choosing a Hybrid Cloud Solution About the Author Steve Suehring is an assistant professor of Computing and New Media Technologies at University of Wisconsin–Stevens Point In addition to working as an editor for LinuxWorld Magazine, Steve has written several books on a variety of technologies, including Java‐ Script, Linux security, Windows Server certifications, Perl, and oth‐ ers Steve has worked at a large Internet provider in both systems engineering and security roles, as well as at a Fortune 1000 company, helping provide architectural direction on numerous initiatives ... needs to have an architecture designed with zero-downtime in mind Conclusion | 23 Final decision-making criteria typically includes the cost of the solu‐ tion along with its ability to control the... components were still limited to single server architectures, with the possible addition of external Storage Area Network (SAN) storage 10 | Chapter 2: Hybrid Cloud Architecture Figure 2-2 Managing... hardware servers, with each server contain‐ ing its own preset amount of each resource, as illustrated in Figure 2-1 Figure 2-1 A traditional server, with fixed resources allocated to it As datacenter