Container Solutions The CLOUD NATIVE ATTITUDE ATTITUDE Move Fast Without Breaking Everything AnneCurrie Currie Anne PART Cloud Native Case Studies ABOUT THIS BOOK/BLURB This is a small book with a single purpose, to tell you all about Cloud Native - what it is, what it’s for, who’s using it and why Go to any software conference and you’ll hear endless discussion of containers, orchestrators and microservices Why are they so fashionable? Are there good reasons for using them? What are the trade-offs and you have to take a big bang approach to adoption? We step back from the hype, summarize the key concepts, and interview some of the enterprises who’ve adopted Cloud Native in production Take copies of this book and pass them around or just zoom in to increase the text size and ask your colleagues to read over your shoulder Horizontal and vertical scaling are fully supported The only hard thing about this book is you can’t assume anyone else has read it and the narrator is notoriously unreliable What did you think of this book? We’d love to hear from you with feedback or if you need help with a Cloud Native project email info@container-solutions.com This book is available in PDF form from the Container Solutions website at www.container-solutions.com First published in Great Britain in 2017 by Container Solutions Publishing, a division of Container Solutions Ltd Copyright © Anne Berger (nee Currie) and Container Solutions Ltd 2017 Chapter “Distributed Systems Are Hard” first appeared in The New Stack on 25 Aug 2017 Design by Remember to Play / www.remembertoplay.co ABOUT THE AUTHORS Anne Currie Anne Currie has been in the software industry for over 20 years working on everything from large scale servers and distributed systems in the ‘90’s to early ecommerce platforms in the 00’s to cutting edge operational tech on the 10’s She has regularly written, spoken and consulted internationally She firmly believes in the importance of the technology industry to society and fears that we often forget how powerful we are She is currently working with Container Solutions Container Solutions As experts in Cloud Native strategy and technology, Container Solutions support their clients with migrations to the cloud Their unique approach starts with understanding the specific customer needs Then, together with your team, they design and implement custom solutions that last Container Solutions’ diverse team of experts is equipped with a broad range of Cloud Native skills, with a focus on distributed system development Container Solutions have global perspective and their office locations include the Netherlands, United Kingdom, Switzerland, Germany and Canada CONTENT ARE CASE STUDIES EVER USEFUL? CASE STUDY / THE FINANCIAL TIMES CASE STUDY / SKYSCANNER 11 CASE STUDY / ASOS 14 CASE STUDY / STARLING BANK 17 CASE STUDY / ITV 21 CASE STUDY / CONTAINER SOLUTIONS 25 DO THOSE CASE STUDIES TELL US ANYTHING? 28 APPENDIX / THE CONTAINER SOLUTIONS METHOD 30 ARE CASE STUDIES EVER USEFUL? I hope so, because that’s what’s coming up next The following four case studies are based on multiple interviews I did in 2017 with experienced real world practitioners of the Cloud Native philosophy These are all companies that are using the CN toolset in production and have been for some time What I wanted from the interviews was to understand: • • • What was their aim? What issues and roadblocks did they hit? Did they get what they wanted? Are Case Studies Ever Useful? Early adopter case studies are usually only moderately useful Successful businesses are unique with their own goals and risk profiles Early adopters of Cloud Native will usually have a different attitude to risk than folk starting out now However, at least these folk are more realistic role models for the average enterprise than Google or Netflix These case studies did give me a general idea of what industry pioneers have done, how difficult it was and whether the path has become any easier over time CASE STUDY THE FINANCIAL TIMES “Our goal of becoming a technologically agile company was a major success - the teams moved from deploys taking 120 days to only 15 minutes” Sarah Wells, Technical Director for Operations and Reliability CASE STUDY The Financial Times Based in London, The Financial Times has an average worldwide daily readership of 2.2 million Its paid circulation, including both print and digital, is 856K Three quarters of its subscribers are digital they use off-the-shelf, cloud-based services like databases-as-a-service (including AWS Aurora) and queues-as-a-service wherever possible Again this is because operating this functionality in house is “not a differentiator” for the company The FT was a pioneer of content paywalls and was the first mainstream UK newspaper to report earning more from digital subscriptions than print sales They are also unusual in earning more from content than from advertising Within the FT as a whole there was a strong inclination to move to a microservices-oriented architecture but in different parts of the company they took different approaches The FT have three big programmes of work where they implemented a new system as a set of microservices One of those (subscription services) incrementally migrated their monolithic server to a microservice architecture by slowly carving off key components However, the remaining two projects (the new content platform and the new website) essentially both built a duplicate of their respective monoliths right from the start using microservices Interestingly, both of those approaches worked successfully for the FT, suggesting that there is no one correct way to a monolith to microservice migration The FT have been gradually adopting microservices, continuous delivery, containers and orchestrators for three years Like Skyscanner (who I’ll talk about next), their original motivation was to be able to move faster and respond more quickly to changes in the marketplace As Sarah Wells, the high-profile tech lead of the content platform, points out, “our goal of becoming a technologically agile company was a major success - the teams moved from deploys taking 120 days to only 15 minutes” In the process, according to senior project manager Victoria Morgan-Smith, “the teams were completely liberated” So how did they achieve all this? Broadly speaking, they made incremental but constant improvements The FT have moved an increasing share of their infrastructure into the cloud (IaaS) Six years ago, the FT started with their own virtualized infrastructure but then adopted AWS as Amazon solved issues with funding, monitoring, networking and OS choice As Sarah Wells described it, “custom infrastructure was not a business differentiator for us” They now have a target of 100% cloud infrastructure and After nearly three years the content platform has moved from a monolith to having around 150 microservices each of which broadly “does one thing” However, they have not followed the popular “Conway’s law” approach where one or more microservices represent the responsibilities of each team (many services to one team) Instead multiple teams support each microservice (many to many) This helps maximize parallelism but is mostly because teams work end-to-end on the delivery of features (such as “publish videos”) and these features usually span multiple microservices They then monitor for deploy conflicts between teams If clashes regularly occur then the service in contention is split further CASE STUDY The Financial Times They found that, in Wells’ words, “infrastructure-as-code was necessary for microservices”, and they evolved a strong culture of automation and CD According to Wells, “There is a fair amount of diversity within the FT with some teams running a home-grown continuous delivery system based on Puppet while others wrap and deploy their services in Docker containers on the container-friendly Linux operating system CoreOS, with yet others deploying to Heroku Basically, we have at least: A home-grown, puppet-based platform, currently hosted on AWS without containers A Heroku-hosted PaaS, A Docker container-based environment using CoreOS, hosted on AWS” All of these environments work well, they are each evolving and were each chosen by the relevant tech team to meet their own needs at the time Again, the FT’s experience suggests there is more than one way to successfully implement an architectural vision that is microservice-oriented and runs in a cloud-based environment with continuous delivery Finally, the FT’s content platform team found that containers were the gateway to orchestration The content folk have been orchestrating their Docker-containerized processes in production for several years with the original motivation being server density more efficient resource utilization By using large AWS instances to host multiple containerized processes, controlled with an orchestrator, they reduced their hosting costs by around 75% As very early users of orchestration they created their own orchestrator from several open source tools but are now evaluating the latest off-theshelf products, in particular Kubernetes So what unexpected results came out of this Cloud Native evolution for the FT? They anticipated the shift to faster deployments would increase risk In fact, they have moved from a 20% deployment rollback rate to ~0.1%, i.e a two order-of-magnitude reduction in their error rate They ascribe this to the ability to release small changes more often with microservices They have invested heavily in monitoring and A/B testing, again building their own tools for the latter, and they replaced traditional pre-deployment acceptance tests with automated monitoring in production of key functionality How have they handled the complexity of distributed systems? They chose to make heavy use of asynchronous queues-as-a-service, which simplified their distributed architecture by limiting the knock-on effects of a single microservice outage (although this does increase system latency, a tradeoff they accepted) They also limit the use of chained synchronous calls to avoid cascading failures as one failed service holds up a whole chain of services waiting on outstanding synchronous requests They also struggled with issues around the order of microservice instantiation and are contemplating rules that microservices should exit if prerequisite services are not yet available, allowing the orchestrator to automatically re-start them (by which point their pre-requisite service should hopefully have appeared) Basically, it was difficult but they learned and improved as they went CASE STUDY The Financial Times According to project manager Victoria MorganSmith: “our goal throughout was to de-risk experimentation” but that involved “training, tools and trust” The FT heavily invested in internal on-the-job training with an explicit remit for their devops teams to disseminate the new operational knowledge to developers and operations They learned that their teams could be trusted to make good judgments if they were informed, given responsibility and had the right tools For example, initially, their IaaS bills were very high, 10 but once developers were given training and access to billing tools and guidance on budgets the bills reduced In common with many other early adopters the FT experimented and built in-house and were prepared to accept a level of uncertainty and risk Sometimes their tech teams needed to re-assess as the world changed, as with their move from private to public cloud, but they were persistent and trusted to make the occasional readjustment in a rapidly changing environment Trust was a key factor in their progress CASE STUDY ITV ITV decided their next strategic step in increasing their feature velocity was to try using devops for their online products They experimented with allowing the development teams to provision their own machines for test and production on Amazon Web Services (AWS) through a series of proof of concept (PoC) deployments These PoCs were a huge success So, they decided to step back and reassess their tech strategy once again It was clear the combination of cloud and devops could have a significant, positive impact on the speed of development of their consumer facing products and it was obvious that ITV should embrace these for online Great However, it was also becoming clear that ITV’s legacy internal systems like talent payment, content delivery and rights management were falling behind those of the rest of the industry and of new entrants A waterfall approach was keeping those applications stable but that wasn’t enough They decided the Agile, cloud and devops strategy they had trialled for consumer products needed to be extended to the legacy services their business had relied on for decades They chose to apply what they’d learned from online to their back office systems At this point, ITV’s perspective as a company with a lengthy history behind them and hopefully a similar future ahead of them paid off They took the long view They had used ITV Hub (their streaming video service) to learn and build expertise in the cloud Now they needed to extend this new infrastructure across their organization They followed a three step process: • Identify legacy migrations with potentially strong ROI 23 • • Experiment using MVPs and assess ease and risk Move the applications (or parts of applications) to the cloud IF there was good bang for their buck (payoff) Eventually, they would migrate much of their legacy code but they needed to pick a place to start This was a classic mix of technical, cultural and business strategic decision making Following this process caused ITV to rapidly realize they needed a “Common Platform” built on AWS for product-based dev, test and deployment Like many early adopters, back in 2014 they had to build their own Their Common Platform V1 comprised technology, but a common team structure also organically evolved for unifying agile development, infrastructure operations, and autonomous ownership: • 2+ developers • Scrum master • Product owner • Platform engineer (devops) The platform engineer played a crucial role in every team, handling: • Operations • Initially On Call (first responder) • Team efficiency (automation and scripting) • Quality (operationally, what’s going to go wrong?) Tech-wise, ITV’s Common Platform V1 was based on application isolation via AWS instances, Centos OS, and automation using Terraform, Puppet, Jenkins and home-grown scripts Initially, there was no use of containers or offthe-shelf orchestrators CASE STUDY ITV As platform providers to the ITV development teams, they didn’t mandate architecture (although microservice architectures are common amongst the platform users) The Common Platform offers services to the development teams, which are recommended but not mandated They strongly advise the dev teams to use Postgres (provided via RDS) and they have their own managed RabbitMQ clusters, for example They chose to self-manage Rabbit over AWS’ own managed queue service (SQS) because of one of their initial guiding principles: there must be a fallback They used PaaS offerings only where the underlying tech was open source They therefore always had a potential escape plan of operating it themselves For that reason, ITV’s Common Platform does not expose AWS’ SQS, Aurora or DynamoDB managed services to the dev teams As well as services, the platform provides diagnostics: alerting, monitoring, logging and observability • Sensu for operational monitoring (alongside PagerDuty and Slack) • Telegraf, Influx and Grafana for metrics • The ELK stack (Elasticsearch, Logstack, Kibana) for logging aggregation They have found it incredibly useful to maintain a strict separation of dev alerts from prod ones The teams never get alerts for stuff they cannot (or should not) fix The Common Platform V1 was a success and ITV are now thinking about V2 The aim of V2 is to adopt off-the-shelf technology wherever possible to replace the homegrown, i.e the V2 strategy is to move as much as possible of the Common Platform to commodity tech They 24 intend to embrace containers and an open source orchestrator alongside carefully considered constraints on service behaviour There are many concepts from the Common Platform V1 that have been very successful and that ITV will maintain with the move to V2, including a concept of “blast radius reduction”, where every team’s stack is currently completely isolated from one another, so issues with one service cannot impact the running of another service This is true even for monitoring, alerting and diagnostics There are no common instances, instead there are duplicates There are pros and cons to this choice of duplication over commonality The downsides are increased costs in hosting and management for these service duplicates However, in Clark’s experience those downsides are outweighed by the benefits of decoupling on stability and on speed of development A major “side channel” benefit of the duplication is a reduction in monitoring noise In alerts their teams only see events generated by their own systems; they never need to worry about issues generated by other teams Looking back, is there anything they would have done differently? With 20:20 hindsight, they realize their fear of vendor lock-in did hold them back The overhead of remaining completely cloud-agnostic was high In future, they may decide to just use a vendor service by default and pay to move if necessary later Of course, a few years ago we had no idea that the vendor services were going to remain relatively inexpensive and develop at the rate they have Overall, ITV’s migration from extreme legacy (fifty years!) to the cloud has been a fascinating story arc with a happy ending I’ll be very interested to see what happens in season two CASE STUDY CONTAINER SOLUTIONS “Cloud Native is no longer just the domain of trendy startups and banks with deep pockets” Jamie Dobson, CEO 25 CASE STUDY Container Solutions Based in London, Amsterdam, Berlin and Zurich, Container Solutions (CS) was formed in 2014 to provide specialist analysis and engineering around the new technologies of microservices, CI/CD, containers and orchestrators At around the same time as CS came into existence the term “Cloud Native” gained currency [13] Since then one of Container Solutions’ key activities has been reviewing production Cloud Native systems and providing feedback on best practice and effective next steps According to CTO Pini Reznik, Cloud Native users have changed a great deal in the past few years “Two or three years ago businesses mostly fell into one of two groups On one side you had companies who had barely heard of containers On the other, you had experts who were experimenting heavily or even building in production These experts invested significantly, usually with board level buy-in, and created systems for themselves with little or no help Those were the companies we all learned from.” Now, however, things have changed According to Pini and his team, it’s common for companies to start experimenting with Cloud Native technologies with low investment, i.e cheaply in a bottom-up fashion This is often initiated by a keen internal technical champion, usually a developer who was inspired to try containers or microservices by a meetup or a conference Reznik says, “Once this person starts playing with the technology internally other engineers see the attraction, particularly for faster software delivery They run more experiments, get excited and often decide to try bigger projects” CS feel that getting from maverick developer to wider acceptance within a company is easier now There’s better industry awareness of Cloud Native Tech leaders hear conversations about it and see market support, it’s no longer such a scary, radical approach Unfortunately, it’s at this point things can apparently go wrong Ironically, a common issue is that there’re loads of great stuff you can with Cloud Native (this book is packed with it) All have potentially big benefits, but many are tough to deliver, with lots to learn “Companies regularly approach us to say they’ve tried Cloud Native but it failed to deliver The project became stuck,” says Reznik “We saw a pattern emerge - as they became more aware of all the possibilities of Cloud Native, they found it hard to focus on any one thing But Cloud Native is difficult, so if they didn’t focus they got bogged down on every front.” CS usually helps by encouraging the tech teams to step back and prioritize There are steps, like minimal automated testing or building a continuous delivery pipeline, that in CS’s experience make later tasks easier “Companies usually have a sensible wish list for Cloud Native, but we find that the order you deliver features in has a huge impact on the success of the project We advise teams to build so the project bootstraps itself, i.e build a minimal base platform that immediately contributes to its own evolution.” In other words, he says “use the platform to develop the platform.” 26 CASE STUDY Container Solutions Container Solution’s CEO Jamie Dobson is even more assertive on the subject “With Cloud Native, if a team are not getting modest ROI quickly they’re probably doing it wrong A successful CN implementation should immediately start making further development easier and build steadily from there If that’s not the case, they need to stop and step back In our experience, they’re probably doing too much, too soon without a firm enough foundation” In CS’s view, the driver for Cloud Native in most companies now is speed of delivery Often companies start with microservices or containerization However, testing, diagnostics and CI/CD are also vitally important in a Cloud 27 Native system - even more than in a monolithic one If that “plumbing” is missed out the project will suffer The good news is that the market is no longer polarised between super-experts (like the FT or Skyscanner) and almost total unawareness Now that tooling has improved, and toolchain and platform leaders have started to emerge, a whole new group of companies are trying out Cloud Native tentatively, but often with a clear goal in mind or problem they want to solve According to Dobson, “Cloud Native is no longer just the domain of trendy startups and banks with deep pockets” DO THOSE CASE STUDIES TELL US ANYTHING? OK, we’ve just looked at six case studies Stepping back is there anything we can learn from comparing and contrasting them? 28 Do Those Case Studies Tell us Anything? Technically Culturally Everyone I interviewed used the public cloud and gradually moved away from their own homegrown data centres None of them regret that; in fact, they all seem to be moving further towards cloud and seeking out more managed services to take the load off their engineers Noone appeared unduly worried about lock-in For FT and Skyscanner in particular, a Cloud Native approach felt like a cultural shift as much a technical one They both had a business-wide, ground-up objective to be agile, creative, individually autonomous and comfortable with change They both experienced considerable pain getting into Cloud Native technologies so early and they both had to retool several times However, I suspect that the difficulties themselves may have helped them with their cultural goal of building a more resilient and confident workforce Everyone cited increased development speed as their prime motivator, although for ASOS increased resilience was also a factor in getting started with the cloud Everyone mentioned the importance of cost, but it was secondary to speed and resilience Everyone had a CI/CD pipeline and automated tests to increase development speed Everyone had adopted a microservice-like architecture at least in part of production, again to increase development speed They were happy with that decision and would continue to build new stuff with the microservice model Most folk still had a monolithic heart that had not gone away but was much less actively developed on Later entrants should have an easier time Our sector’s understanding of the challenges of Cloud Native (CN) has improved enormously in the past few years The Container Solutions experience suggests that companies are now getting involved with CN successfully without needing such a big financial or cultural investment However, I suspect that a cultural desire for flexibility and “radical autonomy” will always play a big part in being successful with Cloud Native Not everyone had adopted containers yet, but everyone who had was pleased with them and had subsequently adopted orchestrators to increase resilience and save hosting costs 29 APPENDIX THE CONTAINER SOLUTIONS CLOUD MIGRATION METHOD Throughout this book we’ve discussed how for most businesses there are multiple benefits to embracing Cloud Native techniques Margin Cutting hosting bills and reducing operational ones through architecture and automation Scale Access to hardware on demand and the ability to effectively utilize it Speed Parallelizing team efforts and allowing access to new SaaS tools that help deliver functionality more quickly Container Solutions have helped many companies move to the Cloud to get these benefits and they have some interesting opinions on the subject of migration Based on what they’ve learned from real users (much of which is reflected in this book) they have developed a method for delivering new Cloud projects successfully, and unblocking stalled migrations In this appendix we’ll walk through their method and consider whether it matches up to what we’ve seen to be successful in the case studies 30 APPENDIX The Container Solutions Cloud Migration Method Four Steps to Migration In my view, the interesting thing about the Container Solutions process is it’s a mixture of both iteration (feeling your way) and careful up-front thought We believe it is imperative to understand your goals before embarking on a potentially revolutionary project • infrastructure tools are the best fit for those migrations All products and vendors are considered at this stage Identify the services where strong wins can be generated from migration without incurring negative costs from moving too early This usually involves a series of exploratory “proof of concept” experiments (POCs) Container Solutions follow a four step process Discover & Plan: Make the right up-front decisions on priorities, capabilities and costs Prepare: Build a stable runway, effective instrumentation and a strong support team on the ground using new processes, tools and training Propel: Get the whole company on board and rolling with the new system Perform: Move faster and increase agility in every dimension Step 1: Discover and Plan A Cloud Native migration is a major undertaking and there will be setbacks Early and consistent wins are important for maintaining commitment and momentum Enterprises must discover the right place to start At this initial stage, Container Solutions: • • Assess their client’s current state and desired state using their Cloud Maturity Matrix This covers eight areas including culture, product management, processes and infrastructure Help clients work out which applications to migrate first for maximum ROI and what The desired output to Discovery is a set of cloud-optimization goals and a strategy for the business These include technical targets but also organizational ones Once goals are outlined, Container Solutions’ next move is to create a high-level migration plan This often includes • Infrastructure Plan (which tools and/or platforms to use) • Architecture Plan (what is going to be migrated and what approach will be used including any new microservices that will be created) • Security Plan (how will the systems be secured and to what level) • Team & Culture Plan (what the new team structure and processes will look like) • Key milestones with target dates and a backlog • A broad execution plan Even at this early stage, Container Solutions also start their client’s training on the tools identified in the infrastructure and architecture outlines This is a mix of formal learning, mentoring and hands-on builds 31 APPENDIX The Container Solutions Cloud Migration Method Step 2: Prepare This first execution step is, in Container Solutions’ experience, vital to the success of a cloud migration and it is often under-valued or even missed It is the least glamorous stage, but, in their experience, the most important for a successful migration project The outputs from this prep step are a set of new development processes and an end-to-end infrastructure: • A Continuous Integration and Delivery pipeline (tools, training, and processes built and rolled out to a subset of the business) • Containerization and orchestration of existing applications (including tools, training, and processes) • Creation of individual, reproducible development environments that usefully reflect production • Creation of the run (production) platform • Automation of minimum viable test suites • A trained, starter devops team (trained using Container Solutions’ mentorship, hands-on, and formal education) Stage 3: Propel This step is about getting the new platform, processes and tools in use company-wide The purpose is global adoption and it’s achieved by a combination of promotion, integration into existing processes and training Container Solutions work with their clients to roll out the new system across the whole organization team-by-team This is done iteratively by training new teams, deploying, observing the results and refactoring where necessary 32 • • • • Roll Out the new systems team by team Observe the new teams; are there legacy tools or processes that don’t fit into the new system? Decide: should they refactor the new systems to replace legacy processes? (Usually yes, if teams don’t use the new processes consistently for everything they will often fall out of use) Train the new teams throughout in the use of the infrastructure and processes For training at scale CS use a combination of formal classroom learning and hands-on exposure via workshops and hackathons Container Solutions consider this stage to be successful once the new processes and infrastructure are in active use in all the target areas of the business Step 4: Perform By this point the enterprise should be reaping considerable rewards from the cloud migration in terms of cost, flexibility and development speed They should have their new development and operational framework rolled out across the business alongside upgraded teams that have the ability to use, support, and maintain their new tech They can now start exploiting that platform to go bigger and go faster By now, enterprises are already independent and can go it alone If, however, they want additional support to go even further Container Solutions can provide additional infrastructure and architecture help but their aim is always to embed any new knowledge and build expertise within the enterprise APPENDIX The Container Solutions Cloud Migration Method Where you start Discovery Architecture Review See the big picture 2-5 days, multidisciplinary team Build a better infrastructure 2-3 days, technical consultants It’s easy to get lost in the details of complex applications and modern IT infrastructure Ultimately, you want to focus on your business, not just your IT Discovery is a proven method to gain situational awareness, identify valuable opportunities and make the right decisions to create products that last Are you utilising technology to its full potential? Building for the resilience, scalability and maintenance of your architecture is the key to benefit from cloud platforms We assess the overall health of your technologies to uncover weak points and possible improvements You emerge with a blueprint to a more robust and efficient architecture to help you reach your business goals Cloud Native Strategy Security Audit Plan your digital transformation 2-5 days, multidisciplinary team Secure your infrastructure 2-3 days, security consultants Moving fast, being scalable and reducing costs are all achievable with Cloud Native To get there you need a solid plan that is not just tech, but rather about transforming your organisation We work with you to reduce the complexity of change and devise a Cloud Native strategy around your goals, people and processes With all the risks faced in a migration, security shouldn’t be one of them Through careful planning, microservices can heavily improve your security In our Security Audit we analyse potential points of weakness and address the challenges of securing a distributed system Strategy Consulting 33 Innovation Loop One Gain Cloud Native confidence 5-10 days, Cloud Native consultants Reduce innovation risk 5-10 days, multidisciplinary team You brace yourself to make a move But is your strategy the right one? Our Cloud Native experts review your migration strategy with the knowledge won through successful campaigns You are about to marshall enormous organisational resources Make sure you’re set up to succeed Innovation is a risky business A promising new technology can entice us into building and scaling quickly, without a firm market or product base We use strategic design to explore ideas and find your product market fit You can then decisively scale up, knowing that you are production-ready CS MIGRATION METHOD CS MIGRATION METHOD APPENDIX The Container Solutions Cloud Migration Method Discover and Plan Discover and Plan Identify the right problems to solve and formulate strategy to solve Identify the right problems and formulate strategy Mapping context and architecture Mapping context opportunities and architecture Explore migration Prepare Prepare Create the building blocks for theCreate infrastructure andblocks organisation the building for the infrastructure and organisation TECHNOLOGY TECHNOLOGY Explore migration opportunities Discovery Discovery Architecture Architecture Review Review Cloud Native Cloud Native Strategy Strategy Prepare the infrastructure Security Security Audit Audit Strategy Strategy Validation Validation Prepare the infrastructure Containers, clusters, microservices Containers, clusters, microservices ORGANISATION ORGANISATION Preparethe theorganisation organisation Prepare Development/ /deployment deployment Development processes and learning plan processes and learning plan OUTCOMES OUTCOMES • Strategy and roadmap • Strategy and roadmap New architecture architecture design ••New design • Maturity assessment • Maturity assessment • Functional PoC (optional) • Functional PoC (optional) 1-3 applications applications in ••1-3 in production production Migration backlog ••Migration backlog BUILDCLOUD CLOUDNATIVE NATIVE BUILD Does This Method Make Sense? CONFIDENCE CONFIDENCE extremely difficult to re-architect a system as microservices, for example Even with the Container Solutions expect to migrate most best platform and stack in the world there are The move to Cloud Native can bring huge rewards to With a solid migration strategy as Thecompanies move to inside Cloud3-6 Native can bring huge rewards to a Container solid migration strategy months for smaller still multiple, steep learningWith curves your business, but only if risk is handled well Container migration and fully independent t yourorganizations business, but only if risk handled well Container migration and to fully independent or 12+ months foris larger Solutions’ judgment - that migration is likely Solutions’ migration method is proven to reduce risk, for smaller organisations (or 12+ m enterprises That sounds realistic We’ve seen take a year for a large enterprise is refreshingly Solutions’ migration method is proven to reduce risk, for smaller organisations (or 12+ take control and help you become independent We’ve that early Cloud Native enterprises have spent realistic take control and help you becomemigrations independent learned the patterns of acomplex andWe’ve developed At this point, you are independen 4-5 years on adoption but great deal of that learned the patterns of complex migrations and developed At this point, you are independe awas four-stage process adapted to each client’s needs attakes greater margin, speed writing their own tools or re-tooling More A potentially controversial platform approach CS a four-stage process adapted to each client’s needs platform at greater margin, spee rely on CSchoices as a cloud partner to gr recent large migrators took only 1-2 years is to consider a range of infrastructure rely on CS as a cloud partner to Our Jumpstart productsTools set you up for a no-regret because tooling improved and platforms There is an argument that it would be quicker Ourmove Jumpstart products set you up for a no-regret to reduce risk and gain situational awareness are becoming even more robust every day and cheaper to select one strong, popular move to reduce risk and gain situational awareness platform (say Kubernetes on AWS) and use that However, tools aren’t everything It is for everyone container-solu container-so 34 es Container Solutions Container Solutions Propel Propel to Migrate applications Perform Perform Stabilise applications and support the newMigrate cloud environment applications to the new cloud environment Stabilise applications and support Application rollout and refactoring Application rollout and refactoring Growth Growth Migrate remaining applications Migrate remaining applications Support Support Stabilisation Stabilisation Capacity building Capacity building Training and coaching of Cloud Native teams Training and coaching of Cloud Native teams • Migrated applications • Migrated applications • Optimisation and growth • Optimisation and growth New organisational practices • New• organisational practices • Handover the organisation • Handover to thetoorganisation • Fixed issues • Fixed issues • Support • Support However, Container Solutions find that different easy to say a project has succeeded because clients need different environments Some one “skunkworks” team is using it In my may be Windows shops for whom Azure is experience, that’s the halfway point at best In ategy as a result, we ensure a smooth as a result, wemore ensure a smooth attractive Others will have low levels of fact, according to Container Solutions, it is the endent teams within 3-6 months t teams withinops 3-6 months expertise or the goal of zero ops, which company-wide rollout that requires the most (or 12+ months for enterprises) Ready to take the makes managed platforms attractive Finally, elapsed time, care and attention during a cloud + months for enterprises) journey to the cloud? we’ve seen from the case studies that everyone migration It is where projects frequently pendently running an integrated retools as they grow The whole point of Cloud stall or fail info@container-solutions.com ently running an integrated n, speed and scalability and can info@container-solutions.com Native is not to get too locked in (mentally ed and scalability and can ner to grow even further! or physically) to a single stack that may be So, overall the Container Solutions method grow even further! superseded next week sounds reasonable and apparently it works very well However, what I particularly like about the CS method is the global adoption stage It is all too Ready to take the journey to the cloud? iner-solutions.com olutions.com 35 REFERENCES within the IT sector http://www.greenpeace.de/sites/ www.greenpeace.de/files/publications/20170110_ - Cloud Native Computing Foundation charter greenpeace_clicking_clean.pdf January 2017 https://www.cncf.io/about/charter/ The Linux 15 - Sam Newman -The Principles of Microservices Foundation, November 2015 http://samnewman.io/talks/principles-of- - Wikipedia, https://en.wikipedia.org/wiki/ microservices/ 2015 Orchestration_(computing) September 2016 16 - Wikipedia - Test Automation https://en.wikipedia - The Register ‘EVERYTHING at Google runs in a org/wiki/Test_automation container’ https://www.theregister.co.uk/2014/05/23/ 17 - Upgard Overview of Configuration Tools https:// google_containerization_two_billion/ May 2014 www.upguard.com/articles/the-7-configuration- - Ross Fairbanks Microscaling Systems Use management-tools-you-need-to-know July 2017 Kubernetes in Production https://medium.com/ 18 - Bryon Root - The Difference Between CI and CD microscaling-systems/microscaling-microbadger- http://blog.nwcadence.com/continuousintegration- 8cba7083e2a February 2017 continuousdelivery/ August 2014 - Forbes David Williams, The OODA loop https:// 19 - Docker - What is a Container? https://www www.forbes.com/sites/davidkwilliams/2013/02/19/ docker.com/what-container 2017 what-a-fighter-pilot-knows-about-business-the- 20 - WeaveWorks Comparing Container Orchestrators ooda-loop/#30e3c4963eb6 February 2013 https://www.weave.works/blog/comparing-container- - The Skeptical Inquirer, Prof Richard Wiseman orchestration/ The Luck Factor http://www.richardwiseman.com/ November 2016 resources/The_Luck_Factor.pdf June 2003 21 - Provost, Foster, and Tom Fawcett Data Science - Skyscanner Stuart Davidson http://codevoyagers for Business: What you need to know about data com/2016/05/02/continuous-integration-where-we- mining and data-analytic thinking “ O’Reilly Media, were-where-we-are-now/ May 2016 Inc.”, 2013 - ASOS public revenue data from https://www 22 - Humble, Jez, and David Farley Continuous asosplc.com/ Delivery: Reliable Software Releases through Build, - About Cloud66 http://www.cloud66.com Test, and Deployment Automation (Adobe Reader) 10 - AWS SQS provides reliable message queuing Pearson Education, 2010 https://ndolgov.blogspot.co.uk/2016/03/aws-sqs-for- 23 - Friedman, Ellen, and Kostas Tzoumas reactive-services.html Introduction to Apache Flink: Stream Processing for 11 - Peter Norvig Moving data between machines Real Time and Beyond “ O’Reilly Media, http://norvig.com/21-days.html#answers Inc.”, 2016 24 - Goodfellow, Ian J., Jonathon Shlens, and Christian 12 - Husobee Restful vs RPC https://husobee.github Szegedy “Explaining and harnessing adversarial io/golang/rest/grpc/2016/05/28/golang-rest-v-grpc examples.” arXiv preprint arXiv:1412.6572 (2014) html) May 2016 25 - https://github.com/spotify/luigi 13 - Cloud Native early mentions InformationWeek 26 - http://jupyter.org/ http://www.informationweek.com/cloud/platformas-a-service/cloud-native-what-it-means-why-it- Photos by Artificial Photography, rawpixel.com, matters/d/d-id/1321539 July 2015 Brooke Lark, Daniel Jensen, Markus Spiske 14 - Greenpeace report on inefficient use of energy www.container-solutions.com ... MorganSmith: “our goal throughout was to de-risk experimentation” but that involved training, tools and trust” The FT heavily invested in internal on-the-job training with an explicit remit for... latter, and they replaced traditional pre-deployment acceptance tests with automated monitoring in production of key functionality How have they handled the complexity of distributed systems? They... because they prioritize auditability and reliability over hyperperformance Again, they have weighed the tradeoffs and made a decision based on a good understanding of their own current situation and