Cloud Adoption Blueprint
Last spring AWS, Intel and Cloud Technology Partners hosted the first ever Cloud Adoption Symposium. The New York City event was oversubscribed, forcing us to double the room capacity to accommodate all those who wished to attend. To say we were excited would be an understatement: we could not have asked for a more clear and solid signal that cloud adoption is top of mind for executives and the urgency for enterprise-wide change is increasing every day.
While moving to cloud is a good thing, it is critical that an organization proceed with caution. Regardless of whether your company is looking at one workload, multiple workloads, or an entire portfolio, transforming from on-premise to cloud-based IT requires more than just understanding the technology. Successful cloud adoption dictates a pin-sharp focus and a detailed blueprint, as a single misstep can become costly and time consuming. Following a prescriptive approach to implementing a cloud program streamlines your transformation, accelerating time to value and reducing risk.
We kicked off the NYC event exploring this prescriptive approach and we shared our learnings from helping hundreds of businesses plan, design and build their cloud programs. As you read the key takeaways below, think about how they apply to your organization. We are confident that no matter how far into your cloud journey you may be, they will remain relevant.
#1 – Kick-off Your Cloud Program with a Vaccination
It should come as no surprise that not everyone in your organization will support a cloud program. As a matter of fact, we consistently see blockers and conscious resisters in almost every enterprise we work with. A colleague, also speaking at the symposium, called these people “viruses.” This memorable description could not be more accurate – viruses kill cloud projects. Unless you discover and neutralize viruses, your likelihood of success will be low.
What’s the solution? As with most ailments, early intervention is key. We have found that bringing key stakeholders together (in the same room) early immunizes your organization against viruses and eliminates blockers. At CTP, we use a structured three-day Workshop that delivers intensive executive training as the first step in our Cloud Adoption Program (CAP). For participants, the workshop turns into a rollercoaster of denial, anger, surrender and acceptance. Your staff will be scared and filled with fear, uncertainty and doubt (FUD) about the future of their jobs. As cloud leaders, it is your task to address these issues head-on and clear the FUD.
The structure of the CAP Workshop is straightforward – get all the decision makers, influencers, and stakeholders in the same room for three full days. We understand this is no small time commitment! However, your cloud journey will be no small investment. It makes sound business sense to make a small investment up front to avoid larger, more costly mistakes down the road. Most of our CAP Workshops include 25 to 30 people on any given day with participants floating in and out depending on the topic. Here is a list of the roles you will need to make sure attend.
- Executive Sponsors – These may be from the below groups or from the C-suite, such as CTO, CIO and CEO whenever possible.
- Application Owners – Business units, development teams
- Security – CISO, SecOps people
- GRC – Governance, risk and compliance experts
- Finance – Procurement, risk and governance experts
- Lead Architects – Cloud and existing infrastructure leaders
- Database – Lead DBAs, data architects
- Central IT Operations – Leaders, key department heads, networking specialists
Getting everyone to commit to all or part of the CAP Workshop is definitely a challenge, but you’ll need all these stakeholders to be involved for a successful cloud program. Alignment is the most important first step in any major IT initiative, especially one as important as cloud adoption.
#2 – Make a “Cloud First” Commitment
Core to making a Cloud First commitment is asking the question: “Why are you moving to the public cloud?” The answer to this simple, but powerful question eludes many of our clients. We spend a good part of the CAP Workshops discussing, debating and arguing the merits and benefits of cloud adoption. Without knowing why you are moving to the cloud, a Cloud First strategy simply falls apart. Team members head off in different and conflicting directions with no idea how their behavior is affecting others.
Making a commitment to putting workloads in the cloud is another no-brainer, yet it is often overlooked.
Cloud First means that all of your applications and data will move to the cloud unless there is a compelling reason that they must remain on-premise. Without a Cloud First strategy, you are simply keeping your application and data teams with one foot on first base while asking them to try and steal second. When this happens, the results are marginal at best since there is no focused dedication to making the changes necessary to reap the full benefits of cloud.
On the surface, Cloud First appears to be an aggressive stance. However, without a Cloud First strategy you simply will not be able to dedicate the appropriate resources to fully establish the organizational change necessary to make a measurable difference. Consider all the attendees needed for a CAP Workshop. Cloud adoption will affect nearly every aspect of your organization. Therefore, it is more of a strategic direction and leadership initiative as opposed to a technology decision.
Cloud First also requires assigning dedicated teams and making a decision to properly fund your cloud program. This means team members will only work on cloud-related activities and their entire focus will be on getting the enterprise to the cloud securely – not just kicking the tires with a proof-of-concept or pilot. A cloud team whose members still have their day jobs is a sure indication that:
- There is not a full commitment to cloud.
- The effort required is misunderstood.
- There is a lack of executive sponsorship.
When an organization truly understands the benefits of cloud and their Cloud First strategy — and there are many such organizations (CapitalOne being a great example) — the sponsor can create a proposal so compelling that no CEO could ignore it.
#3 – Establish a Cloud Business Office
Cloud adoption will have an enormous impact on your company, evolving processes that have not been seriously touched in decades. For the first time, developers are able to create and modify their infrastructure requirements using software. The implications of such power are both dazzling and frightening.
Software development has lived in a static world of change management where the critical nature of the business impact has created tight control processes and long approval cycles. Thus, the need for a Cloud Business Office (CBO).
The CBO serves as the central point of decision-making and communication for your cloud program – both internal and external to your company. More than just a “cloud center of excellence,” the CBO is a permanent operational and governing body that directs and guides all aspects of your cloud program, from the first implementation through ongoing operations.
Members of the CBO fall into two categories: Full-Time and Part-Time. Full-Time CBO members are leaders who have a daily responsibility for the successful adoption, implementation and management of cloud in your organization. These include:
- Cloud Program Leadership
- Technical Operations Leadership
- Chief Architect(s)
- Security Operations Leadership
Part-Time CBO members are leaders who have a vested interest in the success of the cloud program and need visibility into the process. These include:
- Legal and Risk Leaders
- HR Leaders
- IT Finance
- Application Owners and Business Units (BU’s may have a full-time role for a short duration during their on boarding process)
The cloud has completely changed how we consume and operate IT. The agile nature of cloud technology enables dramatic benefits for the enterprise and touches almost every department within an organization. In addition, compared to on-premise environments, the cloud requires far fewer people to manage and operate so a tighter, more cohesive team is needed to break down silos. Because we are combining operations, development, infrastructure, risk, and finance, we need a central set of processes. These include:
- Project management
- Technical decisions
- Application owner onboarding
- Technology training
- Risk / Security decisions
- Organizational change management & training
- Financial governance
- Operational services and governance
- Vendor management
#4 – Know Your Cloud Economics
Understanding the economics of cloud adoption seems like a no-brainer best practice. However, our experience shows that over 50% of enterprises do not take the time required to determine the business case for moving to the cloud, probably because they “already know” it is a good thing. Nevertheless, an organization gains many valuable insights by building a business case and improving their understanding of cloud economics.
Building a Business Case for Cloud
Cloud economics fall into two separate, and highly valuable, buckets. The first is a straight line Total Cost of Ownership (TCO) analysis along with hard cost savings. TCO is the like-for-like replacement of on-premise services with cloud services. When determining your current costs, we suggest you look at the whole package, not just server-for-server comparisons. Areas to consider include:
- Hardware and networking costs
- Downtime costs (planned and unplanned)
- Upgrade costs
- Disaster Recovery / Business Continuity costs
- Service Level Agreement penalties
- Deployment costs
- Operational support costs (day to day operations)
- Performance costs
- Costs of selecting vendor software
- Requirements analysis costs
- Developer, administration and end-user training costs
- Cost of integration with other systems
- Quality, user acceptance and other testing costs
- Application enhancement and ‘bug fixes’ costs
- Physical security costs
- Legal, MSA, and contracting costs
- Replacement and take-out costs
- Cost of other risks (including security breaches)
The second bucket of cloud economics includes agility and other soft costs. What is the benefit of having highly flexible, agile infrastructure? What is the financial impact of decreasing provisioning times from months to hours? Quantifying these intangible cloud benefits for an enterprise can be difficult. Consider these questions:
- How do you measure the impact of productivity (in person days)?
- What is the total benefit of accelerated application development?
- How do you measure the impact of faster software lifecycles?
- How do you measure a “fail fast” model?
- How much do human error and outages cost your organization?
Getting to solid answers around these topics is challenging, however, many companies have been able to determine tangible benefits. For example, a financial services company saw a 10% productivity gain in their software development after moving to AWS. On a $700 million budget, that gain is significant and can help build the business case for a Cloud First commitment.
Finally, it is a best practice to track your financial KPIs as you build your cloud program. Your economic model gets better over time as you add more and more use cases.
#5 – Discover the Inner-Workings of Your Application Estate
Public cloud environments like AWS, Azure and Google are not fully backward compatible. That means some of your applications are not going to be able to move to the cloud. Depending on the importance of these applications, there will likely be a hybrid cloud network whereby the public cloud provider is connected with a private MPLS circuit. In this mode, cloud based applications can access legacy on-premise services while still gaining the benefits of a cost efficient and agile infrastructure.
The challenges with hybrid cloud networks include latency issues as well as the volume of data being transmitted through the network. Simply put, you could cripple your cloud program without an understanding of the application mapping and data volume between application dependencies.
The challenge is that it is uncommon for organizations to know the inner-workings of their application estate. Rarely do CMDBs have this level of detail and, more often than not, those team members who did have this information are no longer working in your organization. Companies have built data centers around application centers of gravity. Without a solid understanding of what the connections are and how much data travels between those applications, there is little hope for program success.
Automation, Tools & Heroic Efforts
Application discovery is not easy. The good news is tools and automation make the job far less painful.
Using automation to discover virtual machine profiles is nothing new. Most hypervisors will give you this information and there are numerous third-party tools that will sniff out virtual and physical server details (such as RAM, cores, etc). However, there are few that will tell you the connections between VMs, the frequency between service calls, and the volume of data moving between the VMs.
There are agentless software tools that discover all the standard VM profile information and build a dependency map based on service calls. Over time, the tools provide a profile of data flow between VMs. The dependency map is the critical first step in discovery and forms the foundation for the rest of the process.
At CTP, we develop custom IP that enables us to import discovery tool information. The information from the discovery tools provides the physical characteristics of the servers and then maps them to the client’s CMDB.
Within the software, we associate important metadata such as application entry points, SLAs, PII status, compliance and other risk related information in a way that enables the team to decide how best to migrate the selected applications. This team can then:
- Identity server and application dependencies
- Identify risks
- Determine the migration strategy
- Create a migration plan
- Determine trade-offs and opportunities
- Right-size resources in the cloud
- Estimate the run rate of your resources in the cloud
Once assembled and analyzed, the migration team uses the data as their Bill of Materials for the migration factory. We will cover this topic in detail in Best Practice #10.
#6 – Build a Minimum Viable Cloud
The Minimum Viable Cloud (MVC) is one of the most important of the 10 best practices. Based on the concept of the Minimum Viable Product, the MVC is the starting point of your first production cloud and a platform that you will iterate and improve as you migrate to the cloud. Azure, AWS and Google all allow for automation programming as the primary means to build the new platform. Therefore, we now must think about our cloud as a piece of software. Hence, the new mantra infrastructure is code.
There are two key components of the MVC – the hub and the spokes.
The MVC Hub
The MVC Hub is the portion of your production cloud that provides all the common services consumed by your cloud customers. We recommend a federated model of common services that are supported centrally by your IT Operations team. Standard services found in the MVC Hub are:
- Logging and Monitoring (centralized)
- Identity Access Management (IAM)
- Encryption Tools and Key Management
- Security Services such as IPS/IDS/WAF
- InfoSec (Security Operations Center)
- Image Management and Repositories
- Automation and Templates (e.g. Chef, CloudFormation, etc.)
- Networking Services to On-Prem Resources
- Financial Controls, Chargeback and Billing
The MVC Hub (depicted in Figure 1) must be created and established first, before you add your first application in the cloud. By establishing your core centralized services up front, you set the stage for rapid onboarding of new and existing applications.
Some may ask, “Why invest so much up front in the central services?” Good question. We have learned through experience that the core management services are a lot easier to implement prior to moving any applications. Retrofitting your cloud foundation is a real pain and those who have lived it will tell you to do the work for your core services first, then add the applications.
The MVC Spokes
The MVC Spokes are a set of applications that belong to a specific owner or business unit. This is the physical cloud account (such as an AWS customer account) and the supporting VPC(s) necessary to run the application(s). It is also a logical collection of applications and services that may belong to a logical business unit. You know and understand your business and should align the MVC Spokes with what makes the most sense for your company.
One of the key responsibilities of the CBO is onboarding new applications. By using this prescriptive approach, you can establish a highly scalable model to onboard hundreds of applications with one MVC Hub. This is how you scale your cloud program.
When the MVC Hub and Spokes are logically and physically connected, the common services from the Hub are consumed by the Spokes. Although the applications are running in their own network and account, the federated services are owned and operated by the MVC Hub. Put simply, the MVC Hub manages services on behalf of the Spokes.
Selecting the Pilot Application for Your MVC
Your MVC 1.0 contains the Hub and a single Spoke. Selecting the right pilot application for your MVC 1.0 is critical. The selection criteria for the pilot is primarily driven by organizational goals as opposed to technical goals.
We have found that most organizations fail to get all stakeholders on board, so when it comes time to roll into the cloud at scale, the non-IT stakeholders put the brakes on the migration.
Therefore, make sure your MVC pilot application addresses key stakeholder concerns. Your objective is to exercise your organization’s muscles so that when it comes time to migrate 50 or 100 applications, other groups within your company (Risk, Legal, Finance, etc.) know what to expect.
Look for these characteristics when selecting your MVC Pilot Application:
- Has sensitive data – You want sensitive data in the MVC 1.0. Why? Because the organization needs to care about what you are doing. Avoiding the topic of sensitive data only kicks the can down the road. Deal with this issue up front and resolve concerns early.
- Has fewer than 10 servers – Do not try to boil the ocean. Moving a large application is not the point of the pilot. Pick something that is manageable, but meaningful.
- Can “Lift and Shift” – Stay away from refactoring or rewriting an application. These are long and often drawn out processes that will delay your effort. Find an application that has OS and database services that are supported by your cloud provider.
- Has a Bastion host – A Bastion host is an internet facing portal that allows developers to access the application from outside. Although optional, this is an important organizational step, because the Risk and Compliance business units tend to get queasy when you open the cloud platform to the internet. If you are developing code, this will need to happen – if not in MVC 1.0 then definitely in MVC 1.1 or 1.2.
- Has cooperative application owners – Seems like a no-brainer, but let’s not overlook the importance of an application that is owned by a team who wants to go to the cloud. All app owners are not equal, so choose wisely.
#7 – Perform a Security & Governance Gap Assessment
CTP’s Cloud Adoption Program is very prescriptive. After hundreds of cloud engagements we discovered that the cloud security technology used from client to client are nearly identical. There are repeatable patterns of reference architectures that form a baseline by which we can assess gaps in your program. We have built those repeatable patterns into the MVC model and the patterns are standard with every MVC we build.
What is often missed, however, is the assessment of the Security and Governance control objects that map to the repeatable patterns in the MVC. The control objectives may range greatly from client to client, with some requiring PCI and SOX regulations, and others adhering to NIST, FISMA and many other industry standards. The challenge is understanding how these standards and regulations map to your cloud program.
The Cloud Security Alliance (CSA) is the world’s leading organization dedicated to defining and raising awareness of best practices to help ensure a secure cloud computing environment. The CSA has produced the Cloud Controls Matrix (CCM) as an accepted baseline of control objects for cloud computing in the enterprise, which is at the center of our cloud security methodology.
We have mapped the CSA Cloud Controls Matrix to the repeatable architectures on AWS, Azure and Google. Performing a Security and Governance Gap Assessment means looking at your control objectives against a known standard such as CSA’s matrix, and documenting the gaps in your controls and technologies against accepted best practices.
The result is an MVC with a prescriptive security and governance platform mapped to the CSA. This is a huge time saver. Instead of building your security reference architecture from the ground up, you can accept a baseline and make minor changes to meet your specific needs.
New Controls & Tools
Most enterprises start by thinking about their cloud program as if it were a data center and quickly find themselves not knowing how to map their existing control objectives to the new cloud model. Taking existing tool sets and applying them to the cloud does not work. Data center-centric tools are not architected for public cloud platforms. Our experience points to new tools and processes to solve these problems.
In addition, there is a big secondary bonus to leveraging new tools – they are a lot cheaper than your data center tools!
#8 – Plan for Continuous Compliance
Enterprises have many controls that govern the IT environment. Since most of the resources are hardware based, the controls take the form of change management and operational services. However, the new cloud model is software based and ungoverned by its very nature. Imagine going from a permissions based purchasing process (getting a purchase order signed for new hardware) to an open credit card account where you can order new services without approvals.
The new consumption based model requires a new level of governance. Using the standard change management and controls approach simply does not work. Legacy change controls will slow the process down and you will find yourself back in the same situation you were trying to escape.
What’s required is Continuous Compliance. In this context, Continuous Compliance is software that is constantly looking at your environment and controlling the consumption and usage of services in your cloud. The controls are implemented using “software signatures” that check for specific governance and compliance requirements.
For example, AWS has servers (EC2) and attached storage (EBS) that form the basic server / storage configuration. If you delete a server and do not specifically tell AWS to delete the storage, the storage is left orphaned. Over time, orphaned block storage becomes a risk to the company. Unless properly governed, unknown storage volumes cost money and can potentially contain sensitive data. As you can imagine, compliance teams do not like ungoverned storage disk(s) hanging around.
To date, we believe there are over 200 signatures that need to be implemented in the MVC. These range from object storage controls, IAM checks, encryption validation, key rotation schedules and many more. There are vendors who provide governance frameworks to address certain operational domains such as security IPS/IDS or firewall rules. However, no one tool does them all. It takes a combination of tools and custom software to cover all the bases.
At scale, continuous governance is a combination of security, risk, compliance and finance controls that are implemented using software. And like any software controls, managing the profiles is where you gain your greatest benefits in the form of consistent, repeatable outcomes with fewer errors.
#9 – Implement Automation Frameworks
Throughout these best practices, we speak of automation as a core tenet of implementation. Infrastructure as code is the mantra. At the core of cloud adoption is the automation of infrastructure builds for every application. The goal is to have each application implemented and deployed through code. We want to take a DevOps mentality to the development of our new cloud environment.
At the heart of the automation mantra are the MVC automation templates. Your goal is to get to repeatable automation templates that carry the operational governance we spoke about in the prior section. For example, onboarding a new application team to your MVC should pull 90% or more of its code for the cloud platform from GitHub and the frameworks you are managing.
Building a Minimum Viable Cloud includes producing repeatable automation templates that are used to onboard new application teams. In the templates are the common services, governance rules, tagging scenarios, metadata, VPC, IAM roles, image repository and a host of common services delivered from your MVC Hub. The automation templates save a ton of time and reduces a huge amount of risk by eliminating much of the human error.
The new processes are focused on controlling the content of the automation templates, code repositories and server image libraries. Change management is now around code management within a group that has never done software development as a core discipline. Thus, it is essential to foster a DevOps model of management and tighten the relationship with the software team.
#10 – Prepare for Migration @ Scale
Migration @ Scale refers to the technology, processes and people who move application workloads to the cloud leveraging a factory model. There is a deep desire within many clients to get out of the data center business. Most executives understand the benefits of public cloud and have directed executive IT leadership to reduce data center costs by moving to cloud.
Across industries, our experience points to an average TCO savings of around 40% year-over-year, and the primary way to achieve this goal is through an application migration factory approach.
To accomplish a solid reduction in TCO requires a significant migration of application workloads to the cloud.
In our prior nine best practices, we have been preparing your team to move many hundreds, if not thousands of applications. This requires a solid, factory approach to migration. Having determined which applications can move to the cloud, we set up the cloud environment, secured it, and prepared operations to receive the applications.
Since cloud platforms are not 100% backward compatible, we must decide if the application can be migrated without change, or if it requires modification before the move. Depending on the complexity, age and architecture of the application, the level of effort to migrate to the cloud can range greatly. Therefore, we recommend an application migration workbench approach.
Setting Up Migration Workbenches
A migration workbench is a team of engineers who perform a specific set of migration tasks. There are six migration workbench types. They are as follows:
- Rehost — Generally referred to as ‘Lift and Shift’, this workbench is a machine to machine migration of the application and data to the cloud platform. This is the easiest of the migration tracks and can be done using automation for most of the tasks. However, you don’t realize all of the cloud benefits, because the applications have not been developed or rewritten for the cloud.
- Replatform — The replatform workbench is a team of engineers who perform minor, but critical replatforming functions to enable the application servers and software to run on the new cloud platform. Typically, this involves environment changes, not code level changes. Replatforming includes:
- OS and / or database version upgrades
- Significant DNS and networking changes
- INI and Configuration file changes
- Refactor — Refactoring an application occurs when code level changes are required. This should include code scans for blockers that would prevent migration to the cloud. At CTP, we’ve developed software tools that scan .NET and Java code for problem areas that may restrict migration to the cloud. Refactoring is a complex function and requires the team to have domain skills in cloud services as well as security and infrastructure knowledge.
- Retire — Retiring applications appears to be straightforward, but many clients overlook a lot of the benefits. On average, 30% of your data center applications will be retired due to replicated services within the cloud platforms. In addition, there are often applications running in the data centers that are maintained for compliance requirements only. They could be retired early in the cloud by building the system in software, testing it and then turning it off, thus using the services only when needed. This creates significant cost savings.
- Replace — Replacing applications is more complex and very dependent on the business unit or application owners. However, the Replace Workbench is required in the event that the total data center footprint is eliminated.
- Retain — Unless you have a zero-footprint goal, you will likely have to retain some portion of your application estate. At that time, it is really about ‘end of life’ scenarios and how you plan to decommission the platform. The challenge is not how to retain the applications, but how will the cloud based applications engage with the retained applications? Often the retained applications are centers of gravity. They are huge databases and legacy systems that have hundreds of connections to other services. If you are consolidating data centers and need to move these applications, sorting out the complex spider web of connections is the job of the Retain Workbench.