Cloud Native Apps for the Ops Guy – What the hell is CNA anyway?

Intro: DevOps, containers and every other buzzword bingo concept seem to be severely muddying the waters of understanding the practicality of web-scale application development and it’s relevance to the everyday Operations Team. I’m hoping this series explains the basics of Cloud Native Applications from the perspective of an ops guy by joining some of the dots across a commonly misunderstood topic. 

Are we “failing fast”, or just plain failing?..

Cloud Native Applications (CNA) has to be one of the most enigmatic concepts in operations today, but it is widely accepted as the future of software development. New school, software centric businesses like NetFlix brought microservices and open source software into sharp focus with their recent success, showing both can be heavily utilised to support a multi billion dollar business. Couple that with the fact traditional application scaling and resiliency is now considered to be slow, inefficient and expensive and it’s of little wonder that everything with a Cloud Native prefix is gaining such rampant popularity.

With that said, I’m always a little alarmed when I have a conversation with a fellow ops engineer that generates grumblings along the lines of…

Our devs just build stuff in the cloud, then dump it on us with little or no information on how to operationalise it…

I hear this quibble (or variation of) surprisingly often. You would almost think it was standard practice for all dev > ops hand overs, right? Question is; Do most devs actually hand over with poor context, or is it a lack of knowledge / unwillingness to learn the basics of modern web application architecture that holds us ops fellas back?

Screen Shot 2015-11-02 at 5.28.08 PM

If you believe the later, then clearly it is critical for us to understand how our development teams function (what with them being the future n’ all). The depth of our understanding however is something most of us will have struggled with.

DevOps today is just tomorrows operations…

Let’s go back to our recent past (like yesterday). Does weekly CAB meetings, architectural review boards, problem management, excessive documentation all sound familiar? This is the kind of rigid ITSM doctrine that has been pushed for many years to bring stability to complex environments. The problem is that it’s costly and moves at a near glacial pace. Pretty much the opposite of agile…

Don’t get me wrong, stability and resiliency are critical for enterprise, a net result of having to baby sit a whole raft of buggy software. This is why most ops guys view devs as cowboys. From our perspective it’s buggy code that stresses the supporting platform, causes outages and so on (the age old blame game). The impact of ‘bugginess’ depends on how far reaching the impacted code is within a service. Monolithic, layered applications tend to be the worst affected because they are comprised of only a few services which are tightly coupled (ie. close dependencies with other services).

Screen Shot 2015-11-08 at 9.32.56 AM

And therein lies the beauty of CNA: Applications comprised of numerous, loosely coupled, highly distributed services where the impact of faults (read: bad code, potentially 😉 ) is drastically reduced. Sound good? I think so…

The reality is most software developers have already had this epiphany and are working (or experimenting) with platforms that support CNA today. The adoption of this practice has forced infrastructure teams to be more forward thinking in accomodating what our devs are building. Make no mistake, they are leading us into the future. What’s more important right now is that we have a opportunity to control and shape the platform(s) used to host, deploy and secure cloud native applications.

Where to start?..

To paraphrase from some self-awareness reading I did recently; first seek to understand, then to be understood. The initial problem I had was that CNA is more of a broad conceptual practice than a prescriptive framework. So the first port of call was to gain a high level understanding of CNA basics so at the very least I could have a constructive conversation with software development teams. I’m not saying this will work for everyone, but it did for me…

First off, I split my research into two areas of understanding; Foundational Concepts and CNA Enablers. This is in no way official, it’s just my way of breaking down a broad topic into chunks that I could easily consume.

Disclaimer: I am pretty far from being any kind of authority on CNA, so rather than try to cover everything or take credit for others great work I have also included a bunch of links which gave me a high level understanding of CNA. These blogs, posts and articles are not intended to show how to implement or architect. This is purely an exercise in understanding…

Foundational Concepts – The following concepts / components are constituents of a CNA practice. This should help us to understand a modern software developers challenges.

Twelve-Factor Applications – The 12-factor app is a widely accepted framework of best practices for deploying scalable applications on any platform. There are litterally hundreds of blogs detailing this and it’s a lot of info to digest for the uninitiated, but the awesome 12-Factor Apps in Plain English helped me to understand this concept with relative ease.

Microservices – I like to use the terrible analogy of a legacy Windows OS as a starting point for understanding microservices (note: it’s just for common ground for explaining a construct and completely unrelated to CNA in general). Hundreds of single instance, single function, fragile services combined to run a basic operating system. I say fragile because historically any one of approx 50% of these services could cause a critical failure in the overall OS, and did, regularly. Imagine if all these services actually ran independently and in a fault tolerant, balanced state so any issue could be isolated and remediated with little or no service degradation!

Essentially microservices are just that. Loosely coupled, lightweight, easily scaleable components that have a single essential function within in a lager application or service. What are microservices explains this in basic detail, but Microservices – Not a free lunch brings a dose of reality by helping us to realise the level of technical maturity required for effective usage. If you do want to get deeper into microservices @martinfowler is a great chap to follow and his blog has a plethora of information to get you pretty far down the rabbit hole, so to speak.

API Collaboration – Microservices are of little power unless they can interact with other microservices. This collaboration is achieved almost exclusively via addressable APIs (typically a web service protocol such as REST). I don’t want to delve to deeply, so A Beginner’s Guide to Creating a REST API is great place to start for understanding what the hell RESTful API’s are in the first place. I also like Build a reusable REST API back end as it identifies some of the common pitfalls likely to be experienced in an enterprise .

Anti-Fragility – This is another concept pioneered by Netflix and popularised through their use of Chaos Monkey in addition to other anti-fragility tools. It’s ‘simply’ a practice wherein services are designed to be highly resilient (self-healing) to abnormal states or stressors (faults, load spikes, etc). In addition, application architects learn how to strengthen the composite service(s) through deliberately inflicting the aforementioned anomalies and then remediating and reinforcing through a constant learning cycle.

What’s interesting here is that anti-fragility requires a level of creativity and strategically defensive architecture that is well outside the normal realms of traditional operations resiliency. DevOps Anti-fragility and the Borg Collective summarises this in a blog that borrows from Star Trek to articulate the beauty in this principal.

CNA Enablers – This is where our focus lies in operations. These are the mechanisms, platforms and practices that will enable us to standardise and control the deployment of highly-distributed (web-scale) applications.

Containers – Containers are an ideal construct for CNA and CNA is arguably the best use case for containers. They provide a consistent runtime across multiple platforms, built with common toolsets, common API and they support extremely rapid lightweight provisioning. From an operational perspective, when building services utilising containers (or any other form of portable construct) it also means that we are not locked into a proprietary PaaS offering which can act as an anchor in the longterm. I believe this is one of the key challenges to successful cloud adoption and a potentially serious lock-in issue that our developers are not aware of.

VMware is focusing on bringing easily consumable, open source container technology to the enterprise through Photon (Machine & Controller) and vSphere Integrated Containers (VIC aka. Project Bonneville). You can be sure I’ll be covering this in great detail in future posts.

Self-Service Infrastructure – CNA’s effectiveness is reliant upon having rapid, on-demand access to platform resources. Containers can provide this but they are not commonplace in the enterprise which is why public cloud tends to be the landing ground for these CNA type apps. Not only do Dev’s need instantaneous access to PaaS resources, but also the ability to rapidly deploy and configure IaaS resources in a structured and repeatable manner (ie. infrastructure as code).

Immutable infrastructure – As an extension of self-service infrastructure, immutable infrastructure is simply the concept of deploying infrastructure once with zero alteration once deployed. If a change needs to be made (or a fault rectified), simple destroy the instance and roll out a new one in it’s place. This is a fantastic way to ensure service integrity through version control. The impact of this is significant and well explained in An Introduction to Immutable Infrastructure.

In summary, I think it’s important to understand that this is still a relatively new mode of operations in the enterprise realm. Unless you are working for a business with software at it’s core it is unlikely that it will be your entire focus for quite some time, however it is always good to be prepared.

 

References and credits:

 

Author:@Kev_McCloud

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s