Cloud Native Apps for the Ops Guy – 3 Containerised Tools for the VMware Engineer…

Over the last year (give or take a few months), VMware has been diligently tweaking a variety of its products to integrate container functionality as it becomes more prevalent in the enterprise. With this in mind, I thought I’d put together a quick post detailing three VMware tools which can be used in a simple containerised format.

Update: The below tools are intended to be run as single Docker commands rather than launching a terminal session on the container as you would with William Lam’s much more comprehensive vmware-utils Docker appliance. If @lamw‘s approach is more your bag, you can read about it here.

OVFTool v4.2

For me, OVFTool is a great CLI utility for migrating VM templates and ISOs to & from vCloud Air, although it’s functionality extends way beyond VCA. As a little side project I thought it would be a great idea to containerise the most recent release (v4.2), instead of installing it on my Mac and dealing with potential conflicts. To my delight this was a relatively easy task and took less than 10 mins to build, commit and push to my public repo.

Disclaimer: This image is hosted on my public Docker Hub registry, however  it is not officially endorsed or supported by VMware in any way. That said, please feel free to use (but at your own risk).

To use, simply enter the below Docker command which will allow you to interactively (-i) run the skinnypin/ovftool image and execute an OVFTool command, in this case ovftool –help.

~> docker run -i skinnypin/ovftool ovftool --help
Example output

PowerCli Core

PowerCLI Core builds upon the open source Microsoft PowerShell Core and .Net Core enabling the use the of PowerCLI on non-Windows operating systems. As a Mac user, having to open up a Windows VM in VMware Fusion just to use PowerShell has been a little inconvenient. But no more…

In addition to availability on OSX and Linux, the awesomeness of PowerCLI Core can also be accessed via an offical VMware Docker image. For more info on PowerCLI Core see here.

To use, enter the below Docker command which will give you interactive (-i) access to the PowerCLI prompt.

~> docker run -i vmware/vmwarepowercli
Example output

Project Platypus

Project Platypus is a very nice tool built by my good friend Grant Orchard (and other VMware folks) which details supported VMware product API’s and their usage. If you’ve ever tried to utilise VMware APIs by referencing official documentation, you understand why this tool is absolutely necessary. To the best of my knowledge Platypus is only available in this containerised format, so if you want the goodness you’re going to have to get familiar with Docker…

Details available on Github here.

To use, enter the below Docker command which will run a detached (-d) container which is accessible from your web browser on port 8080 (-p 8080:80) using the IP address of your container host.

~> docker run -d -p 8080:80 vmware/platypus
Example output
Screen Shot 2016-11-28 at 11.57.41 AM.png
VMware Platypus Web UI

So there you have it. Three easily accessible VMware tools that can be distributed without having to read any installation documentation (as long as you have access to a Docker environment ). As always, feedback is appreciated, especially if this is useful and you want to see other tools available in this format.


Update 2 > BONUS TOOL: I also spend some time Dockerizing VIC Machine (v0.8.0-rc3), the container host provisioning utility used with vSphere Integrated Containers. Details on VIC here.

Example output

To use, simply enter the below Docker command which will allow you to interactively (-i) run the skinnypin/vic image and execute a VIC Machine command, in this case vic-machine-linux –help.

~> docker run -i skinnypin/vic /vic-machine-linux --help



Author: @Kev_McCloud

VCA Dissected – Docker Machine Driver for vCloud Air

If you’ve followed my blog or seen me presenting in the last six months you may have noticed I have developed a keen interest in Cloud Native Apps and DevOps in general.  I was lucky enough to present a combined CNA/vCloud Air session at VMWorld this year which was a little different from the hybrid cloud talks I usually give.

In addition to the ‘what-why-how’,  I also ran a live demo showing the provisioning and decommissioning of a remotely accessible VCA Docker host, complete with NAT and firewall configuration using two simple commands. Since Las Vegas I have been meaning to post how I constructed the demo, so here it is.

Note: some prior knowledge of basic vCloud Air administration and Docker functionality is assumed…

Docker Machine Driver for vCloud Air

In my previous post I talked about VM’s and containers living side by side as decomposing (or building alongside) monolithic apps can take an extended period of time, or may not be possible at all. To support this notion, VMware has made great strides in the containers space to provide technology that allows organisations to run containers natively on vSphere (through VIC) or Photon Platform depending on operational requirements and overall maturity with the cloud native apps.

However there is one aspect of the VMware CNA vision that is often overlooked, namely vCloud Air. This may be because vCloud Air does not have a native container offering (at the time of writing this post), but it does have support for Docker Machine which is an essential part of the Docker workflow if using Docker Toolbox for administration.

What do we need?

In order to use the Docker Machine Driver for vCloud Air we will need to have a VCA subscription (either Virtual or Dedicated Private Cloud) and a user account with network & compute administrator permissions assigned. With this we can go ahead and create a private template which Docker Machine will use to create our container host. Note, if not specified in our docker-machine create command, Docker Machine will use Ubuntu Server 12.04 LTS from the VCA Public Catalogue by default.

Quick Tip: To create a quick template I used the Ubuntu Server 12.04 LTS image from the VCA Public Catalogue as it already has VMware tools installed. After I ran my usual VCA linux template prep, (root pw change, network config, ssh config, apt-get upgrade, apt-get update, etc) I renamed vchs.list to vchs.list.old found in /etc/apt/sources.list.d/. Now I did this because when Docker Machine runs through the provisioner process it uses apt-get to retrieve VCA packages from, which can sometimes be a little slow to respond. This occasionally results in the provisioner process timing out (as it did in my demo at VMWorld….grrr). Note, post initial template creation it is not necessary to have the repo available for the docker provisioning process.

Provided we have a access to a VCA routable network and an available public IP address, we can go ahead and run a relatively simple shell script to execute the entire provisioning process.  It should be noted that I created this script to be easily distributed to anyone needing quick access to a docker environment, provided they had the correct VCA permissions. It also avoids storing your VCA password in clear text.


#Simple docker-machine VCA docker host creation script

read -p "Enter VCA user name: " USER
echo Enter VCA Password:
read -s PASSWORD

docker-machine create --driver vmwarevcloudair \
--vmwarevcloudair-username="$USER" \
--vmwarevcloudair-password="$PASSWORD" \
--vmwarevcloudair-vdcid="M123456789-12345" \
--vmwarevcloudair-catalog="KGLAB" \
--vmwarevcloudair-catalogitem="DMTemplate01" \
--vmwarevcloudair-orgvdcnetwork="KGRTN01" \
--vmwarevcloudair-edgegateway="M123456789-12345" \
--vmwarevcloudair-publicip="x.x.x.x" \
--vmwarevcloudair-cpu-count="1" \
--vmwarevcloudair-memory-size="2048" \


The expected output is as follows…

Sample docker-machine output

Note, this is a minimal subset of commands for basic VCA container host provisioning. I have also changed the VDC ID, Edge Gateway ID and public IP in the example script for obvious reasons. A full list of Docker Machine Driver for vCloud Air commands can be found on the Docker website here.

Once the provisioner process is complete, we should have an internet accessible container host configured with 1 vCPU, 2GB of memory with Docker installed, running and listening for client commands on the configured public IP we specified earlier.

To natively connect to this environment from our Docker client we simply enter the following…


~> eval (docker-machine env DockerHost01)


That was easy, right? Well… it’s not quite that simple.

The above will create a relatively insecure Docker environment as the edge firewall rules are not locked down at all (as shown below).

Default docker-machine VCA Firewall configuration
Default docker-machine VCA SNAT/DNAT configuration

This can be handy for testing internet facing containers quickly as we do not need to explicitly define and lock down the ports needed for external access. However if this Docker host is intended to become a even a little more permanent, we can use VCA-CLI or the VCA web/VCD user interface to alter the rules (at a minimum port 2376 needs to be open from a trusted source address for client-server communications, and whatever ports are needed to access containers directly from the internet).

Assuming our environment is temporary, we can also tear it down quickly using:


~> docker-machine rm DockerHost01


So there you have it. The entire build provisioning process takes less than 5 mins (once you have set up a template) and decommissioning takes less then 2 mins! In addition to simple tasks I’ve outlined here we can also use a similar process to create a Docker Swarm cluster, which I will cover in my next post.

As always, if you have any questions or feedback feel free to leave a comment or hit me up on Twitter.


Author: @Kev_McCloud

Cloud Native Apps for the Ops Guy – VM’s and Containers Living Together in Harmony

Disclaimer: This is not a technical tutorial on Docker or vSphere Integrated Containers, rather my views on the philosophy and gradual integration of containers into our existing VMware ecosystem.

Container will replace VM’s… Eventually… Maybe…

I recently presented a session at the Melbourne VMUG called “Can VM’s and Containers happily coexist?”.  Though somewhat rhetorical, the title was born out of the protracted argument that containers will somehow surpass VM’s in the near future. To condense our session overview into a single sentence, we tackled a brief history of containers, Docker’s rise to fame and the inherent issues with this rise to fame. Despite the age of containers, the fresh faced vendors have yet to prove their worth as a wholesale replacement for virtualisation.

In my first post in this series I described the basic tenets behind Cloud Native Applications, one of which is the 12 Factor App. This framework has arguably become the unofficial guideline to creating applications suitable for a microservices architecture, but it also lends itself perfectly to illustrating why a vast majority of existing monolithic & layered applications are not suitable for decomposition.

It’s also worth bearing in mind that it may be more efficient to build new functionality & services around an existing monolith, a concept Martin Fowler refers to as the “Strangler Application” aka Strangling the Monolith. Simply, if it ain’t broke, don’t fix it… just gradually improve it!

Taking both these factors into consideration it becomes clear that VM’s will play their part for existing organisations for some time to yet, albeit sharing the limelight with there slimmer, more popular counterparts.

Evolution of the workload…

We’ve all heard the pets vs livestock analogy many times, but a recent focus on microservices and the art of driving economy through mass autoscaling has introduced the ‘Organism’: A computing entity that is minuscule from both a footprint and lifespan perspective and has little impact as an individual, but when combined with other organisms the ‘grouping’ becomes highly dynamic and resilient. Where can we find these organism type of workloads being used to great effect? Think Google, Facebook et al.

Screen Shot 2016-06-03 at 10.30.56 AM

Allow me digress to make a point. A large part of my role is identifying how to modernise datacenter practices to incorporate cloud technology in whatever format aligns best with a strategic outcome. I only mention this because I believe marketing is well beyond the point where use of container technology can be useful for a business that has come from traditional mode 1 operations.

In my opinion most organisations are still finding the balance between pets and livestock through evolved lifecycle practices, but that doesn’t mean they can’t incorporate the numerous benefits of containers other than those more commonly found in environments with large scale workload churn.

Note: As an aside, I recently watched “Containerization for the Virtualisation Admin” posted on the Docker blog where the misconception of containers only supporting microservices was dismissed, something I have long been arguing against. Nice one guys.

Sneaking containers to operations…

For ops, the most common first encounter with containers is likely to occur when Dev’s request a large Linux/Windows VM that will eventually become a container host unbeknownst to the the ops team. In almost all cases this means that operations lose visibility of what’s running within the container host(s). Not ideal for monitoring, security, performance troubleshooting and so forth.

In this  scenario our Devs interaction with the rogue Docker hosts may look something like below:

Screen Shot 2016-06-09 at 3.34.44 PM

This approach leaves a lot to be desired from an operational perspective and therefore rules out almost all production scenarios. A better approach is to try to evolve the compute construct to suit existing practices within a business. In VMware’s world, this means treating VM’s and containers as one and the same (at least from an evolutionary perspective).

A box within a box!?.

As an atomic unit, do developers care if a container runs in a VM, natively on a public cloud or on bare metal? Well… it depends on the developer, but generally the answer is no. As long as they have free range access to a build/hosting environment with the required characteristics to support the application, all is good. For operations this is an entirely different story. So like in any good relationship, we need to compromise…

Arguably, there is an easy way to get around this issue: Create a 1:1 mapping of containers to VM’s (I can already hear the container fanboys groaning). Yes, we do lose some of the rapid provisioning benefits (from microseconds to seconds) and some of the broader Docker ecosystem features, but we don’t have to forklift an environment that we have spent years (and lots of $$$) refining. Anecdotally, it seems we have spent so long trying to create monster VM’s that we have forgotten the power of the hypervisors ability to balance and isolate numerous tiny VM’s.

Admittedly, for some organisations having the bottleneck of provisioning individual VM’s is still a very real headache for the development team…

*Fanfare… Enter vSphere Integrated Containers!

vSphere Integrated Containers (aka VIC) provides our Dev’s with a way to transparently work with containers in a vSphere environment, therefore reducing a lot of the friction traditionally found with operations having to create VM’s.

The premise behind VIC is to overlay the container construct into existing vSphere functionality, but with all the characteristics of a container (isolated, lightweight, portable, etc). This has a numerous benefits for operations around resource control/distribution, monitoring and security using mechanisms are already well established (and more importantly, well understood) by network and security teams.

Screen Shot 2016-06-09 at 3.34.18 PM

So we can visualise the above using a familiar interface like vCenter, if I run a basic command like <$docker run …..> from my Docker client to the Docker daemon running on my Virtual Container Host, vCenter launches a Instant Clone forked VM with a single container running inside. From a vCenter perspective we can see the container running in the same vApp where the VCH and Instant Clone Template exists.

Screen Shot 2016-06-28 at 12.02.20 PM

Note: The version of VIC I used for this screen shot is based on Project Bonneville (detailed here) to show the use of the command <docker ps -a> which displays both running and exited containers. At the time of writing (0.3.0), the VIC beta (available here) did not support certain docker commands, including ps. Based on user feedback there have been some changes to the overall architecture to better align with real world requirements. More to follow soon…

The result is vSphere admins can enforce control directly from the parent resource pool. We can route, monitor, shape and secure network traffic on the port group assigned to the VCH as the docker bridge network. We can set CPU/memory shares, reservations and limits to ensure we don’t compromise other workloads…  and our devs get access to a Docker environment that operations fully comprehend and have existing operational policies and procedures that can be adapted.


Before the container/microservices fanboys get up in arms, this post was not intended to show the use of containers for isolated applications or greenfields projects, but rather the integration of a new construct into an existing VMware enterprise. IMO, traditional organisations value the portability, ubiquity and flexibility of Docker across disparate operational platforms more than rapid provisioning and scaling… and us Ops folk need to learn to walk before we can sprint…

In the next post of this series we will start to tackle the challenge of scaling using the same philosophies detailed in this post. See you next time.


Cloud Native Apps for the Ops Guy – Container Basics

Welcome back All. In part 1 we covered the basic understanding of Cloud Native Applications (CNA) and more importantly its relevance to todays IT Operations Teams. Let’s start with a quick recap of what we’ve already covered:

  • CNA is an evolution in software development focused on speed and agility, born out of removing the challenges of the traditional SDLC
  • Ops Teams are struggling to fully operationalise modern web-based applications, i.e. build and/or maintain operational practices for hosting cloud native applications
  • There is a vast array of CNA architectures, platforms and tools, all of which are in their relative infancy and require a degree of rationalisation to be useful in the enterprise

I also covered breaking down my understanding of CNA into two areas of research; Foundational Concepts and CNA Enablers, the later of which we’ll cover in this post.

How can CNA fit into existing IT Operations?..

To see where CNA Enablers might fit, I took a look at the responsibilities of a modern IT Team for application delivery. At a high-level our priorities might cover:

Development Team: Application architecture / resiliency / stability / performance, deployment, version control, data management, UX/UI.

Operations Team: Platform automation / orchestration / availability, scaling, security, authentication, network access, monitoring.

Note, this is a very generic view of the average IT Team dichotomy, but it does illustrate that there is virtually no crossover. More importantly, this shows that the core of operational tasks are still aligned with keeping hosting platform(s) alive, secure and running efficiently. So with this mind, how do we go about bringing operations and development closer together? Where will we start to see some overlap in responsibilities?

Introducing Containers…

There has been a lot of commotion around containers (and by association, micro services) as the genesis of everything cloud native, however Linux containers have existed for a long time. If we filter the noise a little, it’s clear to see that containers have become essential because they address the lack of standardisation and consistency across development and operations environments, which has become more prevalent with the growing adoption of public clouds like AWS.

So what is all the fuss about? To begin to describe the simple beauty of containers, I like to think of them as a physical box where our developers take care of what’s inside the box, whilst operations ensure that the box is available, wherever it needs to be available. The box becomes the only component that both teams need to manipulate.

Screen Shot 2016-01-25 at 10.57.23 AM

To overlay this onto the real world, our dev’s have to deal with multiple programming languages and frameworks, whilst we (as ops) have numerous platforms to maintain, which are often comprised of drastically different performance and security characteristics. If we introduce a container based architecture, the “box” reduces friction by providing a layer of consistency between both teams.

Note: There are plenty of awesome blogs and articles which describe the technical construct of a container in minute detail. If this is your area of interest, get Googling…

Architecture basics…

Now for me it was also important to understand that containers are not the only way to deploy  a cloud native architecture (please refer to this excellent post from my VMware colleague @mreferre), but also to acknowledge that they are important for a number of reasons, namely:

  • They provide a portable, consistent runtime across multiple platforms (desktop, bare metal, private & public cloud)
  • They have a much smaller, more dynamic resource footprint
  • They can be manipulated entirely via API
  • They start in milliseconds, not seconds or minutes
  • They strip away some layers which could be considered to add significant ‘bloat’ to a traditional deployment
  • Taking into account all of the above they provide a better platform for stateless services

If we compare a “traditional” VM deployment to a containerised construct (diagram below), it’s evident that gen 2 (i.e. monolithic / single code base) apps often have a larger resource overhead because of their reliance on vertical scaling and the tight coupling of their constituent parts. If we have to move (or redeploy) a gen 2 app, we need to move (or redeploy) everything northbound of the VM layer, which can be considerable if we are moving app data as well.

Screen Shot 2016-01-29 at 10.33.13 AM

Note: The above diagram is not intended to show a refactoring from gen 2 to gen 3, but instead how the same applications might look if architected differently from scratch.

From an operational perspective, gen 3 (ie. cloud native) apps which leverage containers and have a far greater focus on horizontal scaling, whilst greatly increasing consolidation of supporting platform resources.

As a comparison, when moving gen 3 apps between environments we only have to push the updated app code and supporting binaries/libraries not included in the base OS. This means we have a much smaller package to move (or redeploy) as the VM, guest OS and other supporting components already exist at the destination. Deployment therefore becomes far more rapid with far less dependency.

Now this is all very exciting, but in reality gen 2 and gen 3 will have to coexist for some time yet, therefore it’s probably best to have a strategy that supports both worlds. For this reason, I am researching the synergies between the two constructs which is where I believe many IT shops will thrive in the near term.

Where do we begin?..

If we start with a minimal platform, all we really need to be able to build a containerised application is; a host, an OS which supports a container runtime and a client for access. It’s entirely possible to build containerised applications in this way, but obviously we are severely limited in scalability. Once we go beyond a single host platform, management becomes far more complex and therefore requires greater sophistication in our control plane. But I guess we should try to walk before we run…

Let’s take a closer look at some of the layers of abstraction we will be working with. Note: So as not to confuse myself with too many technologies, I’ve focused my research on VMware’s Photon (for obvious reasons) and Docker, which I believe has firmly established itself as the leader in container and container management software.

Container Engine / Runtime – This is the software layer responsible for running multiple, isolated systems (i.e. containers) by providing a virtual environment that has its own CPU, memory, block I/O, network, cgroups and namespaces within a single host. It is also responsible for scheduling critical container functions (create, start, stop, destroy) in much the same way a hypervisor does.

In the case of Docker, it’s also the runtime that manages tasks from the Docker Daemon which is the interface that exposes the Docker API for client-server interaction (through socket or REST API).

Container OS – A container OS (as the name would suggest) is an operating system which provides all the binaries and libraries needed to run our code. It also enables the container engine interact with the underlying host by providing the hardware interfacing operations and other critical OS services.

Photon is VMware’s open source Linux operating system, optimised for containers. in addition to Docker, Photon also supports Rkt and Garden meaning we are not limited to a single container engine. It’s also fully supported on vSphere (and therefore vCloud Air) and it has no problems running on AWS, Azure and Google Cloud Engine (though it may be fully supported by these service providers at the time of writing).

Note: If you feel like having a play around with Photon (ISO), it can be downloaded from here, deployed directly from the public catalogue in vCloud Air, or if you want to build your own Photon image you can also fork it directly from GitHub.

Host – Our operating system still needs somewhere to run. I believe that for most of us, virtual machines are still best used here because of the sophistication in security, management and monitoring capabilities. In the short term it means we can run our containers and VM’s side by side, but it should be noted that we can also run our container OS on bare metal and schedule container operations through the control plane.

Platform – A platform in the context of operations is simply a hosting environment. This could be a laptop with AppCatalyst or Fusion, vSphere and / or private and public cloud, really any environment that is capable of hosting a container OS and the ecosystem of tools needed to manage our containers.

Basic Container Usage…

In order to make this an effective approach for our dev’s, they need self-service access to deploy code and consume resources as they see fit. The simplest approach for our dev’s is to deploy in an environment where they have full control over the resources, like their laptop.

Once we go beyond the dev laptop, our platforms might include on-premises virtual infrastructure, bare metal and public cloud. The platform itself is not really that important to our dev’s provided it has the capabilities needed to support the application. So ops really need to concentrate on transparently supporting our dev’s ability to operate at scale. With that comes operational changes, which might include:

  • Secure access to the container runtime (via a container scheduling interface, which we’ll cover in the next post)
  • Internal network communications to support containerised services to function at scale, including virtual routing/switching, distributed firewall, load balancing, message queuing, etc
  • Secure internet and/or production network communications for application front end network traffic
  • Support for auto-scaling and infrastructure lifecycle management, including configuration management, asset management, service discovery, etc
  • Authentication across the entire stack defined through identity management and role based access controls (RBAC)
  • Monitoring throughout the entire infrastructure stack (including the containers!)
  • Patching container OS / runtime and all supporting platforms

Now I realise this is only scratching the surface, but if we listed all of the operational changes needed to incorporate this mode of delivery we would be here all day. For this reason I’m ignoring CI/CD and automation tools for the time being. Don’t get me wrong, they are absolutely critical to building a reliable self-service capability for our dev’s, but for now they are just adding a layer of complexity which is not going to aid our understanding. We’ll break it down in a later post.

So there you have it. In looking at the simple benefits that containers provide, we quickly begin to realise why so many organisations are developing cloud native capability. In the next post we’ll start to look at some of the realities of introducing a cloud native capability to our operations when working at scale.

References and credits:


Author: Kev_McCloud



Cloud Native Apps for the Ops Guy – What the hell is CNA anyway?

Intro: DevOps, containers and every other buzzword bingo concept seem to be severely muddying the waters of understanding the practicality of web-scale application development and it’s relevance to the everyday Operations Team. I’m hoping this series explains the basics of Cloud Native Applications from the perspective of an ops guy by joining some of the dots across a commonly misunderstood topic. 

Are we “failing fast”, or just plain failing?..

Cloud Native Applications (CNA) has to be one of the most enigmatic concepts in operations today, but it is widely accepted as the future of software development. New school, software centric businesses like NetFlix brought microservices and open source software into sharp focus with their recent success, showing both can be heavily utilised to support a multi billion dollar business. Couple that with the fact traditional application scaling and resiliency is now considered to be slow, inefficient and expensive and it’s of little wonder that everything with a Cloud Native prefix is gaining such rampant popularity.

With that said, I’m always a little alarmed when I have a conversation with a fellow ops engineer that generates grumblings along the lines of…

Our devs just build stuff in the cloud, then dump it on us with little or no information on how to operationalise it…

I hear this quibble (or variation of) surprisingly often. You would almost think it was standard practice for all dev > ops hand overs, right? Question is; Do most devs actually hand over with poor context, or is it a lack of knowledge / unwillingness to learn the basics of modern web application architecture that holds us ops fellas back?

Screen Shot 2015-11-02 at 5.28.08 PM

If you believe the later, then clearly it is critical for us to understand how our development teams function (what with them being the future n’ all). The depth of our understanding however is something most of us will have struggled with.

DevOps today is just tomorrows operations…

Let’s go back to our recent past (like yesterday). Does weekly CAB meetings, architectural review boards, problem management, excessive documentation all sound familiar? This is the kind of rigid ITSM doctrine that has been pushed for many years to bring stability to complex environments. The problem is that it’s costly and moves at a near glacial pace. Pretty much the opposite of agile…

Don’t get me wrong, stability and resiliency are critical for enterprise, a net result of having to baby sit a whole raft of buggy software. This is why most ops guys view devs as cowboys. From our perspective it’s buggy code that stresses the supporting platform, causes outages and so on (the age old blame game). The impact of ‘bugginess’ depends on how far reaching the impacted code is within a service. Monolithic, layered applications tend to be the worst affected because they are comprised of only a few services which are tightly coupled (ie. close dependencies with other services).

Screen Shot 2015-11-08 at 9.32.56 AM

And therein lies the beauty of CNA: Applications comprised of numerous, loosely coupled, highly distributed services where the impact of faults (read: bad code, potentially 😉 ) is drastically reduced. Sound good? I think so…

The reality is most software developers have already had this epiphany and are working (or experimenting) with platforms that support CNA today. The adoption of this practice has forced infrastructure teams to be more forward thinking in accomodating what our devs are building. Make no mistake, they are leading us into the future. What’s more important right now is that we have a opportunity to control and shape the platform(s) used to host, deploy and secure cloud native applications.

Where to start?..

To paraphrase from some self-awareness reading I did recently; first seek to understand, then to be understood. The initial problem I had was that CNA is more of a broad conceptual practice than a prescriptive framework. So the first port of call was to gain a high level understanding of CNA basics so at the very least I could have a constructive conversation with software development teams. I’m not saying this will work for everyone, but it did for me…

First off, I split my research into two areas of understanding; Foundational Concepts and CNA Enablers. This is in no way official, it’s just my way of breaking down a broad topic into chunks that I could easily consume.

Disclaimer: I am pretty far from being any kind of authority on CNA, so rather than try to cover everything or take credit for others great work I have also included a bunch of links which gave me a high level understanding of CNA. These blogs, posts and articles are not intended to show how to implement or architect. This is purely an exercise in understanding…

Foundational Concepts – The following concepts / components are constituents of a CNA practice. This should help us to understand a modern software developers challenges.

Twelve-Factor Applications – The 12-factor app is a widely accepted framework of best practices for deploying scalable applications on any platform. There are litterally hundreds of blogs detailing this and it’s a lot of info to digest for the uninitiated, but the awesome 12-Factor Apps in Plain English helped me to understand this concept with relative ease.

Microservices – I like to use the terrible analogy of a legacy Windows OS as a starting point for understanding microservices (note: it’s just for common ground for explaining a construct and completely unrelated to CNA in general). Hundreds of single instance, single function, fragile services combined to run a basic operating system. I say fragile because historically any one of approx 50% of these services could cause a critical failure in the overall OS, and did, regularly. Imagine if all these services actually ran independently and in a fault tolerant, balanced state so any issue could be isolated and remediated with little or no service degradation!

Essentially microservices are just that. Loosely coupled, lightweight, easily scaleable components that have a single essential function within in a lager application or service. What are microservices explains this in basic detail, but Microservices – Not a free lunch brings a dose of reality by helping us to realise the level of technical maturity required for effective usage. If you do want to get deeper into microservices @martinfowler is a great chap to follow and his blog has a plethora of information to get you pretty far down the rabbit hole, so to speak.

API Collaboration – Microservices are of little power unless they can interact with other microservices. This collaboration is achieved almost exclusively via addressable APIs (typically a web service protocol such as REST). I don’t want to delve to deeply, so A Beginner’s Guide to Creating a REST API is great place to start for understanding what the hell RESTful API’s are in the first place. I also like Build a reusable REST API back end as it identifies some of the common pitfalls likely to be experienced in an enterprise .

Anti-Fragility – This is another concept pioneered by Netflix and popularised through their use of Chaos Monkey in addition to other anti-fragility tools. It’s ‘simply’ a practice wherein services are designed to be highly resilient (self-healing) to abnormal states or stressors (faults, load spikes, etc). In addition, application architects learn how to strengthen the composite service(s) through deliberately inflicting the aforementioned anomalies and then remediating and reinforcing through a constant learning cycle.

What’s interesting here is that anti-fragility requires a level of creativity and strategically defensive architecture that is well outside the normal realms of traditional operations resiliency. DevOps Anti-fragility and the Borg Collective summarises this in a blog that borrows from Star Trek to articulate the beauty in this principal.

CNA Enablers – This is where our focus lies in operations. These are the mechanisms, platforms and practices that will enable us to standardise and control the deployment of highly-distributed (web-scale) applications.

Containers – Containers are an ideal construct for CNA and CNA is arguably the best use case for containers. They provide a consistent runtime across multiple platforms, built with common toolsets, common API and they support extremely rapid lightweight provisioning. From an operational perspective, when building services utilising containers (or any other form of portable construct) it also means that we are not locked into a proprietary PaaS offering which can act as an anchor in the longterm. I believe this is one of the key challenges to successful cloud adoption and a potentially serious lock-in issue that our developers are not aware of.

VMware is focusing on bringing easily consumable, open source container technology to the enterprise through Photon (Machine & Controller) and vSphere Integrated Containers (VIC aka. Project Bonneville). You can be sure I’ll be covering this in great detail in future posts.

Self-Service Infrastructure – CNA’s effectiveness is reliant upon having rapid, on-demand access to platform resources. Containers can provide this but they are not commonplace in the enterprise which is why public cloud tends to be the landing ground for these CNA type apps. Not only do Dev’s need instantaneous access to PaaS resources, but also the ability to rapidly deploy and configure IaaS resources in a structured and repeatable manner (ie. infrastructure as code).

Immutable infrastructure – As an extension of self-service infrastructure, immutable infrastructure is simply the concept of deploying infrastructure once with zero alteration once deployed. If a change needs to be made (or a fault rectified), simple destroy the instance and roll out a new one in it’s place. This is a fantastic way to ensure service integrity through version control. The impact of this is significant and well explained in An Introduction to Immutable Infrastructure.

In summary, I think it’s important to understand that this is still a relatively new mode of operations in the enterprise realm. Unless you are working for a business with software at it’s core it is unlikely that it will be your entire focus for quite some time, however it is always good to be prepared.


References and credits: