vCloud Air Disaster Recovery (aka Disaster Recovery to Cloud, aka DR2C) is by far the easiest entry point and the most in demand offering from vCloud Air. However, it also seems to be the most misunderstood service from a compute scoping perspective. Let’s break it down!
I won’t be covering DR2C in its entirety as there are numerous solution briefs, videos, white boards etc, most of which can be found in the links I featured here. My colleague @ also has a great blog covering everything from DR2C basics to advanced architecture.
DR2C is a subscription service built around VPC compute (covered in part 1 of this series) so we know that the same characteristics apply. What’s important to note is that workloads replicated from our on-premises vSphere Replication consumes storage in VCA, but does not reserve any compute. So what does this mean for the consumer?
First off let’s cover the two scenarios which would require access to active compute in DR2C. Unlike a VPC Virtual Data Centre (VDC), DR2C is not intended to run active VM’s outside of Testing or Recovery, defined as follows:
Testing: Launch protected VM’s in an isolated VDC for a period of 7 days. VM data continues to be asynchronously replicated for the entire duration of the test.
Recovery (or planned migration): Launch protected VM’s, accessible in a predefined production topology for 30 days (and beyond if required). VM replication is ceased (on-premises > vCA) for the entire duration that recovery workloads are active. Note: reverse replication and failback is available for DR2C 2.0 customers (vR & vCenter 6.0 and above).
Scoping DR2C Compute…
Scoping compute for DR2C is a little different from other cloud services as we really need to analyse our procedures for testing and recovery. Let me explain…
DR2C compute is procured in the same blocks as VPC (10GHz/20GB) and scales in exactly the same way. In order to keep the cost of the service low (for the consumer) it is recommended that we only procure as much compute as we need to guarantee. I’ll come back to this in a second.
The above example shows a VDC capable of supporting 10GHz CPU / 20GB memory / 4TB standard storage. Conceptually let’s say that we need 40Ghz CPU / 80GB memory to fully stand up our replicated VM’s for a test. Using this allocation only allows us to test (or recover) a subset of our total footprint within the core subscription.
This is where temporary add-ons come in. We can procure additional compute resources on a temporary basis (1 week for testing, 1 month for recovery) from My VMware to support any additional compute we will need for both scenarios. This gives us the flexibility to test individual application stacks within the confines of the core subscription, or temporarily extended to full capacity to test (or recover) everything at once.
How much to guarantee?..
The quantity of compute we choose to guarantee is completely dependent on our risk profile and overall Recovery Time Objective (RTO). The more compute we guarantee, the more our monthly cost goes up, much like an insurance policy.
Procuring add-ons will add a small amount of time to our overall recovery, but in reality this is a small inconvenience considering the potential cost savings. A simple rule of thumb would be to protect critical workloads (Tier 1) with guaranteed compute to reduce downtime, but scale up with add-ons for everything else (Tier 2 & 3).
It’s also important to remember that we can utilise active capacity in VPC, VPC OnDemand and DPC to augment our DR strategy. The VMware solution brief on Advanced Architecture for DR2C really helps to understand how different vCA services hang together to form a complete solution.
…and there you have it. Pretty simple once we know the limitations. That said, most of the complexity in DR strategy lies in the network. I’ll save that for the next post.