This post is going to deep-dive into vSphere’s vStorage API for Data Protection (VADP) Changed Block Tracking on restore – something not yet widely used (though when you see the results, you’ll likely agree that over time, everyone is going to need to leverage it).
The context? Today – Avamar 6.0 was released. It’s a huge release. HUGE. The core value proposition is amped up – save more, backup and restore faster, flexible restore (in place, with the industries best VMware integration when it comes to backup – and still no agent or client-based cost.
The high notes (for me) of the release:
- Data Domain Boost and Data Domain management integration. This is very important for the larger database workloads, or for VM image-level backups of very, very active and very large VMs. This makes Avamar and Datadomain an “and”, not an “or” – and customers can get the best of both worlds and save even more $ while accelerating backup/recovery.
- Bigger, badder Avamar hardware = denser systems and lower cost. This means 65% lower power consumption, but also 230% higher max density per rack.
- Not only was the hardware refreshed, but the “under the covers” architecture went from RHEL to SLES. Good for VMware alignment to be sure (EMC is as a general statement moving to SLES in a ton of areas), but also more importantly good for customers in that remote patching, update, and upgrades of Avamar nodes is much, much improved. Likewise, push-based agent deployment (we strive for agentless particularly with VMware, but there are some app-based use cases where app integration demands it).
- Broader and deeper app support. This always has been an Avamar strong suit relative to the other advanced VMware-focused backup folks, but it got hopped up big time in this release.
- Exchange – native mail-level recovery in a single step, and 78% higher throughput.
- Sharepoint – integrated granular recovery via Kroll Ontrack, and nice new VSS plugin that gives you a full farm-level view.
- SQL Server – 46% higher throughput.
- Oracle – with RAC, you just need to point to a single node, and really broad config support
- Huge improvements in end-user self restore. This is a big deal in VDI deployments. The users can do self-recovery against a nice, simple portal – and can do it by “search” or by “browse”. One item of note – this function still requires the in-guest agent, VADP/CBT doesn’t work here – working on it (but remember, dedupe means it doesn’t add material storage costs on top of VM-image level backup, and there is no agent cost). This is what it looks like to the user:
But.. the BIGGEST new thing (at least in my VMware-centered universe) new thing is Changed Block Tracking based restore.
This is really material as the most common backup/recovery request from customers in the VMware use case goes something like this:
- “I want it to be agentless (at least for the majority of my VMs – I accept and desire agents for app-centric scenarioes).”
- “I want it to save me a lot of $ because I know commonality is really high – so my dedupe expectations are through the roof.”
- “I want it to offload the vSphere infrastructure (network, ESX hosts, storage)”
- “I want it to leverage the fact that they are VMs. I want vCenter integration. I want VM image-level backups with no agents – without giving up catalogs, and simple single-step restore.”
- “I want it to be BETTER than the physical world – faster backup, and more importantly, faster recovery.”
There are certain “intrinsic” architectural things that have always made Avamar very popular against this list namely variable-length source-based dedupe = it’s ability to have 99%+ rates of dedupe and to offload the ESX hosts due to it’s source-based nature. Avamar was also one of the earlier large backup vendor products with deep vCenter integration.
But – the emergence of VADP in vSphere 4, it’s maturing as of Update 2, and expansion in vSphere 4.1 have enabled us to take this to a new level.
In Avamar 5, use of CBT along with SCSI hot-add, and the VDDK (all parts of VADP) made us able to improve backup times (which were already good) by a factor of 10x, and make single-step file level restore easy.
In Avamar 6, the addition of CBT being used during restores brings that same “order of magnitude” improvement to common restore scenarioes.
To understand how and why – read on…
Note that a vendor using pretty well anything in the vStorage API for Data Protection set (CBT, SCSI hot-add, VDDK) lets them say “vStorage support” – not all are created equal (of course, it’s up to the customer to decide).
The core use of CBT is that immediately, the changed blocks in a VMDK can be identified. Even if you DON’T do a source-based dedupe, this is a big improvement, because you don’t need an agent to identify what has been changed when doing an incremental backup. The below is the effect on the VM Proxy if the VM proxy doesn’t use CBT (80% CPU utilization for a long time) vs. using CBT (a blip of utilization for a blip of time).
IF you additionally do a source-based dedupe during backup and restore (which Avamar does), it means that those are the only blocks that need to be compared against both the local cache of deduped blocks on the proxy, and also against the target – which has all the blocks for a global dedupe effect (in this case the Avamar grid).
Here’s the net of CBT use during backup – 50TB of data, 1000 VMs, fully backed up (functionally a “full” backup) in 43 minutes (after just 5 days of backup).
Notice that the time to do the backup actually lower after 5 days at 1000 VMs with 50TB of data than it is on Day 1 with 250 VMs and 12.5TB of data.
What’s new is that this technique is also leveraged during restore as well as backup. AFAIK, this is (as of today – April 18th, 2011) unique in Amavar. While the VMware-centric Veeam, vRanger and the other mainstream folks like Commvault and Symantec use CBT for backup, I don’t believe any use it during restore (at least yet).
The widely used (including in Avamar 5) VM-image-level restore process is to restore the VMDK all at once. Their (and Avamar – including Avamar 6) file-level restore is to SCSI-hot add, browse filesystem for the file and restore using the VDDK. (corrections welcome from the community if I’m wrong).
Avamar does some neato stuff here for VM-image level restore…
Since Avamar always works on changed blocks (even absent CBT) and then applying a variable length block level dedupe approach, it’s always needed to be able to backup PARTS of files, and be able to restore PARTS of files. This has a huge effect on image level backup for images with LOTS of files (but as tends to be the case, not much changing from one backup to another) – there’s no need to scan and index the filesystem (like Symantec for example).
Now, to get an idea of how CBT applies on restore – it’s the analog of how CBT accelerates backup. Rather than needing to restore the entire VMDK to recover what is in effect a relatively small proportion of changed data, we can restore only the changed blocks. The animated picture below tells the story. On CBT backup, through the use of the vSphere 4 changed block tracking feature, only the changed blocks are sent to the virtual proxy. These are source-based deduped using Avamar (including Avamar 5), so what gets copied over the LAN/WAN is as close to zero as possible. This enables greater consolidation, much faster backup (often VMs that are 100GB+ in size backup in seconds), and that combo of vSphere and Avamar makes WAN-based backup much, much easier.
The second bit is a “traditional” vStorage API-based (so this approach can claim “VADP compliant”) VM image level restore. Regardless of how much has changed, the full VM is copied.
The last bit is the new CBT-based restore.
As you can see, unlike the normal image-level restore where reverting even a little bit of data requires moving a lot of data in the full VMDK, with CBT-based restore, only the changed blocks are copied back to the VMDK, reverting state.
The net: CBT-based restore meant that we could restore a 100GB VM during an image-level restore in 50 seconds instead of 20 minutes (with 300MB of data changed) – and do it without an agent
As always – there are some caveats, and based on how VADP works, one topic is guest OS support. CBT backup works for lots of Guest OSes: Windows, Linux, Solaris, but restore support varies (based on what can/can’t be done via the APIs), so remember the following
- If you use GUEST level backup (remember there are reasons to do this if it’s an app-level use case, or you need something specific, and also remember that Avamar has no backup agent license cost), then File-level restore is available for all Windows, Linux, Solaris.
- If you use IMAGE level backup, then File-level restore is available for only Windows
- CBT Restore for VMDK restore is available for Windows guests.
When you put that together with Avamar’s great vCenter integration that lets you see and manage backup state of VMs (as well as enabling CBT en masse) – Avamar continues it’s long tradition of leading the major backup players when it comes to VMware (and also supporting all sorts of non-VMware use cases).
This demonstration covers a lot of the new coolness (note that file-level restore is completely supported, and can be done with the VM powered on, and is very flexible – in place restore, or redirected):
You can download it in MOV format here.
In the voices of Avamar customers (which is always better than anything I can say):
BTW – for Avamar customers wondering about a couple of other things, here’s some of the “behind the scenes” + beyond the stuff announced today:
1) There’s been a lot of work between VMware and EMC on backup/recovery of vCloud Director (vCD) at scale using Avamar. Some great joint solutions work has been done to enable tenant-level backup and recovery, integrated across the parts of a vCD solution (thank you to several vCD customers who assisted in the proof of concept work). Add to that that Avamar itself has always been multi-tenant and offers end-user self-restore capabilities, and it’s a great solution for a service provider. More will be published on that shortly.
2) Avamar provides killer Vblock integration – not only backing up the VM data, but able to backup and recover all the UIM state, as well as the UCSM info – enabling you to not only restore the “data on a Vblock”, but “restore a Vblock”
Feedback and courteous comments welcome!