Transcription

NEW White PaperExpert Guide on Backing upWindows Server in Hyper-Vby John Savill, Microsoft MVP

John SavillMicrosoft MVPJohn Savill is a Windows technical specialist, an 11-time MVP, an MCITP: Enterprise Administrator for WindowsServer 2008, and is ITIL certified. He is the author of the popular FAQ for Windows and a senior contributing editorto Windows IT Pro. John is the author of The Complete Guide to Windows Server 2008 (Addison-Wesley), and he’scurrently writing his latest book, Microsoft Virtualization Secrets (Wiley).WHITE PAPER2

Table of ContentsVirtualization Benefits and Dangers.4The Need For Virtual Environment Protection.5Intelligent Guest Level Backup Advantages.7Backup Storage Considerations.7Restoration Processes.9Protecting Your Protection.10WHITE PAPER3

SynopsisMicrosoft’s Hyper-V platform is quickly gaining market share over its competitors in the market. In this paper we willexamine what you need to know to protect your virtualized workloads successfully.New Infrastructural landscapeLooking at the IT Infrastructure of most organizations today and the initiatives at top of their priority list, you will seea completely different approach to virtualization and management, much different than just two years ago. If we lookat how operating system provisioning is handled today, we see environments created in minutes instead of manyweeks. We see a new IT infrastructural landscape with the potential to fully leverage our IT assets, provide amazingflexibility, availability and portability for our services, all contained in a smaller data center footprint. The enabler forthis complete rethinking of our datacenters can be summed up in one word. Virtualization.Virtualization has driven a complete paradigm shift for every IT infrastructure. Some of the major improvementsinclude:1. Multiple operating systems run on a single physical host. This saves money on hardware, on licensing, onpower, on datacenter space and more.2. Hardware presented to the guest operating systems is virtualized and abstracted from the real physicalhardware. This means virtual machines can easily be moved between completely different physical hosts. Thisflexibility is great for day-to-day running and even better for disaster recovery scenarios.3. A higher quality of service is attainable. Operating system instances can be provisioned in real time as needed,allowing business units and end-users to self-serve through portals provided by solutions such as System CenterVirtual Machine Manager.4. Increased standardization simplifies management and simplifies adherence to regulatory andcompliance requirements. Templates simplify provisioning and maintaining operating system instances.But with all its advantages, a challenge to data safetyBut with all of its advantages, virtualization introduces new challenges for IT departments and it can highlight existinggaps in your data infrastructure that will need closing. One of the most significant gaps is likely to be how well you’reprotecting your newly virtualized data and systems. Failure to do so exposes organizations to huge potential data loss,financial loss, longer system RTOs and even criminal prosecution if you fail to meet regulatory requirements. Is thevirtual machine you created actually being backed up and replicated? Is its data being archived in a manner that willmeet your organization’s needs? If you don’t know, your business faces a huge risk.To begin with, virtualization will likely increase the number of operating system instances you need to protect if youwant to achieve greater resiliency to an instance. Adding more instances also means more management overheadunless you automate the configuration process. Finally you need to protect virtualization hosts.WHITE PAPER4

Do I even need to back up?Nearly every service has some kind of replication built into it today; maybe it’s database mirroring or log shipping,mail store replications, or multi-master replication of a directory service. Many systems have trashcans wheredeleted objects go before they are permanently removed. With this in mind, one could be forgiven for assuming thatreplication alone is plenty good enough.The answer is no, and a big no at that! While replication can deliver high availability and trashcans can help in a quickrestore of objects, neither solution will save you if you have corrupted data and neither can rebuild a system. Backupsare more important than ever to achieve complete protection.Can’t I just back up the Hyper-V box?This may seem like a reasonable approach, requiring the least amount of work. With Hyper-V we have a managementpartition which is used to manage the virtual machines, perform configuration of the virtual environment and enablecommunication to certain types of resource like storage and network using standard Windows drivers.We can log in to this Windows Server management partition locally, remotely through protocols like RDP or justmanage remotely through our management tools such as Server Manager. However, because this managementpartition is running Windows Server, why not just perform the backup in the management partition and tell it to backup all the virtual machines?A note on Cluster Shared VolumesOne quick consideration when you’re backing up Hyper-V machines is Windows Server 2008 R2 Cluster SharedVolumes (CSV). This is used for shared storage in a cluster and enables all nodes in the cluster to read and writeto the same NTFS LUN simultaneously, enabled through CSV. There are a few special considerations to backing upvirtual machines on a CSV enabled volume, so be sure to check if your backup software supports CSV. If it doesn’t,as CSV is still quite new, make sure they’re planning to add it or remove it from consideration.What happens when we back up? Every modern Windows backup application today leverages Volume Shadow CopyService (VSS). This architecture allows application developers to ensure their applications and associated data arecorrectly gets backed up and can be restored. This is achieved by application vendors providing VSS Writers whichperform the steps needed to ensure that (1) applications data on disk is in a backup ready state and (2) that writesare suspended to the data during the backup. With this, we’ll be able to restore this application if needed andthere is no risk of inconsistent data rendering the restored environment unusable!Most major applications provide VSS Writers in addition to those provided as part of the Windows operating systemand key Microsoft operating system roles like Hyper-V.1. To leverage these VSS Writers, a backup application acts as a VSS Requestor and asks the Volume ShadowCopy Service, which coordinates all the actions needed for the backup, to create a shadow copy of the requestedvolumes. A shadow copy is a point-in-time snapshot view of the volume, which, once created, can then be backedup to another disk or tape.2. The Volume Shadow Copy Service enumerates all the registered VSS Writers on the system and then tellsinitiates a commit action that triggers the snapshot.WHITE PAPER5

3. The VSS Writers are notified of the commit and each writer performs actions like flushing transactions to disk andquiescing changes to ensure their data on disk is in a backup-ready state.4. The Volume Shadow Copy Service then tells a VSS provider, which can be software-based or hardware-based,to actually create the shadow copy of the data which is currently frozen. The VSS provider is responsible forensuring the shadow copy is maintained until it is deleted which is typically after the backup software has finishedcopying the shadow copy to another location.5. Once the shadow copy is taken (which can take a maximum of 10 seconds) the Volume Shadow Copy Servicethaws the system, allowing the VSS writers to unfreeze writes to the data and resume normal operations.6. The Volume Shadow Copy Service checks with each VSS writer to ensure that all writes were held during theshadow copy creation. If they were not held, then the shadow copy is deemed inconsistent and deleted. Ifshadow copy creation is successful its location is given to the VSS requestor, i.e. the backup application, whichcan now do whatever it wants with it.This whole process can create a shadow copy in seconds, which can then be backed up to various mediums over afar longer period of time without affecting production availability and performance.With Hyper-V, a VSS backup is actually extended beyond where we take the backup. Hyper-V has integration servicesinstalled on the guest virtual machines which allow rich communication between the Hyper-V management partitionand the guest operating systems, resulting in a smooth mouse/keyboard experience, heartbeat, time synchronization,shutdown execution and snapshot integration.That’s right, we can perform a VSS backup on the Hyper-V host. The Hyper-V host will actually notify each virtualmachine that has the integration services installed that a VSS shadow copy is being taken. The VSS writers insidethe guest operating systems will be called to ensure that data within the guest operating systems is in a backupready state and writes are paused while the backup is taken. A backup taken at the Hyper-V host level of the virtualmachines is actually integrity-assured providing the integration services are installed in the guest and the guestoperating system supports VSS.This actually means it may be entirely possible to back up at the Hyper-V host level because we have the VSSnotifications to the guests. But ultimately we don’t care about the backup, we care about the restore and that is wherebacking up at the Hyper-V host level may not be enough. When I backup the VM at the host, I know that I can restorethat virtual machine and it will be functional but that’s what I’m restoring, the entire virtual machine. You may actuallybe able to go a little further. When looking at your Hyper-V backup solution it’s good to choose one that allows itemlevel restore from your Virtual Hard Disk (VHD) backups. This item level restore means I’ve backed up the entire VHDbut when I perform a restore action I can look inside the VHD and rather than restore the entire VHD, I can restoreonly selected files from the contained file system. This has become a lot simpler with Windows Server 2008 R2 sincemounting a VHD is part of the operating system now. This ability to perform item-level restore from a VHD would bevery useful if you were backing up a file server for example and just wanted to restore a single file.Now imagine the virtual machine was running SQL Server, SharePoint, Exchange or any workload that allowsgranular levels of restoration. Opening up the VHD and trying to perform an item-level restore will not work forSQL data or Exchange— the restore would not understand the data. Instead we need to have backup agents thatunderstand the data so it can be restored in ways that make the most sense for the data and the service beingprotected.Great examples are restoring a table from a database, restoring a mailbox or even mail item from a mail database andrestoring a document from a SharePoint library. None of these restore types would be possible if we only performedthe backup at the Hyper-V host level; we need to have backup agents within the guest operating systems for thesetypes of workloads to enable the greatest functionality and finest granularity of restore.WHITE PAPER6

The need for agents in the guest and the power that bringsWhile we have seen backups can be taken at the Hyper-V host level, we have also seen we lose granularity in ourability to restore. This is ultimately why we perform backups, so when things go wrong— and they will— we don’t losestate and information.This does not mean we need to install backup agents in all our guests. If we have guests running fairly basicworkloads and we only need to restore the entire VM or (on rare occasions) a file, performing the backup at theHyper-V host level is fine. By doing this, you give up the ability to do granular restores, full backups of I/O intensiveapplications, and the ability to restore individual files without restoring an entire virtual machine.Installing a backup agent inside the guest operating system allows the backup software to have intelligence intoexactly what is running inside the virtual machine. This gives the backup administrator full flexibility into what needto be protected, how to protect it and, if a disaster happens, how to restore only what is really needed, rather thanrestoring the entire virtual machine.“What they want to protect” and “how they want to protect it”, interesting, but what do those two statements from theprevious paragraph mean? Don’t we just create a VSS shadow copy then back it up somewhere?With an agent running inside the virtual machine, we can be very particular about what we want to back up ratherthan backing up everything. We may decide to not back up the operating system volumes— we may want to onlyback up particular databases on a SQL server instead of the entire data disk.Additionally our backup solution, through its knowledge of our services, may be able to go the extra mile. In addition toregularly backing up the data, it may also be able to harvest transaction logs at a more frequent interval for servicesthat use transaction logs like databases and mail systems that can be to minimize the data lost. Configuring howoften we want to back up the data may vary for each protected workload, and agents inside guests give us thatflexibility.This is actually an important consideration when picking your backup solution. What types of data it supports tobackup and what are its capabilities for restoration as often this is the most important aspect. If a user just deleted amail message, I want to just restore that 1 mail item, not the entire 8GB mailbox— is that possible?Don’t jump to your backup solution for all types of restoration. As a best practice, many applications have their own“trashcan” where items that are deleted actually sit for a period of time before they are removed. They can actually berestored from this application trashcan very quickly so always look at the capabilities of your systems before instantlyreaching for your backup/recovery solution. Where possible use a hybrid approach as these trashcan capabilities areno replacement for a backup solution that can give full protection from corruption and disasters.Whenever we want the most flexibility, the most control and most restore granularity, we should be thinking “agentinside the guest”.Where do we back it up to and what are we really backing up each time?I remember my first job when I left school 18 years ago. I was a VAX/VMS systems administrator and one of my taskswas performing the nightly backup.Backing up a system was a multi-step process- I had to remember to put the square-shaped tape in the machinebefore I left the office, and when I got in the next morning, switch the tape for the second tape because the backupwas larger than a single tape. Then 2 hours later, I’d get both tapes, put them in an envelope and put them atreception for the offsite backup service to collect for secure storage.WHITE PAPER7

In the event we needed to perform a restoration, I would have to contact the archive company, get the tape back(which may not be till the next day), then hope the backup would actually work (which often didn’t due to “solarflares”).Times have changed The cost of disk drives has come down remarkably in recent years, while their capacity has greatly increased whichmakes using disks as the storage medium for our backups a viable option instead of tape. Using disks as the backuptarget gives increased speed over tape to perform the backup but also means the data is easily available to performrestorations from and removes many of the complexities commonly associated with tape drives and tape media.Using disks for our backup storage also allows us to store more than one backup of a data set. We can have manybackups at different points in time, which is great when we come to restoring—choose from what point in time wewish to restore from rather than just the last backup. If the last backup were corrupted, with additional restore points,you could go back in incremental time blocks to find the last best backup, corruption free.This brings us nicely into another advantage of disk as the backup medium. When we think of traditional backupmethods on tape, it was common to perform a full backup at the start of the week which contained all the data, theneach day perform an incremental backup which only contained the files that had changed that day. This was muchfaster to perform and required less storage than a full backup. If you had a loss on Friday, you would have to restoreMonday’s tape, then Tuesday’s, then Wednesday’s and then Thursday’s— a huge waste of time and a huge headache.This method is very important with today’s backup architectures; in fact more so since the backup data is now sentover the network to the backup server, which connects to the storage. We actually go one better than an incrementalbackup which backed up entire files that changed that day. This would not work well today when you consider today’sfiles can be gigabytes in size, but may have only had a few kilobytes of change.Look for a backup solution that uses block level backups which will reduce the amount of disk space you need tostore the backups and more importantly cut down on network bandwidth used during the backup.The way these block level backups work does vary but is essentially the following:1. The first time a new backup source is protected, a full backup of all the data is performed and stored on thebackup server. This is the only time the entire data set is copied over the network.2. For subsequent backups only, the blocks on disk that relate to the protected data that have changed are copiedto the backup server, meaning the data copied over the network directly relates to the amount of change. Thesebackups should occur at whatever interval meets your targets around maximum amount of acceptable data loss.3. If your organization has Recovery-Point Objecting (RPO) of 4 hours, we need to make sure we perform thesebackups at least every 4 hours. This frequency will also be available when you come to perform a restore so if youbackup every 4 hours, you would be able to select the restore point in 4 hour increments.4. Many backup solutions may give you the option of merging these point in time views that are older than x daysor y weeks in order to clean up your views, as it will become unlikely that you would need 4 hour granularity of arestore from 3 weeks ago.5. The backup server looks at the blocks in the current view of the data that are being replaced, moves them to aprevious point in time “slice” so they are maintained for performing restorations at that point in time, then writes thelatest blocks so a current view is available.6. Organizations may still want tape or Blu-ray disc copies for offsite storage or very long term archiving, so supportfor tape may be a factor in your backup selection. Often, solutions will allow a point-in-time view of the data to beexported to tape.WHITE PAPER8

For offsite backup needs such as disaster recovery, it’s actually very common to have an instance of the backupsolution at the disaster recovery location, with its own disks that actually protect the primary backup instance anddata at the main datacenter location. The DR backup instance may protect all data from the primary backup instanceor only the really important data and servers. In the event of a disaster recovery, this is just as fast at the backup siteas at the primary.The final piece, restoration!We’ve already discussed the granularity of restoration that is possible with a top grade backup and recovery solution.Being able to restore not only from multiple points-in-time but also only restoring the data we want to restore insteadof entire containers of data. There are other restoration considerations.Disasters and total server losses bring us to another important capability— performing bare metal recoveries ofour servers. If a server has to be completely restored from the backup or has to be replaced, we need to be able torestore the latest complete backup of a server, then apply the latest protected data from that server. We may wantto perform this restoration by booting the physical box over the network using PXE boot or boot from a CD or USBkey to initiate the restore. Consider what works best for your organization and make sure you choose a solution thatmeets that.The complete server loss or even worse a site loss brings up another interesting capability some backup solutionsoffer. The ability to run in a “virtual standby” mode can be very useful when a physical server hosted OS instance(not a virtual machine) fails but you don’t have physical hardware available. Some solutions allow you to restore thebackup of a physical server to a virtual machine and it automatically takes care of hardware differences allowing yourserver to be back up and running even though you don’t have physical boxes available. This can be critical when youhave aggressive Recovery Time Objectives (RTO) to meet.Remember when I recounted my early days as a IT admin—when after getting back my backup tapes, I hoped therestore would actually work? This is not acceptable.The scenario of a failed restore can be avoided with 2 actions:1. Have a solid restore process in-place and test routinely. With any change in hardware, software or personnel, runthe restore test again to ensure it is still fully functional.2. Make sure your backups’ integrity is assured. The best way to make sure the backup integrity is sound is forthe backup solution to perform regular integrity checks when the backup data is received either using integritychecking capabilities provided by the application such as Exchange Eseutil or for file system protection throughactions like Chkdsk. By performing these integrity checks on the backup system, we are assured our protecteddata will be usable and we are not burdening the source systems with the workload of performing the integrityvalidation. We get the best of both worlds.WHITE PAPER9

Who watches the watchmen?We know how important our data and systems are— we have invested great resources and efforts to get acomprehensive and functional backup solution in place, but what is protecting that backup system from corruption orloss?The use of a second backup solution at the DR site protecting the primary backup solution at the datacenter hasalready been discussed however using the DR backup solution for normal day-to-day restorations in the maindatacenter would not be desirable as all the data would have to flow over the WAN between sites. Instead understandthe dependencies of your backup solution. If the backup solution uses SQL to store its configuration and metadata– and if that SQL server failed your backup solution— it would be useless and unable to mitigate this use of a SQLinstance that is part of a highly available SQL cluster. Whatever the dependencies are, try and mitigate any singlepoint of failure.Final thoughtsVirtualization is a fantastic revolution of the way we look at IT, it enables completely new ways to how we manage,provision and look at our infrastructure. Virtualization does not simplify our backup approach and as we’ve looked atin this paper there are actually more considerations with protecting our virtual environments and ensuring no loss ofcapability or granularity but by investigating the backup solutions and choosing the right one we can actually makebackup and more importantly restore an intuitive and functional part of our infrastructure.WHITE PAPER10

About AppAssure SoftwareAppAssure is the #1 unified backup & replication software for virtual, physical and cloud environments. This multipleaward-winning and customer-proven software recovers virtual and physical servers, applications and data in minutesinstead of days or hours. AppAssure’s innovative and groundbreaking technologies assure 100% reliability ofrecovery and goes beyond just protecting data to protecting entire applications. It also supports multi-hypervisorenvironments including VMware vSphere / ESXi, Microsoft Hyper-V and Citrix XenServer. AppAssure is an EliteVMware Technology Alliance Partner and Microsoft Gold Certified Partner. With more than 6,000 customers, partnersand service providers in over 50 countries and over 3,000% growth in three years, AppAssure is the world’s fastestgrowing backup software company as ranked by Inc. Magazine.AppAssure’s 3 Innovative and Groundbreaking Backup Technologies:1. Live Recovery Instant restore of VMs or Servers – near-zero recovery time (RTO) & 5-minute RPO2. Recovery Assure Assurance of 100% Reliability of Recoverability3. Universal Recovery Anywhere to Anywhere Restore – to any VM or dissimilar hardware with Granular Object Level RecoveryWHITE PAPER11

5 Reasons to Try AppAssure – Get a FREE Trial Now!1. Ultra-Fast Backup & Recovery – near-zero Recovery Time & 5-minute RPO2. Recovery Auto-Testing and Auto-Verification - 100% Recoverability3. Unified Backup & Replication from One Single Pane of Glass4. Recovery Anywhere to Anywhere (P2V, V2V, V2P, P2P)5. True Global Deduplicationwww.appassure.com/Free-TrialAppAssure Software, Inc.1925 Isaac NewtonSquare East, Suite 440, Reston, VA 20190Americas: 1-866-459-6653EMEA: 44-1306-888864Connect with assure.com/twitterwww.appassure.com/blog 2012. AppAssure Software. All Right Reserved.THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAINTYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT ISPROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND.

With Hyper-V, a VSS backup is actually extended beyond where we take the backup. Hyper-V has integration services installed on the guest virtual machines which allow rich communication between the Hyper-V management partition and the guest operating systems, resulting in a smooth mo