Methods to provide vm data backup

9/1/2023

We run daily backups, 14 day retention, active full every 2 weeks to keep retention pool consistent, we run no compression or dedupe, we let the data domain handle it. Total vm's backup count just over 1008, 250TB total space Veeam backup replication 50g ram 16 processors, SQL housed on separate SQL 2012 serverġ0 Proxy servers running 16g ram 1 socket with 16 cores for processorĮMC DD860 2 10Gig uplinks for backup data only. Vmware 5.5 30 Hosts 40 datastores clustered. I am interested to here from company's that have around 2000 or more vm's in vmware and how they have their environment setup. Many customers look at this as "wasting" space, but the reality is that this may be the difference between being able to use reverse incremental or not and reverse incremental will easily provide the most space efficient and easiest to manage backup options assuming the IOPS are available to support it. Because of this, it can many times be beneficial to use RAID1. RAID1 has a write penalty that is half of RAID 5, and thus can generally deliver more IOPS for a mixed R/W workload. Not only that, but having controllers with decent sized, battery-backed write cache is critical.Īlso, remember that RAID levels 5/6 have a significant IOP write penalty, which really comes into play with reverse incremental. So many times I find customers that simply take the default of their RAID or attempt to strip massive bundles of 12-16 disks together creating stripes that are far to small (thus creating too many IOPS), or are way too large (thus creating wasted I/O on every block read/write). Getting the best performance out of reverse incremental as well as synthetic fulls is all about building target storage that is well optimized for the random I/O workload that it will get. Of course the problem is that most clients don't really know what systems will have high change rate and which ones will not, they want a "single best mode to rule them all" but unfortunately for now it's about tradeoffs between the modes. Reverse incremental can be designed to perform well for the vast majority of systems, only systems with high change rate are usually an issue. Basically you effectively need at least 2x the space of forward incremental, and in many cases (based on retention) 3x or more. So in other words, just as you said, it varies a lot from site to site, customer to customer, based on their goals and budget.Ĭertainly incremental will reduce the workload, but the cost is storage space is significant and thus difficult to scale (assuming no dedupe appliance). The lure of this "single repository" is strong, but is much more difficult to build at scale with reasonably random I/O performance, especially because they are generally attempting to use low end SAN hardware to do so (small caches that are easy to saturate). This simplifies job management since there is simply one massive pool of target storage, but has significant performance side effects as all of the I/O is targeted at a single large pool of disks it only takes a few reverse incremental jobs to have long request queues and cause tremendous I/O latency, significantly degrading backup performance.

Many large clients have attempted to build single massive repositories using SAN attached disk. That being said, from a scale out performance perspective, it's hard architecture to beat since it guarantees that traffic does not cross the network (direct SAN to local disk). The disadvantage of this approach is of course that it requires manual balancing of jobs across the available storage/proxies, somewhat negating the smart load balancing built into the V6 product. Potentially these can be proxies as well so you effectively get SAN offload and scale forever simply by adding dedicated proxy nodes. When you need more storage you add another repository so you get more processing horses as well. This provides a self contained device with a single maintenance contract with a fixed performance ceiling that can be easily defined. If a client is simply going to disk (probably the next biggest group) then SAN attached disk or locally attachted disk on a system used as a repository is the next best method.įrom a performance/cost perspective I personally prefer using dedicated physical servers as proxies with locally attached disks for repositories. If the client is using a dedupe appliance (probably the majority) then it's either CIFS directly from multiple proxies, or NFS via a Linux repository. It really boils down to what the customer is wanting to do and what infrastructure they have in place. I actually like that approach, but most clients are more comfortable with Windows, and this more common is to see CIFS or NTFS attached to Windows. I'd say I don't typically see NFS targets, but they can be by using a Linux repository.

0 Comments

Methods to provide vm data backup

Leave a Reply.

Author

Archives

Categories