Backup & Recovery

 View Only
  • 1.  Opendedup - New deduplication product - Question

    Posted Apr 28, 2010 02:15 AM

    Good Evening Everyone,

    A new open source project is out which is named Opendedup. The product has a file system that performs automatic inline and batch deduplication when data is copied or written to its file system. The Linux based file system is called SDFS. The product claims it can perform successful VMDK deduplication WITHOUT using the vStorage API.

    My question is can a vmware virtual machine be successfully deduplicated without using the vStorage API?

    Thanks

    Steve



  • 2.  RE: Opendedup - New deduplication product - Question

    Posted Apr 28, 2010 06:27 AM

    My question is can a vmware virtual machine be successfully deduplicated without using the vStorage API?

    Those API are for backup operation.

    One on the interesting function is the change block tracking that permit an "increamental" VM download.

    I'm not sure that exist a native function to do a block "de-duplication"?!

    You can dedupe a VM backup also on the backup server, also without those API but this mean that you have first to "download" the entire VM (like VCB mode).

    Note that for running VMs there can be a second de-duplication level.

    In this case if your storage can provide this function, then the VMs will benefit of deduplication (also on old ESX).

    Andre



  • 3.  RE: Opendedup - New deduplication product - Question

    Posted Apr 29, 2010 09:51 PM

    So basically you can dedupe a vmdk file without using the vStorage API?

    Thanks

    Steve



  • 4.  RE: Opendedup - New deduplication product - Question

    Posted Apr 30, 2010 05:19 AM

    Yes, you can dedupe the blocks of the vmdk at the storage level.

    For example, by using NetApp.

    Andre



  • 5.  RE: Opendedup - New deduplication product - Question

    Posted Apr 30, 2010 12:35 PM

    Hi,

    Without knowing anything about opendedup, but just going on what you described here, then the answer is YES, sure your VM can use deduplication techniques, but only on guest level on the current VM.

    That's what you are saying, it utilizes its own file system which handles the deduplication.

    So basically the OS is unaware of the virtualisation layer and does handle the deduplication within the file system.

    It would work exactly the same if it was directly running on hardware without a virtualisation layer.

    As the deduplication works on hardware, it will also work for VMDK files.

    How well it does that at VMDK level is the question and I have no idea without testing.

    The vStorage API works on a higher level as this and is normally not invoked from a guest OS directly, it is more common to be used from a management VM or host, such as a backup VM.

    Hope this helps,



    --
    Wil
    _____________________________________________________
    VI-Toolkit & scripts wiki at http://www.vi-toolkit.com

    Contributing author at blog www.planetvm.net

    Twitter: @wilva



  • 6.  RE: Opendedup - New deduplication product - Question

    Posted Apr 30, 2010 11:32 PM

    That's just what I thought. I have suggested to the author of Opendedup, that the vStorage API should be used within the SDFS. (which is the deduplicated file system.) So I guess when backups are performed, then one would expect multiple VMDK files of lighter sizes? Or maybe that would depend on how the Opendedup system works.

    Thanks

    Steve



  • 7.  RE: Opendedup - New deduplication product - Question

    Posted May 29, 2011 09:12 PM

    Anyone considered this - use 2 ESX Hosts with local Storage and 2 Linux VM with DRBD and SDFS on a per NFS exportet Datastore. This would be a awesome cheap Cluster.