VMware Workstation

 View Only
Expand all | Collapse all

Why aren't large VMDK files split into smaller chunks?

  • 1.  Why aren't large VMDK files split into smaller chunks?

    Posted Oct 21, 2024 07:27 AM

    When you create (say) a 200GB VMDK OS drive, it's split by default into up to 32 smaller chunks. At first I didn't like this feature until I came to defragment or compact a disk. With one of these operations (can't recall which), it creates a TMP file, carries out the operation, deletes the original and replaces it with the temp file. If you had one single VMDK file (say 1TB), you had to have as much free space for the temporary files. When the drive was split into 32 smaller chunks, it worked on each chunk one by one. So when split into 32GB chunks, all you needed free at one time was 32GB.

    However, there seems to be a size limit to this chunk feature? One of my VMs has a 2TB data drive and that is only split into three chunks as show below. And most of the data is in one chunk. So to carry out a disk defrag/compact, I need at least 1TB free on this drive.

    Why is there size limit for chunking? Esp. as it becomes especially useful with large drives.



  • 2.  RE: Why aren't large VMDK files split into smaller chunks?

    Posted Oct 22, 2024 10:02 AM

    It kind of feels like a bug doing exactly like you describe. The bigger chunks are typically preferred for speed if  you have something like a large Oracle database. This was the matter with HDDs, which are very slow in VMware, but faster if you don't let it create chunks in the first place.

    On the other hand, if you have SSD, there is no reason to do defragmentation - it is just not relevant. Also, when you use VMware, you should never use HDDs, they are much slower than the nominal speed is, slower with VMware, always have been.




  • 3.  RE: Why aren't large VMDK files split into smaller chunks?

    Posted Oct 22, 2024 10:47 AM

    I tend to defragment before I do a compact - it's the compact that I'm more interested in. You sometimes end up copying large amounts of data into thin VMDK file which increases the size. The space isn't recovered when you delete files in the OS. I'll do a test now to remind myself which operation uses the TMP file.




  • 4.  RE: Why aren't large VMDK files split into smaller chunks?

    Posted Oct 22, 2024 11:25 AM

    Yes, it's the defragment operation that creates the temporary file and the compact works on within the VMDK. I know why Windows doesn't do defragmentation anymore on SSDs - because moving blocks was really only beneficial on hard disks where there was performance benefit for making sure consecutive blocks of a large file were next to each other - fewer seeks. Seek time on an SSD isn't a issue plus moving blocks around on SSD is potentially damaging to the life span due to extra writes.

    But for VMDK files, I've never been quite so sure what defragmentation does. I've guessed it does this, i.e. simply moves free blocks to the end of the VMDK. I don't think VMware Workstation is aware of the file structure within the VMDK (or maybe it is?) so doesn't do anything at file level to ensure blocks are continuous. But this is a guess.

    Compact... well I though it just trimmed the VMDK size down by the free blocks at the end. Any free blocks in the middle would still be there. But I'm not so sure - if it did this, compact would be near instant as all it's doing is adjusting the size of the file. But compact does sometimes take significant time which suggest that it's also moving used blocks over free blocks so that all the blocks at the end are free.

    Also interesting why defrag needs to create a TMP file and compact does it's I/O within the existing file. Not sure why defrag has to produce a TMP file at all.





  • 5.  RE: Why aren't large VMDK files split into smaller chunks?

    Posted Oct 22, 2024 11:28 AM

    Here is the 2TB VMDK shown above. Note how is says the disk space is stored in multiple files but clearly it's only using three. Does sound like a bug doesn't it? Smaller chunks are also useful when moving large VMDKs around - if the copy operation gets interrupted, you simply carry on with the next chunk but with a single 950GB chunk, you have to start again.




  • 6.  RE: Why aren't large VMDK files split into smaller chunks?

    Posted Oct 22, 2024 05:31 PM

    Yes, you may have something there. I have also thought about defrag - while it's not useful on a physical computer with SSD disks - is it somehow connected to the performance and functionality of Compact in VMware? I don't know the answer to that.

    As for copy - well, there is solution to that. Typically, all problems that people have (like in training courses) in VMware, is that they used Windows copy. It is not reliable in any sense - especially if there is something that can go wrong (like unknown networking), never use that. Well, you CAN use that but confirm with robocopy afterwards. The way is simple (please, look for a proper explanation somewhere, for MANY useful options in some other cases):
    (in Command Prompt - not with the newer "shell" in Win11)

    cd /d target_directory
    robocopy /e /v /Z "d:\VMs\MyVMware computer" .

    In there "-signs are necessary because of the space sign in the directory. The dot " . " means the directory where you are.

    The thing that you explained as a problem, is the option "/Z" - with that it will not start from the beginning. However, that is subject to what the source and target really are, generally speaking (like a Linux fileserver or whatever).

    Sorry, from explaining from the start - this may be not necessary for you, but perhaps to somebody else - I just wanted to give an explicit, but still very short explanation.

    Robocopy is always with Windows and has tons of options but this basic syntax does the copying in a reliable matter and the command can be repeated if something goes wrong. Also, files already copied will NOT be copied again. Or, as suggested above, it can take place after you first copy with whatever means you have and then have robocopy to make sure (... this workflow doesn't make much sense, but just explaining what robocopy can do).




  • 7.  RE: Why aren't large VMDK files split into smaller chunks?

    Posted Oct 23, 2024 01:32 PM

    >  Seek time on an SSD isn't a issue plus moving blocks around on SSD is potentially damaging

    >  to the life span due to extra writes.

    While I don't doubt that moving around blocks causes additional wear, I've always been unsure about the 'seek time and fragmentation not counting' on ssd's part.

    All my drives are SSD drives (NVMe and SATA attached) and have seen

    1) VMWare Workstation sometimes complain that 'fragmentation' was slowing down my VM. (Seems to be specific to only the  .vmdk internal 'fragmentation' though as best as I can tell).

    and 

    2) The random read/write performance of the SSDs is usually notably lower than the 'large sequential' performance as shown on benchmarks, etc.

    That makes me think that while there is no 'rotational' delay on ssds, there is still the possibility that the number of IOPs per second, etc. could limit how fast a large number of individual 'random' reads can be done. 

    That is as opposed to possibly being able to do one large read instead (in a non-fragmented case).

    I never found a definitive answer but it still seemed possible to me that a 'contiguous' file might get better performance depending on what size reads/writes are needed for that file. 




  • 8.  RE: Why aren't large VMDK files split into smaller chunks?

    Posted Oct 22, 2024 05:32 PM

    buenas tardes.

    alguien sabe cuanto dura el proceso de verificacion de la cuenta? ya tengo 4 dias esperando y nada.




  • 9.  RE: Why aren't large VMDK files split into smaller chunks?

    Broadcom Employee
    Posted Oct 24, 2024 03:20 AM

    Hi,

    The issue has been raised internally, and the appropriate team will investigate it.

    It would be helpful if you could provide the support bundle.




  • 10.  RE: Why aren't large VMDK files split into smaller chunks?

    Posted Oct 25, 2024 02:11 PM
    Edited by a_p_ Oct 25, 2024 02:13 PM

    I don't exactly remember the version (I think it was in some 15.x version), in which the number and size of extents has changed, and now depends on the virtual disk's provisioned size.

    From: Add a New Virtual Hard Disk to a Virtual Machine

    ~snip~

    ... If the capacity is greater than or equal to 2032 GB, it utilizes 2032 GB extents to maximize efficiency and minimize the number of files.

    ~snip~

    So unless you need exactly 2TB (2048 GB), keep the size below 2032 GB, and you will get a virtual disk split into 32 extents.

    André