vSAN1

 View Only
  • 1.  vSAN remote witness latency

    Posted Oct 17, 2025 01:35 PM

    What is the real impact of latency just above the 5ms magic number?   If the latency were below 10ms, how badly could/would that impact the system?

    We have a couple of customers with cloud latency between 5-10ms.   Could this still be a functional design with the witness hosted in a cloud in such an environment?



    -------------------------------------------


  • 2.  RE: vSAN remote witness latency

    Broadcom Employee
    Posted Oct 30, 2025 05:33 AM

    We tolerate much more latency for the witness components, mainly as there's no data transfer, it is just witness component updates. So it isn't as sensitive. The sensitivity is between the two "data locations", there the requirement is 5ms RTT max. So no worries if the witness goes to 10ms RTT or more. Even 50ms wouldn't be a problem. The documentation describes the current limits by the way.

    -------------------------------------------



  • 3.  RE: vSAN remote witness latency

    Posted Oct 30, 2025 12:58 PM
    Just to add to what Duncan said - this documentation outlines the required the data-site to data-site and data-site to Witness network bandwidth and RTT (Round-Trip Time) latency requirements:
     
    "We have a couple of customers with cloud latency between 5-10ms.   Could this still be a functional design with the witness hosted in a cloud in such an environment?"
    If you mean 5-10ms RTT data-site to data-site then the impact would be decreased performance and max potential throughput possible - vSAN is a synchronous replication solution, what this means is that if for instance you have data stored as PFTT=1 (data-replica on both data-sites) then the time for a write IO to be committed (e.g. latency as observed for VM/Guest-OS) is cumulative of 1. IO being issued to all nodes that have data-replica stored locally (e.g. how long this takes to traverse the network) + 2. How long those nodes take to write their portion of the IO to the local Disk-Group/StoragePool where the data-replica resides + 3. How long the Ack of that being committed takes to traverse the network back to the originator of the IO (the DOM-Owner).
    So if the transmission of the data on the inter-node network and/or the Ack are significantly delayed due to high network latency then that will increase overall how long the IO takes e.g. increased latency observed on the VM issuing it.
    If you mean 5-10ms between data-site and Witness then that is completely fine and wouldn't impact performance whatsoever - I have seen clusters with this varying 50-100ms and it is not a problem.
    -------------------------------------------