Hi All,
Issue:
We have a 3-Node vSAN Hybrid 7.0.3 cluster. Each 2U Host has 18 disks fitted, 3 Disk groups in each host, each with 1 x SSD 745Gb SAS cache and 5 x HDD SAS 2.4TB Capacity Disks.
Focus is to determine if the cache is sufficient. The cluster consists of 3-Nodes that each have three disk groups totalling 9 x 745Gb SAS SSD Cache Drives (6.7TB Cache) and 45 x 2.4Tb SAS HDD Capacity Drives. This equates to an approximate 12% Cache to Used capacity ratio (Used 56.5TB) with the guidance of a 10% ratio.
Performance charts are showing Read Latency with a spike as shown below.
The goal of vSAN is to have a 90% cache hit rate. A cache hit is when a read request is found on the read cache. Subsequently, a cache miss is when the block needs to be retrieved from the capacity tier. Since the capacity tier is using magnetic disks the read operation will incur latency. Looking at the below 9 x Disk Groups read cache hit rate it does not look like its reaching 90% very often and therefore there are a lot of Cache Miss?
So would the cluster benefit from an addition DG per host and therefore more cache?
Or would a SPBM Cache reservation be advised for the affected VMs? (There are 2 x VMs that run the main LOB applications and complete batch jobs daily, this was taking 10 hrs to complete now 12 hrs to complete) so this is an review to see if the cache is struggling.
Thanks