We use Zerto to replicate our production VMs to our 3 Node vSAN 6 cluster in DR. I can create new VMs, delete them, create folders, copy ISO files, etc... However, when Zerto tries to do anything; test failover, create new protection group, etc..., I get "Cannot complete file creation operation". And I see the following in vpxa.log on the host trying to do the operation:
--> Result:
--> (vim.fault.CannotCreateFile) {
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = (vmodl.LocalizableMessage) [
--> (vmodl.LocalizableMessage) {
--> key = "com.vmware.esx.hostctl.default",
--> arg = (vmodl.KeyAnyValue) [
--> (vmodl.KeyAnyValue) {
--> key = "reason",
--> value = "Hostsvc::osfs::CreateDirectory : Failed to create directory vm-27637 (Cannot Create File)"
--> }
--> ],
--> message = "Operation failed, diagnostics report: Hostsvc::osfs::CreateDirectory : Failed to create directory vm-27637 (Cannot Create File)"
--> }
--> ],
--> file = "Hostsvc::osfs::CreateDirectory : Failed to create directory vm-27637 (Cannot Create File)"
--> msg = "Received SOAP response fault from [<cs p:1f31ce78, TCP:localhost:8307>]: CreateDirectory
--> Cannot complete file creation operation."
--> }
--> Args:
-->
--> Arg spec:
--> (vpxapi.VmLayoutSpec) {
--> vmLocation = (vpxapi.VmLayoutSpec.Location) null,
--> multipleConfigs = <unset>,
--> basename = "Z-VRAH-gvvsan1.rpionline.com-586248",
--> baseStorageProfile = <unset>,
--> disk = (vpxapi.VmLayoutSpec.Location) [
--> (vpxapi.VmLayoutSpec.Location) {
--> url = "ds:///vmfs/volumes/vsan:fe0afcdab0b14afe-b39f91c71d1f1a76/1d00fa54-b4a3-9158-b876-ecf4bbcfd398/f79828c5-af94-4c14-8117-712276546bdd/vm-27637/Temo.vmdk",
--> key = 16020,
--> sourceUrl = <unset>,
--> urlType = "exactFilePath",
--> storageProfile = <unset>
--> }
--> ],
--> reserveDirOnly = <unset>
Everything I look into leads me back to an MTU issue and this KB: VMware KB: Creating new objects on a VMware Virtual SAN Datastore fails and reports the error: Failed to create dire…
I enabled the vDS health check and sure enough, it reports and MTU issue. There is also this:
[root@gvvsan1:~] vmkping -d -s 1472 10.7.7.121
PING 10.7.7.121 (10.7.7.121): 1472 data bytes
1480 bytes from 10.7.7.121: icmp_seq=0 ttl=64 time=0.476 ms
1480 bytes from 10.7.7.121: icmp_seq=1 ttl=64 time=0.401 ms
1480 bytes from 10.7.7.121: icmp_seq=2 ttl=64 time=0.298 ms
--- 10.7.7.121 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.298/0.392/0.476 ms
[root@gvvsan1:~] vmkping -d -s 1473 10.7.7.121
PING 10.7.7.121 (10.7.7.121): 1473 data bytes
sendto() failed (Message too long)
sendto() failed (Message too long)
sendto() failed (Message too long)
--- 10.7.7.121 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
MTU size of 1472 results in a success and a size of 1473 results in a failure, the ping come back with a size of 1480.
My physical Dell 4032 switches have an MTU of 1518 and the vDS have an MTU of 1500. My question is, what should they be and is there anywhere else I need to verify/set the MTU?
I have cases opened with Zerto, VMware, and my local networking consultant. As always... thank you, Zach.