I ran into something today that I think could be a bug.
Trying to mount an NFS 4.1 share with :
esxcli storage nfs41 add -H 10.10.10.1 -s /data/nfstest -v nfstest
results in:
Unable to complete Sysinfo operation. Please see the VMkernel log file for more details.: Timeout
Running tcpdump from the NFS server shows that ESXi issues an arp request, but never follows up with communication.
I can ping the NFS server from the host and furthermore:
[root@esx:~] nc -z 10.10.10.1 2049
Connection to 10.10.10.1 2049 port [tcp/nfs] succeeded!
vmkernel.log shows the following relevant lines:
2020-05-12T19:06:44.598Z cpu3:2099718 opID=997a8ef8)World: 12458: VC opID esxcli-0f-b668 maps to vmkernel opID 997a8ef8
2020-05-12T19:06:44.598Z cpu3:2099718 opID=997a8ef8)NFS41: NFS41_VSIMountSet:403: Mount server: 10.10.10.1, port: 2049, path: /data/nfstest, label: nfstest, security: 1 user: , options: <none>
2020-05-12T19:06:44.598Z cpu3:2099718 opID=997a8ef8)StorageApdHandler: 976: APD Handle Created with lock[StorageApd-0x431d75a03540]
2020-05-12T19:06:44.598Z cpu3:2099718 opID=997a8ef8)NFS41: NFS41_ConnectionLookup:785: Created new connection for address tcp 10.10.10.1.8.1
2020-05-12T19:06:59.601Z cpu12:2107491)WARNING: SunRPC: 3936: fail all pending calls for client 0x430944c01370 IP 10.10.10.1.8.1 (socket disconnected)
2020-05-12T19:07:14.600Z cpu3:2099718 opID=997a8ef8)WARNING: NFS41: NFS41FSWaitForCluster:3702: Failed to wait for the cluster to be located: Timeout
2020-05-12T19:07:14.600Z cpu3:2099718 opID=997a8ef8)WARNING: NFS41: NFS41_FSMount:4814: NFS41FSDoMount failed: Timeout
2020-05-12T19:07:14.600Z cpu6:2098147)NFS41: NFS41_ConnectionRemove:1071: Connection: 0x431d75a042f0 [55] addr: tcp 10.10.10.1.8.1
2020-05-12T19:07:14.600Z cpu3:2099718 opID=997a8ef8)StorageApdHandler: 1062: Freeing APD handle 0x431d75a03540 []
2020-05-12T19:07:14.600Z cpu3:2099718 opID=997a8ef8)StorageApdHandler: 1146: APD Handle freed!
2020-05-12T19:07:14.600Z cpu3:2099718 opID=997a8ef8)WARNING: NFS41: NFS41_VSIMountSet:411: NFS41_FSMount failed: Timeout
2020-05-12T19:07:14.601Z cpu6:2098147)SunRPC: 1104: Destroying world 0x202863
Note the IP 10.10.10.1.8.1. Where did it get .8.1 from?? Between seeing the additional numbers on the address and not seeing communication (besides a single arp request) via tcpdump on the nfs server, it looks to me like NFS might be buggy on ESXi 7. Unless I'm missing something, but I have had good luck with NFS on previous versions of ESXi.
Updated before posting
I attempted using esxcli storage nfs add -H 10.10.10.1 -s /data/nfstest -v nfstest with a result of:
2020-05-12T19:14:50.383Z cpu12:2099721 opID=58765395)World: 12458: VC opID esxcli-4a-b6a0 maps to vmkernel opID 58765395
2020-05-12T19:14:50.383Z cpu12:2099721 opID=58765395)NFS: 162: Command: (mount) Server: (10.10.10.1) IP: (10.10.10.1) Path: (/data/nfstest) Label: (nfstest) Options: (None)
2020-05-12T19:14:50.383Z cpu12:2099721 opID=58765395)StorageApdHandler: 976: APD Handle 05893e1b-b8b24cee Created with lock[StorageApd-0x431d23c028b0]
2020-05-12T19:15:00.551Z cpu15:2099721 opID=58765395)SunRPC: 3306: Synchronous RPC abort for client 0x430944c01370 IP 10.10.10.1.0.111 proc 3 xid 0x5066cf32 attempt 1 of 3
2020-05-12T19:15:11.551Z cpu15:2099721 opID=58765395)SunRPC: 3306: Synchronous RPC abort for client 0x430944c01370 IP 10.10.10.1.0.111 proc 3 xid 0x5066cf32 attempt 2 of 3
2020-05-12T19:15:22.551Z cpu15:2099721 opID=58765395)SunRPC: 3306: Synchronous RPC abort for client 0x430944c01370 IP 10.10.10.1.0.111 proc 3 xid 0x5066cf32 attempt 3 of 3
2020-05-12T19:15:22.551Z cpu15:2099721 opID=58765395)SunRPC: 1104: Destroying world 0x202877
2020-05-12T19:15:22.551Z cpu15:2099721 opID=58765395)StorageApdHandler: 1062: Freeing APD handle 0x431d23c028b0 [05893e1b-b8b24cee]
2020-05-12T19:15:22.551Z cpu15:2099721 opID=58765395)StorageApdHandler: 1146: APD Handle freed!
2020-05-12T19:15:22.551Z cpu15:2099721 opID=58765395)NFS: 173: NFS mount 10.10.10.1:/data/nfstest failed: Unable to connect to NFS server.
In this one it shows IP 10.10.10.1.0.111. NFSv3 is enabled on the server side. Running tcpdump on the NFS server shows a similar lack of layer 3 communication after the arp requests.
NFS seems to load successfully on boot and my network interfaces are on the HCL.
2020-05-12T04:44:12.366Z cpu11:2097688)Activating Jumpstart plugin nfs.
2020-05-12T04:44:12.366Z cpu12:2097716)Activating Jumpstart plugin nfs41.
2020-05-12T04:44:12.448Z cpu23:2097649)Loading module nfsclient ...
2020-05-12T04:44:12.471Z cpu23:2097649)Elf: 2048: module nfsclient has license VMware
2020-05-12T04:44:12.474Z cpu23:2097649)FSS: 1153: Registered fs nfs, module 4e, fsTypeNum 0xb00f
2020-05-12T04:44:12.474Z cpu23:2097649)VProbe: 802: Loaded 3 static probes from: nfsclient
2020-05-12T04:44:12.474Z cpu23:2097649)Mod: 4997: Initialization of nfsclient succeeded with module ID 78.
2020-05-12T04:44:12.474Z cpu23:2097649)nfsclient loaded successfully.
2020-05-12T04:44:12.497Z cpu11:2097688)Jumpstart plugin nfs activated.
2020-05-12T04:44:12.516Z cpu23:2097649)Loading module nfs41client ...
2020-05-12T04:44:12.541Z cpu23:2097649)Elf: 2048: module nfs41client has license VMware
2020-05-12T04:44:12.549Z cpu23:2097649)VProbe: 802: Loaded 5 static probes from: nfs41client
2020-05-12T04:44:12.549Z cpu23:2097649)Mod: 4997: Initialization of nfs41client succeeded with module ID 79.
2020-05-12T04:44:12.549Z cpu23:2097649)nfs41client loaded successfully.
2020-05-12T04:44:12.573Z cpu12:2097716)Jumpstart plugin nfs41 activated.
2020-05-12T04:44:25.332Z cpu3:2097682)Activating Jumpstart plugin restore-nfs-volumes.
2020-05-12T04:44:25.351Z cpu3:2097682)Jumpstart plugin restore-nfs-volumes activated.