I was expecting some improved performance compared to running without setting the NUMA affinity (The reason being that the memory allocation would have been on the local node)
ESXi by default already always tries to assign memory from the local NUMA node the VM is being scheduled on. There is no need to forcibly pin a VM on a certain NUMA node.
You can verify this in (r)esxtop in the memory view (enable fields for NUMA stats with f->g). This will show you the VM's NUMA home node (NHN) as well as the amount of local and remote memory (NLMEM, NRMEM).
Note that the NUMA placement used to be a bit wonky on earlier ESXi 5.x builds, where some VMs often ended up with a lot of remote memory even though there was enough free local memory. But this has been fixed in all recent 5.x releases.
Also, even if memory is located on the remote node, I wouldn't expect really big performance differences for most applications. Sure, the difference will be notable with synthetic benchmarks like a memory throughput test, but synthetic benchmarks like these tell little about actual application performance.