I just completed my VCP-NV so I'll have a crack at explaining it and any more experienced onlookers may care to check my logic and weed out any typos :-)
My first point is that the encapsulation would be VXLAN and not GRE if you're on NSX-v. NSX-mh can use GRE, STT and also VXLAN. Just saying in case your answering any test questions in future. I'm not breaking any non-disclosure here as I honest don't recall if that question came up and in any case it's a pretty obvious fact you should learn if you follow the exam blueprint and study the differences between NSX-v and NSX-mh.
The first error in your thinking is that you are applying the theory as if the VMs were on the same subnet and L2 domain but they are not.
In the example you quoted The two hosts are on different subnets;
web-sv-01a 172.16.10.11 is on the 172.16.10.0/24 network
whereas
app-sv-01a 172.16.20.11 is on the 172.16.20.0/24 network
NOTE: This will mean that there are two logical switches since a DLR may not connect more than 1 LIF to the same logical switch. Since each Logical switch is uniquely identified by a VNI there will therefore be 2 VNIs. Let's allocated 5001 and 5002 in case we need them further on down.
Whether the network is physical or virtual It still holds true that these VMs must communicate with one another via a router, albeit in this case a virtual router or DLR. So when encapsulating packets at L2 each VM will use it's default gateway (an interface on the DLR instance) as the destination MAC address and any frames it receives from the other subnet will be sourced from that same DLR interface MAC address. It's just Routing 101 as you already know it. It's virtually the same (oops couldn't help the pun, sorry)
You're second misunderstanding is a knock on effect from the first. Since the VMs are on different subnets they will be communicating with a different DLR interface and so a different LIF and vMAC.
The MAC address on a DLR Interface connecting to a Logical Switch is called a vMAC.
This vMAC is the same that DLR and IF pair across any hypervisors in the transport zone.
So lets say we have one hypervisor with these two VMs on it and let's call the router instance DLR1 and the logical interfaces LIF1 and LIF2 and .99 is the host IP address of the DLR LIFs on their respective subnets.
Using fictitious MAC addresses for easier illustration -
web-sv-01a has DLR1/LIF1 IP address 172.16.10.99/24 MAC 00:00:00:00:10:99 as it's default gateway
app-sv-01a has DLR1/LIF2 IP address 172.16.20.99/24 MAC 00:00:00:00:20:99 as it's default gateway
When web-sv-01a attempts for the first time to send an IP packet to app-sv-01a
web-sv-01a will ARP for the MAC address of it's DFGW which is DLR1/LIF1
so on this network segment 172.16.10.0/24 for all traffic between the VMs
IP packets will have the web-sv-01a's and app-sv-01a's IP addresses in the source and destination fields
but at L2 web-sv-01a's and DLR1/LIF1 vMAC address in the source and destination fields
NOTE: which way around the source and destination addresses are is depends on the direction of the traffic, i.e. to or from the host via the router but you can work that out.
Once the ARP process is done -
web-sv-01a's ARP table
IP Address MAC Address IF
172.16.10.99 00:00:00:00:10:99 1
DLR1's ARP table
IP Address MAC Address IF
172.16.10.11 00:00:00:00:10:11 1
So as per usual web-sv-01a has no clue of the actual MAC address of app-sv-01a.
Similarly when DLR1 attempts for the first time to route a packet originating from web-sv-01a to app-sv-01a it will first ARP for the MAC address of app-sv-01a via DLR1/LIF2
so on this network 172.16.20.254/24
IP packets will have the web-sv-01a's and app-sv-01a's IP addresses (NO CHANGE AT THE IP LAYER)
but at L2 app-sv-01a's and DLR1/LIF2 vMAC address
Once the ARP process is done -
app-sv-01a's ARP table
IP Address MAC Address IF
172.16.20.99 00:00:00:00:20:99 1
DLR1's ARP table
IP Address MAC Address IF
172.16.10.11 00:00:00:00:10:11 1
172.16.20.11 00:00:00:00:20:11 2 THE DIFFERENCE NOW IS THAT THE DLR HAS THE APP SERVER MAC TOO
NOTE: No VXLAN encapsulation occurred since the VMs were on the same host and the Controllers were not involved in the ARP traffic (although the security module in each hypervisor would have updated the VTEP and ARP tables on both the hosts and the Controller cluster)
Now lets consider multiple ESXi Hosts (Hypervisors)
In the second example I'll give you let's say we still do the same and send traffic from the web server to the app server as in the example above but the VMs are on different ESXi hosts.
e.g. let's say web-sv-01a was on Host1 and app-sv-01a was on Host2.
In this case the rule applies that routing is done by the DLR kernel instance on the host closest to the source of the traffic so traffic originating from web-sv-01a would be routed by the DLR kernel module on Host1 and traffic originating from app-sv-01a would be routed by the DLR kernel module on Host2. May be a bit hard to follow but there's some good blogs and explanations out there so take a look around if you want pictures. e.g. http://networkinferno.net/nsx-compendium#Logical_Distributed_Routing
NOTE: This principle of routing closest to the source is an important one to remember for your packet walk/trace theory and is also part of the exam blueprint.
The next step is for the Logical switch kernel instance on Host1 to encapsulate the traffic in VXLAN with VNI 5001 in the VXLAN header (remember VNIs for the 2 logical switches were allocated for these examples earlier) and in the outer IP header Host1s VTEP IP address as the source and Host2s VTEP as the destination of the UDP packet and send it to the physical network to Host2s VTEP IP. At Host 2 it is decapsulated by the Logical switch VNI 5001 kernel instance and forwarded to the app-sv-01a VM (notice no routing at this end as it was already done on Host1).
I'm assuming that return traffic in the reverse direction of app-sv-01a to web-sv-01a behaves in much the same way, this time routed by Host2's DLR kernel instance onto the router LIF associated with the logical switch having VNI 5000, encapsulated by Host2's VXLAN kernel module with the VNI 5000 destination in the VXLAN header and Host1's VTEP IP address as the destination in the outer IP header then sent to the physical network to be forwarded to Host1s VTEP interface, decapsulated by the VXLAN kernel module on Host1 and switched to the web-sv-01a by the Logical switch instance to which Host1s interface is connected.
Phew, that was a long time writing but I think I have it. I hope you can follow it.
As for your questions on command line and troubleshooting have a look at Rich Dowlings blog VCP-NV | YAVB - Rich Dowling and the last section 9 of the NCP-NV blueprint where there's a goodly bunch of troubleshooting CLI commands, etc.
Have fun and hope your question was answered. This was my first reply to a VMware technical post ever so please be kind and I welcome your constructive feedback both positive and negative. It's all a learning process.