Because you mentioned VXLAN, I assume you are running NSXv? The recommended configuration is setting jumbo frames to 9k and at minimum 1600.
This will at minimum need to be configured on all ToR's and switches that your encapsulated traffic will pass. The answer to this question is dependant on your design. IE, wherever your host transport nodes are connected to and where your edges sit.
Depending on what hardware you are running, MTU can be specified globally, on a per interface level or VLAN.