I don't understand why if I have 5TB why I'd need VM1 to be 2.5 and the other, say VM2, to be 2.5TB volumes.
No, I'm not advocating for distributing a single VM across multiple datastores. That is generally not a recommended practice. If you have a 5 TB VM, you should try and put that on a single datastore unless you have specific requirements. This obviously means you need a single datastore of more than that capacity.
why having one large volume on the SAN can cause problems.
It's not that necessarily it would cause problems, just that there are reasons for not doing this. A fourth reason is blast radius. If someone screws up that one LUN and it has everything on it, you're completely hosed. Even if you have backups, now you have to restore absolutely everything versus, say, 20%.
Supposedly, the SAN can do all the load balancing across the controllers.
Not talking about array-level balancing when it comes to queueing, talking about ESXi and VMFS queues which are per LUN.