Hi,
A quick question about link aggregation. How do we PXE boot and build an ESXi server that is connected to the network via an aggregate link?
We have ESXi 4.0 servers connected via a pair of 10GE NICs to Cisco Nexus switch infrastructure. We want to run link aggregation between the servers and the network, so we configure "Route based on IP hash" on the ESX host and the Cisco virtual Port-Channel (vPC) on the Nexus switches. The Cisco switches are configured for "static" link aggregation as the ESX servers do not support the Link Aggregation Control Protocol (LACP) i.e., dynamic link aggregation. The use of static link aggregation means that as soon as "link" is seen on the Nexus switch ports, the link is inserted into the port-channel.
This works fine and we can send traffic and receive traffic for a single VM across both VMNICs.
Now the server team want to update the ESX server. They reboot the host and select to PXE boot the server, a process which only uses one the two 10GE NICs i.e., the first physical NIC on the host. The second interface is operational i.e., "link" is established between the host and the Nexus switch, but the PXE client on the host does nothing with the NIC.
All traffic from the server is received by the Nexus switch on the physical port that connects to the first physical NIC, but the MAC associated with that NIC will be learnt by the Nexus via its port-channel interface. When the switch sends traffic to the server it is just as likely to utilise the physical port that connects to the servers second physical NIC, which during the PXE boot / build process, is not operational. This traffic will be dropped and the build process will fail.
Cisco have added functionality in IOS and NX-OS to suspend an individual link of a port-channel if LACP PDU are not received on that link, such that the remaining link acts as if it were a single switch port. This is a useful feature.... apart from the point above that ESX does not support LACP.
So this raises two questions for me:
1. Does this mean that, if we utilise link aggregation between the ESX host and the Nexus switch, that every time the server team want to update the ESX build version, they have to first contact the network team to have them disable vPC?
2. When will ESX support LACP i.e., dynamic Link Aggregation?
Thanks in advance.
Steve