We recently onboarded a setup that has NSX-T version 3.1 running on a 14-node Cisco Hyperflex cluster. This particular setup hasn’t been upgraded since its inception, and we're now planning its first upgrade to NSX-T version 3.2.
While I've done my best to understand the intricacies of both NSX-T and Hyperflex, I acknowledge I might be misinterpreting some aspects. Please correct me if that's the case.
My specific concerns are:
Upgrading Nodes the Right Way: NSX-T's approach seems to involve sequentially restarting nodes during an upgrade. However, given Hyperflex's architecture, each node requires a preparatory phase, typically managed via HX Connect and followed by a storage rebalance. How can we ensure NSX-T’s upgrade process respects these nuances?
Cluster Health Checks Between Node Upgrades: After upgrading a single node in NSX-T, it's critical to ensure our Hyperflex cluster's health before moving on to the next node. Does NSX-T provide a way to pause between node upgrades or offer an opportunity for manual intervention?
Maintaining Communication: Post-upgrade, the nodes need to maintain seamless communication with each other and align with NSX-T's central controls. What provisions does NSX-T have in place to ensure this?
Given the above, we'd appreciate guidance on:
Best practices for upgrading NSX-T within a Hyperflex environment, taking storage rebalancing into account.
Potential challenges we might encounter due to the intertwined nature of NSX-T and Hyperflex.
Tips or tools to monitor the health and connections of NSX-T and Hyperflex during the upgrade.
Options within NSX-T to either manually intervene or synchronize with Hyperflex's health checks during the process.
As we navigate this initial upgrade, your expert insights will be instrumental in ensuring a smooth and efficient transition.