VMware vSphere

 View Only
Expand all | Collapse all

After upgrade to ESXi 5 HA errors

  • 1.  After upgrade to ESXi 5 HA errors

    Posted Mar 02, 2012 08:42 AM

    Hello,

    I recently upgraded the servers to ESXi 5 and am now receiving HA errors. This seems to primarily happen when I put a host into maintenance mode. I get either:

    Vsphere ha detected that this host is in a different network partition that the master to which vcenter server is connected or

    The Vsphere HA Agent on the host is alive and has management network connectivity but the management network has been partitioned. This state is reported by a vSphere HA masteragent that is in a partition other than the one containing the host

    All the Management Console IP's are on the same subnet. The only thing I could think of is that vMotion is on the same subnet, could this be an issue, or if not what I have dont wrong with this?

    Thank you.



  • 2.  RE: After upgrade to ESXi 5 HA errors

    Posted Mar 02, 2012 08:50 AM

    Hi,

    What error you are getting on the screen

    Can you share the error print screen or vmware logs



  • 3.  RE: After upgrade to ESXi 5 HA errors

    Posted Mar 02, 2012 08:56 AM

    Hello,

    One of the errors I get is attached. I will put the other one up when it comes up again. But it literally was just: vsphere ha detected that this host is in a different network partition that the master to which vcenter server is connected.

    Any ideas?

    Thank you.



  • 4.  RE: After upgrade to ESXi 5 HA errors

    Posted Mar 02, 2012 09:18 AM

    Hi,

    I am not sure might be this problem

    vSphere HA Agent is in the Network Partitioned State

    The vSphere HA agent on a host is in the Network Partitioned state. User intervention might be required to resolve this situation.

    Problem

    While the virtual machines running on the host continue to be monitored by the master hosts that are responsible for them, vSphere HA's ability to restart the virtual machines after a failure is affected. First, each master host has access to a subset of the hosts, so less failover capacity is available to each host. Second, vSphere HA might be unable to restart a Secondary VM after a failure (see Primary VM Remains in the Need Secondary State).

    Cause

    A host is reported as partitioned if both of the following conditions are met:

    The vSphere HA master host to which vCenter Server is   connected is unable to communicate with the host by using the management network, but is able to communicate with   that host by using the heartbeat datastores that have been selected for it.

    The host is not isolated.

    A network partition can occur for a number of reasons including incorrect VLAN tagging, the failure of a physical NIC or switch, configuring a cluster with some hosts that use only IPv4 and others that use only IPv6, or the management networks for some hosts were moved to a different virtual switch without first putting the host into maintenance mode.

    Solution

    Resolve the networking problem that prevents the hosts from communicating by using the management networks.



  • 5.  RE: After upgrade to ESXi 5 HA errors

    Posted Mar 02, 2012 09:20 AM


  • 6.  RE: After upgrade to ESXi 5 HA errors

    Posted Mar 02, 2012 10:06 AM

    Hello

    Thank you for looking at this for me. I have an error that appearso n the migrate task also in case it helps. It happens on maybe 1 in 10 and only when you do bang the server in maintenance mode.

    I understand the links you sent but would this explain why it is only an issue when the host is in maintenance mode? The rest of the time they are fine and can ping each other happily.



  • 7.  RE: After upgrade to ESXi 5 HA errors

    Broadcom Employee
    Posted Mar 05, 2012 08:50 AM

    When the host goes in to maintenance mode this should not happen as the HA agent  should be disabled. Can I ask you to do a log dump on a host where this happens and file a Support Request?

    Thanks,



  • 8.  RE: After upgrade to ESXi 5 HA errors

    Posted May 16, 2012 11:55 AM

    Came across this issue this morning at a large client. Recent upgrade to vCenter 5.0 U1 while hosts remain at ESXi 4.1 U1. After reviewing configurations to ensure nothing had changed we decided to try right clicking on the single host with the issue and selected "Reconfigure for vSphere HA." It resolved the issue for us today.



  • 9.  RE: After upgrade to ESXi 5 HA errors

    Posted May 29, 2012 05:08 PM

    I have recently run into this error on a 5.0 cluster with two hosts.  The symptom is a recurring red warning indicator with this exact error message, occaisionally the host will disconnect.  The issue was resolved after fixing a trunk configuration where one of the uplinks in the vSwitch was not part of an etherchannel trunk.  I removed the uplink and it started working.  Most likely the reason for the up/down symptom is because I am using IP hash on the trunk.  These are the sorts of errors that present when networking is not properly configured.  It is very important that when using IP hash that all of the uplinks are on the same switch-side trunk and active in the vSwitch.  Also, it is important to never add a link to the active uplink list that is not on the switch-side trunk.  The best approach is to use a separate vSwitch (distributed or standard) for each trunk or single uplink if the single uplink is not in the trunk and is intended to be a backup link.

    Hope this helps someone with the same issue.  In short: its probably not HA, its the network config.

    Stan



  • 10.  RE: After upgrade to ESXi 5 HA errors

    Posted May 29, 2012 05:56 PM

    true, if you have all in same subnet then I don't think any point in trunking, Can you make it to access and check as this is now a know problem faced by many



  • 11.  RE: After upgrade to ESXi 5 HA errors

    Posted Feb 09, 2013 12:26 AM

    After mobing management to its own VLAN, I also had the message about "not being in the same network partition as the Master" and it actually was.  All I did was right-click on the host with the alarm and click on "Reconfigure for vSphere HA" and that fixed it.

    I know this issue was probably settled a while ago, I though this may help someone esle with similar issue down the road.

    Dan M