Brocade Fibre Channel Networking Community

Expand all | Collapse all

HAM-1004 Processor reboot - trying to determine root cause

Jump to Best Answer
  • 1.  HAM-1004 Processor reboot - trying to determine root cause

    Posted 04-18-2017 01:00 AM

    Hi,

        I've got a customer with a fairly old SAN setup and a couple of HP Branded Silkworm 300 switches. They are running FOS 6.4.2b and have been plodding along with no issues until recently. In the last couple of months one of them has begun to spontaneously reboot. I've been seeing the following in the errdump output 

     

    2017/04/13-22:36:56, [HAM-1004], 1653, CHASSIS, INFO, Brocade300, Processor rebooted - Reset

     

    This has occured about 4 times in the last 8 weeks.

     

    I've looked through a supportsave to see if there are any alerts as I was wondering if perhaps there were thermal issues but all the switch components are reporting temps as nominal. Just looking for any advice on how to work back and perhaps determine the root cause of the reboot if it's not a physical issue.. 


    I guess it could be a firmware... I am aware this is old firmware. Anyway if anyone has any pointers about where to look I'd be greatful. These switches are obviously out of warranty now but we do have hardware support from a 3rd party supplier so i could get the switch swapped out and the licence transferred. Or I could update the firmware. 

     

    Advice welcome

     

    thanks

    Adam


    #software
    #BrocadeFibreChannelNetworkingCommunity
    #reboot


  • 2.  Re: HAM-1004 Processor reboot - trying to determine root cause

    Posted 04-18-2017 02:13 AM

    Hello,

     

    This is very difficult to determine the issue with only this message. The general recommendation for that kind of case will be to upgrade the switch at the latest firmware level available for eliminate this part of the potential issue as firmware is old.

     


    #BrocadeFibreChannelNetworkingCommunity


  • 3.  Re: HAM-1004 Processor reboot - trying to determine root cause

    Posted 04-18-2017 02:37 AM

    Ensure that you have console - serial connection - where you log the output from the console. And a syslog server configured and setup. Also check for power loss.  But otherwise I concur with Thierry - update the to last supported firmware, target path.


    #BrocadeFibreChannelNetworkingCommunity


  • 4.  Re: HAM-1004 Processor reboot - trying to determine root cause

    Posted 04-18-2017 03:36 AM

    thanks guys, I'll see if I can work out the upgrade path to a more recent FOS 7.x release.

     

    Hopefully this will resolve the issues.

     

    thank you.

     

    addendum. According to this (http://community.brocade.com/t5/Fibre-Channel-SAN/Brocade-Fabric-OS-Target-Path-Technical-Brief/ta-p/63946) and the latest target path selection guide  I should be able to go from 6.4.2b  via the following route

     

    FOS v6.4.2a → FOS v7.0.2c/d/e → FOS v7.1.1a/b/c* → FOS 7.2.1g → FOS 7.3.1d/e → FOS 7.4.1e

     

    I assume this would be a non disruptive update?

     

    Are there any steps I can skip if I am able to tolerate a reboot?

     

    thanks

     

     

     


    #BrocadeFibreChannelNetworkingCommunity


  • 5.  Re: HAM-1004 Processor reboot - trying to determine root cause

    Posted 04-18-2017 04:27 AM

    Correct. The below would be non disruptive

     

    FOS v6.4.2a → FOS v7.0.2c/d/e → FOS v7.1.1a/b/c* → FOS 7.2.1g → FOS 7.3.1d/e → FOS 7.4.1e

     

    The latest Target Path which was release end of last week is at

     

    https://www.brocade.com/content/dam/common/documents/content-types/target-path-selection-guide/brocade-fos-target-path.pdf

     

    Notice that for FOS 7.4.1e - Migrating from FOS v7.2 (7.4.1e RN)

    Any 8G or 16G platform operating at FOS v7.2.x must be upgraded to FOS v7.3.x before upgrading to FOS v7.4.1e.

    • Disruptive upgrade to FOS v7.4.1e from FOS v7.2 is not supported

     

    If you can take disruptive upgrades, then you can take two steps - for example skip the 7.2 upgrade, e.g.

     

    FOS v6.4.2a → FOS v7.0.2c/d/e → FOS v7.1.1a/b/c*  → FOS 7.3.1d/e → FOS 7.4.1e

     

    7.3.1e release notes says:

     

    Disruptive upgrade to FOS v7.3.1e from FOS v7.1 is supported.

     


    #BrocadeFibreChannelNetworkingCommunity


  • 6.  Re: HAM-1004 Processor reboot - trying to determine root cause

    Posted 04-24-2017 01:00 AM

    thanks for all the info. I've got one more question. I'm struggling to locate FOS 7.1.1x anywhwere on the hpe.com web sites...

     

    Would it be ok to go the following route?

     

    FOS v6.4.2b → FOS v7.0.2c/d/e → FOS v7.1.2b* → FOS 7.2.1g → FOS 7.3.1d/e → FOS 7.4.1e

     

     

    thanks

     

     


    #BrocadeFibreChannelNetworkingCommunity


  • 7.  Re: HAM-1004 Processor reboot - trying to determine root cause
    Best Answer

    Posted 04-24-2017 01:12 AM

    Hello,

     

    Yes will be good.


    #BrocadeFibreChannelNetworkingCommunity


  • 8.  Re: HAM-1004 Processor reboot - trying to determine root cause

    Posted 05-05-2017 12:49 AM

    thanks for all the advice, Guys. I was able to sucessfully uplift both switches to 7.4.1e. Sadly though this does not seem to have fixed my potential hardware issues on one of the switches.

     

    I now suddenly have multiple faults as of this morning.

     

    Index Port Address Media Speed State Proto
    ==================================================
    0 0 0a1700 id 8G Online FC F-Port 50:06:0e:80:10:4d:84:e0
    1 1 0a1500 id 8G No_Sync FC
    2 2 0a1300 id N8 Hard_Flt FC
    3 3 0a1100 id N8 Online FC F-Port 10:00:00:05:1e:fb:42:19
    4 4 0a1600 id N8 Hard_Flt FC
    5 5 0a1400 id N8 Online FC F-Port 10:00:00:05:1e:fb:3f:8c
    6 6 0a1200 id N8 Hard_Flt FC
    7 7 0a1000 id N8 Online FC F-Port 50:01:43:80:03:30:34:c2
    8 8 0a0f00 id N8 Online FC F-Port 10:00:00:05:1e:fb:35:a4
    9 9 0a0d00 id 4G Online FC L-Port 1 public
    10 10 0a0b00 id N4 In_Sync FC
    11 11 0a0900 id N8 In_Sync FC
    12 12 0a0e00 id N4 Online FC F-Port 21:78:00:c0:ff:d7:21:22
    13 13 0a0c00 id N8 Online FC F-Port 50:01:43:80:02:51:93:7c
    14 14 0a0a00 id N8 Online FC F-Port 50:01:43:80:02:51:94:50
    15 15 0a0800 id N8 In_Sync FC
    16 16 0a0700 id N8 Online FC L-Port
    17 17 0a0500 id N8 In_Sync FC
    18 18 0a0300 id N8 Online FC F-Port 10:00:8c:7c:ff:21:07:fa
    19 19 0a0100 id N8 Online FC F-Port 10:00:8c:7c:ff:20:eb:6e
    20 20 0a0600 id N8 Online FC F-Port 10:00:8c:7c:ff:21:07:40
    21 21 0a0400 id N8 No_Light FC
    22 22 0a0200 -- N8 No_Module FC

     

    All ports 0-20 were fine post update but this switch had had a few spontaneous reboots prior to the firmware update. I am guessing these all now point toward hardware issues?

     

    the fabriclog appears to be showing some port flapping but these devices were fine previously so i am wondering if it's actually the switch which is at fault

     

    08:46:28.534534 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:28.709195 SCN LR_PORT(0);g=0x5df0 D0,P0 D0,P0 13 NA
    08:46:28.709253 SCN Port Online; g=0x5df0,isolated=0 D0,P0 D0,P1 13 NA
    08:46:28.709423 Port Elp engaged D0,P1 D0,P0 13 NA
    08:46:28.709500 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:28.709643 SCN Port F_PORT D0,P1 D0,P0 13 NA
    08:46:30.284077 SCN Port Offline;g=0x5df2 D0,P0 D0,P0 15 NA
    08:46:30.284096 *Removing all nodes from port D0,P0 D0,P0 15 NA
    08:46:30.898566 *Removing all nodes from port D0,P0 D0,P0 5 NA
    08:46:30.898708 SCN Port F_PORT D0,P0 D0,P0 5 NA
    08:46:31.333173 SCN Port Offline;g=0x5df4 D0,P0 D0,P0 11 NA
    08:46:31.333193 *Removing all nodes from port D0,P0 D0,P0 11 NA
    08:46:32.593991 SCN Port Offline;g=0x5df6 D0,P0 D0,P0 13 NA
    08:46:32.594009 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:32.845692 SCN LR_PORT(0);g=0x5df6 D0,P0 D0,P0 13 NA
    08:46:32.845741 SCN Port Online; g=0x5df6,isolated=0 D0,P0 D0,P1 13 NA
    08:46:32.845913 Port Elp engaged D0,P1 D0,P0 13 NA
    08:46:32.845993 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:32.846137 SCN Port F_PORT D0,P1 D0,P0 13 NA
    08:46:33.282719 SCN Port Offline;g=0x5df8 D0,P0 D0,P0 15 NA
    08:46:33.282737 *Removing all nodes from port D0,P0 D0,P0 15 NA
    08:46:34.334477 SCN Port Offline;g=0x5dfa D0,P0 D0,P0 11 NA
    08:46:34.334498 *Removing all nodes from port D0,P0 D0,P0 11 NA
    08:46:35.857793 SCN Port Offline;g=0x5dfc D0,P0 D0,P0 14 NA
    08:46:35.857812 *Removing all nodes from port D0,P0 D0,P0 14 NA
    08:46:36.111614 SCN LR_PORT(0);g=0x5dfc D0,P0 D0,P0 14 NA
    08:46:36.111661 SCN Port Online; g=0x5dfc,isolated=0 D0,P0 D0,P1 14 NA
    08:46:36.111835 Port Elp engaged D0,P1 D0,P0 14 NA
    08:46:36.111915 *Removing all nodes from port D0,P0 D0,P0 14 NA
    08:46:36.112060 SCN Port F_PORT D0,P1 D0,P0 14 NA
    08:46:36.290822 SCN Port Offline;g=0x5dfe D0,P0 D0,P0 15 NA
    08:46:36.290842 *Removing all nodes from port D0,P0 D0,P0 15 NA
    08:46:36.653851 SCN Port Offline;g=0x5e00 D0,P0 D0,P0 13 NA
    08:46:36.653872 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:36.868517 SCN LR_PORT(0);g=0x5e00 D0,P0 D0,P0 13 NA
    08:46:36.868572 SCN Port Online; g=0x5e00,isolated=0 D0,P0 D0,P1 13 NA
    08:46:36.868744 Port Elp engaged D0,P1 D0,P0 13 NA
    08:46:36.868823 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:36.868966 SCN Port F_PORT D0,P1 D0,P0 13 NA
    08:46:37.335304 SCN Port Offline;g=0x5e02 D0,P0 D0,P0 11 NA
    08:46:37.335323 *Removing all nodes from port D0,P0 D0,P0 11 NA
    08:46:39.290901 SCN Port Offline;g=0x5e04 D0,P0 D0,P0 15 NA
    08:46:39.290920 *Removing all nodes from port D0,P0 D0,P0 15 NA
    08:46:40.342782 SCN Port Offline;g=0x5e06 D0,P0 D0,P0 11 NA
    08:46:40.342801 *Removing all nodes from port D0,P0 D0,P0 11 NA
    08:46:40.712943 SCN Port Offline;g=0x5e08 D0,P0 D0,P0 13 NA
    08:46:40.712962 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:40.929195 SCN LR_PORT(0);g=0x5e08 D0,P0 D0,P0 13 NA
    08:46:40.929251 SCN Port Online; g=0x5e08,isolated=0 D0,P0 D0,P1 13 NA
    08:46:40.929420 Port Elp engaged D0,P1 D0,P0 13 NA
    08:46:40.929499 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:40.929643 SCN Port F_PORT D0,P1 D0,P0 13 NA
    08:46:42.293870 SCN Port Offline;g=0x5e0a D0,P0 D0,P0 15 NA
    08:46:42.293890 *Removing all nodes from port D0,P0 D0,P0 15 NA
    08:46:42.489625 *Removing all nodes from port D0,P0 D0,P0 3 NA
    08:46:42.489772 SCN Port F_PORT D0,P0 D0,P0 3 NA
    08:46:43.347249 SCN Port Offline;g=0x5e0c D0,P0 D0,P0 11 NA
    08:46:43.347268 *Removing all nodes from port D0,P0 D0,P0 11 NA
    08:46:43.986892 *Removing all nodes from port D0,P0 D0,P0 19 NA
    08:46:43.988040 SCN Port F_PORT D0,P0 D0,P0 19 NA
    08:46:44.211151 *Removing all nodes from port D0,P0 D0,P0 18 NA
    08:46:44.211298 SCN Port F_PORT D0,P0 D0,P0 18 NA
    08:46:44.771924 SCN Port Offline;g=0x5e0e D0,P0 D0,P0 13 NA
    08:46:44.771943 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:45.119880 SCN LR_PORT(0);g=0x5e0e D0,P0 D0,P0 13 NA
    08:46:45.119929 SCN Port Online; g=0x5e0e,isolated=0 D0,P0 D0,P1 13 NA
    08:46:45.120102 Port Elp engaged D0,P1 D0,P0 13 NA
    08:46:45.120179 *Removing all nodes from port D0,P0 D0,P0 13 NA
    08:46:45.120324 SCN Port F_PORT D0,P1 D0,P0 13 NA
    08:46:45.293326 SCN Port Offline;g=0x5e10 D0,P0 D0,P0 15 NA
    08:46:45.293346 *Removing all nodes from port D0,P0 D0,P0 15 NA
    08:46:46.347961 SCN Port Offline;g=0x5e12 D0,P0 D0,P0 11 NA
    08:46:46.347980 *Removing all nodes from port D0,P0 D0,P0 11 NA

     

     

    If I want to replace the chassis I assume I'll need to arrange for a licence transfer of the existing licences which I am guessing will be a vendor specific process (in my case i believe these are HP branded switches)..

     

    So I am guessing you'll say replace the switch but thought it was worth asking.

     

    thanks

    Adam.

     

     


    #BrocadeFibreChannelNetworkingCommunity