Brocade Fibre Channel Networking Community

Expand all | Collapse all

DCX CPU usage 100%

  • 1.  DCX CPU usage 100%

    Posted 08-22-2013 11:45 PM

    Hello All,

    I have 2 fabrics (A&B) consist of 1 unit of DCX as core and 2 units of 12K as edge switch. These switches are registered in DCFM. DCX running on FOS v6.3.2d

    yesterday I was running command sysmonitor --show cpu at DCX Fabric A to know current CPU utilization (this is the 1st time I run this command). then the result it beyond my expectation because the CPU usage are 100%. this is log that i captured:

    admin> sysmonitor --show cpu

    Showing Cpu Usage:

        Cpu Usage            : 100%

        Cpu Usage limit      : 75%

        Number of Retries    : 3

        Polling Interval     : 120 seconds

        Actions              : none

    at the other fabric dcx cpu util just 6%

    There are no error message at errdump. So i am so worried of my DCX status, Is it just a defect of the FOS or this is the real CPU usage??

    DCX spec:

    20disc port, 9 host port, available 67ports (total port 96 ports)

    note: this fabric located at DRC site, so the IO is not so big..

    i also capture the top command via root

    root> top

    top - 15:18:17 up 112 days, 23:38,  1 user,  load average: 2.82, 2.51, 2.30

    Tasks: 110 total,   3 running, 107 sleeping,   0 stopped,   0 zombie

    Cpu(s): 42.0%us, 57.3%sy,  0.3%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st

    Mem:   1863188k total,  1140520k used,   722668k free,    33344k buffers

    Swap:        0k total,        0k used,        0k free,   786368k cached

      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

    3113 root      25   0 28108 4016 3376 R 94.8  0.2  11225:22 tracestore

    4364 root      16   0  139m  84m  80m S  2.3  4.7   2891:40 iswitchd

    2859 root      16   0 72688 4492 3300 S  1.7  0.2   2595:11 emd

    4358 root      33  18 64732 7532 3572 S  1.0  0.4   1165:59 fwd

        1 root      16   0  1696  592  524 S  0.0  0.0   0:31.96 init

        2 root      34  19     0    0    0 S  0.0  0.0   0:03.51 ksoftirqd/0

        3 root      10  -5     0    0    0 S  0.0  0.0   0:00.03 events/0

        4 root      19  -5     0    0    0 S  0.0  0.0   0:00.02 khelper

        5 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 kthread

       29 root      10  -5     0    0    0 S  0.0  0.0   0:00.17 kblockd/0

       62 root      20   0     0    0    0 S  0.0  0.0   0:00.00 pdflush

       63 root      15   0     0    0    0 S  0.0  0.0   0:00.51 pdflush

       65 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 aio/0

       64 root      25   0     0    0    0 S  0.0  0.0   0:00.00 kswapd0

      756 root      15   0     0    0    0 S  0.0  0.0   0:07.14 kjournald

      774 root      RT   0  1676  400  336 S  0.0  0.0   0:00.02 wdtd

      835 root      15   0     0    0    0 S  0.0  0.0   0:02.36 kjournald

      991 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 eth2/0

    1002 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 eth1/0

    1023 root      10  -5     0    0    0 S  0.0  0.0   0:00.01 eth0/0

    1025 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 eth3/0

    1038 bin       16   0  1688  428  336 S  0.0  0.0   0:00.33 portmap

    1058 root      16   0  2116  652  508 S  0.0  0.0   0:00.02 inetd

    1063 root      15   0     0    0    0 S  0.0  0.0   0:00.00 nfsd

    1064 root      15   0     0    0    0 S  0.0  0.0   0:00.00 nfsd

    1065 root      15   0     0    0    0 S  0.0  0.0   0:00.00 nfsd

    1066 root      23   0     0    0    0 S  0.0  0.0   0:00.00 lockd

    1067 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 rpciod/0

    1068 root      15   0     0    0    0 S  0.0  0.0   0:00.01 nfsd

    1070 root      16   0  2336  556  420 S  0.0  0.0   0:00.01 rpc.mountd

    1084 root      25   0  2552 1088  916 S  0.0  0.1   0:55.60 kmsghandler

    1098 root      16   0  1700  376  304 S  0.0  0.0   0:11.76 klogd

    1099 root      15   0  1808  620  528 S  0.0  0.0   0:04.04 crond

    1106 root      16   0  1944  680  532 S  0.0  0.0   0:07.81 syslogd

    1128 root      15   0     0    0    0 S  0.0  0.0   0:00.11 RASLOGK_TH

    1926 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 kwt_nb_thread

    2200 root      19   0     0    0    0 S  0.0  0.0   0:00.00 module-182-th

    2208 root      15   0     0    0    0 S  0.0  0.0  10:53.57 module-99-th

    2230 root      19   0     0    0    0 S  0.0  0.0   0:00.01 module-107-th


    #BrocadeFibreChannelNetworkingCommunity


  • 2.  Re: DCX CPU usage 100%

    Posted 08-23-2013 01:51 AM
      |   view attached

    Hi there,

    That looks like DEFECT000363516 to me. It seems to be fixed in FOS 7.0.1

    Rgds


    #BrocadeFibreChannelNetworkingCommunity


  • 3.  Re: DCX CPU usage 100%

    Posted 08-23-2013 02:17 AM

    Oh i see.. but is there any different action beside upgrading to FOS v7.0.1 because the DCX is connected to 12K SAN Switch which prohibited to direct connection with DCX FOS v7.0


    #BrocadeFibreChannelNetworkingCommunity


  • 4.  Re: DCX CPU usage 100%

    Posted 08-23-2013 04:49 AM

    you could try to failover to the Standby CP.

    Rgds


    #BrocadeFibreChannelNetworkingCommunity


  • 5.  Re: DCX CPU usage 100%

    Posted 08-23-2013 09:00 AM

    i should get permission 1st from my customer.. i'll inform later


    #BrocadeFibreChannelNetworkingCommunity


  • 6.  Re: DCX CPU usage 100%

    Posted 08-23-2013 07:25 AM

    azakiyy,

    ->but is there any different action beside upgrading to FOS v7.0.1 because the DCX is connected to 12K SAN Switch which prohibited to direct connection with DCX FOS v7.0

    This is correct, but if you want to continued to work with 12K then you can implement FCR on DCX, trough Integrated Routing, and upgrade to latest FOS 7.1.x release.

    Keep in mind IR is optional License.


    #BrocadeFibreChannelNetworkingCommunity


  • 7.  Re: DCX CPU usage 100%

    Posted 08-23-2013 09:05 AM

    I'm afraid i cant do the FOS upgrade for DCX coz I should stick with this topology (core edge)

    Is it safe to kill the most consume service (PID 3113/tracestore)?

    Is it possible to use command "kill -9 (PID)"?

    anyone know what tracestore stand for??


    #BrocadeFibreChannelNetworkingCommunity


  • 8.  Re: DCX CPU usage 100%

    Posted 08-24-2013 01:30 AM

    If your active CP is at 100%.

    Then try the ha failover as suggested by felipon.

    If your now passive CP (assuming the fail over was successful) still is at 100% CPU, you can also reboot that CP.


    #BrocadeFibreChannelNetworkingCommunity


  • 9.  Re: DCX CPU usage 100%

    Posted 08-28-2013 09:49 PM

    Thanks all for ur reply,

    i have escalated the problem to support and they replied with same recommendations.

    1. try to hafailover the active CP (Check the CPU utit)

    2. if the DCX still has 100% CPU then try to hafailover again (My problem fix in this step)

    3. use kill -9 pid 3113 (3113 is the PID of tracestore)


    note: hafailover is disruptive, do it in less IO time




    #BrocadeFibreChannelNetworkingCommunity


  • 10.  Re: DCX CPU usage 100%

    Posted 08-29-2013 05:55 AM

    Great that you've got support telling the almost same thing.

    However I don't believe hafailover is disruptive to IO traffic.


    #BrocadeFibreChannelNetworkingCommunity


  • 11.  Re: DCX CPU usage 100%

    Posted 09-02-2013 02:13 AM

    Hi there,

    I agree with dion.v.d.c, hafailover is not disruptive (under normal conditions).

    The problem of killing the process could be dangerous, depending on what that process is doing, it will just restart, make the switch failover of reboot the CP...

    Please, once done, post here the result.

    Rgds


    #BrocadeFibreChannelNetworkingCommunity


  • 12.  Re: DCX CPU usage 100%

    Posted 06-02-2015 07:56 AM

    Hi Everyone,

     

    In my case the top process that uses more cpu is SNMPCONFIG.

     

    Is it safe to kill it?   do you guys know what are the  consequences?

     

     

    admin> top
    top - 09:50:52 up 311 days, 18:49, 2 users, load average: 4.60, 4.45, 4.24
    Tasks: 125 total, 3 running, 121 sleeping, 0 stopped, 1 zombie
    Cpu(s): 77.2%us, 21.9%sy, 0.3%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st
    Mem: 1863172k total, 1511384k used, 351788k free, 44100k buffers
    Swap: 0k total, 0k used, 0k free, 643696k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    16430 root 25 0 36568 8076 5308 R 43.4 0.4 9677:46 snmpconfig
    1704 root 16 0 140m 7404 4668 S 11.3 0.4 32555:39 emd
    1879 root 15 0 113m 23m 4848 S 2.3 1.3 9503:23 trafd
    19255 root 15 0 126m 18m 13m S 2.3 1.0 9232:59 0.weblinker.fcg
    1833 root 16 0 149m 19m 5472 S 1.3 1.1 4680:48 fspfd
    1881 root 15 0 319m 14m 6960 S 0.7 0.8 238:52.92 zoned
    6592 root 16 0 2440 1200 940 R 0.7 0.1 0:00.25 top
    1849 root 33 18 176m 38m 7140 S 0.3 2.1 10794:51 fwd
    1868 root 15 0 256m 44m 36m S 0.3 2.4 767:16.99 nsd
    1872 root 15 0 358m 39m 25m S 0.3 2.1 1077:54 psd
    4486 root 15 0 28364 7364 5016 S 0.3 0.4 0:01.76 sshd
    4490 root 17 0 5640 2920 1872 S 0.3 0.2 0:01.23 rbash
    1 root 16 0 1696 592 524 S 0.0 0.0 0:07.73 init
    2 root 34 19 0 0 0 S 0.0 0.0 4:32.06 ksoftirqd/0
    3 root 10 -5 0 0 0 S 0.0 0.0 0:07.32 events/0
    4 root 19 -5 0 0 0 S 0.0 0.0 0:00.04 khelper
    5 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread

     

     

     


    #BrocadeFibreChannelNetworkingCommunity