I was thinking that a port can be in marginal state if there are errors seen on it.
Now I have set of switches which constantly reports marginal ports without any obvious reason.
Port 008 is MARGINAL
But I don't see any errors on it:
stat_wtx 145874323 4-byte words transmitted
stat_wrx 254053827 4-byte words received
stat_ftx 437418 Frames transmitted
stat_frx 636166 Frames received
stat_c2_frx 0 Class 2 frames received
stat_c3_frx 636141 Class 3 frames received
stat_lc_rx 11 Link control frames received
stat_mc_rx 0 Multicast frames received
stat_mc_to 0 Multicast timeouts
stat_mc_tx 0 Multicast frames transmitted
tim_rdy_pri 46 Time R_RDY high priority
tim_txcrd_z 39862 Time TX Credit Zero (2.5Us ticks)
tim_txcrd_z_vc 0- 3: 0 0 39862 0
tim_txcrd_z_vc 4- 7: 0 0 0 0
tim_txcrd_z_vc 8-11: 0 0 0 0
tim_txcrd_z_vc 12-15: 0 0 0 0
er_enc_in 0 Encoding errors inside of frames
er_crc 0 Frames with CRC errors
er_trunc 0 Frames shorter than minimum
er_toolong 0 Frames longer than maximum
er_bad_eof 0 Frames with bad end-of-frame
er_enc_out 0 Encoding error outside of frames
er_bad_os 0 Invalid ordered set
er_rx_c3_timeout 0 Class 3 receive frames discarded due to timeout
er_tx_c3_timeout 0 Class 3 transmit frames discarded due to timeout
er_c3_dest_unreach 0 Class 3 frames discarded due to destination unreachable
er_other_discard 0 Other discards
er_type1_miss 0 frames with FTB type 1 miss
er_type2_miss 0 frames with FTB type 2 miss
er_type6_miss 0 frames with FTB type 6 miss
er_zone_miss 0 frames with hard zoning miss
er_lun_zone_miss 0 frames with LUN zoning miss
er_crc_good_eof 0 Crc error with good eof
er_inv_arb 0 Invalid ARB
open 0 loop_open
transfer 0 loop_transfer
opened 0 FL_Port opened
starve_stop 0 tenancies stopped due to starvation
fl_tenancy 0 number of times FL has the tenancy
nl_tenancy 0 number of times NL has the tenancy
zero_tenancy 0 zero tenancy
portFlags: 0x10004907 PRESENT ACTIVE E_PORT G_PORT U_PORT LOGICAL_ONLINE LOGIN LED
POD Port: Port is licensed
portState: 1 Online
portPhys: 6 In_Sync portScn: 16 E_Port
port generation number: 32
state transition count: 3
portWwn of device(s) connected:
Distance: standard <= 10km
LE domain: 0
FC Fastwrite: OFF
Interrupts: 0 Link_failure: 0 Frjt: 0
Unknown: 0 Loss_of_sync: 0 Fbsy: 0
Lli: 0 Loss_of_sig: 0
Proc_rqrd: 25 Protocol_err: 0
Timed_out: 0 Invalid_word: 0
Rx_flushed: 0 Invalid_crc: 0
Tx_unavail: 0 Delim_err: 0
Free_buffer: 0 Address_err: 0
Overrun: 0 Lr_in: 0
Suspended: 0 Lr_out: 0
Parity_err: 0 Ols_in: 0
2_parity_err: 0 Ols_out: 0
The only thing that looks suspicious are the SFP RX/TX values which constantly are reported as above/below threshold.
Is that taken into account for marginal port??
2013/01/16-12:44:09, , 4582, FID 128, WARNING, s2ngm580, Sfp TX power for port 20, is below low boundary(High=580, Low=110). Current value is 13 uWatts.
2013/01/16-12:44:09, , 4583, FID 128, WARNING, s2ngm580, Sfp RX power for port 21, is below low boundary(High=580, Low=110). Current value is 0 uWatts.
2013/01/16-12:44:09, , 4584, FID 128, WARNING, s2ngm580, Sfp TX power for port 21, is below low boundary(High=580, Low=110). Current value is 25 uWatts.
2013/01/16-12:44:09, , 4585, FID 128, WARNING, s2ngm580, Sfp TX power for port 22, is below low boundary(High=580, Low=110). Current value is 22 uWatts.
2013/01/16-12:44:09, , 4586, FID 128, WARNING, s2ngm580, Sfp TX power for port 23, is below low boundary(High=580, Low=110). Current value is 24 uWatts.
2013/01/16-12:48:36, , 4588, FID 128, WARNING, s2ngm580, Sfp TX power for port 8, is above high boundary(High=580, Low=110). Current value is 637 uWatts.
2013/01/16-13:03:34, , 4589, FID 128, WARNING, s2ngm580, Sfp RX power for port 8, is above high boundary(High=580, Low=110). Current value is 583 uWatts.
I was told that the SFPs were already replaced few times and the problem is seen on most of the ports regardless if it's ISL, FC AL or F
Any advice appreciated.
I was first thinking that it might be only reporting issue but there are also connectivity issues seen on the end devices connected to the switches.
Current information from SFP show:
Identifier: 3 SFP
Connector: 7 LC
Transceiver: 5401001200000000 200,400,800_MB/s SM lw Long_dist
Encoding: 1 8B10B
Baud Rate: 85 (units 100 megabaud)
Length 9u: 10 (units km)
Length 9u: 100 (units 100 meters)
Length 50u: 0 (units 10 meters)
Length 62.5u:0 (units 10 meters)
Length Cu: 0 (units 1 meter)
Vendor Name: BROCADE
Vendor OUI: 00:05:1e
Vendor PN: 57-1000027-01
Vendor Rev: A
Wavelength: 1310 (units nm)
Options: 001a Loss_of_Sig,Tx_Fault,Tx_Disable
BR Max: 0
BR Min: 0
Serial No: UDA110104001931
Date Code: 100322
DD Type: 0x68
Enh Options: 0xf0
Alarm flags = 0x0, 0x0
Warn Flags = 0x0, 0x0
low high low high
Temperature: 52 Centigrade -40 90 -10 85
Current: 36.126 mAmps 2.000 95.000 2.000 80.000
Voltage: 3292.3 mVolts 2700.0 3800.0 2970.0 3630.0
RX Power: -2.5 dBm (565.9 uW) 4.4 uW 2240.0 uW 11.2 uW 1122.0 uW
TX Power: -2.0 dBm (635.3 uW) 27.0 uW 2240.0 uW 67.0 uW 1122.0 uW
A port is faulty when the port value for Link Loss, Synchronization Loss, Signal Loss, Invalid word,
Protocol error, cyclic redundancy check (CRC) error, Port state change or Buffer Limited Port is
above the high boundary.
It seems from the messages that you have Fabric Watch configured to alert based on SFP TX/RX power. These messages are the reason for the marginal port flag.
So I am curious on what values you need to look at on the SFP TX/RX power settings to see if an SFP is going bad?
When I look at my current ports, I see these things
RX Power: -3.5 dBm (450.7 uW) 0.0 uW 0.0 uW 0.0 uW 0.0 uW
TX Power: -3.6 dBm (431.6 uW) 50.0 uW 1000.0 uW 63.1 uW 794.3 uW
RX Power: -4.1 dBm (388.5 uW) 0.0 uW 0.0 uW 0.0 uW 0.0 uW
TX Power: -3.7 dBm (427.8 uW) 50.0 uW 1000.0 uW 63.1 uW 794.3 uW
RX Power: -2.7 dBm (532.9 uW) 0.0 uW 0.0 uW 0.0 uW 0.0 uW
TX Power: -3.8 dBm (420.2 uW) 50.0 uW 1000.0 uW 63.1 uW 794.3 uW
RX Power: -26.8 dBm (2.1 uW) 0.0 uW 0.0 uW 0.0 uW 0.0 uW
TX Power: -3.6 dBm (436.1 uW) 50.0 uW 1000.0 uW 63.1 uW 794.3 uW
So by looking at the tx/rx powers, they are all different, how would I determine what is a marginal range?
In addition to 's post, if you have fabricwatch license installed, you can stablish the threshold above which a port is set to MARGINAL status.
So if you currently have a MARGINAL port and you do not see any error in it, check the FW policies and thresholds that are active for that port.
I see this in configshow -all:
I'm not a Fabric Watch specialist, how this setting should be changed?
I suggest you review the Fabric Watch and or Web Tools manual. Each of these review the Fabric Watch functionality and how to manipulate the settings. There is a manual available for each specific FOS release. High level procedure:
- Access Fabric Watch from Webtools or BNA
- Select SFP in the left pane (Fabric Watch Explorer)
- Select Threshold Configuration tab
- Modify the TX / RX settings, select apply
I've attached the spec sheet for the SFP listed in your post.
You can also use 'fwconfigure' or 'thConfig' / 'portThConfig' commands.
As to ports flapping between healthy and marginal for no apparent reason, you may want to run the 'diagClearError -all' command.