For more details, please see ourCookie Policy.


Fibre Channel (SAN)

Reply
Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

Hi,

If you are facing the issue on ISL path between 2 SWs , then connect 2 more cables to create another trunk.or if you have adjacent port available, then add m2 more cables, once these 2 new cables create a trunk, observe the error. If you will not get any error on these new cables , then remove the old cables and observe through these new trunk

one question, u say that u observe errors on ISL between 2 SWs, so the servers showing intermittent path offline are connected to only these 2 sws. Try to localize the servers and storages also, that means both HBA and controller should be on the same SW.

Also if these are HITACHI controllers, check the HDLM version on hosts, you may have to upgrade the HDLM version with autofailback on and extended I/O settings.

If not the HITACHI arrays, then you should log a call with ur vendor, which will log a call in the backend of Brocade.

One thing you have done that you have upgarded FOS from 6.1 to 6.2, but you have upgrdae all the SWs in that fabric then.do nto keep it like this.

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

Hi Hemant,

I think I will proceed to propose to add ISLs links between these 2 dir switches. Those servers are connected : servers ---> 32poirt_switch ---> dir1 ----> dir3---->USP . So, all the servers connected via these switches are facing same problem.

Regards,

Mahendran

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

Mahendran,

your are on the right way if you add ISL. Try to create a bigger trunk and not to add more trunks to simplify the routing table and avoid to have a mash or ring topology.

The FOS code will manage the balancing by them selfs well.

Did you have on the 32port switch old servers running on 1 or 2 gbit speed with very old PCI bus infrastructure and did they have zones to 8 Gbit storage ports?. If so this can cause back pressure on the ISL which will end in discards somewhere in the fabric. The storage port can overload the server.

Did your problems came up after your FOS code update?

Did you have a error history from the time before the update or did you now start the error monitoring since you have the issue?

If the errors came up after your update then FOS code may causes the discards which result in IO errors on the servers which you can see in also as LUN resets on the storage ports.

I have seen the same in my own environment nothing changed only the the FOS code. After adding ISLs every thing went fine as before.

Don't waste your time to look at HDLM or HBA firmware. FIx the DISCARDS on ISL!

Andreas

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

Hi,

Otherwise if you can localize the servers and storages i.e. connect the HBAs to that SW, where th estorages are connected. Eliminate the hop. That will solve the issue.

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

Mahendran,

Mahendran wrote:

I think I will proceed to propose to add ISLs links between these 2 dir switches. Those servers are connected : servers ---> 32poirt_switch ---> dir1 ----> dir3---->USP . So, all the servers connected via these switches are facing same problem.

I am confused about your setup. You say a few posts back that you have three dual-frabric SANs. In your second post you mention an MPR 7500. So are all three SANs in a meta-SAN? You later say the discards are between two 48000 directors, Are these on the same fabric in the same SAN or across SANs in the meta-SAN, via the routers?

In your first post you show details of a specific port, presumably an e-port. Is this connected to the 7500? What is the distance between sites? Are the inter-site links all in the backbone fabrics on the 7500s?

Thanks,

Alastair

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

This error can't still be noticed in a non-isl environment regardless of GBIC speed.

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

In this case  you just can ignore the error, because it will not harm ur data

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

we have same isl issue between 48k and dcx: v631a with c3 discarded frames set portcfgfillword on 8gbps ports no luck seems to me some internal timings issue between 4 and 8gbps san ports for the rest have no clue as neither emc has

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

I saw this one popping by. The below might shed some light on fillwords and "invalid ordered sets". I wrote this explanation in and HDS internal communiqe but it seems most of you will benefit as well. Let me emphasize that this has nothing to do with hardware troubles whatsoever when you see this phenomenon.

-----------------------------

This goes back to the change in fibre  channel protocol requirements for 8G and higher linespeeds. On 4G and  lower a so called IDLE fill-word is used which starts with a K28.5 and  is followed with 3 datawords. (D21.4 D21.5 D21.5) This fill-word is used  to maintain bit and word synchronisation between two N-Ports.  Due to  the higher baud-rate on 8G speeds and the specific bit pattern of this  IDLE fill-word it is known that this increases emission of high  radiation waves that might result into electrical interference with  other equipment.

To  circumvent that an other fillword was adopted which was already defined  in the FC-AL protocol called ARB(ff) (K28.5 D20.4 D31.7 D31.7) . This  is a similar fillword but has a better bitpattern to prevent this  radiation emission.  These fillwords are called ordered sets (a K28.5  and three datawords are an ordered set). The standard defines that  during word synchronisation (that is after speed negotiation on 8G and  bit/character synchronisation) the ports shall send 6 IDLES upon  entering the Active state to obtain word synchronisation and then switch  to ARB(ff) as fillword. (I spare you the entire protocol definition on  link state changes)

If  however the speed is negotiated at 8G during this init/transition  sequence and only one port switches to ARB(ff) as fill-word you will see  these er_bad_os counter increase very fast. Be aware that even thought  no actual frame is sent from an HBA the HBA and switchport still send  these fill-words constantly at the negotiated line-speed. Beside the  word synchronisation the ports also use actual frames to sync their  clockrate by looking at SOF and EOF delimiters. If however an HBA is not  sending any frames and the port is not able to determine a sync state  within a certain period of time  it will do a link reset (LR) and will  go through the sync process again.

Brocade  FOS pre 6.3.1 had only mode 0 and 1 (either IDLE or ARBff). This meant  that if one device was very strict in the standard but another was not  it would sync up if the switch port was configured as mode 0 but it  would never switch to ARBff as required by the standard. On the other  hand if the switchport was configured as mode 1 and the HBA lived by the  standard it could never get into a synch state because the switch would  only transmit ARBff as fillwords and the HBA would only use IDLEs.

Fillwords  can be replaced by other ordered sets (primitive signals or sequences).  One of those is very important for buffer credit organisation and is  called R_RDY. If you loose R_RDY signals the sending device has no  knowledge if these buffers on the other side have been cleared. This may  lead to performance problems etc. I’ll spare you the details.

As  you can see these fillwords are used between frames. FLOGI and PLOGI  are frames so to answer your question , no, changing fillwords has  nothing to do with failed FLOGI or PLOGI’s. PLOGI’s from initiator to  target devices might sometimes get dropped as any other frame in class 3  service due to numerous reasons. Physical errors or congestion on ISL’s  is one of the most likely causes. A FLOGI is one frame going from an  N_Port to a F-port controller on a switch which registers it in the  fabric controller. That is the only reason why a FLOGI is needed, to  obtain a 24 bit fabric address. After the PLOGI and nameserver  registration an RSCN is send out to all devices in it’s zone and the  other end-to-end queries and registrations begin.

The  fun becomes even more apparent next year with 16 and 32 G speeds where  we switch from 8b/10 to 62b/64b encoding. This encoding mechanism is  already used on 10G FC hence the reason it’s not interoperable with  1/2/4/8G speeds.

In  short if you have 8G ports on HBA’s and Storage and have Brocade 8G  port with FOS => 6.3.1 use Mode 2. All other linespeeds (1/2/4) still  use IDLE fillwords and require Mode 0. If you use FOS <6.3.1 it  depends a bit on the implementation of the HBA/Storage vendors.  Recommendation is to upgrade to the latest supported firmware levels  to be able to adhere to the standard.

You  may find that especially when using long distance connections over DWDM  where either transponders or TDM multiplexers are used in some  occasions these devices have not adhered to the fibre channel standard  yet. You should consult with the DWDM provide to upgrade the firmware in  those devices to be able to get a reliable connection.

I hope this explains a bit these changes.

------------------------------------

Cheers

E

Anonymous
Posts: 0

Re: Rapid increasing er_bad_os at 8 Gbit speed

Hello Erwin,

many thanks for this details and the very good explanation of ABRff and IDEL.

What happens if you fix the switch port and storage port of an Hitachi Array to 8 Gbit. I assume that both devices have to start with ABRff right from the beginning. Is this right?

Why does this cause problems on the Hitachi arrays to have difficulties to get in sync with the switch port. I have seen switch ports which change to state faulty.

Is this related to miss behavior of the storage port firmware on the arrays?

From a user point of view it is very confusing if some devices work with IDEL and some with ARBff at 8 Gbit speed.

Andreas

Join the Broadcom Support Community

Get quick and easy access to valuable resources across the Broadcom Community Network.