07-13-2011 08:20 PM
First make sure no physical issues are seen in any of the link in the fabric. The enc_in, enc_out, CRC and sync columns should be 0 at all times. If not you might have a cable or sfp problem. Fix this first.
Second, clear all stats with slotstatsclear to obtain a baseline of counters. This command will reser all values back to 0.
Verify with porterrshow if these counters stay at 0 every 30 minutes.
the sloterrshow and slotstatsshow commands are somewhat difficult to explain since these can differ per switchtype and actual configuration. They also display internal back-end link statistics but it too far fetched to go into this in this forum.
Make sure no N-ports are marked a possible slow-drain devices. You can check with bottleneckmon. If one or more are mentioned in the eventlog you'll most likely need to re-arange some traffic paths to other storage or server ports.
Hope this helps.
07-14-2011 12:35 AM
Thanks for your answer. Currently, we have added ISL and the situation is now stabilized. Before adding ISL, the situation was (as indicated by San Health) with 2 ISL by fabric (48000 --- 2 ISL --- dwdm -----dwdm----2 ISL----48000).
What do you think of our oversubscription ratio ? How can I interpret it ? I know the recommendation value are 7:1. but it will means we have to add a lot of ISL !
|Port Counts||Attached Device Types||Inter Switch Links||Fan Out Ratios||Long Distance Modes|
|Total||Free||Unlcnsd||Disk||Tape||Host||Appliance||Gateway||ISL||IFL||Trnk Mstr||Trnk Slv||Host:Disk||Port:ISL||Device:ISL||10km||25km||50km||100k||300k||Auto|