Turn on suggestions
![]() Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.
Showing results for
|
08-07-2012 12:49 AM
Hello,
could anyone give me a hint how to troubleshoot error "Severe latency bottleneck detected" on ISL/trunk port?
What can cause this problem and how a root cause can be found?
We received this alert on trunk created from two 8Gbit ISL ports between two 5100 switches.
ASAN01:admin> trunkshow -perf
1: 1-> 7 10:00:00:05:1e:36:38:62 100 deskew 15 MASTER
0-> 6 10:00:00:05:1e:36:38:62 100 deskew 24
Tx: Bandwidth 8.00Gbps, Throughput 37.44Kbps (0.00%)
Rx: Bandwidth 8.00Gbps, Throughput 51.94Kbps (0.00%)
Tx+Rx: Bandwidth 16.00Gbps, Throughput 89.38Kbps (0.00%)
2: 5-> 71 10:00:00:05:1e:36:38:62 100 deskew 16 MASTER
4-> 70 10:00:00:05:1e:36:38:62 100 deskew 15
Tx: Bandwidth 8.00Gbps, Throughput 33.12Kbps (0.00%)
Rx: Bandwidth 8.00Gbps, Throughput 58.08Kbps (0.00%)
Tx+Rx: Bandwidth 16.00Gbps, Throughput 91.20Kbps (0.00%)
3: 35-> 35 10:00:00:05:33:ce:61:f5 203 deskew 15 MASTER => trunk with alerts
39-> 39 10:00:00:05:33:ce:61:f5 203 deskew 16
Tx: Bandwidth 16.00Gbps, Throughput 442.46Kbps (0.00%)
Rx: Bandwidth 16.00Gbps, Throughput 433.73Kbps (0.00%)
Tx+Rx: Bandwidth 32.00Gbps, Throughput 876.19Kbps (0.00%)
porterrshowASAN01:admin> porterrshow
frames enc crc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err g_eof shrt long eof out c3 fail sync sig
=========================================================================================================
35: 374.0m 115.7m 0 0 0 0 0 0 0 70 0 1 2 0 0
39: 3.1g 3.8g 0 2 0 0 0 0 0 204 0 1 2 0 0
Thanks
Radek
08-07-2012 02:47 AM
Here is my alert message:
Time Level Message Service Number Count Message ID Switch
Mon Aug 06 2012 20:32:05 CEST Warning Severe latency bottleneck detected at slot 0 port 35. Switch 1241 1 AN-1010 ASAN01
Looks like buffer credit problem:
Data Center Fabric Resiliency Best Practices:
Bottleneck Detection can detect ports that are blocked due to lost credits and generate special “stuck VC” and “lost
credit” alerts for the E_Port with the lost credits (available in FOS 6.3.1b and later).
Example of a “stuck VC” alert on an E_Port:
2010/03/16-03:40:48, , 21761, FID 128, WARNING, sw0, Severe latency bottleneck detected at slot 0 port 38.
Data Center Bottleneck Detection Best Practices Guide:
<timestamp>, , <sequence-number>,, WARNING, <system-name>, Severe latency bottleneck detected at Slot
<slot number> port <port number within slot number>.
This message identifies the date and time of a credit loss on a link.
The platform and port affected and the number of seconds that triggered the threshold.
ISL is about 500m over multimode shortwawe connection...
Each ISL port is has 26 buffer credits and is configured as L0 for 2km distance...
FOS version on all switches is 6.4.2a.
Any hint how to troubleshoot that?
08-07-2012 12:13 PM
Hi ,
Step 1 -- Check which end device connected to edge switch port utilising more .
Step 2 -- Add one more port for trunking and try to increase buffer credits as well .Monitor for sometime .
Step 3 -- This is happening only on certain time period ,check any backup is running .
Above hint may be helpful i guess.
08-08-2012 01:10 AM
Hi,
Step 1 -- Check which end device connected to edge switch port utilising more .
OK, but is it really relevant if I have enough bandwidth in trunk for high-utilizing device?
Step 2 -- Add one more port for trunking and try to increase buffer credits as well .Monitor for sometime .
trunk is made from two 8Gbit/s ISL ports and is never utilized up to 100%
Step 3 -- This is happening only on certain time period ,check any backup is running .
Yes, it's related to time when backups are running but any port is not utilized so much and trunk utilization is low.
08-08-2012 02:17 AM
Hi ,
Then problem will be on buffer credits value only it seems.Since a device experiencing latencies more slowly than expected means buffer credits does not returned and load is high too .This is mentioned in fabric resiliency best practice . I am unable to attach that document since option not available .Share your mail id if you don't have that document.
08-08-2012 05:30 AM
Check if, on any other F-port in these switchs, there are frame discards. Because if there are no stuck VC in the ISL, there could be a slow drain device causing the issue.
08-28-2012 01:47 AM
Problem is solved.
Root cause of this problem has been one erroneous port on second switch with id 203.
BSAN01:admin> porterrshow
frames enc crc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err g_eof shrt long eof out c3 fail sync sig
=========================================================================================================
28: 0 0 0 0 0 0 0 0 13.9k 0 0 0 0 0 0
SFP has been identified as a failing item in fabric. After its replacement problem is gone.
Radek's IT Blog: Severe latency bottleneck detected on ISL / Trunk port
Radek
10-04-2012 12:31 PM
We are having the same issue - the port is trunk port ( 8GB 5100) .
, 372, FID 128, WARNING, switch , Severe latency bottleneck detected at slot 0 port 23.
the answer above is not clear ? I see the traffic flowing fine on both sides of the trunk - upstream and downstream but I am having all these bottleneck error , our application keeps complaining about timeouts , slow storage and bad cable ?
The cable if long distance going from port 23 to patch panel , to fiber termination box and it connects to another down stream switch about 1/4 mile via dark fiber.