Reset Search
 

 

Article

How to Troubleshoot EAPS Segment Timer Fail Flag Messages

« Go Back

Information

 
TitleHow to Troubleshoot EAPS Segment Timer Fail Flag Messages
Objective
Explain the meaning of EAPS segment timer fail flag messages and provide some ways to help troubleshoot why they occur
Environment
  • Summit
  • BlackDiamond
Procedure
Messages similar to the following may appear in a switch log:
 
13:57:47.95 <Info:EAPS.SharedPort.PortInfo> Slot-1: EAPS Shared-port 1:29 - Segment 1:9 - Seg-timer-failed-flag is Cleared domain:eaps-ring1
13:57:46.95 <Info:EAPS.SharedPort.PortInfo> Slot-1: EAPS Shared-port 1:29 - Segment 1:9 - Seg-timer-failed-flag is Set domain:eaps-ring1
 

For EAPS configurations with shared ports, to detect link issues on the shared ports the controller and partner nodes send segment health check packets to each other at an interval of 1 per second through the EAPS shared segment.  The dead interval is 3 seconds.  When either controller or partner doesn’t receive a health check packet in 1 second, the segment-timer-failed flag is set and the message is logged by the switch not receiving the message.
 
If the next health check packet is received within 1 second, the segment failed flag is cleared and the switch will log a message indicating this.
 
If no health check packets are received in 3 consecutive seconds, the receiving node will believe the other end of the shared link is down (segment down).
 
There can be various reasons for the segment timer expiring such as:
 
  1. EAPS master switch’s CPU does not send the health check packets every second, possibly due to the CPU being busy performing other tasks, network congestion or hardware problem.
  2. EAPS master switch’s CPU does not receive the health check packets every second, possibly due to network congestion or hardware problem.
  3. EAPS master switch receives the health packet, but drops it before it reaches the CPU.
  4. EAPS master switch receives the health packet, it is sent to the CPU for processing, but it is not processed before the segment fail timer is set.  This may be due to the packet being queued and dropped before it is processed. (tx queue congestion)
 
There are commands that can be useful in troubleshooting this issue. 
 
  • Show port rxerror (look for increasing errors on port)
  • Show port txerror (look for increasing errors on port)
  • Top  (look for high CPU utilization for any processes)
  • Show l2stats (run 3 times with 10 second intervals.  Look for total # of packets to CPU field and calculate if the rate is over 3000 packets/sec for a particular vlan.  If yes, there are a lot of packets being CPU processed. However, this may be normal depending on specific network.) 

Note: Rate of packets processed by CPU for a particular vlan = Total number of packets to CPU for a particular vlan for run 1 + run 2 + run 3 divided by 30)
 
  • Show port congestion (note any increasing port congestion)
  • Debug hal show congestion (run command 3 times to see if congestion present.  See Extreme knowledge base article 000001686 for ways to reduce congestion)
  • Show eaps counters global (note any increasing values in the output below)
 
Global counters for EAPS:
  Rx-Failed              : 0
  Rx-Invalid-Vlan-Intf   : 0
  Rx-Undersize-Pkt       : 0
  Rx-Invalid-8021Q-Tag   : 0
  Rx-Invalid-SNAP-Type   : 0
  Rx-Invalid-OUI         : 0
  Rx-EEP-Unsupported-Ver : 0
  Rx-EEP-Invalid-Length  : 0
  Rx-EEP-Invalid-Checksum: 0
  Rx-Domain-Invalid      : 0
  Rx-Lif-Invalid         : 0
  Rx-Lif-Down            : 0
  Tx-Failed              : 0
 
  • Show eaps counters <eaps domain> (On each switch in the ring, note any increasing errors or drops.  The Rx-Health and Tx-Health counters should be close in value.  These packets are sent from the EAPS master primary port, travel through the transit switches and are expected to be received by the master on the secondary port.)
A sample output is below:
 
Counters for EAPS domain: ring2
Rx Stats
  Rx-Health                 :  102078
  Rx-Ringup-Flushfdb        :  8
  Rx-Ringdown-Flushfdb      :  16
  Rx-Link-Down              :  14
  Rx-Flush-Fdb              :  0
  Rx-Suspend-Prefwd-Timer   :  0

  Rx-Query-Link-Status      :  146
  Rx-Link-Up                :  0
Rx Dropped
  Rx-Unknown                :  0
  Rx-Another-Master         :  0
  Rx-Unconfigured-Port      :  0
  Rx-Health-Pdu-Pri-Port    :  0
 
Tx Stats
  Tx-Health                 :  0
  Tx-Ringup-Flushfdb        :  0

  Tx-Ringdown-Flushfdb      :  0
  Tx-Link-Down              :  3
  Tx-Flush-Fdb              :  0
  Tx-Suspend-Prefwd-Timer   :  0
  Tx-Query-Link-Status      :  72
  Tx-Link-Up                :  9
Tx Dropped
  Tx-Unknown                :  0
  Tx-Transmit-Err           :  0
 

Fw Stats
  Fw-Link-Down              :  0
  Fw-Flush-Fdb              :  0
  Fw-Query-Link-Status      :  0
Fw Dropped
  Fw-Unknown                :  0
  Fw-Transmit-Err           :  0
 
  • Show eaps counters shared-port <shared port> (The Tx-Seg-Health and Rx-Seg-Health stats should be similar in value.  Also note any increasing errors or drops)
 
A sample output is below:
 
p.8 # sh eaps counters shared-port 2
 
Counters for EAPS Shared-Port 2:
Common Link Port Stats
Rx Stats
  Rx-Seg-Health            :  280177
  Rx-Path-Detect           :  0
  Rx-Flush-Notify          :  0
Rx Dropped
  Rx-Seg-Health-Dropped    :  106801
  Rx-Path-Detect-Dropped   :  0
  Rx-Flush-Notify-Dropped  :  0
  Rx-Dropped-Invalid-Port  :  0
  Rx-Segment-State-Notify  :  0

  Rx-Segment-State-Query   :  0
 
Tx Stats
  Tx-Seg-Health            :  200002
  Tx-Path-Detect           :  0
  Tx-Flush-Notify          :  0
  Tx-Flush-Fdb             :  0
Tx Dropped
  Tx-Unknown               :  0
  Tx-Transmit-Err          :  0
  Tx-Segment-State-Notify  :  0
  Tx-Segment-State-Query   :  0
 
To monitor EAPS Rx and Tx Health counters, clear the counters by entering clear eaps counters and determine if the Tx and Rx counters are increasing around the same rate.  This is normal behavior.  If one counter is increasing much faster than the other one there may be a problem with the EAPS control packets being forwarded through the switches in the ring.
 
Additional notes

Feedback

 

Was this article helpful?


   

Feedback

Please tell us how we can make this article more useful.

Characters Remaining: 255