Can't find what you need?

• Ask the Community
• Create a Case
Reset Search



ERS8600/8800: KHI Error Messages "F2E egress queue drops" and "Fabric Drops"

« Go Back


TitleERS8600/8800: KHI Error Messages "F2E egress queue drops" and "Fabric Drops"

No negative network impacts are noticed.

KHI INFO Slot(s) are experiencing F2X Errors - F2E Egress Queue Drops
In logs we are noticing KHI log messages, "F2E Egress Queue Drops" and "Fabric Drops".

Drops are mostly seen in Qid 4.
Qid 4 is the Standard (Default). All data traffic with default qos (qos 0) goes into Qid 4.
Dropped pages on Qid 4 indicates port is getting oversubscribed (i.e. it is getting more traffic to TX out than it can handle based on bandwidth).
Below are the min/max rates for each Qid.

Qid Q-name Q-style min-rate max-rate max-q-length
0 Platinum Bal 10 100 68
1 Gold Bal 10 100 68
2 Silver Bal 5 100 136
3 Bronze Bal 15 100 136
4 Standard (Default) Bal 5 100 409
5 Custom low-pri 0 100 409
6 Premium high-pri 0 50 68
7 Critical/Network high-pri 0 5 150

Qid 4 drops, if they are small they can be ignored. In case of higher dropped pages, you may address that by adding more bandwidth to the links in question. Any packet loss due to dropped pages would be taken care by TCP re-transmission by end devices, in case of TCP packets.

F2E egress queue drops and fabric drops noticed. The interface on which these errors are noticed is marked yellow.
There is no traffic loss or issue reported by the customer.

"F2E Egress Queue Drops" indicate that a particular egress queue of a port has been oversubscribed and that frames have tail dropped from that queue.

"Fabric Drops" indicate that the entire 10G "lane" has been oversubscribed from a processing standpoint.

offending port(s) are easily identified via the CLI by the output of "show qos stats egress-queue-set" found in a "show fulltech".
This command will show "Dropped Pages" which indicate the culprits; this command also lists the port utilization and Qid as shown below:

#show ports stats egress-queues 
                 R/RS-Module QOS Egress Rate-Limit Stats Table
Port       Qid   Total pages     Dropped pages   Utilization 
2/7        4     538769818       27606           99
2/13       4     635190          3939            95
2/14       4     336791          2096            91

NOTE: The KHI subsystem (key health indicator) samples the system state every 30 seconds.
If an egress queue drop had happened in the sampling window, no matter how many, the KHI will report the drops.

Issue stems from the port(s) / "lanes" being oversubscribed with traffic causing traffic to be dropped in port queue or the processing for each lane to congest causing drops.

1. For "F2E Egress Queue Drops" - Create a LAG (MLT / LACP) to enable more bandwidth as needed.
2. For "Fabric Drops" - Move port configuration over to a port in a new "lane", THEN swap the physical port over to ease up on the traffic load being processed in said

In addition, the CLI output that we have collected is used to determine (approximately), how much oversubscribed the link is.
If a particular queue on a port indicates “total pages 1000, dropped pages 200”, then we know the offered load was 20% (200/1000) greater than the link speed.

That said, there are TWO exceptions, queues 62/63 (or 6/7, dependent on the card type).
These queues are the “high priority” queues and both are rate-limited to prevent denial of service attacks.

The queues 6 or 62 are limited to 50% of link bandwidth, while 7 or 63 are limited to 5%. Queues are selected, one-to , by priority. Priorities 1-5 correspond to queues 4-0 (note inversion), priority 0 to 55 (or 5), priority 6 to 62 (or 6), and priority 7 to 63 (or 7).

So, if one attempts to send >500M at priority 6 to a port, then egress queue drops will occur.
If it is either of the high priority queues which are showing egress queue drops, the user can increase the limits via the CLI’s QoS configuration commands.

Additional notes
These drops are usually connected to some type of server sending / requesting heavy amounts of burst traffic; a backup server for example.



Was this article helpful?



Please tell us how we can make this article more useful.

Characters Remaining: 255