How to perform MLAG troubleshooting

How to perform MLAG troubleshooting
  • EXOS All
  • MLAG
There are two main ‘show’ commands that provide current MLAG port and peer status.
When an ISC link goes down in between the peers, MLAG port and peer status will be seen as shown below:

Slot-1 mlag_peer1.8 # show mlag port 1:47
                Local                                             Local   Remote
MLAG    Local   Link     Remote                           Peer    Fail    Fail
Id      Port    State    Link    Peer                     Status  Count   Count
40      1:47    A       N/A      mlag_peer2               Down         0       0
Local Link State: A - Active, D - Disabled, R - Ready, NP - Port not present
Remote Link     : Up - One or more links are active on the remote switch,
                  Down - No links are active on the remote switch,
                  N/A - The peer has not communicated link state for this MLAG
Number of Multi-switch Link Aggregation Groups  : 12
Convergence control                             : Conserve Access Lists
Slot-1 mlag_peer1.9 # sh mlag peer
Multi-switch Link Aggregation Peers:
MLAG Peer          : mlag_peer2
VLAN               : isc                    Virtual Router     : VR-Default
Local IP Address   :                Peer IP Address    :
MLAG ports         : 12                     Tx-Interval        : 1000 ms
Checkpoint Status  : Down                   Peer Tx-Interval   : 1000 ms
Rx-Hellos          : 7882543                Tx-Hellos          : 7901911
Rx-Checkpoint Msgs : 3892866                Tx-Checkpoint Msgs : 4960040
Rx-Hello Errors    : 0                      Tx-Hello Errors    : 0
Hello Timeouts     : 0                      Checkpoint Errors  : 0
Up Time            : N/A                    Peer Conn.Failures : 0
Local MAC          : 02:04:96:6d:17:f7      Peer MAC           : 02:04:96:6d:17:e8
Config'd LACP MAC  : None                   Current LACP MAC   : 02:04:96:6d:17:e8


When dealing with traffic loss over MLAG peers, keeping the above topology in mind, the following list of ‘debug’ commands are recommended to be run (and captured) on the switches and possibly to be shared for further analysis. 
  1. Identify the affected location and traffic in the network and verify ARP, FDB, and routing entries for the traffic in MLAG peer and remote switch.
    1. show fdb
    2. show iparp
    3. show iproute
    4. debug hal show fdb
    5. Verify if the rate-limit is configured to QP8 profile
  2. Verify if there is either port or CPU congestion and check the IP/L2 statistics for the VLANs across all the three switches. Confirm whether packets are being dropped on the switch.
    1. debug hal show congestion
    2. show port <port#> congestion
    3. show ipstats
    4. show l2stats
  3. Verify that all the MAC address entries are synced between core switches over the ISC link through “show fdb” output. Additionally collect command output from both MLAG peers for the following;
    1. debug fdb show mlag <mlag port>
    2. debug fdb show isc <isc port>
    3. debug fdb show vsm isc peer <peer name>
    4. debug vsm show peer <peer name>
    5. debug vsm show ports id <mlag port id>
    6. debug vsm show ports peer <peer name>
    7. debug vsm show ports ports <portlist>
    8. debug hal show vsm
    9. debug fdb show globals
Additional notes
An MLAG configuration checker exists for switches running EXOS 15.6 and higher. Please see MLAG Config Check on GitHub for further information.

MLAG Limitations and Requirements:



