Reset Search
 

 

Article

Troubleshooting the Control Paths between I/O modules and MSM modules in BD8K

« Go Back

Information

 
TitleTroubleshooting the Control Paths between I/O modules and MSM modules in BD8K
Objective
This article explains the details on the Control Path Health Check between I/O modules and MSM modules in BD8K
Environment
  • EXOS All
  • BlackDiamond 8K
Procedure
EXOS continuously monitors the health of the Control Paths between I/O and MSM modules. Each MSM sends ten health-check broadcasts every few seconds to every module installed in the system.
On receipt of these broadcasts, each module sends a unicast reply back to the corresponding MSM.
The MSM keeps track of replies from each module, and if a module fails to reply even once for ten broadcasts sent, the MSM flags an error in the sys-health-check output and logs a warning message.
  • The Control Path Health Check is always enabled and cannot be turned off.
  • Both MSMs run the health checking completely independent of each other.
  • Each I/O module has two control interfaces, except the MSM I/O daughter card has only one. The control interface numbered #1 connects to MSM-A and #2 connects to MSM-B 
The diagram below depicts how I/O modules in the system communicate with MSM-A and B over the Control Paths.

User-added image

The results of the Control Path Health Check can be verified by the "debug hal show sys-health-check" command. 

 
# debug hal show sys-health-check 

In case that the interface #1 (MSM-A) of the IO module in slot 9 is in a problematic condition, the following is an example of the sys-health-check output and a warning log message:
 
# debug hal show sys-health-check 
[Control Links]
Slot Link  MissedPolls   Timestamp of last miss
-------------------------------------------------
 9    1             12   Mon Jun 20 18:23:14 2015

# show log
06/20/2015 18:23:49.97 <Warn:HAL.Sys.Warning> MSM-A: Sys-Health-Check Card 9 not responding for 100 ticks over interface 1
 
In case that the interface #2 (MSM-B) of the IO module in slot 9 is in a problematic condition, the following is an example of the sys-health-check output and a warning log message:

 
# debug hal show sys-health-check 
[Control Links]
Slot Link  MissedPolls   Timestamp of last miss
-------------------------------------------------
 9    2             16   Mon Jun 20 21:15:02 2015

# show log
06/20/2015 21:15:36.72 <Warn:HAL.Sys.Warning> MSM-B: Sys-Health-Check Card 9 not responding for 100 ticks over interface 2

These outputs indicate that a specific Control Path between I/O and MSM modules is experiencing issues most likely due to improper physical seating or a hardware fault of I/O modules and MSM modules.
Generally, it is recommended to attempt to reseat relevant I/O and MSM modules, and to replace them if reseating doesn't help.
As the chassis is also part of the physical Control Path, the replacement of the chassis may need to be considered. 
Additional notes
If the problem persists, please open a case with GTAC.

Feedback

 

Was this article helpful?


   

Feedback

Please tell us how we can make this article more useful.

Characters Remaining: 255