Reset Search
 

 

Article

VDX6740T interfaces connected to Intel I350 NICs at 1G flap randomly

« Go Back

Information

 
TitleVDX6740T interfaces connected to Intel I350 NICs at 1G flap randomly
Symptoms
VDX6740T interfaces connected to Intel I350 NICs at 1G flap randomly.
Environment
  • VDX6740T or VDX6740T-1G
  • Intel I350 NIC
Cause
On all gigabit Ethernet interfaces following IEEE specifications, autonegotiation is mandatory. On gigabit copper interfaces connected to each other, one of the items decided during autonegotiation is master and slave modes for clocking purposes. The Intel I350 NIC can only operate in slave mode. Any interface on any device connected to an Intel I350 needs to be configured to only be master and to only offer to be master during autonegotiation. Otherwise the link may be unstable and may flap.
Resolution
The VDX Network Operating System CLI does not have a native setting for forcing a gigabit copper interface to perform in master mode. It is possible to force a VDX copper interface to be master by manipulating ASIC registers. Software versions NOS 7.0.2a, 7.0.2b, 7.0.2c, and any later 7.0.2x patches include a Python script named link-Configure-MASTER.py that performs this manipulation.

The addition of this Python script is documented in release notes as DEFECT000647282. Only the script was added and CLI help for the "execute-script" command was updated to list the script. Other than those two points, software was not changed. It is possible to upload the script to any currently supported version of NOS that supports the execute-script command and achieve the same functionality.

Release notes for NOS 7.2.0b, 7.3.0, and 7.4.0 list DEFECT000647282 under closed defects. In certain versions CLI help for "execute-script" was updated to list the script, but the script itself was erroneously not included in software packages. On a NOS 7.2.0x or later system that is missing the script, it will be listed in CLI help for "execute-script ?", but attempting to run it will return an error message.
VDX1# execute-script link-Configure-MASTER.py
Currently user created shell scripts are not supported using this command.
VDX1#

Examination of the location where the script is supposed to be will show that it is missing.
VDX1# unhide foscmd
Password: ********
VDX1# foscmd ls /scripts
bindACLonISL.py           ipfabric-config-automation.py    notify
clear_system_counters.py  ipfabric-egressNode-debugv2.py   restrict_ssh
crontab-input             ipfabric-ingressNode-debugv2.py  rte_cap_acl
db_script                 ipfabric-spineNode-debugv2.py    wlv_db
int-range                 mcastTimeout.py
VDX1# hide foscmd
VDX1#

The command "foscmd" in the example above is an extremely powerful tool that allows access to many undocumented functions. Free use of it without specific instructions from Extreme Networks is not supported. Activating it with "unhide foscmd" requires the root password. The default root password as documented in each version of the Troubleshooting Guide is "fibranne". The command is automatically hidden again after logout or reload, but for safety it should be hidden after use with "hide foscmd".

If coldboot upgrading from a 7.0.2x release with the script to a 7.1.0x release and then a 7.2.0x or later release that is missing link-Configure-MASTER.py, the script that was included in the 7.0.2x release may remain. The script can also be copied from any 7.0.2x system that has it and uploaded to any 7.2.0x or later system that is missing it.

Using a NOS 7.0.2b or later 7.0.2x VDX system that includes the script, the script can be copied from the VDX by using "foscmd" and "scp"
to upload the script to a separate SSH/SCP server.
  1. Execute "unhide foscmd".
  2. Verify management interface reachability to an SSH/SCP server. Ether NOS CLI ping or foscmd ping can be used. For example:
    VDX2# ping 10.170.104.18 count 3 vrf mgmt-vrf
    
    VDX2# foscmd "ping -c 3 10.170.104.18"
    
  3. Use "foscmd" to make the VDX scp client upload the link-Configure-MASTER.py script to a server reachable from the management interface. You may see a prompt to add the server's key if this is the first time you're accessing it from this VDX. No asterisks will be displayed when you enter the password. It is important to use "| nomore" here. Otherwise you won't see prompts for adding the server's key and for entering the password. No stars are displayed when entering the password.
    VDX2# foscmd "scp /scripts/link-Configure-MASTER.py extreme@10.170.104.18:." | nomore
    The authenticity of host '10.170.104.18 (10.170.104.18)' can't be established.
    ECDSA key fingerprint is SHA256:IoUeQmaRACaifyKJ05XJkk5T0IhgaoLgYRofuIJw4j0.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added '10.170.104.18' (ECDSA) to the list of known hosts.
    extreme@10.170.104.18's password:
    link-Configure-MASTER.py                      100% 7075     3.9MB/s   00:00
    VDX2#
  4. Execute "hide foscmd".
The "execute-script" command uses Python scripts located in the "/scripts/" directory. On NOS 7.2.0x or any other systems that are missing link-Configure-MASTER.py, it can be downloaded with "foscmd" and "scp".
  1. Execute "unhide foscmd".
  2. Use foscmd and scp to download the script from an SCP/SSH server.
    VDX1# foscmd "scp extreme@10.170.104.18:./link-Configure-MASTER.py /scripts/." | nomore
    
  3. Execute "hide foscmd".
The above method downloads the script to the "/scripts/" directory on the active partition which is usually SW/0. Copying it to the standby partition so it is available in case of ISSU or HA failover requires additional steps.

Each partition acts as a virtual management module and has an internal interface eth1 that allows it to communicate with the other partition. SW/0 eth1 has address 127.2.1.0, and SW/1 eth1 has address 127.2.2.0. These addresses can be used to copy files between the two partitions.
  1. Under "rbridge-id <RBID#>" settings for the VDX in question, add "root enable" and "ssh server standby enable".
  2. Execute "unhide foscmd".
  3. Use "show version" to check which partition is active.
  4. If the active partition is SW/0, use foscmd to scp the script to SW/1.
    VDX1# foscmd "scp /scripts/link-Configure-MASTER.py root@127.2.2.0:/scripts/" | nomore
    
    If the active partition is SW/1, you can scp the script to SW/0.
    VDX1# foscmd "scp /scripts/link-Configure-MASTER.py root@127.2.1.0:/scripts/" | nomore
  5. After the script is copied to the standby partition. execute "hide foscmd".
  6. Under "rbridge-id <RBID#>" settings for the VDX in question, remove "root enable" and "ssh server standby enable" by executing "no root enable" and "no ssh server standby enable".
On a system that has link-Configure-MASTER.py in the scripts directory where it is supposed to be, the first time you execute it you need to specify the option "1". When you do this you will be prompted to enter a space-separated list of interface numbers where you want to force master mode.
VDX1# execute-script link-Configure-MASTER.py 1

===========================================
PEM SCRIPT TRIGGERRED TO SET PORT AUTO .....
===========================================

Enter the interested port number to set as MASTER, Range 1 to 48

Enter the interested port number to set as MASTER

Enter a space-separated list of interface numbers and press Enter.

1 2 3 4 5 6

The script will force master mode on each interface that doesn't already have master mode forced. The script will also administratively disable and enable each interface where it makes changes.

List of valid 10GE ports ['1', '2', '3', '4', '5', '6']
List of Invalid 10GE ports []
Processing for interface : 1/0/1 ...
  Interface 1/0/1 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/2 ...
  Interface 1/0/2 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/3 ...
  Interface 1/0/3 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/4 ...
  Interface 1/0/4 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/5 ...
  Interface 1/0/5 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/6 ...
2018/08/30-11:19:46, [NSM-1020], 1831, SW/0 | Active | DCE, INFO, VDX1,  Interface TenGigabitEthernet 1/0/6 is administratively down.
2018/08/30-11:19:46, [NSM-1019], 1832, SW/0 | Active | DCE, INFO, VDX1,  Interface TenGigabitEthernet 1/0/6 is administratively up.
Configuration of MASTER value is completed successfully

One other task the script performs with option 1 is to write a list of interfaces to the file "/scripts/portNumFile.txt". Whenever you execute the script with option "2", it reads from that file instead of requiring a user to re-enter the list of interfaces.
VDX1# execute-script link-Configure-MASTER.py 2

===========================================
PEM SCRIPT TRIGGERRED TO SET PORT AUTO .....
===========================================

Below ports will set as Master

List of valid 10GE ports ['1', '2', '3', '4', '5', '6']
Processing for interface : 1/0/1 ...
  Interface 1/0/1 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/2 ...
  Interface 1/0/2 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/3 ...
  Interface 1/0/3 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/4 ...
  Interface 1/0/4 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/5 ...
  Interface 1/0/5 is already set to Mastership value...skipping to next port..
Processing for interface : 1/0/6 ...
  Interface 1/0/6 is already set to Mastership value...skipping to next port..
Configuration of MASTER value is completed successfully
VDX1#

Each time the script is run, first it checks the status of each interface before modifying it any way. If the interface is already linked up in forced master mode, the script does not change or affect the interface in any way. There is no risk of unnecessary link flaps.

Each time the script tries to modify an interface, there is a small probability the attempt will fail. The best way to verify and retry as needed is to rerun the script with option 2. The script can safely be repeatedly run with option 2 until "is already set" is displayed for all interfaces.

Running the script with option "1" on the active partition creates "/scripts/portNumFile.txt" on only the active partition. It will no longer be accessible after HA failover or ISSU upgrade. If you want to copy the file from the active partition to the standby partition in case of ISSU or HA failover, the steps are almost identical to those for copying the script.
  1. Under "rbridge-id <RBID#>" settings for the VDX in question, add "root enable" and "ssh server standby enable".
  2. Execute "unhide foscmd" and enter the root password.
  3. Using your normal admin login account, use "show version" to check which partition is active, SW/0 or SW/1.
  4. If the active partition is SW/0, use foscmd to scp the file to SW/1. For example:
    VDX1# foscmd "scp /scripts/portNumFile.txt root@127.2.2.0:/scripts/" | nomore
    If the active partition is SW/1, you can scp the file to SW/0 with the following command.
    VDX1# foscmd "scp /scripts/portNumFile.txt root@127.2.1.0:/scripts/" | nomore
  5. After the file is copied to the standby partition, execute "hide foscmd".
  6. Under "rbridge-id <RBID#>" settings for the VDX in question, remove "root enable" and "ssh server standby enable".
The procedure above for copying portNumFile.txt from the active partition to the standby partition would need to be repeated each time "execute-script link-Configure-MASTER.py 1" is executed to update portNumFile.txt.

With link-Configure-MASTER.py as it is distributed, it needs to be manually rerun with option "2" after every reload. NOS does include event handler functionality that can be used to execute Python scripts after events such as reloads. However, the NOS event handler can only use scripts in the "/var/config/vcs/scripts/" directory. It cannot use scripts in the "/scripts/" directory. Furthermore the NOS event handler does not allow execution of scripts with parameters like the "1" or "2" that link-Configure-MASTER.py needs.

Using link-Configure-MASTER.py with the NOS event handler requires downloading the script from a VDX that has it, modifying it so it runs with the "2" option by default when no option is specified, uploading it to the directory that the NOS event handler uses, and configuring an event handler.
  1. Under "rbridge-id <RBID#>" settings for the VDX in question, add "root enable" and "ssh server standby enable".
  2. From a remote client, use scp to download a copy of the script from the "/scripts/" directory.
    $ scp root@10.170.107.16:/scripts/link-Configure-MASTER.py ./link-Configure-MASTER_orig.py
    
  3. Create a copy of the script and preserve the original.
    $ cp link-Configure-MASTER_orig.py link-Configure-MASTER_mod.py
    
  4. Edit the script copy so that option "2" is used by default if none is specified. Even if you know nothing about Python, the changes you need to make start at about line 81. After you make those changes, a "diff" of the two files will show the following.
    $ diff link-Configure-MASTER_orig.py link-Configure-MASTER_mod.py
    81a82,84
    >     if (len(sys.argv) == 1):
    >        cmdType = "2"
    >        return (True, cmdType)
  5. Upload the edited script to "/var/config/vcs/scripts/" on the active partition of the VDX.
    $ scp link-Configure-MASTER_mod.py root@10.170.107.16:/var/config/vcs/scripts/.
    
  6. On the VDX, if you haven't already executed "execute-script link-Configure-MASTER.py 1" to create "/scripts/portNumFile.txt", do so now. The script you edited will use that same file at that same location.
  7. Execute "unhide foscmd" and enter the root password.
  8. Use "show version" to check the active partition.
  9. Use "foscmd" and "scp" to copy the modified script and the "/scripts/portNumFile.txt" file to the standby partition.
    If SW/0 is the active partition, copy them to SW/1.
    VDX1# foscmd "scp /scripts/portNumFile.txt root@127.2.2.0:/scripts/" | nomore
    
    VDX1# foscmd "scp /var/config/vcs/scripts/link-Configure-MASTER_mod.py root@127.2.2.0:/var/config/vcs/scripts/" | nomore
    If SW/1 is the active partition, copy them to SW/0.
    VDX1# foscmd "scp /scripts/portNumFile.txt root@127.2.1.0:/scripts/" | nomore
    
    VDX1# foscmd "scp /var/config/vcs/scripts/link-Configure-MASTER_mod.py root@127.2.1.0:/var/config/vcs/scripts/" | nomore
  10. After copying all files needed, execute "hide foscmd". Under "rbridge-id <RBID#>" settings for the VDX in question, remove "root enable" and "ssh server standby enable".
  11. Configure an event handler to be triggered by an event such as raslog VCS-1005, VCS node rejoin, that happens on each VCS member after reload and signals complete readiness to accept all commands.
    VDX1# show running-config event-handler
    event-handler Attempt5
     trigger 5 raslog VCS-1005
     action python-script link-Configure-MASTER_mod.py
    !
    VDX1#
  12. Apply that event handler to the rbridges where the script needs to be run at boot.
    VDX1# show running-config rbridge-id 1 event-handler
    rbridge-id 1
     event-handler activate Attempt5
      delay 10
      iterations 3
      interval 10
     !
    !
    VDX1#
    
  13. To confirm the effectiveness of the script and event handler combination, you should try reloading the rbridge once.
    VDX1# reload system rbridge-id 1
    
  14. After reload, verify that the script has been run by checking "show logging raslog" for instances of interfaces brought administratively up and down. Be aware that even if the triggering event happens multiple times, there is no effect of running the script multiple times. If master mode is already set on an interface, it won't be shutdown again.
  15. At any time if you want to confirm that master mode has taken effect on the intended interfaces, you can run the original script "execute-script link-Configure-MASTER.py 2" or run the edited script with no options, "python link-Configure-MASTER_mod.py". Furthermore to verify that event handlers have been activated you can use "show event-handler activations".
  16. At any time you need to update the "portNumFile.txt" file you can use either "execute-script link-Configure-MASTER.py 1" or "python link-Configure-MASTER_mod.py 1". However after you do so, you need to configure "root enable" and "ssh server standby enable", execute "unhide foscmd", and copy "portNumFile.txt" to the standby partition again. For example, if SW/0 is active:
    VDX1# foscmd "scp /scripts/portNumFile.txt root@127.2.2.0:/scripts/" | nomore
    If SW/1 is active:
    VDX1# foscmd "scp /scripts/portNumFile.txt root@127.2.1.0:/scripts/" | nomore
    Finally, disable access that is not continuously necessary: Remove "root enable" and "ssh server standby enable".
Additional notes
Depending on software version, DEFECT000647282 may also be documented as any one of NOS-53076,  NOS-63089, NOS-63088, or NOS-52909.

Feedback

 

Was this article helpful?


   

Feedback

Please tell us how we can make this article more useful.

Characters Remaining: 255