We’re coming up on three years now that [John, Warthog9] and I have been running the MicroMirror project, and I decided that it was about time to start working on the next generation of the MicroMirror appliance hardware design. Currently, our main appliance consists of an HP T620 thin client with a 2TB M.2 SATA SSD in it, but due to a lot of feedback from networks that they didn’t have any way to plug a 1GbaseT server into their network, we pivoted to using the HP T620+ thin client with a ConnectX-311A-XCAT NIC in the PCIe slot. So we now ship these thicker thin clients with a 10G-LR optic installed in the SFP+ cage on the server, but hosts are free to replace that optic with SR or a DAC if their specific needs dictate something other than the 10G-LR default.

So as part of designing the next generation of our super tiny web servers, I wanted to try and qualify a solution to another piece of feedback we got from a few networks:

Can we have two SFP+ cages? We run MLAG + VARP on our routers and want to be able to reboot one of them without impacting the server.

Dual cage NICs have gotten significantly cheaper than they were three years ago when we locked in the bill of materials for the T620+ platform, so this is within the realm of posibilities for us. The T620+ has a x4 gen3 PCIe slot, which only affords us 16Gbps, so we wouldn’t be able to run a 2x10G port channel at full speed, but 16Gbps is still larger than 10Gbps, and it’s extremely rare that the existing 10G MicroMirrors even kiss 100% NIC utilization for a few seconds, so we are also looking at new server options with better IO capabilities than the T620+, but we’ll limit this post to talking about the NIC.

Not only have dual port NICs gotten cheaper in the last 3 years, but the ConnectX-4 based NICs have started flowing onto the secondhand market in volume as well and become affordable so we might be interested in moving beyond the ConnectX-3 generation of PCIe NICs! Where the ConnectX-3 is based around a 10G serdes (so you can get Nx10G or Nx40G variants of the ConnectX-3 NICs) the ConnectX-4 is able to operate at 25G per lane. The ConnectX-4 Lx variant of the ASIC is a cost reduced NIC that only has a 50Gbps packet pipeline inside of it, which means that it can support 2x25G SFP28 ports (as well as a single 40G QSFP+ port or a “depopulated” 50G QSFP28 port, where only two of the four lanes of a 100G port are electrcially active)

So where a 2x10G CX3 NIC can be had on eBay for around $24, the incremental cost to snagging a 2x25G CX4-Lx for around $26 makes it an interesting idea. Unfortunately, 25G is an extremely awkward Ethernet speed between 10G and 100G in that the IEEE originally didn’t want it, so a separate industry consortium originally defined it, but when the IEEE finally caved and ratified it they standardized 802.3by using a different forward error correcting code than the consortium did, so you have some Ethernet silicon out there in the wild that only supports the earlier “firecode” or “baser” FEC, where IEEE expects you to be using “Reed-Solomon 544/514” FEC. Ideally, the switch and the NIC should be able to agree on which of the two FEC options to use based on their clause 73 autonegotiation, but in reality many platforms don’t correctly implement clause 73 autoneg and you’re left manually hardcoding the FEC mode on one or both sides if you don’t get lucky with the default being the same on both. Namely, the default on Arista EOS gear is reed-solomon, where the default for ConnectX-4 seems to be firecode, so knobs need to be turned to make 25G work. HUMPH!

What Does Any of This Have to Do with Firmware?

Right! So back to my point. So I bought one of these Dell branded ConnectX-4121C 2x25G NICs on eBay for $26, and it arrived and I started playing with it. In the process of playing with it, I have fallen down the rabbit hole of connectX firmware, and figured I should write it down so I don’t forget it.

Dell ConnectX-4121C NIC

One lesson I’ve learned from the MicroMirror project is to always make sure to update the firmware on secondhand hardware which we’re trying to deploy as critical load-bearing Internet infrastructure, because many of the CX3 NICs have arrived with very old firmware, and Mellanox/Nvidia have been kind enough to fix a whole bunch of performance / stability issues in older firmwares, so we have see tangible performance improvements in the past due to updating the firmware on our NICs. On AlmaLinux 9, we can simply install the open source Mellanox/Nvidia firmware tools with sudo dnf install mstflint pciutils and using lspci to figure out the PCIe bus location of the NIC, query it:

[kenneth@nicbringup ~]$ lspci | grep Mellanox
01:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
01:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
[kenneth@nicbringup ~]$ sudo mstflint -d 01:00.0 query
Image type:            FS3
FW Version:            14.32.2004
FW Release Date:       13.1.2022
Product Version:       14.32.2004
Rom Info:              type=UEFI version=14.25.18 cpu=AMD64
                       type=PXE version=3.6.502 cpu=AMD64
Description:           UID                GuidsNumber
Base GUID:             b8599f03003e5016        8
Base MAC:              b8599f3e5016            8
Image VSD:             N/A
Device VSD:            N/A
PSID:                  DEL2420110034
Security Attributes:   N/A

The important facts to pull out of that output is that this NIC has a PSID of DEL2420110034 which is what identifies the whole NIC as a PCBA and usable module in a computer, and the firmware version / release date of 14.32.2004 / 2022-1-13. Unfortunately, the DEL in the PSID means that this is a Dell branded NIC, so it isn’t an identical copy of the MCX4121A-ACAT product with the PSID of MT_2420110034, but is really close. Many people report success of manually forcing mstflint to burn these NICs with the MCX4121A-ACAT firmware where the NICs seemly work fine, but do report that the link LEDs are then permanently stuck on. Unfortunately, there’s also several weird things about the firmware version and release date that this NIC came with:

14.32.2004 is a higher version and build number than I can find any reference to from Dell
14.32.2004 is a higher version number than Nvidia claims is the latest and greatest
2022-1-13 is a release date 2 years sooner than the latest 14.32.1900 firmware from Nvidia

So it seems like Dell has stopped releasing firmware updates for their silo of ConnectX-4 Lx NICs, and exactly where this current firmware lands in the progression of Nvidia’s bug fixes is murky. Being the “I like to understand the underlying mechanisms” type of guy I am, I started digging through everything I could find on Nvidia’s website, reading WAY too many ServeTheHome forum threads, a dash of homelabbers making fools of themselves on Reddit, and I think I’ve got it. So, as is often the case, BUCKLE UP.

What Makes a Firmware?

To meaningfully talk about what makes up a firmware image, we need to be completely clear on what a firmware image is and what it’s doing. Firmware is the software and configuration burned to a flash chip on the NIC card; usually stored in a SPI flash chip soldered onto the PCBA of the NIC next to the ConnectX ASIC itself. The ASIC loads the software off the flash chip to configure the ASIC, and a combination of that software running on the ASIC and the driver running up in the host operating system query the configuration stored in the flash chip for specifics like “what kind of NIC am I?” and “what is my MAC address?”, etc.

So while the ASIC software is going to be the same across all of the products using the ConnectX-4 Lx ASIC, the configuration is going to vary based on how each NIC product has wired up the ~50 GPIO pins on the ASIC (signals to/from the SFPs, LEDs, etc) and how the data lanes are exposed to the user (a single SFP port, multiple SFP ports, a QSFP port, etc) and furthermode the MAC address is going to vary across every individual part.

When you go download the BIN from Nvidia’s website for a specific SKU and burn it to the NIC, the update process is smart enough to preserve the globally unique MAC address and GUID section of the flash, but overwrites the rest of the flash with the provided BIN.

To recap, the firmware stored in the flash memory on the NIC consists of the following:

The ASIC software that runs on the NIC silicon
The NIC configuration that desribes how the rest of the NIC was built around the ASIC
The globally unique MAC address / GUID section that you don’t want to change with a firmware update
One or more boot ROMs, supporting BIOS vs UEFI and AMD64 vs ARM depending on what host you care about being able to boot over the network using this NIC

Note that none of these are the drivers being used by the OS, so we’re not talking about the mlx4 or mlx5 kernel modules in Linux at all here. The ASIC software is the other side of the conversation that the kernel driver is having with the attached hardware, so when you update the driver in your OS to fix something, it’s one rung higher than what we’re dealing with here and something that’s running on the host CPU itself, not the microcontroller inside the NIC.

Thanks mostly to a helpful thread on STH it looks like the technically correct answer for how to update the firmware on these Dell CX4121C NICs beyond what Dell has bothered to release themselves is to use the mlxburn tool from Nvidia to compile the latest ASIC software (which gets released as an MLX file) together with the hardware specific configuration file specific to the Dell hardware design ( with you can read via sudo mstflint -d [PCIe-Address] dc) so the ASIC can have the latest software fixes while also being correctly aware of what is connected to the ASIC where.

Using Nvidia’s MCX4121A-ACAT / MT_2420110034 firmware on this NIC only mostly works because, by happy accident, Dell happened to keep the critical function-to-pin mappings like which serdes lanes go to each SFP28 port the same, while Dell has apparently used a different GPIO for their link up indicator than Nvidia did, which is why that LED is now stuck on after the update.

The obvious solution here is to just grab the MLX file for the latest 14.32.1900 firmware, compile it with Dell’s firmware configuration that spells out GPIO mappings, how many virtual functions to allocate to the NIC, high speed signal pathway tuning parameters specific to this PCB design, and we’re all set!

The Fatal Flaw in the Plan

Very sound and good plan: compile a new firmware using the latest MLX and Dell’s specific firmware config. The only problem is that Nvidia has stopped publishing the MLX files for their ConnectX ASICs. Presumably Dell gets access to the MLX files since they’re an OEM shipping a metric buttload of Nvidia’s silicon in their devices here, but lowly mortals poking around at a few random NICs we bought off eBay are left out in the cold…

Hope you didn’t get too excited reading all of this. I just thought it was a good nugget of understanding, and saves you a lot of effort trying to also figure out how to run the mlxburn tool.