Building a switch OS from scratch taught me how wrong that was. The SFP is barely half of it, and every type of optic needs code on the switch side to work.
We’re building EdgeNOS — a network OS written from scratch — on an Edgecore AS5610-52X with a Broadcom Trident+ ASIC. No vendor SDK black box: just the chip, an open toolkit covering a tenth of what you need, and lots of reverse engineering.
The part nobody warns you about: getting light onto fiber. Before a packet crosses an optical port, three independent things must all be right — and each fails the same way: “no link.”
-
The chip’s SerDes has to lock its PCS.
-
The retimer (a DS100DF410 between chip and cage) has to lock its clock recovery and tune its equalization. It keeps its output muted until the EQ is right — chip, optic, and fiber all perfect, the part in the middle silently holding the line dark. We only found the settings by capturing what a known-good NOS does on the same chip.
-
The optic has to be powered, taken out of reset, mod-selected on the I²C mux, and have its control bytes set. On 40G the GPIO map was reverse-ordered from the schematic — the “obvious” pin order was backwards. That cost a day.
And the alphabet soup is real. Those “lock” states are specific IEEE 802.3 clauses, and the chip behaves differently for each: CL49 is 10GBASE-R (one lane, success = block-lock). CL82 is 40GBASE-R — four lanes that each alignment-marker-lock then deskew, all four or nothing (exactly where we’re stuck: 2 of 4). CL72/CL73 are link training and auto-negotiation — turning training off made our marginal lanes worse by killing the receiver’s auto-adaptation. CL45 is the MDIO model you drive it all through. Knowing which clause a port runs is half of telling a real L1 failure from a software one.
The detail I’d missed: how 40G BiDi uses the LC pair. I knew duplex-LC and four lanes — but BiDi runs each fiber bidirectionally, TX and RX sharing one strand on opposite wavelengths, so the pair carries all four lanes with no MPO ribbon. The optics aren’t even “handed” — same part on both ends.
The nastiest “L1” bug wasn’t L1 at all: a status register our init never populated made the software think the link was down, so the chip dropped every frame from a link that was up. A stale status register masquerades as dead hardware.
Where we are: 10G SFP+ forwards real traffic — bidirectional ping to a Cisco Nexus, 0% loss, full MTU. 40G is the honest unfinished part — all four optics detect, only 2 of 4 lanes lock.
Bringing up silicon from scratch is 10% heroics and 90% building the instruments that let you see. When every failure looks identical, visibility is everything.
More soon. 🔦