Category Archives: Networking

Meraki MS120 hardware overview

I received an innocent sounding question via GitHub, would the custom firmware I have been developing for the MS220 work on an MS120?

I am an eternal sucker for good mysteries involving hardware, so I found a seller on eBay offering an MS120-8-HW for $95 USD (plus shipping and customs to the EU). A few weeks of buyer’s remorse and waiting, and I had the MS120 in my hands.

One thing that immediately struck me about the MS120 is the material change from aluminum to steel. While I thought Meraki’s use of anodized aluminum in the MS220 series was a silly choice for the larger rack mounted models, it did make me think they were attempting to position themselves as the Apple Computer of networking products (“You pay the premium because we’re different”). Regardless of their intentions with the aluminum MS220 series, it was a precedent and it cheapens the experience to see them swap out aluminum for steel.

Let us continue, because whinging about metal choices is not bringing us closer to answering the original question.

Inside the MS120-8-HW

The MS120 is based on the Marvell Alleycat3 platform, referred to as kelpie-8 in the u-boot source and otherwise known by its marketing name “Prestera.” It is an ARMv7 core running at 400MHz with 512MB of DDR3 and 256MB of NAND flash.

The UART header is J16 at 115200n8 with the pinout:

Vcc (Do not connect)
Rx
Tx
GND

Pin 1 is closest to the SFP cage.

J17 is a mystery jumper. I have not identified its purpose yet.

There is 32Mbit (4MB) of SPI flash present, however as far as I can tell, this is connected directly to the Microsemi SmartFusion 2 and not to the Marvell ASIC. Using a hardware reader and a chip clip, I dumped the contents to examine it. Running binwalk yielded no results.

The entropy graph of the dump suggests that there are multiple copies of the same data stored, which follows Meraki’s design with the MS220 switches where there are primary and backup copies of the bootloader.

Entropy graph of the 32Mbit flash in the MS120

Further inspection confirms that there are two identical copies of what I think is u-boot stored in the flash, starting at 0x301000. Each copy is approximately 420KB, which would correspond to the size of u-boot for this platform. However, the entropy is much higher than the entropy of u-boot.bin built using the Meraki GPL source, and contains only one readable string: kelpie_top

Perhaps this is the output of u-boot after running doimage to enable Secure Boot and AES-128 encryption?

The PCB traces from the winbond flash appear to go directly to the SmartFusion 2, but the u-boot UART output shows that the BootROM is booting from SPI:

BootROM 1.41
Booting from SPI flash

Is the SmartFusion 2 emulating an SPI device to the Alleycat3 after verifying the integrity of the u-boot binary in ROM?

u-boot then executes an application at memory address 0x0C100000 that prints the value of multiple “SecureBoot Registers” the strings of which do not appear anywhere in the u-boot code provided in the GPL archive:

## Starting application at 0x0C100000 ...

----Security Versions----
SecureBoot: R03.11b39af022017-07-25
SB Core: F01114R18.1680555472017-07-12
Microloader: MG0008R01.0103302017

----SecureBoot Registers----
system_invalid: 0
boot_check_count_error: 0
boot_done: 1
boot_ok: 1
boot_check_count_golden: 0
boot_check_count_upgrade: 2
boot_status_golden: 0
boot_status_upgrade: 1
first_bootloader: 1

----Upgrade----
boot_error: 0
boot_check_count_error_vc: 0
boot_check_count_error: 0
boot_timeout_vc: 0
boot_timeout: 0
boot_cs_good: 1
boot_config_error: 0
boot_version_error: 0
boot_config_error_code: 0
boot_error_code: 0
boot_cs_good: 1
boot_version_error: 0
boot1_cs_key_type: 1
boot1_cs_return_code: 0
boot1_cs_key_index: 5
boot2_cs_return_code: 0
boot2_cs_key_index: 5
boot2_cs_key_type: 1

----Other Registers----
fpga_version: 001b

Meraki do not include the build toolchain in their GPL archive. Luckily, I remembered that I have encountered this Marvell fork of u-boot before, for the Western Digital EX2100 which also uses a Marvell Armada 385 from the same family of ARMv7 CPU cores. Western Digital does include the Marvell toolchain used to compile u-boot in their GPL archive, good guy Western Digital!

To save anyone else the effort of setting up a development environment with an ancient version of GCC, I have created a Dockerfile that will handle building u-boot using the Marvell toolchain. You can find this work on GitHub.

I reached out to members of the Doozan forum who have been building Linux images for Marvell based NAS devices for many years to see if they had any more information about Secure Boot. Apparently Marvell CPUs will always kwboot before loading from other sources such as SPI or NAND:

Even if a box has secure boot in stock FW (u-boot, kernel,..), you should be able to kwboot it with a non-secure u-boot/spl binary.

I tried to kwboot the MS120 (with and without the Armada patches), but was unable to get it working. For some reason, the BootROM output is printed twice when kwboot is running, which I have not witnessed during any normal boot sequence:

$ ./kwboot -b u-boot.uart -f -t -B 115200 /dev/ttyUSB0
Sending boot message. Please reboot the target.../
BootROM 1.41
Pattern detected on UART-
BootROM 1.41
Pattern detected on UART/

uart.bin was created using tools/marvell/doimage:

$ ./doimage -T uart -D 0x0 -E 0x0 u-boot.bin u-boot.uart

Unfortunately, this investigation leaves us with more questions than answers.

What is the contents of the 32Mbit SPI flash?
- Does the SmartFusion 2 only provide glue logic, or does it also protect/verify the contents of SPI flash?
Why won’t the Alleycat3 kwboot?
- Is the duplicate output from BootROM when kwboot is invoked a clue?
What is the purpose of the header J17?
Why did Meraki switch from aluminum to steel?

The MS120 series is a completely different platform from the previous MS220 series, which used Vitesse ASICs with a MIPS core, 128MB of DDR2, 16MB of SPI, and 128MB of NAND flash.

The use of Secure Boot will complicate efforts to create a third-party firmware for the MS120 series. However, the more immediate issue is that kwboot does not work and there is no obvious copy of u-boot in SPI flash we can modify to alter the boot process.

Meraki MS220: PoE support

11 Replies

The last several posts in this series have focused primarily on getting a custom firmware running on the Meraki MS220-series switches, without much regard for preserving existing features. Since I am now at a point where my custom firmware is functional as a Layer 2-ish switch, my attention has turned to PoE, since many switches in the series have PoE support and that is feature I think switch owners (especially MS220-8P) are interested in.

From my investigation into libpoecore included in the Meraki firmware, PoE on the MS220-8P appears to be managed by Microsemi’s PD690xx series of Power over Ethernet Management chips (datasheet). The PD690xx series communicates over I2C with the management CPU to manage PoE on the switch ports (enable/disable PoE, set 802.3af/at modes, query power consumed by a PoE device).

We can confirm that the PD690xx communicates via I2C by running poe_server from Meraki’s firmware and enabling I2C tracing in the kernel:

# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 1358/1358   #P:1
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
      poe_server-682   [000] ....   560.356000: i2c_write: i2c-1 #0 a=030 f=0000 l=4 [13-32-0f-ff]
      poe_server-682   [000] ....   560.358000: i2c_result: i2c-1 n=1 ret=1
      poe_server-682   [000] ....   560.358000: i2c_write: i2c-1 #0 a=030 f=0000 l=2 [13-32]
      poe_server-682   [000] ....   560.358000: i2c_read: i2c-1 #1 a=030 f=0001 l=2
      poe_server-682   [000] ....   560.359000: i2c_reply: i2c-1 #1 a=030 f=0001 l=2 [0f-ff]
      poe_server-682   [000] ....   560.359000: i2c_result: i2c-1 n=2 ret=2
      poe_server-682   [000] ....   560.359000: i2c_write: i2c-1 #0 a=030 f=0000 l=4 [13-32-0f-ff]
      poe_server-682   [000] ....   560.360000: i2c_result: i2c-1 n=1 ret=1
      poe_server-682   [000] ....   560.360000: i2c_write: i2c-1 #0 a=030 f=0000 l=4 [13-9e-dc-03]
      poe_server-682   [000] ....   560.362000: i2c_result: i2c-1 n=1 ret=1

I2C tracing is extremely helpful, as running strace against poe_server directly will not yield useful output as to what operations it is performing to configure PoE.

While it is good news that we are able to recover the I2C commands via kernel tracing, it’s bad news in the sense that writing a new daemon to duplicate the features of poe_cli is non-trivial.

Thankfully, with the libpoecore from the Meraki firmware dump and free disassembly tools like Ghidra (sorry Hex-Rays, support MIPS in IDA Free ¯\_(ツ)_/¯), understanding some of the logic behind functionality provided by poe_server and poe_cli becomes much easier.

If you disassemble libpoecore, you can find the function hard_init which contains code to set up GPIO outputs. Interesting to note is that while the GPIO pins change depending on which switch ASIC is present, the sequence of GPIO outputs to configure the PD690xx remains constant.

Disassembler view of the function hard_init from the library libpoecore.so

The same GPIO configuration is executed when switch_brain is started (full strace output):

writev(1, [{iov_base="", iov_len=0}, {iov_base="echo 7 > /sys/class/gpio/export\n", iov_len=32}], 2echo 7 > /sys/class/gpio/export) = 32
writev(1, [{iov_base="", iov_len=0}, {iov_base="echo 12 > /sys/class/gpio/export\n", iov_len=33}], 2echo 12 > /sys/class/gpio/export) = 33
writev(1, [{iov_base="", iov_len=0}, {iov_base="echo out > /sys/class/gpio/gpio7/direction\n", iov_len=43}], 2echo out > /sys/class/gpio/gpio7/direction) = 43
writev(1, [{iov_base="", iov_len=0}, {iov_base="echo out > /sys/class/gpio/gpio12/direction\n", iov_len=44}], 2echo out > /sys/class/gpio/gpio12/direction) = 44
writev(1, [{iov_base="", iov_len=0}, {iov_base="echo 1 > /sys/class/gpio/gpio7/value\n", iov_len=37}], 2echo 1 > /sys/class/gpio/gpio7/value) = 37
writev(1, [{iov_base="", iov_len=0}, {iov_base="echo 0 > /sys/class/gpio/gpio12/value\n", iov_len=38}], 2echo 0 > /sys/class/gpio/gpio12/value) = 38
writev(1, [{iov_base="", iov_len=0}, {iov_base="echo 1 > /sys/class/gpio/gpio12/value\n", iov_len=38}], 2echo 1 > /sys/class/gpio/gpio12/value) = 38
writev(1, [{iov_base="", iov_len=0}, {iov_base="echo 0 > /sys/class/gpio/gpio7/value\n", iov_len=37}], 2echo 0 > /sys/class/gpio/gpio7/value) = 37
writev(1, [{iov_base="", iov_len=0}, {iov_base="echo 1 > /sys/class/gpio/gpio12/value\n", iov_len=38}], 2echo 1 > /sys/class/gpio/gpio12/value) = 38

The datasheet for the luton26 ASIC used in the MS220-8P, MS220-24P, and MS22 (VDMS-10393) doesn’t list anything connected to GPIO 7, and GPIO 12 is used for either SFP17_SD or PHY7_LED1 depending on the overlay function chosen. The functionality of these GPIO pins is undefined in the ASIC datasheet, however libpoecore is setting them and manipulating their outputs.

We can implement the logic of hard_init in an init script to set up the GPIO pins in the same way, and the result is that the PD690xx is configured for auto mode. I am not sure how, there is nothing in the PD690xx datasheet which suggests GPIO pins can be used to configure the operating mode, but the switch will automatically negotiate and power a PoE device.

Writing a new daemon to communicate with the PD690xx will ultimately be necessary if fine control over PoE functionality is to be achieved. Without I2C communication to the PD690xx, it is not possible to query the power budget, or limit port power delivery. In the mean time, for those who do not mind unmanaged “plug-and-play” style, PoE can be considered functional.

MS220-8P: Custom firmware from scratch

2 Replies

To follow up on my previous posts about modifying the firmware for the Cisco Meraki MS220-8P, I have more progress to report.

Since I could not figure out how to fix serial output from userspace when booting from u-boot, I went back to booting with RedBoot and tried to focus on improving userspace from my first sloppy attempt to coerce the Meraki firmware into booting from NOR.

Here is the current situation:

RedBoot without CRC or kernel size checks
Kernel 3.18.123 with xz compression
Busybox userspace based on buildroot 2020.02.1
Meraki kernel modules: vtss_core, proclikefs, merakiclick, elts_meraki, and vc_click are successfully loaded during boot
The click router is successfully initialized, creating the interfaces arping, linux_mgmt, sw0_pcap, and wired0

However, networking is non-functional. The switch broadcasts DHCP requests, but no packets are ever passed from the switching ASIC to the management CPU:

arping    Link encap:Ethernet  HWaddr 88:15:44:73:22:31
          inet addr:169.254.55.143  Bcast:169.254.255.255  Mask:255.255.0.0
          inet6 addr: fe80::8a15:44ff:fe73:2231/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:33 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:5546 (5.4 KiB)

linux_mgmt Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.255.255.255
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

sw0_pcap  Link encap:Ethernet  HWaddr 00:01:02:03:04:05
          inet addr:169.254.83.119  Bcast:169.254.255.255  Mask:255.255.0.0
          inet6 addr: fe80::201:2ff:fe03:405/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:32 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:5476 (5.3 KiB)

wired0    Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

On the other side:

tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
06:02:32.423798 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::201:c0ff:fe1f:9fb2 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff1f:9fb2 to_ex { }]
06:02:33.097129 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::201:c0ff:fe1f:9fb2 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff1f:9fb2 to_ex { }]
06:03:20.969517 IP (tos 0x0, ttl 64, id 20247, offset 0, flags [none], proto UDP (17), length 332, bad cksum 2a8b (->1677)!)
    0.0.10.10.68 > 10.10.255.255.67: [bad udp cksum 0x17ca -> 0x03b6!] BOOTP/DHCP, Request from 88:15:44:21:65:10 (oui Unknown), length 304, xid 0xd3650d76, secs 123, Flags [none] (0x0000)
          Client-Ethernet-Address 88:15:44:21:65:10 (oui Unknown)
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            Client-ID Option 61, length 15: hardware-type 255, 44:21:65:10:00:03:00:01:88:15:44:21:65:10
            SLP-NA Option 80, length 0""
            NOAUTO Option 116, length 1: Y
            MSZ Option 57, length 2: 1472
            Hostname Option 12, length 13: "m881544216510"
            T145 Option 145, length 1: 1
            Parameter-Request Option 55, length 14:
              Subnet-Mask, Classless-Static-Route, Static-Route, Default-Gateway
              Domain-Name-Server, Hostname, Domain-Name, MTU
              BR, Lease-Time, Server-ID, RN
              RB, Option 119
            END Option 255, length 0
06:04:25.620540 IP (tos 0x0, ttl 64, id 13784, offset 0, flags [none], proto UDP (17), length 332, bad cksum 43ca (->2fb6)!)
    0.0.10.10.68 > 10.10.255.255.67: [bad udp cksum 0x178a -> 0x0376!] BOOTP/DHCP, Request from 88:15:44:21:65:10 (oui Unknown), length 304, xid 0xd3650d76, secs 187, Flags [none] (0x0000)
          Client-Ethernet-Address 88:15:44:21:65:10 (oui Unknown)
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            Client-ID Option 61, length 15: hardware-type 255, 44:21:65:10:00:03:00:01:88:15:44:21:65:10
            SLP-NA Option 80, length 0""
            NOAUTO Option 116, length 1: Y
            MSZ Option 57, length 2: 1472
            Hostname Option 12, length 13: "m881544216510"
            T145 Option 145, length 1: 1
            Parameter-Request Option 55, length 14:
              Subnet-Mask, Classless-Static-Route, Static-Route, Default-Gateway
              Domain-Name-Server, Hostname, Domain-Name, MTU
              BR, Lease-Time, Server-ID, RN
              RB, Option 119
            END Option 255, length 0

The primary reason for this appears to be because Meraki is using the click modular router. Click seems to be an attempt to lobotomize the Linux kernel networking stack for the sake of writing academic research papers. The kernel side of click was never upstreamed to mainline and is no longer maintained. Let us pour one out for the poor soul at Cisco who has to maintain this.

Since Meraki was spun out of MIT and the researchers who wrote Click were also at MIT, the relationship between Meraki and Click is clear. Finally Click found a practical use, in a now-obsolete product line from Cisco.

More reverse engineering is necessary to make Click functional enough to have basic connectivity to the management CPU.

The strace output of switch_brain is available in this gist. By first running switch_brain, and then overwriting the default traffic rules with “allow all” I was able to SSH to the management CPU:

/ # echo "allow all" > /click/nat/common_switch_nat/from_smc_filter/config
/ # echo "allow all" > /click/nat/common_switch_nat/from_mgmt_filter/config
/ # tail /tmp/messages | grep dropbear
Jan  1 00:15:49 buildroot authpriv.info dropbear[680]: Child connection from 10.10.10.1:39248
Jan  1 00:15:50 buildroot authpriv.notice dropbear[680]: Password auth succeeded for 'root' from 10.10.10.1:39248
/ # w
USER            TTY             IDLE    TIME             HOST
root            pts/0           00:00   Jan  1 00:15:50  10.10.10.1

I think it is now very clear that the “only” thing blocking full access to the management CPU is the Click configuration. My hope is to build the configuration from the switch_brain strace output and package that into an initscript.

I have updated the meraki-builder repository with the buildroot config file and overlay directory I use to inject init scripts (among other things).

If you would like to contribute, I have written some installation instructions for the current development firmware. ~~If you find the correct voodoo of Click commands to access the management CPU, please open a PR~~ 🥰 The voodoo has been found.

«WatchMySys» Blog

Guaranteed to be pseudorandom

Category Archives: Networking

Meraki MS120 hardware overview

Meraki MS220: PoE support

MS220-8P: Custom firmware from scratch