To follow up on my previous posts about modifying the firmware for the Cisco Meraki MS220-8P, I have more progress to report.
Since I could not figure out how to fix serial output from userspace when booting from u-boot, I went back to booting with RedBoot and tried to focus on improving userspace from my first sloppy attempt to coerce the Meraki firmware into booting from NOR.
Here is the current situation:
- RedBoot without CRC or kernel size checks
- Kernel 3.18.123 with xz compression
- Busybox userspace based on buildroot 2020.02.1
- Meraki kernel modules: vtss_core, proclikefs, merakiclick, elts_meraki, and vc_click are successfully loaded during boot
- The click router is successfully initialized, creating the interfaces arping, linux_mgmt, sw0_pcap, and wired0
However, networking is non-functional. The switch broadcasts DHCP requests, but no packets are ever passed from the switching ASIC to the management CPU:
arping Link encap:Ethernet HWaddr 88:15:44:73:22:31
inet addr:169.254.55.143 Bcast:169.254.255.255 Mask:255.255.0.0
inet6 addr: fe80::8a15:44ff:fe73:2231/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:33 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:5546 (5.4 KiB)
linux_mgmt Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.255.255.255
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
sw0_pcap Link encap:Ethernet HWaddr 00:01:02:03:04:05
inet addr:169.254.83.119 Bcast:169.254.255.255 Mask:255.255.0.0
inet6 addr: fe80::201:2ff:fe03:405/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:32 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:5476 (5.3 KiB)
wired0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
On the other side:
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
06:02:32.423798 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::201:c0ff:fe1f:9fb2 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff1f:9fb2 to_ex { }]
06:02:33.097129 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::201:c0ff:fe1f:9fb2 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff1f:9fb2 to_ex { }]
06:03:20.969517 IP (tos 0x0, ttl 64, id 20247, offset 0, flags [none], proto UDP (17), length 332, bad cksum 2a8b (->1677)!)
0.0.10.10.68 > 10.10.255.255.67: [bad udp cksum 0x17ca -> 0x03b6!] BOOTP/DHCP, Request from 88:15:44:21:65:10 (oui Unknown), length 304, xid 0xd3650d76, secs 123, Flags [none] (0x0000)
Client-Ethernet-Address 88:15:44:21:65:10 (oui Unknown)
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Discover
Client-ID Option 61, length 15: hardware-type 255, 44:21:65:10:00:03:00:01:88:15:44:21:65:10
SLP-NA Option 80, length 0""
NOAUTO Option 116, length 1: Y
MSZ Option 57, length 2: 1472
Hostname Option 12, length 13: "m881544216510"
T145 Option 145, length 1: 1
Parameter-Request Option 55, length 14:
Subnet-Mask, Classless-Static-Route, Static-Route, Default-Gateway
Domain-Name-Server, Hostname, Domain-Name, MTU
BR, Lease-Time, Server-ID, RN
RB, Option 119
END Option 255, length 0
06:04:25.620540 IP (tos 0x0, ttl 64, id 13784, offset 0, flags [none], proto UDP (17), length 332, bad cksum 43ca (->2fb6)!)
0.0.10.10.68 > 10.10.255.255.67: [bad udp cksum 0x178a -> 0x0376!] BOOTP/DHCP, Request from 88:15:44:21:65:10 (oui Unknown), length 304, xid 0xd3650d76, secs 187, Flags [none] (0x0000)
Client-Ethernet-Address 88:15:44:21:65:10 (oui Unknown)
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Discover
Client-ID Option 61, length 15: hardware-type 255, 44:21:65:10:00:03:00:01:88:15:44:21:65:10
SLP-NA Option 80, length 0""
NOAUTO Option 116, length 1: Y
MSZ Option 57, length 2: 1472
Hostname Option 12, length 13: "m881544216510"
T145 Option 145, length 1: 1
Parameter-Request Option 55, length 14:
Subnet-Mask, Classless-Static-Route, Static-Route, Default-Gateway
Domain-Name-Server, Hostname, Domain-Name, MTU
BR, Lease-Time, Server-ID, RN
RB, Option 119
END Option 255, length 0
The primary reason for this appears to be because Meraki is using the click modular router. Click seems to be an attempt to lobotomize the Linux kernel networking stack for the sake of writing academic research papers. The kernel side of click was never upstreamed to mainline and is no longer maintained. Let us pour one out for the poor soul at Cisco who has to maintain this.
Since Meraki was spun out of MIT and the researchers who wrote Click were also at MIT, the relationship between Meraki and Click is clear. Finally Click found a practical use, in a now-obsolete product line from Cisco.
More reverse engineering is necessary to make Click functional enough to have basic connectivity to the management CPU.
The strace output of switch_brain is available in this gist. By first running switch_brain, and then overwriting the default traffic rules with “allow all” I was able to SSH to the management CPU:
/ # echo "allow all" > /click/nat/common_switch_nat/from_smc_filter/config
/ # echo "allow all" > /click/nat/common_switch_nat/from_mgmt_filter/config
/ # tail /tmp/messages | grep dropbear
Jan 1 00:15:49 buildroot authpriv.info dropbear[680]: Child connection from 10.10.10.1:39248
Jan 1 00:15:50 buildroot authpriv.notice dropbear[680]: Password auth succeeded for 'root' from 10.10.10.1:39248
/ # w
USER TTY IDLE TIME HOST
root pts/0 00:00 Jan 1 00:15:50 10.10.10.1
I think it is now very clear that the “only” thing blocking full access to the management CPU is the Click configuration. My hope is to build the configuration from the switch_brain strace output and package that into an initscript.
I have updated the meraki-builder repository with the buildroot config file and overlay directory I use to inject init scripts (among other things).
If you would like to contribute, I have written some installation instructions for the current development firmware. If you find the correct voodoo of Click commands to access the management CPU, please open a PR 🥰 The voodoo has been found.
Greetings! My MS220-8p is about to turn into a paper-weight in the next week or so. I was wondering if you could perhaps assist in helping me get your beta firmware running on it? I’ve done a few Meraki APs, but never touched the switches.
The installation instructions describe in detail how to install the firmware.