-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't boot Pi 5 via NVMe behind PCIe switch / bridge #1833
Comments
It's not supported right now |
What "PCIe switch" do you have? |
@peterharperuk - I'm testing with this I/O Crest switch:
|
I've just come across a HAT+ (Geekworm X1004) with an ASM1182e PCIe switch for two SSDs and the manufacturer claims it can't be used for booting at this time. It looks related. |
@cnxsoft - It is most definitely related! And thanks for you post on that board, I really do hope Raspberry Pi can support at least a few switches (the asmedia ones seem extremely popular for smaller companies. I don't see Pericom that often). |
I heard the guys from Pineberry are also already experimenting with a hat that supports 2 m.2 devices... (no promises though.. ) 2 NVME drives or 1 NVME drive and room for a pcie coral TPU would be nice but of course we need to be able to boot from it too if a pcie switch is involved :) |
Removed off topic comment about a CM4. This issue is specifically about Pi5 / BCM2712 |
That's covered in the documentation. From https://www.raspberrypi.com/documentation/computers/raspberry-pi-5.html#pcie-gen-3-0:
If Gen 3 was guaranteed to be stable then it would be enabled by default. |
@Ronald1817 - The title of this GitHub issue is "Can't boot Pi 5 via NVMe behind PCIe switch / bridge" and is meant to discuss that topic, not general NVMe SSD issues you may be having. The Raspberry Pi forums, or the manufacturer of the HAT / Bottom you're using, would be a better avenue for questions about Pi boot issues with NVMe. That is all that is implied by @timg236—this GitHub repository is for issues pertaining to the Pi firmware, and this issue is about a specific feature request. These comments don't have anything to do with that, so typically the maintainers will clean them out at some point. |
A new dual NVMe rpi5 product has appeared with a PCIe switch in it https://pimoroni.com/nvmeduo |
@greyltc - Looks like @timg236 / @pelwell — it seems like by some stroke of luck, all the main vendors of PCIe addon boards have settled on the asmedia |
https://pineberrypi.com/products/hatdrive-ai-coral-edge-tpu-bundle-nvme-2230-2242-gen-2-for-raspberry-pi-5 |
Recently I have bought an PCIe HAT from aliexpress.com. The link is https://www.aliexpress.us/item/3256806347359812.html. But I found two interesting things:
Let me explain below what I did, I didn't go in depth to find out how it worked. But at a high level,
I didn't know what I did except expanding the root partition size to utilize the full NVMe (512G) size, (SD card was 128G), I resized the partition in NVMe. While preparing to answer this, I found that the below in lsblk command. mmcblk0 179:0 0 116.2G 0 disk Both boot and root partitions are used from NVMe only. Now this is running as my home router / gateway. So, I can't do much tests immediately. But I will try to reboot multiple times/ try without SD card and see what's happening when I get a chance and post it here. |
Can I please add the ASM1184e PCIe packet switch chip to this issue? It shows up on the ZS ZHISHANG PCI-E X1 to 4 PCI-E X16 Expansion Riser as documented by @geerlingguy. The ASM1182e is mentioned above. It's a 2 by PCIe chip vs the ASM1184e which is 4 by PCIe chip; so similar(ish). Boards available on Amazon via:
I will note that this PCIe bridge works fine once the system is booted. |
In principle any switch that follows the spec should work with generic firmware (once written). Supporting / debugging individual switch/PCIe devices would be up to the HAT designer i.e. debugging low level electronic issues. |
To be clear, my obeys the spec I mean that anything that requires a ton of quirks an interop fixes just won't be supported. No immediate plans to do this though |
Agreed. The intention would be to support booting from an NVMe driver behind a switch, probably only simple hardware topologies would be supported |
@timg236 - The other use case I have is to have multiple boot drives in a compact solution, so a Pi could boot into different OSes more easily than with a bunch of USB stuff hanging off it (or PXE boot). That... would be interesting to do from the Pi OS default firmware experience, though a UEFI bootloader could have boot order priority. I don't know the default way the bootloader picks which device to boot from first, if there are multiple on one bus like USB. Is it by device ID? |
The bootloader enumerates MSD in parallel and you get the first one that looks like valid OS. It’s mentioned in the msd docs. |
It is drive related, some of them need a PCIe compat flag for ASPM behind a switch. I don't know anything about Shenzhen Longsys Electronics controllers, but adding one of those overlays should make it work:
@HonzaJaros generally you should contact our support before raising an issue here, it is slightly offtopic (as @pelwell highlighted) |
I see. The TPU issue is a distraction here. It is worth trying the TPU with combinations of the compatibility flags @mikegapinski mentions above, but that won't affect booting. |
@pelwell Yes and no, drives with Phison E12 controllers won't boot past bootloader without the no-l0s flag. Coral needs no-mip. But that's a different story |
Yes, that's possible - but then it would be a kernel issue, not a firmware issue. |
Hi Mike, I did, 2x (email and form), sorry I didn't realised that its not related. |
Hello, is there already a solution for this? I would like to use the HatDrive! AI board from Pineboards for my Pi 5 with NVMe and use the TPU chip for my local AI environment. Thank you |
The latest bootloader firmware has some support for PCIe switches (apt update / apt upgrade). In a future update (next couple of weeks) we will be switching the bootloader to use mmio-hi and make that the default. |
Tim,Does that mean we will be able to use commander and NVME and dual coral?Which means, switch in commander, and another switch in dual TPUThanksHonza JarosOn 2 Oct 2024, at 9:27 pm, Tim Gover ***@***.***> wrote:
The latest bootloader firmware has some support for PCIe switches (apt update / apt upgrade). In a future update (next couple of weeks) we will be switching the bootloader to use mmio-hi and make that the default.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @timg236, is the latest bootloader I haven't tried it yet because I'm waiting for confirmation that a dual NVME M.2 can boot before changing my current system from the single NVME M2. Thanks!! |
No |
Hi, is there already a PR for the new bootloader to follow? I would be highly interested as I am using a pineboards hat that bundles an AI accelerator and a NVMe drive adapter. The Pi5 used to boot from it but stopped doing so after the latest bootloader update from version September 23rd to October 9 (the devices are recognized though after booting from a USB) . Also, later I would like to use it behind a pcie bridge as the one that @geerlingguy described. Cheers, |
@konsim83 Need more details. Where is it failing? Do you see any errors? It should have worked since the 2024-06-05 release of the bootloader. |
Hi @peterharperuk Here some output: sudo lspci -v $ sudo lspci -v
0000:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 21) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 38
Bus: primary=00, secondary=01, subordinate=04, sec-latency=0
Memory behind bridge: 80000000-800fffff [size=1M] [32-bit]
Prefetchable memory behind bridge: 1800000000-18000fffff [size=1M] [32-bit]
Capabilities: [48] Power Management version 3
Capabilities: [ac] Express Root Port (Slot-), MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [160] Virtual Channel
Capabilities: [180] Vendor Specific Information: ID=0000 Rev=0 Len=028 <?>
Capabilities: [240] L1 PM Substates
Capabilities: [300] Secondary PCI Express
Kernel driver in use: pcieport
0000:01:00.0 PCI bridge: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch (prog-if 00 [Normal decode])
Subsystem: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
Flags: bus master, fast devsel, latency 0, IRQ 38
Bus: primary=01, secondary=02, subordinate=04, sec-latency=0
I/O behind bridge: [disabled] [32-bit]
Memory behind bridge: 80000000-800fffff [size=1M] [32-bit]
Prefetchable memory behind bridge: 1800000000-18000fffff [size=1M] [32-bit]
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Power Management version 3
Capabilities: [80] Express Upstream Port, MSI 00
Capabilities: [c0] Subsystem: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
Capabilities: [100] Virtual Channel
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Vendor Specific Information: ID=0000 Rev=0 Len=c00 <?>
Kernel driver in use: pcieport
0000:02:03.0 PCI bridge: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch (prog-if 00 [Normal decode])
Subsystem: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
Flags: bus master, fast devsel, latency 0, IRQ 40
Bus: primary=02, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: [disabled] [32-bit]
Memory behind bridge: 80000000-800fffff [size=1M] [32-bit]
Prefetchable memory behind bridge: [disabled] [64-bit]
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Power Management version 3
Capabilities: [80] Express Downstream Port (Slot+), MSI 00
Capabilities: [c0] Subsystem: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
Capabilities: [100] Virtual Channel
Capabilities: [200] Advanced Error Reporting
Kernel driver in use: pcieport
0000:02:07.0 PCI bridge: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch (prog-if 00 [Normal decode])
Subsystem: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
Flags: bus master, fast devsel, latency 0, IRQ 41
Bus: primary=02, secondary=04, subordinate=04, sec-latency=0
I/O behind bridge: [disabled] [32-bit]
Memory behind bridge: [disabled] [32-bit]
Prefetchable memory behind bridge: 1800000000-18000fffff [size=1M] [32-bit]
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Power Management version 3
Capabilities: [80] Express Downstream Port (Slot+), MSI 00
Capabilities: [c0] Subsystem: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
Capabilities: [100] Virtual Channel
Capabilities: [200] Advanced Error Reporting
Kernel driver in use: pcieport
0000:03:00.0 Non-Volatile memory controller: MAXIO Technology (Hangzhou) Ltd. NVMe SSD Controller MAP1202 (rev 01) (prog-if 02 [NVM Express])
Subsystem: MAXIO Technology (Hangzhou) Ltd. NVMe SSD Controller MAP1202 (DRAM-less)
Flags: bus master, fast devsel, latency 0, IRQ 39
Memory at 1b80000000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit+
Capabilities: [70] Express Endpoint, MSI 1f
Capabilities: [b0] MSI-X: Enable+ Count=9 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [158] Alternative Routing-ID Interpretation (ARI)
Capabilities: [168] Secondary PCI Express
Capabilities: [1d4] Latency Tolerance Reporting
Capabilities: [1dc] L1 PM Substates
Capabilities: [1ec] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
Capabilities: [2ec] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Kernel driver in use: nvme
0000:04:00.0 Co-processor: Hailo Technologies Ltd. Hailo-8 AI Processor (rev 01)
Subsystem: Hailo Technologies Ltd. Hailo-8 AI Processor
Flags: bus master, fast devsel, latency 0, IRQ 39
Memory at 1800000000 (64-bit, prefetchable) [size=16K]
Memory at 1800008000 (64-bit, prefetchable) [size=4K]
Memory at 1800004000 (64-bit, prefetchable) [size=16K]
Capabilities: [80] Express Endpoint, MSI 00
Capabilities: [e0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [f8] Power Management version 3
Capabilities: [100] Vendor Specific Information: ID=1556 Rev=1 Len=008 <?>
Capabilities: [108] Latency Tolerance Reporting
Capabilities: [110] L1 PM Substates
Capabilities: [128] Alternative Routing-ID Interpretation (ARI)
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Secondary PCI Express
Kernel driver in use: hailo
Kernel modules: hailo_pci
0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 21) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 47
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
Memory behind bridge: 00000000-005fffff [size=6M] [32-bit]
Prefetchable memory behind bridge: [disabled] [64-bit]
Capabilities: [48] Power Management version 3
Capabilities: [ac] Express Root Port (Slot-), MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [160] Virtual Channel
Capabilities: [180] Vendor Specific Information: ID=0000 Rev=0 Len=028 <?>
Capabilities: [240] L1 PM Substates
Capabilities: [300] Secondary PCI Express
Kernel driver in use: pcieport
0001:01:00.0 Ethernet controller: Raspberry Pi Ltd RP1 PCIe 2.0 South Bridge
Flags: bus master, fast devsel, latency 0, IRQ 47
Memory at 1f00410000 (32-bit, non-prefetchable) [size=16K]
Memory at 1f00000000 (32-bit, non-prefetchable) [virtual] [size=4M]
Memory at 1f00400000 (32-bit, non-prefetchable) [size=64K]
Capabilities: [40] Power Management version 3
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable+ Count=61 Masked-
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: rp1 Could it be an incompatibility with the NVMe controller? -> lsblk -f $ lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sda
├─sda1 vfat FAT32 bootfs 9BE2-1346 434.4M 15% /boot/firmware
└─sda2 ext4 1.0 rootfs 12974fe2-889e-4060-b497-1d6ac3fbbb4b 20G 24% /
nvme0n1
├─nvme0n1p1 vfat FAT32 bootfs 91FE-7499
└─nvme0n1p2 ext4 1.0 rootfs 56f80fa2-e005-4cca-86e6-19da1069914d bootloader $ sudo rpi-eeprom-config
[all]
BOOT_UART=1
POWER_OFF_ON_HALT=0
BOOT_ORDER=0xf416
#NET_INSTALL_AT_POWER_ON=1 and bootloader version CURRENT: Wed Oct 9 11:36:47 PM UTC 2024 (1728517007)
UPDATE: Wed Oct 9 11:36:47 PM UTC 2024 (1728517007)
BOOTFS: /boot/firmware sudo cat /boot/firmware/config.txt $ sudo cat /boot/firmware/config.txt
# For more options and information see
# http://rptl.io/configtxt
# Some settings may impact device functionality. See link above for details
# Uncomment some or all of these to enable the optional hardware interfaces
dtparam=i2c_arm=on
dtparam=i2s=on
dtparam=spi=on
# Enable audio (loads snd_bcm2835)
dtparam=audio=on
# Additional overlays and parameters are documented
# /boot/firmware/overlays/README
# Automatically load overlays for detected cameras
camera_auto_detect=1
# Automatically load overlays for detected DSI displays
display_auto_detect=1
# Automatically load initramfs files, if found
auto_initramfs=1
# Enable DRM VC4 V3D driver
dtoverlay=vc4-kms-v3d
max_framebuffers=2
# Don't have the firmware create an initial video= setting in cmdline.txt.
# Use the kernel's default instead.
disable_fw_kms_setup=1
# Run in 64-bit mode
arm_64bit=1
# Disable compensation for displays with overscan
disable_overscan=1
# Run as fast as firmware / board allows
arm_boost=1
[cm4]
# Enable host mode on the 2711 built-in XHCI USB controller.
# This line should be removed if the legacy DWC2 controller is required
# (e.g. for USB device mode) or if USB support is not required.
otg_mode=1
[cm5]
dtoverlay=dwc2,dr_mode=host
[all]
dtparam=pciex1
dtparam=pciex1_gen=2 # may be redundant, but let's be explicit
usb_max_current_enable=1 This is a picture that I see upon boot from NVMe: |
@peterharperuk that is concerning... the ASM1182e is the most popular switch and the board mentioned (the Hailo bundle) has the latest revision of the IC as well. Maybe it is a regression from the recent PCI BAR allocation that fixed the Gen 3 ASMedia switch? I think you have our boards and NVMe's but it should be easy to reproduce with the Pimoroni NVMe Base Duo as well |
As mentioned, it used to work before the update. Behind an additional pcie switch from pineboards (brick commander) the devices cannot even be recognized |
We're looking at this now.
Which change are you referring to? |
This one b154632 It was also done in the firmware recently as far as I remember. Thank you for taking a look - I can already see a small spike in the support tickets on our end after the recent kernel/firmware update. |
Thanks - just wanted to check we were looking at the same change. Yes, reverting that fixes things. |
FYI, @P33M. |
It'd be nice if Asmedia could make products that didn't have overlapping interop bugs. The Gen3 switch refuses to use low addresses, the Gen2 switch has some nonstandard default that makes downstream address decode extremely narrow. |
that's not all... The Gen 2 ones made before 2024 need ASPM disabled in order to function |
Attached is the latest development version of the bootloader, which has this problem fixed. You can flash this to a spare BLANK sd card with the RPi Imager app and update the bootloader. Or wait a few days for this to reach rpi-update. It should show up as version 7749803e. The code had restricted the memory space size to 0x4000000 rather than 0x40000000, and we've improved the bridge handling code generally. I'll try and make sure we catch regressions like this in future. This bootloader also includes changes to the "net install" UI. It might appear on boot when it didn't previously - this is a done on purpose to make this feature more visible. You can now press the space bar to interrupt boot and change the boot order via a new boot order menu. |
Really nice, thank you @peterharperuk ! |
…s (latest) * Fix PCIe BAR setup issue which prevented NVMe boot from working with some PCIe switches See: raspberrypi/firmware#1833 * Boot-menu improvements Remain in the forced boot mode until the menu is used to select a different boot-mode or reset to the original boot-order.
…s (latest) * Fix PCIe BAR setup issue which prevented NVMe boot from working with some PCIe switches See: raspberrypi/firmware#1833 * Boot-menu improvements Remain in the forced boot mode until the menu is used to select a different boot-mode or reset to the original boot-order.
You should be able to get the fix from rpi-update now - the usual warnings about using pre-release software applies...
|
The fix is now in apt, so the following should update your bootloader to the 21 October 2024 release.
I am closing this issue as I think it has served its purpose. If you have further issues with this feature please raise a new issue. |
Describe the bug
I am unable to boot a Raspberry Pi 5 from an external NVMe SSD if used behind a PCIe switch (e.g. not as the root device on the external connection).
To reproduce
BOOT_ORDER=0xf25416
)Expected behaviour
I would expect the NVMe SSD to be selected for boot, wherever it is enumerated on the PCIe bus.
Actual behaviour
The Raspberry Pi 5 bootloader attempts to load
nvme
but fails, likely due to it only enumerating devices directly attached to the external port, and not walking down the tree of any other connected PCIe bridges...System
Copy and paste the results of the raspinfo command in to this section. Alternatively, copy and paste a pastebin link, or add answers to the following questions:
cat /etc/rpi-issue
)?vcgencmd version
)?uname -a
)?Logs
Click to expand full bootloader log (captured via UART)
Additional context
When behind the bridge, here is the hierarchy according to
lspci
:I know for the Compute Module 4, the concern was a lack of space in the bootloader to successfully enumerate all PCIe devices, no matter where they are on the bus. Does the Pi 5's bootloader overcome that limitation? I don't expect this to work on launch day, but it is something I think a lot of people would like to do (e.g. stack an 'NVMe + 2.5G Ethernet' HAT, or 'NVMe + WiFi 7' HAT, etc. on top).
The text was updated successfully, but these errors were encountered: