-
Notifications
You must be signed in to change notification settings - Fork 8
Performance
There are several factors that affect performance:
- Display resolution
-
1920x1080 = 2025k pixels, 800x600 = 468k pixels
- Pixel format
-
XRGB8888 has a framebuffer twice the size of RGB565
- USB speed
-
3.0 = 5Gbps, 2.0 = 480Mbps, 1.1 = 12Mbps
- Compression
-
lz4 works very well for desktop application use, not so good for showing fullscreen movies.
- CPU speed
-
Mainly affects decompression
- RAM speed
-
the received buffer is memcpy’ed into the framebuffer
- Buffer size
-
If the work/decompression buffer on the host or device is smaller than the framebuffer, the transfer is split up possibly slowing down the transfer.
- Partial update
-
If the graphics application is smart, it will tell the kernel which part of the framebuffer that has been updated often resulting in a (much) smaller USB transfer.
Board | USB | CPU speed | RAM speed |
---|---|---|---|
Rock Pi 4 |
3.0 |
1.8/1.4 GHz |
LPDDR4-3200 |
Raspberry Pi 4 |
2.0 |
1.5 GHz |
LPDDR4-3200 |
Raspberry Pi Zero |
2.0 |
1.0 GHz |
LPDDR2-450 |
Average compression ratio is available in debugfs (this is after showing Big Buck Bunny in HD):
pi@pi4:~ $ sudo cat /sys/kernel/debug/dri/0/stats Max buffer size: 8.00 MiB Number of errors: 0 Compression: lz4 Compression ratio: 2.5
There are 2 scripts that can be used to measure performance, one drives the device directly using libusb, the other goes through the host driver.
-
tests/perf-kms.py (requires the async_flush parameter added in Linux v5.15)
-
No compression
-
x0: Fully random image that will fail to compress into the max buffer size, so will fallback to no compression, so this takes the hit of first trying the compression.
-
x1: Random image with enough zeroes to compress into the same size as a no compress image.
-
x2,3,4,8,16: Fill image with zeroes until it compresses to the desired ratio.
Resolution | Board | No* | x2 | x4 |
---|---|---|---|---|
1920x1080 |
Rock Pi 4 |
19 fps |
25 fps |
41 fps |
Raspberry Pi 4 |
7 fps |
11 fps |
22 fps |
|
Raspberry Pi Zero |
6 fps |
9 fps |
12 fps |
|
1024x768 |
Rock Pi 4 |
49 fps |
61 fps |
60 fps |
Raspberry Pi 4 |
18 fps |
29 fps |
63 fps |
|
Raspberry Pi Zero |
16 fps |
24 fps |
34 fps |
|
800x600 |
Rock Pi 4 |
60 fps |
61 fps |
61 fps |
Raspberry Pi 4 |
29 fps |
48 fps |
99 fps |
|
Raspberry Pi Zero |
27 fps |
39 fps |
55 fps |
|
640x480 |
Rock Pi 4 |
n/a |
n/a |
n/a |
Raspberry Pi 4 |
50 fps |
79 fps |
148 fps |
|
Raspberry Pi Zero |
45 fps |
63 fps |
85 fps |
|
320x240 |
Raspberry Pi Pico |
6 fps |
9 fps |
15 fps |
240x135 |
Raspberry Pi Pico |
14 fps |
23 fps |
37 fps |
(* No compression)
If the SPI display ends up as a DRM minor other than zero, override which one GUD uses in /boot/cmdline.txt: drm_dev=1
Resolution | Board | SPI speed | No* | x2 | x4 | Max* |
---|---|---|---|---|---|---|
320x240 |
Raspberry Pi 4 |
62.5 MHz |
43 fps |
43 fps |
43 fps |
50 fps |
Raspberry Pi Zero |
66.6 MHz |
25 fps |
24 fps |
24 fps |
54 fps |
|
320x480 |
Raspberry Pi 4 |
62.5 MHz |
20 fps |
21 fps |
20 fps |
25 fps |
Raspberry Pi Zero |
66.6 MHz |
12 fps |
12 fps |
12 fps |
27 fps |
(No* compression)
(Max* Theoretical maximum if we could continously push only the pixel data from a static buffer and SPI was the only limiting factor)
Why does the Zero only get half the speed? This doesn’t make sense, almost all time should be taken by the SPI bus transfer.
Running modetest
on the device itself and thus driving the display directly shows that GUD is not to blame:
Resolution | Board | freq |
---|---|---|
320x240 |
Raspberry Pi 4 |
42 Hz |
Raspberry Pi Zero |
25 Hz |
|
320x480 |
Raspberry Pi 4 |
21 Hz |
Raspberry Pi Zero |
13 Hz |
I have tried to track down what’s going on here, but gave up (details). I haven’t got a SPI analyzer so I can’t see what actually happens on the bus.
It turns out that the problem is the VPU clock changing: See https://github.com/raspberrypi/linux/issues/3381
-
USB 3.0: 500 MB/s (TODO: find formula)
-
USB 2.0: 13 packets of 512 bytes per microframe (1/8ms): 13*512*8*1000/1024/1024 = 50 MB/s
-
USB 1.1: 19 packets of 64 bytes per frame (1ms): 19*64*1000/1024/1024 = 1.2 MB/s
-
testusb: Userspace tool to control usbtest
-
usbtest: Kernel module that runs the tests
-
f_sourcesink: A sink USB function to pour USB bulk OUT requests into
Device:
## Stop the display gadget # /etc/init.d/S70gud stop ## Start the source/sink USB gadget function configured like the g_zero legacy gadget (but only source/sink) # g_zero start
Host:
# Match g_zero setup: write a 4MB buffer (~= 1920*1080*2), queue up 1 request, do it 73 times $ sudo ~/testusb -a -t 27 -s 4194304 -g 1 -c 73 unknown speed /dev/bus/usb/001/026 0 /dev/bus/usb/001/040 test 27, 10.054837 secs $ dmesg [357713.208084] usbtest 1-1.4:1.0: TEST 27: bulk write 292Mbytes # raw USB throughput: 292/10.05 = 29.0MB/s, 73/10.05 = 7.2 fps
Board | USB | MB/s | fps* |
---|---|---|---|
Rock Pi 4 |
3.0 |
74.6 |
18.6 |
Raspberry Pi 4 |
2.0 |
29.0 |
7.2 |
Raspberry Pi Zero |
2.0 |
20.9 |
5.2 |
Raspberry Pi Pico |
1.1 |
n/a |
(fps*: 1920x1080-RGB565 no compression framerate)