Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Performance - Info Report #25

Open
TheBajaGuy opened this issue Jun 3, 2024 · 1 comment
Open

Memory Performance - Info Report #25

TheBajaGuy opened this issue Jun 3, 2024 · 1 comment

Comments

@TheBajaGuy
Copy link

I wanted to report a few things since my N2630 with 128M v4.0.1a with PLCC FPU arrived today. Memory is built/set for 2 banks of 64M. This is not a bug, per-se, but maybe just a quirk of some memory access logic necessary to the board's design. Things have been stable so far.

  1. I clocked the IDE with some memory-priority tweaking (see IDE PIO Mode Support #2) and a 4GB Sandisk Ultra card w/related jumper settings, with RSCP v1.15, showing a read speed at 9481K/sec (and 0% CPU availability - 8904 vs 2 stones - this % is expected with IDE/CPU driven). SysInfo pulled 9,039,448.
    1a) Prior to tweaking, I pulled 8192K/sec RSCP reads on the same setup with no FastROM, standard cache settings, and only the base MuLibs 68030.library settings. The OS was booting mainly into the higher priority Z3 memory (essentially default from scratch OS 3.2.2.1).

  2. I noticed an interesting memory performance quirk. FastRAM performance from Z2 is slightly faster than Z3 memory. See the BusSpeedTest results below.


For readability, the tool's output has been rearranged in the order of Chip/Z2/Z3 across the 3 memory region tests in the 3 tests I ran (ref: addr). I also added MuLibs' MuScan output for the FastROM/MMU settings reference on the last two.

Base System: Fresh OS 3.2.2.1 Workbench, OS 3.2.2 Kickstart (47.111), basic MuLibs, No FastROM, Booted up to Workbench. USA NTSC machine but a PAL 2MB MegAChip, no other cards or add-ons. Results sent >SER: and collected on my web browser over my serial-WiFi adapter. There are 2x ~2GB partitions, and a DF0:

Memory Priority (native/default):
Z3 Fast = +20
Z2 Fast = 0
Chip = -10

BusSpeedTest 0.19 (mlelstv) Buffer: 262144 Bytes, Alignment: 32768

memtype addr op cycle calib bandwidth

chip $00038000 readw 1053.1 ns normal 1.9 * 10^6 byte/s
chip $00038000 readl 1649.9 ns normal 2.4 * 10^6 byte/s
chip $00038000 readm 1228.1 ns normal 3.3 * 10^6 byte/s
chip $00038000 writew 587.5 ns normal 3.4 * 10^6 byte/s
chip $00038000 writel 1177.5 ns normal 3.4 * 10^6 byte/s
chip $00038000 writem 1175.0 ns normal 3.4 * 10^6 byte/s
user $00300000 readw 149.6 ns normal 13.4 * 10^6 byte/s
user $00300000 readl 211.0 ns normal 19.0 * 10^6 byte/s
user $00300000 readm 202.0 ns normal 19.8 * 10^6 byte/s
user $00300000 writew 107.9 ns normal 18.5 * 10^6 byte/s
user $00300000 writel 108.6 ns normal 36.8 * 10^6 byte/s
user $00300000 writem 98.4 ns normal 40.6 * 10^6 byte/s
fast $401C0000 readw 159.5 ns normal 12.5 * 10^6 byte/s
fast $401C0000 readl 231.2 ns normal 17.3 * 10^6 byte/s
fast $401C0000 readm 221.0 ns normal 18.1 * 10^6 byte/s
fast $401C0000 writew 148.8 ns normal 13.4 * 10^6 byte/s
fast $401C0000 writel 150.8 ns normal 26.5 * 10^6 byte/s
fast $401C0000 writem 132.1 ns normal 30.3 * 10^6 byte/s


MuLibs full Install with FastROM added during startup. No other install options as all memory is AutoConfig.

Memory Priority:
Z3 Fast = +20
Z2 Fast = 0
Chip = -10

BusSpeedTest 0.19 (mlelstv) Buffer: 262144 Bytes, Alignment: 32768

memtype addr op cycle calib bandwidth
chip $00018000 readw 1024.1 ns normal 2.0 * 10^6 byte/s
chip $00018000 readl 1606.0 ns normal 2.5 * 10^6 byte/s
chip $00018000 readm 1199.6 ns normal 3.3 * 10^6 byte/s
chip $00018000 writew 574.3 ns normal 3.5 * 10^6 byte/s
chip $00018000 writel 1148.9 ns normal 3.5 * 10^6 byte/s
chip $00018000 writem 1148.5 ns normal 3.5 * 10^6 byte/s
user $00300000 readw 145.4 ns normal 13.8 * 10^6 byte/s
user $00300000 readl 207.8 ns normal 19.3 * 10^6 byte/s
user $00300000 readm 196.6 ns normal 20.4 * 10^6 byte/s
user $00300000 writew 104.4 ns normal 19.2 * 10^6 byte/s
user $00300000 writel 107.6 ns normal 37.2 * 10^6 byte/s
user $00300000 writem 96.8 ns normal 41.3 * 10^6 byte/s
fast $40178000 readw 154.5 ns normal 12.9 * 10^6 byte/s
fast $40178000 readl 227.5 ns normal 17.6 * 10^6 byte/s
fast $40178000 readm 214.6 ns normal 18.6 * 10^6 byte/s
fast $40178000 writew 145.3 ns normal 13.8 * 10^6 byte/s
fast $40178000 writel 146.8 ns normal 27.3 * 10^6 byte/s
fast $40178000 writem 132.0 ns normal 30.3 * 10^6 byte/s

MuScan 46.1 (02.07.2016) � THOR
68030 MMU detected.
MMU page size is 0x400 bytes.

Memory map:
0x00000000 - 0x001FFFFF CacheInhibit Imprecise NonSerial
0x00200000 - 0x009FFFFF CopyBack
0x00A00000 - 0x00BBFFFF Blank
0x00BC0000 - 0x00BFFFFF CacheInhibit I/O space
0x00C00000 - 0x00D7FFFF Blank
0x00D80000 - 0x00DFFFFF CacheInhibit I/O space
0x00E00000 - 0x00E9FFFF Blank
0x00EA0000 - 0x00EBFFFF CacheInhibit I/O space
0x00EC0000 - 0x00EFFFFF Blank
0x00F00000 - 0x00F7FFFF CacheInhibit
0x00F80000 - 0x00FFFFFF ROM CopyBack Remapped to 0x47F78000
0x01000000 - 0x3FFFFFFF Blank
0x40000000 - 0x47F77FFF CopyBack
0x47F78000 - 0x47FF7FFF ROM CopyBack
0x47FF8000 - 0x47FFFFFF CopyBack
0x48000000 - 0xFFFFFFFF Blank


Adjusted Memory Priority:
Z2 Fast = +21 (pre-SetPatch adjustment in StartupSequence)
Z3 Fast = +20
Chip = -10

BusSpeedTest 0.19 (mlelstv) Buffer: 262144 Bytes, Alignment: 32768

memtype addr op cycle calib bandwidth
chip $00018000 readw 1024.7 ns normal 2.0 * 10^6 byte/s
chip $00018000 readl 1606.9 ns normal 2.5 * 10^6 byte/s
chip $00018000 readm 1200.6 ns normal 3.3 * 10^6 byte/s
chip $00018000 writew 575.1 ns normal 3.5 * 10^6 byte/s
chip $00018000 writel 1150.7 ns normal 3.5 * 10^6 byte/s
chip $00018000 writem 1149.5 ns normal 3.5 * 10^6 byte/s
fast $002C0000 readw 144.8 ns normal 13.8 * 10^6 byte/s
fast $002C0000 readl 207.7 ns normal 19.3 * 10^6 byte/s
fast $002C0000 readm 195.6 ns normal 20.4 * 10^6 byte/s
fast $002C0000 writew 104.8 ns normal 19.1 * 10^6 byte/s
fast $002C0000 writel 107.6 ns normal 37.2 * 10^6 byte/s
fast $002C0000 writem 96.1 ns normal 41.6 * 10^6 byte/s
user $04100000 readw 145.7 ns normal 13.7 * 10^6 byte/s
user $04100000 readl 208.7 ns normal 19.2 * 10^6 byte/s
user $04100000 readm 197.2 ns normal 20.3 * 10^6 byte/s
user $04100000 writew 105.0 ns normal 19.0 * 10^6 byte/s
user $04100000 writel 107.3 ns normal 37.3 * 10^6 byte/s
user $04100000 writem 97.9 ns normal 40.9 * 10^6 byte/s

MuScan 46.1 (02.07.2016) � THOR
68030 MMU detected.
MMU page size is 0x400 bytes.
Memory map:
0x00000000 - 0x001FFFFF CacheInhibit Imprecise NonSerial
0x00200000 - 0x00977FFF CopyBack
0x00978000 - 0x009F7FFF ROM CopyBack
0x009F8000 - 0x009FFFFF CopyBack
0x00A00000 - 0x00BBFFFF Blank
0x00BC0000 - 0x00BFFFFF CacheInhibit I/O space
0x00C00000 - 0x00D7FFFF Blank
0x00D80000 - 0x00DFFFFF CacheInhibit I/O space
0x00E00000 - 0x00E9FFFF Blank
0x00EA0000 - 0x00EBFFFF CacheInhibit I/O space
0x00EC0000 - 0x00EFFFFF Blank
0x00F00000 - 0x00F7FFFF CacheInhibit
0x00F80000 - 0x00FFFFFF ROM CopyBack Remapped to 0x00978000
0x01000000 - 0x3FFFFFFF Blank
0x40000000 - 0x47FFFFFF CopyBack
0x48000000 - 0xFFFFFFFF Blank

I am simply curious if this is an expected performance difference between the Z2 & Z3 address ranges? or raise the possibility something got missed. I also realize this slight difference is nearly splitting hairs. Yet, this is the Amiga community and if I didn't spot it, someone else would eventually. Overall: Nice work on the card.

@TheBajaGuy
Copy link
Author

I located the cause of the memory quirk I described.

MuLibs does not, by default, enable the Burst setting on the Data Cache. Once enabled, better Read speeds are obtained:

System: 68030 68882 68030-MMU FastROM (INST: Cache Burst) (DATA: Cache Burst)
Z3 Memory
BusSpeedTest 0.19 (mlelstv) Buffer: 262144 Bytes, Alignment: 32768

memtype addr op cycle calib bandwidth
fast $401A8000 readw 111.0 ns normal 18.0 * 10^6 byte/s
fast $401A8000 readl 141.1 ns normal 28.3 * 10^6 byte/s
fast $401A8000 readm 171.6 ns normal 23.3 * 10^6 byte/s

I did not detect any change in performance for writes, so I assume there is no Burst-Write cycles being performed.

An interesting note on the the above solution was the disk performance, which dropped to 8605K/sec with Data Burst enabled.

In any event, a note in the docs should mention to add:

CPU >nil: DataBurst

somewhere after the SetPatch command, or somewhere in the S:User-Startup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant