-
Notifications
You must be signed in to change notification settings - Fork 273
SDCC Notes
This will identify lots of the cases below very very easily
The Z80 has only IX and IY that make useful pointers for many operations, and HL for some others. SDCC uses IX for the frame pointer in most cases which means you've got about 1.5 effective pointer registers.
This means that it's a definite win to
- gather multiple arrays indexed off the same thing (eg minor number) into one struct. SDCC will then reference it off IY.
- order operations so that only one pointer is live at a time as seen in C. SDCC doesn't do a bad job at re-ordering but the language rules tie its hands in some cases (eg across function invocation). ie do everything with foo then do everything with bar - don't interleave too much.
- put a few non re-entrant pieces of data into static addresses, and in some cases to stash things in per process areas (hence the udata trick UZI pulls)
SDCC has a very poor ABI for Z80 argument passing as it always uses the stack. Therefore simple short helper routines are normally a loss. This is a balancing act as gcc 6809 uses register arguments so helpers work well on that.
Pathalogical cases include things like
char *foo(int n) {
return "Hello"+n;
}
which generates a 16bit load, an add and a ret on some other compilers but a small essay on SDCC. SDCC is getting some experimental register based calling support. Testing suggests it's going to help on size in some situations. However it's not yet reliable enough to enable.
Arguments being passed down long chains may want to be put in the udata.
Combined with the register pressure due to the lack of effective use of the alternate registers this can be quite ugly with longs.
SDCC generates very good int code for the most part, as good as can be expected long long code, and absolutely dire long code. In particular rather than using the alternate register pairs for 32bit (eg HL and HL') it only tries to use DEHL meaning any long using code gives the compiler register constipation.
The kernel code therefore tries to use uints when it can and to move things into uint as soon as possible. It's also not hot on some optimisations so replace
uint32_t n;
if (n & 0x1ff) { ...
with
uint32_t n;
if (((uint16_t)n) & 0x1ff) { ...
Z80 is better at unsigned. The kernel tries to use unsigned where it can. 6502 is even worse at signed maths. Note that the next SDCC plans to default to unsigned char.
It's common in C to write "while (x < some-expression)", and call functions in the loop. In most cases this forces the compiler to recompute expression each time around the loop as it can't prove it has not changed. On modern processors who cares, but something like shifting a long right 9 bits produces horrible results on SDCC. If it can be computed once assign the expression and compare with a variable.
SDCC can't easily go round packing structures to nice sizes. The Z80 has no multiplier so try and keep power of two sizes, or easy to compute ones (eg 24 bytes). The lower the number of 1 bits in the multiplier the better.
SDCC often generates better code if you compute a static pointer to the end of an array and compare the pointer to the end, rather than keeping an additional counter. The same seems to work well with gcc 6809.
-
conditions on variable declared as an I/O port sometimes access it twice. This is supposed to be fixed in 3.4 but the code keeps workarounds, in part because it is all to easy to accidentally write code forgetting it's accessing an I/O port
-
ignoring the return of a long long function causes the compiler to explode. Fixed in the SDCC 3.4. The utilities always use t = time(NULL) not time(&t) to avoid this
-
signed mod of a long long reports a missing helper function. Fixed in the SDCC 3.4 tree, and you can pull the function out of it and use it in 3.4. Ideally we'd clean up the time code not to do an expensive modulo 64bit
-
assigning 1 << n to an I/O port causes the compiler to fail. Workaround - assign it to a temporary register, call a dummy function, then assign the dummy (or use out explicitly). Fixed in sdcc 3.5
-
SDCC __interrupt generates irq entry/exit logic that is only safe on the CMOS Z80 processor. On an NMOS box it may corrupt the IRQ status. FUZIX uses its own IRQ on/off/restore functionality which avoids this.
Just in case anyone fancies hacking SDCC/SDLDZ80 rather than Fuzix
- Now and then SDCC likes to generate crap of the form
LD HL, #foo
LD (HL), #0
LD HL, #foo+1
LD (HL), #0
There are enough of them in the Fuzix binary that it would be really nice to get that specific ugly fixed
-
Finish banking support. Right now it works but its real hack at the linker end of things. The compiler side seems fine, other than needing perhaps a better way to describe bank relations.
-
Arguments by register in DE, BC (as Hitech does) (or DE'DE BC'BC) with returns in HL, HL'HL. SDCC 3.5.1 has some experimental fastcall in HL stuff but it's not yet stable enough.
-
Use alternate register set for the high half of 32bit values
-
A common sequence eliminator at the link level - identify common blocks of code that are not jumped into and which have the stack balanced, and then replace them with call/ret sequences.
-
Relocatable binary support (needs linker changes)
Fuzix: because small is beautiful