Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsxInit() change brings issue with SDL libraries #106

Closed
bucanero opened this issue Jul 10, 2020 · 62 comments
Closed

rsxInit() change brings issue with SDL libraries #106

bucanero opened this issue Jul 10, 2020 · 62 comments

Comments

@bucanero
Copy link

Just noted that recent changes in rsxInit() from d0eea6e , brings some issues when building the SDL_PSL1GHT library from ps3libraries (013-sdl_psl1ght.sh)

e.g., you'll get a too few arguments to function 'rsxInit' error in SDL_PSL1GHTvideo.c

maybe other libs or samples are affected.

@zeldin
Copy link
Member

zeldin commented Jul 10, 2020

Ugh. That commit mixes a lot of unrelated changes which should have been separate commits...

@shagkur Changing the prototype of the API function rsxInit seems a bit dubious since you'll break all existing code using said API. How about making a new API function rsxInitExt with the extra parameters instead and leaving the old one as it is for compat?

@miigotu
Copy link
Member

miigotu commented Jul 10, 2020

Too bad it isn't cpp, overloading would be perfect.

@shagkur
Copy link
Member

shagkur commented Jul 10, 2020 via email

@shagkur
Copy link
Member

shagkur commented Jul 10, 2020 via email

@zeldin
Copy link
Member

zeldin commented Jul 10, 2020

Well, if you call the function with the new prototype rsxInitExt, you can simply make a compatibility wrapper

gcmContextData* rsxInit(const u32 cmdSize,const u32 ioSize,const void *ioAddress)
{
    gcmContextData *context = NULL;
    rsxInitExt(&context, cmdSize, ioSize, ioAddress);
    return context;
}

and then you don't have to change any libraries (or applications) at all...

@shagkur
Copy link
Member

shagkur commented Jul 11, 2020

I get your point. As well as i know how it could be solved. But this one is a bit more complicated since i also changed defines by their names. The question here is: What do we maintain? The ps3dev, hence ps3toolchain, psl1ght and the ps3librariies? If so, those are our only responsibilities. And then we still should fix the libraries.
Other applications, using librsx directly, not maintained by us we shouldn't bother to stay API backward compatible.
And like i wrote before, i broke those lib, i'll fix them.

@ghost
Copy link

ghost commented Jul 12, 2020

Thanks @shagkur for fix, @zeldin for merge and @bucanero for reporting. Issue seems to be fixed, I did a successful build of the ps3toolkit without this issue showing up.

@bucanero
Copy link
Author

thanks @shagkur 👍
I'm closing this issue; fix merged in zeldin/SDL_PSL1GHT#12

@zeldin
Copy link
Member

zeldin commented Jul 17, 2020

@shagkur Could you please fix SDL2_PSL1GHT as well? I'd like to merge the develop branch of ps3libraries into master, but right now it does not build due to this breakage...

@shagkur
Copy link
Member

shagkur commented Jul 17, 2020

@zeldin whats now broken? i already made a PR for rsxInit change and it already got merged!?

@zeldin
Copy link
Member

zeldin commented Jul 17, 2020

SDL2_PSL1GHT is. Note the "2". (Only the develop branch of ps3libraries builds SDL2_PSL1GHT.)

@shagkur
Copy link
Member

shagkur commented Jul 17, 2020

Ahh got it. Where is the branch? git url?

@zeldin
Copy link
Member

zeldin commented Jul 17, 2020

@shagkur
Copy link
Member

shagkur commented Jul 17, 2020

cant fork it (seems to be the same as yours, somehow). github says i've already forked it (the ones from you, for which i already fixed it). Perhaps you can apply this one-liner yourself? the diff from my PR should be still around.

@zeldin
Copy link
Member

zeldin commented Jul 17, 2020

It looks like he forked the SDL_PSL1GHT repo and then updated it to be SDL2. Just add https://github.com/sergiou87/SDL2_PSL1GHT/ as a remote and you should be able to pull the master branch from there. After you fix the issue, push the fix to a branch in your own fork and then create a PR against sergiou87/SDL2_PSL1GHT.

@shagkur
Copy link
Member

shagkur commented Jul 18, 2020

Does not work. If i pull/merge the upstream into mine i get a ton of merge conflicts. Not sure how i should solve this.

@zeldin
Copy link
Member

zeldin commented Jul 18, 2020

Dont pull/merge. Create a new branch based on sergio87/master.

git remote add sergiou87 https://github.com/sergiou87/SDL2_PSL1GHT
git fetch sergiou87
git checkout -b myfixes sergiou87/master

@zeldin
Copy link
Member

zeldin commented Jul 18, 2020

Then, when you have committed the fix to myfixes and pushed that branch to your own fork, you go to https://github.com/sergiou87/SDL2_PSL1GHT/pulls and press "New pull request", and choose myfixes as the branch to merge into sergiou87/master.

@shagkur
Copy link
Member

shagkur commented Jul 18, 2020

Done (i guess)

@zeldin
Copy link
Member

zeldin commented Jul 18, 2020

@shagkur Thanks!

@shagkur
Copy link
Member

shagkur commented Jul 18, 2020

@zeldin
btw i figured a long standing bug with the SPRX call stub, in psl1ght. Its only capable of calling methods where the passed
arguments don't exceed GPR10 (above this one, as by specification, arguments passed on the stack).

@zeldin
Copy link
Member

zeldin commented Jul 18, 2020

Cool. I guess nobody needed to call a function with that many arguments before. 😄

@shagkur
Copy link
Member

shagkur commented Jul 18, 2020

probably not. I wanted to use "gcmSetZcull" which takes 12 arguments and always got an assertion with one argument past gpr10. After analyzing the stub asm method (and my little understanding of PPC asm) i figured that we dont pass those args and the SPRX lib just gets undefined data.
Trying to fix this by a special EXPORT macro where one can pass the arg count and then i'll copy from the incoming to the outgoing stack.
At least thats my idea.

@shagkur
Copy link
Member

shagkur commented Jul 19, 2020

@zeldin Your change of the RS600_ABI_NAME to linux seems to have a huge impact on resulting binaries. In my case (it's a quite big c++ project) the resulting binary crashes immediately. reverting to the commit before (along with changing back sbrk.c in psl1ght) results in working binaries. Although the rsxtest samples did work with your commit too. Anyways. I strongly believe we need to stick to "aixdesc", as PS3 is AIX ABI compliant and therefor have to figure why the .text duplicate error happens. Perhaps we need to adapt/adjust cell64lv2.h a bit.

small side note: i got the "more than 8 function args to PRX calls" fixed :)

@zeldin
Copy link
Member

zeldin commented Jul 19, 2020

@shagkur You will need to recompile all files in order to avoid problems.

I don't think the PS3 is "AIX ABI compiliant" any more than it is "Linux ABI compliant", both AIX and (big endian) Linux use ELF ABIv1. The difference here is just that all the non-function-descriptor symbols (starting with .) for functions are dropped, since they are not needed anymore with GNU binutils 2.16 and newer (you can use the function-descriptor symbol as the target of a branch instruction, like in the updated psl1ght code) and the linker will sort things out for you.

@zeldin
Copy link
Member

zeldin commented Jul 19, 2020

And I know exactly why the .text duplicate error happens. It's because in the old mode for every function foo you get two exported symbols, foo for the function-descriptor, and .foo for the non-function-descriptor entry point. Since there are a number of pre-defined symbols starting with . (such symbols are reserved for internal use by the toolchain), such as .text, .data, .bss and so on, you run a risk of collision for every global function you define. And since the non-fuction-descriptor symbol is not actually needed, the sane thing to do seems to be to remove it.

@zeldin
Copy link
Member

zeldin commented Jul 19, 2020

Thanks!

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

@zeldin Here we go.... clone/fork this one: https://github.com/shagkur/irrlichtPS3.git and build example/2 (as well as the whole library)

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

@shakgur Thanks. I get this while compiling though:

Unable to load Cg, aborting.
Please install Cg toolkit and/or set the path (i.e. LD_LIBRARY_PATH) to the shared library accordingly.

"Cg toolkit" appears to be unmaintained, unsupported, and closed source X86 code. Is there an opensource alternative, or could you perhaps provide the relevant build products somewhere?

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

@zeldin What Os do you use? I'm on ubuntu and i just did sudo apt-get install -y nvidia-cg-toolkit.
But this would also mean the rsxtest sample of psl1ght does not compile for you either?

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

@shagkur Gentoo/ppc64.
Correct, the rsxtest sample does not build, but that is not needed to build and install PSL1GHT.

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

@zeldin okay. Sure it's not needed to get psl1ght built, but to know whether the stuff works or not it does matter ;)
So if i get it right there's no way for you to build irrlichtPS3 due to the missing cg toolkit on your platform :(

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

Well, if you give me the build products generated with Cg toolkit, I can build the rest. We are looking for an issue with gcc, not Cg toolkit, so I don't expect I shall need to rebuild those parts.
Going forward it would be nice with an open source solution to build the rsxtest sample of course, but I guess that is a separate issue. 😸

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

That's true. I'll get you the fragments needed to build the library.
Yes, would be nice, but there's no decent Opensource cg compiler out there as far as know.

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

irrlichtPS3shaders.zip

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

Put the files into the build folder of the library.

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

Thanks. Now I can build examples/2. Will investigate.

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

Thank your very much!

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

Hm, interresting. With the new gcc the .text section is smaller, but instead the .got section is larger. The total size seems to be the same...

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

The difference appears to be that in one case globals are found by loading their address from the .got (ld X(r2), with a R_PPC64_TOC16_DS relocation), an in the other by adding an offset to r2 (addis r2,X and addi X with R_PPC64_TOC16_HA and R_PPC64_TOC16_LO). Shouldn't matter as the resulting pointer should be the same either way, but I'll try to negate this change and see if anything else differs...

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

Well to me it seems by changing the abi name and in gcc configure addin -melf64ppc results in different build results. Keep in mind that the original PS3 SDK still is on gcc 4.1.1 and probably the underlaying FW OS is kinda relies on some sort of ABI compliancy.
But that's just wild guessing by me 😃

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

We've always been incompatible with the FW OS ABI (which uses 32 bit pointers, for examples). That why we need wrappers.

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

The TOC thing seems to be because large TOC support is now default (you've always had the options to turn it on), but I don't see why it should break anything. As I said, I'll try turning it off and see if anything else differs.

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

Yeah sure go on. I guess, nowadays, you know more and better about gcc toolchain build tweaking. I did that things at that time myself. But this is ages ago hence i'm quite happy you take a look at it.

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

Btw. i need your opinion on an other topic.....
As you know i figured the issue with the SPRX stub macro and methods with more than 8 function arguments, for which i've been able to fix it (see psl1ght commit). However, currently this is done by 2 different macros (since as of now, only 2 PRX methods are affected), but i could collapse it to macro. The advantage: Only one macro used (code maintainability) but the downside would be: all export.h files need to be touched and the arg count needs to be set too (lots of boilerplate work).
What do you think?

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

We've always been incompatible with the FW OS ABI (which uses 32 bit pointers, for examples). That why we need wrappers.

Well this is some strange awkwardness within FW OS ABI anyways. Having a 64bit system and then fiddling around with 32bit pointers.
And due to that i just think that a slight, not even harmful , change can affect the whole thing.

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

The 32-bit pointers do kind of make sense considering the PS3 has <1GB of RAM...
I think having two macros is fine. If you want to share some implementation I guess you could factor that out into a utility macro used by the other two?

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

well atm they don't really share much. Wouldn't make sense to create a utility macro therefore. I guess, for now i keep it with 2 and if i'm really in the mood to spend hours of stupid copy over work, i'll collapse the 2 into 1.

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

@shagkur I changed the default code model to "small" (max 64K TOC), and now the binary built in examples/2 is 100% identical to the one built with the old toolchain.

Note that there is no obvious reason why building with the "medium" model should make the code crash, this has nothing to do with ABI and is purely a code generation style option (the code uses a few more instructions for TOC lookups, but in return can handle TOC entries beyond the initial 64K). So it might be the case that there is a logical bomb type bug somewhere (in PSL1GHT, in irrlichtPS3, elsewhere) which just happened to become triggered by the code moving around a bit...

Its also a bit puzzling that you would get the "small" model before my changes (but I confirm that it does happen); both AIX and Linux use the "medium" code model by default in gcc 7 (in gcc 4 only "small" existed), and cell64lv2.h actually sets the model to "medium"...

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

@zeldin Have you been able to run the example on HW? or do you mind to send me your build version of the example, so i can test here?
This sounds really a bit strange. It's forcefully set to medium but previous gcc 7 toollchain built produced athe small model?
The only location where i could imagine an issue with TOC handling is the SPRX stub macro along with the SPRX linker?
Especially if the resulting binary gets "big" enough, what it does imo with the irrlichtPS3 example.
Well, i was kinda surprised too that it all of a sudden crashed with irrlichtPS3. Because the rsxtest samples all worked.
Currently i'm again on the commit before your change, but if you want (and if it's of any help for you) i can built with the recent version and run the example and then send you the backtrace?

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

I haven't tested it on HW yet, I'm making a full rebuild of everything now, I can send you the resulting binary afterwards.

Well, it wasn't forceful, it was just the default, but it was this default setting that I could change from "medium" to "small" to get the small model back afterward. There's probably a logical explanation for this, I just didn't see it without instrumentation of the code, and doing that seemed a little overkill. 😄

Regarding "big" binaries, the "medium" model actually makes the TOC smaller, and also can handle larger TOCs, so if anything it should work better with big binaries than the "small" model does... It's all very weird.

Well, if you have a backtrace maybe it could shed some light (l1ght?) on things. Feel free to collect one and pastebin it.

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

@shagkur Here are the binaries I build with everything from master: https://mc.pp.se/irrlicht_example_2.zip
I also tested it on PS3 hardware and it worked fine (showing some stone steps and a moving light source).

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

That sounds promising. Yeah, it's not that fancy example, but if it wouldn't work it'd crash immediatly.

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

@zeldin Here's the built from the "nonworking" toolchain with the stacktrace:
2_crash.zip
Oh, and yes, your build works fine :)

@zeldin
Copy link
Member

zeldin commented Jul 20, 2020

Looks like it's crashing in pngDecCreate, but the actual crash is down inside libpngdec.sprx somewhere. So I guess it got bad arguments from the caller (irr::video::CImageLoaderPNG::CImageLoaderPNG()) or something?

@shagkur
Copy link
Member

shagkur commented Jul 20, 2020

Okay. hmm...need to take a look then. Kinda strange it's not triggering the crash with the small model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants