Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interaction with other GC based languages #19

Open
encukou opened this issue May 10, 2023 · 7 comments
Open

Interaction with other GC based languages #19

encukou opened this issue May 10, 2023 · 7 comments

Comments

@encukou
Copy link
Contributor

encukou commented May 10, 2023

From JPype, a “nearly intractable” issue:

Interfacing with another garbage-collected language can introduce “cross language reference loops”. Apparently, Java has mechanisms to deal with them, but would need support on the Python side as well.

@encukou encukou changed the title Interaction with other GC system Interaction with other GC systems May 10, 2023
@filmor
Copy link

filmor commented May 10, 2023

Same for Python.NET, this creates a lot of headaches :)

@encukou
Copy link
Contributor Author

encukou commented May 11, 2023

Does .NET have some tools that could help solve this?

@Thrameos
Copy link

It isn't clear if Java has an tools either. Java uses block sweep garbage collection algorithm internally and the JNI only provides for 3 types of references: global (strong reference), local (strong reference that automatically dereferences when the scope is left) and weak. Given only those tools one can't handle reference loops.

They do have another API which is used when communicating over the network between two JVM. This introduces additional resources to track references as described in:

https://docs.oracle.com/javase/8/docs/platform/rmi/spec/rmi-arch4.html

I was hoping that I could tap into that API to handle the interactions between languages as it is part of the Java GC. But it isn't clear to me if this actually resolves reference loops. And even if it does I would need to have the same cooperative hooks in Python so it could participate.

I have thought a lot about the topic and searched for literature to help and have mostly come up short. The literature that I do find usually just deals with one side and does not even consider a reference loop.

http://www.schemeworkshop.org/2008/paper6.pdf

The only way that I can think to do this is to have each language mark objects that are references from foreign sources. When performing the sweep algorithm, the foreign list is traversed last after all others have been marked. During that sweep if any object from a foreign origin marks an object which if foreign in the other direction to marks it in the table which is available by the other language. Then when the other language later performs its own sweep can do the same and consult the table which would allow it to see the reference loop. The problems with approach are that both languages have to maintain the list of foreign interactions and they will be marking with different cadences. Thus the relationship may not be discovered if one code has yet to perform GC. Meaning that something is only really dead if both codes have performed GC and both agreed that there was a loop. At that point one of the codes declares something a dead and then the strong reference goes away which would kill the loop.

I am hoping that we can find some other literature or examples where this has been done successfully with .Net or Java. Until then it will remain a hard topic.

The reason that this worries me is that in Python there are a lot of objects that can become containers (as almost everything has a dict). Thus it is all too easy to construct a reference loop in users code.

@steve-s
Copy link
Contributor

steve-s commented May 17, 2023

I am hoping that we can find some other literature or examples where this has been done successfully with .Net or Java. Until then it will remain a hard topic.

TBH an API that would allow this level of customization of Java GC does not seem very likely. There are multiple GCs, each of them working differently, most of them partition the heap to some regions and don't always collect all the regions during all GC cycles. On top of that most of the GCs also need some read or write or return barrier. And on top of all that, GCs usually need to be able to stop the application code, so this would include Python I guess. I can imagine some research project that hacks one specific GC in OpenJDK, but it would be very hard sell to upstream that. This project can be also useful in such case: https://www.mmtk.io/about.

Maybe you can have some luck with Java based or .NET based Python implementation like GraalPy, where Python objects are Java objects managed by Java GC together with other Java objects.

@encukou
Copy link
Contributor Author

encukou commented May 17, 2023

On the CPython side, at first glance it looks like a foreign-object wrapper could implement tp_traverse & tp_clear to traverse the foreign graph and report/clear any references that cross the boundary back to Python. But it would need a traverse API on the foreign side.

@Thrameos
Copy link

@steve-s Thanks for the info. That is exactly why I fear this is an intractable problem. I don't suppose that GraalPy would ever be the main branch? Having one GC is way more elegant than fighting with multiple though I suppose .NET will just have the same issues.

The reason I was hoping to hack the remote protocol is that supported on multiple Java machines with different GC so it would at least be portable beyond a research project. But then I haven't found anything on how it manages reference loops.

@steve-s
Copy link
Contributor

steve-s commented May 18, 2023

Meta: I propose renaming this to "Interaction with other GC based languages". To make it clear this is not about using some alternative GC system within Python.

@iritkatriel iritkatriel changed the title Interaction with other GC systems Interaction with other GC based languages Jun 5, 2023
@iritkatriel iritkatriel added the v label Jul 21, 2023
@iritkatriel iritkatriel removed the v label Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants