Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foundation collection instances can be cached without their helper wrappers #539

Open
freakboy3742 opened this issue Nov 12, 2024 · 11 comments · May be fixed by #540 or #543
Open

Foundation collection instances can be cached without their helper wrappers #539

freakboy3742 opened this issue Nov 12, 2024 · 11 comments · May be fixed by #540 or #543
Labels
bug A crash or error in behavior.

Comments

@freakboy3742
Copy link
Member

Describe the bug

When an instance of a Foundation collection class is created (e.g., NSMutableDictionary), Rubicon uses a collection class (e.g., ObjCMutableDictInstance) to ensure that Pythonic helpers such as __setitem__ are available.

It is also possible to create an ObjCInstance directly from an objc_id. If this happens, it may be cached as a raw ObjCInstance, and not have the helper methods.

Rubicon caches object instances based on the object pointer, performing cache eviction only when the class name of the underlying object class doesn't match. However, an NSMutableDictionary created directly, and one created from an objc_id will have the same pointer, and the same underlying class name. As a result, it's possible for a recycled collection class instance to not have the helper methods as a result of a cache hit.

Steps to reproduce

I haven't worked out a simple way to reproduce this programmatically.

beeware/toga-chart#191 contains an example that can reproduce the issue with reasonable reliability. Run the example app, and resize the window continuously. I can usually trigger the exception reported by that ticket within a couple of seconds.

Expected behavior

Cached ObjC collection instances should always be returned with the appropriate wrapper, even if the memory address has previously been used by an object of the same type.

Screenshots

No response

Environment

  • Operating System: macOS
  • Python version: Any
  • Software versions:
    • Rubicon 0.4.9+

Logs


Additional context

The workaround fix is to modify the conditions for a cache eviction so that we don't just look at the class name, but at the class instance as well.

Even better would be to ensure that a non-wrapped instance isn't ever cached in the first place... but I'm having difficulty working out how the non-wrapped instance is being created.

Even better still would be to fix the issue with memory-based cache hits (#256).

@freakboy3742 freakboy3742 added the bug A crash or error in behavior. label Nov 12, 2024
@freakboy3742 freakboy3742 linked a pull request Nov 12, 2024 that will close this issue
4 tasks
@mhsmith
Copy link
Member

mhsmith commented Nov 12, 2024

It is also possible to create an ObjCInstance directly from an objc_id. If this happens, it may be cached as a raw ObjCInstance, and not have the helper methods.

How could this happen, when ObjCInstance.__new__ already uses type_for_objcclass to decide which Python wrapper class to use?

@freakboy3742
Copy link
Member Author

It is also possible to create an ObjCInstance directly from an objc_id. If this happens, it may be cached as a raw ObjCInstance, and not have the helper methods.

How could this happen, when ObjCInstance.__new__ already uses type_for_objcclass to decide which Python wrapper class to use?

After some more investigation, I've found the culprit... it's a niche side effect of #70.

The branch that is being triggered is this one. It's being triggered on the very first use of the subscript notation, which triggers load_methods handling; this causes a retrieval of self.objc_class, which causes the creation of an ObjCClass instance. ObjCClass is a subclass of ObjCInstance, so it is put in the instance cache.

As far as I can make out, caching the ObjCClass in this case is correct (and quick test of not caching instances created with this branch cause other problems). The bug being addressed here is caused if the memory address used for the ObjCClass instance is re-used with an actual dictionary. There might be some deeper cleanups if #70 or #256 are addressed, but at least in the short term, I'm increasingly convinced the cache type check is actually the right fix in this instance.

@mhsmith
Copy link
Member

mhsmith commented Nov 14, 2024

Summary of in-person discussion: it seems like this ought to be caught by the existing check of the cached object's class, since:

  • class of an NSDictionary instance is the NSDictionary class
  • class of the NSDIctionary class is its meta-class

It looks like the existing class comparison is making both of these look the same, but there should be a way of distinguishing them, perhaps by comparing pointers rather than names. This would avoid adding more complexity to what is already a pretty confusing area of the code.

@mhsmith
Copy link
Member

mhsmith commented Nov 15, 2024

Edited my previous comment to link to information about meta-classes. If the class and its meta-class both return the same value for class_getName, that would explain what we're seeing.

@samschott
Copy link
Member

Under which circumstances can the memory address of the ObjCClass instance be freed and reused? IIUC, you mean the memory address of NSMutableDictionary implementation (__NSDictionaryM). Shouldn't this is just always stay as is?

@freakboy3742
Copy link
Member Author

@samschott That's a really good question - your analysis was my initial "This can't possibly be happening" analysis, as well. However, if you reproduce the bug and set a breakpoint, an ObjCClass instance that is created to satisfy the request for self.objc_class when the method cache is populated. As far as I can make out, that's the memory address that is being re-used; and the issue is that because it's an ObjCClass that has the same class name, it's a cache hit.

What I can't work out is (a) why an ObjCClass instance would be disposed of; and (b) how to structure the cache check so that it catches this edge case without spoiling other cases.

@samschott
Copy link
Member

As far as I can make out, that's the memory address that is being re-used; and the issue is that because it's an ObjCClass that has the same class name, it's a cache hit.

I wasn't able to reproduce myself on a M1 MacBook Air with macOS 15.1, so can't actually see for myself what is going on. If indeed the memory address of the NSMutableDictionary class is reused, can you check if NSMutableDictionary lives at a different memory address later? This should be theoretically easy to validate. Recording the address of NSMutableDictionary before the exception is thrown should additionally also allow us to add debug statements to track if we ever see a different object at this address.

@samschott
Copy link
Member

samschott commented Nov 19, 2024

I did actually manage to reproduce this now, turns out a bit more dragging and resizing was needed than I thought :)

As expected, the memory address of the __NSDictionaryM class appears to remain the same throughout, even after the incorrect cache hit causes an exception. So unless there is another class with name __NSDictionaryM (which would at least go against ObjC conventions of having unique class names) I am skeptical about its memory being reused.

Assuming that this issue is indeed caused by the instance's class name and the class's class name being equal (which they are) and the class object being deallocated and its memory address reused (which I am skeptical about), I still see one more issue with the explanation that you give in #539 (comment):

The branch that is being triggered is this one. It's being triggered on the very first use of the subscript notation, which triggers load_methods handling; this causes a retrieval of self.objc_class, which causes the creation of an ObjCClass instance. ObjCClass is a subclass of ObjCInstance, so it is put in the instance cache.

If the ObjCClass instance is created by a retrieval of self.objc_class, survives in the cache and causes a false hit, its string representation should still be <ObjCClass: __NSDictionaryM>. However, the stack trace from #256 as well as my own print statements show that its <ObjCInstance: __NSDictionaryM>.

@samschott
Copy link
Member

samschott commented Nov 19, 2024

Could it be possible that some code in Rubicon-ObjC creates a NSMutableDictionary instance before the code that actually registers the wrapper gets to run?

@for_objcclass(NSMutableDictionary)

This is just wild speculation, I'm not even sure if that cache entry would have a chance to survive sufficiently log for this bug manifest. Edit: Maybe it could, given the unpredictability of Python garbage collection.

@freakboy3742
Copy link
Member Author

Could it be possible that some code in Rubicon-ObjC creates a NSMutableDictionary instance before the code that actually registers the wrapper gets to run?

I can't rule that out - and a "cached instance of an NSMutableDictionary created with ObjCInstance rather than ObjCMutableDictInstance would be consistent with the failure mode. However, I can't for the life of me work out how that "short lived" ObjCInstance would be created.

@samschott
Copy link
Member

However, I can't for the life of me work out how that "short lived" ObjCInstance would be created.

Neither can I. Maybe your initial theory was right after all, even though I do not fully understand it 🤷

@mhsmith mhsmith linked a pull request Nov 24, 2024 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A crash or error in behavior.
Projects
None yet
3 participants