Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate foreign objects better with Ruby objects #2149

Closed
eregon opened this issue Nov 10, 2020 · 16 comments
Closed

Integrate foreign objects better with Ruby objects #2149

eregon opened this issue Nov 10, 2020 · 16 comments
Assignees
Labels
Milestone

Comments

@eregon
Copy link
Member

eregon commented Nov 10, 2020

Currently, foreign objects are handled specially in OutgoingForeignCallNode.
They are given a Ruby class, Truffle::Interop::Foreign but actually methods are not looked there, all method calls are directly dispatched to OutgoingForeignCallNode instead.

I think a better design would be to have a set of modules and classes for foreign objects, and make foreign objects behave more like Ruby objects by using the normal Ruby method lookup.
When a Ruby method would not be found, we would check members with isMemberInvocable and either invokeMember or readMember, instead of calling method_missing.

So we could have something like Polyglot::ForeignArray < Polyglot::ForeignObject < Object and Polyglot::ForeignArray would include Enumerable.
Then we could easily define the logic in Ruby.
All Object , Kernel, and BasicObject methods should then also handle foreign objects.

Given that a foreign object might have multiple traits:
https://github.com/oracle/graal/blob/68ce20c37071cfa91146f437eb6ab6ccffbd0759/truffle/src/com.oracle.truffle.api.interop/src/com/oracle/truffle/api/interop/InteropLibrary.java#L129-L145
we should probably use modules to represent those traits that map to Ruby concepts.

So we'd have module Polyglot::Trait::ForeignArray, module Polyglot::Trait::ForeignProc, etc.
And we'd need classes to compose them like (an object must be the instance of a class, not of a module)

module Polyglot::Trait::ForeignArray
  include Enumerable
  def each
    ...
  end
end

class Polyglot::ForeignArray < Polyglot::ForeignObject
  include Polyglot::Trait::ForeignArray
end

class Polyglot::ForeignArrayProc < Polyglot::ForeignObject
  include Polyglot::Trait::ForeignArray
  include Polyglot::Trait::ForeignProc
end

(just a sketch, we might want to think more about names)

foreign_object.class would actually return one of these classes, and we'll need an efficient way to compute that as it would be used for every method call on a foreign object.
Internally, to get the class of a foreign object, we'd probably want to add RubyLibrary#getMetaClass() and RubyLibrary#getLogicalClass() and implement them for foreign objects.

cc @rbotafogo @fniephaus @chumer

@eregon
Copy link
Member Author

eregon commented Nov 10, 2020

It might be better to inherit from the core Ruby types actually, like class Polyglot::ForeignArray < Array, and Polyglot::ForeignObject could just be a module instead (or more simply a trait). That means things like foreign_array.is_a?(Array) would be true. But we can only inherit from one class, so we'd have to find the "primary trait" or so.

@fniephaus
Copy link
Member

In theory, you could generate all sorts of Ruby classes each with a different combination of traits. On class lookup, TruffleRuby could return the best possible fit depending on the interop properties of the foreign object (e.g. hasArrayElements and such). That still wouldn't get you past the inheritance problem, so is_a? calls might need to be special-cased for foreign objects as well (e.g. returns true if Array or Exception depending on its interop props).

@eregon
Copy link
Member Author

eregon commented Nov 12, 2020

In theory, you could generate all sorts of Ruby classes each with a different combination of traits

Right, it would be nice if they have a name though, for inspection/debugging. But we could generate the classes dynamically from Java or from Ruby I suppose. I think initially I'd keep it static while we have few traits handled in Ruby.

That still wouldn't get you past the inheritance problem, so is_a? calls might need to be special-cased for foreign objects as well (e.g. returns true if Array or Exception depending on its interop props).

That's a good point, handling foreign object specially in is_a?/kind_of?/Module#=== would be one possibility.
However it kind of breaks the consistency of "foreign objects are just like Ruby objects with a Polyglot::Foreign* class + extra methods (which we should probably report in Kernel#methods).

@rbotafogo
Copy link
Contributor

I tried to implement some of the ideas, but had the following two problems:

  1. class ForeignArray was implemented inside of array.rb file. This is clearly wrong, but I didn't find yet a way of requiring a new file when lauching.
  2. Where in the directory hierarchy should ForeignObject, ForeignArray, and all the traits be added?

@eregon eregon self-assigned this Aug 16, 2021
eregon added a commit to eregon/truffleruby that referenced this issue Aug 16, 2021
* By giving foreign objects a proper Ruby class.
* Move the logic for all special methods on foreign objects (Implicit polyglot API)
  to regular Ruby methods of Polyglot::ForeignObject and Polyglot trait modules.
* Automatically generate Polyglot::Foreign* classes composing the InteropLibrary traits.
* Simplify Polyglot::ForeignObject#respond_to? now that most are defined as Ruby methods.
* Fixes oracle#2149
* From oracle#2153

Co-authored-by: Benoit Daloze <benoit.daloze@oracle.com>
eregon added a commit to eregon/truffleruby that referenced this issue Aug 16, 2021
* By giving foreign objects a proper Ruby class.
* Move the logic for all special methods on foreign objects (Implicit polyglot API)
  to regular Ruby methods of Polyglot::ForeignObject and Polyglot trait modules.
* Automatically generate Polyglot::Foreign* classes composing the InteropLibrary traits.
* #[] on foreign arrays now return nil if out of bounds like Array#[].
* Foreign arrays now have all methods of Enumerable and many methods of Array.
* Simplify Polyglot::ForeignObject#respond_to? now that most are defined as Ruby methods.
* Fixes oracle#2149
* From oracle#2153

Co-authored-by: Benoit Daloze <benoit.daloze@oracle.com>
eregon added a commit to rbotafogo/truffleruby that referenced this issue Aug 16, 2021
* By giving foreign objects a proper Ruby class.
* Move the logic for all special methods on foreign objects (Implicit polyglot API)
  to regular Ruby methods of Polyglot::ForeignObject and Polyglot trait modules.
* Automatically generate Polyglot::Foreign* classes composing the InteropLibrary traits.
* #[] on foreign arrays now return nil if out of bounds like Array#[].
* Foreign arrays now have all methods of Enumerable and many methods of Array.
* Simplify Polyglot::ForeignObject#respond_to? now that most are defined as Ruby methods.
* Fixes oracle#2149
* From oracle#2153

Co-authored-by: Benoit Daloze <benoit.daloze@oracle.com>
eregon added a commit to rbotafogo/truffleruby that referenced this issue Aug 16, 2021
* By giving foreign objects a proper Ruby class.
* Move the logic for all special methods on foreign objects (Implicit polyglot API)
  to regular Ruby methods of Polyglot::ForeignObject and Polyglot trait modules.
* Automatically generate Polyglot::Foreign* classes composing the InteropLibrary traits.
* #[] on foreign arrays now return nil if out of bounds like Array#[].
* Foreign arrays now have all methods of Enumerable and many methods of Array.
* Simplify Polyglot::ForeignObject#respond_to? now that most are defined as Ruby methods.
* Fixes oracle#2149
* From oracle#2153

Co-authored-by: Benoit Daloze <benoit.daloze@oracle.com>
@eregon eregon added this to the 21.3.0 milestone Aug 16, 2021
@eregon
Copy link
Member Author

eregon commented Aug 16, 2021

This is now implemented, see #2153 (comment) and 4b6964f for more details.
As an example:

p Truffle::Interop.to_java_array([1, 2, 3]) # => #<Polyglot::ForeignArray[Java] int[]:0x23eee4b8 [1, 2, 3]>

obj = Truffle::Debug.foreign_pointer_array_from_java(Truffle::Interop.to_java_array([1, 2, 3]))
p obj # => #<Polyglot::ForeignArrayPointer 0x0 [1, 2, 3]>
p obj.class.ancestors # => [Polyglot::ForeignArrayPointer, Polyglot::ArrayTrait, Enumerable, Polyglot::PointerTrait, Polyglot::ForeignObject, Object, Kernel, BasicObject]
p obj.map { _1 * 2 } # => [2, 4, 6]

Also foreign objects nicely display in irb now, while it used to be a problem with OutgoingForeignCallNode (due to irb using the Kernel#respond_to? to check if inspect was callable IIRC).

@fniephaus
Copy link
Member

Nice! What happens if traits have overlapping methods or aren't there any?
In Smalltalk, for example, Dictionary>>at: and Array>>at: accept different types of arguments. In a ForeignDictionaryArray, these methods would need to be composed somehow.

@eregon
Copy link
Member Author

eregon commented Aug 17, 2021

There is no Hash entries trait yet, but it should be added soon.
Right now there is an overlap for [], []= and delete for the Array trait and ForeignObject as these methods can access both indices and members:

def [](index)
if Primitive.object_kind_of?(index, Numeric)
at(index)
else
super(index)
end
end
def []=(member, value)
if Primitive.object_kind_of?(member, Numeric)
Truffle::Interop.write_array_element(self, member, value)
else
super(member, value)
end
end
def delete(member)
if Primitive.object_kind_of?(member, Numeric)
Truffle::Interop.remove_array_element(self, member)
else
super(member)
end
end

That one is easy to decide as arguments have different types (integer array indices vs String/Symbol member names).

I think we'll have to choose which takes precedence between Hash entries and Array elements.
I think it makes more sense to prefer the Hash entries as that's general, rather than use array elements for integer and hash entries for non-integers which sounds confusing / error-prone (Hash can have integer keys of course).
Array elements can still be accessed with more specific methods like #at, and of course using Truffle::Interop methods or even using a Ruby UnboundMethod by taking it from the trait
(Polyglot::ArrayTrait.instance_method(:[]).bind_call(foreign, 0)).

Hash entries will also hide members from [], []= and delete.
But I think that should be fine, because they can already be accessed with foreign.foo (if it's not also invocable) and foreign.foo = value, and called with foreign.bar(1).

@fniephaus
Copy link
Member

Introducing a notion of precedence is, of course, one possible way forward. However, it's always going to cause some incompatibilities. What if, for example, the users passes an object into a Ruby method and the wrong interface is exposed? Similar to target type mappings on the level of the host language, maybe there's a way to give developers some control over such precedences used in TruffleRuby.

Nonetheless, I do believe that some user intervention is always going to be needed. In TruffleSqueak, users can control what kind of interface their objects expose: myForeignObject asCollection, for example, creates a wrapper that exposes the full collection protocol if the foreign object implements the array interop trait.

@eregon
Copy link
Member Author

eregon commented Aug 17, 2021

I think we want to maximize portability (as in existing Ruby code should just work with foreign objects) as @chumer would say (and obj[] is used mostly for Array & Hash in Ruby), and a foreign hash seems more specific than just "an object with members". Members are typically accessed with obj.foo/obj.foo= anyway (it looks nicer, and is more concise), so there is probably very little overlap in practice.

Note there is already potential overlap betwen e.g. Enumerable methods and JS array methods (e.g., map).
But for portability clearly it's better to always prefer Ruby methods.

Yes, one way could be to have some explicit wrapper that only exposes one given trait.
I think we won't need that (we have Polyglot.as_enumerable for legacy reasons but I plan to deprecate it), IMHO the implicit APIs and Truffle::Interop should be enough.
But if there is a compelling realistic example I'm happy to reconsider.

@fniephaus
Copy link
Member

Hang on, I thought users are not supposed to use Truffle::Interop?
Also, I wasn't trying to convince you to do anything now... of course, we need more and realistic data points to act upon :)

@eregon
Copy link
Member Author

eregon commented Aug 17, 2021

Hang on, I thought users are not supposed to use Truffle::Interop?

Still changing my mind about that, because sometimes there seems to be no other way, or sometimes being explicit feels better/safer.

Notably in ExecJS, e.g., there is currently no way to express isMemberReadable, but maybe we could via instance_variable_defined? but that's more like isMemberExisting, and there is nothing e.g. for isMemberModifiable (easier to add now with Ruby classes).
Part of the issue there is that plain JS objects are not interop hashes, but only have members, and IIRC some members are not readable (will double check).

Maybe we should simply rename Truffle::Interop to TruffleRuby::Interop or so (Truffle is indeed private).

@fniephaus
Copy link
Member

I guess the main reason why it's not exposed in all guest languages is that it's still being evolved a lot and it's hard to change things if users depend on them. In terms of naming, why not call it Interop if users are allowed to use it? I mean TruffleRuby already provides Polyglot. Does a TruffleRuby module already exist?

@eregon
Copy link
Member Author

eregon commented Aug 17, 2021

I guess the main reason why it's not exposed in all guest languages is that it's still being evolved a lot and it's hard to change things if users depend on them.

I guess InteropLibrary won't change in incompatible ways, it would likely break all languages.

why not call it Interop if users are allowed to use it

Maybe, the main reason is to avoid potential name clashes by limiting the number of top-level constants added by TruffleRuby (no choice for Polyglot to be API-compatible with other languages).
But it sounds tempting, true.

Does a TruffleRuby module already exist?

Yes it does.

I managed locally to avoid using Truffle::Interop for ExecJS, isMemberReadable seems not needed (luckily).
The remaining issue is how to getMembers.
Right now there is foreign.keys but that's the old naming and should be deprecated.
Maybe expose it with instance_variables, which is otherwise always empty for foreign objects?
One detail there is Kernel#instance_variables returns an Array of Symbols, but here it would return an Array of String most likely (interop members are Strings). Could also convert the member names to Symbols for consistency, but it does feel a little bit awkward (it also implies all member names would be interned as Symbols, but maybe that's probably good for inline caches / conversion to j.l.String).

@fniephaus
Copy link
Member

I guess InteropLibrary won't change in incompatible ways, it would likely break all languages.

True :)

Maybe, the main reason is to avoid potential name clashes by limiting the number of top-level constants added by TruffleRuby (no choice for Polyglot to be API-compatible with other languages).
But it sounds tempting, true.

Since there already is a TruffleRuby module, TruffleRuby::Interop seems reasonable too.

I managed locally to avoid using Truffle::Interop for ExecJS, isMemberReadable seems not needed (luckily).
The remaining issue is how to getMembers.
Right now there is foreign.keys but that's the old naming and should be deprecated.
Maybe expose it with instance_variables, which is otherwise always empty for foreign objects?
One detail there is Kernel#instance_variables returns an Array of Symbols, but here it would return an Array of String most likely (interop members are Strings). Could also convert the member names to Symbols for consistency, but it does feel a little bit awkward (it also implies all member names would be interned as Symbols, but maybe that's probably good for inline caches / conversion to j.l.String).

Mh, well I don't know what specific ExecJS problem you're currently working on, but how do you identify instance variables from getMembers? Finding all readable members is additional work. And how do you plan to deal with readable members that are also invocable?

@eregon
Copy link
Member Author

eregon commented Aug 18, 2021

Mh, well I don't know what specific ExecJS problem you're currently working on, but how do you identify instance variables from getMembers? Finding all readable members is additional work. And how do you plan to deal with readable members that are also invocable?

A good question. Since in Ruby there is Kernel#methods and Kernel#instance_variables I was thinking to split the members in invocable and non-invocable for those methods. Kernel#methods also additionally reports methods defined on the Ruby class and ancestors.
(an upside of having Kernel#instance_variables and related working would be the default Kernel#inspect would already give useful output, and also make it easy to not show invocable members in inspect which is typically not helpful)
For ExecJS, JS objects with even executable members are filtered, because ExecJS does not attempt to serialize/pass JS functions to Ruby, similar to what JSON.stringify in JS does:

> JSON.stringify({"a": 1, "b": function(){} })
{"a":1}
> JSON.stringify(["a", function(){}])
["a",null]

So there I just filter those explicitly with respond_to?(:call).

@eregon
Copy link
Member Author

eregon commented Aug 18, 2021

Hash, Iterable and Iterator traits added in 21d0dc2 (with HashTrait#[] winning over ArrayTrait#[]).
Also foreign.class now always returns the Ruby class for better compatibility with existing Ruby code.

wildmaples pushed a commit to Shopify/truffleruby that referenced this issue Sep 1, 2021
* By giving foreign objects a proper Ruby class.
* Move the logic for all special methods on foreign objects (Implicit polyglot API)
  to regular Ruby methods of Polyglot::ForeignObject and Polyglot trait modules.
* Automatically generate Polyglot::Foreign* classes composing the InteropLibrary traits.
* #[] on foreign arrays now return nil if out of bounds like Array#[].
* Foreign arrays now have all methods of Enumerable and many methods of Array.
* Simplify Polyglot::ForeignObject#respond_to? now that most are defined as Ruby methods.
* Fixes oracle#2149
* From oracle#2153

Co-authored-by: Benoit Daloze <benoit.daloze@oracle.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants