Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove class wrap for constructors in Rust exports #3562

Merged
merged 6 commits into from
Aug 25, 2023

Conversation

snOm3ad
Copy link
Contributor

@snOm3ad snOm3ad commented Aug 18, 2023

After #1594 constructors of Rust exported structs started using class wrapping when generating JS shims. Wrapping erases prototype information from the object instance in JS and as a result it is not possible to override methods (via inheritance) of the generated class.

This PR is rebased from #3561, it passes the constructor information from the Builder into the instruction translation function which ensures that constructors returning Self no longer rely on __wrap.

Fixes #3213

After rustwasm#1594 constructors of Rust exported structs started using class
wrapping when generating JS shims. Wrapping erases prototype information
from the object instance in JS and as a result it is not possible to
override methods (via inheritance) of the generated class.

Additionally, some checks to ensure constructors always return an
instance of `Self` were lost.

This PR is rebased from rustwasm#3561, it passes the constructor information
from the `Builder` into the instruction translation function which
is then used to modify the generated bindings.

The `return` statement is also removed instead the value is directly
attached to the instance.

Signed-off-by: Oliver T <geronimooliver00@gmail.com>
Copy link
Collaborator

@Liamolucko Liamolucko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I phrased myself poorly in the last PR - I wasn't trying to say that the constructor for Foo should accept either Foo or Option<Foo>, and disallow Bar or Option<Bar>, I was trying to say that Option<Foo> shouldn't be allowed either. Foo should be the only valid return type - Option<Foo>, Bar, and non-struct types such as String should all be disallowed.

As for choosing to pass constructor to instruction: sure, that's fine, I think it's about as clean as my suggestion. You'll need to insert a check that the function has the right return type somewhere else though, since you can't tell whether it's the last instruction or not.

crates/cli-support/src/js/binding.rs Outdated Show resolved Hide resolved
@Liamolucko
Copy link
Collaborator

Oops, I just noticed the CI failure. That's because of this test:

#[wasm_bindgen]
pub struct AsyncStruct;

#[wasm_bindgen]
impl AsyncStruct {
    #[wasm_bindgen(constructor)]
    pub async fn new() -> AsyncStruct {
        AsyncStruct
    }

    pub async fn method(&self) -> u32 {
        42
    }
}

It expects returning a Future<Output = AsyncStruct> to work. Apparently I added it in #3043, without thinking about it much.

Thinking back on it, I vaguely remember trying to fix this issue before, noticing that failing test and then bailing on it. (found it: 052281f)

While that's probably the most sensible use of a constructor returning something other than the type it's a constructor for, I still don't think it should be allowed - just use a builder instead. I just added that test on a whim without really thinking about it. However, I'm now a bit worried that someone might already be relying on that, and this could actually cause breakage.

Ugh, I don't know - the breakage could be avoided by falling back on the old behaviour if a constructor returns the wrong type (which is basically what the last PR did, now that I think about it), but it feels kind of crappy to be explicitly supporting this weirdness. I guess we could print some kind of deprecation warning, but that'd only be able to show up when you ran the CLI, not when using the macro... or we could just risk the breakage. Do you have any opinion?

@snOm3ad
Copy link
Contributor Author

snOm3ad commented Aug 20, 2023

I think I phrased myself poorly in the last PR - I wasn't trying to say that the constructor for Foo should accept either Foo or Option<Foo>, and disallow Bar or Option<Bar>, I was trying to say that Option<Foo> shouldn't be allowed either. Foo should be the only valid return type - Option<Foo>, Bar, and non-struct types such as String should all be disallowed.

Initially, that's what I thought as well. Hence why in #3561 I inserted the check before generating the adapter. Because you can check the return type from the signature that is passed to register_export_adapter.

You'll need to insert a check that the function has the right return type somewhere else though, since you can't tell whether it's the last instruction or not.

This is why inserting the check at this point is a bit more cumbersome because you now need to peek into the instructions that are being passed to generate this adapter. But this should not be that big of a deal, so long as we allow things like Result<T = Self, E>, Option<T = Self>, etc.

If we instead decide to remove support for these, then it becomes trickier to do the check here since you need to know the sequence of instructions these types generate. E.g. Result<T, E> is Instruction::UnwrapResult followed by Instruction::RustFromI32. Here, if you only check the last instruction you'd think that the constructor is returning T.

@snOm3ad
Copy link
Contributor Author

snOm3ad commented Aug 20, 2023

Oops, I just noticed the CI failure. That's because of this test:

#[wasm_bindgen]
pub struct AsyncStruct;

#[wasm_bindgen]
impl AsyncStruct {
    #[wasm_bindgen(constructor)]
    pub async fn new() -> AsyncStruct {
        AsyncStruct
    }

    pub async fn method(&self) -> u32 {
        42
    }
}

It expects returning a Future<Output = AsyncStruct> to work. Apparently I added it in #3043, without thinking about it much.

Interestingly enough the test passes if you return this from the constructor. I added this back from the original PR:

fn instruction(
    js: &mut JsBuilder,
    instr: &Instruction,
    log_error: &mut bool,
    constructor: &Option<String>,
) -> Result<(), Error> {
    match instr {
        Instruction::RustFromI32 { class } => {
            let val = js.pop();
            if let Some(name) = constructor {
                if name != class {
                    bail!("constructor for `{}` cannot return `{}`", name, class);
                }
                js.prelude(&format!("this.__wbg_ptr = {} >>> 0;", val));
                js.push(String::from("this"));
            } else {
                js.cx.require_class_wrap(class);
                js.push(format!("{}.__wrap({})", class, val));
            }
        }
    }
}

While that's probably the most sensible use of a constructor returning something other than the type it's a constructor for, I still don't think it should be allowed - just use a builder instead. I just added that test on a whim without really thinking about it. However, I'm now a bit worried that someone might already be relying on that, and this could actually cause breakage.

I think there are already a lot of users relying on this behaviour! The ray-tracing example in the repository returns Result<T = Self, JsValue> and a search of #[wasm_bindgen(constructor)] in sourcegraph shows multiple libraries using this pattern:

I think it might be possible to query crates.io to see how many crates are relying on a specific pattern. I remember seeing an article about this somewhere. If I can find it I'll post the data here just for future reference. But from the looks of it, if we sneak this in breakage is not a risk, it's a guarantee!

@snOm3ad
Copy link
Contributor Author

snOm3ad commented Aug 20, 2023

I guess we could print some kind of deprecation warning, but that'd only be able to show up when you ran the CLI, not when using the macro

We could emit the warning from the macro by using something like proc-macro-warning.

@Liamolucko
Copy link
Collaborator

I think there are already a lot of users relying on this behaviour! The ray-tracing example in the repository returns Result<T = Self, JsValue> and a search of #[wasm_bindgen(constructor)] in sourcegraph shows multiple libraries using this pattern:

Ah, I didn't think of Result - that should definitely be supported, having a constructor that can throw is a perfectly normal thing to do.

I also found an example of async constructors being used in the wild, so we probably need to support those too: https://github.com/RingsNetwork/rings-node/blob/master/node/src/browser/client.rs#L143-L153

So yeah, people are definitely already relying on this; let's not break it. I hadn't heard of sourcegraph before, thank you for pointing me to it!

We might still want to disallow some types though, because JavaScript handles constructor returns kind of weirdly:

class Foo {
    constructor(x) {
        this.a = 2;
        return x;
    }
}

console.log(new Foo(null));      // Foo { a: 2 }
console.log(new Foo(undefined)); // Foo { a: 2 }
console.log(new Foo("hi"));      // Foo { a: 2 }
console.log(new Foo(12));        // Foo { a: 2 }
console.log(new Foo(() => {}));  // [Function]
console.log(new Foo({}));        // {}

Anything other than an object gets completely ignored, so I think we should disallow returning JS primitives (String, i32, etc.) as they won't work properly.

That also includes Option<T>, since it should be returning null or undefined in the None case, both of which get ignored. That's what's leading to the current implementation just returning an object with no __wbg_ptr set like you mentioned.

You'll still be able to trigger this by accident using JsValue, but that needs to be supported in order to not break async constructors.

The last thing to consider is whether to allow returning different Rust structs. I'm basically indifferent now, it's not any weirder than returning a JsValue, so probably just allow it as well.

You'll still need to check for it though - we can't just set this.__wbg_ptr to a pointer to the wrong type, so if it's a different class you have to go back to __wrap.

Interestingly enough the test passes if you return this from the constructor. I added this back from the original PR:

That's working because async fns get desugared down to regular synchronous functions which return Promises. This means that they don't go through RustFromI32 in the first place and have their code generated the same as a normal function.

Going back to returning this is probably also the way forwards here, since we need to generate a return statement whenever the constructor doesn't return Self (or Result<Self, E>), and it's easier to just do it unconditionally.

As for where to insert the type checks, you should be able to do it in the same place as I did in 052281f: inside Context::export, since all of the information about the export is available there while it's being decoded.

@daxpedda daxpedda added the waiting for author Waiting for author to respond label Aug 22, 2023
snOm3ad and others added 3 commits August 23, 2023 20:54
Co-authored-by: Liam Murphy <liampm32@gmail.com>
Signed-off-by: Oliver T <geronimooliver00@gmail.com>
@snOm3ad snOm3ad requested a review from Liamolucko August 24, 2023 13:03
Signed-off-by: Oliver T <geronimooliver00@gmail.com>
Copy link
Collaborator

@Liamolucko Liamolucko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I just have a couple small comments.

Comment on lines 490 to 506
Descriptor::I8
| Descriptor::U8
| Descriptor::ClampedU8
| Descriptor::I16
| Descriptor::U16
| Descriptor::I32
| Descriptor::U32
| Descriptor::F32
| Descriptor::F64
| Descriptor::I64
| Descriptor::U64
| Descriptor::Boolean
| Descriptor::Char
| Descriptor::CachedString
| Descriptor::String
| Descriptor::Option(_)
| Descriptor::Unit => bail!(msg),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also include Enum, since (exported) enums are represented as integers in JS.

Refs and RefMuts to any of these should probably also be included, since they'll still map to primitives while behind a reference, although it doesn't matter a whole lot since you can't return references right now anyway.

(Imported / string enums are handled purely at the macro level, so we don't need to worry about them here.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this check to it's own function so that it is cleaner to check nested descriptors. Like you mentioned Ref and RefMut are not relevant for now, but it occurred to me that Result still is.

crates/cli-support/src/wit/mod.rs Outdated Show resolved Hide resolved
Signed-off-by: Oliver T <geronimooliver00@gmail.com>
Copy link
Collaborator

@Liamolucko Liamolucko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for author Waiting for author to respond
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot inherit from structs created via #[wasm_bindgen] attribute as expected
3 participants