Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make experimental Kotlin features optional for consumers. #279

Merged
merged 1 commit into from
Sep 18, 2020

Conversation

rfk
Copy link
Collaborator

@rfk rfk commented Sep 12, 2020

(This builds on #262, but only because I didn't want to deal with the merge conflicts in RustBufferHelpers.kt).

Previously, we were using two experimental features of Kotlin, in different ways:

  • Unsigned types, both as API-level types and for some internal helpers.
  • The buildMap function, for easily building a Map.

Both of these were emitted as part of the generated Kotlin code, regardless of whether the component actually needed them.

In the first case, consumers would get a warning that they need to opt-in to using experimental unsigned types, even if the API they were consuming did not use any unsigned types.

In the second case, we were silently opting in to an experimental API, meaning that consumers might find the generated code broken by a future Kotlin release without any warning.

This commit removes default-on experimental Kotlin APIs. Now, the only time Kotlin consumers will hear about experimental APIs is if they're using a component with unsigned integers in its public API, in which case they will need to explicitly opt in (and hence will be aware of the potential for bustage if the feature gets removed or changed in future Kotlin releases).

The key here was to refactor the ways the ComponentInterface deals with types, so that we can inspect the set of types that
are actually used by the interface and emit only the code required for those types. This is encapsulated in a new TypeUniverse
struct whose job is to maintain that whole-interface view of the set of types in use.

Fixes #259, in the sense that I tried compiling nimbus-sdk with the branch and it didn't emit any warnings.

@rfk rfk force-pushed the optional-kotlin-unsigned-types branch from a7a80d5 to 1f338f4 Compare September 12, 2020 06:04
let len = u32::try_from(self.len()).unwrap();
buf.put_u32(len); // We limit strings to u32::MAX bytes
let len = i32::try_from(self.len()).unwrap();
buf.put_i32(len); // We limit strings to u32::MAX bytes
Copy link
Collaborator Author

@rfk rfk Sep 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kotlin code was already using signed types for string/list/map lengths, so I've explicitly made the other parts of the system use signed types as well. I don't really like it but it seems worthwhile to be consistent.


{% for typ in ci.iter_types() %}
{% let type_name = typ.canonical_name()|class_name_kt %}
{%- match typ -%}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fun bit, where we iterate over all the known types and emit the helper code for each of them, rather than just always including all the helpers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh, I like this! This will come in handy for the in-progress Desktop bindings, too—no need to emit complicated C++ serialization code for DOM types if they aren’t actually used!


fun UShort.lower(): Short {
return this.toShort()
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This have just moved further up in the file, so things are grouped by datatype rather than grouping all the readers and all the writers.

val len = Int.read(buf)
return List<T>(len) {
readItem(buf)
fun lift{{ type_name }}(rbuf: RustBuffer.ByValue): {{ inner_type_name }}? {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this may be slightly controversial, and I welcome pushback here if you don't like it.

I've opted to "monomorphize" the helper functions for Optional, Sequence and Map types, emitting a separate helper for each concrete instance of such a type rather than a single generic helper that takes a lambda. I find this form more readable as I don't have to mentally untangle to lambda, and it happens to fit nicely with the "iterate over all the types" approach.

The downside may be slightly increased code size, although I have a hard time imagining it'll make a big difference in practice. I also think reasonable people could disagree with me about which is the more readable approach :-)

Anyway, if y'all hate this, an alternative would be to:

  • Add methods on TypeUniverse like has_map_types(), has_sequence_types() etc.
  • Change the template here to conditionally emit the Map/Sequence/Optional helpers only when those types are actually used in the interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm, I could see that! Oddly, I find the generic form more readable, even with the lambda, to looking at a mostly-identical block of implementations for each collection type (and especially nested collections, like a sequence of maps)—even though that’s exactly what they do under the hood; you’ve just monomorphized them instead of the compiler.

I guess my brain is thinking about all these structures as generic types already, and calling a lambda for each map member or sequence element makes sense to me. Swift and C++ also keep them as generic type parameters, but those languages have richer generics than Kotlin, so it’s harder to replicate how they do that here. For those languages, the alternative you mentioned would be lovely.

But I don’t feel very strongly about my preference, and “it’s hard to read” is very subjective—especially for generated code that, though we try to keep readable, isn’t really meant to be read 😊 If you’d like to go ahead with monomorphizing them, I won’t object!

pub use types::{FFIType, Type};

/// The main public interface for this module, representing the complete details of an interface exposed
/// by a rust component and the details of consuming it via an extern-C FFI layer.
///
#[derive(Debug, Default, Serialize, Deserialize)]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer attempt to serialize/deserialize ComponentInterface structs, so I've removed the derivations here rather than adding them to my new structs.

@@ -336,13 +323,21 @@ impl APIBuilder for weedle::Definition<'_> {
false
};
if is_error {
ci.add_error_definition(d.convert(ci)?)
let err = d.convert(ci)?;
ci.add_error_definition(err)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and similar changes, is because convert() and friends now take an &mut ci, and Rust's order of evaluation tries to take the mutable reference for ci.add_error_definition before evaluating any of its arguments. So it thinks I'm trying to do a double-mutable-borrow unless I pull the conversion into a separate line.

}
}

impl TypeFinder for weedle::TypedefDefinition<'_> {
fn find_type_definitions(&self, ci: &mut ComponentInterface) -> Result<()> {
fn add_type_definitions_to(&self, types: &mut TypeUniverse) -> Result<()> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes in the TypeFinder and TypeResolve trait impls are simple renamings, I don't think I've made any substantial changes to the logic. (Apart from moving the "dont override builtin names" check into TypeUniverse.add_type_definition so that it's done in a single consistent spot).

@rfk rfk requested a review from jhugman September 12, 2020 06:10
@rfk rfk force-pushed the strings-as-buffers branch 2 times, most recently from 7202f38 to 155fd6c Compare September 12, 2020 06:50
@rfk rfk force-pushed the optional-kotlin-unsigned-types branch 2 times, most recently from ba50b87 to 5fc016c Compare September 12, 2020 06:55
///
/// (And to be honest, a big part of its job is to error out if we encounter any of the
/// many, many WebIDL type definitions that are not supported by uniffi.)
pub(crate) trait TypeResolver {
fn resolve_type_definition(&self, ci: &ComponentInterface) -> Result<Type>;
fn resolve_type_expression(&self, types: &mut TypeUniverse) -> Result<Type>;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing this method operates on is not a "definition" in any meaningful sense - it's an expression for a type, built up from the named types we discovered in the type-finding pass. So, I've renamed the method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not wild about the name, (either the resolve or the expression bits).

This might be that the difference in naming between the TypeResolver and TypeFinder is so small.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair; I think the new name is better, but that's not the same thing as good.

"Resolve" is probably not what we're really mean here. Not for this PR, but I'd be happy to try to brainstorm something better. Maybe something using "builder" could be helpful, since this builds more complex structured types out of the named primitives we found in the previous pass?

Good naming is hard :-/

Base automatically changed from strings-as-buffers to main September 12, 2020 06:59
@rfk
Copy link
Collaborator Author

rfk commented Sep 12, 2020

Also, cross-linking for completeness, the TypeUniverse approach used here subsumes all the iter_types machinery I was messing around with over in #214; I think this version is easier to understand and also likely to scale better.

Copy link
Contributor

@linabutler linabutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the one hand, this does feel a little hacky...one of the points you raised a while back is that our bindings will have to target the lowest common denominator, by design. It is a little unfortunate that Kotlin’s lack of support for unsigned integers mean we have to make string, sequence, and map sizes signed, even though our other typed languages—Swift and C++—can support them without a hitch. We also have to do a bit more work to make sure they’re in range, and, for languages that do support unsigned types, we need to first cast them to signed.

But I think that’s a philosophical objection, and it would be a shame if our Kotlin consumers had a worse experience with spurious warnings about unsigned types just because all the other languages support them.

Practically, I think this is the right way to go. It does restrict data structure sizes even more, but if you’re trying to pass a 2+ GB map over the FFI, you’re going to have a bad time for many reasons other than integer width.

Overall, this looks wonderful!


{% for typ in ci.iter_types() %}
{% let type_name = typ.canonical_name()|class_name_kt %}
{%- match typ -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh, I like this! This will come in handy for the in-progress Desktop bindings, too—no need to emit complicated C++ serialization code for DOM types if they aren’t actually used!

val len = Int.read(buf)
return List<T>(len) {
readItem(buf)
fun lift{{ type_name }}(rbuf: RustBuffer.ByValue): {{ inner_type_name }}? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm, I could see that! Oddly, I find the generic form more readable, even with the lambda, to looking at a mostly-identical block of implementations for each collection type (and especially nested collections, like a sequence of maps)—even though that’s exactly what they do under the hood; you’ve just monomorphized them instead of the compiler.

I guess my brain is thinking about all these structures as generic types already, and calling a lambda for each map member or sequence element makes sense to me. Swift and C++ also keep them as generic type parameters, but those languages have richer generics than Kotlin, so it’s harder to replicate how they do that here. For those languages, the alternative you mentioned would be lovely.

But I don’t feel very strongly about my preference, and “it’s hard to read” is very subjective—especially for generated code that, though we try to keep readable, isn’t really meant to be read 😊 If you’d like to go ahead with monomorphizing them, I won’t object!

// Named type definitions (including aliases).
type_definitions: HashMap<String, Type>,
// All the types in the universe, by canonical type name.
// We should use a `HashSet`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there something else you wanted to say about the HashSet here? 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I was originally going to use a HashMap and perform some extra cross-checks, and this comment was going to be to explain why that was...but in the end I didn't go that route.

}

impl<T: TypeFinder> TypeFinder for Vec<T> {
fn find_type_definitions(&self, ci: &mut ComponentInterface) -> Result<()> {
impl<T: TypeFinder> TypeFinder for &Vec<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this &[T] so you can resolve types for any kind of slice?

Copy link
Contributor

@jhugman jhugman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very much like this PR. I've left a few comments, and some nits, but I like the type universe/oracle, divvying up by types and the monomorphising of types. 🚚 🚢 :shipit: 🚀

pub struct ComponentInterface {
/// Every ComponentInterface gets tagged with the version of uniffi used to create it.
/// This helps us avoid using a lib compiled with one version together with bindings created
/// using a different version, which might introduce unsafety.
uniffi_version: String,
/// A map of all the nameable types used in the interface (including type aliases)
types: HashMap<String, Type>,
types: TypeUniverse,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this change a lot.

// XXX TODO: reject duplicates; the type-finding pass won't help us here.
if self.functions.iter().any(|f| f.name == defn.name) {
bail!("duplicate function definition: {}", defn.name);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// values of that type in a buffer.

{% for typ in ci.iter_types() %}
{% let type_name = typ.canonical_name()|class_name_kt %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: naming. I've scrolled up to here to find out what this was.

Would prefer something more descriptive: canonical_type_name or type_identifier.

fun UShort.Companion.read(buf: ByteBuffer): UShort {
return UShort.lift(buf.getShort())
}

@ExperimentalUnsignedTypes
fun UShort.lower(): Short {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be an ideal time to change the visibility of these methods to private or internal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, yes, it had somehow slipped my mind that Kotlin defaults to public visibility. I'll update these to internal.

fun<T> liftSequence(rbuf: RustBuffer.ByValue, readItem: (ByteBuffer) -> T): List<T> {
return liftFromRustBuffer(rbuf) { buf -> readSequence(buf, readItem) }
}
{% when Type::Optional with (inner_type) -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we're getting to library specific code now.

Suggest adding a comment in the generate code to make this easier to find?

Comment on lines +161 to +172
Type::Object(nm) => format!("Object{}", nm),
Type::Error(nm) => format!("Error{}", nm),
Type::Enum(nm) => format!("Enum{}", nm),
Type::Record(nm) => format!("Record{}", nm),
// Recursive types.
// These add a prefix to the name of the underlying type.
// The component API definition cannot give names to recursive types, so as long as the
// prefixes we add here are all unique amongst themselves, then we have no chance of
// acccidentally generating name collisions.
Type::Optional(t) => format!("Optional{}", t.canonical_name()),
Type::Sequence(t) => format!("Sequence{}", t.canonical_name()),
Type::Map(t) => format!("Map{}", t.canonical_name()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's lots of places in the code where you're immediately calling name.to_camel_case().

class_name_kt(&type_.canonical_name())?

Does heck respect the existing case?

Just found: lowerSequenceEnumEnumeration, so yes it appears it does.

///
/// (And to be honest, a big part of its job is to error out if we encounter any of the
/// many, many WebIDL type definitions that are not supported by uniffi.)
pub(crate) trait TypeResolver {
fn resolve_type_definition(&self, ci: &ComponentInterface) -> Result<Type>;
fn resolve_type_expression(&self, types: &mut TypeUniverse) -> Result<Type>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not wild about the name, (either the resolve or the expression bits).

This might be that the difference in naming between the TypeResolver and TypeFinder is so small.

buf.putShort(this.toShort())
}

{% when Type::UInt32 -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I love this grouping by type rather than reader and writers.

}
}

/// The set of all possible types used in a particular component interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might this be a good time to split up types.rs in to a few smaller files?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it is time, but I don't want to do it as part of this PR; filed #286 for followup

Previously, we were using two experimental features of Kotlin,
in different ways:

* Unsigned types, both as API-level types and for some internal
  helpers.
* The `buildMap` function, for easily building a Map.

Both of these were emitted as part of the generated Kotlin code,
regardless of whether the component actually needed them.

In the first case, consumers would get a warning that they need
to opt-in to using experimental unsigned types, even if the API
they were consuming did not use any unsigned types.

In the second case, we were silently opting in to an experimental
API, meaning that consumers might find the generated code broken
by a future Kotlin release without any warning.

This commit removes default-on experimental Kotlin APIs. Now,
the only time Kotlin consumers will hear about experimental APIs
is if they're using a component with unsigned integers in its
public API, in which case they will need to explicitly opt in
(and hence will be aware of the potential for bustage if the
feature gets removed or changed in future Kotlin releases).

The key here was to refactor the ways the `ComponentInterface`
deals with types, so that we can inspect the set of types that
are actually used by the interface and emit only the code required
for those types. This is encapsulated in a new `TypeUniverse`
struct whose job is to maintain that whole-interface view of the
set of types in use.
@rfk rfk force-pushed the optional-kotlin-unsigned-types branch from 8878871 to 970b515 Compare September 18, 2020 07:46
@rfk rfk merged commit 576ec1a into main Sep 18, 2020
@rfk rfk deleted the optional-kotlin-unsigned-types branch September 18, 2020 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generated Kotlin uses experimental unsigned types
3 participants