Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc: Start the deprecation of libserialize #19755

Merged
merged 1 commit into from
Dec 17, 2014

Conversation

alexcrichton
Copy link
Member

The primary focus of Rust's stability story at 1.0 is the standard library.
All other libraries distributed with the Rust compiler are planned to
be #[unstable] and therfore only accessible on the nightly channel of Rust. One
of the more widely used libraries today is libserialize, Rust's current solution
for encoding and decoding types.

The current libserialize library, however, has a number of drawbacks:

  • The API is not ready to be stabilize as-is and we will likely not have enough
    resources to stabilize the API for 1.0.
  • The library is not necessarily the speediest implementations with alternatives
    being developed out-of-tree (e.g. serde from erickt).
  • It is not clear how the API of Encodable/Decodable can evolve over time while
    maintaining backwards compatibility.

One of the major pros to the current libserialize, however, is
deriving(Encodable, Decodable) as short-hands for enabling serializing and
deserializing a type. This is unambiguously useful functionality, so we cannot
simply deprecate the in-tree libserialize in favor of an external crates.io
implementation.

For these reasons, this commit starts off a stability story for libserialize by
following these steps:

  1. The deriving(Encodable, Decodable) modes will be deprecated in favor of a
    renamed deriving(RustcEncodable, RustcDecodable).
  2. The in-tree libserialize will be deprecated in favor of an external
    rustc-serialize crate shipped on crates.io. The contents of the crate will be
    the same for now (but they can evolve separately).
  3. At 1.0 serialization will be performed through
    deriving(RustcEncodable, RustcDecodable) and the rustc-serialize crate. The
    expansions for each deriving mode will change from ::serialize::foo to
    ::rustc_serialize::foo.

This story will require that the compiler freezes its implementation of
RustcEncodable deriving for all of time, but this should be a fairly minimal
maintenance burden. Otherwise the crate in crates.io must always maintain the
exact definition of its traits, but the implementation of json, for example, can
continue to evolve in the semver-sense.

The major goal for this stabilization effort is to pave the road for a new
official serialization crate which can replace the current one, solving many of
its downsides in the process. We are not assuming that this will exist for 1.0,
hence the above measures. Some possibilities for replacing libserialize include:

  • If plugins have a stable API, then any crate can provide a custom deriving
    mode (will require some compiler work). This means that any new serialization
    crate can provide its own deriving with its own backing
    implementation, entirely obsoleting the current libserialize and fully
    replacing it.
  • Erick is exploring the possibility of code generation via preprocessing Rust
    source files in the near term until plugins are stable. This strategy would
    provide the same ergonomic benefit that deriving does today in theory.

So, in summary, the current libserialize crate is being deprecated in favor of
the crates.io-based rustc-serialize crate where the deriving modes are
appropriately renamed. This opens up space for a later implementation of
serialization in a more official capacity while allowing alternative
implementations to be explored in the meantime.

Concretely speaking, this change adds support for the RustcEncodable and
RustcDecodable deriving modes. After a snapshot is made warnings will be
turned on for usage of Encodable and Decodable as well as deprecating the
in-tree libserialize crate to encurage users to use rustc-serialize instead.

@alexcrichton
Copy link
Member Author

cc @aturon, @erickt, can you verify that this reflects the outcome of what we were discussing in the work week about libserialize?

@emk
Copy link
Contributor

emk commented Dec 11, 2014

@alexcrichton I'm working on bindings to a JavaScript interpreter which support Encode, and I've discovered that the deriving(Encodable) support in the compiler seems to have a couple of weird minor issues, as described here: #19756.

If RustcEncodable is going to be stablized, it's probably worth deciding whether the current behavior with struct enum variants and tuple structs is actually intentional. (It feels like the current behavior might be an oversight or a bug.) I'd be interested in contributing a patch, if that seems like the right decision.

Aside from this, the current deriving(Encodable) appears to be correct and sensible, at least based on the unit tests I've been writing.

@TyOverby
Copy link
Contributor

Is it possible for e.g. serde to contain all of the functionality of our current serialization functionality?

@alexcrichton
Copy link
Member Author

@emk I think we'd definitely be willing to accept updates in the pre-1.0 timeframe but we would not be able to afterwards. The library was designed in a time where struct variants weren't as much of a thing as they are now, so I wouldn't be too surprised if you ran into bugs :).

The good part is that in terms of functionality we ship we'd only have to update the deriving code (nothing in-tree), and we could catch up rustc-serialize afterwards.

@TyOverby I'm sure @erickt could answer that question better than I, but I don't see why it couldn't!

The primary focus of Rust's stability story at 1.0 is the standard library.
All other libraries distributed with the Rust compiler are planned to
be #[unstable] and therfore only accessible on the nightly channel of Rust. One
of the more widely used libraries today is libserialize, Rust's current solution
for encoding and decoding types.

The current libserialize library, however, has a number of drawbacks:

* The API is not ready to be stabilize as-is and we will likely not have enough
  resources to stabilize the API for 1.0.
* The library is not necessarily the speediest implementations with alternatives
  being developed out-of-tree (e.g. serde from erickt).
* It is not clear how the API of Encodable/Decodable can evolve over time while
  maintaining backwards compatibility.

One of the major pros to the current libserialize, however, is
`deriving(Encodable, Decodable)` as short-hands for enabling serializing and
deserializing a type. This is unambiguously useful functionality, so we cannot
simply deprecate the in-tree libserialize in favor of an external crates.io
implementation.

For these reasons, this commit starts off a stability story for libserialize by
following these steps:

1. The deriving(Encodable, Decodable) modes will be deprecated in favor of a
   renamed deriving(RustcEncodable, RustcDecodable).
2. The in-tree libserialize will be deprecated in favor of an external
   rustc-serialize crate shipped on crates.io. The contents of the crate will be
   the same for now (but they can evolve separately).
3. At 1.0 serialization will be performed through
   deriving(RustcEncodable, RustcDecodable) and the rustc-serialize crate. The
   expansions for each deriving mode will change from `::serialize::foo` to
   `::rustc_serialize::foo`.

This story will require that the compiler freezes its implementation of
`RustcEncodable` deriving for all of time, but this should be a fairly minimal
maintenance burden. Otherwise the crate in crates.io must always maintain the
exact definition of its traits, but the implementation of json, for example, can
continue to evolve in the semver-sense.

The major goal for this stabilization effort is to pave the road for a new
official serialization crate which can replace the current one, solving many of
its downsides in the process. We are not assuming that this will exist for 1.0,
hence the above measures. Some possibilities for replacing libserialize include:

* If plugins have a stable API, then any crate can provide a custom `deriving`
  mode (will require some compiler work). This means that any new serialization
  crate can provide its own `deriving` with its own backing
  implementation, entirely obsoleting the current libserialize and fully
  replacing it.

* Erick is exploring the possibility of code generation via preprocessing Rust
  source files in the near term until plugins are stable. This strategy would
  provide the same ergonomic benefit that `deriving` does today in theory.

So, in summary, the current libserialize crate is being deprecated in favor of
the crates.io-based rustc-serialize crate where the `deriving` modes are
appropriately renamed. This opens up space for a later implementation of
serialization in a more official capacity while allowing alternative
implementations to be explored in the meantime.

Concretely speaking, this change adds support for the `RustcEncodable` and
`RustcDecodable` deriving modes. After a snapshot is made warnings will be
turned on for usage of `Encodable` and `Decodable` as well as deprecating the
in-tree libserialize crate to encurage users to use rustc-serialize instead.
@emk
Copy link
Contributor

emk commented Dec 14, 2014

Thank you, @alexcrichton. Further investigation reveals that Decoder and Decodable are even more broken than Encoder and Encodable. Specifically, read_enum_struct_variant is never used. What's worse, however, is that read_enum doesn't have any way to indicate to indicate to its caller whether to use read_enum_variant or read_enum_struct_variant. This is no surprise, because deriving(Decodable) has never even tried to invoke read_enum_struct_variant.

In short, it's not clear that the Decoder API even makes sense.

I can see a couple ways forward:

  1. I could write up a detailed explanation of how to change Decoder, and then fix deriving(Encodable) and deriving(Decodable) to do the right thing.
  2. We could stop pretending that Encoder and Decoder have special support for enum struct variants and tuple structs, and we could rip the related functions out of the traits, since they've never been used and Decoder's support for them doesn't seem to make sense.
  3. I could run away quickly and port my code to serde. :-)

I guess the key question here is how RustcEncodable and RustcDecodable will actually work once serialize is moved into a standalone crate. How will they access Encodable and Decodable? Will libraries like serde just ignore RustcEncodable and RustcDecodable completely?

Thank you for your feedback!

@alexcrichton
Copy link
Member Author

Further investigation reveals that Decoder and Decodable are even more broken than Encoder and Encodable

Oh dear, this does sound bad! We should be able to make tweaks like this before 1.0, though.

I guess the key question here is how RustcEncodable and RustcDecodable will actually work once serialize is moved into a standalone crate

I plan on modifying their expansion to references ::rustc_serialize::{Decodable, Encodable} traits. This means that the traits in rustc-serialize on crates.io are forced to never change and the compiler will just blindly assume that you have extern crate "rustc-serialize" as rustc_serialize; at the top (as best as we can do, sadly).

Will libraries like serde just ignore RustcEncodable and RustcDecodable completely?

Sadly yes, the deriving expansion is fundamentally incompatible with the strategy that serde is taking. I'd love to just-as-nice deriving with serde, and there are two route to doing so:

  1. Support plugins in the stable channel of Rust. This means that serde can ship with its own set of deriving-mode expanders, making it even nicer than RustcEncodable!
  2. Leverage codegen to use the syntax-of-tomorrow today! @erickt could fill you in more on this :)

@emk
Copy link
Contributor

emk commented Dec 14, 2014

Further investigation reveals that Decoder and Decodable are even more broken than Encoder and Encodable

Oh dear, this does sound bad! We should be able to make tweaks like this before 1.0, though.

OK, in that case, here's a short summary of what needs to be done:

  1. RustcEncodable and RustcDecodable need to be modified to know about enum struct variants and and tuple structs, as described here: deriving(Encodable) does not use emit_enum_struct_variant or emit_tuple_struct #19756.
  2. The relationship between read_enum, read_enum_variant and read_enum_struct_variant needs to be rethought—perhaps these three functions should be collapsed into 1 or 2 functions, or perhaps read_enum should indicate whether to call read_enum_variant or read_enum_struct_variant next.

Anyway, here's the most most serious problem, illustrated by the JSON deserializer:

    fn read_enum<T, F>(&mut self, name: &str, f: F) -> DecodeResult<T> where
        F: FnOnce(&mut Decoder) -> DecodeResult<T>,
    {
        debug!("read_enum({})", name);
        f(self)
    }

    fn read_enum_variant<T, F>(&mut self, names: &[&str], f: F) -> DecodeResult<T> where
        F: FnOnce(&mut Decoder, uint) -> DecodeResult<T>,
    {
        debug!("read_enum_variant(names={})", names);
        let name = match self.pop() {
            Json::String(s) => s,
            Json::Object(mut o) => {
                // ...
            }
        };
        let idx = match names.iter()
                             .position(|n| str::eq_slice(*n, name.as_slice())) {
            Some(idx) => idx,
            None => return Err(UnknownVariantError(name))
        };
        f(self, idx)
    }

   // ...

    // (NOT ACTUALLY CALLED?)
    fn read_enum_struct_variant<T, F>(&mut self, names: &[&str], f: F) -> DecodeResult<T> where
        F: FnOnce(&mut Decoder, uint) -> DecodeResult<T>,
    {
        debug!("read_enum_struct_variant(names={})", names);
        self.read_enum_variant(names, f)
    }

How do we get from read_enum to read_enum_struct_variant?

Once we know what we want to have happen here, the rest is easier.

bors added a commit that referenced this pull request Dec 17, 2014
rustc: Start the deprecation of libserialize

Reviewed-by: aturon
bors referenced this pull request Dec 17, 2014
Using a type alias for iterator implementations is fragile since this
exposes the implementation to users of the iterator, and any changes
could break existing code.

This commit changes the iterators of `VecMap` to use
proper new types, rather than type aliases.  However, since it is
fair-game to treat a type-alias as the aliased type, this is a:

[breaking-change].

expand!(encodable::expand_deriving_encodable)
}
"Decodable" =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton Today's auto-rollup build failed with:

/Users/rustbuild/src/rust-buildbot/slave/auto-mac-64-opt/build/src/libsyntax/ext/deriving/mod.rs:126:1: 126:2 error: incorrect close delimiter: `}`
/Users/rustbuild/src/rust-buildbot/slave/auto-mac-64-opt/build/src/libsyntax/ext/deriving/mod.rs:126 }
                                                                                                     ^

Possibly because of a missing { on this line after "Decodable" =>?

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Dec 17, 2014
The primary focus of Rust's stability story at 1.0 is the standard library.
All other libraries distributed with the Rust compiler are planned to
be #[unstable] and therfore only accessible on the nightly channel of Rust. One
of the more widely used libraries today is libserialize, Rust's current solution
for encoding and decoding types.

The current libserialize library, however, has a number of drawbacks:

* The API is not ready to be stabilize as-is and we will likely not have enough
  resources to stabilize the API for 1.0.
* The library is not necessarily the speediest implementations with alternatives
  being developed out-of-tree (e.g. serde from erickt).
* It is not clear how the API of Encodable/Decodable can evolve over time while
  maintaining backwards compatibility.

One of the major pros to the current libserialize, however, is
`deriving(Encodable, Decodable)` as short-hands for enabling serializing and
deserializing a type. This is unambiguously useful functionality, so we cannot
simply deprecate the in-tree libserialize in favor of an external crates.io
implementation.

For these reasons, this commit starts off a stability story for libserialize by
following these steps:

1. The deriving(Encodable, Decodable) modes will be deprecated in favor of a
   renamed deriving(RustcEncodable, RustcDecodable).
2. The in-tree libserialize will be deprecated in favor of an external
   rustc-serialize crate shipped on crates.io. The contents of the crate will be
   the same for now (but they can evolve separately).
3. At 1.0 serialization will be performed through
   deriving(RustcEncodable, RustcDecodable) and the rustc-serialize crate. The
   expansions for each deriving mode will change from `::serialize::foo` to
   `::rustc_serialize::foo`.

This story will require that the compiler freezes its implementation of
`RustcEncodable` deriving for all of time, but this should be a fairly minimal
maintenance burden. Otherwise the crate in crates.io must always maintain the
exact definition of its traits, but the implementation of json, for example, can
continue to evolve in the semver-sense.

The major goal for this stabilization effort is to pave the road for a new
official serialization crate which can replace the current one, solving many of
its downsides in the process. We are not assuming that this will exist for 1.0,
hence the above measures. Some possibilities for replacing libserialize include:

* If plugins have a stable API, then any crate can provide a custom `deriving`
  mode (will require some compiler work). This means that any new serialization
  crate can provide its own `deriving` with its own backing
  implementation, entirely obsoleting the current libserialize and fully
  replacing it.

* Erick is exploring the possibility of code generation via preprocessing Rust
  source files in the near term until plugins are stable. This strategy would
  provide the same ergonomic benefit that `deriving` does today in theory.

So, in summary, the current libserialize crate is being deprecated in favor of
the crates.io-based rustc-serialize crate where the `deriving` modes are
appropriately renamed. This opens up space for a later implementation of
serialization in a more official capacity while allowing alternative
implementations to be explored in the meantime.

Concretely speaking, this change adds support for the `RustcEncodable` and
`RustcDecodable` deriving modes. After a snapshot is made warnings will be
turned on for usage of `Encodable` and `Decodable` as well as deprecating the
in-tree libserialize crate to encurage users to use rustc-serialize instead.
@bors bors merged commit 7d1fa4e into rust-lang:master Dec 17, 2014
@alexcrichton alexcrichton deleted the rust-serialize branch December 17, 2014 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants