Skip to content

Commit

Permalink
docs: add migration guide
Browse files Browse the repository at this point in the history
Adds a document that should help adjust users to the new deserialization
API.

Co-authored-by: Wojciech Przytuła <wojciech.przytula@scylladb.com>
  • Loading branch information
piodul and wprzytula committed Nov 13, 2024
1 parent bc804e7 commit 7b3838e
Show file tree
Hide file tree
Showing 3 changed files with 300 additions and 1 deletion.
1 change: 1 addition & 0 deletions docs/source/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

- [Migration guides](migration-guides/migration-guides.md)
- [Adjusting code to changes in serialization API introduced in 0.11](migration-guides/0.11-serialization.md)
- [Adjusting code to changes in deserialization API introduced in 0.15](migration-guides/0.15-deserialization.md)

- [Connecting to the cluster](connecting/connecting.md)
- [Compression](connecting/compression.md)
Expand Down
296 changes: 296 additions & 0 deletions docs/source/migration-guides/0.15-deserialization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
# Adjusting code to changes in deserialization API introduced in 0.15

In 0.15, a new deserialization API has been introduced. The new API improves type safety and performance of the old one, so it is highly recommended to switch to it. However, deserialization is an area of the API that users frequently interact with: deserialization traits appear in generic code and custom implementations have been written. In order to make migration easier, the driver still offers the old API, which - while opt-in - can be very easily switched to after version upgrade. Furthermore, a number of facilities have been introduced which help migrate the user code to the new API piece-by-piece.

The old API and migration facilities will be removed in a future major release.

## Introduction

### Old traits

The legacy API works by deserializing rows in the query response to a sequence of `Row`s. The `Row` is just a `Vec<Option<CqlValue>>`, where `CqlValue` is an enum that is able to represent any CQL value.

The user can request this type-erased representation to be converted into something useful. There are two traits that power this:

__`FromRow`__

```rust
# extern crate scylla;
# use scylla::frame::response::cql_to_rust::FromRowError;
# use scylla::frame::response::result::Row;
pub trait FromRow: Sized {
fn from_row(row: Row) -> Result<Self, FromRowError>;
}
```

__`FromCqlVal`__

```rust
# extern crate scylla;
# use scylla::frame::response::cql_to_rust::FromCqlValError;
// The `T` parameter is supposed to be either `CqlValue` or `Option<CqlValue>`
pub trait FromCqlVal<T>: Sized {
fn from_cql(cql_val: T) -> Result<Self, FromCqlValError>;
}
```

These traits are implemented for some common types:

- `FromRow` is implemented for tuples up to 16 elements,
- `FromCqlVal` is implemented for a bunch of types, and each CQL type can be converted to one of them.

While it's possible to implement those manually, the driver provides procedural macros for automatic derivation in some cases:

- `FromRow` - implements `FromRow` for a struct.
- `FromUserType` - generated an implementation of `FromCqlVal` for the struct, trying to parse the CQL value as a UDT.

Note: the macros above have a default behavior that is different than what `FromRow` and `FromUserType` do.

### New traits

The new API introduce two analogous traits that, instead of consuming pre-parsed `Vec<Option<CqlValue>>`, are given raw, serialized data with full information about its type. This leads to better performance and allows for better type safety.

The new traits are:

__`DeserializeRow<'frame, 'metadata>`__

```rust
# extern crate scylla;
# use scylla::deserialize::row::ColumnIterator;
# use scylla::deserialize::{DeserializationError, TypeCheckError};
# use scylla::frame::response::result::ColumnSpec;
pub trait DeserializeRow<'frame, 'metadata>
where
Self: Sized,
{
fn type_check(specs: &[ColumnSpec]) -> Result<(), TypeCheckError>;
fn deserialize(row: ColumnIterator<'frame, 'metadata>) -> Result<Self, DeserializationError>;
}
```

__`DeserializeValue<'frame, 'metadata>`__

```rust
# extern crate scylla;
# use scylla::deserialize::row::ColumnIterator;
# use scylla::deserialize::FrameSlice;
# use scylla::deserialize::{DeserializationError, TypeCheckError};
# use scylla::frame::response::result::ColumnType;
pub trait DeserializeValue<'frame, 'metadata>
where
Self: Sized,
{
fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError>;
fn deserialize(
typ: &'metadata ColumnType<'metadata>,
v: Option<FrameSlice<'frame>>,
) -> Result<Self, DeserializationError>;
}
```

The above traits have been implemented for the same set of types as `FromRow` and `FromCqlVal`, respectively. Notably, `DeserializeRow` is implemented for `Row`, and `DeserializeValue` is implemented for `CqlValue`.

There are also `DeserializeRow` and `DeserializeValue` derive macros, analogous to `FromRow` and `FromUserType`, respectively - but with slightly different defaults (explained later in this doc page).

## Updating the code to use the new API

Some of the core types have been updated to use the new traits. Updating the code to use the new API should be straightforward.

### Basic queries

Sending queries with the single page API should work similarly as before. The `Session::query_{unpaged,single_page}`, `Session::execute_{unpaged,single_page}` and `Session::batch` functions have the same interface as before, the only exception being that they return a new, updated `QueryResult`.

Consuming rows from a result will require only minimal changes if you are using helper methods of the `QueryResult`. Now, there is no distinction between "typed" and "non-typed" methods; all methods that return rows need to have the type specified. For example, previously there used to be both `rows(self)` and `rows_typed<RowT: FromRow>(self)`, now there is only a single `rows<R: DeserializeRow<'frame, 'metadata>>(&self)`. Another thing worth mentioning is that the returned iterator now _borrows_ from the `QueryResult` instead of consuming it.

Note that the `QueryResult::rows` field is not available anymore. If you used to access it directly, you need to change your code to use the helper methods instead.

Before:

```rust
# extern crate scylla;
# use scylla::LegacySession;
# use std::error::Error;
# async fn check_only_compiles(session: &LegacySession) -> Result<(), Box<dyn Error>> {
let iter = session
.query_unpaged("SELECT name, age FROM my_keyspace.people", &[])
.await?
.rows_typed::<(String, i32)>()?;
for row in iter {
let (name, age) = row?;
println!("{} has age {}", name, age);
}
# Ok(())
# }
```

After:

```rust
# extern crate scylla;
# use scylla::Session;
# use std::error::Error;
# async fn check_only_compiles(session: &Session) -> Result<(), Box<dyn Error>> {
// 1. Note that the result must be converted to a rows result, and only then
// an iterator created.
let result = session
.query_unpaged("SELECT name, age FROM my_keyspace.people", &[])
.await?
.into_rows_result()?
.unwrap();

// 2. Note that `rows` is used here, not `rows_typed`.
// 3. Note that the new deserialization framework support deserializing types
// that borrow directly from the result frame; let's use them to avoid
// needless allocations.
for row in result.rows::<(&str, i32)>()? {
let (name, age) = row?;
println!("{} has age {}", name, age);
}
# Ok(())
# }
```

### Iterator queries

The `Session::query_iter` and `Session::execute_iter` have been adjusted, too. They now return a `QueryPager` - an intermediate object which needs to be converted into `TypedRowStream` first before being actually iterated over.

Before:

```rust
# extern crate scylla;
# extern crate futures;
# use scylla::LegacySession;
# use std::error::Error;
# use scylla::IntoTypedRows;
# use futures::stream::StreamExt;
# async fn check_only_compiles(session: &LegacySession) -> Result<(), Box<dyn Error>> {
let mut rows_stream = session
.query_iter("SELECT name, age FROM my_keyspace.people", &[])
.await?
.into_typed::<(String, i32)>();

while let Some(next_row_res) = rows_stream.next().await {
let (a, b): (String, i32) = next_row_res?;
println!("a, b: {}, {}", a, b);
}
# Ok(())
# }
```

After:

```rust
# extern crate scylla;
# extern crate futures;
# use scylla::Session;
# use std::error::Error;
# use futures::stream::StreamExt;
# async fn check_only_compiles(session: &Session) -> Result<(), Box<dyn Error>> {
let mut rows_stream = session
.query_iter("SELECT name, age FROM my_keyspace.people", &[])
.await?
// The type of the TypedRowStream is inferred from further use of it.
// Alternatively, it can be specified using turbofish syntax:
// .rows_stream::<(String, i32)>()?;
.rows_stream()?;

while let Some(next_row_res) = rows_stream.next().await {
let (a, b): (String, i32) = next_row_res?;
println!("a, b: {}, {}", a, b);
}
# Ok(())
# }
```

Currently, `QueryPager`/`TypedRowStream` do not support deserialization of borrowed types due to limitations of Rust with regard to lending streams. If you want to deserialize borrowed types not to incur additional allocations, use manual paging (`{query/execute}_single_page`) API.

### Procedural macros

As mentioned in the Introduction section, the driver provides new procedural macros for the `DeserializeRow` and `DeserializeValue` traits that are meant to replace `FromRow` and `FromUserType`, respectively. The new macros are designed to be slightly more type-safe by matching column/UDT field names to rust field names dynamically. This is a different behavior to what the old macros used to do, but the new macros can be configured with `#[attributes]` to simulate the old behavior.

__`FromRow` vs. `DeserializeRow`__

The impl generated by `FromRow` expects columns to be in the same order as the struct fields. The `FromRow` trait does not have information about column names, so it cannot match them with the struct field names. You can use `enforce_order` and `skip_name_checks` attributes to achieve such behavior via `DeserializeRow` trait.

__`FromUserType` vs. `DeserializeValue`__

The impl generated by `FromUserType` expects UDT fields to be in the same order as the struct fields. Field names should be the same both in the UDT and in the struct. You can use the `enforce_order` attribute to achieve such behavior via the `DeserializeValue` trait.

### Adjusting custom impls of deserialization traits

If you have a custom type with a hand-written `impl FromRow` or `impl FromCqlVal`, the best thing to do is to just write a new impl for `DeserializeRow` or `DeserializeValue` manually. Although it's technically possible to implement the new traits by using the existing implementation of the old ones, rolling out a new implementation will avoid performance problems related to the inefficient `CqlValue` representation.

## Accessing the old API

Most important types related to deserialization of the old API have been renamed and contain a `Legacy` prefix in their names:

- `Session` -> `LegacySession`
- `CachingSession` -> `LegacyCachingSession`
- `RowIterator` -> `LegacyRowIterator`
- `TypedRowIterator` -> `LegacyTypedRowIterator`
- `QueryResult` -> `LegacyQueryResult`

If you intend to quickly migrate your application by using the old API, you can just import the legacy stuff and alias it as the new one, e.g.:

```rust
# extern crate scylla;
use scylla::LegacySession as Session;
```

In order to create the `LegacySession` instead of the new `Session`, you need to use `SessionBuilder`'s `build_legacy()` method instead of `build()`:

```rust
# extern crate scylla;
# use scylla::{LegacySession, SessionBuilder};
# use std::error::Error;
# async fn check_only_compiles() -> Result<(), Box<dyn Error>> {
let session: LegacySession = SessionBuilder::new()
.known_node("127.0.0.1")
.build_legacy()
.await?;
# Ok(())
# }
```

## Mixing the old and the new API

It is possible to use different APIs in different parts of the program. The `Session` allows to create a `LegacySession` object that has the old API but shares all resources with the session that has the new API (and vice versa - you can create a new API session from the old API session).

```rust
# extern crate scylla;
# use scylla::{LegacySession, Session};
# use std::error::Error;
# async fn check_only_compiles(new_api_session: &Session) -> Result<(), Box<dyn Error>> {
// All of the session objects below will use the same resources: connections,
// metadata, current keyspace, etc.
let old_api_session: LegacySession = new_api_session.make_shared_session_with_legacy_api();
let another_new_api_session: Session = old_api_session.make_shared_session_with_new_api();
# Ok(())
# }
```

In addition to that, it is possible to convert a `QueryResult` to `LegacyQueryResult`:

```rust
# extern crate scylla;
# use scylla::{QueryResult, LegacyQueryResult};
# use std::error::Error;
# async fn check_only_compiles(result: QueryResult) -> Result<(), Box<dyn Error>> {
let result: QueryResult = result;
let legacy_result: LegacyQueryResult = result.into_legacy_result()?;
# Ok(())
# }
```

... and `QueryPager` into `LegacyRowIterator`:

```rust
# extern crate scylla;
# use scylla::transport::iterator::{QueryPager, LegacyRowIterator};
# use std::error::Error;
# async fn check_only_compiles(pager: QueryPager) -> Result<(), Box<dyn Error>> {
let pager: QueryPager = pager;
let legacy_result: LegacyRowIterator = pager.into_legacy();
# Ok(())
# }
```
4 changes: 3 additions & 1 deletion docs/source/migration-guides/migration-guides.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# Migration guides

- [Serialization changes in version 0.11](0.11-serialization.md)
- [Deserialization changes in version 0.15](0.15-deserialization.md)

```{eval-rst}
.. toctree::
:hidden:
:glob:
0.11-serialization
```
0.15-deserialization
```

0 comments on commit 7b3838e

Please sign in to comment.