Skip to content

Commit

Permalink
Merge pull request #1121 from wprzytula/deserialization-api-migration…
Browse files Browse the repository at this point in the history
…-aid

Aid in migration to the new deserialization API
  • Loading branch information
wprzytula authored Nov 13, 2024
2 parents 8253681 + b7d1dfe commit 8ccc597
Show file tree
Hide file tree
Showing 17 changed files with 526 additions and 9 deletions.
1 change: 1 addition & 0 deletions docs/source/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

- [Migration guides](migration-guides/migration-guides.md)
- [Adjusting code to changes in serialization API introduced in 0.11](migration-guides/0.11-serialization.md)
- [Adjusting code to changes in deserialization API introduced in 0.15](migration-guides/0.15-deserialization.md)

- [Connecting to the cluster](connecting/connecting.md)
- [Compression](connecting/compression.md)
Expand Down
295 changes: 295 additions & 0 deletions docs/source/migration-guides/0.15-deserialization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
# Adjusting code to changes in deserialization API introduced in 0.15

In 0.15, a new deserialization API has been introduced. The new API improves type safety and performance of the old one, so it is highly recommended to switch to it. However, deserialization is an area of the API that users frequently interact with: deserialization traits appear in generic code and custom implementations have been written. In order to make migration easier, the driver still offers the old API, which - while opt-in - can be very easily switched to after version upgrade. Furthermore, a number of facilities have been introduced which help migrate the user code to the new API piece-by-piece.

The old API and migration facilities will be removed in a future major release.

## Introduction

### Old traits

The legacy API works by deserializing rows in the query response to a sequence of `Row`s. The `Row` is just a `Vec<Option<CqlValue>>`, where `CqlValue` is an enum that is able to represent any CQL value.

The user can request this type-erased representation to be converted into something useful. There are two traits that power this:

__`FromRow`__

```rust
# extern crate scylla;
# use scylla::frame::response::cql_to_rust::FromRowError;
# use scylla::frame::response::result::Row;
pub trait FromRow: Sized {
fn from_row(row: Row) -> Result<Self, FromRowError>;
}
```

__`FromCqlVal`__

```rust
# extern crate scylla;
# use scylla::frame::response::cql_to_rust::FromCqlValError;
// The `T` parameter is supposed to be either `CqlValue` or `Option<CqlValue>`
pub trait FromCqlVal<T>: Sized {
fn from_cql(cql_val: T) -> Result<Self, FromCqlValError>;
}
```

These traits are implemented for some common types:

- `FromRow` is implemented for tuples up to 16 elements,
- `FromCqlVal` is implemented for a bunch of types, and each CQL type can be converted to one of them.

While it's possible to implement those manually, the driver provides procedural macros for automatic derivation in some cases:

- `FromRow` - implements `FromRow` for a struct.
- `FromUserType` - generated an implementation of `FromCqlVal` for the struct, trying to parse the CQL value as a UDT.

Note: the macros above have a default behavior that is different than what `FromRow` and `FromUserType` do.

### New traits

The new API introduce two analogous traits that, instead of consuming pre-parsed `Vec<Option<CqlValue>>`, are given raw, serialized data with full information about its type. This leads to better performance and allows for better type safety.

The new traits are:

__`DeserializeRow<'frame, 'metadata>`__

```rust
# extern crate scylla;
# use scylla::deserialize::row::ColumnIterator;
# use scylla::deserialize::{DeserializationError, TypeCheckError};
# use scylla::frame::response::result::ColumnSpec;
pub trait DeserializeRow<'frame, 'metadata>
where
Self: Sized,
{
fn type_check(specs: &[ColumnSpec]) -> Result<(), TypeCheckError>;
fn deserialize(row: ColumnIterator<'frame, 'metadata>) -> Result<Self, DeserializationError>;
}
```

__`DeserializeValue<'frame, 'metadata>`__

```rust
# extern crate scylla;
# use scylla::deserialize::row::ColumnIterator;
# use scylla::deserialize::FrameSlice;
# use scylla::deserialize::{DeserializationError, TypeCheckError};
# use scylla::frame::response::result::ColumnType;
pub trait DeserializeValue<'frame, 'metadata>
where
Self: Sized,
{
fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError>;
fn deserialize(
typ: &'metadata ColumnType<'metadata>,
v: Option<FrameSlice<'frame>>,
) -> Result<Self, DeserializationError>;
}
```

The above traits have been implemented for the same set of types as `FromRow` and `FromCqlVal`, respectively. Notably, `DeserializeRow` is implemented for `Row`, and `DeserializeValue` is implemented for `CqlValue`.

There are also `DeserializeRow` and `DeserializeValue` derive macros, analogous to `FromRow` and `FromUserType`, respectively - but with slightly different defaults (explained later in this doc page).

## Updating the code to use the new API

Some of the core types have been updated to use the new traits. Updating the code to use the new API should be straightforward.

### Basic queries

Sending queries with the single page API should work similarly as before. The `Session::query_{unpaged,single_page}`, `Session::execute_{unpaged,single_page}` and `Session::batch` functions have the same interface as before, the only exception being that they return a new, updated `QueryResult`.

Consuming rows from a result will require only minimal changes if you are using helper methods of the `QueryResult`. Now, there is no distinction between "typed" and "non-typed" methods; all methods that return rows need to have the type specified. For example, previously there used to be both `rows(self)` and `rows_typed<RowT: FromRow>(self)`, now there is only a single `rows<R: DeserializeRow<'frame, 'metadata>>(&self)`. Another thing worth mentioning is that the returned iterator now _borrows_ from the `QueryResult` instead of consuming it.

Note that the `QueryResult::rows` field is not available anymore. If you used to access it directly, you need to change your code to use the helper methods instead.

Before:

```rust
# extern crate scylla;
# use scylla::LegacySession;
# use std::error::Error;
# async fn check_only_compiles(session: &LegacySession) -> Result<(), Box<dyn Error>> {
let iter = session
.query_unpaged("SELECT name, age FROM my_keyspace.people", &[])
.await?
.rows_typed::<(String, i32)>()?;
for row in iter {
let (name, age) = row?;
println!("{} has age {}", name, age);
}
# Ok(())
# }
```

After:

```rust
# extern crate scylla;
# use scylla::Session;
# use std::error::Error;
# async fn check_only_compiles(session: &Session) -> Result<(), Box<dyn Error>> {
// 1. Note that the result must be converted to a rows result, and only then
// an iterator created.
let result = session
.query_unpaged("SELECT name, age FROM my_keyspace.people", &[])
.await?
.into_rows_result()?;

// 2. Note that `rows` is used here, not `rows_typed`.
// 3. Note that the new deserialization framework support deserializing types
// that borrow directly from the result frame; let's use them to avoid
// needless allocations.
for row in result.rows::<(&str, i32)>()? {
let (name, age) = row?;
println!("{} has age {}", name, age);
}
# Ok(())
# }
```

### Iterator queries

The `Session::query_iter` and `Session::execute_iter` have been adjusted, too. They now return a `QueryPager` - an intermediate object which needs to be converted into `TypedRowStream` first before being actually iterated over.

Before:

```rust
# extern crate scylla;
# extern crate futures;
# use scylla::LegacySession;
# use std::error::Error;
# use scylla::IntoTypedRows;
# use futures::stream::StreamExt;
# async fn check_only_compiles(session: &LegacySession) -> Result<(), Box<dyn Error>> {
let mut rows_stream = session
.query_iter("SELECT name, age FROM my_keyspace.people", &[])
.await?
.into_typed::<(String, i32)>();

while let Some(next_row_res) = rows_stream.next().await {
let (a, b): (String, i32) = next_row_res?;
println!("a, b: {}, {}", a, b);
}
# Ok(())
# }
```

After:

```rust
# extern crate scylla;
# extern crate futures;
# use scylla::Session;
# use std::error::Error;
# use futures::stream::StreamExt;
# async fn check_only_compiles(session: &Session) -> Result<(), Box<dyn Error>> {
let mut rows_stream = session
.query_iter("SELECT name, age FROM my_keyspace.people", &[])
.await?
// The type of the TypedRowStream is inferred from further use of it.
// Alternatively, it can be specified using turbofish syntax:
// .rows_stream::<(String, i32)>()?;
.rows_stream()?;

while let Some(next_row_res) = rows_stream.next().await {
let (a, b): (String, i32) = next_row_res?;
println!("a, b: {}, {}", a, b);
}
# Ok(())
# }
```

Currently, `QueryPager`/`TypedRowStream` do not support deserialization of borrowed types due to limitations of Rust with regard to lending streams. If you want to deserialize borrowed types not to incur additional allocations, use manual paging (`{query/execute}_single_page`) API.

### Procedural macros

As mentioned in the Introduction section, the driver provides new procedural macros for the `DeserializeRow` and `DeserializeValue` traits that are meant to replace `FromRow` and `FromUserType`, respectively. The new macros are designed to be slightly more type-safe by matching column/UDT field names to rust field names dynamically. This is a different behavior to what the old macros used to do, but the new macros can be configured with `#[attributes]` to simulate the old behavior.

__`FromRow` vs. `DeserializeRow`__

The impl generated by `FromRow` expects columns to be in the same order as the struct fields. The `FromRow` trait does not have information about column names, so it cannot match them with the struct field names. You can use `enforce_order` and `skip_name_checks` attributes to achieve such behavior via `DeserializeRow` trait.

__`FromUserType` vs. `DeserializeValue`__

The impl generated by `FromUserType` expects UDT fields to be in the same order as the struct fields. Field names should be the same both in the UDT and in the struct. You can use the `enforce_order` attribute to achieve such behavior via the `DeserializeValue` trait.

### Adjusting custom impls of deserialization traits

If you have a custom type with a hand-written `impl FromRow` or `impl FromCqlVal`, the best thing to do is to just write a new impl for `DeserializeRow` or `DeserializeValue` manually. Although it's technically possible to implement the new traits by using the existing implementation of the old ones, rolling out a new implementation will avoid performance problems related to the inefficient `CqlValue` representation.

## Accessing the old API

Most important types related to deserialization of the old API have been renamed and contain a `Legacy` prefix in their names:

- `Session` -> `LegacySession`
- `CachingSession` -> `LegacyCachingSession`
- `RowIterator` -> `LegacyRowIterator`
- `TypedRowIterator` -> `LegacyTypedRowIterator`
- `QueryResult` -> `LegacyQueryResult`

If you intend to quickly migrate your application by using the old API, you can just import the legacy stuff and alias it as the new one, e.g.:

```rust
# extern crate scylla;
use scylla::LegacySession as Session;
```

In order to create the `LegacySession` instead of the new `Session`, you need to use `SessionBuilder`'s `build_legacy()` method instead of `build()`:

```rust
# extern crate scylla;
# use scylla::{LegacySession, SessionBuilder};
# use std::error::Error;
# async fn check_only_compiles() -> Result<(), Box<dyn Error>> {
let session: LegacySession = SessionBuilder::new()
.known_node("127.0.0.1")
.build_legacy()
.await?;
# Ok(())
# }
```

## Mixing the old and the new API

It is possible to use different APIs in different parts of the program. The `Session` allows to create a `LegacySession` object that has the old API but shares all resources with the session that has the new API (and vice versa - you can create a new API session from the old API session).

```rust
# extern crate scylla;
# use scylla::{LegacySession, Session};
# use std::error::Error;
# async fn check_only_compiles(new_api_session: &Session) -> Result<(), Box<dyn Error>> {
// All of the session objects below will use the same resources: connections,
// metadata, current keyspace, etc.
let old_api_session: LegacySession = new_api_session.make_shared_session_with_legacy_api();
let another_new_api_session: Session = old_api_session.make_shared_session_with_new_api();
# Ok(())
# }
```

In addition to that, it is possible to convert a `QueryResult` to `LegacyQueryResult`:

```rust
# extern crate scylla;
# use scylla::{QueryResult, LegacyQueryResult};
# use std::error::Error;
# async fn check_only_compiles(result: QueryResult) -> Result<(), Box<dyn Error>> {
let result: QueryResult = result;
let legacy_result: LegacyQueryResult = result.into_legacy_result()?;
# Ok(())
# }
```

... and `QueryPager` into `LegacyRowIterator`:

```rust
# extern crate scylla;
# use scylla::transport::iterator::{QueryPager, LegacyRowIterator};
# use std::error::Error;
# async fn check_only_compiles(pager: QueryPager) -> Result<(), Box<dyn Error>> {
let pager: QueryPager = pager;
let legacy_result: LegacyRowIterator = pager.into_legacy();
# Ok(())
# }
```
4 changes: 3 additions & 1 deletion docs/source/migration-guides/migration-guides.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# Migration guides

- [Serialization changes in version 0.11](0.11-serialization.md)
- [Deserialization changes in version 0.15](0.15-deserialization.md)

```{eval-rst}
.. toctree::
:hidden:
:glob:
0.11-serialization
```
0.15-deserialization
```
10 changes: 10 additions & 0 deletions scylla-cql/src/frame/response/cql_to_rust.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#![allow(deprecated)]

use super::result::{CqlValue, Row};
use crate::frame::value::{
Counter, CqlDate, CqlDecimal, CqlDuration, CqlTime, CqlTimestamp, CqlTimeuuid, CqlVarint,
Expand All @@ -19,6 +21,10 @@ pub enum FromRowError {
/// This trait defines a way to convert CqlValue or `Option<CqlValue>` into some rust type
// We can't use From trait because impl From<Option<CqlValue>> for String {...}
// is forbidden since neither From nor String are defined in this crate
#[deprecated(
since = "0.15.0",
note = "Legacy deserialization API is inefficient and is going to be removed soon"
)]
pub trait FromCqlVal<T>: Sized {
fn from_cql(cql_val: T) -> Result<Self, FromCqlValError>;
}
Expand All @@ -34,6 +40,10 @@ pub enum FromCqlValError {
}

/// This trait defines a way to convert CQL Row into some rust type
#[deprecated(
since = "0.15.0",
note = "Legacy deserialization API is inefficient and is going to be removed soon"
)]
pub trait FromRow: Sized {
fn from_row(row: Row) -> Result<Self, FromRowError>;
}
Expand Down
6 changes: 6 additions & 0 deletions scylla-cql/src/frame/response/result.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
#[allow(deprecated)]
use crate::cql_to_rust::{FromRow, FromRowError};
use crate::frame::frame_errors::{
ColumnSpecParseError, ColumnSpecParseErrorKind, CqlResultParseError, CqlTypeParseError,
Expand Down Expand Up @@ -607,6 +608,11 @@ pub struct Row {

impl Row {
/// Allows converting Row into tuple of rust types or custom struct deriving FromRow
#[deprecated(
since = "0.15.0",
note = "Legacy deserialization API is inefficient and is going to be removed soon"
)]
#[allow(deprecated)]
pub fn into_typed<RowT: FromRow>(self) -> StdResult<RowT, FromRowError> {
RowT::from_row(self)
}
Expand Down
3 changes: 3 additions & 0 deletions scylla-cql/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ pub mod macros {
// Reexports for derive(IntoUserType)
pub use bytes::{BufMut, Bytes, BytesMut};

#[allow(deprecated)]
pub use crate::impl_from_cql_value_from_method;

pub use crate::impl_serialize_row_via_value_list;
Expand All @@ -22,12 +23,14 @@ pub mod macros {
pub mod types;

pub use crate::frame::response::cql_to_rust;
#[allow(deprecated)]
pub use crate::frame::response::cql_to_rust::FromRow;

pub use crate::frame::types::Consistency;

#[doc(hidden)]
pub mod _macro_internal {
#[allow(deprecated)]
pub use crate::frame::response::cql_to_rust::{
FromCqlVal, FromCqlValError, FromRow, FromRowError,
};
Expand Down
Loading

0 comments on commit 8ccc597

Please sign in to comment.