Merge pull request #880 from Lorak-mmk/serialization-documentation

Serialization documentation
scylladb · Dec 15, 2023 · 83d78b0 · 83d78b0
2 parents 8f360c5 + bc97ffd
commit 83d78b0
Show file tree

Hide file tree

Showing 5 changed files with 89 additions and 17 deletions.
diff --git a/docs/source/data-types/udt.md b/docs/source/data-types/udt.md
@@ -8,17 +8,25 @@ For example let's say `my_type` was created using this query:
 CREATE TYPE ks.my_type (int_val int, text_val text)
 ```
 
-To use this type in the driver, create a matching struct and derive `IntoUserType` and `FromUserType`:
+To use this type in the driver, create a matching struct and derive:
+- `SerializeCql`: in order to be able to use this struct in query parameters. \
+    This macro requires fields of UDT and struct to have matching names, but the order
+    of the fields is not required to be the same. \
+    Note: you can use different name using `rename` attribute - see `SerializeCql` macro documentation.
+- `FromUserType`:  in order to be able to use this struct in query results. \
+    This macro requires fields of UDT and struct to be in the same *ORDER*. \
+    This mismatch between `SerializeCql` and `FromUserType` requirements is a temporary situation - in the future `FromUserType` (or  the macro that replaces it) will also require matching names.
 
 ```rust
 # extern crate scylla;
 # async fn check_only_compiles() {
-use scylla::macros::{FromUserType, IntoUserType};
+use scylla::macros::{FromUserType, SerializeCql};
 
 // Define a custom struct that matches the User Defined Type created earlier.
-// Fields must be in the same order as they are in the database.
+// Fields must be in the same order as they are in the database and also
+// have the same names.
 // Wrapping a field in Option will gracefully handle null field values.
-#[derive(Debug, IntoUserType, FromUserType)]
+#[derive(Debug, FromUserType, SerializeCql)]
 struct MyType {
     int_val: i32,
     text_val: Option<String>,
@@ -27,8 +35,13 @@ struct MyType {
 ```
 
 > ***Important***\
-> Fields in the Rust struct must be defined in the same order as they are in the database.
-> When sending and receiving values, the driver will (de)serialize fields one after another, without looking at field names.
+> For deserialization, fields in the Rust struct must be defined in the same order as they are in the database.
+> When receiving values, the driver will (de)serialize fields one after another, without looking at field names.
+
+> ***Important***\
+> For serialization, by default fields in the Rust struct must be defined with the same names as they are in the database.
+> The driver will serialize the fields in the order defined by the UDT, matching Rust fields by name.
+> You can change this behaviour using macro attributes, see `SerializeCql` macro documentation for more information.
 
 Now it can be sent and received just like any other CQL value:
 ```rust
@@ -37,10 +50,10 @@ Now it can be sent and received just like any other CQL value:
 # use std::error::Error;
 # async fn check_only_compiles(session: &Session) -> Result<(), Box<dyn Error>> {
 use scylla::IntoTypedRows;
-use scylla::macros::{FromUserType, IntoUserType, SerializeCql};
+use scylla::macros::{FromUserType, SerializeCql};
 use scylla::cql_to_rust::FromCqlVal;
 
-#[derive(Debug, IntoUserType, FromUserType, SerializeCql)]
+#[derive(Debug, FromUserType, SerializeCql)]
 struct MyType {
     int_val: i32,
     text_val: Option<String>,

diff --git a/docs/source/queries/paged.md b/docs/source/queries/paged.md
@@ -5,6 +5,14 @@ allow to receive the whole result page by page.
 `Session::query_iter` and `Session::execute_iter` take a [simple query](simple.md) or a [prepared query](prepared.md)
 and return an `async` iterator over result `Rows`.
 
+> ***Warning***\
+> In case of unprepared variant (`Session::query_iter`) if the values are not empty
+> driver will first fully prepare a query (which means issuing additional request to each
+> node in a cluster). This will have a performance penalty - how big it is depends on
+> the size of your cluster (more nodes - more requests) and the size of returned
+> result (more returned pages - more amortized penalty). In any case, it is preferable to
+> use `Session::execute_iter`.
+
 ### Examples
 Use `query_iter` to perform a [simple query](simple.md) with paging:
 ```rust
@@ -119,6 +127,11 @@ let res2 = session
 # }
 ```
 
+> ***Warning***\
+> If the values are not empty, driver first needs to send a `PREPARE` request
+> in order to fetch information required to serialize values. This will affect
+> performance because 2 round trips will be required instead of 1.
+
 On a `PreparedStatement`:
 ```rust
 # extern crate scylla;

diff --git a/docs/source/queries/simple.md b/docs/source/queries/simple.md
@@ -22,6 +22,11 @@ session
 > 
 > When page size is set, `query` will return only the first page of results.
 
+> ***Warning***\
+> If the values are not empty, driver first needs to send a `PREPARE` request
+> in order to fetch information required to serialize values. This will affect
+> performance because 2 round trips will be required instead of 1.
+
 ### First argument - the query
 As the first argument `Session::query` takes anything implementing `Into<Query>`.\
 You can create a query manually to set custom options. For example to change query consistency:

diff --git a/docs/source/queries/values.md b/docs/source/queries/values.md
@@ -5,14 +5,14 @@ Each `?` in query text will be filled with the matching value.
 
 > **Never** pass values by adding strings, this could lead to [SQL Injection](https://en.wikipedia.org/wiki/SQL_injection)
 
-Each list of values to send in a query must implement the trait `ValueList`.\
+Each list of values to send in a query must implement the trait `SerializeRow`.\
 By default this can be a slice `&[]`, a tuple `()` (max 16 elements) of values to send,
-or a custom struct which derives from `ValueList`.
+or a custom struct which derives from `SerializeRow`.
 
 A few examples:
 ```rust
 # extern crate scylla;
-# use scylla::{Session, ValueList, SerializeRow, frame::response::result::CqlValue};
+# use scylla::{Session, SerializeRow, frame::response::result::CqlValue};
 # use std::error::Error;
 # use std::collections::HashMap;
 # async fn check_only_compiles(session: &Session) -> Result<(), Box<dyn Error>> {
@@ -33,22 +33,45 @@ session
     .await?;
 
 // Sending an integer and a string using a named struct.
-// The values will be passed in the order from the struct definition
-#[derive(ValueList, SerializeRow)]
+// Names of fields must match names of columns in request,
+// but having them in the same order is not required.
+// If the fields are in the same order, you can use attribute:
+// `#[scylla(flavor = "enforce_order")]`
+// in order to skip sorting the fields and just check if they
+// are in the same order. See documentation of this macro
+// for more information.
+#[derive(SerializeRow)]
 struct IntString {
-    first_col: i32,
-    second_col: String,
+    a: i32,
+    b: String,
 }
 
 let int_string = IntString {
-    first_col: 42_i32,
-    second_col: "hello".to_owned(),
+    a: 42_i32,
+    b: "hello".to_owned(),
 };
 
 session
     .query("INSERT INTO ks.tab (a, b) VALUES(?, ?)", int_string)
     .await?;
 
+// You can use named bind markers in query if you want
+// your names in struct to be different than column names.
+#[derive(SerializeRow)]
+struct IntStringCustom {
+    first_value: i32,
+    second_value: String,
+}
+
+let int_string_custom = IntStringCustom {
+    first_value: 42_i32,
+    second_value: "hello".to_owned(),
+};
+
+session
+    .query("INSERT INTO ks.tab (a, b) VALUES(:first_value, :second_value)", int_string_custom)
+    .await?;
+
 // Sending a single value as a tuple requires a trailing coma (Rust syntax):
 session.query("INSERT INTO ks.tab (a) VALUES(?)", (2_i32,)).await?;
 

diff --git a/scylla/src/transport/session.rs b/scylla/src/transport/session.rs
@@ -558,6 +558,10 @@ impl Session {
     ///
     /// This is the easiest way to make a query, but performance is worse than that of prepared queries.
     ///
+    /// It is discouraged to use this method with non-empty values argument (`is_empty()` method from `SerializeRow`
+    /// trait returns false). In such case, query first needs to be prepared (on a single connection), so
+    /// driver will perform 2 round trips instead of 1. Please use [`Session::execute()`] instead.
+    ///
     /// See [the book](https://rust-driver.docs.scylladb.com/stable/queries/simple.html) for more information
     /// # Arguments
     /// * `query` - query to perform, can be just a `&str` or the [Query] struct.
@@ -608,6 +612,11 @@ impl Session {
     }
 
     /// Queries the database with a custom paging state.
+    ///
+    /// It is discouraged to use this method with non-empty values argument (`is_empty()` method from `SerializeRow`
+    /// trait returns false). In such case, query first needs to be prepared (on a single connection), so
+    /// driver will perform 2 round trips instead of 1. Please use [`Session::execute_paged()`] instead.
+    ///
     /// # Arguments
     ///
     /// * `query` - query to be performed
@@ -749,6 +758,10 @@ impl Session {
     /// Returns an async iterator (stream) over all received rows\
     /// Page size can be specified in the [Query] passed to the function
     ///
+    /// It is discouraged to use this method with non-empty values argument (`is_empty()` method from `SerializeRow`
+    /// trait returns false). In such case, query first needs to be prepared (on a single connection), so
+    /// driver will initially perform 2 round trips instead of 1. Please use [`Session::execute_iter()`] instead.
+    ///
     /// See [the book](https://rust-driver.docs.scylladb.com/stable/queries/paged.html) for more information
     ///
     /// # Arguments
@@ -1128,6 +1141,11 @@ impl Session {
     ///
     /// Batch values must contain values for each of the queries
     ///
+    /// Avoid using non-empty values (`SerializeRow::is_empty()` return false) for simple queries
+    /// inside the batch. Such queries will first need to be prepared, so the driver will need to
+    /// send (numer_of_unprepared_queries_with_values + 1) requests instead of 1 request, severly
+    /// affecting performance.
+    ///
     /// See [the book](https://rust-driver.docs.scylladb.com/stable/queries/batch.html) for more information
     ///
     /// # Arguments