Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(torii-server): server proxy handlers #2708

Merged
merged 3 commits into from
Nov 22, 2024

Conversation

Larkooo
Copy link
Collaborator

@Larkooo Larkooo commented Nov 21, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced handlers for GraphQL, gRPC, SQL, and static file requests, enhancing the server's ability to manage various types of requests.
    • Added support for executing SQL queries and returning results in JSON format.
    • Implemented a modular request handling system for improved maintainability.
    • Introduced caching for external column names in SQL queries, optimizing performance.
  • Bug Fixes

    • Streamlined error handling for proxy calls, improving response accuracy.
  • Chores

    • Updated dependencies to include async-trait and sqlparser for enhanced asynchronous capabilities and SQL parsing.

Copy link

coderabbitai bot commented Nov 21, 2024

Walkthrough

Ohayo, sensei! This pull request introduces several changes to the torii-server package, including the addition of two new dependencies, async-trait and sqlparser. It implements multiple request handlers for GraphQL, gRPC, SQL, and static file requests, all adhering to a common Handler trait for modular request processing. Additionally, the visibility of the handlers module is restricted to the crate, and the request handling logic in proxy.rs is refactored to utilize these new handlers.

Changes

File Path Change Summary
crates/torii/server/Cargo.toml Added dependencies: async-trait = "0.1.83", sqlparser = "0.52.0"
crates/torii/server/src/handlers/graphql.rs Introduced GraphQLHandler struct with methods new, should_handle, and handle for GraphQL requests.
crates/torii/server/src/handlers/grpc.rs Introduced GrpcHandler struct with methods new, should_handle, and handle for gRPC requests.
crates/torii/server/src/handlers/mod.rs Added public modules: graphql, grpc, sql, static_files. Introduced Handler trait with methods.
crates/torii/server/src/handlers/sql.rs Introduced SqlHandler struct with methods for SQL query handling and implementing the Handler trait.
crates/torii/server/src/handlers/static_files.rs Introduced StaticHandler struct with methods for handling static file requests and implementing the Handler trait.
crates/torii/server/src/lib.rs Changed visibility of handlers module to pub(crate).
crates/torii/server/src/proxy.rs Refactored handle function to use a vector of handler objects. Changed visibility of GRAPHQL_PROXY_CLIENT and GRPC_PROXY_CLIENT to pub(crate).

Possibly related PRs

Suggested labels

sozo

Suggested reviewers

  • kariy
  • glihm

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Outside diff range and nitpick comments (11)
crates/torii/server/src/handlers/mod.rs (1)

1-4: Ohayo! Clean module organization, sensei!

The separation into distinct modules for each handler type (GraphQL, gRPC, SQL, and static files) follows good architectural practices and maintains clear separation of concerns.

Consider documenting the responsibility of each handler type in module-level documentation to help future maintainers understand the system's architecture.

crates/torii/server/src/handlers/graphql.rs (3)

11-14: Add documentation for better maintainability, sensei!

Consider adding documentation for the struct and its fields to improve code maintainability.

+/// Handles GraphQL requests by proxying them to a configured GraphQL service
 pub struct GraphQLHandler {
+    /// Client's IP address for request tracking
     client_ip: IpAddr,
+    /// Optional socket address of the GraphQL service
     graphql_addr: Option<SocketAddr>,
 }

16-20: Consider validating the GraphQL service port, sensei!

The constructor could benefit from validating the port number when graphql_addr is provided.

 impl GraphQLHandler {
     pub fn new(client_ip: IpAddr, graphql_addr: Option<SocketAddr>) -> Self {
+        if let Some(addr) = graphql_addr {
+            assert!(addr.port() > 0, "Invalid port number for GraphQL service");
+        }
         Self { client_ip, graphql_addr }
     }
 }

28-45: Consider adding timeout for proxy requests, sensei!

The proxy requests currently have no timeout, which could lead to resource exhaustion if the GraphQL service is unresponsive.

Consider wrapping the proxy call with a timeout:

use tokio::time::timeout;
use std::time::Duration;

const PROXY_TIMEOUT: Duration = Duration::from_secs(30);

// In handle method:
match timeout(
    PROXY_TIMEOUT,
    crate::proxy::GRAPHQL_PROXY_CLIENT.call(self.client_ip, &graphql_addr, req)
).await {
    Ok(result) => match result {
        Ok(response) => response,
        Err(error) => // ... error handling
    },
    Err(_) => {
        error!(target: LOG_TARGET, "GraphQL proxy request timed out after {:?}", PROXY_TIMEOUT);
        Response::builder()
            .status(StatusCode::GATEWAY_TIMEOUT)
            .body(Body::empty())
            .unwrap_or_else(|_| Response::new(Body::empty()))
    }
}
crates/torii/server/src/handlers/static_files.rs (2)

10-13: Add documentation for the StaticHandler struct, sensei!

Consider adding documentation to explain:

  • The purpose of the StaticHandler
  • The significance of client_ip
  • When artifacts_addr might be None vs Some

Example addition:

+/// Handles requests for static files, optionally proxying them to an artifacts server
+/// when configured with an artifacts_addr.
 pub struct StaticHandler {
+    /// The IP address of the client making the request
     client_ip: IpAddr,
+    /// Optional address of the artifacts server to proxy requests to
     artifacts_addr: Option<SocketAddr>,
 }

15-19: Document the constructor method, sensei!

The implementation is correct, but would benefit from documentation explaining the parameters and their usage.

 impl StaticHandler {
+    /// Creates a new StaticHandler instance
+    ///
+    /// # Arguments
+    /// * `client_ip` - The IP address of the requesting client
+    /// * `artifacts_addr` - Optional address of the artifacts server
     pub fn new(client_ip: IpAddr, artifacts_addr: Option<SocketAddr>) -> Self {
         Self { client_ip, artifacts_addr }
     }
 }
crates/torii/server/src/handlers/grpc.rs (4)

11-14: Ohayo, sensei! Consider restricting the visibility of GrpcHandler

If GrpcHandler is intended for internal use within the crate, making it pub(crate) enhances encapsulation and limits its exposure to other crates.


24-30: Ohayo, sensei! Ensure case-insensitive Content-Type header comparison

HTTP header values can vary in case. To make the can_handle method more robust, consider comparing the Content-Type header in a case-insensitive manner.


38-38: Ohayo, sensei! Prevent potential sensitive data exposure in logs

Using "{:?}" in error! may log sensitive information. Consider using "{}" or sanitizing the error message to avoid leaking sensitive data.


39-42: Ohayo, sensei! Avoid using unwrap() to prevent potential panics

Using unwrap() may cause a panic if an error occurs. It's safer to handle the potential error or use expect() with a meaningful message.

crates/torii/server/src/handlers/sql.rs (1)

63-67: Avoid using .unwrap() when building responses

Ohayo, sensei! Calling .unwrap() on Response::builder().body() may panic if there is an error constructing the response. It's safer to handle potential errors.

Consider handling errors from Response::builder():

         let response = Response::builder()
             .status(StatusCode::OK)
             .header(CONTENT_TYPE, "application/json")
             .body(Body::from(json));
-        response.unwrap()
+        match response {
+            Ok(res) => res,
+            Err(e) => {
+                // Log the error here if necessary
+                Response::builder()
+                    .status(StatusCode::INTERNAL_SERVER_ERROR)
+                    .body(Body::from("Internal Server Error"))
+                    .unwrap()
+            }
+        }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 043554c and aa3c5ac.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • crates/torii/server/Cargo.toml (1 hunks)
  • crates/torii/server/src/handlers/graphql.rs (1 hunks)
  • crates/torii/server/src/handlers/grpc.rs (1 hunks)
  • crates/torii/server/src/handlers/mod.rs (1 hunks)
  • crates/torii/server/src/handlers/sql.rs (1 hunks)
  • crates/torii/server/src/handlers/static_files.rs (1 hunks)
  • crates/torii/server/src/lib.rs (1 hunks)
  • crates/torii/server/src/proxy.rs (3 hunks)
🔇 Additional comments (9)
crates/torii/server/src/lib.rs (1)

2-2: Ohayo! Clean module visibility restriction, sensei! ✨

The pub(crate) visibility modifier for the handlers module is a good practice as it properly encapsulates the implementation details while keeping them accessible within the crate.

Let's verify that the handlers are only used within this crate:

✅ Verification successful

Ohayo! The handlers module visibility is perfectly confined, sensei! ✨

The verification confirms that all imports of the handlers module are properly scoped with crate::handlers:: and only occur within the same crate (crates/torii/server/). This aligns perfectly with the pub(crate) visibility restriction.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify handlers module usage is confined to the crate
# Expected: Only matches within crates/torii/server/

# Search for handlers module usage
rg --type rust "use.*handlers::" crates/torii/server/

Length of output: 424

crates/torii/server/src/handlers/mod.rs (1)

6-6: LGTM! Minimal and focused imports.

The import statement brings in only the necessary HTTP types from hyper.

crates/torii/server/Cargo.toml (1)

33-33: ⚠️ Potential issue

Ohayo sensei! Please update to the latest stable version of async-trait.

The specified version 0.1.83 appears to be ahead of the latest stable release. This could cause build issues.

Let's verify the latest stable version:

Please update the dependency to use the latest stable version:

-async-trait = "0.1.83"
+async-trait = "0.1.74"
crates/torii/server/src/handlers/graphql.rs (1)

1-9: Ohayo! Clean and well-organized imports, sensei!

The imports are minimal and necessary, and the log target constant is properly scoped.

crates/torii/server/src/handlers/static_files.rs (2)

1-9: Ohayo! Clean and focused imports, sensei!

The imports are well-organized and the LOG_TARGET constant is appropriately scoped as crate-private.


23-25: Verify security implications of static file handling

The current path validation only checks for "/static" prefix. Consider:

  1. Path traversal attacks using "../" sequences
  2. Open proxy security implications when artifacts_addr is set
✅ Verification successful

Let me search for more context about the static file handling implementation.


Let me check the actual implementation of the static file handling and serving logic.


Ohayo sensei! After reviewing the implementation, I can now provide a complete analysis of the security concerns.

Path traversal and proxy security are properly handled

The static file handling implementation has multiple security layers:

  1. Path validation in serve_static_file:
    • Enforces strict path format with 3 components
    • Validates contract address and token ID formats
    • Uses warp's safe path handling
  2. File access is restricted to specific directories through:
    • Controlled directory creation
    • Strict file naming conventions
  3. Proxy implementation:
    • Only forwards to configured artifacts server
    • Uses internal network address (127.0.0.1)
    • Proper error handling and logging

The concerns raised in the review comment are already addressed by the implementation's security controls.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for path normalization or security checks in the codebase
rg -A 5 "normalize|canonicalize|clean|sanitize.*path" 

# Look for other static file handlers that might have security measures
ast-grep --pattern 'fn $_(req: &Request<Body>) -> bool {
  $$$
  path().starts_with("/static")
  $$$
}'

Length of output: 151188


Script:

#!/bin/bash
# Look for the static file handler implementation
rg -A 10 "fn can_handle.*static" 

# Search for any path handling or validation logic
rg -A 5 "path.*static|sanitize.*path|clean.*path"

# Look for the artifacts server configuration
rg -A 5 "artifacts::|new.*artifacts"

Length of output: 4499


Script:

#!/bin/bash
# Look for the static file serving implementation
rg -A 15 "serve_static_file" 

# Check the artifacts module implementation
cat crates/torii/server/src/artifacts.rs

# Look for any proxy or request forwarding logic
rg -A 10 "handle.*Request.*Body"

Length of output: 20437

crates/torii/server/src/handlers/grpc.rs (1)

34-34: Ohayo, sensei! Verify the URL scheme for gRPC server address

Check whether using "http://" is appropriate when formatting grpc_addr. If the gRPC server expects a different scheme like "https://", updating it could prevent potential connection issues.

crates/torii/server/src/proxy.rs (2)

46-53: Consider the visibility of static references

Ohayo, sensei! The static references GRAPHQL_PROXY_CLIENT and GRPC_PROXY_CLIENT have been made pub(crate). Please ensure that this increased visibility is necessary. Exposing these clients might lead to unintended usage outside the module.

Would it be better to keep them private or provide accessor methods if external access is required?


174-179: Verify the order of handler initialization

Ohayo, sensei! The order in which handlers are added to the handlers vector determines their priority when handling requests. Please confirm that this order is intentional and aligns with the desired request processing logic.

If the order is significant, consider adding comments to clarify the handler precedence.

crates/torii/server/src/handlers/mod.rs Outdated Show resolved Hide resolved
crates/torii/server/src/handlers/graphql.rs Show resolved Hide resolved
Comment on lines 24 to 26
fn can_handle(&self, req: &Request<Body>) -> bool {
req.uri().path().starts_with("/graphql")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Strengthen the path matching logic, sensei!

The current path check might match unintended paths like "/graphql-wrong". Consider using exact matching or a more specific pattern.

     fn can_handle(&self, req: &Request<Body>) -> bool {
-        req.uri().path().starts_with("/graphql")
+        req.uri().path() == "/graphql" || req.uri().path().starts_with("/graphql/")
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn can_handle(&self, req: &Request<Body>) -> bool {
req.uri().path().starts_with("/graphql")
}
fn can_handle(&self, req: &Request<Body>) -> bool {
req.uri().path() == "/graphql" || req.uri().path().starts_with("/graphql/")
}

crates/torii/server/src/handlers/static_files.rs Outdated Show resolved Hide resolved
crates/torii/server/src/handlers/sql.rs Outdated Show resolved Hide resolved
crates/torii/server/src/handlers/sql.rs Outdated Show resolved Hide resolved
crates/torii/server/src/handlers/sql.rs Outdated Show resolved Hide resolved
crates/torii/server/src/handlers/sql.rs Show resolved Hide resolved
crates/torii/server/src/proxy.rs Outdated Show resolved Hide resolved
crates/torii/server/src/proxy.rs Show resolved Hide resolved
@Larkooo Larkooo changed the title refactor(torii-server): cleanup & handlers refactor(torii-server): server proxy handlers Nov 21, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (11)
crates/torii/server/src/handlers/mod.rs (1)

1-4: Ohayo sensei! Consider adding module-level documentation.

The module structure is clean and well-organized, but it would benefit from documentation explaining the purpose and responsibility of each handler module.

Add documentation like this:

+//! Handler implementations for different request types.
+
+/// Handles GraphQL-specific requests and operations
 pub mod graphql;
+/// Handles gRPC protocol requests
 pub mod grpc;
+/// Handles SQL query requests
 pub mod sql;
+/// Serves static files and assets
 pub mod static_files;
crates/torii/server/src/handlers/graphql.rs (2)

11-20: Add documentation for better maintainability, sensei!

Consider adding doc comments to explain the purpose of the struct and its fields.

+/// Handles GraphQL requests by proxying them to a configured GraphQL server
 pub struct GraphQLHandler {
+    /// Client's IP address for request tracking
     client_ip: IpAddr,
+    /// Optional socket address of the GraphQL server to proxy requests to
     graphql_addr: Option<SocketAddr>,
 }

1-46: Consider implementing a custom error type, sensei!

To improve error handling throughout the handler, consider implementing a dedicated error type that can capture and properly handle different error scenarios (proxy errors, network issues, etc.).

Example structure:

#[derive(Debug, thiserror::Error)]
pub enum GraphQLHandlerError {
    #[error("Proxy error: {0}")]
    ProxyError(#[from] hyper::Error),
    #[error("Response building error: {0}")]
    ResponseError(#[from] http::Error),
}
crates/torii/server/src/handlers/static_files.rs (2)

10-19: Add documentation for better maintainability, sensei!

Consider adding documentation comments to explain:

  • The purpose of the StaticHandler struct
  • What the artifacts_addr represents and when it might be None
  • Any assumptions or requirements for the client_ip

Example:

+/// Handles requests for static files, optionally proxying them to an artifacts server
+/// when configured with an artifacts_addr
 pub struct StaticHandler {
+    /// The IP address of the client making the request
     client_ip: IpAddr,
+    /// Optional address of the artifacts server to proxy requests to
     artifacts_addr: Option<SocketAddr>,
 }

23-25: Enhance path matching robustness, sensei!

The current path matching could be improved to handle edge cases:

     fn should_handle(&self, req: &Request<Body>) -> bool {
-        req.uri().path().starts_with("/static")
+        req.uri()
+            .path()
+            .split('/')
+            .nth(1)
+            .map(|segment| segment == "static")
+            .unwrap_or(false)
     }

This change will:

  • Properly handle malformed paths
  • Only match "/static" and not "/static-other"
crates/torii/server/src/handlers/grpc.rs (2)

11-14: Add documentation for the struct and its fields

Consider adding documentation to explain the purpose of the struct and its fields, especially the significance of the optional grpc_addr.

+/// Handles incoming gRPC requests by proxying them to a configured gRPC server
 pub struct GrpcHandler {
+    /// Client's IP address for request tracking
     client_ip: IpAddr,
+    /// Optional gRPC server address. If None, requests will receive a 404 response
     grpc_addr: Option<SocketAddr>,
 }

16-20: Consider adding address validation in the constructor

The constructor could benefit from validating the socket address format when provided.

 impl GrpcHandler {
     pub fn new(client_ip: IpAddr, grpc_addr: Option<SocketAddr>) -> Self {
+        if let Some(addr) = grpc_addr {
+            // Validate that the address is suitable for gRPC (e.g., not using a privileged port)
+            debug_assert!(addr.port() > 1024, "gRPC server should use non-privileged ports");
+        }
         Self { client_ip, grpc_addr }
     }
 }
crates/torii/server/src/proxy.rs (2)

174-179: Consider caching handler instances

Ohayo, sensei! Currently, handler instances are created for each request. Consider moving the handlers vector to a static or struct field to avoid recreation.

- let handlers: Vec<Box<dyn Handler>> = vec![
+ lazy_static::lazy_static! {
+     static ref HANDLERS: Vec<Box<dyn Handler>> = vec![
+         Box::new(SqlHandler::new(pool.clone())),
+         Box::new(GraphQLHandler::new(client_ip, graphql_addr)),
+         Box::new(GrpcHandler::new(client_ip, grpc_addr)),
+         Box::new(StaticHandler::new(client_ip, artifacts_addr)),
+     ];
+ }

174-184: Consider handler precedence and performance

Ohayo, sensei! The current sequential handler checking might benefit from ordering handlers by specificity (most specific first) or by frequency of use (most common first) to optimize the average case performance.

crates/torii/server/src/handlers/sql.rs (2)

28-55: Refactor value extraction logic for improved readability

Ohayo, sensei! The match block for converting column values to serde_json::Value is quite extensive and affects readability. Consider refactoring this logic into a helper function to simplify the code and enhance maintainability.

Here's how you might refactor the value extraction:

fn extract_column_value(row: &Row, column: &Column, index: usize) -> serde_json::Value {
    match column.type_info().name() {
        "TEXT" => row
            .get::<Option<String>, _>(index)
            .map_or(serde_json::Value::Null, serde_json::Value::String),
        "INTEGER" | "NULL" => row
            .get::<Option<i64>, _>(index)
            .map_or_else(
                || serde_json::Value::Null,
                |n| serde_json::Value::Number(n.into()),
            ),
        "REAL" => row
            .get::<Option<f64>, _>(index)
            .map_or(serde_json::Value::Null, |f| {
                serde_json::Number::from_f64(f)
                    .map(serde_json::Value::Number)
                    .unwrap_or(serde_json::Value::Null)
            }),
        "BLOB" => row
            .get::<Option<Vec<u8>>, _>(index)
            .map_or(serde_json::Value::Null, |bytes| {
                serde_json::Value::String(STANDARD.encode(bytes))
            }),
        _ => row
            .get::<Option<String>, _>(index)
            .map_or(serde_json::Value::Null, serde_json::Value::String),
    }
}

Then, update your mapping function:

for (i, column) in row.columns().iter().enumerate() {
    let value = extract_column_value(row, column, i);
    obj.insert(column.name().to_string(), value);
}

11-116: Add unit tests for SqlHandler

Ohayo, sensei! To ensure the robustness of the new SqlHandler, consider adding unit tests that cover various scenarios, including successful query execution, error handling, and edge cases. This will help catch issues early and maintain code quality.

Would you like assistance in creating unit tests for the SqlHandler?

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between aa3c5ac and ff49502.

📒 Files selected for processing (6)
  • crates/torii/server/src/handlers/graphql.rs (1 hunks)
  • crates/torii/server/src/handlers/grpc.rs (1 hunks)
  • crates/torii/server/src/handlers/mod.rs (1 hunks)
  • crates/torii/server/src/handlers/sql.rs (1 hunks)
  • crates/torii/server/src/handlers/static_files.rs (1 hunks)
  • crates/torii/server/src/proxy.rs (3 hunks)
🔇 Additional comments (12)
crates/torii/server/src/handlers/graphql.rs (3)

1-9: Ohayo! Clean and focused imports, sensei!

The imports are minimal and well-organized, and the logging target is appropriately scoped.


24-26: Path matching needs strengthening, sensei!

The previous review comment about improving path matching is still valid. The current check might match unintended paths.


34-41: Error handling needs improvement, sensei!

The previous review comment about improving error handling and avoiding unwrap is still valid.

crates/torii/server/src/handlers/static_files.rs (2)

1-8: Ohayo sensei! The imports and constants look good!

The necessary dependencies are properly imported and the log target is appropriately scoped to the module.


27-46: ⚠️ Potential issue

Ohayo! The handler implementation needs robustness improvements!

The previous review comments about error handling, unwrap() usage, timeouts, and artifacts_addr validation are still applicable. These improvements are essential for production-ready code.

Additionally, consider adding metrics for:

  • Number of requests handled
  • Proxy success/failure rates
  • Response times

Example metrics implementation:

use metrics::{counter, histogram};

// In handle method:
let timer = histogram!("static_handler_duration_seconds").start();
counter!("static_handler_requests_total").increment(1);
// ... handle request ...
if success {
    counter!("static_handler_success_total").increment(1);
} else {
    counter!("static_handler_errors_total").increment(1);
}
timer.stop();
crates/torii/server/src/handlers/grpc.rs (3)

1-9: Ohayo! Clean and well-organized imports, sensei!

The imports are minimal and focused, and the logging target follows good practices with hierarchical naming.


24-30: LGTM! Robust content-type checking

The should_handle implementation properly checks for gRPC content type with good null safety.


35-35: Verify GRPC_PROXY_CLIENT implementation

Let's ensure the proxy client is properly configured and thread-safe.

✅ Verification successful

GRPC_PROXY_CLIENT implementation is properly configured and thread-safe

The implementation is using lazy_static with a properly configured HTTP/2 client. The ReverseProxy is initialized as a static reference with thread-safe properties:

  • Uses lazy_static for safe one-time initialization
  • Configured with http2_only(true) specifically for gRPC
  • Utilizes immutable static references, making it safe for concurrent access
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for GRPC_PROXY_CLIENT definition and usage
rg -p "GRPC_PROXY_CLIENT" --type rust

# Look for potential race conditions in proxy implementation
ast-grep --pattern 'static $name: $type = $expr;' 

Length of output: 356


Script:

#!/bin/bash
# Let's check the ReverseProxy implementation for thread safety
rg -p "ReverseProxy" -A 10 --type rust

# Check if lazy_static is used for initialization
rg -p "lazy_static" --type rust

# Look for the actual implementation of the call method
ast-grep --pattern 'impl$_ ReverseProxy$_ {
  $$$
  async fn call($_, $_, $_, $_) $$ {
    $$$
  }
  $$$
}'

Length of output: 1959

crates/torii/server/src/proxy.rs (4)

15-15: LGTM! Well-organized imports

Ohayo, sensei! The modular organization of handlers and their imports is clean and follows good Rust practices.

Also applies to: 20-24


181-184: Add error handling for handler execution

Ohayo, sensei! The previous review comment about error handling is still valid. The handle method should handle potential errors from handlers.


187-197: Return 404 for unmatched requests

Ohayo, sensei! The previous review comment about returning a 404 status code when no handler matches is still valid. Additionally, consider adding a debug log to help identify unhandled request patterns.

     Ok(Response::builder()
-        .status(StatusCode::OK)
+        .status(StatusCode::NOT_FOUND)
         .header(CONTENT_TYPE, "application/json")
-        .body(Body::from(json.to_string()))
+        .body(Body::from(json!({
+            "service": "torii",
+            "success": false,
+            "error": "No handler found for request"
+        }).to_string()))
         .unwrap())

Line range hint 46-60: Verify thread-safety of proxy clients

Ohayo, sensei! While making the proxy clients pub(crate) aligns with the modular design, we should verify that all handlers accessing these static references handle concurrent access safely.

✅ Verification successful

Thread-safety of proxy clients is properly handled

Ohayo, sensei! After reviewing the implementation, I can confirm that the thread-safety concerns are properly addressed:

  • The ReverseProxy clients are wrapped in lazy_static with static ref, which provides safe initialization and thread-safe access
  • The proxy implementation uses Arc<RwLock> for shared state, ensuring thread-safe concurrent access
  • The clients are used in async handlers that properly await the responses, maintaining safety across concurrent requests
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for all usages of the proxy clients to verify thread-safety
ast-grep --pattern 'GRAPHQL_PROXY_CLIENT' 
ast-grep --pattern 'GRPC_PROXY_CLIENT'

Length of output: 727


Script:

#!/bin/bash
# Let's check the implementation of ReverseProxy to verify thread-safety
ast-grep --pattern 'struct ReverseProxy'
ast-grep --pattern 'impl ReverseProxy'

# Also check the call method implementation
ast-grep --pattern 'fn call'

# And let's see if there are any unsafe blocks or interior mutability
rg -A 5 "unsafe|RefCell|Cell|Mutex|RwLock" crates/torii/server/src/

Length of output: 1207

crates/torii/server/src/handlers/mod.rs Show resolved Hide resolved
}
}
} else {
Response::builder().status(StatusCode::NOT_FOUND).body(Body::empty()).unwrap()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle Response builder error case, sensei!

The unwrap() call on the Response builder could panic. Consider using unwrap_or_else for safer error handling.

-            Response::builder().status(StatusCode::NOT_FOUND).body(Body::empty()).unwrap()
+            Response::builder()
+                .status(StatusCode::NOT_FOUND)
+                .body(Body::empty())
+                .unwrap_or_else(|_| Response::new(Body::empty()))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Response::builder().status(StatusCode::NOT_FOUND).body(Body::empty()).unwrap()
Response::builder()
.status(StatusCode::NOT_FOUND)
.body(Body::empty())
.unwrap_or_else(|_| Response::new(Body::empty()))

crates/torii/server/src/handlers/grpc.rs Show resolved Hide resolved
crates/torii/server/src/handlers/sql.rs Outdated Show resolved Hide resolved
crates/torii/server/src/handlers/sql.rs Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (2)
crates/torii/server/src/handlers/sql.rs (2)

1-13: Ohayo sensei! Consider adding rate limiting and authentication

The SqlHandler structure currently lacks mechanisms for rate limiting and authentication. Exposing a SQL endpoint without these protections could lead to abuse or unauthorized access.

Consider:

  1. Adding a rate limiter to prevent DoS attacks
  2. Implementing authentication middleware
  3. Adding request logging for audit trails

Would you like assistance implementing these security measures?


124-126: Ohayo sensei! Consider more specific path matching

The current path matching is very broad, potentially catching unintended requests.

Consider using a more specific path pattern:

     fn should_handle(&self, req: &Request<Body>) -> bool {
-        req.uri().path().starts_with("/sql")
+        req.uri().path() == "/sql" || req.uri().path() == "/sql/"
     }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ff49502 and 7d12dba.

📒 Files selected for processing (1)
  • crates/torii/server/src/handlers/sql.rs (1 hunks)

Comment on lines +20 to +82
pub async fn execute_query(&self, query: String) -> Response<Body> {
match sqlx::query(&query).fetch_all(&*self.pool).await {
Ok(rows) => {
let result: Vec<_> = rows
.iter()
.map(|row| {
let mut obj = serde_json::Map::new();
for (i, column) in row.columns().iter().enumerate() {
let value: serde_json::Value = match column.type_info().name() {
"TEXT" => row
.get::<Option<String>, _>(i)
.map_or(serde_json::Value::Null, serde_json::Value::String),
"INTEGER" | "NULL" => row
.get::<Option<i64>, _>(i)
.map_or(serde_json::Value::Null, |n| {
serde_json::Value::Number(n.into())
}),
"REAL" => row.get::<Option<f64>, _>(i).map_or(
serde_json::Value::Null,
|f| {
serde_json::Number::from_f64(f).map_or(
serde_json::Value::Null,
serde_json::Value::Number,
)
},
),
"BLOB" => row
.get::<Option<Vec<u8>>, _>(i)
.map_or(serde_json::Value::Null, |bytes| {
serde_json::Value::String(STANDARD.encode(bytes))
}),
_ => row
.get::<Option<String>, _>(i)
.map_or(serde_json::Value::Null, serde_json::Value::String),
};
obj.insert(column.name().to_string(), value);
}
serde_json::Value::Object(obj)
})
.collect();

let json = match serde_json::to_string(&result) {
Ok(json) => json,
Err(e) => {
return Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)
.body(Body::from(format!("Failed to serialize result: {:?}", e)))
.unwrap();
}
};

Response::builder()
.status(StatusCode::OK)
.header(CONTENT_TYPE, "application/json")
.body(Body::from(json))
.unwrap()
}
Err(e) => Response::builder()
.status(StatusCode::BAD_REQUEST)
.body(Body::from(format!("Query error: {:?}", e)))
.unwrap(),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo sensei! Add query validation and sanitization

The execute_query method executes queries without validation or sanitization, which could lead to SQL injection or harmful operations.

Consider:

  1. Implementing a query allowlist
  2. Using parameterized queries
  3. Adding query complexity limits
  4. Restricting operations (e.g., no DROP, DELETE, etc.)

Would you like assistance implementing these security measures?

Comment on lines +84 to +119
async fn extract_query(&self, req: Request<Body>) -> Result<String, Response<Body>> {
match *req.method() {
Method::GET => {
// Get the query from the query params
let params = req.uri().query().unwrap_or_default();
form_urlencoded::parse(params.as_bytes())
.find(|(key, _)| key == "q" || key == "query")
.map(|(_, value)| value.to_string())
.ok_or(
Response::builder()
.status(StatusCode::BAD_REQUEST)
.body(Body::from("Missing 'q' or 'query' parameter."))
.unwrap(),
)
}
Method::POST => {
// Get the query from request body
let body_bytes = hyper::body::to_bytes(req.into_body()).await.map_err(|_| {
Response::builder()
.status(StatusCode::BAD_REQUEST)
.body(Body::from("Failed to read query from request body"))
.unwrap()
})?;
String::from_utf8(body_bytes.to_vec()).map_err(|_| {
Response::builder()
.status(StatusCode::BAD_REQUEST)
.body(Body::from("Invalid query"))
.unwrap()
})
}
_ => Err(Response::builder()
.status(StatusCode::METHOD_NOT_ALLOWED)
.body(Body::from("Only GET and POST methods are allowed"))
.unwrap()),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo sensei! Add request size limits and improve error handling

The query extraction lacks request size limits and uses unwrap() in error paths.

Consider these improvements:

  1. Add request size limits:
const MAX_BODY_SIZE: usize = 1024 * 1024; // 1MB limit

// In POST handler:
if body_bytes.len() > MAX_BODY_SIZE {
    return Err(Response::builder()
        .status(StatusCode::PAYLOAD_TOO_LARGE)
        .body(Body::from("Request body too large"))
        .unwrap_or_else(|_| Response::new(Body::from("Request body too large"))));
}
  1. Replace unwrap() with unwrap_or_else:
-                            .unwrap(),
+                            .unwrap_or_else(|_| Response::new(Body::from("Bad request"))),

Comment on lines +61 to +69
let json = match serde_json::to_string(&result) {
Ok(json) => json,
Err(e) => {
return Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)
.body(Body::from(format!("Failed to serialize result: {:?}", e)))
.unwrap();
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo sensei! Handle serialization errors more gracefully

The current error handling exposes internal error details and uses unwrap(). This could leak sensitive information and potentially crash the server.

Consider this safer approach:

                 let json = match serde_json::to_string(&result) {
                     Ok(json) => json,
                     Err(e) => {
+                        // Log the error internally
+                        tracing::error!("Serialization error: {:?}", e);
                         return Response::builder()
                             .status(StatusCode::INTERNAL_SERVER_ERROR)
-                            .body(Body::from(format!("Failed to serialize result: {:?}", e)))
+                            .body(Body::from("Internal server error"))
                             .unwrap_or_else(|_| Response::new(Body::from("Internal server error")));
                     }
                 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let json = match serde_json::to_string(&result) {
Ok(json) => json,
Err(e) => {
return Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)
.body(Body::from(format!("Failed to serialize result: {:?}", e)))
.unwrap();
}
};
let json = match serde_json::to_string(&result) {
Ok(json) => json,
Err(e) => {
// Log the error internally
tracing::error!("Serialization error: {:?}", e);
return Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)
.body(Body::from("Internal server error"))
.unwrap_or_else(|_| Response::new(Body::from("Internal server error")));
}
};

Copy link

codecov bot commented Nov 21, 2024

Codecov Report

Attention: Patch coverage is 0% with 302 lines in your changes missing coverage. Please review.

Project coverage is 56.00%. Comparing base (043554c) to head (f109bbf).

Files with missing lines Patch % Lines
crates/torii/server/src/handlers/sql.rs 0.00% 212 Missing ⚠️
crates/torii/server/src/handlers/grpc.rs 0.00% 27 Missing ⚠️
crates/torii/server/src/handlers/static_files.rs 0.00% 26 Missing ⚠️
crates/torii/server/src/handlers/graphql.rs 0.00% 24 Missing ⚠️
crates/torii/server/src/proxy.rs 0.00% 13 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2708      +/-   ##
==========================================
- Coverage   56.17%   56.00%   -0.18%     
==========================================
  Files         410      414       +4     
  Lines       52929    53110     +181     
==========================================
+ Hits        29732    29742      +10     
- Misses      23197    23368     +171     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@Larkooo Larkooo changed the title refactor(torii-server): server proxy handlers refactor(torii-server): server proxy handlers & sql column infer Nov 21, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (2)
crates/torii/server/src/handlers/sql.rs (2)

16-19: Ohayo sensei! Consider implementing cache size limits

The external_columns_cache could potentially grow unbounded. Consider implementing a maximum cache size with an LRU eviction policy.

use lru::LruCache;

pub struct SqlHandler {
    pool: Arc<SqlitePool>,
    external_columns_cache: Arc<RwLock<LruCache<String, Vec<String>>>>,
}

impl SqlHandler {
    pub fn new(pool: Arc<SqlitePool>) -> Self {
        Self {
            pool,
            external_columns_cache: Arc::new(RwLock::new(LruCache::new(100))), // Cache up to 100 tables
        }
    }
}

30-41: Ohayo sensei! Optimize query transformation performance

The current implementation parses and transforms the entire query AST. For better performance, consider:

  1. Pre-compiling common query patterns
  2. Caching transformed queries
use std::time::Duration;
use moka::sync::Cache;

struct ColumnMapper {
    external_columns: Vec<String>,
    query_cache: Cache<String, String>,
}

impl ColumnMapper {
    fn new(external_columns: Vec<String>) -> Self {
        Self {
            external_columns,
            query_cache: Cache::builder()
                .time_to_live(Duration::from_secs(300))
                .max_capacity(1000)
                .build(),
        }
    }
}
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 7d12dba and f109bbf.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • crates/torii/server/Cargo.toml (1 hunks)
  • crates/torii/server/src/handlers/sql.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/torii/server/Cargo.toml

Comment on lines 190 to 218
let value: serde_json::Value = match column.type_info().name() {
"TEXT" => row
.get::<Option<String>, _>(i)
.map_or(serde_json::Value::Null, serde_json::Value::String),
"INTEGER" | "NULL" => row
.get::<Option<i64>, _>(i)
.map_or(serde_json::Value::Null, |n| {
serde_json::Value::Number(n.into())
}),
"REAL" => row.get::<Option<f64>, _>(i).map_or(
serde_json::Value::Null,
|f| {
serde_json::Number::from_f64(f).map_or(
serde_json::Value::Null,
serde_json::Value::Number,
)
},
),
"BLOB" => row
.get::<Option<Vec<u8>>, _>(i)
.map_or(serde_json::Value::Null, |bytes| {
serde_json::Value::String(STANDARD.encode(bytes))
}),
_ => row
.get::<Option<String>, _>(i)
.map_or(serde_json::Value::Null, serde_json::Value::String),
};
obj.insert(column_name, value);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo sensei! Add validation for SQL data type conversions

The type conversion logic could fail silently for edge cases. Consider adding validation and logging for type conversion failures.

                 let value: serde_json::Value = match column.type_info().name() {
                     "TEXT" => row
                         .get::<Option<String>, _>(i)
+                        .map_err(|e| {
+                            tracing::warn!("Failed to convert TEXT column: {}", e);
+                            None
+                        })
+                        .ok()
                         .map_or(serde_json::Value::Null, serde_json::Value::String),

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines 158 to 167
pub async fn execute_query(&self, query: String) -> Response<Body> {
let transformed_query = match self.transform_query(query).await {
Ok(q) => q,
Err(e) => {
return Response::builder()
.status(StatusCode::BAD_REQUEST)
.body(Body::from(format!("Query transformation error: {}", e)))
.unwrap();
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ohayo sensei! Replace unwrap() with proper error handling

The error handling in execute_query uses unwrap() which could panic. Consider using map_err for more graceful error handling.

-                    .unwrap();
+                    .unwrap_or_else(|_| {
+                        Response::new(Body::from(format!(
+                            "Query transformation error: {}",
+                            e
+                        )))
+                    });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub async fn execute_query(&self, query: String) -> Response<Body> {
let transformed_query = match self.transform_query(query).await {
Ok(q) => q,
Err(e) => {
return Response::builder()
.status(StatusCode::BAD_REQUEST)
.body(Body::from(format!("Query transformation error: {}", e)))
.unwrap();
}
};
pub async fn execute_query(&self, query: String) -> Response<Body> {
let transformed_query = match self.transform_query(query).await {
Ok(q) => q,
Err(e) => {
return Response::builder()
.status(StatusCode::BAD_REQUEST)
.body(Body::from(format!("Query transformation error: {}", e)))
.unwrap_or_else(|_| {
Response::new(Body::from(format!(
"Query transformation error: {}",
e
)))
});
}
};

Comment on lines +288 to +126
fn should_handle(&self, req: &Request<Body>) -> bool {
req.uri().path().starts_with("/sql")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo sensei! Enhance request validation

The should_handle method only checks the path prefix. Consider adding more validation:

  1. Content-Type validation for POST requests
  2. Query parameter validation for GET requests
     fn should_handle(&self, req: &Request<Body>) -> bool {
-        req.uri().path().starts_with("/sql")
+        if !req.uri().path().starts_with("/sql") {
+            return false;
+        }
+        match *req.method() {
+            Method::POST => req
+                .headers()
+                .get(CONTENT_TYPE)
+                .map_or(false, |ct| ct == "application/json"),
+            Method::GET => true,
+            _ => false,
+        }
     }

Committable suggestion skipped: line range outside the PR's diff.

@glihm glihm changed the title refactor(torii-server): server proxy handlers & sql column infer refactor(torii-server): server proxy handlers Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants