feat: spatial temporal navigation #355

maciejmajek · 2025-01-07T10:41:29Z

Purpose

One approach to maintaining a coherent long-term history is to use spatiotemporal vector stores or databases. These databases contain information such as images, descriptions, observed objects, timestamps, and 3D world positions.

Proposed Changes

This PR introduces a spatiotemporal navigation system based on:

MongoDB
- Used for storing complete observations.
Weaviate VectorDB
- Used for storing embeddings and enabling efficient search.

Simplified idea:

Feature description:
API for data collection and processing, allowing for building tempo-spatial database during robot's runtime.

MongoDB stores observations, consisting of image, timestamps, position, description and so on.
Weaviate stores embeddings of textual information of observations.

Tools for querying the databases (names will most likely by changed later):

GetMemoriesNearPositionTool
Queries MongoDB for any memories near the position
GetMemoriesNearTimestampTool
Queries MongoDB for any memories close to the timestamp
GetMemoriesNearTextTool
Queries Weaviate (using similarity search) for memories related to text information.

Issues

Testing

from rai.apps import run_spatial_temporal_data_collection
from pymongo import MongoClient
from langchain_community.vectorstores import FAISS
from langchain_weaviate.vectorstores import WeaviateVectorStore
import weaviate
from langchain_openai.embeddings import OpenAIEmbeddings
from pymongo.database import Database
from pymongo.collection import Collection
import rclpy

client = MongoClient("mongodb://localhost:27017/")
db: Database = client["memories"]
observations_collection: Collection = db["observations"]

weaviate_client = weaviate.connect_to_local()
vectorstore = WeaviateVectorStore(client=weaviate_client, embedding=OpenAIEmbeddings(), text_key= "text", index_name='rai')

rclpy.init()

run_spatial_temporal_data_collection(
    target_frame="base/odom",
    source_frame="base/",
    image_topic="/base/camera_image_color",
    vectorstore=vectorstore,
    observations_collection=observations_collection,
)

Summary by CodeRabbit

New Features
- Introduced spatial-temporal navigation module with data collection and memory retrieval capabilities
- Added tools for querying memories by position, timestamp, and text similarity
- Implemented ROS2 integration for image and transformation data processing
Improvements
- Created structured data models for observations and scene descriptions
- Developed functions for converting ROS2 messages and generating scene descriptions

maciejmajek · 2025-01-07T10:42:05Z

@coderabbitai full review

coderabbitai · 2025-01-07T10:44:55Z

Walkthrough

The pull request introduces a new module for spatial-temporal navigation in the RAI (Robotics AI) application. The changes include creating a comprehensive data collection and memory retrieval system that integrates ROS2 message handling, image processing, vector database storage, and memory querying tools. The module enables collecting, describing, and retrieving spatial-temporal observations with support for position, timestamp, and text-based memory searches.

Changes

File	Change Summary
`src/rai/rai/apps/__init__.py`	Added import and exported `run_spatial_temporal_data_collection` function
`src/rai/rai/apps/spatial_temporal_navigation/__init__.py`	Introduced module with public interfaces for spatial-temporal navigation classes and functions
`src/rai/rai/apps/spatial_temporal_navigation/spatial_temporal_navigation.py`	Implemented core functionality for data collection, including Pydantic models, ROS2 message conversion, description generation, and observation building
`src/rai/rai/apps/spatial_temporal_navigation/tools.py`	Added memory retrieval tools for querying observations by position, timestamp, and text similarity

Sequence Diagram

sequenceDiagram
    participant ROS2 as ROS2 System
    participant DataCollection as Spatial Temporal Data Collection
    participant VectorStore as Vector Store
    participant MongoDB as MongoDB Collection
    participant LLM as Language Model

    ROS2->>DataCollection: Provide Transform and Image
    DataCollection->>DataCollection: Build Observation
    DataCollection->>LLM: Generate Description
    LLM-->>DataCollection: Return Description
    DataCollection->>VectorStore: Store Observation
    DataCollection->>MongoDB: Store Observation Details

The sequence diagram illustrates the high-level workflow of the spatial-temporal data collection process, showing how data is collected from ROS2, processed, described, and stored in both a vector store and a MongoDB collection.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (8)

src/rai/rai/apps/spatial_temporal_navigation/spatial_temporal_navigation.py (6)
16-23: Import Optimization: Duplicate Imports Detected

The imports for cv2 and CvBridge are redundant if they are not used extensively throughout the module. Ensure that all imported modules are necessary. Also, consider grouping related imports together and removing any unused ones to improve code readability and maintainability.

88-105: Simplify ros2_transform_stamped_to_position Function

The function can be simplified by removing unnecessary comments and redundant code. Additionally, the use of # type: ignore comments suggests type-checking issues that should be addressed.

Apply this diff to simplify the function:
-def ros2_transform_stamped_to_position(
-    transform_stamped: TransformStamped,
-) -> PositionStamped:
-    return PositionStamped(
-        timestamp=transform_stamped.header.stamp.sec
-        + transform_stamped.header.stamp.nanosec / 1e9,
-        position=Pose(
-            x=transform_stamped.transform.translation.x,
-            y=transform_stamped.transform.translation.y,
-            z=transform_stamped.transform.translation.z,
-        ),
-        orientation=Orientation(
-            x=transform_stamped.transform.rotation.x,
-            y=transform_stamped.transform.rotation.y,
-            z=transform_stamped.transform.rotation.z,
-            w=transform_stamped.transform.rotation.w,
-        ),
-    )
+def ros2_transform_stamped_to_position(transform_stamped: TransformStamped) -> PositionStamped:
+    timestamp = transform_stamped.header.stamp.sec + transform_stamped.header.stamp.nanosec / 1e9
+    position = Pose(
+        x=transform_stamped.transform.translation.x,
+        y=transform_stamped.transform.translation.y,
+        z=transform_stamped.transform.translation.z,
+    )
+    orientation = Orientation(
+        x=transform_stamped.transform.rotation.x,
+        y=transform_stamped.transform.rotation.y,
+        z=transform_stamped.transform.rotation.z,
+        w=transform_stamped.transform.rotation.w,
+    )
+    return PositionStamped(timestamp=timestamp, position=position, orientation=orientation)
108-119: Avoid Reinitializing CvBridge in Each Call

The CvBridge instance is created every time ros2_image_to_image is called. It would be more efficient to initialize CvBridge once and reuse it.

[performance]

Apply this diff to initialize CvBridge globally:
+bridge = CvBridge()

def ros2_image_to_image(ros2_image: Image) -> ImageStamped:
    logger.info("Converting ROS2 image to base64 image")
-    bridge = CvBridge()
    cv2_image = bridge.imgmsg_to_cv2(ros2_image)
    # Rest of the code...
152-157: Metadata Type Mismatch in VectorDatabaseEntry

The metadata field in VectorDatabaseEntry is typed as Dict[str, str], but you might want to include other data types in the metadata, like timestamps or numerical IDs.

Consider changing the metadata type:
-class VectorDatabaseEntry(BaseModel):
-    text: str
-    metadata: Dict[str, str]
+class VectorDatabaseEntry(BaseModel):
+    text: str
+    metadata: Dict[str, Any]
160-177: Asynchronous Execution: Handle Database and Vector Store Operations Properly

Database and vector store operations in data_collection_pipeline are executed sequentially but might benefit from asynchronous execution to improve performance, especially when dealing with I/O-bound tasks.

[performance]

Consider using asynchronous programming constructs like asyncio or threading.

191-201: Resource Management: Node Shutdown in ImageGrabber

The ImageGrabber node does not have a mechanism to gracefully shut down, which could lead to resource leaks when the application is terminated.

Implement a shutdown method to properly clean up the node:
 class ImageGrabber(Node):
     def __init__(self, image_topic: str):
         super().__init__("image_grabber")
         self.subscription = self.create_subscription(
             Image, image_topic, self.image_callback, 10
         )
         self.image: Image | None = None

+    def shutdown(self):
+        self.destroy_node()
src/rai/rai/apps/__init__.py (1)
15-17: Maintain Consistency in Import Statements

It's good practice to use absolute imports for clarity and to avoid potential issues with relative imports in complex packages.

Apply this diff to use absolute imports:
 from .spatial_temporal_navigation.spatial_temporal_navigation import (
     run_spatial_temporal_data_collection,
 )
src/rai/rai/apps/spatial_temporal_navigation/tools.py (1)
104-121: Consistent Use of Input Schemas

The GetMemoriesNearTimestampToolInput and GetMemoriesNearTextToolInput classes are defined for input validation, but GetMemoriesNearPositionTool directly uses Pose as its args_schema. For consistency and clarity, consider defining a dedicated input schema for each tool.

Define an input schema for GetMemoriesNearPositionTool:
 class GetMemoriesNearPositionToolInput(BaseModel):
     x: float
     y: float
     z: float

 class GetMemoriesNearPositionTool(BaseTool):
     # ...
     args_schema: Type[GetMemoriesNearPositionToolInput] = GetMemoriesNearPositionToolInput

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cc2176 and 9cf0bd5.

📒 Files selected for processing (4)

src/rai/rai/apps/__init__.py (1 hunks)
src/rai/rai/apps/spatial_temporal_navigation/__init__.py (1 hunks)
src/rai/rai/apps/spatial_temporal_navigation/spatial_temporal_navigation.py (1 hunks)
src/rai/rai/apps/spatial_temporal_navigation/tools.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: build-and-test-ros2 (humble)
GitHub Check: build-and-test-ros2 (jazzy)

🔇 Additional comments (4)

src/rai/rai/apps/spatial_temporal_navigation/tools.py (1)

75-89: Handle Empty Results in get_memories_near_text

The function assumes that results are found in the vector store. If no results are found, this could lead to errors when querying MongoDB with an empty list of UUIDs.

[error_handling]

Add a check for empty results:
 uuids = [result[0].metadata["uuid"] for result in results]
+if not uuids:
+    return []
src/rai/rai/apps/spatial_temporal_navigation/__init__.py (3)

1-13: LGTM! License header is properly formatted.

The Apache 2.0 license header is correctly included with up-to-date copyright information.

27-36: LGTM! Well-structured public interface.

The __all__ list properly defines the module's public interface, exposing all the necessary components for the spatiotemporal navigation system:

Spatial components (Pose, Orientation, PositionStamped)

Observation components (ImageStamped, Scene, Description, Observation)

Main functionality (run_spatial_temporal_data_collection)

16-25: Verify the imported module implementation.

Let's ensure the imported module exists with all the referenced classes and function.

✅ Verification successful

All imported entities are properly implemented

The module contains all the required Pydantic model classes and the data collection function with correct signatures.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the existence of the spatial_temporal_navigation module and its components

# Check if the module file exists
fd -t f "spatial_temporal_navigation.py$" src/rai/rai/apps/spatial_temporal_navigation/

# Verify all imported classes and function are defined
ast-grep --pattern 'class $CLASS:' src/rai/rai/apps/spatial_temporal_navigation/spatial_temporal_navigation.py
ast-grep --pattern 'def run_spatial_temporal_data_collection' src/rai/rai/apps/spatial_temporal_navigation/spatial_temporal_navigation.py
Length of output: 418

Script:
#!/bin/bash
# Check the actual content of the file
cat src/rai/rai/apps/spatial_temporal_navigation/spatial_temporal_navigation.py

# Try alternative patterns for class definitions
ast-grep --pattern '@dataclass
class $CLASS'

# Look for any function definition containing our target name
rg "def.*run_spatial_temporal_data_collection" -A 1
Length of output: 7978

src/rai/rai/apps/spatial_temporal_navigation/spatial_temporal_navigation.py

src/rai/rai/apps/spatial_temporal_navigation/tools.py

coderabbitai · 2025-01-07T10:44:59Z

src/rai/rai/apps/spatial_temporal_navigation/tools.py

+class GetMemoriesNearPositionTool(BaseTool):
+    name: str = "get_memories_near_position"
+    description: str = (
+        "Get memories near a specific position using regular coordinate comparison"
+    )
+    args_schema: Type[Pose] = Pose
+
+    observations_collection: Collection
+
+    def _run(self, pose: Pose) -> List[Observation]:
+        return get_memories_near_position(self.observations_collection, pose)
+


⚠️ Potential issue

Correct Return Type Annotation in GetMemoriesNearPositionTool

The _run method should return a string as per the BaseTool class specification in LangChain tools, but it's currently returning a list of Observation objects.

Modify the _run method to return serialized data:

def _run(self, pose: Pose) -> str: observations = get_memories_near_position(self.observations_collection, pose) return json.dumps([obs.dict() for obs in observations])

Ensure that you import json and adjust the return types accordingly.

Committable suggestion skipped: line range outside the PR's diff.

src/rai/rai/apps/spatial_temporal_navigation/tools.py

coderabbitai bot reviewed Jan 7, 2025

View reviewed changes

maciejmajek added 8 commits January 7, 2025 13:22

feat: spatial-temporal navigation toolset

f07ed11

style: type ignores and typing

484151b

fix: bad export

2d518ac

style: logging + typing

a9cc6e6

fix: rename range to time_range

914c059

feat: use threading lock in TransformGrabber

28909c5

chore: remove Description type check

f62a48e

feat: add shutdown method to ImageGrabber

77fcc61

maciejmajek force-pushed the feat/spatial-temporal-navigation branch from 7917be6 to 77fcc61 Compare January 7, 2025 12:22

maciejmajek added 2 commits January 7, 2025 13:24

chore: add pymongo + weaviate

080dea8

fix(tools): return string instead of list of observations

613397d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: spatial temporal navigation #355

feat: spatial temporal navigation #355

maciejmajek commented Jan 7, 2025 •

edited

Loading

maciejmajek commented Jan 7, 2025

coderabbitai bot commented Jan 7, 2025

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Jan 7, 2025

feat: spatial temporal navigation #355

Are you sure you want to change the base?

feat: spatial temporal navigation #355

Conversation

maciejmajek commented Jan 7, 2025 • edited Loading

Purpose

Proposed Changes

Issues

Testing

Summary by CodeRabbit

maciejmajek commented Jan 7, 2025

coderabbitai bot commented Jan 7, 2025

Walkthrough

Changes

Sequence Diagram

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 7, 2025

Choose a reason for hiding this comment

maciejmajek commented Jan 7, 2025 •

edited

Loading