Atoma Node infrastructure

[]

Introduction

Atoma is a decentralized cloud compute network for AI that enables:

Verifiable Compute: Transparent and trustworthy AI model execution, for both inference, text embeddings, multi-modality, etc, through Atoma's novel Sampling Consensus algorithm (see Atoma's whitepaper)
Private Inference: Secure processing with strong privacy guarantees, through the use of secure hardware enclaves (see Atoma's confidential compute paper)
Decentralized Infrastructure: A permissionless network of compute nodes, orchestrated by Atoma's smart contract on the Sui blockchain (see repo)
LLM Focus: Specialized in serving Large Language Models compute.

This repository contains the node software that enables node operators to participate in the Atoma Network. By running an Atoma node, you can:

Contribute with your hardware to provide computing power to the network;
Earn rewards for processing AI workloads;
Help build a more accessible and democratic AI infrastructure.

Community Links

🌐 Official Website
📖 Documentation
🐦 Twitter
💬 Discord

Spawn an Atoma Node

Install the Sui client locally

The first step in setting up an Atoma node is installing the Sui client locally. Please refer to the Sui installation guide for more information.

Once you have the Sui client installed, locally, you need to connect to a Sui RPC node to be able to interact with the Sui blockchain and therefore the Atoma smart contract. Please refer to the Connect to a Sui Network guide for more information.

You then need to create a wallet and fund it with some testnet SUI. Please refer to the Sui wallet guide for more information. If you are plan to run the Atoma node on Sui's testnet, you can request testnet SUI tokens by following the docs.

Docker Deployment

Prerequisites

Docker and Docker Compose (>= v2.22) installed
NVIDIA Container Toolkit installed (for GPU support)
Access to HuggingFace models (and token if using gated models)
Sui wallet configuration

Quickstart

Clone the repository

git clone https://github.com/atoma-network/atoma-node.git
cd atoma-node

Configure environment variables by creating .env file, use .env.example for reference:

# Hugging Face Configuration
HF_CACHE_PATH=~/.cache/huggingface
HF_TOKEN=   # Required for gated models

# Inference Server Configuration
INFERENCE_SERVER_PORT=50000    # External port for vLLM service
MODEL=meta-llama/Llama-3.1-70B-Instruct
MAX_MODEL_LEN=4096            # Context length
GPU_COUNT=1                   # Number of GPUs to use
TENSOR_PARALLEL_SIZE=1        # Should be equal to GPU_COUNT

# Sui Configuration
SUI_CONFIG_PATH=~/.sui/sui_config

# Atoma Node Service Configuration
ATOMA_SERVICE_PORT=3000       # External port for Atoma service

Configure config.toml, using config.example.toml as template:

[atoma-service]
chat_completions_service_url = "http://chat-completions:80"    # Internal Docker network URL
embeddings_service_url = "http://embeddings:80"
image_generations_service_url = "http://image-generations:80"
image_generations_service_url = ""
models = ["meta-llama/Llama-3.1-70B-Instruct"]
revisions = [""]
service_bind_address = "0.0.0.0:3000"         # Bind to all interfaces

[atoma-sui]
http_rpc_node_addr = ""
atoma_db = ""
atoma_package_id = ""
usdc_package_id = ""
request_timeout = { secs = 300, nanos = 0 }
max_concurrent_requests = 10
limit = 100
node_small_ids = [0, 1, 2]  # List of node IDs under control
task_small_ids = []         # List of task IDs under control
sui_config_path = "/root/.sui/sui_config/client.yaml"
sui_keystore_path = "/root/.sui/sui_config/sui.keystore"

[atoma-state]
database_url = "postgres://<POSTGRES_USER>:<POSTGRES_PASSWORD>@localhost:5432/<POSTGRES_DB>"

Create required directories

mkdir -p data logs

Start the containers with the desired inference services

We currenlty support the following inference services:

Chat Completions

Backend	Architecture/Platform	Docker Compose Profile
vLLM	CUDA	`chat_completions_vllm`
vLLM	x86_64	`chat_completions_vllm_cpu`
vLLM	ROCm	`chat_completions_vllm_rocm`
mistral.rs	x86_64, aarch64	`chat_completions_mistralrs_cpu`

Embeddings

Backend	Architecture/Platform	Docker Compose Profile
Text Embeddings Inference	CUDA	`embeddings_tei`

Image Generations

Backend	Architecture/Platform	Docker Compose Profile
mistral.rs	CUDA	`image_generations_mistralrs`

# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistralrs docker compose up --build

# Only start one service
COMPOSE_PROFILES=chat_completions_vllm docker compose up --build

# Run in detached mode
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistralrs docker compose up -d --build

Container Architecture

The deployment consists of two main services:

vLLM Service: Handles the AI model inference
Atoma Node: Manages the node operations and connects to the Atoma Network

Service URLs

vLLM Service: http://localhost:50000 (configured via INFERENCE_SERVER_PORT)
Atoma Node: http://localhost:3000 (configured via ATOMA_SERVICE_PORT)

Volume Mounts

HuggingFace cache: ~/.cache/huggingface:/root/.cache/huggingface
Sui configuration: ~/.sui/sui_config:/root/.sui/sui_config
Logs: ./logs:/app/logs
PostgreSQL database: ./data:/app/data

Managing the Deployment

Check service status:

docker compose ps

View logs:

# All services
docker compose logs

# Specific service
docker compose logs atoma-node
docker compose logs vllm

# Follow logs
docker compose logs -f

Stop services:

docker compose down

Troubleshooting

Check if services are running:

docker compose ps

Test vLLM service:

curl http://localhost:50000/health

Test Atoma Node service:

curl http://localhost:3000/health

Check GPU availability:

docker compose exec vllm nvidia-smi

View container networks:

docker network ls
docker network inspect atoma-network

Security Considerations

Firewall Configuration

# Allow Atoma service port
sudo ufw allow 3000/tcp

# Allow vLLM service port
sudo ufw allow 50000/tcp

HuggingFace Token

Store HF_TOKEN in .env file
Never commit .env file to version control
Consider using Docker secrets for production deployments

Sui Configuration

Ensure Sui configuration files have appropriate permissions
Keep keystore file secure and never commit to version control

Testing

Since the AtomaStateManager instance relies on a PostgreSQL database, we need to have a local instance running to run the tests. You can spawn one using the docker-compose.test.yaml file:

docker compose -f docker-compose.test.yaml up --build -d

It might be necessary that you clean up the database before or after running the tests. You can do so by running:

docker compose -f docker-compose.test.yaml down

and remove the specific postgres volumes:

docker system prune -af --volumes

Notice that by running the above commands you will lose all the data stored in the database.

Manual deployment

1. Installing Rust

Install Rust using rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Follow the prompts and restart your terminal. Verify the installation:

rustc --version
cargo --version

2. Cloning the Repository

git clone https://github.com/atoma-network/atoma-node.git
cd atoma-node

3. Configuring the Node

The application uses a TOML configuration file with the following sections:

`[atoma-service]`

chat_completions_service_url (optional): Endpoint URL for the inference service. At least one of the service URLs must be provided.
embeddings_service_url (optional): Endpoint URL for the embeddings service. At least one of the service URLs must be provided.
image_generations_service_url (optional): Endpoint URL for the image generations service. At least one of the service URLs must be provided.
models: List of model names deployed by the Atoma Service
revisions: List of model revisions supported by the service
service_bind_address: Address and port for the Atoma Service to bind to

`[atoma-sui]`

http_rpc_node_addr: HTTP URL for a Sui RPC node, that the Atoma Sui's subscriber will use to listen to events on the Sui network.
atoma_db: ObjectID for Atoma's DB on the Sui network
atoma_package_id: ObjectID for Atoma's package on the Sui network
usdc_package_id: ObjectID for USDC token package
request_timeout (optional): Duration for request timeouts
max_concurrent_requests (optional): Maximum number of concurrent Sui client requests
limit (optional): Limit for dynamic fields retrieval per event subscriber loop
node_small_ids: List of node small IDs controlled by the current Sui wallet. Node small IDs are assigned to each node upon registration on the Atoma's smart contract.
task_small_ids: List of task small IDs controlled by the current Sui wallet. Recommended to be an empty list.
sui_config_path: Path to the Sui configuration file
sui_keystore_path: Path to the Sui keystore file, it should be at the same directory level as the Sui configuration file.

`[atoma-state]`

database_url: PostgreSQL database connection URL

Example Configuration

[atoma-service]
chat_completions_service_url = "<chat_completions_service_url>"
embeddings_service_url = "<EMBEDDINGS_SERVICE_URL>"
image_generations_service_url = "<image_generations_service_url>"
models = ["<MODEL_1>", "<MODEL_2>"]
revisions = ["<REVISION_1>", "<REVISION_2>"]
service_bind_address = "<HOST>:<PORT>"

[atoma-sui]
http_rpc_node_addr = "<SUI_RPC_NODE_URL>"
atoma_db = "<ATOMA_DB_OBJECT_ID>"
atoma_package_id = "<ATOMA_PACKAGE_OBJECT_ID>"
toma_package_id = "<TOMA_PACKAGE_OBJECT_ID>"
request_timeout = { secs = 300, nanos = 0 }
max_concurrent_requests = 10
limit = 100
node_small_ids = [0, 1, 2]  # List of node IDs under control
task_small_ids = []  # List of task IDs under control
sui_config_path = "<PATH_TO_SUI_CONFIG>" # Example: "~/.sui/sui_config/client.yaml" (default)
sui_keystore_path = "<PATH_TO_SUI_KEYSTORE>" # Example: "~/.sui/sui_config/sui.keystore" (default)

[atoma-state]
# Path inside the container
# Replace the placeholder values with the ones for your local environment (in the .env file)
database_url = "postgres://<POSTGRES_USER>:<POSTGRES_PASSWORD>@localhost:5432/<POSTGRES_DB>"

4. Running the Atoma Node

After configuring your node, you can run it using the following command:

RUST_LOG=debug cargo run --release --bin atoma-node -- \
  --config-path /path/to/config.toml

Or if you've built the binary:

./target/release/atoma-node \
  --config-path /path/to/config.toml

Command line arguments:

--config-path (-c): Path to your TOML configuration file
--address-index (-a): Index of the address to use from the keystore (defaults to 0)

5. Spawn the background inference service

We currently support the following inference services:

atoma-inference-service
vLLM

Please refer to the documentation of the inference service you want to use to spawn the service. Make sure to set the correct inference service URL in the Atoma Node configuration, above.

6. Managing Logs

The Atoma node uses a comprehensive logging system that writes to both console and files:

Log Location

Logs are stored in the ./logs directory
The main log file is named atoma-node-service.log
Logs rotate daily to prevent excessive file sizes

Log Formats

Console Output: Human-readable format with pretty printing, ideal for development
File Output: JSON format with detailed metadata, perfect for log aggregation systems

Log Levels

The default logging level is info, but you can adjust it using the RUST_LOG environment variable:

# Set specific log levels
export RUST_LOG=debug,atoma_node_service=trace

# Run with custom log level
RUST_LOG=debug cargo run --release --bin atoma-node -- [args]

Common log levels (from most to least verbose):

trace: Very detailed debugging information
debug: Useful debugging information
info: General information about operation
warn: Warning messages
error: Error messages

Viewing Logs

You can use standard Unix tools to view and analyze logs:

# View latest logs
tail -f ./logs/atoma-node-service.log

# Search for specific events
grep "event_name" ./logs/atoma-node-service.log

# View JSON logs in a more readable format (requires jq)
cat ./logs/atoma-node-service.log | jq '.'

Log Rotation

Logs automatically rotate daily
Old logs are preserved with the date appended to the filename
You may want to set up log cleanup periodically to manage disk space:

# Example cleanup script for logs older than 30 days
find ./logs -name "atoma-node-service.log.*" -mtime +30 -delete

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Atoma Node infrastructure

Introduction

Community Links

Spawn an Atoma Node

Install the Sui client locally

Docker Deployment

Prerequisites

Quickstart

Chat Completions

Embeddings

Image Generations

Container Architecture

Service URLs

Volume Mounts

Managing the Deployment

Troubleshooting

Security Considerations

Testing

Manual deployment

1. Installing Rust

2. Cloning the Repository

3. Configuring the Node

`[atoma-service]`

`[atoma-sui]`

`[atoma-state]`

Example Configuration

4. Running the Atoma Node

5. Spawn the background inference service

6. Managing Logs

Log Location

Log Formats

Log Levels

Viewing Logs

Log Rotation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Atoma Node infrastructure

Introduction

Community Links

Spawn an Atoma Node

Install the Sui client locally

Docker Deployment

Prerequisites

Quickstart

Chat Completions

Embeddings

Image Generations

Container Architecture

Service URLs

Volume Mounts

Managing the Deployment

Troubleshooting

Security Considerations

Testing

Manual deployment

1. Installing Rust

2. Cloning the Repository

3. Configuring the Node

[atoma-service]

[atoma-sui]

[atoma-state]

Example Configuration

4. Running the Atoma Node

5. Spawn the background inference service

6. Managing Logs

Log Location

Log Formats

Log Levels

Viewing Logs

Log Rotation

`[atoma-service]`

`[atoma-sui]`

`[atoma-state]`