Skip to content

Commit

Permalink
Merge pull request #2 from ritual-net/dev
Browse files Browse the repository at this point in the history
Major release 1.0.0
  • Loading branch information
arshan-ritual authored Jun 6, 2024
2 parents 6f69975 + fbf0b72 commit dc2d6fb
Show file tree
Hide file tree
Showing 29 changed files with 546 additions and 272 deletions.
6 changes: 6 additions & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@ jobs:
with:
extra_args: --all-files --show-diff-on-failure

- name: Run Format (AWS)
run: cd procure/aws && terraform fmt -check

- name: Run Format (GCP)
run: cd procure/gcp && terraform fmt -check

- name: Setup TFLint
uses: terraform-linters/setup-tflint@v3
with:
Expand Down
30 changes: 30 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Changelog

All notable changes to this project will be documented in this file.

- ##### The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
- ##### This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.0.0] - UNRELEASED

### Added
- Support for multi-region, multi-zone deployments in GCP.
- Support for multi-zone deployments in AWS. Since multi-region deployments require
separate provider blocks, we don't allow multiple regions to avoid increased repo complexity.
- Support for GPUs on GCP (via the terraform `accelerator` block) and AWS. Includes driver installation script
and a gpu-specific `docker-compose.yaml` file to expose GPUs to the node container for diagnostics.
- Terraform formatter in pipeline and README.

### Changed
- Format of node specification in `.tfvars`. Nodes are now specified via a map (see `variables.tf`) where keys correspond to node IDs.
- Format of router specification in `.tfvars`. Router is now specified via a map (see `variables.tf`).
- Naming conventions for configuration `.json` files. One file per deployed node, names (without `.json` postfix) matching the node IDs (keys of `nodes` from `variables.tf`), are now the only requirements.

### Fixed
- All created resources are now parametrized by cluster name, so no conflicts arise from successive deployments within the same project.
- Omissions in Makefile.

## [0.1.0] - 2024-01-18

### Added
- Initial release of Infernet Deploy.
22 changes: 17 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@ Deploy a cluster of heterogenous [Infernet](https://github.com/ritual-net/infern
1. [Install Terraform](https://developer.hashicorp.com/terraform/install)
2. **Configure nodes**: A node configuration file **for each** node being deployed.
- See [example configuration](configs/0.json.example).
- They must be named `0.json`, `1.json`, etc...
- Misnamed files are ignored.
- They must have **unique** names
- A straightforward approach would be `0.json`, `1.json`, etc...
- They must be placed under the top-level `configs/` directory.
- Number and name of `.json` files must match the number and name of *keys* in the `nodes` variable in `terraform.tfvars`.
- See [terraform.tfvars.example](./procure/aws/terraform.tfvars.example).
- Each key should correspond to the name of a `.json` file, *excluding* the `.json` postfix.
- Each node *strictly* requires its own configuration `.json` file, even if those are identical.
- Number of `.json` files must match the `node_count` variable in `terraform.tfvars`.
- Extra files are ignored.
- For instructions on configuring nodes, refer to the [Infernet Node](https://github.com/ritual-net/infernet-node).
- For instructions on configuring individual nodes, refer to the [Infernet Node](https://github.com/ritual-net/infernet-node).

#### Infernet Router:
The Infernet Router REST server is configured automatically by Terraform. However, if you plan to use it, you need to understand its implications:
Expand Down Expand Up @@ -106,6 +107,17 @@ tflint --init
tflint --recursive
```

### Using Terraform Format
```bash
# Format AWS files
cd procure/aws
terraform fmt
# Format GCP files
cd procure/gcp
terraform fmt
```

## License

[BSD 3-clause Clear](./LICENSE)
26 changes: 21 additions & 5 deletions configs/0.json.example
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,24 @@
"chain": {
"enabled": true,
"rpc_url": "http://127.0.0.1:8545",
"coordinator_address": "0x...",
"trail_head_blocks": 4,
"wallet": {
"max_gas_limit": 100000,
"private_key": "12345s"
"private_key": "12345s",
"payment_address": "0x...",
"allowed_sim_errors": []
},
"snapshot_sync": {
"sleep": 1.5,
"batch_size": 200
}
},
"docker": {
"username": "username",
"password": "password"
},
"redis": {
"host": "localhost",
"host": "redis",
"port": 6379
},
"forward_stats": true,
Expand All @@ -40,7 +45,12 @@
"KEY1": "VALUE1",
"KEY2": "VALUE2"
},
"gpu": true
"gpu": false,
"accepted_payments": {
"0x0000000000000000000000000000000000000000": 1000000000000000000,
"0x59F2f1fCfE2474fD5F0b9BA1E73ca90b143Eb8d0": 1000000000000000000
},
"generates_proofs": true
},
{
"id": "container-2",
Expand All @@ -58,7 +68,13 @@
"env": {
"KEY3": "VALUE3",
"KEY4": "VALUE4"
}
},
"gpu": true,
"accepted_payments": {
"0x0000000000000000000000000000000000000000": 1000000000000000000,
"0x59F2f1fCfE2474fD5F0b9BA1E73ca90b143Eb8d0": 1000000000000000000
},
"generates_proofs": false
}
]
}
62 changes: 62 additions & 0 deletions deploy/docker-compose-gpu.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
version: '3'

services:
node:
image: ritualnetwork/infernet-node:latest-gpu
ports:
- "0.0.0.0:4000:4000"
volumes:
- ./config.json:/app/config.json
- node-logs:/logs
- /var/run/docker.sock:/var/run/docker.sock
tty: true
networks:
- network
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
depends_on:
- redis
restart:
on-failure
extra_hosts:
- "host.docker.internal:host-gateway"
stop_grace_period: 1m

redis:
image: redis:latest
expose:
- "6379"
networks:
- network
volumes:
- ./redis.conf:/usr/local/etc/redis/redis.conf
- redis-data:/data
restart:
on-failure

fluentbit:
image: fluent/fluent-bit:latest
expose:
- "24224"
environment:
- FLUENTBIT_CONFIG_PATH=/fluent-bit/etc/fluent-bit.conf
volumes:
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- /var/log:/var/log:ro
networks:
- network
restart:
on-failure

networks:
network:


volumes:
node-logs:
redis-data:
19 changes: 9 additions & 10 deletions deploy/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,29 +2,28 @@ version: '3'

services:
node:
image: ritualnetwork/infernet-node:0.1.0
image: ritualnetwork/infernet-node:latest
ports:
- "0.0.0.0:4000:4000"
volumes:
- type: bind
source: ./config.json
target: /app/config.json
- ./config.json:/app/config.json
- node-logs:/logs
- /var/run/docker.sock:/var/run/docker.sock
tty: true
networks:
- network
restart:
on-failure
depends_on:
- redis
restart:
on-failure
extra_hosts:
- "host.docker.internal:host-gateway"
stop_grace_period: 1m

redis:
image: redis:latest
ports:
- "6379:6379"
expose:
- "6379"
networks:
- network
volumes:
Expand All @@ -35,8 +34,8 @@ services:

fluentbit:
image: fluent/fluent-bit:latest
ports:
- "24224:24224"
expose:
- "24224"
environment:
- FLUENTBIT_CONFIG_PATH=/fluent-bit/etc/fluent-bit.conf
volumes:
Expand Down
8 changes: 6 additions & 2 deletions procure/Makefile
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
SHELL := /bin/bash

# Define the Terraform command
TERRAFORM_CMD := terraform

Expand All @@ -21,6 +23,9 @@ init: check-provider

# Define the plan target
plan: check-provider
@echo "Preparing deployment files..."
@chmod +x prepare_files.sh
@./prepare_files.sh
@echo "Generating Terraform plan..."
$(TERRAFORM_CMD) -chdir=$(provider) plan

Expand All @@ -33,7 +38,6 @@ apply: check-provider
@$(TERRAFORM_CMD) -chdir=$(provider) apply

# Define the destroy target
destroy:
@[[ "$(provider)" == "aws" || "$(provider)" == "gcp" ]] || (echo "Usage: 'make destroy provider={aws, gcp}'" && exit 1)
destroy: check-provider
@echo "Destroying Terraform resources..."
@$(TERRAFORM_CMD) -chdir=$(provider) destroy
2 changes: 1 addition & 1 deletion procure/aws/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ terraform {
provider "aws" {
access_key = var.access_key_id
secret_key = var.secret_access_key
region = var.region
region = var.region
}
16 changes: 9 additions & 7 deletions procure/aws/metadata.tf
Original file line number Diff line number Diff line change
@@ -1,22 +1,24 @@
# Config files as secrets
resource "aws_ssm_parameter" "config_file" {
count = var.node_count
for_each = var.nodes

name = "config_${count.index}"
name = "${each.key}.json"
type = "SecureString"
value = filebase64("${path.module}/../../configs/${count.index}.json")
value = filebase64("${path.module}/../../configs/${each.key}.json")
}

# Deployment files
resource "aws_ssm_parameter" "deploy_tar" {
name = "deploy_tar"
for_each = var.nodes

name = "deploy-tar-${each.key}"
type = "SecureString"
value = filebase64("${path.module}/../deploy.tar.gz")
value = each.value.has_gpu ? filebase64("${path.module}/../deploy-gpu.tar.gz") : filebase64("${path.module}/../deploy.tar.gz")
}

# Node IPs
resource "aws_ssm_parameter" "node_ips" {
name = "node_ips"
name = "node-ips-${var.name}"
type = "String"
value = join("\n", [for ip in aws_eip.static_ip[*].public_ip : "${ip}:4000"])
value = join("\n", [for key, _ in aws_instance.nodes : "${aws_eip.static_ip[key].public_ip}:4000"])
}
Loading

0 comments on commit dc2d6fb

Please sign in to comment.