Skip to content

Commit

Permalink
doc: update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
dirvine committed Nov 24, 2024
1 parent 02a8bdd commit 4be08fb
Showing 1 changed file with 120 additions and 175 deletions.
295 changes: 120 additions & 175 deletions bootstrap_cache/README.md
Original file line number Diff line number Diff line change
@@ -1,216 +1,161 @@
# Bootstrap Cache

A decentralized peer discovery and caching system for the Safe Network.
A robust peer caching system for the Safe Network that provides persistent storage and management of network peer addresses. This crate handles peer discovery, caching, and reliability tracking with support for concurrent access across multiple processes.

## Features

- **Decentralized Design**: No dedicated bootstrap nodes required
- **Cross-Platform Support**: Works on Linux, macOS, and Windows
- **Shared Cache**: System-wide cache file accessible by both nodes and clients
- **Concurrent Access**: File locking for safe multi-process access
- **Atomic Operations**: Safe cache updates using atomic file operations
- **Initial Peer Discovery**: Fallback web endpoints for new/stale cache scenarios
- **Comprehensive Error Handling**: Detailed error types and logging
- **Circuit Breaker Pattern**: Intelligent failure handling with:
- Configurable failure thresholds and reset timeouts
- Exponential backoff for failed requests
- Automatic state transitions (closed → open → half-open)
- Protection against cascading failures
### Storage and Accessibility
- System-wide accessible cache location
- Configurable primary cache location
- Automatic fallback to user's home directory (`~/.safe/bootstrap_cache.json`)
- Cross-process safe with file locking
- Atomic write operations to prevent cache corruption

### Peer Management
### Concurrent Access
- Thread-safe in-memory cache with `RwLock`
- File system level locking for cross-process synchronization
- Shared (read) and exclusive (write) lock support
- Exponential backoff retry mechanism for lock acquisition

The bootstrap cache implements a robust peer management system:
### Data Management
- Peer expiry after 24 hours of inactivity
- Automatic cleanup of stale and unreliable peers
- Configurable maximum peer limit
- Peer reliability tracking (success/failure counts)
- Atomic file operations for data integrity

- **Peer Status Tracking**: Each peer's connection history is tracked, including:
- Success count: Number of successful connections
- Failure count: Number of failed connection attempts
- Last seen timestamp: When the peer was last successfully contacted
## Configuration Options

- **Automatic Cleanup**: The system automatically removes unreliable peers:
- Peers that fail 3 consecutive connection attempts are marked for removal
- Removal only occurs if there are at least 2 working peers available
- This ensures network connectivity is maintained even during temporary connection issues
The `BootstrapConfig` struct provides the following configuration options:

- **Duplicate Prevention**: The cache automatically prevents duplicate peer entries:
- Same IP and port combinations are only stored once
- Different ports on the same IP are treated as separate peers
```rust
pub struct BootstrapConfig {
/// List of endpoints to fetch initial peers from
pub endpoints: Vec<String>,

/// Maximum number of peers to maintain in the cache
pub max_peers: usize,

/// Path where the cache file will be stored
pub cache_file_path: PathBuf,

/// How long to wait for peer responses
pub peer_response_timeout: Duration,

/// Interval between connection attempts
pub connection_interval: Duration,

/// Maximum number of connection retries
pub max_retries: u32,
}
```

## Installation
### Option Details

Add this to your `Cargo.toml`:
#### `endpoints`
- List of URLs to fetch initial peers from when cache is empty
- Example: `["https://sn-node1.s3.amazonaws.com/peers", "https://sn-node2.s3.amazonaws.com/peers"]`
- Default: Empty vector (no endpoints)

```toml
[dependencies]
bootstrap_cache = { version = "0.1.0" }
```
#### `max_peers`
- Maximum number of peers to store in cache
- When exceeded, oldest peers are removed first
- Default: 1500 peers

## Usage
#### `cache_file_path`
- Location where the cache file will be stored
- Falls back to `~/.safe/bootstrap_cache.json` if primary location is not writable
- Example: `/var/lib/safe/bootstrap_cache.json`

### Basic Example
#### `peer_response_timeout`
- Maximum time to wait for a peer to respond
- Affects peer reliability scoring
- Default: 60 seconds

```rust
use bootstrap_cache::{BootstrapCache, CacheManager, InitialPeerDiscovery};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize the cache manager
let cache_manager = CacheManager::new()?;

// Try to read from the cache
let mut cache = match cache_manager.read_cache() {
Ok(cache) if !cache.is_stale() => cache,
_ => {
// Cache is stale or unavailable, fetch initial peers
let discovery = InitialPeerDiscovery::new();
let peers = discovery.fetch_peers().await?;
let cache = BootstrapCache {
last_updated: chrono::Utc::now(),
peers,
};
cache_manager.write_cache(&cache)?;
cache
}
};

println!("Found {} peers in cache", cache.peers.len());
Ok(())
}
```
#### `connection_interval`
- Time to wait between connection attempts
- Helps prevent network flooding
- Default: 10 seconds

### Custom Endpoints
#### `max_retries`
- Maximum number of times to retry connecting to a peer
- Affects peer reliability scoring
- Default: 3 attempts

```rust
use bootstrap_cache::InitialPeerDiscovery;
## Usage Modes

let discovery = InitialPeerDiscovery::with_endpoints(vec![
"http://custom1.example.com/peers.json".to_string(),
"http://custom2.example.com/peers.json".to_string(),
]);
### Default Mode
```rust
let config = BootstrapConfig::default();
let store = CacheStore::new(config).await?;
```
- Uses default configuration
- Loads peers from cache if available
- Falls back to configured endpoints if cache is empty

### Circuit Breaker Configuration

### Test Network Mode
```rust
use bootstrap_cache::{InitialPeerDiscovery, CircuitBreakerConfig};
use std::time::Duration;

// Create a custom circuit breaker configuration
let config = CircuitBreakerConfig {
max_failures: 5, // Open after 5 failures
reset_timeout: Duration::from_secs(300), // Wait 5 minutes before recovery
min_backoff: Duration::from_secs(1), // Start with 1 second backoff
max_backoff: Duration::from_secs(60), // Max backoff of 60 seconds
let args = PeersArgs {
test_network: true,
peers: vec![/* test peers */],
..Default::default()
};

// Initialize discovery with custom circuit breaker config
let discovery = InitialPeerDiscovery::with_config(config);
let store = CacheStore::from_args(args, config).await?;
```
- Isolates from main network cache
- Only uses explicitly provided peers
- No cache persistence

### Peer Management Example

### Local Mode
```rust
use bootstrap_cache::BootstrapCache;

let mut cache = BootstrapCache::new();

// Add a new peer
cache.add_peer("192.168.1.1".to_string(), 8080);

// Update peer status after connection attempts
cache.update_peer_status("192.168.1.1", 8080, true); // successful connection
cache.update_peer_status("192.168.1.1", 8080, false); // failed connection

// Clean up failed peers (only if we have at least 2 working peers)
cache.cleanup_failed_peers();
let args = PeersArgs {
local: true,
..Default::default()
};
let store = CacheStore::from_args(args, config).await?;
```
- Returns empty store
- Suitable for local network testing
- Uses mDNS for peer discovery

## Cache File Location

The cache file is stored in a system-wide location accessible to all processes:

- **Linux**: `/var/safe/bootstrap_cache.json`
- **macOS**: `/Library/Application Support/Safe/bootstrap_cache.json`
- **Windows**: `C:\ProgramData\Safe\bootstrap_cache.json`

## Cache File Format

```json
{
"last_updated": "2024-02-20T15:30:00Z",
"peers": [
{
"ip": "192.168.1.1",
"port": 8080,
"last_seen": "2024-02-20T15:30:00Z",
"success_count": 10,
"failure_count": 0
}
]
}
### First Node Mode
```rust
let args = PeersArgs {
first: true,
..Default::default()
};
let store = CacheStore::from_args(args, config).await?;
```
- Returns empty store
- No fallback to endpoints
- Used for network initialization

## Error Handling

The crate provides detailed error types through the `Error` enum:
The crate provides comprehensive error handling for:
- File system operations
- Network requests
- Concurrent access
- Data serialization/deserialization
- Lock acquisition

```rust
use bootstrap_cache::Error;

match cache_manager.read_cache() {
Ok(cache) => println!("Cache loaded successfully"),
Err(Error::CacheStale) => println!("Cache is stale"),
Err(Error::CacheCorrupted) => println!("Cache file is corrupted"),
Err(Error::Io(e)) => println!("IO error: {}", e),
Err(e) => println!("Other error: {}", e),
}
```
All errors are propagated through the `Result<T, Error>` type with detailed error variants.

## Thread Safety

The cache system uses file locking to ensure safe concurrent access:
The cache store is thread-safe and can be safely shared between threads:
- `Clone` implementation for `CacheStore`
- Internal `Arc<RwLock>` for thread-safe data access
- File system locks for cross-process synchronization

- Shared locks for reading
- Exclusive locks for writing
- Atomic file updates using temporary files
## Logging

## Development

### Building

```bash
cargo build
```

### Running Tests

```bash
cargo test
```

### Running with Logging

```rust
use tracing_subscriber::FmtSubscriber;

// Initialize logging
let subscriber = FmtSubscriber::builder()
.with_max_level(tracing::Level::DEBUG)
.init();
```

## Contributing

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -am 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
Comprehensive logging using the `tracing` crate:
- Info level for normal operations
- Warn level for recoverable issues
- Error level for critical failures
- Debug level for detailed diagnostics

## License

This project is licensed under the GPL-3.0 License - see the LICENSE file for details.

## Related Documentation

- [Bootstrap Cache PRD](docs/bootstrap_cache_prd.md)
- [Implementation Guide](docs/bootstrap_cache_implementation.md)
This SAFE Network Software is licensed under the General Public License (GPL), version 3 ([LICENSE](LICENSE) http://www.gnu.org/licenses/gpl-3.0.en.html).

0 comments on commit 4be08fb

Please sign in to comment.