-
Notifications
You must be signed in to change notification settings - Fork 57
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
120 additions
and
175 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,216 +1,161 @@ | ||
# Bootstrap Cache | ||
|
||
A decentralized peer discovery and caching system for the Safe Network. | ||
A robust peer caching system for the Safe Network that provides persistent storage and management of network peer addresses. This crate handles peer discovery, caching, and reliability tracking with support for concurrent access across multiple processes. | ||
|
||
## Features | ||
|
||
- **Decentralized Design**: No dedicated bootstrap nodes required | ||
- **Cross-Platform Support**: Works on Linux, macOS, and Windows | ||
- **Shared Cache**: System-wide cache file accessible by both nodes and clients | ||
- **Concurrent Access**: File locking for safe multi-process access | ||
- **Atomic Operations**: Safe cache updates using atomic file operations | ||
- **Initial Peer Discovery**: Fallback web endpoints for new/stale cache scenarios | ||
- **Comprehensive Error Handling**: Detailed error types and logging | ||
- **Circuit Breaker Pattern**: Intelligent failure handling with: | ||
- Configurable failure thresholds and reset timeouts | ||
- Exponential backoff for failed requests | ||
- Automatic state transitions (closed → open → half-open) | ||
- Protection against cascading failures | ||
### Storage and Accessibility | ||
- System-wide accessible cache location | ||
- Configurable primary cache location | ||
- Automatic fallback to user's home directory (`~/.safe/bootstrap_cache.json`) | ||
- Cross-process safe with file locking | ||
- Atomic write operations to prevent cache corruption | ||
|
||
### Peer Management | ||
### Concurrent Access | ||
- Thread-safe in-memory cache with `RwLock` | ||
- File system level locking for cross-process synchronization | ||
- Shared (read) and exclusive (write) lock support | ||
- Exponential backoff retry mechanism for lock acquisition | ||
|
||
The bootstrap cache implements a robust peer management system: | ||
### Data Management | ||
- Peer expiry after 24 hours of inactivity | ||
- Automatic cleanup of stale and unreliable peers | ||
- Configurable maximum peer limit | ||
- Peer reliability tracking (success/failure counts) | ||
- Atomic file operations for data integrity | ||
|
||
- **Peer Status Tracking**: Each peer's connection history is tracked, including: | ||
- Success count: Number of successful connections | ||
- Failure count: Number of failed connection attempts | ||
- Last seen timestamp: When the peer was last successfully contacted | ||
## Configuration Options | ||
|
||
- **Automatic Cleanup**: The system automatically removes unreliable peers: | ||
- Peers that fail 3 consecutive connection attempts are marked for removal | ||
- Removal only occurs if there are at least 2 working peers available | ||
- This ensures network connectivity is maintained even during temporary connection issues | ||
The `BootstrapConfig` struct provides the following configuration options: | ||
|
||
- **Duplicate Prevention**: The cache automatically prevents duplicate peer entries: | ||
- Same IP and port combinations are only stored once | ||
- Different ports on the same IP are treated as separate peers | ||
```rust | ||
pub struct BootstrapConfig { | ||
/// List of endpoints to fetch initial peers from | ||
pub endpoints: Vec<String>, | ||
|
||
/// Maximum number of peers to maintain in the cache | ||
pub max_peers: usize, | ||
|
||
/// Path where the cache file will be stored | ||
pub cache_file_path: PathBuf, | ||
|
||
/// How long to wait for peer responses | ||
pub peer_response_timeout: Duration, | ||
|
||
/// Interval between connection attempts | ||
pub connection_interval: Duration, | ||
|
||
/// Maximum number of connection retries | ||
pub max_retries: u32, | ||
} | ||
``` | ||
|
||
## Installation | ||
### Option Details | ||
|
||
Add this to your `Cargo.toml`: | ||
#### `endpoints` | ||
- List of URLs to fetch initial peers from when cache is empty | ||
- Example: `["https://sn-node1.s3.amazonaws.com/peers", "https://sn-node2.s3.amazonaws.com/peers"]` | ||
- Default: Empty vector (no endpoints) | ||
|
||
```toml | ||
[dependencies] | ||
bootstrap_cache = { version = "0.1.0" } | ||
``` | ||
#### `max_peers` | ||
- Maximum number of peers to store in cache | ||
- When exceeded, oldest peers are removed first | ||
- Default: 1500 peers | ||
|
||
## Usage | ||
#### `cache_file_path` | ||
- Location where the cache file will be stored | ||
- Falls back to `~/.safe/bootstrap_cache.json` if primary location is not writable | ||
- Example: `/var/lib/safe/bootstrap_cache.json` | ||
|
||
### Basic Example | ||
#### `peer_response_timeout` | ||
- Maximum time to wait for a peer to respond | ||
- Affects peer reliability scoring | ||
- Default: 60 seconds | ||
|
||
```rust | ||
use bootstrap_cache::{BootstrapCache, CacheManager, InitialPeerDiscovery}; | ||
|
||
#[tokio::main] | ||
async fn main() -> Result<(), Box<dyn std::error::Error>> { | ||
// Initialize the cache manager | ||
let cache_manager = CacheManager::new()?; | ||
|
||
// Try to read from the cache | ||
let mut cache = match cache_manager.read_cache() { | ||
Ok(cache) if !cache.is_stale() => cache, | ||
_ => { | ||
// Cache is stale or unavailable, fetch initial peers | ||
let discovery = InitialPeerDiscovery::new(); | ||
let peers = discovery.fetch_peers().await?; | ||
let cache = BootstrapCache { | ||
last_updated: chrono::Utc::now(), | ||
peers, | ||
}; | ||
cache_manager.write_cache(&cache)?; | ||
cache | ||
} | ||
}; | ||
|
||
println!("Found {} peers in cache", cache.peers.len()); | ||
Ok(()) | ||
} | ||
``` | ||
#### `connection_interval` | ||
- Time to wait between connection attempts | ||
- Helps prevent network flooding | ||
- Default: 10 seconds | ||
|
||
### Custom Endpoints | ||
#### `max_retries` | ||
- Maximum number of times to retry connecting to a peer | ||
- Affects peer reliability scoring | ||
- Default: 3 attempts | ||
|
||
```rust | ||
use bootstrap_cache::InitialPeerDiscovery; | ||
## Usage Modes | ||
|
||
let discovery = InitialPeerDiscovery::with_endpoints(vec![ | ||
"http://custom1.example.com/peers.json".to_string(), | ||
"http://custom2.example.com/peers.json".to_string(), | ||
]); | ||
### Default Mode | ||
```rust | ||
let config = BootstrapConfig::default(); | ||
let store = CacheStore::new(config).await?; | ||
``` | ||
- Uses default configuration | ||
- Loads peers from cache if available | ||
- Falls back to configured endpoints if cache is empty | ||
|
||
### Circuit Breaker Configuration | ||
|
||
### Test Network Mode | ||
```rust | ||
use bootstrap_cache::{InitialPeerDiscovery, CircuitBreakerConfig}; | ||
use std::time::Duration; | ||
|
||
// Create a custom circuit breaker configuration | ||
let config = CircuitBreakerConfig { | ||
max_failures: 5, // Open after 5 failures | ||
reset_timeout: Duration::from_secs(300), // Wait 5 minutes before recovery | ||
min_backoff: Duration::from_secs(1), // Start with 1 second backoff | ||
max_backoff: Duration::from_secs(60), // Max backoff of 60 seconds | ||
let args = PeersArgs { | ||
test_network: true, | ||
peers: vec![/* test peers */], | ||
..Default::default() | ||
}; | ||
|
||
// Initialize discovery with custom circuit breaker config | ||
let discovery = InitialPeerDiscovery::with_config(config); | ||
let store = CacheStore::from_args(args, config).await?; | ||
``` | ||
- Isolates from main network cache | ||
- Only uses explicitly provided peers | ||
- No cache persistence | ||
|
||
### Peer Management Example | ||
|
||
### Local Mode | ||
```rust | ||
use bootstrap_cache::BootstrapCache; | ||
|
||
let mut cache = BootstrapCache::new(); | ||
|
||
// Add a new peer | ||
cache.add_peer("192.168.1.1".to_string(), 8080); | ||
|
||
// Update peer status after connection attempts | ||
cache.update_peer_status("192.168.1.1", 8080, true); // successful connection | ||
cache.update_peer_status("192.168.1.1", 8080, false); // failed connection | ||
|
||
// Clean up failed peers (only if we have at least 2 working peers) | ||
cache.cleanup_failed_peers(); | ||
let args = PeersArgs { | ||
local: true, | ||
..Default::default() | ||
}; | ||
let store = CacheStore::from_args(args, config).await?; | ||
``` | ||
- Returns empty store | ||
- Suitable for local network testing | ||
- Uses mDNS for peer discovery | ||
|
||
## Cache File Location | ||
|
||
The cache file is stored in a system-wide location accessible to all processes: | ||
|
||
- **Linux**: `/var/safe/bootstrap_cache.json` | ||
- **macOS**: `/Library/Application Support/Safe/bootstrap_cache.json` | ||
- **Windows**: `C:\ProgramData\Safe\bootstrap_cache.json` | ||
|
||
## Cache File Format | ||
|
||
```json | ||
{ | ||
"last_updated": "2024-02-20T15:30:00Z", | ||
"peers": [ | ||
{ | ||
"ip": "192.168.1.1", | ||
"port": 8080, | ||
"last_seen": "2024-02-20T15:30:00Z", | ||
"success_count": 10, | ||
"failure_count": 0 | ||
} | ||
] | ||
} | ||
### First Node Mode | ||
```rust | ||
let args = PeersArgs { | ||
first: true, | ||
..Default::default() | ||
}; | ||
let store = CacheStore::from_args(args, config).await?; | ||
``` | ||
- Returns empty store | ||
- No fallback to endpoints | ||
- Used for network initialization | ||
|
||
## Error Handling | ||
|
||
The crate provides detailed error types through the `Error` enum: | ||
The crate provides comprehensive error handling for: | ||
- File system operations | ||
- Network requests | ||
- Concurrent access | ||
- Data serialization/deserialization | ||
- Lock acquisition | ||
|
||
```rust | ||
use bootstrap_cache::Error; | ||
|
||
match cache_manager.read_cache() { | ||
Ok(cache) => println!("Cache loaded successfully"), | ||
Err(Error::CacheStale) => println!("Cache is stale"), | ||
Err(Error::CacheCorrupted) => println!("Cache file is corrupted"), | ||
Err(Error::Io(e)) => println!("IO error: {}", e), | ||
Err(e) => println!("Other error: {}", e), | ||
} | ||
``` | ||
All errors are propagated through the `Result<T, Error>` type with detailed error variants. | ||
|
||
## Thread Safety | ||
|
||
The cache system uses file locking to ensure safe concurrent access: | ||
The cache store is thread-safe and can be safely shared between threads: | ||
- `Clone` implementation for `CacheStore` | ||
- Internal `Arc<RwLock>` for thread-safe data access | ||
- File system locks for cross-process synchronization | ||
|
||
- Shared locks for reading | ||
- Exclusive locks for writing | ||
- Atomic file updates using temporary files | ||
## Logging | ||
|
||
## Development | ||
|
||
### Building | ||
|
||
```bash | ||
cargo build | ||
``` | ||
|
||
### Running Tests | ||
|
||
```bash | ||
cargo test | ||
``` | ||
|
||
### Running with Logging | ||
|
||
```rust | ||
use tracing_subscriber::FmtSubscriber; | ||
|
||
// Initialize logging | ||
let subscriber = FmtSubscriber::builder() | ||
.with_max_level(tracing::Level::DEBUG) | ||
.init(); | ||
``` | ||
|
||
## Contributing | ||
|
||
1. Fork the repository | ||
2. Create your feature branch (`git checkout -b feature/amazing-feature`) | ||
3. Commit your changes (`git commit -am 'Add amazing feature'`) | ||
4. Push to the branch (`git push origin feature/amazing-feature`) | ||
5. Open a Pull Request | ||
Comprehensive logging using the `tracing` crate: | ||
- Info level for normal operations | ||
- Warn level for recoverable issues | ||
- Error level for critical failures | ||
- Debug level for detailed diagnostics | ||
|
||
## License | ||
|
||
This project is licensed under the GPL-3.0 License - see the LICENSE file for details. | ||
|
||
## Related Documentation | ||
|
||
- [Bootstrap Cache PRD](docs/bootstrap_cache_prd.md) | ||
- [Implementation Guide](docs/bootstrap_cache_implementation.md) | ||
This SAFE Network Software is licensed under the General Public License (GPL), version 3 ([LICENSE](LICENSE) http://www.gnu.org/licenses/gpl-3.0.en.html). |