Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't GC manifests #4959

Merged
merged 28 commits into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@

### Fixed

- [#4959](https://github.com/ChainSafe/forest/pull/4959) Re-enable garbage
collection after implementing a "persistent" storage for manifests.
- [#4988](https://github.com/ChainSafe/forest/pull/4988) Fix the `logs` member
in `EthTxReceipt` that was initialized with a default value.

Expand Down
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "forest-filecoin"
version = "0.22.0"
version = "0.22.1"
authors = ["ChainSafe Systems <[email protected]>"]
repository = "https://github.com/ChainSafe/forest"
edition = "2021"
Expand Down
23 changes: 13 additions & 10 deletions src/daemon/bundle.rs
Original file line number Diff line number Diff line change
@@ -1,29 +1,27 @@
// Copyright 2019-2024 ChainSafe Systems
// SPDX-License-Identifier: Apache-2.0, MIT

use crate::db::PersistentStore;
use crate::{
networks::{ActorBundleInfo, NetworkChain, ACTOR_BUNDLES},
utils::{
db::{
car_stream::{CarBlock, CarStream},
car_util::load_car,
},
db::car_stream::{CarBlock, CarStream},
net::http_get,
},
};
use ahash::HashSet;
use anyhow::ensure;
use cid::Cid;
use futures::{stream::FuturesUnordered, TryStreamExt};
use fvm_ipld_blockstore::Blockstore;
use std::mem::discriminant;
use std::{io::Cursor, path::Path};
use tokio::io::BufReader;
use tracing::{info, warn};

/// Tries to load the missing actor bundles to the blockstore. If the bundle is
/// not present, it will be downloaded.
pub async fn load_actor_bundles(
db: &impl Blockstore,
db: &impl PersistentStore,
network: &NetworkChain,
) -> anyhow::Result<()> {
if let Some(bundle_path) = match std::env::var("FOREST_ACTOR_BUNDLE_PATH") {
Expand All @@ -40,7 +38,7 @@ pub async fn load_actor_bundles(
}

pub async fn load_actor_bundles_from_path(
db: &impl Blockstore,
db: &impl PersistentStore,
network: &NetworkChain,
bundle_path: impl AsRef<Path>,
) -> anyhow::Result<()> {
Expand Down Expand Up @@ -70,15 +68,15 @@ pub async fn load_actor_bundles_from_path(

// Load into DB
while let Some(CarBlock { cid, data }) = car_stream.try_next().await? {
db.put_keyed(&cid, &data)?;
db.put_keyed_persistent(&cid, &data)?;
}

Ok(())
}

/// Loads the missing actor bundle, returns the CIDs of the loaded bundles.
pub async fn load_actor_bundles_from_server(
db: &impl Blockstore,
db: &impl PersistentStore,
network: &NetworkChain,
bundles: &[ActorBundleInfo],
) -> anyhow::Result<Vec<Cid>> {
Expand Down Expand Up @@ -106,7 +104,12 @@ pub async fn load_actor_bundles_from_server(
http_get(alt_url).await?
};
let bytes = response.bytes().await?;
let header = load_car(db, Cursor::new(bytes)).await?;

let mut stream = CarStream::new(BufReader::new(Cursor::new(bytes))).await?;
while let Some(block) = stream.try_next().await? {
db.put_keyed_persistent(&block.cid, &block.data)?;
}
let header = stream.header;
ensure!(header.roots.len() == 1);
ensure!(header.roots.first() == root);
Ok(*header.roots.first())
Expand Down
6 changes: 1 addition & 5 deletions src/daemon/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -246,11 +246,7 @@ pub(super) async fn start(
genesis_header.clone(),
)?);

// Network Upgrade manifests are stored in the blockstore but may not be
// garbage collected. Until this is fixed, the GC has to be disabled.
// Tracking issue: https://github.com/ChainSafe/forest/issues/4926
// if !opts.no_gc {
if false {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to see GC being manually triggered in at least one of the CI tests. Could you create a tracking issue if it cannot land in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is an issue for manual gc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if !opts.no_gc {
let mut db_garbage_collector = {
let chain_store = chain_store.clone();
let depth = cmp::max(
Expand Down
14 changes: 14 additions & 0 deletions src/db/car/any.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

use super::{CacheKey, RandomAccessFileReader, ZstdFrameCache};
use crate::blocks::Tipset;
use crate::db::PersistentStore;
use crate::utils::io::EitherMmapOrRandomAccessFile;
use cid::Cid;
use fvm_ipld_blockstore::Blockstore;
Expand Down Expand Up @@ -124,6 +125,19 @@ where
}
}

impl<ReaderT> PersistentStore for AnyCar<ReaderT>
Copy link
Contributor

@hanabi1224 hanabi1224 Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it would be less error-prone to make Blockstore trait persistent by default and have a GarbageCollectableStore trait for storing blockchain data. Currently, ppl have to interpret Blockstore as GarbageCollectableStore to avoid mistakes which is counter-intuitive. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can create an issue for that and we’ll address it as prioritized. It’s out of scope for this PR, I have already created an issue to generalize different stores a bit more to be able to rely on tests for different database types better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m also not sure that making Blockstore persistent by default is a good idea as the use-cases for persistent storage are pretty limited at the moment.
It seems our default use-case is garbage-collected, so having persistent storage as an add-on makes more sense. I do agree that in general case this would be counter-intuitive.

where
ReaderT: ReadAt,
{
fn put_keyed_persistent(&self, k: &Cid, block: &[u8]) -> anyhow::Result<()> {
match self {
AnyCar::Forest(forest) => forest.put_keyed_persistent(k, block),
AnyCar::Plain(plain) => plain.put_keyed_persistent(k, block),
AnyCar::Memory(mem) => mem.put_keyed_persistent(k, block),
}
}
}

impl<ReaderT> From<super::ForestCar<ReaderT>> for AnyCar<ReaderT> {
fn from(car: super::ForestCar<ReaderT>) -> Self {
Self::Forest(car)
Expand Down
11 changes: 11 additions & 0 deletions src/db/car/forest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ use super::{CacheKey, ZstdFrameCache};
use crate::blocks::{Tipset, TipsetKey};
use crate::db::car::plain::write_skip_frame_header_async;
use crate::db::car::RandomAccessFileReader;
use crate::db::PersistentStore;
use crate::utils::db::car_stream::{CarBlock, CarHeader};
use crate::utils::encoding::from_slice_with_fallback;
use crate::utils::io::EitherMmapOrRandomAccessFile;
Expand All @@ -73,6 +74,7 @@ use std::{
use tokio::io::{AsyncWrite, AsyncWriteExt};
use tokio_util::codec::{Decoder, Encoder as _};
use unsigned_varint::codec::UviBytes;

#[cfg(feature = "benchmark-private")]
pub mod index;
#[cfg(not(feature = "benchmark-private"))]
Expand Down Expand Up @@ -237,6 +239,15 @@ where
}
}

impl<ReaderT> PersistentStore for ForestCar<ReaderT>
where
ReaderT: ReadAt,
{
fn put_keyed_persistent(&self, k: &Cid, block: &[u8]) -> anyhow::Result<()> {
self.put_keyed(k, block)
}
}

fn decode_zstd_single_frame<ReaderT: Read>(reader: ReaderT) -> io::Result<BytesMut> {
let mut zstd_frame = vec![];
zstd::Decoder::new(reader)?
Expand Down
8 changes: 7 additions & 1 deletion src/db/car/many.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
//! A single z-frame cache is shared between all read-only stores.

use super::{AnyCar, ZstdFrameCache};
use crate::db::{EthMappingsStore, MemoryDB, SettingsStore};
use crate::db::{EthMappingsStore, MemoryDB, PersistentStore, SettingsStore};
use crate::libp2p_bitswap::BitswapStoreReadWrite;
use crate::rpc::eth::types::EthHash;
use crate::shim::clock::ChainEpoch;
Expand Down Expand Up @@ -169,6 +169,12 @@ impl<WriterT: Blockstore> Blockstore for ManyCar<WriterT> {
}
}

impl<WriterT: PersistentStore> PersistentStore for ManyCar<WriterT> {
fn put_keyed_persistent(&self, k: &Cid, block: &[u8]) -> anyhow::Result<()> {
self.writer.put_keyed_persistent(k, block)
}
}

impl<WriterT: BitswapStoreRead + Blockstore> BitswapStoreRead for ManyCar<WriterT> {
fn contains(&self, cid: &Cid) -> anyhow::Result<bool> {
Blockstore::has(self, cid)
Expand Down
10 changes: 10 additions & 0 deletions src/db/car/plain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ use cid::Cid;
use fvm_ipld_blockstore::Blockstore;
use integer_encoding::VarIntReader;

use crate::db::PersistentStore;
use nunny::Vec as NonEmpty;
use parking_lot::RwLock;
use positioned_io::ReadAt;
Expand Down Expand Up @@ -233,6 +234,15 @@ where
}
}

impl<ReaderT> PersistentStore for PlainCar<ReaderT>
where
ReaderT: ReadAt,
{
fn put_keyed_persistent(&self, k: &Cid, block: &[u8]) -> anyhow::Result<()> {
self.put_keyed(k, block)
}
}

pub async fn write_skip_frame_header_async(
mut writer: impl AsyncWrite + Unpin,
data_len: u32,
Expand Down
45 changes: 44 additions & 1 deletion src/db/gc/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ mod test {
use crate::blocks::{CachingBlockHeader, Tipset};
use crate::chain::{ChainEpochDelta, ChainStore};

use crate::db::{GarbageCollectable, MarkAndSweep, MemoryDB};
use crate::db::{GarbageCollectable, MarkAndSweep, MemoryDB, PersistentStore};
use crate::message_pool::test_provider::{mock_block, mock_block_with_parents};
use crate::networks::ChainConfig;

Expand All @@ -247,7 +247,11 @@ mod test {
use core::time::Duration;

use crate::shim::clock::ChainEpoch;
use cid::multihash::Code::Identity;
use cid::multihash::MultihashDigest;
use cid::Cid;
use fvm_ipld_blockstore::Blockstore;
use fvm_ipld_encoding::DAG_CBOR;
use std::sync::Arc;

const ZERO_DURATION: Duration = Duration::from_secs(0);
Expand Down Expand Up @@ -473,4 +477,43 @@ mod test {
current_epoch + 1 + depth * 2
);
}

#[tokio::test]
async fn persistent_data_resilient_to_gc() {
let depth = 5 as ChainEpochDelta;
let current_epoch = 0 as ChainEpochDelta;

let tester = GCTester::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we test both MemoryDB and ParityDB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParityDB is tricky to test, there is an issue for that. Therefore it has been tested manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue there is that deletes are not propagating immediately, so unit testing it reliably is not currently possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, no issue. creating one

let mut gc = MarkAndSweep::new(
tester.db.clone(),
tester.get_heaviest_tipset_fn(),
depth,
ZERO_DURATION,
);

let depth = depth as ChainEpochDelta;
let current_epoch = current_epoch as ChainEpochDelta;

let persistent_data = [1, 55];
let persistent_cid = Cid::new_v1(DAG_CBOR, Identity.digest(&persistent_data));

// Make sure we run enough epochs to initiate GC.
tester.run_epochs(current_epoch);
tester.run_epochs(depth);
tester
.db
.put_keyed_persistent(&persistent_cid, &persistent_data)
.unwrap();
// Mark.
gc.gc_workflow(ZERO_DURATION).await.unwrap();
tester.run_epochs(depth);
// Sweep.
gc.gc_workflow(ZERO_DURATION).await.unwrap();

// Make sure persistent data stays.
assert_eq!(
tester.db.get(&persistent_cid).unwrap(),
Some(persistent_data.to_vec())
);
}
}
23 changes: 21 additions & 2 deletions src/db/memory.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
// SPDX-License-Identifier: Apache-2.0, MIT

use crate::cid_collections::CidHashSet;
use crate::db::GarbageCollectable;
use crate::db::{GarbageCollectable, PersistentStore};
use crate::libp2p_bitswap::{BitswapStoreRead, BitswapStoreReadWrite};
use crate::rpc::eth::types::EthHash;
use ahash::HashMap;
Expand All @@ -16,6 +16,7 @@ use super::{EthMappingsStore, SettingsStore};
#[derive(Debug, Default)]
pub struct MemoryDB {
blockchain_db: RwLock<HashMap<Vec<u8>, Vec<u8>>>,
blockchain_persistent_db: RwLock<HashMap<Vec<u8>, Vec<u8>>>,
settings_db: RwLock<HashMap<String, Vec<u8>>>,
eth_mappings_db: RwLock<HashMap<EthHash, Vec<u8>>>,
}
Expand Down Expand Up @@ -109,7 +110,16 @@ impl EthMappingsStore for MemoryDB {

impl Blockstore for MemoryDB {
fn get(&self, k: &Cid) -> anyhow::Result<Option<Vec<u8>>> {
Ok(self.blockchain_db.read().get(&k.to_bytes()).cloned())
Ok(self
.blockchain_db
.read()
.get(&k.to_bytes())
.cloned()
.or(self
.blockchain_persistent_db
.read()
.get(&k.to_bytes())
.cloned()))
}

fn put_keyed(&self, k: &Cid, block: &[u8]) -> anyhow::Result<()> {
Expand All @@ -120,6 +130,15 @@ impl Blockstore for MemoryDB {
}
}

impl PersistentStore for MemoryDB {
fn put_keyed_persistent(&self, k: &Cid, block: &[u8]) -> anyhow::Result<()> {
self.blockchain_persistent_db
.write()
.insert(k.to_bytes(), block.to_vec());
Ok(())
}
}

impl BitswapStoreRead for MemoryDB {
fn contains(&self, cid: &Cid) -> anyhow::Result<bool> {
Ok(self.blockchain_db.read().contains_key(&cid.to_bytes()))
Expand Down
2 changes: 2 additions & 0 deletions src/db/migration/migration_map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ use std::{

use crate::db::migration::v0_16_0::Migration0_15_2_0_16_0;
use crate::db::migration::v0_19_0::Migration0_18_0_0_19_0;
use crate::db::migration::v0_22_1::Migration0_22_0_0_22_1;
use crate::Config;
use anyhow::bail;
use anyhow::Context as _;
Expand Down Expand Up @@ -80,6 +81,7 @@ create_migrations!(
"0.12.1" -> "0.13.0" @ Migration0_12_1_0_13_0,
"0.15.2" -> "0.16.0" @ Migration0_15_2_0_16_0,
"0.18.0" -> "0.19.0" @ Migration0_18_0_0_19_0,
"0.22.0" -> "0.22.1" @ Migration0_22_0_0_22_1,
);

pub struct Migration {
Expand Down
1 change: 1 addition & 0 deletions src/db/migration/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ mod migration_map;
mod v0_12_1;
mod v0_16_0;
mod v0_19_0;
mod v0_22_1;
mod void_migration;

pub use db_migration::DbMigration;
2 changes: 1 addition & 1 deletion src/db/migration/v0_16_0.rs
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@ mod paritydb_0_15_1 {
// if it changes and then this migration should either be maintained or removed.
pub(super) fn open(path: impl Into<PathBuf>) -> anyhow::Result<db::parity_db::ParityDb> {
let opts = Self::to_options(path.into());
let db = db::parity_db::ParityDb::wrap(Db::open_or_create(&opts)?, false);
let db = db::parity_db::ParityDb::wrap(Db::open_or_create(&opts)?, false, false);
Ok(db)
}
}
Expand Down
Loading
Loading