Add layer 3-log #11

lucassong-mh · 2023-09-26T11:14:29Z

No description provided.

tatetian

This is my first round of review, in which I have focused on TxLog and ChunkAlloc.

src/error.rs

src/tx/current.rs

src/layers/3-log/tx_log.rs

tatetian · 2023-11-21T05:03:30Z

src/layers/3-log/tx_log.rs

+use crate::tx::{CurrentTx, Tx, TxData, TxId, TxProvider};
+use crate::util::{LazyDelete, RandomInit};
+
+use alloc::collections::{BTreeMap, BTreeSet};


As a recommended convention, the standard and third-party imports should be put before the imports of the current crate.

src/layers/3-log/tx_log.rs

tatetian · 2023-11-22T05:08:24Z

src/layers/3-log/tx_log.rs

+                        let log_cache = log_caches.get_mut(log_id).unwrap();
+                        let mut cache_inner = log_cache.inner.write();
+                        open_cache.lru_cache.iter().for_each(|(&pos, node)| {
+                            cache_inner.lru_cache.put(pos, node.clone());


Migrating items from the per-TX cache to the global cache looks like an expensive operation (O(N\log{N}), where N is the number of blocks in a TxLog). I am certain that this can be improved a lot. Try to do some microbenchmark and decide if this is a performance bottleneck.

tatetian · 2023-11-22T05:30:26Z

src/layers/3-log/mod.rs

@@ -1,9 +1,7 @@
 //! The layer of transactional logging.


Module-level Rust docs should be provided. Document how the individual APIs, namely TxLogStore, TxLog, and Tx, that are provided by this module can be used together. Also, TxLogStore and TxLog (as well as Tx) are documented inadequately. Sadly, their current type-level Rust docs are basically what I wrote in the draft design. But this is now the official implementation. To this point, we should have been much more to say and write.

I cannot put my trust to any code that are poorly documented. Because if one cannot write decent docs to give the spec of the code, how on the earth can anyone, including the author, knows if the implementation is correct or not. Our abstractions are not trivial. They deserve to be well documented.

With ChatGPT, the job of writing docs in decent English is much easier.

src/layers/3-log/tx_log.rs

src/layers/3-log/raw_log.rs

src/layers/3-log/chunk.rs

cqs21 · 2023-11-22T07:34:20Z

src/layers/3-log/chunk.rs

    // A bitmap where each bit indicates whether a corresponding chunk has been
    // allocated.
-    alloc_map: BitVec<usize, Lsb0>,
+    alloc_map: BitMap,


Maybe free_map is better, since all other fields are free.

cqs21 · 2023-11-22T07:39:42Z

src/layers/3-log/chunk.rs

    // The number of free chunks.
    free_count: usize,
-    // The minimum free chunk Id. Useful to narrow the scope of searching for 
+    // The minimum free chunk Id. Useful to narrow the scope of searching for
    // free chunk IDs.
    min_free: usize,


I guess this is the start position to search in the alloc_map, min_free is a little vague. How about search_cursor ?

cqs21 · 2023-11-22T08:00:24Z

src/layers/3-log/chunk.rs

-        self.alloc_map[chunk_id] = false;
+        assert_eq!(self.alloc_map[chunk_id], true);
+        self.alloc_map.set(chunk_id, false);
+        self.free_count += 1;

        // Keep the invariance that all free chunk IDs are no less than min_free
        if chunk_id < self.min_free {


The logic of min_free seems to encourage allocating the Chunk that deallocated just now. I. prefer to alloc the Chunk which may never used before or deallocate a long time ago.

cqs21 · 2023-11-24T08:15:21Z

src/layers/3-log/tx_log.rs

+use pod::Pod;
+use serde::{Deserialize, Serialize};
+
+pub type TxLogId = RawLogId;


If TxLogId and RawLogId are the same, a single LogId definition in the mod level could be better, for it establishes a connection between the TxLog and RawLog, as a master key.

cqs21 · 2023-11-24T08:22:26Z

src/layers/3-log/tx_log.rs

@@ -463,8 +540,11 @@ impl<D: BlockSet> TxLogStore<D> {
    /// This method must be called within a TX. Otherwise, this method panics.
    pub fn open_log_in(&self, bucket: &str) -> Result<Arc<TxLog<D>>> {


Rename the function for better readability, such as open_max_log_in or open_latest_log_in?

cqs21 · 2023-11-24T08:25:16Z

src/layers/3-log/tx_log.rs

@@ -463,8 +540,11 @@ impl<D: BlockSet> TxLogStore<D> {
    /// This method must be called within a TX. Otherwise, this method panics.
    pub fn open_log_in(&self, bucket: &str) -> Result<Arc<TxLog<D>>> {
        let log_ids = self.list_logs(bucket)?;


Could we maintain the max_log_id, avoiding list_log and iter them every time?

cqs21 · 2023-11-24T08:30:51Z

src/layers/3-log/tx_log.rs

        self.do_contain_log(log_id, &state, &current_tx)
    }

    fn do_contain_log(&self, log_id: TxLogId, state: &State, current_tx: &CurrentTx<'_>) -> bool {
-        if state.contains_log(log_id) {
+        if state.persistent.contains_log(log_id) {
            let not_deleted = current_tx
                .data_with(|store_edit: &TxLogStoreEdit| !store_edit.is_log_deleted(log_id));
            not_deleted


Avoid negated terms, and it seems that not_deleted and is_created are redundant, leaving the function call alone is enough.

cqs21 · 2023-11-24T08:35:40Z

src/layers/3-log/tx_log.rs

 }

 /// A transactional log.
 #[derive(Clone)]
 pub struct TxLog<D> {
    inner_log: Arc<TxLogInner<D>>,
+    tx_provider: Arc<TxProvider>,
    can_append: bool,


RawLog also has a field can_append. Do they have the same meaning? Can we just keep one of them?

tatetian

LGTM. Thanks for the contribution!

lucassong-mh force-pushed the dev-songsw-3_log branch 2 times, most recently from 9ecf5ff to 00ac9ee Compare September 26, 2023 11:44

lucassong-mh force-pushed the dev-songsw-3_log branch 3 times, most recently from 4b91ed8 to 8c9dd5e Compare October 13, 2023 05:17

lucassong-mh force-pushed the dev-songsw-3_log branch 2 times, most recently from 17dc4f8 to fe3e702 Compare November 16, 2023 09:31

lucassong-mh marked this pull request as ready for review November 16, 2023 09:31

tatetian reviewed Nov 22, 2023

View reviewed changes

cqs21 reviewed Nov 22, 2023

View reviewed changes

lucassong-mh force-pushed the dev-songsw-3_log branch from fe3e702 to c0e0c8d Compare November 22, 2023 15:44

Add layer 3-log: chunk, raw_log and tx_log

6f19034

lucassong-mh force-pushed the dev-songsw-3_log branch from c0e0c8d to 6f19034 Compare November 24, 2023 04:23

cqs21 reviewed Nov 24, 2023

View reviewed changes

lucassong-mh mentioned this pull request Nov 24, 2023

Add compilable dm_sworndisk module (dummy) #13

Merged

tatetian approved these changes Dec 14, 2023

View reviewed changes

tatetian merged commit 64f76c2 into asterinas:main Dec 14, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add layer 3-log #11

Add layer 3-log #11

lucassong-mh commented Sep 26, 2023

tatetian left a comment

tatetian Nov 21, 2023

tatetian Nov 22, 2023

tatetian Nov 22, 2023

cqs21 Nov 22, 2023

cqs21 Nov 22, 2023

cqs21 Nov 22, 2023

cqs21 Nov 24, 2023

cqs21 Nov 24, 2023

cqs21 Nov 24, 2023

cqs21 Nov 24, 2023

cqs21 Nov 24, 2023

tatetian left a comment

		@@ -463,8 +540,11 @@ impl<D: BlockSet> TxLogStore<D> {
		/// This method must be called within a TX. Otherwise, this method panics.
		pub fn open_log_in(&self, bucket: &str) -> Result<Arc<TxLog<D>>> {

Add layer 3-log #11

Add layer 3-log #11

Conversation

lucassong-mh commented Sep 26, 2023

tatetian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tatetian left a comment

Choose a reason for hiding this comment