Feature/recover whole path #8120

asdacap · 2025-01-29T07:54:17Z

Current trie recover recover one node then try again without recovery.
- This means it does not recover when more than one node in a path is missing.
This PR changes that by trying to recover the whole path.
- When snap peer is available it uses account range or storage range for the full path and attempt to assemble the needed nodes from the proof of the range.
- This seems to be about 4 time faster than assemblying the path one node at a time.
Then when combined with an additional code recovery (not included here) allow it to execute block with nodes entirely from network. Obviously its gonna be terrible, but how terrible? About a few minute for the first block, can be 3 minute, can be 15 minute, really depends on how close is the server. After that, it can take 10 to 30 seconds per block. Lowest I've seen is about 5 second per block. It does not reliably speed up enough to escape the snap 128 block state range.

Types of changes

What types of changes does your code introduce?

New feature (a non-breaking change that adds functionality)
Optimization

Testing

Requires testing

Yes
No

If yes, did you write tests?

Yes
No

Notes on testing

Tested with a hacked state sync that only save root.

…le-path

LukaszRozmej

Love the idea!

LukaszRozmej · 2025-01-30T09:10:46Z

src/Nethermind/Nethermind.Core/Utils/AutoCancelTokenSource.cs

+/// Automatically cancel and dispose underlying cancellation token source.
+/// Make it easy to have golang style defer cancel pattern.
+/// </summary>
+public readonly struct AutoCancelTokenSource(CancellationTokenSource cancellationTokenSource) : IDisposable


why not derive from CancellationTokenSource?

I prefer not to use inheritance.

I somewhat agree, but it IS a CancellationTokenSource, so it makes some sense

LukaszRozmej · 2025-01-30T09:22:50Z

src/Nethermind/Nethermind.Core/Tasks/WaitForPassingTasks.cs

+    /// <param name="tasks"></param>
+    /// <typeparam name="T"></typeparam>
+    /// <returns></returns>
+    public static async Task<T> ForPassingTask<T>(Func<T, bool> cond, params IEnumerable<Task<T>> tasks)


Can we make it span?
LINQ style naming? AnyWhere or just Any - in LINQ it has optional parameter:https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.any?view=net-8.0#system-linq-enumerable-any-1(system-collections-generic-ienumerable((-0))-system-func((-0-system-boolean)))

Suggested change

public static async Task<T> ForPassingTask<T>(Func<T, bool> cond, params IEnumerable<Task<T>> tasks)

public static async Task<T> AnyWhere<T>(Func<T, bool> predicate, params ReadOnlySpan<Task<T>> tasks)

Because its an async code.

Ok the naming comment still stands

LukaszRozmej · 2025-01-30T09:25:24Z

src/Nethermind/Nethermind.Core/Tasks/WaitForPassingTasks.cs

+            if (!taskSet.Any())
+            {
+                // No more tasks, just return the last one.
+                return result;


shouldn't we return null or just TaskCompleated with default value?

Hmmm... don't know.

Well, it does make nullability simple.

LukaszRozmej · 2025-01-30T09:26:41Z

src/Nethermind/Nethermind.Network.Stats/NodeStatsLight.cs

@@ -204,7 +204,7 @@ private void UpdateValue(ref decimal? currentValue, decimal newValue)
    {
        return (long?)(transferSpeedType switch
        {
-            TransferSpeedType.Latency => _averageLatency,
+            TransferSpeedType.Latency => _averageLatency ?? 10000,


const from some timeout? int.MaxValue?

LukaszRozmej · 2025-01-30T09:30:29Z

src/Nethermind/Nethermind.Synchronization/Peers/ISyncPeerPool.cs

+                cancellationToken);
+        }
+
+        public static async Task<T> AllocateAndRun2<T>(


confused about the naming, can we just name it AllocateAndRun?

AllocateAndRun already used.

but has different parameters?

LukaszRozmej · 2025-01-30T09:32:42Z

src/Nethermind/Nethermind.Synchronization/Trie/NodeDataRecovery.cs

+                nodeRlp = await FetchRlp(rootHash, address, currentPath, currentHash, cts.Token);
+            }
+
+            if (nodeRlp == null)


LukaszRozmej · 2025-01-30T09:33:34Z

src/Nethermind/Nethermind.Synchronization/Trie/NodeDataRecovery.cs

+            queryPath = newQueryPath;
+        }
+
+        Dictionary<TreePath, byte[]> recoveredNodes = new();


Does it make sense to be Dictionary? Wouldn't list suffice? (could use pooledList)

LukaszRozmej · 2025-01-30T09:34:09Z

src/Nethermind/Nethermind.Synchronization/Trie/NodeDataRecovery.cs

+                    return null;
+                }, NodePeerStrategy, AllocationContexts.State, cancellationToken);
+            })
+            .ToArray();


ToPooledList instead of array

LukaszRozmej · 2025-01-30T09:36:09Z

src/Nethermind/Nethermind.Synchronization/Trie/SnapRangeRecovery.cs

+                            return null;
+                        }, SnapPeerStrategy, AllocationContexts.Snap, cts.Token);
+                })
+                .ToArray();


ToPooledList

LukaszRozmej · 2025-01-30T09:36:29Z

src/Nethermind/Nethermind.Synchronization/Trie/SnapRangeRecovery.cs

+        // Sometimes the start path for the missing node and the actual full path that the trie is working on is not the same.
+        // So we change the query to match the missing node path.
+        if (new TreePath(queryPath, 64).Truncate(startingPath.Length) != startingPath)
+        {
+            int remainingLength = 64 - startingPath.Length;
+            TreePath newQueryPath = startingPath;
+            for (int i = 0; i < remainingLength; i++)
+            {
+                newQueryPath.Append(0);
+            }
+            queryPath = newQueryPath.Path;
+        }


Same code as the other one?

asdacap added 15 commits January 28, 2025 12:31

Fix snap serve is on in test

5a6d98d

Fix hive

41067ed

Remove healing trie store and keep path in memory.

7ad5fa4

WIP

3e64e9f

Snap path recovery

819b3d2

Fix some test

c2e15f3

Fix test

a963b3f

Basic snap node recovery

4e238a9

Node data recovery

30dbd68

Fix snap server

aa543c2

More log

5c5471c

Fix tests

ba014ff

Slight cleanup

fbe9db0

Merge remote-tracking branch 'origin/master' into feature/recover-who…

b1d0505

…le-path

Whitespace

3aac29d

LukaszRozmej approved these changes Jan 30, 2025

View reviewed changes

asdacap added 2 commits January 30, 2025 19:37

Address comment

1767de1

To pooled list

fbea83a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/recover whole path #8120

Feature/recover whole path #8120

asdacap commented Jan 29, 2025 •

edited

Loading

LukaszRozmej left a comment

LukaszRozmej Jan 30, 2025

asdacap Jan 30, 2025

LukaszRozmej Jan 30, 2025

LukaszRozmej Jan 30, 2025

asdacap Jan 30, 2025

LukaszRozmej Jan 30, 2025

LukaszRozmej Jan 30, 2025

asdacap Jan 30, 2025

asdacap Jan 30, 2025

LukaszRozmej Jan 30, 2025

LukaszRozmej Jan 30, 2025

asdacap Jan 30, 2025

LukaszRozmej Jan 30, 2025

LukaszRozmej Jan 30, 2025

LukaszRozmej Jan 30, 2025

LukaszRozmej Jan 30, 2025

LukaszRozmej Jan 30, 2025

LukaszRozmej Jan 30, 2025

	public static async Task<T> ForPassingTask<T>(Func<T, bool> cond, params IEnumerable<Task<T>> tasks)
	public static async Task<T> AnyWhere<T>(Func<T, bool> predicate, params ReadOnlySpan<Task<T>> tasks)

Feature/recover whole path #8120

Are you sure you want to change the base?

Feature/recover whole path #8120

Conversation

asdacap commented Jan 29, 2025 • edited Loading

Types of changes

What types of changes does your code introduce?

Testing

Requires testing

If yes, did you write tests?

Notes on testing

LukaszRozmej left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asdacap commented Jan 29, 2025 •

edited

Loading