Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sideloading functionality to neutrino #285

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Chinwendu20
Copy link
Contributor

@Chinwendu20 Chinwendu20 commented Aug 19, 2023

Related issue #70

This issue adds the sideloading functionality in neutrino. We want to be able to preload the block header and filter header store with headers before neutrino starts up. This optimizes speed.

Change summary:

  • Introduce a sideload package responsible for sideloading. It fetches headers from the source, validates them, and writes to the store.
  • Write an implementation for fetching block headers from a binary encoded source.
  • Decouple block header validation and storage from the block manager.
  • Decouple the process of finding the next and previous block header checkpoints from the block manager.
  • Include functionality in the neutrino chain service.

@Chinwendu20 Chinwendu20 force-pushed the side branch 2 times, most recently from 24a87f6 to 916eb57 Compare August 20, 2023 04:51
@positiveblue positiveblue self-requested a review August 22, 2023 01:35
Copy link

@positiveblue positiveblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty nice! I think we are in the right track here 👍

I left some comments, many are about naming, styling and formatting but there are others about the feature logic that we need to address 🙏

sideload/binary.go Show resolved Hide resolved
sideload/binary.go Outdated Show resolved Hide resolved
sideload/binary.go Outdated Show resolved Hide resolved
sideload/binary.go Outdated Show resolved Hide resolved
sideload/binary.go Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
@Chinwendu20
Copy link
Contributor Author

Chinwendu20 commented Aug 22, 2023

Thanks @positiveblue , please I left comments on your review and pushed some changes in response to your feedback (the ones that I am clear on).

@Chinwendu20 Chinwendu20 force-pushed the side branch 3 times, most recently from 214a0e6 to 26ffe03 Compare August 24, 2023 15:34
@Chinwendu20
Copy link
Contributor Author

Hello @positiveblue, have you had the chance to look at this again. If there are no objections to my comment. I will just make fmt on this and include a description of the encoding and push.

@Roasbeef Roasbeef requested a review from ellemouton September 14, 2023 02:06
Copy link

@positiveblue positiveblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second round done. I think it's taking shape 🥇 but there is still some things to address.

Currently unit test are not passing and the linter is also failing. You can check the github actions to get more details 🙏

Should we include the side loading for cfilters in this PR or you prefer to leave it for a next one?

chainDataLoader/binary.go Outdated Show resolved Hide resolved
chainDataLoader/binary.go Outdated Show resolved Hide resolved
chainDataLoader/binary.go Outdated Show resolved Hide resolved
chainDataLoader/binary.go Outdated Show resolved Hide resolved
chainDataLoader/binary.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated Show resolved Hide resolved
@Chinwendu20
Copy link
Contributor Author

Hello can anyone please run the workflow?

@Chinwendu20
Copy link
Contributor Author

Please can anyone help run the workflow?

@Chinwendu20
Copy link
Contributor Author

hmm, this works fine locally. I think I would just write a simple unit test instead of the integration test.

@Roasbeef
Copy link
Member

@Chinwendu20 have you run it with the race condition detector locally?

@Chinwendu20
Copy link
Contributor Author

Oh no I did not @Roasbeef but I have made some changes now and ran it with the race detector, hopefully, this works on the CI as well. Can anyone please help run the workflow?

@Chinwendu20
Copy link
Contributor Author

Oh okay I will just mock the reader interface and write a unit test now.

@Chinwendu20
Copy link
Contributor Author

Please can anyone help run the workflow

@Roasbeef Roasbeef requested review from Crypt-iQ and removed request for positiveblue October 26, 2023 17:23
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work so far on this feature!

Completed an initial review, with the following jumping out:

  • Style guide not closely adhered to (godoc string, 80 char columns, etc)
  • The main set of interfaces can be simplified.
  • The control flow can be simplified: if we have more headers than in the file, we can terminate. We should check the file before starting up any other sub-system. The file can just have the header be removed, then appended in one step to the end of headers.bin.
  • We can also use the side loader for the filter headers as well. I think this is where that generic param can actually be used.
  • No need for custom re-org logic, just load the file if it's relevant, then let the normal p2p logic handle reorgs if needed.


First 17 bytes holds the following information:

- First 8 bytes holds the height of the first block header in the binary file (start height).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to compress things a bit more, then we can use a varint here, so: https://github.com/lightningnetwork/lnd/blob/3b7cda9e8d3a493fd548077a6cd6d5b8fa4b76bb/tlv/varint.go#L15

The comment should also be wrapped to 80 character columns as dictated by our style guide. I think it can be compressed to just:

Each file has a header consisting of: firstHeight || lastHeight || chainType.

Also we can just use an integer for the chain type, so assign values 1-4 to: mainnet, testnet, simnet, regtest, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I would look into this.

chaindataloader/binary.go Outdated Show resolved Hide resolved
chaindataloader/binary.go Outdated Show resolved Hide resolved
chaindataloader/binary.go Outdated Show resolved Hide resolved

if _, err := b.file.ReadAt(rawBlkHdr, b.offset+(headerfs.BlockHeaderSize*b.tracker)); err != nil {
if err == io.EOF {
return nil, ErrEndOfFile
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can just pass thru io.EOF as normal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated
return
default:
// Request header
header, headerErr := b.sideLoadBlkHdrReader.NextHeader()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should read batches of headers at a time. We can also verify batches of headers at a time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like it is done in handleheaders ? From what I understand even though we read in batch we would have to verify the headers one after the other.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can read them in as a batch, verify one by one (validity of header N depends on the validity of header N-1), then write as a batch.

blockmanager.go Outdated Show resolved Hide resolved
blockmanager.go Outdated

// Verify checkpoint only if verification is enabled.
if b.nextCheckpoint != nil &&
node.Height == b.nextCheckpoint.Height {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should feed all this through our normal header processing logic. Some refactoring might be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean refactoring the handleheaders function?

Copy link
Contributor

@Crypt-iQ Crypt-iQ Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For regular block headers, I think the sideloading should be in blockManager.Start() after the blockHandler goroutine has started. Then it can call handleHeadersMsg with the set of block headers. The headersMsg struct might need to be modified since there's no peer -- maybe a sideloading bool so we can error if things go wrong. This also gets rid of the code duplication

For filter headers, something similar can be done where the sideloading is in Start().

Copy link
Contributor Author

@Chinwendu20 Chinwendu20 Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't place it after the blockhandler goroutine because we want to sideload before connecting to the network to fetch block headers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's better to have this be distinct, as then we don't need to increase the set of responsibilities of the blockManager.

Logic similar to the following can validate all teh headers at oce:

	for height := uint32(1); height <= bestBlockHeight; height++ {
		if currHeader.PrevBlock != prevHeader.BlockHash() {
			return fmt.Errorf("block header at height %d does "+
				"not refrence previous hash %v", height,
				prevHeader.BlockHash().String())
		}

		parentHeaderCtx := newLightHeaderCtx(
			int32(height-1), &prevHeader, headerStore, nil,
		)

		skipDifficulty := blockchain.BFFastAdd
		err = blockchain.CheckBlockHeaderContext(
			&currHeader, parentHeaderCtx, skipDifficulty, chainCtx,
			true,
		)
		if err != nil {
			return fmt.Errorf("error checking block %d header "+
				"context: %w", height, err)
		}

		err = blockchain.CheckBlockHeaderSanity(
			&currHeader, chainCtx.params.PowLimit, timeSource,
			skipDifficulty,
		)
		if err != nil {
			return fmt.Errorf("error checking block %d header "+
				"sanity: %w", height, err)
		}

		// TODO: Validate checkpoint and filter headers.
		prevHeader = currHeader

		if height > 0 && height%100_000 == 0 {
			log.Debugf("Validated %d headers", height)
		}
	}

We'd then:

  1. Read a chunk of N headers (say 2k or so) from the side loader source.
  2. Validate them all at once.
  3. Write them all to disk directly.

Copy link
Contributor Author

@Chinwendu20 Chinwendu20 Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I thought we did not want to duplicate the functionality of the handleheaders function #285 (comment)

blockmanager.go Outdated Show resolved Hide resolved
@linden
Copy link

linden commented Oct 27, 2023

Hi @Chinwendu20, I'd love to implement this into a wallet I'm working on for Joltz Rewards (https://joltzrewards.com).

Let me know if you'd like any support in addressing this PR review. Happy to dedicate time towards it.

@Chinwendu20
Copy link
Contributor Author

Nice work so far on this feature!

Completed an initial review, with the following jumping out:

  • Style guide not closely adhered to (godoc string, 80 char columns, etc)
  • The main set of interfaces can be simplified.
  • The control flow can be simplified: if we have more headers than in the file, we can terminate. We should check the file before starting up any other sub-system. The file can just have the header be removed, then appended in one step to the end of headers.bin.
  • We can also use the side loader for the filter headers as well. I think this is where that generic param can actually be used.
  • No need for custom re-org logic, just load the file if it's relevant, then let the normal p2p logic handle reorgs if needed.

Thanks for the review left some comments and would implement changes that I am clear on.

@Chinwendu20 Chinwendu20 requested a review from Roasbeef February 15, 2024 22:32
@lightninglabs-deploy
Copy link

@Roasbeef: review reminder
@Chinwendu20, remember to re-request review from reviewers when ready

@Chinwendu20 Chinwendu20 marked this pull request as draft March 13, 2024 06:47
@Chinwendu20 Chinwendu20 force-pushed the side branch 3 times, most recently from ee234b0 to 1f87365 Compare March 13, 2024 11:01
@Chinwendu20
Copy link
Contributor Author

@ellemouton, you can review now, thank you so much. I have not yet included a commit to remove code that is now redundant in blockmanger. As well as updated the blockmanager to use the validator and writer that was created in this PR.
Please skip formatting comments for now.

I just want to know if it is in line with Laolu's comments concerning encapsulation and using interfaces in the chaindataloader package and if the design is okay in general.

Copy link
Member

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Chinwendu20 - I think the overall idea of the various interfaces & abstractions along with the decoupling from the block manager look good 👍 would be good to get an opinion from @Roasbeef again though.

I think a good next step would be to spend some time getting the PR in a working & readable state. As is, it is hard to see what fits where just by going through the commits. So it is hard to give detailed feedback. As a reviewer, we want to be able to play around with the code. And then it is also easier to give feedback about whether or not an interface looks good.

Adding working tests would be awesome too cause that is often used by reviewers to et familiar with the changes & the reasons for various decisions.

Kudos for tackling this! This is quite a big project 🥇

@Chinwendu20 Chinwendu20 force-pushed the side branch 3 times, most recently from 4ce5ab9 to d5bed2f Compare March 25, 2024 05:21
@@ -39,7 +42,7 @@ var (
// btclog.LevelOff turns on log messages from the tests themselves as
// well. Keep in mind some log messages may not appear in order due to
// use of multiple query goroutines in the tests.
logLevel = btclog.LevelOff
logLevel = btclog.LevelInfo
syncTimeout = 30 * time.Second
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert this

SkipVerify: cfg.BlkHdrSideload.SkipVerify,
Chkpt: blockHdrChkptMgr,
SideloadRange: SideloadRange,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that I have not modified the blockmanager yet, if my approach is okay, the PR that I will submit would not have these.

return curHeight + s.SideloadRange, nil
}
}

Copy link
Contributor Author

@Chinwendu20 Chinwendu20 Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking there should be nothing like sideload range and we just fetch from checkpoint to checkpoint just as the blockmanager currently does whether we are verifying or not.

Also this just returns the last height and header that should be fetched in the next fetch. I would fix the comment and name of function.


require.NoError(
t, tlv.WriteVarInt(encodedOsFile, c.StartHeight, &[8]byte{}),
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would jut use one scratch buffer for these

// headerBufPool is a pool of bytes.Buffer that will be re-used by the various
// headerStore implementations to batch their header writes to disk. By
// utilizing this variable we can minimize the total number of allocations when
// writing headers to disk.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy pasta - ignore comment for now.

Ononiwu Maureen added 6 commits April 10, 2024 15:25
This commit introduces the `sideload` package,
designed to facilitate the sideloading of Bitcoin
blockchain headers from external sources.

Key components and changes:

- **Interfaces and Core Types**: Introduction of several
   interfaces and types such as `SourceType`, `dataType`,
   `dataSize`, `HeaderValidator`, `HeaderWriter`,
  `Checkpoints`, and `LoaderSource` to abstract the
   concepts of blockchain header validation, storage,
   and source management.

- **Loader Implementation**: The core of the sideload
    functionality is encapsulated in the `SideLoader`
    struct, which includes logic for header fetching,
    validation, and writing.

- **Binary Loader for Headers**: An implementation of the
LoaderSource interface for binary encoded headers is
included in this commit

Signed-off-by: Ononiwu Maureen <maureen.ononiwu@outlook.com>
This commit introduces a new `Checkpoints` structure
for managing block header checkpoints.

Motivation:
Decoupling the logic for finding next and previous
header checkpoints from the `blockmanager`,
facilitating sharing this functionality between the
`sideload` package and `blockmanager`, promoting
code reuse and consistency across the components.

Signed-off-by: Ononiwu Maureen <maureen.ononiwu@outlook.com>
This commit introduces a new structure to decouple
the process of validating `wire.BlockHeaders`
from the blockmanager.

Signed-off-by: Ononiwu Maureen <maureen.ononiwu@outlook.com>
This commit introduces a new structure to decouple
the process of writing `wire.BlockHeaders` to the
block header store from the blockmanager.

Signed-off-by: Ononiwu Maureen <maureen.ononiwu@outlook.com>
This commit adds the sideoading functionality to
neutrino's chainservice.

Signed-off-by: Ononiwu Maureen <maureen.ononiwu@outlook.com>
Signed-off-by: Ononiwu Maureen <maureen.ononiwu@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants