Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement v2 client GET functionality #972

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

litt3
Copy link
Contributor

@litt3 litt3 commented Dec 9, 2024

Why are these changes needed?

This PR creates a new EigenDAClientV2, and implements GET functionality for the new client. Additional client functionality will be implemented in upcoming PRs.

Checks

  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

Signed-off-by: litt3 <[email protected]>
@litt3 litt3 self-assigned this Dec 9, 2024
Copy link
Contributor

@samlaf samlaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass on the client code. Logic LGTM. Added a bunch of nit comments to make code cleaner.
Have not looked at tests yet. Can you try to clean up the linter errors, makes it hard to read on github.

api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
// EigenDAClientV2 provides the ability to get blobs from the relay subsystem, and to send new blobs to the disperser.
type EigenDAClientV2 interface {
// GetBlob iteratively attempts to retrieve a given blob with key blobKey from the relays listed in the blobCertificate.
GetBlob(ctx context.Context, blobKey corev2.BlobKey, blobCertificate corev2.BlobCertificate) ([]byte, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only need the BlobCertificate for the relay keys, I think we should just take the relayKeys slice directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need more data from the certificate for verification (though I'm not sure yet what is required). If it turns out we don't need the cert for anything else, I'll make the suggested change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense! I'll keep this convo unresolved as a reminder for the next review when you commit the POST code

api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
api/clients/eigenda_client_v2.go Outdated Show resolved Hide resolved
@litt3 litt3 mentioned this pull request Dec 10, 2024
5 tasks
Comment on lines 1 to 8
package clients

import (
"github.com/Layr-Labs/eigenda/api/clients/codecs"
)

// EigenDAClientConfigV2 contains configuration values for EigenDAClientV2
type EigenDAClientConfigV2 struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting this thread to discuss package structure. I personally don't like that all our clients are in the same package. I think I would prefer if each client was in its own separate package, because that allows to have cleaner names, and make sure to prevent clashes. For eg:

import "...clients/v2/disperser"
import edaclient ".../v2/eigenda"
func main() {
   dispClient := disperser.New(disperser.Config{...})
   edaclient.New(edaclient.Config{...}, dispClient)
}

is way cleaner than having long names all under clients package.

I'm fine if people prefer to keep all the clients in same namespace, but at least I think we should separate the v1 and v2 namespace. Thoughts? @ian-shim @epociask @bxue-l2 @jianoaix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. We are getting a lot "v2" in the names around.

indices[i], indices[j] = indices[j], indices[i]
})

// TODO (litt3): consider creating a utility which can deprioritize relays that fail to respond (or respond maliciously)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to think about but probably pretty low priority until we have relays that we don't run ourselves

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sort of TODO comment acceptable to leave in production code?

// An honest relay should never send a blob which cannot be decoded
decodedData, err := c.codec.DecodeBlob(data)
if err != nil {
c.log.Warn("error decoding blob", "blobKey", blobKey, "relayKey", relayKey, "error", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knit - only warning in iteration logic that doesn't reference relay in core msg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// EigenDAClientV2 provides the ability to get blobs from the relay subsystem, and to send new blobs to the disperser.
type EigenDAClientV2 struct {
log logging.Logger
// doesn't need to be cryptographically secure, as it's only used to distribute load across relays

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of scope for this PR but curious if we'd ever wanna let users define their own retrieval policies when communicating with relays

return decodedData, nil
}

return nil, fmt.Errorf("unable to retrieve blob from any relay")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knit - including the relay count in the error msg could be useful context when debugging

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func createCodec(config *EigenDAClientConfigV2) (codecs.BlobCodec, error) {
lowLevelCodec, err := codecs.BlobEncodingVersionToCodec(config.PutBlobEncodingVersion)
if err != nil {
return nil, fmt.Errorf("create low level codec: %w", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knit:

Suggested change
return nil, fmt.Errorf("create low level codec: %w", err)
return nil, fmt.Errorf("could not create low level codec: %w", err)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knit^2: disagree here, been trying to enforce https://github.com/uber-go/guide/blob/master/style.md#error-wrapping

Comment on lines 90 to 91
// TODO: does this need a timeout?
data, err := c.relayClient.GetBlob(ctx, relayKey, blobKey)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - grpc clients by default don't have a default timeout iiuc:

https://medium.com/geekculture/timeout-context-in-go-e88af0abd08d

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a timeout 88df865

Comment on lines 93 to 97
// if GetBlob returned an error, try calling a different relay
if err != nil {
c.log.Warn("blob couldn't be retrieved from relay", "blobKey", blobKey, "relayKey", relayKey, "error", err)
continue
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the circumstance where the error is transient and the # of relay keys == 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that we have an additional timeout, during which the client repeatedly retries all relays?

I could implement this if it's the way we want to go- but I don't see how the case relay keys == 1 is unique?

continue
}

// An honest relay should never send a blob which cannot be decoded

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To expand these invariants: an honest relay should never send a blob which doesn't respect its polynomial commitments. The thing is though this check would get caught upstream (i.e, within proxy directly) and probably cause the request to fail. The proxy client would trigger a retry which would probably route to another relay.

this isn't a big problem rn and we can just document it somewhere for circle back sometime in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to check this invariant here, included in this PR? Seems like it wouldn't be hard to add

@@ -0,0 +1,68 @@
package mock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: filenames should be snakecase i.e. blob_codec.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed a48afb1


// GetBlob iteratively attempts to retrieve a given blob with key blobKey from the relays listed in the blobCertificate.
//
// The relays are attempted in random order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disperser implementation itself shuffles the relay keys, so attempting each relay in order is actually fine. But it's good that we're not assuming that relay keys aren't ordered in any particular way.

Aka do we need to hit all the relayKeys to get all the blobs, or any ONE should do?

Any one should do!

Comment on lines 75 to 79
var indices []int
// create a randomized array of indices, so that it isn't always the first relay in the list which gets hit
for i := 0; i < relayKeyCount; i++ {
indices = append(indices, i)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

indices := make([]int, relayKeyCount)
for i := 0; i < relayKeyCount; i++ {
  indices[i] = i
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified even further, as suggested in this comment

a48afb1

for _, val := range indices {
relayKey := blobCertificate.RelayKeys[val]

// TODO: does this need a timeout?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a timeout 88df865

}

// GetCodec returns the codec the client uses for encoding and decoding blobs
func (c *EigenDAClientV2) GetCodec() codecs.BlobCodec {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this getter?

Copy link
Contributor Author

@litt3 litt3 Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I carried it forward from v1 client - will remove as it may not be necessary for v2 (nevermind, see comment from Sam below)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll still need it. We use it in proxy to get the codec and IFFT blobs when computing commitments.

func (_m *BlobCodec) DecodeBlob(encodedData []byte) ([]byte, error) {
ret := _m.Called(encodedData)

if len(ret) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it expect leng(ret) == 2 since you also access Get(1)?

Copy link
Contributor Author

@litt3 litt3 Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the reason to only check ==0 here is to yield a more informative panic in the specific case where no return value is specified. The call to Get(1) later contains a length check under the hood, and would just yield a different message.

(for context, this mock was autogenerated)

if rf, ok := ret.Get(0).(func([]byte) []byte); ok {
r0 = rf(encodedData)
} else {
if ret.Get(0) != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: control flow may be simplified as else if ret.Get(0) != nil { ... }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file was auto generated by mockery. Are you ok with leaving the auto generated code as is?

Let me know if you'd prefer I make these changes, and I will apply your suggestions.

r0 = rf(encodedData)
} else {
if ret.Get(0) != nil {
r0 = ret.Get(0).([]byte)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may check the casting is done correctly r0, ok = ret.Get(0).([]byte)

// EigenDAClientConfigV2 contains configuration values for EigenDAClientV2
type EigenDAClientConfigV2 struct {
// The blob encoding version to use when writing blobs from the high level interface.
PutBlobEncodingVersion codecs.BlobEncodingVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "Put" in this naming mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was referring to PUT, as contrasted with GET.

But I think you're right, and it's not necessary. I will simplify the name to BlobEncodingVersion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in a48afb1

// This makes it possible to open points on the KZG commitment to prove that the field elements correspond to
// the commitment. With this mode disabled, you will need to supply the entire blob to perform a verification
// that any part of the data matches the KZG commitment.
DisablePointVerificationMode bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it better to define a enum for point and coeff modes for this param? That may be more readable, since these two modes are important information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum defined a48afb1

relayKeyCount := len(blobCertificate.RelayKeys)

if relayKeyCount == 0 {
return nil, fmt.Errorf("relay key count is zero")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may just errors.New("...") since there is not formatting in this error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done a48afb1

return nil, fmt.Errorf("relay key count is zero")
}

var indices []int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just simplify it: indices := rand.Perm(relayKeyCount)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done a48afb1


// An honest relay should never send an empty blob
if len(data) == 0 {
c.log.Warn("blob received from relay had length 0", "blobKey", blobKey, "relayKey", relayKey, "error", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "err" is already made sure nil, no need to log it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed a48afb1

)

// EigenDAClientV2 provides the ability to get blobs from the relay subsystem, and to send new blobs to the disperser.
type EigenDAClientV2 struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment thread safety of this struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment stating the struct is not threadsafe a48afb1

type VerificationMode uint

const (
// TODO: write good docs here for IFFT and NoIFFT (I need to update my understanding to be able to write this)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks the "PointVerificationMode" field in config already documents what these two mean.

Actually, I am not entirely sure if we want to call out "how" (i.e. IFFT and then FFT) at the interface level. What about BlobFormat or something (point v.s. coeffs) that describes the blobs, not how they are processed underneath? cc @samlaf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks the "PointVerificationMode" field in config already documents what these two mean.

There's a description there, but I'm personally a bit confused by what it says, so I want to clarify to be able to write a good description here.

What about BlobFormat or something

Makes sense, I'll defer to you and Sam for the proper name here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants