Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replication with Patchwork fails #87

Closed
black-puppydog opened this issue Jan 20, 2024 · 6 comments
Closed

Replication with Patchwork fails #87

black-puppydog opened this issue Jan 20, 2024 · 6 comments

Comments

@black-puppydog
Copy link

On current main (79514e9) I am getting the following error when trying to replicate with a local patchwork instance:

[2024-01-20T09:55:26Z ERROR solar::actors::replication::ebt::replicator] EBT replicate handler failed: Validation(Json(Error("missing field key", line: 1, column: 117)))

In solar/src/actors/muxrpc/ebt.rs in the handler

// Attempt to deserialize bytes into vector clock hashmap.
// If the deserialization is successful, emit a 'received clock'
// event.
if let Ok(clock) = serde_json::from_slice(req) {

I added a bit of println magic and this is what came up:

[2024-01-20T10:25:16Z DEBUG solar::actors::replication::ebt::manager] Received EBT event message from broker
[2024-01-20T10:25:16Z DEBUG solar::actors::replication::ebt::replicator] Received RPC response: {"name":["tunnel","isRoom"],"args":[]}
[2024-01-20T10:25:16Z DEBUG solar::actors::replication::ebt::replicator] Received RPC response: Network(2, RpcRequest(Body { name: ["blobs", "createWants"], rpc_type: Source, args: Array [] }))
[2024-01-20T10:25:16Z DEBUG solar::actors::replication::ebt::replicator] Received RPC response: {"@Ac5Zx3J7UeAYWoyVX3RZtlADWwXETAW7WT0gcDtWDcQ=.ed25519":4,"@i5rWm+biEH39dffvNHVtr1ccaj3EjxJG538cl7BKMwM=.ed25519":2}
[2024-01-20T10:25:16Z ERROR solar::actors::replication::ebt::replicator] EBT replicate handler failed: Validation(Json(Error("missing field `key`", line: 1, column: 117)))
[2024-01-20T10:25:16Z DEBUG solar::actors::replication::ebt::manager] Received EBT event message from broker
@black-puppydog
Copy link
Author

Reverting to 4e3a06e (without ebt, but having kuska as a github dependency) solves this, but of course I'd love to use EBT 😁

@mycognosist
Copy link
Owner

@black-puppydog

Thanks for opening an issue! This is exactly the bug I was looking at this week 😅

Outbound replication from solar to Manyverse is working (haven't checked using Patchwork) but inbound replication is failing. As you see in the debug output, the error happens because solar is trying to deserialize the inbound request as a message but it is in fact a vector clock.

I believe this commit is at least part of the problem: ed447ee

Serves me right for trying to make things more efficient when they're not yet working perfectly! I'll work on this on Wednesday and keep you updated. I'm very much committed to getting EBT working reliably.

@black-puppydog
Copy link
Author

Ah great, then you already know much more. Saves me the work to bisect it 😅

@mycognosist
Copy link
Owner

mycognosist commented Jan 29, 2024

@black-puppydog

I started looking into this today and I am making some progress. Figured I'd share with you and anyone else who may read this at some point.

First I returned the vector clock deserialization code to the recv_rpc_response() method in the MUXRPC handler for EBT (I mentioned this in a previous comment in this issue). This did not fix anything.

Then I remembered that I could run Manyverse with EBT logging enabled. I then saw an error when the vector clock was received from solar: "no stream for incoming msg" (https://github.com/ssbc/packet-stream/blob/bb0493dd0eb6454ac675df78dfed20e994a4881f/index.js#L188). That allowed me to realise that the RpcType for the ebt_clock_res_send() method in kuska was incorrect (https://github.com/Kuska-ssb/ssb/blob/master/src/api/helper.rs#L220); should be Duplex and not Source. I made the change locally and on the first attempt Manyverse received the vector clock correctly and sent messages in response. Success! However, solar seems not to have received the messages.

Now Manyverse is not receiving the vector clock from solar, even though I haven't changed anything further 😅 So yeah, I'm making progress but I'm not quite there yet.

@mycognosist
Copy link
Owner

Haha ignore my previous comment...I was on the wrong track.

But I'm happy to report that I've got bidirectional replication working for the first time 😺

The short version: I was using the positive version of the request number when responding to Manyverse (ie. sending a message or a vector clock). The request numbers of MUXRPC responses need to be negative.

@mycognosist
Copy link
Owner

Fixed in: #88

Thanks again for opening the issue, @black-puppydog . You gave me some additional motivation to solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants