Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing an additional parameter for launcher run_client closures #2684

Closed
wants to merge 7 commits into from

Conversation

riesentoaster
Copy link
Contributor

Follow-up to #2676.

With Launcher::overcommit, CoreId is no longer necessarily unique. ClientId is generated in LLMP, therefore unique, and an immutable borrow of it is passed to the closure.

@tokatoka
Copy link
Member

tokatoka commented Nov 12, 2024

I don't think this is good.
ClientId will not stay the same if the fuzzer restarts. that will make this number useless
(for example, if it is just a varying number you can't do if (id == xxx) {// do this} else {// do that} inside the run_in_client harness

@tokatoka
Copy link
Member

tokatoka commented Nov 12, 2024

Also, just in general, we should not assume that every fuzzer has client id. it's a specific identifier to llmp communication, not something to distinguish fuzzer instance

@riesentoaster
Copy link
Contributor Author

That's fair, I haven't thought of that. I still feel like having an id there that uniquely identifies the client is a good idea. I guess in that case we're back to creating a second ClientId based on CoreId and the overcommit index.

@domenukk Opinions?

@tokatoka
Copy link
Member

tokatoka commented Nov 12, 2024

we're back to creating a second ClientId based on CoreId and the overcommit index.

yeah whatever number is fine. such as timestamp + pid or anything. just that this is not appropriate case where ClientId should be used

@riesentoaster
Copy link
Contributor Author

I think having something predictable is better, again because of restarts. If we have CoreId * overcommit_i + overcommit_i index, we're guaranteed to have the same numbers if the fuzzer is started with the same config and thus can resume based on files it created before shutdown.

@tokatoka
Copy link
Member

yeah it's good

@riesentoaster
Copy link
Contributor Author

And another downside is that this id is not going to match whatever the monitors display. That might lead to confusion.

@tokatoka
Copy link
Member

And another downside is that this id is not going to match whatever the monitors display. That might lead to confusion.

IIRC that monitor isn't even displaying the clientid

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Nov 12, 2024

The basic implementation does (the number after the # is the sender id of the llmp message):

[UserStats   #1]  (GLOBAL) run time: 0h-0m-3s, clients: 1, corpus: 1, objectives: 0, executions: 0, exec/sec: 0.000, stability: 100.000%, edges_objective: 0.006%, edges: 0.011%
                 (CLIENT) corpus: 1, objectives: 0, executions: 0, exec/sec: 0.000, stability: 7/7 (100%), edges_objective: 4/65536 (0%), edges: 7/65536 (0%)
[Objective   #1]  (GLOBAL) run time: 0h-0m-3s, clients: 1, corpus: 1, objectives: 1, executions: 0, exec/sec: 0.000, stability: 100.000%, edges_objective: 0.006%, edges: 0.011%        

@tokatoka
Copy link
Member

tokatoka commented Nov 12, 2024

uh ok.
but you should not trust that ClientID number. it has LOOOOOOTs of problems. and i don't think that number is unique to each fuzzer process #1850 (comment)

@tokatoka
Copy link
Member

tokatoka commented Nov 12, 2024

I think the problem of varying client id after restart is a unique problem to centralized broker.
but anyway client id is unique to connection. so it should not be used for distinguishing fuzzers

@riesentoaster
Copy link
Contributor Author

How about we generate a new ClientId based on the core/overcommit but leave the getter functions for the LLMP ClientId on the managers, this way they are still accessible if anyone ever needs it, but the more obvious option is the "correct" one.

@tokatoka
Copy link
Member

tokatoka commented Nov 12, 2024

no... things are much more complex than that.

First,... what you call as Client ID is not just one.
There are 4 Client IDs per one broker-to-client connection.
One connection has one Sender and Receiver, and then broker -> client connection and client -> broker connetion are on differenct channel.
thus 2 x 2 = 4
and each of this sender and receiver has its own client id. now you should see that clientid is really the id of llmp_{sender, receiver}. it's not something unique to a fuzzer client.

If we do like you said there're alot of problems.
For example, you cannot set the client-id of the broker, as broker is not bound to any core, thus you cannot set any number to it.

So client id is the id that should only be used inside llmp. not for anything else.

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Nov 12, 2024

The only interesting one of those 4 is the sender id in the client->broker direction, since this is what is printed in the monitors. I don't really need this information for my current projects, but I figured it'd be worth a suggestion since it's already built and it may be helpful for someone down the line.

I'm still curious about @domenukk's opinion though, since he's the one who initially suggested using the LLMP ClientId.

@tokatoka
Copy link
Member

tokatoka commented Nov 12, 2024

The only interesting one of those 4 is the sender id in the client->broker direction

Yeah, but i'm saying even if it is not interesting you have to put something for the llmp to work. and you can't put core/overcommimt there.

Also llmp is independent of Launcher. that means it'd be a problem if you don't use Launcher / core pinning

@domenukk
Copy link
Member

ClientId was supposed to be unique and survive reboots. That's literally what it's for :D
TIL about issues with centralized.
Maybe we should fix these?

@domenukk
Copy link
Member

IMHO
Launcher -> CoreId + Overcommit -> pick what features to use (provided on the commandline)
Anythign else should use ClientId and we should make sure that's unique.
Initially I also had a BrokerId for broker2broker, but maybe that didn't survive the C->Rust migration

@domenukk
Copy link
Member

That being said, we shouldn't add more parameters to the closure in any case. I suggested either adding it into state or extending CoreId to CoreDescription

@tokatoka
Copy link
Member

ClientId was supposed to be unique and survive reboots.

It's not unique for fuzzers. it's only unique for communication. As I said, each fuzzer-to-broker communication have 4 client ids assigned to each Sender and Receiver.

@domenukk
Copy link
Member

Yes and we shoudl pick one as fuzzer node id IMHO

@riesentoaster
Copy link
Contributor Author

So I guess we're basically back to #2676? Or do we want to integrate a LLMP id somewhere right now?

Personally, I think that in the long run, there should be one identifier per client/node that is globally unique and used throughout any user-facing interface, including launchers, monitors, and everything else. It may make sense to couple those with LLMP ids, since LLMP also needs unique ids per client, but if there are good reasons to do something else, that's fine too — in that case however it would likely need to be synced along with the messages. I don't know LLMP and the entire architecture nearly well enough to make a judgement call about this.

@tokatoka
Copy link
Member

tokatoka commented Nov 15, 2024

there should be one identifier per client/node that is globally unique and used throughout any user-facing interface, including launchers, monitors, and everything else.

Yeah, if you really mean "globally unique" then here you can't even use coreid as the identifier. you can run fuzzers on multiple machines and aggregate logs from others, and in that case, the core id is not a valid globally unique identifier

It may make sense to couple those with LLMP ids, since LLMP also needs unique ids per client

No. imo you shouldn't. "LLMP also needs unique ids per client", yeah, but it's not globally unique. it's only unique within it's sequence of connections. And this is what makes the logs from monitors "looks" unique. because the monitor is simply printing Broker's receiver's client id (which is unique) but if this assumption breaks, for example, if you have two brokers emitting the logs to stdout, nothing is unique anymore.

@tokatoka
Copy link
Member

tokatoka commented Nov 15, 2024

if you "just" want a globally unique id. then you should use uuid
https://docs.rs/uuid/latest/uuid/index.html
(and imo it is good to have this for the client)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants