Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Ecall/Halt #258

Merged
merged 31 commits into from
Oct 7, 2024
Merged

Feat: Ecall/Halt #258

merged 31 commits into from
Oct 7, 2024

Conversation

kunxian-xia
Copy link
Collaborator

@kunxian-xia kunxian-xia commented Sep 21, 2024

Fixes #125.

  • Ecall instruction config
  • Ecall/Halt circuit.
  • Added public input in the prover framework and allows the circuit builder to access public inputs.

@kunxian-xia kunxian-xia marked this pull request as draft September 21, 2024 00:36
@kunxian-xia kunxian-xia changed the title [WIP] Feat: Ecall/ Halt Feat: Ecall/Halt Sep 24, 2024
@kunxian-xia kunxian-xia linked an issue Sep 24, 2024 that may be closed by this pull request
@kunxian-xia kunxian-xia marked this pull request as ready for review September 25, 2024 10:00
@kunxian-xia kunxian-xia requested a review from naure September 25, 2024 10:06
Copy link
Collaborator

@naure naure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not check the exit_code.

The VM looks for the exit code in register x10 / a0 and requires it to be 0 for success.

VMState.ecall()

While non-zero values are an error, and they cannot be proven. E.g. panic is halt(1).

ceno_rt panic

There are two options:

  1. Read the register CENO_PLATFORM.reg_arg0() with success value 0.
    … OR …
  2. redefine halt to halt_success. In ceno_rt, call it only for success, not for panic:
pub fn halt(exit_code: u32) -> ! {
    if exit_code == 0 {
        // SUCCESS
        unsafe {
            asm!(
                // Set the ecall code HALT_SUCCESS.
                "li t0, 0x0",
            );
            riscv::asm::ecall();
        }
    }
    #[allow(clippy::empty_loop)]
    loop {}
}

ceno_zkvm/src/instructions/riscv/ecall.rs Show resolved Hide resolved
ceno_zkvm/src/instructions/riscv/ecall/halt.rs Outdated Show resolved Hide resolved
ceno_zkvm/src/instructions/riscv/ecall_insn.rs Outdated Show resolved Hide resolved
cb,
[ECALL_HALT[0].into(), ECALL_HALT[1].into()],
None,
Some(0.into()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We define PC=0 as invalid, and no other instruction can execute after that. I would document this requirement right here as follows:

assert!(!CENO_PLATFORM.can_execute(0), "PC=0 means the program is terminated.");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: we can define 0 as constant EXIT_PC to improve the readibility

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want PC=0 to be invalid, I would suggest also adding 0x0000_0000 in the first 32 bits of memory. Just for consistency. (32 bits of all zeroes is Risc-V's way of specifying a guaranteed invalid instruction in every version of the instruction set.)

@kunxian-xia
Copy link
Collaborator Author

kunxian-xia commented Sep 25, 2024

As for exit_code, I plan to add it in the 2-tuple of global state (pc, ts) . That is,

  • the global state becomes a three-tuple (pc, ts, exit_code).
  • the global state is initialized to be (pc=PC_START, ts=0, exit_code=0); and only ecall/halt instruction will change exit_code to some non-zero value read from X10 register.

@naure
Copy link
Collaborator

naure commented Sep 25, 2024

That incurs a cost on every instruction then. There should be a better way.

For instance, initialize by writing state (pc_start, ts_start), and require a write of (pc=0, ts) to exist.

Or require the HaltCircuit to exist with exactly one instance.

@kunxian-xia
Copy link
Collaborator Author

That incurs a cost on every instruction then. There should be a better way.

The cost is negligible as we stores the rlc not the 3-tuple (or old 2-tuple).

@kunxian-xia
Copy link
Collaborator Author

For instance, initialize by writing state (pc_start, ts_start), and require a write of (pc=0, ts) to exist.

The problem with this approach is that we need to add a lookup record for program table. But we don't know that ecall/halt instruction's index/pc in the program.

@kunxian-xia
Copy link
Collaborator Author

kunxian-xia commented Sep 25, 2024

As a desired feature of zkvm prover/verifier: we should skip generating opcode proof if the number of instances = 1. #288.

@naure
Copy link
Collaborator

naure commented Sep 25, 2024

How do you plan on checking the presence of the code in the state? However that works, it could check that (pc=0, ts=0) or whatever special value. Same as checking (pc, ts, code=1).

@naure
Copy link
Collaborator

naure commented Sep 25, 2024

That is not part of this PR though, right? That would be issue #127

@naure
Copy link
Collaborator

naure commented Sep 25, 2024

Regarding non-zero exit codes: this does not need to be supported by the zkVM circuits. It is just impossible to generate a proof for an execution that did not terminate correctly.

Copy link
Collaborator

@naure naure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works now with the exit code!

@kunxian-xia
Copy link
Collaborator Author

Regarding non-zero exit codes: this does not need to be supported by the zkVM circuits. It is just impossible to generate a proof for an execution that did not terminate correctly.

No. For computation that ends with error we can detect it by exit_code. But for program that failed by misaligned memory address, that might be hard to prove.

@naure
Copy link
Collaborator

naure commented Sep 25, 2024

The cost is negligible as we stores the rlc not the 3-tuple (or old 2-tuple).

That is a lot of extended multiplications in the prover and verifier. I’d be interested to learn how much those weight though.

@kunxian-xia
Copy link
Collaborator Author

We assert that the value read from register x10 is equal to exit_code in the public inputs.

@kunxian-xia kunxian-xia requested a review from naure September 30, 2024 14:14
Copy link
Collaborator

@hero78119 hero78119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome job!! 🔥 just left reviews with few questions

ceno_zkvm/src/expression.rs Show resolved Hide resolved
ceno_zkvm/src/expression.rs Show resolved Hide resolved
ceno_zkvm/src/expression/monomial.rs Show resolved Hide resolved
cb,
[ECALL_HALT[0].into(), ECALL_HALT[1].into()],
None,
Some(0.into()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: we can define 0 as constant EXIT_PC to improve the readibility

@@ -66,18 +66,39 @@ pub struct ZKVMTableProof<E: ExtensionField, PCS: PolynomialCommitmentScheme<E>>
pub wits_opening_proof: PCS::Proof,
}

#[derive(Default, Clone, Debug)]
pub struct PublicValues<T: Default + Clone + Debug> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A side topic: would we plan to separate public input into 2 kinds, one are hinted from prover, and the others are set from verifier directly? For those hinted from prover, we can have mark #[serde(skip)] to make proof size smaller

@matthiasgoergens
Copy link
Collaborator

matthiasgoergens commented Oct 3, 2024

A few notes on design:

Do we have use for exit codes that go beyond success / failure?

If not, we might want to consider using unimpl to abort execution. unimpl is the mnemonic for the instruction that consists of all 32 bits of 0. Fortunately, we don't actually have to add anything to the constraints: we already need constraints that forbid jumping to an illegal instruction.

For successful exit, ecall might be fine, but you can also consider using ebreak. That way, if you want to run your guest program hooked up to a debugger like gdb via (a slightly hacked up) qemu, for simple programs is will automatically do the right thing: fail for unimpl and halt for ebreak.

Making debugging simpler is one of the big lessons we learned from previously working on Mozak's zkVM.

Overloading ecall less, and using alternative means, might make our guest programs slightly more robust, as an attacker would have to work harder to change eg an IO system call into a successful exit. Or to change a program abort into an IO call or a successful exit.

Copy link
Collaborator

@hero78119 hero78119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and nice job! +1 for resolving other pending issues in later PR

@kunxian-xia kunxian-xia merged commit d3ea040 into master Oct 7, 2024
6 checks passed
@kunxian-xia kunxian-xia deleted the feat/ecall_halt branch October 7, 2024 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

ecall/halt
5 participants