Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign named variables in FuzzIL #486

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Redesign named variables in FuzzIL #486

wants to merge 1 commit into from

Conversation

saelo
Copy link
Collaborator

@saelo saelo commented Dec 25, 2024

This change replaces the three operations LoadNamedVariable, StoreNamedVariable, and DefineNamedVariable with a single new operation: CreateNamedVariable. This operation now simply creates a new FuzzIL variable that will be assigned a specific identifier during lifting. Optionally, the named variable can be declared using any of the available variable declaration modes: global, var, let, or const.

Below is a small example of how CreateNamedVariable can be used:

// Make an existing named variable (e.g. a builtin) available
v0 <- CreateNamedVariable 'foo', declarationMode: .none
v1 <- LoadString 'bar'
Reassign v0, v1

// Declare a new named variable
v2 <- LoadInt 1337
v3 <- CreateNamedVariable 'baz', declarationMode: .var, v2
Update v3, .PostInc
v4 <- CreateNamedVariable 'print', declarationMode: .none
v5 <- CallFunction v4, v3

This will now lift to the following JavaScript code:

foo = "bar";
var baz = 1337;
baz++;
print(baz);

With this, we now have a single, flexible way of creating variables that have a specific name. We now use this:

  • For builtins, which are effectively just existing named variables in the global scope. This also now makes it (easily) possible to overwrite builtins. As such, The LoadBuiltin operation was removed.
  • During code generation, to sometimes create variables with specific names in generated code (we use random property names).
  • In the compiler. This is the main user of named variables and this is where this change has the most impact: we now compiler every variable declaration to a CreateNamedVariable operation. This now makes it possible to correctly compiler any code that relies on variable names, for example due to using eval, with statements, or similar constructs. See some of the added/modified tests for examples. The consequence of this is that compiled code will now often have a lot of CreateNamedVariable operations. However, as these are now just regular FuzzIL variables, this change should not significantly affect mutability of the programs. In the future, we could consider implementing a specific minimizer (that we could also run during corpus import) to remove unneeded CreateNamedVariable operations. However, it will likely be somewhat difficult to determine when such an operation is not needed.

@saelo
Copy link
Collaborator Author

saelo commented Dec 25, 2024

Had some time on the train today so I gave the idea from #474 (comment) a try. @carl-smith and @TobiasWienand, could you both take a look at this and see what you think? This should make named variables much more flexible and powerful and I think it should now make it easy to handle issues such as the one we're encountering in PR #474 now. But it's also a somewhat deep change and for example also removes LoadBuiltin.

@saelo saelo force-pushed the named_variables branch 4 times, most recently from eda09b2 to 3e40cbe Compare December 25, 2024 13:46
This change replaces the three operations LoadNamedVariable,
StoreNamedVariable, and DefineNamedVariable with a single new operation:
CreateNamedVariable. This operation now simply creates a new FuzzIL
variable that will be assigned a specific identifier during lifting.
Optionally, the named variable can be declared using any of the
available variable declaration modes: global, var, let, or const.

Below is a small example of how CreateNamedVariable can be used:

   // Make an existing named variable (e.g. a builtin) available
   v0 <- CreateNamedVariable 'print', declarationMode: .none

   // Overwrite an existing named variable
   v1 <- CreateNamedVariable 'foo', declarationMode: .none
   v2 <- CallFunction v0, v1
   v3 <- LoadString 'bar'
   Reassign v1, v3

   // Declare a new named variable
   v4 <- CreateNamedVariable 'baz', declarationMode: .var, v1
   v5 <- LoadString 'bla'
   Update v4 '+' v5
   v5 <- CallFunction v0, v4

This will lift to JavaScript code similar to the following:

   print(foo);
   foo = "bar";
   var baz = foo;
   baz += "bla";
   print(baz);

With this, we now have a single, flexible way of creating variables that
have a specific name. We now use this:

* For builtins, which are effectively just existing named variables in
  the global scope. This also now makes it (easily) possible to
  overwrite builtins. As such, The LoadBuiltin operation was removed.
* During code generation, to sometimes create variables with specific
  names in generated code (we use random property names).
* In the compiler. This is the main user of named variables and this is
  where this change has the most impact: we now compiler _every_
  variable declaration to a CreateNamedVariable operation. This now
  makes it possible to correctly compiler any code that relies on
  variable names, for example due to using `eval`, with statements, or
  similar constructs. See some of the added/modified tests for examples.
  The consequence of this is that compiled code will now often have a
  lot of CreateNamedVariable operations. However, as these are now just
  regular FuzzIL variables, this change should not significantly affect
  mutability of the programs. In the future, we could consider
  implementing a specific minimizer (that we could also run during
  corpus import) to remove unneeded CreateNamedVariable operations.
  However, it will likely be somewhat difficult to determine when such
  an operation is not needed.
@TobiasWienand
Copy link
Contributor

TobiasWienand commented Dec 28, 2024

Overall it looks really awesome :)
I noticed that

before

for (const a of ["a"]) {}
for (let b of ["b"]) {}
for (var c of ["c"]) {}

after

for (const v2 of ["a"]) {}
for (const v5 of ["b"]) {}
for (const v8 of ["c"]) {}

Apart from that, my small testbench regarding scoping of variables passed. I am currently working on a more comprehensive test bench

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants