Explain why Copy is required for a type specified %parse-param #404

FranklinChen · 2023-05-20T00:56:15Z

It might be useful for the book to explain briefly why Copy is required for the type specified in %parse-param. This restriction has led me to have to pass in mutable state using a layer of indirection with &RefCell<State> and repeatedly using borrow_mut() in actions, so I wonder if this is really necessary.

The text was updated successfully, but these errors were encountered:

ratmice · 2023-05-20T01:29:25Z

It is theoretically possible that it could be relaxed such that %parse-param could allow of an &mut reference, but it is exceedingly difficult. With the main reason being the unique ownership of an &mut, the type for action functions needs to be changed to a type of function which cannot capture ownership of the &mut (so that when the reference passed to the action falls out of scope it can be passed to the next action). Rust can type such functions as an HRTB/Higher ranked lifetimes using for <'a>.

One of the things that makes this difficult may be the pattern matching involved in generated code, and the difference between FunctionParamPattern and Pattenrs.

Copy patterns are fairly easy to implement without actually parsing the %parse-param using the argument in the generated code verbatim. There were a lot of attempts to get mutable references working in #214
But I kind of ran out of steam on that, and the RefCell workaround.

ltratt · 2023-05-20T07:36:36Z

It would be nice if it supported &mut but as @ratmice said, I'm not sure how practical that is. One of the challenges of very-clever type systems is that it can be hard to experiment on existing codebases, and we've definitely encountered that challenge!

Another possibility for your case might be to use interior mutation: so you could pass a type T: Copy around which internally has a RefCell or whatnot that hides some of the horrors of borrow_mut. It's still not completely ideal, of course.

FranklinChen · 2023-05-20T22:13:36Z

I guess that lalrpop (which I was using before) doesn't have this problem for its parameter passing mechanics https://lalrpop.github.io/lalrpop/tutorial/009_state_parameter.html because it generates a parser struct for each exported rule, whereas lrpar generates static functions. In lalrpop, I passed mutable state using the directive

grammar<'extra>(state: &'extra mut Vec<u8>);

whereas for lrpar I am using

%parse-param state: &RefCell<Vec<u8>>

ltratt · 2023-05-20T22:16:44Z

To some extent, the current design reflects the fact that I always write parsers that bubble state up rather than mutate state: honestly, it didn't really occur to me to deal with mutable state! Could lrpar be adapted to deal with mutable state? I guess it probably could. I must admit that I don't think I'll be the person who does that though :/ Sorry, that's not a very satisfactory answer on my part!

FranklinChen · 2023-05-20T22:26:19Z

I personally prefer to program purely functionally, but for performance I've been collecting some things during parsing mutably, which is very cheap when using a Vec and simply pushing to it. I've considered passing everything upward instead, at the expense of basically manually threading state through everything, and using Rust doubly LinkedList to collect things recursively without paying a quadratic concatenation penalty. I haven't done that yet for comparative benchmarking purposes.

The other thing I want to do, where mutable state seems particularly sensible, is to catch semantic errors during parsing and log error objects into a Vec before doing recovery and going on, since I don't want to fail fast in parsing but want to generate as many errors as possible for the end user.

ratmice · 2023-05-20T23:51:22Z

I don't mind having another look at it, though I can't promise I'll be any more successful than last time I attempted to do so.
I think that the %parse-param work that made it's way in tree may have simplified things.

The primary difference being in these 2 commits, where we went from accepting multiple named arguments to a single named argument.
3298c50
#216

So, if we can in fact go with the simpler single named argument approach it may well avoid a lot of the difficulty I encountered in my previous attempt. As such I think there is reason to hope that much of the difficulty I'd previously encountered can be avoided.

ltratt · 2023-05-21T10:15:41Z

If it is doable, that would be great!

Cypher1 · 2024-09-28T20:20:17Z

This would be handy. It wasn't clear to me if this meant that the state data might need to deal with backtracking, though I had previously been sure it wasn't necessary.

My solution was the following

%start Expr
%parse-param ctx: *mut crate::parser::Ast
%%

Then in the code area:

fn with(ctx: *mut crate::parser::Ast) -> &'static mut Ast {
    // TODO: Replace with this something safer... but... for now
    // the context is guaranteed to live long enough and not be moved.
    unsafe {&mut (*ctx)}
 }

This let's actions use the ctx without a copy or a check, and makes the use of slab allocation simple enough.
I also found it convenient to implement Into for spans to shrink them to a smaller type (as I don't expect my language to need to handle any files > 2^16 bytes long).

     let op = with(ctx).add_op(Op{
          op: Symbol::Try,
          args: smallvec![$1?]
        },
        $span.into()
      );
     Ok(op)

ltratt · 2024-09-28T20:29:17Z

In general I'm leary about passing *mut around because you can easily violate Rust's semantics. Well, in a sense, I just lied: Rust doesn't yet have a proper semantics around derefing pointers! Stacked borrows and its successors are the current best guide but they're not yet formally blessed.

I do agree, though, that Copy is a tough restriction to deal with. Relaxing that restriction would be nice.

ratmice · 2024-09-28T21:49:18Z

The feeling I get is that we just can't have both &mut, and T where T: Copy using the same API, this is just a part of rusts Mutable XOR Shared semantics. So I think we would need to add additional, or different entry points for runtime parsers and compile-time parsers, and have separate entry points for params with copy bounds (which should also work for references which are Copy) and those with mut references.

So I'm not sure that we can really relax it in any way that could actually support both Copy and mutable. So much as either switch from Copy to mutable, or alternately provide separate entry points %parse_param_mut/parse_mut which would then generate either a parse function, or a parse_mut function (which would not be available when using RTParserBuilder probably.

So I think there are at least some decisions we can think about even before we try to relax that restriction which affect how we would want to approach it.

ltratt · 2024-09-29T16:42:41Z

I wonder if Clone as a bound would be possible? That would make it possible to e.g. pass things like Rc<RefCell<..>> in.

ratmice · 2024-09-29T20:05:01Z

I think it should be possible, we just need to add the explicit calls to clone, and change the bounds,
I think this would even be backwards compatible because the Copy trait inherits Clone here?

ltratt · 2024-09-29T21:31:31Z

I think this would even be backwards compatible because the Copy trait inherits Clone here?

I think you're right!

Maybe this isn't too difficult a change. I'm not sure I'll get to it very soon, though.

ratmice mentioned this issue Sep 30, 2024

Relax %parse-param bounds from Copy to Clone #471

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explain why Copy is required for a type specified %parse-param #404

Explain why Copy is required for a type specified %parse-param #404

FranklinChen commented May 20, 2023

ratmice commented May 20, 2023

ltratt commented May 20, 2023

FranklinChen commented May 20, 2023

ltratt commented May 20, 2023

FranklinChen commented May 20, 2023

ratmice commented May 20, 2023

ltratt commented May 21, 2023

Cypher1 commented Sep 28, 2024 •

edited

Loading

ltratt commented Sep 28, 2024

ratmice commented Sep 28, 2024

ltratt commented Sep 29, 2024

ratmice commented Sep 29, 2024

ltratt commented Sep 29, 2024

Explain why Copy is required for a type specified %parse-param #404

Explain why Copy is required for a type specified %parse-param #404

Comments

FranklinChen commented May 20, 2023

ratmice commented May 20, 2023

ltratt commented May 20, 2023

FranklinChen commented May 20, 2023

ltratt commented May 20, 2023

FranklinChen commented May 20, 2023

ratmice commented May 20, 2023

ltratt commented May 21, 2023

Cypher1 commented Sep 28, 2024 • edited Loading

ltratt commented Sep 28, 2024

ratmice commented Sep 28, 2024

ltratt commented Sep 29, 2024

ratmice commented Sep 29, 2024

ltratt commented Sep 29, 2024

Cypher1 commented Sep 28, 2024 •

edited

Loading