Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More informative diagnostic messages for syntax errors in user input #35

Open
agentzh opened this issue Mar 29, 2015 · 4 comments
Open

Comments

@agentzh
Copy link
Contributor

agentzh commented Mar 29, 2015

Pegex has a wonder automatic error reporting feature for user input's syntax errors. What I've found could be a further improvement is an automatically generated hint like "expecting foo, bar, and baz instead" where foo, bar, and baz are the related rule names automatically induced from the grammar.

I know we already have the back-tick custom error message feature, but it quickly becomes a burden to add all the back-tick strings to large grammars.

Parse::RecDescent has limited support for such contextual hints. Hopefully Pegex can do even better than that :)

@ingydotnet
Copy link
Collaborator

@agentzh can you leave an example here? Maybe show a current error message you get, and then show what you would want it to be.

@agentzh
Copy link
Contributor Author

agentzh commented Apr 1, 2015

@ingydotnet I'll use real examples to demonstrate the idea :)

Firstly I'd use Parse::RecDescent for comparison because it gives contextual information with the rule names. Consider the following minimal and standalone Perl script:

use Parse::RecDescent;

my $grammar = <<'_EOC_';
foo:
    first
    second
  | <error>

first: 'a'

second:
    bee
  | cee
  | <error>

bee: 'b'
cee: 'c'
_EOC_

Parse::RecDescent->new($grammar)->foo("ad");

Running the script gives:

       ERROR (line 1): Invalid second: Was expecting bee, or cee

       ERROR (line 1): Invalid foo: Was expecting second but found "d" instead

And consider the following Pegex version:

use Pegex;

my $grammar = <<'_EOC_';
foo:
  first
  second

first: 'a'

second:
  | bee
  | cee

bee: 'b'
cee: 'c'
_EOC_

pegex($grammar)->parse("ad")

It gives

Error parsing Pegex document:
  msg:      Parse document failed for some reason
  line:     1
  column:   2
  context:  ad
             ^
  position: 1 (0 pre-lookahead)
 at a.pl line 18.

The Pegex version does give more detailed contextual info in the input string, but no contextual info in the grammar being applied (like related rule names "second", "bee", and "cee"). Maybe such additional hints can go to the "msg:" part?

@ingydotnet
Copy link
Collaborator

Wonderful example @agentzh !!

I should be able to do this fairly easily. :)

I might even be able to print a line #github url of the grammar rule involved!

@agentzh
Copy link
Contributor Author

agentzh commented Apr 4, 2015

@ingydotnet Woot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants