Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scanner trims the last newline #58

Open
alexschneider opened this issue Mar 3, 2015 · 8 comments
Open

Scanner trims the last newline #58

alexschneider opened this issue Mar 3, 2015 · 8 comments

Comments

@alexschneider
Copy link
Owner

The last newline in the scanner doesn't really convey any information to the parser or program - and we have to specially handle it in multiple locations in the parser to ensure that a program with and without a newline at the end get parsed properly. Is there any issue with trimming the last token from the scan tokens if it's a newline?

@rachelriv specifically because you worked on the scanner

@rachelriv
Copy link
Collaborator

According to our grammar, a block is a sequence of statements followed by newlines (and optionally a return statement at the end).

Program ::= Block
Block   ::= (Stmt newline)* (ReturnStmt newline)?

If we remove the final newline token, then the final statement doesn't fit our grammar.

@alexschneider
Copy link
Owner Author

So what about adding a newline prior to the end of the file if it doesn't exist? That way we can assume it exists.

@rachelriv
Copy link
Collaborator

Why would we do that?

@alexschneider
Copy link
Owner Author

The alternative is just not at all parsing files that look like this:

if exp:
  xyz
end<EOF token>

@rachelriv
Copy link
Collaborator

Well we are the ones adding in the EOF token. I'm really not sure what you are getting at.

@alexschneider
Copy link
Owner Author

Some files don't end with a newline - there's an implicit EOF token put in the files so we know where the file ends (by the operating system). Though it's best practice, not everyone has newlines before the end of file.

@rachelriv
Copy link
Collaborator

I understand what you are saying now! Thanks for the explanation.

If you can think of an elegant way to fix this, go ahead and implement it and submit a PR. However, I think this issue should be low on our priority list. I'd really like to get some more tests and a fully working parser first!

@rtoal
Copy link
Collaborator

rtoal commented Mar 4, 2015

Because your scripts are just lines of code the need for the classic EOF token isn't really there. For files that are bracketed with, say "program" and "end" (like Pascal) or that are allowed only one class, say, the EOF is important to ensure there is no additional source after the single syntactic structure allowed in the compilation unit. I believe in your case that emitting a newline when you hit the end of your stream will suffice. It would be a shame to put newline | eof everywhere in your grammar.

This is a great issue. Good find, Alex. Agree with Rachel that it can be postponed a bit. It emitting a newline at the end of file works for you, though, you can do it sooner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants