Fix error messages for unterminated ( and { #3271

tenderlove · 2024-12-02T21:06:45Z

If we hit an EOF token, and the character before the EOF is a newline, we need to rewind the error to the newline so that error messages contain the correct newline.

Before this commit, we would report an error on the "line of the EOF" which would come right after the newline

https://bugs.ruby-lang.org/issues/20918

kddnewton · 2024-12-04T14:48:02Z

It looks like this works for this particular case, but there are many other ways to add errors to the list without going through this macro so I think I would like to solve it in another place. I think https://github.com/ruby/ruby/blob/f43585b02c3634ab9a4e54049b08e04ab1a640fd/prism_compile.c#L10242 is the place to fix it, which handles formatting the errors for stdout. That way it's all handled in one place regardless of how the errors are added to the parser struct so we don't have to worry about other entrypoints. I'll see if I can take a look at it this morning.

tenderlove · 2024-12-04T16:22:40Z

@kddnewton I don't think that's the right place to fix it. The way I understand that function is that it formats the errors for output. But in this case we're reporting the wrong line for the error. Consumers of the Ruby API for example, will get the wrong line number in their objects (if I understand this function correctly)

tenderlove · 2024-12-04T16:31:52Z

I tried adding a Ruby level test for this bug, but it seems like my fix doesn't do the right thing. 😭

tenderlove · 2024-12-04T19:44:27Z

Ok, I fixed it. I'm just bad at C.

tenderlove · 2024-12-04T19:56:02Z

It looks like this works for this particular case, but there are many other ways to add errors to the list without going through this macro so I think I would like to solve it in another place.

If there isn't one entry point when adding an error via token, then it feels like we should make that the case.

I can't imagine knowing to "rewind" without the additional information of "this is the eof of the file!". The token knows that we're at the end of the file (based on the token type), so it seems natural to only do this when there is an EOF token.

tenderlove · 2024-12-05T17:53:59Z

After thinking about this for a while, I've updated it to a fix I like better. I changed the tokenizer such that when it encounters an EOF, it checks whether or not the previous character was a newline and then changes that token's start position accordingly. This way we don't have to specifically check for an EOF token every time we add an error to the error list.

If we hit an EOF token, and the character before the EOF is a newline, we should make EOF token start at the previous newline. That way any errors reported will occur on that line. For example "foo(\n" should report an error on line 1 even though the EOF technically occurs on line 2. [Bug #20918] https://bugs.ruby-lang.org/issues/20918

kddnewton

That's a great solution!

paracycle approved these changes Dec 3, 2024

View reviewed changes

tenderlove force-pushed the eof-tok branch from 796220d to 0073b30 Compare December 4, 2024 19:43

tenderlove force-pushed the eof-tok branch from fa8bae4 to d6fd6d0 Compare December 5, 2024 17:52

tenderlove force-pushed the eof-tok branch from d6fd6d0 to 60bc43d Compare December 5, 2024 17:55

kddnewton approved these changes Dec 5, 2024

View reviewed changes

tenderlove merged commit f2a0fe6 into main Dec 5, 2024
55 of 56 checks passed

tenderlove deleted the eof-tok branch December 5, 2024 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix error messages for unterminated ( and { #3271

Fix error messages for unterminated ( and { #3271

Uh oh!

tenderlove commented Dec 2, 2024

Uh oh!

kddnewton commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 5, 2024

Uh oh!

kddnewton left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix error messages for unterminated ( and { #3271

Fix error messages for unterminated ( and { #3271

Uh oh!

Conversation

tenderlove commented Dec 2, 2024

Uh oh!

kddnewton commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 4, 2024

Uh oh!

tenderlove commented Dec 5, 2024

Uh oh!

kddnewton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants