Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected token kDO #447

Closed
GBH opened this issue Feb 18, 2018 · 13 comments
Closed

unexpected token kDO #447

GBH opened this issue Feb 18, 2018 · 13 comments
Assignees
Labels

Comments

@GBH
Copy link

GBH commented Feb 18, 2018

With 2.5 release there seems to be a bug:

# test.rb
assert_difference [file_count, attachment_count], -1 do
end
$ ruby -vc test.rb
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-linux]
Syntax OK

$ ruby-parse _2.5.0.0_ test.rb
test.rb:1:54: error: unexpected token kDO
test.rb:1: assert_difference [file_count, attachment_count], -1 do
test.rb:1:                                                      ^~

$ ruby-parse _2.4.0.1_ test.rb
(block
  (send nil :assert_difference
    (array
      (send nil :file_count)
      (send nil :attachment_count))
    (int -1))
  (args) nil)
@whitequark
Copy link
Owner

@iliabylich I think you broke it, can you take a look?

@iliabylich
Copy link
Collaborator

@whitequark Sure, I'll check it today

@iliabylich
Copy link
Collaborator

iliabylich commented Feb 19, 2018

@whitequark The code m [] do end does a wrong cmdarg manipulation.

m [] do end
^ tIDENTIFIER "m"                               expr_cmdarg  [0 <= cond] [0 <= cmdarg]
cmdarg.beep("[")
cmdarg.push(0)
m [] do end
  ^ tLBRACK "["                                 expr_beg     [0 <= cond] [0 <= cmdarg]
cmdarg.beep("before command_args")
cmdarg.push(1)
cmdarg.beep("} | ) | ] rule")
cmdarg.pop
m [] do end
   ^ tRBRACK "]"                                expr_end     [0 <= cond] [0 <= cmdarg]
m [] do end
     ^~ kDO "do"                                expr_value   [0 <= cond] [0 <= cmdarg]
cmdarg.pop
cmdarg.beep("after command_args")
(fragment:0):1:6: error: unexpected token kDO
(fragment:0):1: m [] do end
(fragment:0):1:      ^~

It does push(0) (rbrack) -> push(1) (command args). So the next pop drops a command bit marker. Instead it should do push(1) -> push(0). (I guess now I understand the idea of lexpop).

MRI handles it by doing a lookahead here. It shuffles 1 and 0 in this case.

So the question is: how can we make the same lookahead from the parser? I see two solutions (and I don't like both of them):

  1. Get a token using lexer.advance, analyze it and push back to the lexer.token_queue. Ugly and also requires manual state (and probably other ivars) restoring.
  2. dup-ing a lexer and running advance on this copy.

What do you think?

EDIT: ignore it, next tokens are already processed by lookahead.

@iliabylich
Copy link
Collaborator

Is there a way to get a lookahead char like yychar in bison?

@whitequark
Copy link
Owner

Is there a way to get a lookahead char like yychar in bison?

Sure, use @p.

@iliabylich
Copy link
Collaborator

@whitequark In the parser, not in the lexer. racc does lookahead, is it possible to get that token?

@whitequark
Copy link
Owner

Not sure, maybe @racc_t?

@iliabylich
Copy link
Collaborator

There's no such ivar (and no ivars with tokens at all). I've implemented a naive approach in #448, but I expected racc to have something similar to yychar.

@whitequark
Copy link
Owner

Ah, that's only present in the pure-Ruby racc runtime, then.

@GBH
Copy link
Author

GBH commented Feb 24, 2018

I pulled in newest version and this is still an issue. Looking at the commit it seems like it only dealt with ruby 2.4 and 2.5. 2.2 and 2.3 wasn't fixed, I think.

@iliabylich
Copy link
Collaborator

@whitequark old parsers still use lexpop and all of them have a bug with wrong cmdarg value on command args handling (cmdarg stack is not active when lexer expects it to be active). I think we should completely drop it and use push/pop like we do in ruby24/25.y because branching this logic for 18..23 and 24..25 is painful. WDYT?

@iliabylich iliabylich reopened this Feb 24, 2018
@iliabylich
Copy link
Collaborator

Sorry, I was a bit wrong in my previous comment, cmdarg is (sometimes) invalid for 18..23, but it's fine. Lexer always does pop (instead of lexpop) on ], }, ), so the following patch makes it working:

       e_rbrace | e_rparen | ']'
       => {
         emit_table(PUNCTUATION)
-        @cond.lexpop; @cmdarg.pop
+        @cond.lexpop;
+        if @version >= 24
+          @cmdarg.pop
+        else
+          @cmdarg.lexpop
+        end

But I guess backporting proper cmdarg handling to parsers < 24 and removing lexpop is better, isn't it? (less branching and the logic becomes simpler)

@whitequark
Copy link
Owner

But I guess backporting proper cmdarg handling to parsers < 24 and removing lexpop is better, isn't it? (less branching and the logic becomes simpler)

You've already introduced 2 (or more, I didn't count) bugs in the latest parser related to this, what's to guarantee that this simplification wouldn't introduce more? Non-latest-version parsers don't get as much user testing and clearly this isn't covered by our test suite so I am not convinced this should be backported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants