-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak while parsing improperly terminated inherit
-expressions
#31
Conversation
…ions First of all, big thanks to @fufexan who helped me to reliably reproduce this. Originally discovered in `rnix-lsp`[1], but I confirmed that `nixpkgs-fmt` is also affected. Basically, when having an expression such as let inherit the parser would wait for a `TOKEN_SEMICOLON` indefinitely. The actual problem however is that `self.parse_val()` always detects the SAME syntax-error, i.e. "unexpected EOF". This will be written indefinetely into `self.errors`. However, `errors` is of type `Vec<ParseError>` and a vector in Rust grows in an amortized fashion[2] which means that if an entry is pushed and the vector exceeds the currently allocated size, it will be ~doubled (though the exact growth-factor isn't constant). This essentially means that the buffer is growing exponentially pretty fast and - according to KDE heaptrack - my system allocated ~9.5GB after 20s while running some tests. I added an exit-condition to the loop traversing through `inherit`-subexpressions to avoid that. Checking for an "unexpected EOF" is actually sufficient here: * There's either a `;` later in the expression causing the loop to terminate and causing an actual "unexpected token" error then. * Otherwise, `parse_val` will go through the tokens until a matching semicolon is found (which is not the case) and then reach the end of the file. In that case, `unexpected EOF` is returned by `parse_val`. [1] nix-community/rnix-lsp#33 [2] https://www.cs.cornell.edu/courses/cs3110/2011sp/Lectures/lec20-amortized/amortized.htm
@fufexan btw would you mind building your rnix-lsp with this branch of |
@Ma27 I would, I'm just not sure how to build it, as I haven't worked with rust/cargo before, so I've got no clue how to replace the default |
Oh right, sorry! The thing is,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should build rnix-parser, using this PR:
|
@fufexan did you have a chance to test this? :) |
@Ma27 yes, all works fine. Haven't had a memleak since switching :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, and the code looks good to me. Thank you @Ma27!
First of all, big thanks to @fufexan who helped me to reliably reproduce
this.
Originally discovered in
rnix-lsp
[1], but I confirmed thatnixpkgs-fmt
is also affected.Basically, when having an expression such as
the parser would wait for a
TOKEN_SEMICOLON
indefinitely. The actualproblem however is that
self.parse_val()
always detects the SAMEsyntax-error, i.e. "unexpected EOF". This will be written indefinetely
into
self.errors
. However,errors
is of typeVec<ParseError>
and avector in Rust grows in an amortized fashion[2] which means that if an
entry is pushed and the vector exceeds the currently allocated size, it
will be ~doubled (though the exact growth-factor isn't constant).
This essentially means that the buffer is growing exponentially pretty fast
and - according to KDE heaptrack - my system allocated ~9.5GB after 20s
while running some tests.
I added an exit-condition to the loop traversing through
inherit
-subexpressions to avoid that. Checking for an "unexpected EOF"is actually sufficient here:
There's either a
;
later in the expression causing the loop toterminate and causing an actual "unexpected token" error then.
Otherwise,
parse_val
will go through the tokens until a matchingsemicolon is found (which is not the case) and then reach the end of
the file. In that case,
unexpected EOF
is returned byparse_val
.[1] nix-community/rnix-lsp#33
[2] https://www.cs.cornell.edu/courses/cs3110/2011sp/Lectures/lec20-amortized/amortized.htm