Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nestlevel: Fix user data alignment #3883

Merged
merged 1 commit into from
Dec 18, 2023

Conversation

b4n
Copy link
Member

@b4n b4n commented Dec 16, 2023

We need to align the user data properly not to trigger undefined behavior, which even apparently crashes on SPARC.

As NestingLevels::levels is actually a single allocation for all levels and their user data mapped as [NL0|UD0|NL1|UD1|...] (where NL is a NestingLevel, and UD a user data), we need to align twice, as we need every NL* and every UD* to align properly.

Here we align everything to 2*sizeof(size_t), which is a logic borrowed from GLib, which seems to have borrowed the value from glibc. This is pretty conservative in our case, because actually NL*s only need aligning to int's requirements currently, which on some architectures is 4, not 16; but it's trickier to implement (and actually impossible with the current API) as we'd need to compute the actual alignment for each level taking into account it's position in the overall memory region to still align UD*s to a conservative value.
Also, having all NL+UD group at the same size makes things a bit simpler for debugging, I guess.

We make sure to only add alignment padding manually for cases where there's actually some user data, not to waste memory needlessly for the common case where sizeof(UD) is 0, and thus where we can merely align to sizeof(NL) -- which C does for us already.

Note that currently only the Ruby parser is affected, as it's the only current consumer of nesting level user data.

Fixes #3881.

We need to align the user data properly not to trigger undefined
behavior, which even apparently crashes on SPARC.

As `NestingLevels::levels` is actually a single allocation for all
levels and their user data mapped as `[NL0|UD0|NL1|UD1|...]` (where NL
is a NestingLevel, and UD a user data), we need to align twice, as we
need every `NL*` and every `UD*` to align properly.

Here we align everything to `2*sizeof(size_t)`, which is a logic
borrowed from GLib, which seems to have borrowed the value from glibc.
This is pretty conservative in our case, because actually `NL*`s only
need aligning to `int`'s requirements currently, which on some
architectures is 4, not 16; but it's trickier to implement (and
actually impossible with the current API) as we'd need to compute the
actual alignment for each level taking into account it's position in
the overall memory region to still align `UD*`s to a conservative
value.
Also, having all NL+UD group at the same size makes things a bit
simpler for debugging, I guess.

We make sure to only add alignment padding manually for cases where
there's actually some user data, not to waste memory needlessly for the
common case where `sizeof(UD)` is 0, and thus where we can merely
align to `sizeof(NL)` -- which C does for us already.

Note that currently only the Ruby parser is affected, as it's the only
current consumer of nesting level user data.

Fixes universal-ctags#3881.
Copy link

codecov bot commented Dec 16, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (aa9e264) 85.25% compared to head (e6bc697) 85.25%.
Report is 11 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #3883   +/-   ##
=======================================
  Coverage   85.25%   85.25%           
=======================================
  Files         230      230           
  Lines       55502    55502           
=======================================
  Hits        47320    47320           
  Misses       8182     8182           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@masatake masatake added this to the 6.1 milestone Dec 16, 2023
@masatake masatake merged commit 39e3105 into universal-ctags:master Dec 18, 2023
42 checks passed
@masatake
Copy link
Member

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unaligned access UB in Ruby parser
2 participants