Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrap: add quoting support to avoid splitting #132635

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

kiike
Copy link
Contributor

@kiike kiike commented Nov 5, 2024

With this change, it is now possible to pass quotes to the configure script, such as

./configure.py --set=target.\"thumbv8m.main-none-eabi\".linker=/linker
or
./configure.py '--set=target."thumbv8m.main-none-eabi".linker=/linker'

, which will treat thumbv8.main-none-eabi as a whole part. Currently, the string would be split into two elements: thumbv8, and main-none-eabi.

The approach taken is to perform custom splitting instead of using str.split() and then repairing the split. Also, There are numerous corner cases not handled: the custom split doesn't differentiate between single quotes or double quotes, so it is perfectly possible to pass ./configure.py --set=target.\"thumbv8m.main-none-eabi\'.linker=/linker and the behaviour would be the same as with all double quotes or single quotes.

As for the code, i'm unsure on whether to delimit strings with double or single quotes. I've seen both single quotes and double quotes used to delimit strings, like in

err("Option '{}' provided more than once".format(key))

and this a handful of lines down:

if option.name == 'sccache':
    set('llvm.ccache', 'sccache', config)

Please advise on the wanted one.

Fixes #130602

r? @onur-ozkan

Thanks in advance for the feedback!

@rustbot
Copy link
Collaborator

rustbot commented Nov 5, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @onur-ozkan (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Nov 5, 2024
@jieyouxu
Copy link
Member

jieyouxu commented Nov 5, 2024

Can this use something like https://docs.python.org/3.10/library/shlex.html#shlex.split?

@onur-ozkan
Copy link
Member

Can this use something like https://docs.python.org/3.10/library/shlex.html#shlex.split?

Does that work for python2 aswell?

@jieyouxu
Copy link
Member

jieyouxu commented Nov 5, 2024

Does that work for python2 aswell?

AFAICT that exists for python2 https://docs.python.org/2.7/library/shlex.html as well

(Unrelated: do we really need to keep supporting python 2?)

@kiike
Copy link
Contributor Author

kiike commented Nov 5, 2024

Can this use something like https://docs.python.org/3.10/library/shlex.html#shlex.split?

Not as is. That splits on whitespace, right? We want to split on periods. If you want me to use shlex.split, I need to research how to modify its behaviour to split on periods, because right now it splits on whitespace with shell semantics.

@kiike kiike force-pushed the fix/dots_in_target branch 2 times, most recently from 9eae8f9 to 435e53e Compare November 5, 2024 18:38
@kiike
Copy link
Contributor Author

kiike commented Nov 5, 2024

AFAIUI, the splitting in shlex is done on whitespace, so I commited a change to split on periods. It was a lot easier than expected and now the splitting is more robust. I'm using posix=True so that quotes are stripped from the resulting split elements, and punctuation_chars=True to allow, for instance, hyphens inside keys, like in configure-args. I checked compatibility with Python2 and the punctuation_chars argument is not expected. This flag is also not compatible with Python versions older than 3.6. If bootstrapping is meant to be compatible with older versions of Python, I will need to investigate alternative ways. Let me know if so.

@onur-ozkan
Copy link
Member

Bootstrap should be compatible with python versions older than 3.6. Some distros (and possibly even some of our build runners) might still use very old python versions.

@kiike kiike force-pushed the fix/dots_in_target branch from 435e53e to fe04396 Compare November 6, 2024 08:55
@kiike
Copy link
Contributor Author

kiike commented Nov 6, 2024

Bootstrap should be compatible with python versions older than 3.6. Some distros (and possibly even some of our build runners) might still use very old python versions.

Got it. I've added the relevant word character to the last commit.

Copy link
Member

@onur-ozkan onur-ozkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Could you please squash your commits?

@onur-ozkan
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 6, 2024
With this change, it is now possible to pass quotes to the configure
script, such as

`./configure.py --set=target.\"thumbv8m.main-none-eabi\".linker=/linker`

, which will treat `thumbv8.main-none-eabi` as a whole part. Currently,
the string would be split into two elements: `thumbv8`, and
`main-none-eabi`.
@kiike kiike force-pushed the fix/dots_in_target branch from fe04396 to 8471c6b Compare November 6, 2024 20:03
@kiike
Copy link
Contributor Author

kiike commented Nov 6, 2024

@rustbot review

@onur-ozkan done!

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 6, 2024
@onur-ozkan
Copy link
Member

@bors r+ rollup=iffy

@bors
Copy link
Contributor

bors commented Nov 7, 2024

📌 Commit 8471c6b has been approved by onur-ozkan

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 7, 2024
@bors
Copy link
Contributor

bors commented Nov 7, 2024

⌛ Testing commit 8471c6b with merge fe43131...

@bors
Copy link
Contributor

bors commented Nov 7, 2024

☀️ Test successful - checks-actions
Approved by: onur-ozkan
Pushing fe43131 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 7, 2024
@bors bors merged commit fe43131 into rust-lang:master Nov 7, 2024
7 checks passed
@rustbot rustbot added this to the 1.84.0 milestone Nov 7, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (fe43131): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 1.0%, secondary 4.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.0% [0.8%, 1.3%] 2
Regressions ❌
(secondary)
4.1% [4.1%, 4.1%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.0% [0.8%, 1.3%] 2

Cycles

Results (secondary 1.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.9% [1.9%, 1.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 780.818s -> 781.131s (0.04%)
Artifact size: 335.29 MiB -> 335.23 MiB (-0.02%)

@kiike kiike deleted the fix/dots_in_target branch November 7, 2024 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
6 participants