Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slice_as_bytes lint #10984

Closed
wants to merge 3 commits into from
Closed

Conversation

A-Walrus
Copy link

Detects patterns like

s[1..5].as_bytes();

and suggest replacing with

&s.as_bytes()[1..5];

fixes: #10981

changelog: [`slice_as_bytes`]: add new lilnt

@rustbot
Copy link
Collaborator

rustbot commented Jun 18, 2023

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Jarcho (or someone else) soon.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Jun 18, 2023
Copy link
Member

@blyxyas blyxyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good for a first-time contributor, thanks ❤️! Could you consider these changes?
Also, as a side note, the lint should go in the methods type. Try putting it there (maybe creating a new branch and copy-pasting your changes to the new files?) or, if it's too complex, just leave it as is (We'll manually re-type all lints in a few weeks).

(cargo dev new_lint --name=slice_as_bytes --type=methods --category=pedantic)

clippy_lints/src/slice_as_bytes.rs Outdated Show resolved Hide resolved
clippy_lints/src/slice_as_bytes.rs Outdated Show resolved Hide resolved
clippy_lints/src/slice_as_bytes.rs Outdated Show resolved Hide resolved
clippy_lints/src/slice_as_bytes.rs Outdated Show resolved Hide resolved
clippy_lints/src/slice_as_bytes.rs Outdated Show resolved Hide resolved
clippy_lints/src/slice_as_bytes.rs Outdated Show resolved Hide resolved
tests/ui/slice_as_bytes.rs Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented Jun 19, 2023

☔ The latest upstream changes (presumably #10951) made this pull request unmergeable. Please resolve the merge conflicts.

@A-Walrus
Copy link
Author

I believe I've addressed all the comments, please let me know if I missed something :)

@ChrisJefferson
Copy link

I'm going to be honest, I don't like this lint without clearer warnings that it could break working code.

The comment is this removes code which can "perform a safety check which can panic". If someone wrote their code depending on that safety check, this clippy silently removes it.

I realise this would be a much bigger change (sorry!), but could clippy have some way of annotating changes which change behaviour in this way?

@blyxyas
Copy link
Member

blyxyas commented Jun 24, 2023

Checking every use of that same vector would be too expensive performance-wise, so I don't think it's worth it. We can document it in the lint though. Could you provide an example where this check is needed? I feel like using this check would be just moving the panic from one place to another. But thanks for the warning! ❤️

@A-Walrus
Copy link
Author

Checking every use of that same vector would be too expensive performance-wise, so I don't think it's worth it. We can document it in the lint though.

I'm not sure what you're referring to, Could you please explain?

Could you provide an example where this check is needed? I feel like using this check would be just moving the panic from one place to another.

Sure, helix-editor/helix#7417. We were providing chunks of bytes from a string to be processed by some other code. The chunks themselves are not necessarily valid strings, ie can be cut in the middle of a char.
Something similar might happen with the std::io::Write trait's write function which returns how many bytes where written. If someone was writing from a string they might try taking the remaining bytes of the string like s[bytes_written..].as_bytes() which would be incorrect.

@bors
Copy link
Contributor

bors commented Jul 1, 2023

☔ The latest upstream changes (presumably #11012) made this pull request unmergeable. Please resolve the merge conflicts.

@blyxyas
Copy link
Member

blyxyas commented Jul 2, 2023

I'm not sure what you're referring to, Could you please explain?

Referring to: "If someone wrote their code depending on that safety check, this clippy [lint] silently removes it." -- @ChrisJefferson

I was saying that, if we need to check if all uses of a vector s when linting this because maybe the suggestion might break code. Checking all uses of s would be too expensive performance-wise and a bad idea.

@blyxyas
Copy link
Member

blyxyas commented Jul 2, 2023

I think we should do a big lintcheck and see the repercusions of this change. We can always leave it at MaybeIncorrect

@bors
Copy link
Contributor

bors commented Jul 5, 2023

☔ The latest upstream changes (presumably #10970) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Contributor

@Jarcho Jarcho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the long delay. Just a category change, and some minor cleanup are needed.

/// ```
#[clippy::version = "1.72.0"]
pub SLICE_AS_BYTES,
perf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the behaviour change and the small perf gain, this should not be linting by default. pedantic would be a better category.

let ty = cx.typeck_results().expr_ty(indexed).peel_refs();
let is_str = ty.is_str();
let is_string = is_type_lang_item(cx, ty, LangItem::String);
if is_str || is_string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just be `ty.is_str() || is_type_lang_item(..)``. They can both just be referred to as a string in the message.


pub(super) fn check(cx: &LateContext<'_>, expr: &Expr<'_>, recv: &Expr<'_>) {
if let ExprKind::Index(indexed, index) = recv.kind {
if is_range_literal(index) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let-chains can be used here. No need for nested if statements.

SLICE_AS_BYTES,
expr.span,
&(format!(
"slicing a {type_name} before calling `as_bytes` results in needless UTF-8 alignment checks, and has the possiblity of panicking"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The justification for the lint should not be in the message, only a short description of what is being linted. The rest of it should just be in the documentation, or if needed, as an additional note.

let is_str = ty.is_str();
let is_string = is_type_lang_item(cx, ty, LangItem::String);
if is_str || is_string {
let mut applicability = Applicability::MachineApplicable;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be MaybeIncorrect. Same justification as the category change.

/// Checks for string slices immediantly followed by `as_bytes`.
/// ### Why is this bad?
/// It involves doing an unnecessary UTF-8 alignment check which is less efficient, and can cause a panic.
/// ### Example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A ### Known problems section is needed with the note that the panic may be required for the code's correctness.

@Jarcho
Copy link
Contributor

Jarcho commented Nov 11, 2023

Ping @A-Walrus

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties labels Nov 11, 2023
@xFrednet
Copy link
Member

Hey this is triage, I'm closing this due to inactivity. If you want to continue this implementation, you're welcome to create a new PR. @A-Walrus, thank you for the time, you already put into this!

Interested parties are welcome to pick this implementation up as well :)

@rustbot label +S-inactive-closed -S-waiting-on-author -S-waiting-on-review

@xFrednet xFrednet closed this Mar 29, 2024
@rustbot rustbot added S-inactive-closed Status: Closed due to inactivity and removed S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) labels Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive-closed Status: Closed due to inactivity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flip order of s[a..b].as_bytes()
9 participants