Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize strip_prefix and strip_suffix with str patterns #69784

Merged
merged 1 commit into from
Mar 31, 2020

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Mar 6, 2020

As mentioned in #67302 (comment).
I'm not sure whether adding these methods to Pattern is desirable—but they have default implementations so the change is backwards compatible. Plus it seems like they're slated for wholesale replacement soon anyway? #56345


Constructing a Searcher in strip_prefix and strip_suffix is
unnecessarily slow when the pattern is a fixed-length string. Add
strip_prefix and strip_suffix methods to the Pattern trait, and add
optimized implementations of these methods in the str implementation.
The old implementation is retained as the default for these methods.

@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 6, 2020
Copy link
Contributor

@mlodato517 mlodato517 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just asking some questions because I'm new to rust but want to learn and get more involved! :-)

@@ -3984,26 +3984,15 @@ impl str {
/// ```
/// #![feature(str_strip)]
///
/// assert_eq!("foobar".strip_prefix("foo"), Some("bar"));
/// assert_eq!("foobar".strip_prefix("bar"), None);
/// assert_eq!("foo:bar".strip_prefix("foo:"), Some("bar"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering - do these changes improve the test cases or are they simply to make the docs more readable? Or some other thing? I assume it's not because the previous tests failed with the new implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They improve the test cases to account for a bug that I introduced while rewriting these methods that was not properly caught by the old test cases. The old test cases had the defect that they split the string exactly in half. My buggy implementation for strip_suffix sliced from s[pat.len()..] when it needed to slice from s[s.len() - pat.len()..], which was not caught by any of the previous test cases.

/// assert_eq!("foobar".strip_prefix("foo"), Some("bar"));
/// assert_eq!("foobar".strip_prefix("bar"), None);
/// assert_eq!("foo:bar".strip_prefix("foo:"), Some("bar"));
/// assert_eq!("foo:bar".strip_prefix("bar"), None);
/// assert_eq!("foofoo".strip_prefix("foo"), Some("foo"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be updated to foo:foo? I guess this partly depends on the comment above :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of this test case, as best as I can tell, is to check that strip_prefix(P) only strips one prefix from a string like PPPP, so adding the colon would change what this test is intended to do.

@@ -3041,7 +3041,7 @@ impl str {
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn starts_with<'a, P: Pattern<'a>>(&'a self, pat: P) -> bool {
pat.is_prefix_of(self)
pat.strip_prefix_from(self).is_some()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm missing something :-( - what is this change for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, it's errant. Thanks. It's from an earlier revision of the PR where I was experimenting with removing the is_prefix_of and is_suffix_of methods and rewriting them in terms of the new strip_prefix_from and strip_suffix_from methods.


#[inline]
fn strip_prefix_from(self, haystack: &'a str) -> Option<&'a str> {
self[..].strip_prefix_from(haystack)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm having a little trouble following the flow of these methods. I'm going to botch some made up syntax and pretend Module.method is the method method on an instance of the Module struct but can you tell me if this what's going on?

We have

  1. String.strip_prefix_from which calls str.strip_prefix_from which is an implementation of Pattern.strip_prefix_from optimized because we know the length of the String instance
  2. str.strip_prefix which calls prefix.strip_prefix_from(self) which will call the optimized implementation if prefix is a str or char
  3. char.strip_prefix_from which is an optimized implementation of Pattern.strip_prefix_from
  4. a default implementation of Pattern.strip_prefix_from which is for ... I don't know 🤔 The PR description says "they have default implementations so the change is backwards compatible" - what does that mean? That sounds like, "if we no longer override strip_prefix_from for str or char they'll get the default implementation"? Or am I reading that wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, exactly right! (4) is for anyone who wants to implement Pattern for their own type (say, a function) but doesn't want to go to the trouble of implementing a specialized strip_prefix_from or strip_suffix_from method—or perhaps their pattern is not of a fixed length and therefore there isn't a way to optimize the method beyond constructing and using a full search.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh okay I understand now! I was originally confused because I didn't notice that, if there wasn't a default implementation, implementors would have to supply their own implementation. Forgot that optional methods without defaults aren't things :-) Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You bet!

Copy link
Contributor

@mlodato517 mlodato517 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any "Resolve Conversation" button, but you resolved all my comments, thank you :-)
Just one more question and then this is ✅ from my perspective but I have no idea what I'm talking about so I'd rather someone else review this

@@ -47,6 +47,22 @@ pub trait Pattern<'a>: Sized {
matches!(self.into_searcher(haystack).next(), SearchStep::Match(0, _))
}

/// Removes the pattern from the front of haystack, if it matches.
#[inline]
fn strip_prefix_from(self, haystack: &'a str) -> Option<&'a str> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this method no longer have tests? Do we want it to still have tests?

@bors
Copy link
Contributor

bors commented Mar 13, 2020

☔ The latest upstream changes (presumably #69986) made this pull request unmergeable. Please resolve the merge conflicts.

@tesuji
Copy link
Contributor

tesuji commented Mar 23, 2020

@benesch You need to rebase this PR.

@benesch benesch force-pushed the fast-strip-prefix-suffix branch 2 times, most recently from 0b86e69 to dbbea25 Compare March 28, 2020 18:39
@benesch
Copy link
Contributor Author

benesch commented Mar 28, 2020

Thanks, done.

src/libcore/str/pattern.rs Outdated Show resolved Hide resolved
src/libcore/str/pattern.rs Outdated Show resolved Hide resolved
Constructing a Searcher in strip_prefix and strip_suffix is
unnecessarily slow when the pattern is a fixed-length string. Add
strip_prefix and strip_suffix methods to the Pattern trait, and add
optimized implementations of these methods in the str implementation.
The old implementation is retained as the default for these methods.
@kennytm
Copy link
Member

kennytm commented Mar 30, 2020

@bors r+

@bors
Copy link
Contributor

bors commented Mar 30, 2020

📌 Commit ac478f2 has been approved by kennytm

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 30, 2020
@bors
Copy link
Contributor

bors commented Mar 30, 2020

⌛ Testing commit ac478f2 with merge 23e4949bb9f13174569c3f369434398db395ee16...

@bors
Copy link
Contributor

bors commented Mar 30, 2020

💔 Test failed - checks-azure

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 30, 2020
@benesch
Copy link
Contributor Author

benesch commented Mar 31, 2020

This seems to have been part of a Bors batch that was canceled? https://dev.azure.com/rust-lang/rust/_build/results?buildId=24890&view=results

Will it be requeued automatically or does it need another r+?

@kennytm
Copy link
Member

kennytm commented Mar 31, 2020

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 31, 2020
Centril added a commit to Centril/rust that referenced this pull request Mar 31, 2020
…=kennytm

Optimize strip_prefix and strip_suffix with str patterns

As mentioned in rust-lang#67302 (comment).
I'm not sure whether adding these methods to `Pattern` is desirable—but they have default implementations so the change is backwards compatible. Plus it seems like they're slated for wholesale replacement soon anyway? rust-lang#56345

----

Constructing a Searcher in strip_prefix and strip_suffix is
unnecessarily slow when the pattern is a fixed-length string. Add
strip_prefix and strip_suffix methods to the Pattern trait, and add
optimized implementations of these methods in the str implementation.
The old implementation is retained as the default for these methods.
Centril added a commit to Centril/rust that referenced this pull request Mar 31, 2020
…=kennytm

Optimize strip_prefix and strip_suffix with str patterns

As mentioned in rust-lang#67302 (comment).
I'm not sure whether adding these methods to `Pattern` is desirable—but they have default implementations so the change is backwards compatible. Plus it seems like they're slated for wholesale replacement soon anyway? rust-lang#56345

----

Constructing a Searcher in strip_prefix and strip_suffix is
unnecessarily slow when the pattern is a fixed-length string. Add
strip_prefix and strip_suffix methods to the Pattern trait, and add
optimized implementations of these methods in the str implementation.
The old implementation is retained as the default for these methods.
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 31, 2020
Rollup of 9 pull requests

Successful merges:

 - rust-lang#69784 (Optimize strip_prefix and strip_suffix with str patterns)
 - rust-lang#70548 (Add long error code for error E0226)
 - rust-lang#70555 (resolve, `try_resolve_as_non_binding`: use `delay_span_bug` due to parser recovery)
 - rust-lang#70561 (remove obsolete comment)
 - rust-lang#70562 (infer array len from pattern)
 - rust-lang#70585 (std: Fix over-aligned allocations on wasm32-wasi)
 - rust-lang#70587 (Add `Rust` to the code snippet)
 - rust-lang#70588 (Fix incorrect documentation for `str::{split_at, split_at_mut}`)
 - rust-lang#70613 (more clippy fixes)

Failed merges:

r? @ghost
@bors bors merged commit 9ee373f into rust-lang:master Mar 31, 2020
@benesch benesch deleted the fast-strip-prefix-suffix branch March 31, 2020 22:03
@benesch
Copy link
Contributor Author

benesch commented Mar 31, 2020

Thanks very much, everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants