Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trim_start, trim_end etc.; deprecate trim_left, trim_right, etc. in future #52994

Merged
merged 4 commits into from
Sep 6, 2018
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions src/liballoc/tests/str.rs
Original file line number Diff line number Diff line change
Expand Up @@ -726,6 +726,36 @@ fn test_is_char_boundary() {
}
}

#[test]
fn test_trim_start_matches() {
let v: &[char] = &[];
assert_eq!(" *** foo *** ".trim_start_matches(v), " *** foo *** ");
let chars: &[char] = &['*', ' '];
assert_eq!(" *** foo *** ".trim_start_matches(chars), "foo *** ");
assert_eq!(" *** *** ".trim_start_matches(chars), "");
assert_eq!("foo *** ".trim_start_matches(chars), "foo *** ");

assert_eq!("11foo1bar11".trim_start_matches('1'), "foo1bar11");
let chars: &[char] = &['1', '2'];
assert_eq!("12foo1bar12".trim_start_matches(chars), "foo1bar12");
assert_eq!("123foo1bar123".trim_start_matches(|c: char| c.is_numeric()), "foo1bar123");
}

#[test]
fn test_trim_end_matches() {
let v: &[char] = &[];
assert_eq!(" *** foo *** ".trim_end_matches(v), " *** foo *** ");
let chars: &[char] = &['*', ' '];
assert_eq!(" *** foo *** ".trim_end_matches(chars), " *** foo");
assert_eq!(" *** *** ".trim_end_matches(chars), "");
assert_eq!(" *** foo".trim_end_matches(chars), " *** foo");

assert_eq!("11foo1bar11".trim_end_matches('1'), "11foo1bar");
let chars: &[char] = &['1', '2'];
assert_eq!("12foo1bar12".trim_end_matches(chars), "12foo1bar");
assert_eq!("123foo1bar123".trim_end_matches(|c: char| c.is_numeric()), "123foo1bar");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit surprised that we are copy-pasting tests for methods which are exact aliases of others. Is that common practice? Why is it a good idea? There are some downsides to this, like increased code size and testing time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point; it should be safe to remove the tests for the to-be-deprecated methods, which now just call the new ones.


#[test]
fn test_trim_left_matches() {
let v: &[char] = &[];
Expand Down Expand Up @@ -771,6 +801,26 @@ fn test_trim_matches() {
assert_eq!("123foo1bar123".trim_matches(|c: char| c.is_numeric()), "foo1bar");
}

#[test]
fn test_trim_start() {
assert_eq!("".trim_start(), "");
assert_eq!("a".trim_start(), "a");
assert_eq!(" ".trim_start(), "");
assert_eq!(" blah".trim_start(), "blah");
assert_eq!(" \u{3000} wut".trim_start(), "wut");
assert_eq!("hey ".trim_start(), "hey ");
}

#[test]
fn test_trim_end() {
assert_eq!("".trim_end(), "");
assert_eq!("a".trim_end(), "a");
assert_eq!(" ".trim_end(), "");
assert_eq!("blah ".trim_end(), "blah");
assert_eq!("wut \u{3000} ".trim_end(), "wut");
assert_eq!(" hey".trim_end(), " hey");
}

#[test]
fn test_trim_left() {
assert_eq!("".trim_left(), "");
Expand Down Expand Up @@ -1518,12 +1568,20 @@ fn trim_ws() {
"a \t ");
assert_eq!(" \t a \t ".trim_right_matches(|c: char| c.is_whitespace()),
" \t a");
assert_eq!(" \t a \t ".trim_start_matches(|c: char| c.is_whitespace()),
"a \t ");
assert_eq!(" \t a \t ".trim_end_matches(|c: char| c.is_whitespace()),
" \t a");
assert_eq!(" \t a \t ".trim_matches(|c: char| c.is_whitespace()),
"a");
assert_eq!(" \t \t ".trim_left_matches(|c: char| c.is_whitespace()),
"");
assert_eq!(" \t \t ".trim_right_matches(|c: char| c.is_whitespace()),
"");
assert_eq!(" \t \t ".trim_start_matches(|c: char| c.is_whitespace()),
"");
assert_eq!(" \t \t ".trim_end_matches(|c: char| c.is_whitespace()),
"");
assert_eq!(" \t \t ".trim_matches(|c: char| c.is_whitespace()),
"");
}
Expand Down
170 changes: 157 additions & 13 deletions src/libcore/str/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3587,6 +3587,76 @@ impl str {
self.trim_matches(|c: char| c.is_whitespace())
}

/// Returns a string slice with leading whitespace removed.
///
/// 'Whitespace' is defined according to the terms of the Unicode Derived
/// Core Property `White_Space`.
///
/// # Text directionality
///
/// A string is a sequence of bytes. `start` in this context means the first
/// position of that byte string; for a left-to-right language like English or
/// Russian, this will be left side; and for right-to-left languages like
/// like Arabic or Hebrew, this will be the right side.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// let s = " Hello\tworld\t";
/// assert_eq!("Hello\tworld\t", s.trim_start());
/// ```
///
/// Directionality:
///
/// ```
/// let s = " English ";
/// assert!(Some('E') == s.trim_start().chars().next());
///
/// let s = " עברית ";
/// assert!(Some('ע') == s.trim_start().chars().next());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my computer this example shows up like this:

selection_110

The whitespace at the low byte positions of the string appear on the left, and the right-to-left characters in the high byte positions appear on the right. The documentation explains that trim_start trims the right side of right-to-left text, and in fact trimming is taking place, but as rendered it appears that no trimming has happened. Is this something that would be clear to people who regularly deal with right-to-left? Is there a way to make an example that is less ambiguous for people not experienced with right-to-left?

Copy link
Member Author

@varkor varkor Aug 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Examples are tricky with mixed text directions. I'll try to come up with a better example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added whitespace to both sides. It should be clear to anyone that the leading whitespace is being trimmed, even if they don't realise which spaces are actually being removed. Anything more subtle requires actually explaining how mixed directional text is actually rendered, which we probably don't want to do here.

/// ```
#[stable(feature = "trim_direction", since = "1.30.0")]
pub fn trim_start(&self) -> &str {
self.trim_start_matches(|c: char| c.is_whitespace())
}

/// Returns a string slice with trailing whitespace removed.
///
/// 'Whitespace' is defined according to the terms of the Unicode Derived
/// Core Property `White_Space`.
///
/// # Text directionality
///
/// A string is a sequence of bytes. `end` in this context means the last
/// position of that byte string; for a left-to-right language like English or
/// Russian, this will be right side; and for right-to-left languages like
/// like Arabic or Hebrew, this will be the left side.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// let s = " Hello\tworld\t";
/// assert_eq!(" Hello\tworld", s.trim_end());
/// ```
///
/// Directionality:
///
/// ```
/// let s = " English ";
/// assert!(Some('h') == s.trim_end().chars().rev().next());
///
/// let s = " עברית ";
/// assert!(Some('ת') == s.trim_end().chars().rev().next());
/// ```
#[stable(feature = "trim_direction", since = "1.30.0")]
pub fn trim_end(&self) -> &str {
self.trim_end_matches(|c: char| c.is_whitespace())
}

/// Returns a string slice with leading whitespace removed.
///
/// 'Whitespace' is defined according to the terms of the Unicode Derived
Expand Down Expand Up @@ -3619,8 +3689,9 @@ impl str {
/// assert!(Some('ע') == s.trim_left().chars().next());
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_deprecated(reason = "superseded by `trim_start`", since = "1.33.0")]
pub fn trim_left(&self) -> &str {
self.trim_left_matches(|c: char| c.is_whitespace())
self.trim_start()
}

/// Returns a string slice with trailing whitespace removed.
Expand Down Expand Up @@ -3655,8 +3726,9 @@ impl str {
/// assert!(Some('ת') == s.trim_right().chars().rev().next());
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_deprecated(reason = "superseded by `trim_end`", since = "1.33.0")]
pub fn trim_right(&self) -> &str {
self.trim_right_matches(|c: char| c.is_whitespace())
self.trim_end()
}

/// Returns a string slice with all prefixes and suffixes that match a
Expand Down Expand Up @@ -3725,14 +3797,14 @@ impl str {
/// Basic usage:
///
/// ```
/// assert_eq!("11foo1bar11".trim_left_matches('1'), "foo1bar11");
/// assert_eq!("123foo1bar123".trim_left_matches(char::is_numeric), "foo1bar123");
/// assert_eq!("11foo1bar11".trim_start_matches('1'), "foo1bar11");
/// assert_eq!("123foo1bar123".trim_start_matches(char::is_numeric), "foo1bar123");
///
/// let x: &[_] = &['1', '2'];
/// assert_eq!("12foo1bar12".trim_left_matches(x), "foo1bar12");
/// assert_eq!("12foo1bar12".trim_start_matches(x), "foo1bar12");
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn trim_left_matches<'a, P: Pattern<'a>>(&'a self, pat: P) -> &'a str {
#[stable(feature = "trim_direction", since = "1.30.0")]
pub fn trim_start_matches<'a, P: Pattern<'a>>(&'a self, pat: P) -> &'a str {
let mut i = self.len();
let mut matcher = pat.into_searcher(self);
if let Some((a, _)) = matcher.next_reject() {
Expand Down Expand Up @@ -3764,20 +3836,20 @@ impl str {
/// Simple patterns:
///
/// ```
/// assert_eq!("11foo1bar11".trim_right_matches('1'), "11foo1bar");
/// assert_eq!("123foo1bar123".trim_right_matches(char::is_numeric), "123foo1bar");
/// assert_eq!("11foo1bar11".trim_end_matches('1'), "11foo1bar");
/// assert_eq!("123foo1bar123".trim_end_matches(char::is_numeric), "123foo1bar");
///
/// let x: &[_] = &['1', '2'];
/// assert_eq!("12foo1bar12".trim_right_matches(x), "12foo1bar");
/// assert_eq!("12foo1bar12".trim_end_matches(x), "12foo1bar");
/// ```
///
/// A more complex pattern, using a closure:
///
/// ```
/// assert_eq!("1fooX".trim_right_matches(|c| c == '1' || c == 'X'), "1foo");
/// assert_eq!("1fooX".trim_end_matches(|c| c == '1' || c == 'X'), "1foo");
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn trim_right_matches<'a, P: Pattern<'a>>(&'a self, pat: P) -> &'a str
#[stable(feature = "trim_direction", since = "1.30.0")]
pub fn trim_end_matches<'a, P: Pattern<'a>>(&'a self, pat: P) -> &'a str
where P::Searcher: ReverseSearcher<'a>
{
let mut j = 0;
Expand All @@ -3791,6 +3863,78 @@ impl str {
}
}

/// Returns a string slice with all prefixes that match a pattern
/// repeatedly removed.
///
/// The pattern can be a `&str`, [`char`], or a closure that determines if
/// a character matches.
///
/// [`char`]: primitive.char.html
///
/// # Text directionality
///
/// A string is a sequence of bytes. 'Left' in this context means the first
/// position of that byte string; for a language like Arabic or Hebrew
/// which are 'right to left' rather than 'left to right', this will be
/// the _right_ side, not the left.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// assert_eq!("11foo1bar11".trim_left_matches('1'), "foo1bar11");
/// assert_eq!("123foo1bar123".trim_left_matches(char::is_numeric), "foo1bar123");
///
/// let x: &[_] = &['1', '2'];
/// assert_eq!("12foo1bar12".trim_left_matches(x), "foo1bar12");
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_deprecated(reason = "superseded by `trim_start_matches`", since = "1.33.0")]
pub fn trim_left_matches<'a, P: Pattern<'a>>(&'a self, pat: P) -> &'a str {
self.trim_start_matches(pat)
}

/// Returns a string slice with all suffixes that match a pattern
/// repeatedly removed.
///
/// The pattern can be a `&str`, [`char`], or a closure that
/// determines if a character matches.
///
/// [`char`]: primitive.char.html
///
/// # Text directionality
///
/// A string is a sequence of bytes. 'Right' in this context means the last
/// position of that byte string; for a language like Arabic or Hebrew
/// which are 'right to left' rather than 'left to right', this will be
/// the _left_ side, not the right.
///
/// # Examples
///
/// Simple patterns:
///
/// ```
/// assert_eq!("11foo1bar11".trim_right_matches('1'), "11foo1bar");
/// assert_eq!("123foo1bar123".trim_right_matches(char::is_numeric), "123foo1bar");
///
/// let x: &[_] = &['1', '2'];
/// assert_eq!("12foo1bar12".trim_right_matches(x), "12foo1bar");
/// ```
///
/// A more complex pattern, using a closure:
///
/// ```
/// assert_eq!("1fooX".trim_right_matches(|c| c == '1' || c == 'X'), "1foo");
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_deprecated(reason = "superseded by `trim_end_matches`", since = "1.33.0")]
pub fn trim_right_matches<'a, P: Pattern<'a>>(&'a self, pat: P) -> &'a str
where P::Searcher: ReverseSearcher<'a>
{
self.trim_end_matches(pat)
}

/// Parses this string slice into another type.
///
/// Because `parse` is so general, it can cause problems with type
Expand Down