-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[refurb] Parse more exotic decimal strings in verbose-decimal-constructor (FURB157)
#14098
Changes from all commits
14c4842
fe01159
550f09b
d003550
82d0a51
d4f6a81
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ use ruff_macros::{derive_message_formats, violation}; | |
use ruff_python_ast::{self as ast, Expr}; | ||
use ruff_python_trivia::PythonWhitespace; | ||
use ruff_text_size::Ranged; | ||
use std::borrow::Cow; | ||
|
||
use crate::checkers::ast::Checker; | ||
|
||
|
@@ -75,13 +76,20 @@ pub(crate) fn verbose_decimal_constructor(checker: &mut Checker, call: &ast::Exp | |
value: str_literal, .. | ||
}) => { | ||
// Parse the inner string as an integer. | ||
let trimmed = str_literal.to_str().trim_whitespace(); | ||
|
||
// | ||
// For reference, a string argument to `Decimal` is parsed in CPython | ||
// using this regex: | ||
// https://github.com/python/cpython/blob/ac556a2ad1213b8bb81372fe6fb762f5fcb076de/Lib/_pydecimal.py#L6060-L6077 | ||
// _after_ trimming whitespace from the string and removing all occurrences of "_". | ||
let mut trimmed = Cow::from(str_literal.to_str().trim_whitespace()); | ||
if memchr::memchr(b'_', trimmed.as_bytes()).is_some() { | ||
trimmed = Cow::from(trimmed.replace('_', "")); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: I don't think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm fine not using memchr, but I do think we need to use Edit: whoops, didn't see that was already addressed below |
||
} | ||
// Extract the unary sign, if any. | ||
let (unary, rest) = if let Some(trimmed) = trimmed.strip_prefix('+') { | ||
("+", trimmed) | ||
("+", Cow::from(trimmed)) | ||
} else if let Some(trimmed) = trimmed.strip_prefix('-') { | ||
("-", trimmed) | ||
("-", Cow::from(trimmed)) | ||
} else { | ||
("", trimmed) | ||
}; | ||
|
@@ -90,7 +98,7 @@ pub(crate) fn verbose_decimal_constructor(checker: &mut Checker, call: &ast::Exp | |
let rest = rest.trim_start_matches('0'); | ||
|
||
// Verify that the rest of the string is a valid integer. | ||
if !rest.chars().all(|c| c.is_ascii_digit()) { | ||
if !rest.bytes().all(|c| c.is_ascii_digit()) { | ||
return; | ||
}; | ||
|
||
|
@@ -159,3 +167,26 @@ pub(crate) fn verbose_decimal_constructor(checker: &mut Checker, call: &ast::Exp | |
|
||
checker.diagnostics.push(diagnostic); | ||
} | ||
|
||
// // Slightly modified from [CPython regex] to ignore https://github.com/python/cpython/blob/ac556a2ad1213b8bb81372fe6fb762f5fcb076de/Lib/_pydecimal.py#L6060-L6077 | ||
// static DECIMAL_PARSER_REGEX: LazyLock<Regex> = LazyLock::new(|| { | ||
// Regex::new( | ||
// r"(?x) # Verbose mode for comments | ||
// ^ # Start of string | ||
// (?P<sign>[-+])? # Optional sign | ||
// (?: | ||
// (?P<int>\d*) # Integer part (can be empty) | ||
// (\.(?P<frac>\d+))? # Optional fractional part | ||
// (E(?P<exp>[-+]?\d+))? # Optional exponent | ||
// | | ||
// Inf(inity)? # Infinity | ||
// | | ||
// (?P<signal>s)? # Optional signal | ||
// NaN # NaN | ||
// (?P<diag>\d*) # Optional diagnostic info | ||
// ) | ||
// $ # End of string | ||
// ", | ||
// ) | ||
// .unwrap() | ||
// }); |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -166,3 +166,43 @@ FURB157.py:12:17: FURB157 [*] Verbose expression in `Decimal` constructor | |
13 13 | | ||
14 14 | # OK | ||
15 15 | Decimal(0) | ||
|
||
FURB157.py:23:9: FURB157 [*] Verbose expression in `Decimal` constructor | ||
| | ||
22 | # Errors | ||
23 | Decimal("1_000") | ||
| ^^^^^^^ FURB157 | ||
24 | Decimal("__1____000") | ||
| | ||
= help: Replace with `1000` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it would be nice to preserve the thousand separator in that case and suggest There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good suggestion, and I think it would be a nice followup! I think one would want to trim leading/trailing underscores, and then replace every internal, contiguous sequence of underscores with a single underscore. At that point it may be worth a little refactoring to just walk through the string once and abort early as appropriate... |
||
|
||
ℹ Safe fix | ||
20 20 | # See https://github.com/astral-sh/ruff/issues/13807 | ||
21 21 | | ||
22 22 | # Errors | ||
23 |-Decimal("1_000") | ||
23 |+Decimal(1000) | ||
24 24 | Decimal("__1____000") | ||
25 25 | | ||
26 26 | # Ok | ||
|
||
FURB157.py:24:9: FURB157 [*] Verbose expression in `Decimal` constructor | ||
| | ||
22 | # Errors | ||
23 | Decimal("1_000") | ||
24 | Decimal("__1____000") | ||
| ^^^^^^^^^^^^ FURB157 | ||
25 | | ||
26 | # Ok | ||
| | ||
= help: Replace with `1000` | ||
|
||
ℹ Safe fix | ||
21 21 | | ||
22 22 | # Errors | ||
23 23 | Decimal("1_000") | ||
24 |-Decimal("__1____000") | ||
24 |+Decimal(1000) | ||
25 25 | | ||
26 26 | # Ok | ||
27 27 | Decimal("2e-4") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please also include cases for
Decimal("-inf")
andDecimal("inf")
(both should error).