-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
W1202 - logging-fstring-interpolation is not useful #2395
Comments
Have a read of #1788, fstrings are gone for good |
How about an option to exempt f-strings? By default, f-strings would be flagged; with the option, no. |
Do we have consensus on this? We too feel this should be deprecated. As of Python 3.6 the recommended way to do this is with a formatted string literal. https://docs.python.org/3/reference/lexical_analysis.html#f-strings Note that there are TWO warnings that could come ...
This is from: MY_VAR_BAR = 'ywah'
_LOG.warning('foo %s', MY_VAR_BAR)
_LOG.warning('foo {}'.format(MY_VAR_BAR))
_LOG.warning('foo {MY_VAR_BAR}')
_LOG.warning("foo {MY_VAR_BAR}")
_LOG.warning(f"foo {MY_VAR_BAR}") go
Of note, in our [LOGGING]
# Format style used to check logging format string. `old` means using %
# formatting, while `new` is for `{}` formatting.
logging-format-style=new |
This is a per-project issue. We're writing a filesystem. There are certain points in the code where we have TRACE level logging. These are hit millions of times and interpolation can kill performance. Even DEBUG level f-strings can slaughter the functioning of low level code. On the other hand, the readability of f-strings should generally be preferred when the number of interpolations isn't significant. In this case, providing developer with linter tools and options that help them make smart decisions is the right choice... not arbitrarily deprecating or enabling an option that has, at least for me personally, rescued seriously broken code. Many pylint settings fall into this nuanced trade-off range, where considerations of readability, code complexity, etc. need to be balanced against each other and there are not easy choices. These generally cause controversy, and repeated issue tickets (this one is the 3rd!): This could be an indicator that pylint needs more nuanced options to allow configuration of the linter depending on the nature of the project. See: #3000 for an example. |
In addition to @earonesty 's comment above, another consideration MAY be that if we add clarity and documentation to these such nuanced warnings, the projects and users can make better educated decisions on their own. (In this example, the information available at the moment for the fstring-interpolation warning leads to confusion and at worse developers NOT using this for non-performant required code which hurts readability unnecessarily) |
@earonesty, I think the point that's missed here is that the purpose of pylint is not to make every project's algorithmic decisions for them, but to make code generally consistent for readability purposes. The cases where the performance of interpolation matters seem particularly niche, whereas the cases where f-strings enhance readability are broad and varied. In that sense, -in general- f-strings should be favored by default by style-checkers. The few projects where f-strings cause performance problems can set non-default configurations. |
Here’s a copy of a comment I made on #1788. #1788 (comment) It seems like everyone here is talking about performance. There are other benefits to letting
|
I've come around to the idea that f-strings are almost always a bad idea, certainly in logging. The run-time implications and side-effects of the f-string eval() simply don't belong in logging statements. |
@earonesty Can you elaborate on those side-effects? |
As @chdsbd said, string interpolation is happening at run-time in a way that can lead to exceptions. And while it's abundantly clear what the potential side effects of |
@earonesty I really don't agree with this as a general rule of thumb. It completely depends on the criticality of your application and frequency of logging. It is definitely possible that some applications may wish to avoid this for robustness and performance reasons, but MOST ~80% of Python programs need not worry. |
Sure, they don't have to worry. But linters are all about best practices. And in this case, you have a choice of logging lines that either can or cannot cause exceptions... and either can or cannot be aggregated. Worse... if you're writing a library... you're making this decision on behalf of your users. f-strings just don't improve readability that much in log lines to justify the possibility of all 3 issues: run-time failures, aggregation incompatibility, and performance issues. |
@earonesty it's generally best practice and more pythonic to explicitly catch your exceptions. Trusting the logger to implicitly catch exceptions, but only when the code is contained within the logging statement, seems like a rather odd and unreliable approach. I disagree about the extent of readability. f-strings put an expression exactly where it should go in a string statement. That's a monumental step up from positional %s replacements, improving readability and reducing the potential for bugs: Suppose you have a logging string requiring 5 different variable replacements. It's already difficult to see which element of the tuple maps to which %s instance. Now you want to rearrange the message a little bit, or add a variable to the message, or remove one. Now are you sure you've rearranged all of the variables in the tuple correctly? The potential for error here is quite high, especially if you're working with someone else's code. Most people should be using f-strings where possible. I don't deny that there are edge cases where %s or even .format can work better, but f-strings should be the default. |
|
I do see that this is an old, closed issue, but it's the first hit when I Googled this pylint warning, but it's still being referenced here and on SO. When I test it the way it's tested in the original it seems to impose overhead with getting the logger that muddles the results, more than 40% of the measured runtime being getting the logger over and over. I don't quite see why, logging with My whole test then becomes: import timeit
def timeitex2(num_times):
fstring = timeit.timeit(
'logging.info(f"""this is a {thing}, and thing: {thing}""")',
setup="import logging ; thing='1234'",
number=num_times,
)
formattim = timeit.timeit(
'logging.info("this is a {}, and thing: {}".format(thing, thing))',
setup="import logging ; thing='1234'",
number=num_times,
)
percentop = timeit.timeit(
'logging.info("this is a %s, and thing: %s", thing, thing)',
setup="import logging ; thing='1234'",
number=num_times,
)
formatpercentop = timeit.timeit(
'logging.info("this is a %s, and thing: %s" % (thing, thing))',
setup="import logging ; thing='1234'",
number=num_times,
)
fstring2 = timeit.timeit(
'logger.info(f"""this is a {thing}, and thing: {thing}""")',
setup="import logging ; thing='1234' ; logger = logging.getLogger(__name__)",
number=num_times,
)
formattim2 = timeit.timeit(
'logger.info("this is a {}, and thing: {}".format(thing, thing))',
setup="import logging ; thing='1234' ; logger = logging.getLogger(__name__)",
number=num_times,
)
percentop2 = timeit.timeit(
'logger.info("this is a %s, and thing: %s", thing, thing)',
setup="import logging ; thing='1234' ; logger = logging.getLogger(__name__)",
number=num_times,
)
formatpercentop2 = timeit.timeit(
'logger.info("this is a %s, and thing: %s" % (thing, thing))',
setup="import logging ; thing='1234' ; logger = logging.getLogger(__name__)",
number=num_times,
)
return (fstring, formattim, percentop, formatpercentop, fstring2, formattim2, percentop2, formatpercentop2) So the four first results are using I ran this several times until I got a run where every result was lower than for all previous runs: >>> print(timeitex2(10000000))
(4.543375627021305, 5.602089578984305, 4.252049928996712, 5.226050494937226, 2.43524838401936, 3.362805937998928, 1.9745253099827096, 3.079829377005808) So while f-strings are only 6% slower in this test when using >>> print( 4.543375627021305 / 4.252049928996712 )
1.0685141761948531 They're 23% slower when testing using a logger instance: >>> print( 2.43524838401936 / 1.9745253099827096 )
1.2333335874234423 This is still not much if there isn't much logging per second, of course, so I'd say the discussion about correctness vs. readability trumps this in the general case. To clarify, this is of course when the logs aren't being emitted, when they are this is moot anyhow, and f-strings seem a slight bit more performant, as originally mentioned. |
For the record, conversations like this, where people get bogged down in what others should do in their projects instead of giving them the tools to do what they determine they should do and project maintainers appear to not have any interest in supporting said tool (in this case, an option), are why I have stopped using so many tools, pylint being no exception. |
@thormick that's fine for integers. what about complex structures where str() is extremely expensive. i've had to debug code where str() was hitting an endpoint. i'm surprised the effect is so large just for ints. @FrankDMartinez it's easy enough to disable. i disable a lot of pylint things. my pylintrc:
|
@earonesty Thanks. How long have you known about the disabling aspect? |
i just thought everyone knew about it :(. bad to assume. didn't notice your first comment was about how to disable it. |
I'm surprised people spend so long arguing the performance case for this and not mention the code-injection attack this check prevents. Personally I'd have thought the issue of causing errors by using f-strings was of greater importance. If you pass unsanitized input to logging you can get errors if that input happens to contain value = get_user_input()
logger.info(f"User wrote {value}") Now I know this linter check doesn't catch the trivial |
The configuration for fstring is avaialble since pylint 2.4 and was introduced in eeca6cf. See, http://pylint.pycqa.org/en/latest/whatsnew/2.4.html?highlight=fstring%20interpolation
I'm going to prevent further discussion here. |
Is your feature request related to a problem? Please describe
In my experiments W1202 is misleading as f-strings performance is virtually equivalent to %-operator formatting in practice.
Describe the solution you'd like
W1202 be deprecated, as it can lead people to not use f-strings, which (if available) are the preferred mechanism for string interpolation.
Additional context
This is related to my inquiry around this warning: #2354
To verify my hypothesis that f-strings are virtually as performant as %-operator formatting I wrote the following snippet to explore timing related to log statements with the 3 approaches (f-strings, .format(), and %-operator). This was on a Django app I'm working on.
With logging output disabled (ie in my Django settings I had:
LOGGING["loggers"] = {}
), %-formatting and f-strings are virtually identical.Returned times are tuples of "time for f-strings", "time for the format() call", and "time for %-formatting".
That's a difference of ~1% (1.641325417906046 vs 1.6249939629342407 seconds) between f-strings & the %-operator on the largest run (.format calls being clearly slower).
With logging enabled, it's a total wash (technically f-strings was the winner in the larger runs but by such a small margin it's inconsequential). This is with log output going to standard out (so I only went up to 100,000 entries as I didn't want to dump 3 million log statements to my console):
num_times = 10,000:
(1.6245153648778796, 2.0137458329554647, 1.6029843359719962)
num_times = 100,000:
(4.39409149507992, 4.51585376705043, 4.532764581032097)
So basically unless you are emitting a ton of logging statements for a level that are never actually logged (say because the LOG_LEVEL is set high), then it doesn't matter. And if you're in that scenario, then really there's no noticeable difference between %-operator formatting or f-strings (just don't do
.format()
calls).OTOH, if you are emitting the logging message, then it's likely that the f-string approach will be faster, so following W1202 will actually make your code (extremely slightly) slower.
The text was updated successfully, but these errors were encountered: