-
Notifications
You must be signed in to change notification settings - Fork 408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Buffer debug logs and only emit on exception #4432
Comments
Thanks for opening your first issue here! We'll come back to you as soon as we can. |
(haven't forgotten, adjusting to jet lag from ANZ as I'm back today) - this is a great idea and we were waiting from customer demand to consider it. I'll share thoughts and questions tomorrow |
This is tricky part. As of now, I can think of these scenarios:
Did I miss anything? I'd love to hear your thoughts on both. |
I imagine the new behaviour would be Logger(debug_on_exception=True). Explicit and backward compatible. Existing code would therefore not buffer and every debug line is published as expected. No surprises for anyone already using Powertools. If I set this new option, I expect the logger.exception to flush out the buffer like a stack trace – that is why I set the option. If I didn’t set the option above, then I don’t want this buffering feature and there will be no stack trace or buffering. But your point remains. If I want to selectively stop publishing the “stack trace” I can’t. |
Yeah, that's the direction we'd go given our tenets. On Monday (17th), I'll summarize actionable points and tests we've made while we wait for customers +1.
|
First, I love the idea of this feature, it's definitely a "have your observability cake and eat it too" moment.
+1, that seems like the best way to implement to start. An env var which also sets it would be additionally welcome, as then users could click-ops turn on debug tracing in response to an unforeseen issue without a code deploy.
This seems to me like the least surprising behavior, given a function wrapped with As for in-line vs separate thread, whew, that's a challenging one. I definitely see the appeal of the lambda returning as fast as possible on error, but you really don't want your error handling code to itself get swallowed by the void, so you'd want to especially make sure the thread kicked off has a very high degree of success at emitting the error. As long as extension threads are able to keep the lambda hot long enough to complete the handler, I'd be concerned about the lambda being scaled-in mid-process. But I would think the error packaging would be fast enough that it'd be unlikely to drop (I'm just not super knowledgeable on lambda innards to know what keeps one alive). Perhaps you could define a handler interface with a default stderr impl and let users inject their own function to run such that they could override the error rollup behavior, use threads or not, customize the debug sink (maybe they want to dump to S3 or datadog or whatever). |
@sthulb when you can, would you mind dropping the memory test you did for verbose log records if kept in a circular buffer for this feature? We're handling some critical operational bits this week hence my extended delay in responding it. We're still committed to do this feature and plan to deprecate the log sampling (as this is a much better alternative) |
Hey everyone! We started a RFC draft in this discussion: aws-powertools/powertools-lambda-typescript#3410 Even though this discussion belongs to the Powertools TypeScript repository, this will be implemented in all runtimes - if the RFC is accepted. Anyone who wants to participate in the discussion is very welcome and we intend to continue the discussions until mid-January and decide how the implementation will look. Thank you |
Hi everyone, I'm here with some updates on this issue. We were discussing the implementation and design API in this discussion and came to some decisions. I'll copy @dreamorosi's original message and change the examples/changes to Python instead of TypeScript.
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.buffer import BufferConfig
# Initialize the logger
buffer_config = BufferConfig(max_bytes=10240)
logger = Logger(service="payment", buffer_config=buffer_config, level="DEBUG")
logger.debug("init complete") # Not buffered
def lambda_handler(event: dict, context: LambdaContext) -> dict:
logger.debug("start request") # Buffered which translates, in your logs to this: Request 1 { "message": "init complete", "level": "DEBUG", ...other keys } Request 2 // no logs
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.buffer import BufferConfig
# Initialize the logger
buffer_config = BufferConfig(max_bytes=10240)
logger = Logger(service="payment", buffer_config=buffer_config) # INFO is the default log level
def lambda_handler(event: dict, context: LambdaContext) -> dict:
logger.debug('I am a debug'); # buffered
logger.info('I am an info'); # NOT buffered
logger.warn('I am a warning'); # NOT buffered
logger.error('i am an error'); # NOT buffered
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.buffer import BufferConfig
# Initialize the logger
buffer_config = BufferConfig(max_bytes=10240, buffer_at_verbosity="WARN")
logger = Logger(service="payment", buffer_config=buffer_config) # INFO is the default log level
def lambda_handler(event: dict, context: LambdaContext) -> dict:
logger.debug('I am a debug'); # buffered
logger.info('I am an info'); # buffered
logger.warn('I am a warning'); # buffered
logger.error('i am an error'); # NOT buffered
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.buffer import BufferConfig
# Initialize the logger
buffer_config = BufferConfig(max_bytes=10240)
logger = Logger(service="payment", buffer_config=buffer_config) # INFO is the default log level
def lambda_handler(event: dict, context: LambdaContext) -> dict:
logger.debug('I am a debug'); # buffered
logger.flush_buffer()
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.buffer import BufferConfig
# Initialize the logger
buffer_config = BufferConfig(max_bytes=10240)
logger = Logger(service="payment", buffer_config=buffer_config) # INFO is the default log level
def lambda_handler(event: dict, context: LambdaContext) -> dict:
logger.debug('I am a debug'); # buffered
try:
function_doesnt_existts()
except Exception:
logger.error('unable to perform operation'); # buffer is flushed
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.buffer import BufferConfig
# Initialize the logger
buffer_config = BufferConfig(max_bytes=10240, flush_on_error_log=False) # True by default
logger = Logger(service="payment", buffer_config=buffer_config) # INFO is the default log level
def lambda_handler(event: dict, context: LambdaContext) -> dict:
logger.debug('I am a debug'); # buffered
try:
function_doesnt_existts()
except Exception:
logger.error('unable to perform operation'); # not buffered, write normal log
logger.flush_buffer()
When buffering is enabled, and customers are using our
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.buffer import BufferConfig
# Initialize the logger
buffer_config = BufferConfig(max_bytes=10240)
logger = Logger(service="payment", buffer_config=buffer_config) # INFO is the default log level
def lambda_handler(event: dict, context: LambdaContext) -> dict:
logger.debug('I am a debug 1'); # buffered
logger.debug('I am a debug 2'); # buffered
logger.debug('I am a debug 3'); # buffer was full, so "I am a debug 1" was evicted
logger.flush_buffer() which results in: { "message": "I am a debug 2", "level": "DEBUG", ...other keys }
{ "message": "I am a debug 3", "level": "DEBUG", ...other keys }
{ "message": `One or more log entries were evicted from the buffer because it got full. Consider increasing the "max_bytes" setting if you want to retain more logs`, "level": "WARN", ...other keys }
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.buffer import BufferConfig
# Initialize the logger
buffer_config = BufferConfig(max_bytes=10240, compress: True) # False by default
logger = Logger(service="payment", buffer_config=buffer_config) # INFO is the default log level
|
Use case
CloudWatch Logs are vital, but expensive. To control costs you switch log level from DEBUG to INFO. When an exception happens you have very little context to debug what happened. You swtich back to DEBUG level and hope(!) the error happens again.
Solution/User Experience
By configuring a buffer, debug logs would not be written to CloudWatch Logs unless an exception is encountered. In this way, CWL is low cost and minimalist. But, like a stack trace, when an exception is encountered the buffer is flushed to CWL and all the context of the preceding events is captured.
The buffer should not create memory pressure and become the cause of exceptions. Using something like a ring buffer oldest entries will be lost in favour of the freshest. There is likely a threshold in which a flush to CWL could take too long as an execution approaches the 900s exection threshold. Therefore the memory allocation or number of buffer lines could be configurable.
Alternative solutions
No response
Acknowledgment
The text was updated successfully, but these errors were encountered: