Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SubSegment Naming #32

Merged

Conversation

therealryanbonham
Copy link
Contributor

Remove Invalid Chars from URL used in name to prevent errors being logged.

@therealryanbonham
Copy link
Contributor Author

Addresses #31

@haotianw465
Copy link
Contributor

Thank you for providing the fix. The log warning is at entity level to indicate the actual name is not as same as the original intent whether it is user specified or patcher code. Having the log warning usually means a bug in user code where unsupported characters are used. But in this case it reflects an issue when the SQL Alchemy patcher is assembling the name.

IMO selectively copy the logic from entity class can be a workaround. But we cannot merge this as a real fix. The subsegment name should not contain "?" or anything after "?" at the first place, so the fix should correct the information extraction from the connection string.

@thehesiod
Copy link
Contributor

btw I added this for httplib: https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/util.py#L97 may prove useful

@therealryanbonham
Copy link
Contributor Author

@haotianw465 This uses strip_url now as well as doing the logic check from entity. I also added the segment/subsegment name to the log message about invalid chars to make debuging easier for people.

# Strip URL of ? and following text
sub_name = strip_url(sql['url'])
# Ensure url has a valid characters.
sub_name = ''.join([c for c in sub_name if c in _valid_name_characters])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a duplicate to https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/core/models/entity.py#L33. IMO when we see a warning on invalid characters we fix from its source, like the strip_url here. But we should not deliberately skip the log warning by placing sub_name = ''.join([c for c in sub_name if c in _valid_name_characters]) everywhere.

Otherwise it looks fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without that code, we would still end up logging warnings for certain SQL URLs. For example 'échéanciers' is a valid name or a database in postgres, but would be stripped to 'chanciers' for x-ray... I left this code to prevent the warnings as the sqlalchemy plugin is setting this, so it is not in users control.. If you want it removed I will, let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with special casing the sql subsegment name. I will create a follow up issue for further write an utility module that strips unicode for patchers that might capture unicode as subsegment name.

@haotianw465 haotianw465 added the bug label Mar 7, 2018
@haotianw465
Copy link
Contributor

Oh one thing I forgot is that could you add one line on CHANGELOG "unreleased" section?

@thehesiod
Copy link
Contributor

would be nice if it changed échéanciers to echeanciers if that makes sense

@haotianw465
Copy link
Contributor

@thehesiod it's nice to have for a utility class to convert some certain unicode characters to ASCII letters. I will make sure to include that on the follow up ISSUE.

@haotianw465 haotianw465 merged commit 29997ec into aws:master Mar 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants