-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitize NSVCA fields on YAML parser #3346
Conversation
@dralley The test is failing with the same permission issue I was having locally, so I guess I won't open an issue about that on I'm not sure how to approach this exactly, but I can do some exploration work next week. usr/local/lib/python3.8/site-packages/pulp_rpm/app/modulemd.py:192: in parse_modular
for module in split_modulemd_file(file):
usr/local/lib/python3.8/site-packages/pulp_rpm/app/modulemd.py:71: in split_modulemd_file
with tempfile.TemporaryDirectory(dir=".") as tf:
usr/lib64/python3.8/tempfile.py:780: in __init__
self.name = mkdtemp(suffix, prefix, dir)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
suffix = '', prefix = 'tmp', dir = '.'
def mkdtemp(suffix=None, prefix=None, dir=None):
"""User-callable function to create and return a unique temporary
directory. The return value is the pathname of the directory.
Arguments are as for mkstemp, except that the 'text' argument is
not accepted.
The directory is readable, writable, and searchable only by the
creating user.
Caller is responsible for deleting the directory when done with it.
"""
prefix, suffix, dir, output_type = _sanitize_params(prefix, suffix, dir)
names = _get_candidate_names()
if output_type is bytes:
names = map(_os.fsencode, names)
for seq in range(TMP_MAX):
name = next(names)
file = _os.path.join(dir, prefix + name + suffix)
_sys.audit("tempfile.mkdtemp", file)
try:
> _os.mkdir(file, 0o700)
E PermissionError: [Errno 13] Permission denied: './tmpu6rk65ns' |
Have you verified that the tests fail without this change? |
Yes. The failure was the one we expected |
5f0c091
to
d34ad7e
Compare
Ok, so there is an exception: Almost everything is text fields, but we do have a boolean or two, and some JSON that need to be stored as such. The JSON might be OK currently, but it looks like the booleans are not. The CI errors are referring to this field: https://github.com/pulp/pulp_rpm/blob/main/pulp_rpm/app/models/modulemd.py#L30 So we need to find a way to make sure floats are parsed as strings, but not booleans and such. |
I was thinking about doing a simple bool casting, but still, other fields inside those Json could end up messed up. I'll go back to the more conservative approach of the custom Loader, which provides per-field control over not casting it. Then I'll add only the NSVCA fields as fields which should "bypass YAML" (thus, always be Wdyt? |
"""Custom Loader that preserve unquoted float in specific fields""" | ||
|
||
# Field to preserve (will bypass yaml casting) | ||
PRESERVED_FIELDS = ("name", "stream", "version", "context", "arch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this ought to handle the stream
field on ModulemdDefaults and ModulemdObsoletes as well, but we ought to make sure it's tested
https://github.com/pulp/pulp_rpm/blob/main/pulp_rpm/app/models/modulemd.py#L101
We'll talk about this a bit more on a call
69cfd7b
to
d1f7fe6
Compare
pulp_rpm/tests/unit/test_modulemd.py
Outdated
@@ -90,8 +89,7 @@ def test_parse_modular_preserves_literal_unquoted_values_3285(tmp_path): | |||
with open(file_name, "w") as file: | |||
file.write(sample_file_data) | |||
|
|||
mock_obj = SimpleNamespace(path=str(file_name)) | |||
all, defaults, obsoletes = parse_modular(mock_obj) | |||
all, defaults, obsoletes = parse_modular(str(file_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to get rid of the str()
around file_name
, I'm not sure why that was here before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The str
is there because I was using a Path
object before. Will remove it
pulp_rpm/app/modulemd.py
Outdated
|
||
|
||
class ModularYamlLoader(yaml.SafeLoader): | ||
"""Custom Loader that preserve unquoted float in specific fields""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we leave a bit more detail, including the issue number where this was originally encountered, and perhaps point to the libmodulemd equivalent code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, where this code in particular came from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
pulp_rpm/tests/unit/test_modulemd.py
Outdated
# parse_modular changes the structure and key names for obsoletes | ||
assert modulemd_obsoletes["module_name"] == "perl" | ||
assert modulemd_obsoletes["module_stream"] == "5.30" | ||
assert modulemd_obsoletes["obsoleted_by_module_name"] == "5.40" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be obsoleted_by_module_stream
?
edit: oh look, a bug
the actual metadata is unaffected since we use raw yaml snippets. Our other tests are primarily interested in this, and so it wasn't caught. Only the metadata Pulp itself keeps for reference is wrong
This is a pretty minor bug, all things considered, because module obsoletes are basically unused and even less often would this metadata in question be referenced. However we should probably fix it anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
assert modulemd_obsoletes["obsoleted_by_module_name"] == "perl"
assert modulemd_obsoletes["obsoleted_by_module_stream"] == "5.40"
Let's also fill these tests out with a few more fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found "obsoleted_by_module_name"
odd, but just thought, "it is what it is".
Will fix the constant name and test more fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Nice job. You can trust your instincts, though! Just because it works in a certain way currently, doesn't mean it's right :)
3caeb02
to
a9d5d95
Compare
When adding more asserts, I've found that |
The failure is due to the fix in obsoletes constant name, I'm taking a look into the test (will add those changes to another commit) |
Most likely, the constant in the test is wrong. What I did was take the output from the initial implementation, check it by eye to ensure it was correct, and then commit that. But it's possible I missed something, in which case the constant would be incorrect. |
85b925b
to
3b40687
Compare
- Replace yaml.safe_load by YAML loader that prevents unquoted numbers in NSVCA fields to loose trailing/heading zeros. - Add unit test to `parse_modular` for the specific issue closes pulp#3285
- Pass only necesary data to parse_modular (the path string) - Improve some docstrings and simple types [noissue]
3b40687
to
4739428
Compare
modulemd_obsoletes["message"] | ||
== "Module stream perl:5.30 is no longer supported. Please switch to perl:5.32" | ||
) | ||
assert modulemd_obsoletes["obsoleted_by_module_stream"] == "5.40" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add obsoleted_by_module_name
also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! This is good to go with one last small addition to the test
OBSOLETE_BY_STREAM attr was named as "obsoleted_by_module_name" instead of "obsoleted_by_name_stream". Fix tests accordingly. [noissue]
4739428
to
f3d5e8e
Compare
This PR adds support for reading the literal value (non-casted value) of an unquoted number from a YAML file in the
parse_modular
function.This is a customization of the YAML spec, but is the expected behavior for this context (see #3285)
parse_modular
for the specific issue