-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Psych.safe_load by default #487
Conversation
For context,
|
I support this change, but what do you plan to do with the version number? Feels like a major change. |
Yes, definitely major version change. |
f3cc8b1
to
f351e51
Compare
I am extremely supportive of switching to safe by default, even going through a full rename dance here. deprecate lol ... I just read the PR looks like you are very close already, only change I recommend adding is deprecating |
I've made |
Yeah sounds like at worst this is allowing some cascade of components to cause memory bloat. Enormously better place to be than RCE friendly by default. I guess people like this kind of stuff and tend to use it?
One concern is that the "load" then is less cross compatible with other frameworks. |
That's true, but it's already the current behavior. For example: YAML.load(<<-eostr)
---
:a: 1
eostr The above code will result in the same data structure before and after this PR is merged. Allowing symbols is a compromise for compatibility / non-surprising behavior (as it's what people expect today). Also, interlanguage compatibility hasn't really been a priority for the library. I'm leaning towards allowing symbols for |
+1 for the interface change. How about releasing |
Yes, that is a great idea. I will do it! |
Psych.load is not safe for use with untrusted data. Too many applications make the mistake of using `Psych.load` with untrusted data and that ends up with some kind of security vulnerability. This commit changes the default `Psych.load` to use `safe_load`. Users that want to parse trusted data can use Psych.unsafe_load.
818f830
to
0767227
Compare
I shipped this in v4.0.0. Thanks! |
Friday, our CI is red. Only if there was a better week day to push big releases out, hmmm 🤔 Well, a mistake in our |
Follow ruby/psych#487. This PR fixes the following Ruby 3.1.0-dev build matrix error. ```console % ruby -ryaml -ve 'p Psych::VERSION' ruby 3.1.0dev (2021-05-17T10:51:51Z master ee611341c9) [x86_64-darwin19] "4.0.0" % bundle exec rspec ./spec/rubocop/cli/options_spec.rb (snip) Failures: 1) RuboCop::CLI options --show-cops with no args prints the current configuration Failure/Error: printed_config = YAML.load(out.join) # rubocop:disable Security/YAMLLoad Psych::DisallowedClass: Tried to load unspecified class: Regexp Shared Example Group: "prints config" called from ./spec/rubocop/cli/options_spec.rb:1104 # (eval):2:in `regexp' # ./spec/rubocop/cli/options_spec.rb:1027:in `block (4 levels) in <top (required)>' # ./spec/support/cli_spec_behavior.rb:26:in `block (2 levels) in <top (required)>' # ./lib/rubocop/rspec/shared_contexts.rb:30:in `block (4 levels) in <top (required)>' # ./lib/rubocop/rspec/shared_contexts.rb:30:in `chdir' # ./lib/rubocop/rspec/shared_contexts.rb:30:in `block (3 levels) in <top (required)>' # ./lib/rubocop/rspec/shared_contexts.rb:7:in `block (2 levels) in <top (required)>' Finished in 25.97 seconds (files took 1.02 seconds to load) 117 examples, 1 failure Failed examples: rspec ./spec/rubocop/cli/options_spec.rb[1:13:1:4] # RuboCop::CLI options --show-cops with no args prints the current configuration ``` https://app.circleci.com/pipelines/github/rubocop/rubocop/4637/workflows/d6722067-28c1-4740-a1b8-abea06edab0d/jobs/185620
Psych 4 load in safe mode by default: ruby/psych#487 So aliases need to be explicitly allowed
Psych 4 load in safe mode by default: ruby/psych#487 So aliases need to be explicitly allowed
Ruby 3.1 contains Psych version 4 as the default YAML parsing library. This has a backwards compatibility break where YAML.load_file behaviour has changed from an unsafe load to a safe load [1]. This causes an exception to occur when using GOV.UK Puppet Hieradata YAML files are parsed, as these make use of aliases - which is a feature that needs to be enabled for YAML in safe load. To resolve this issue I've set a conditional to check if the safe_load method exists on the YAML object. I did consider whether it could be resoled by pinning the psych dependency, but this seemed a bit delicate as this tool has been authored to have no external dependencies and is installed explicitly without them [2]. I couldn't identify an approach to apply a test for this that was consistent with this repo's code and test conventions. Due to the fluctuating standard library it'd have been a little contrived. Before: ``` ➜ ~ govuk-connect app-console -e integration asset-manager Connecting to the app console for asset-manager, in the integration environment The relevant hosting provider is aws /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:430:in `visit_Psych_Nodes_Alias': Unknown alias: node_class (Psych::BadAlias) from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/visitor.rb:30:in `visit' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/visitor.rb:6:in `accept' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:35:in `accept' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:345:in `block in revive_hash' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:343:in `each' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:343:in `each_slice' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:343:in `revive_hash' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:167:in `visit_Psych_Nodes_Mapping' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/visitor.rb:30:in `visit' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/visitor.rb:6:in `accept' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:35:in `accept' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:345:in `block in revive_hash' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:343:in `each' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:343:in `each_slice' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:343:in `revive_hash' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:167:in `visit_Psych_Nodes_Mapping' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/visitor.rb:30:in `visit' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/visitor.rb:6:in `accept' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:35:in `accept' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:318:in `visit_Psych_Nodes_Document' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/visitor.rb:30:in `visit' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/visitor.rb:6:in `accept' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych/visitors/to_ruby.rb:35:in `accept' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych.rb:335:in `safe_load' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych.rb:370:in `load' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych.rb:671:in `block in load_file' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych.rb:670:in `open' from /usr/local/Cellar/ruby/3.1.0/lib/ruby/3.1.0/psych.rb:670:in `load_file' ``` After: ``` ➜ ~ govuk-connect app-console -e integration asset-manager Connecting to the app console for asset-manager, in the integration environment The relevant hosting provider is aws The relevant node class is backend ``` [1]: ruby/psych#487 [2]: https://github.com/alphagov/homebrew-gds/blob/de1fb95fbda6219ad7158ab0c71253dcb55822d9/Formula/govuk-connect.rb#L12
Ruby master ships with Psych 4.0.0 which makes YAML.load defaults to safe mode (ruby/psych#487). Keep compatibility by using unsafe_load.
defaults to safe mode (ruby/psych#487). Keep compatibility by using unsafe_load. From: settingslogic#21 Also: https://stackoverflow.com/a/71192990
This isn't used and no longer works after upgrading to Ruby 3.1, since that involves an upgrade to Psych, which no longer allows deserializing from arbitrary classes: ruby/psych#487
Psych 4 load in safe mode by default: ruby/psych#487 So aliases need to be explicitly allowed
Passing in the permitted classes as the second argument was deprecated in Psych 3 (ruby/psych#378) and removed in Psych 4 (ruby/psych#487). Ruby 3.1 ships with Psych 4, so to make the code work for Ruby 2.7 and up, use the `permitted_classes` keyword argument.
With ruby 3.1.0, psych version 4.0.0 got introduced. Starting psych version 4.0.0 the `load` method defaults to `safe_load`. See ruby/psych#487 See https://stdgems.org/psych/ This got introduced to prevent users from accidentally load yaml content from untrusted user input. But now breaks certain patterns, when using the gem. See https://bugs.ruby-lang.org/issues/17866 To still allow users to load trusted documents, the `unsafe_load` method got introduced. See ruby/psych#488 I added a ruby version check for backwards compatibility.
Remove private methods unused since #487
Psych 4 load in safe mode by default: ruby/psych#487 So aliases need to be explicitly allowed
See ruby/psych#487 for more information.
Ruby master ships with Psych 4.0.0 which makes `YAML.load_file` defaults to safe mode (ruby/psych#487). Keep compatibility by using `unsafe_load_file`.
Psych.load is not safe for use with untrusted data. Too many
applications make the mistake of using
Psych.load
with untrusted dataand that ends up with some kind of security vulnerability.
This commit changes the default
Psych.load
to usesafe_load
. Usersthat want to parse trusted data can use Psych.unsafe_load.
Psych.load
orYAML.load
are susceptible to a Remote Code Execution (or RCE) flaw. The problem isn't with the load functions themselves, but they can be used to abuse other objects that are in the system to escalate to an RCE. This blog post goes in to details about how YAML can be used in conjunction with RubyGems to execute arbitrary code. If someone tries to load YAML from an untrusted source, bad actors could use this issue to execute arbitrary code in the target system.Psych provides a method called "safe_load" which can't be used to launch such an attack. The proposal here is to change the default such that "load" is now "safe_load". Of course the drawback is that "safe_load" is much more limited than "load", and making this change could break existing code.
Also it's probably worth mentioning that the YAML load functions could be used with other objects in the system besides RubyGems to escalate to an RCE. Also worth mentioning that anything that can be used to load arbitrary objects (like
Marshal.load
) can be abused in the same way. So neither changing RubyGems nor changing to a different serialization scheme (like Marshal) would plug this hole.