Add large document benchmarks, tune alias heuristic, add max depth limits #515

liggitt · 2019-10-02T16:53:18Z

This PR addresses the following items:

Parse time of excessively deep nested or indented documents

Parsing these documents is non-linear; limiting stack depth to 10,000 keeps parse times of pathological documents sub-second (~.25 seconds in benchmarks)

Alias node expansion limits

The current limit allows 10,000% expansion, which is too permissive for large documents.

Limiting to 10% expansion for larger documents allows callers to use input size as an effective way to limit resource usage. Continuing to allow larger expansion rates (up to the current 10,000% limit) for smaller documents does not unduly affect memory use.

This PR bounds decode operations from alias expansion to ~400,000 operations for small documents (worst-case ~100-150MB) or 10% of the input document for large documents, whichever is greater.

Using the same benchmark methodology as is included in this PR, memory growth of documents at a given size was measured with no alias use, and with alias use up to the point where the document was rejected:

Benchmark10KBMaps-8                    	     300	   5170631 ns/op	 2178170 B/op	   30780 allocs/op
Benchmark10KB100Aliases-8              	      10	 157137076 ns/op	63367120 B/op	 1015200 allocs/op
58MB / 2800% memory growth

Benchmark100KBMaps-8                   	      30	  49184677 ns/op	22002812 B/op	  307269 allocs/op
Benchmark100KB100Aliases-8             	       5	 324828977 ns/op	121346915 B/op	 1911494 allocs/op
94MB / 450% memory growth

Benchmark500KBMaps-8         	       5	 242092357 ns/op	110320752 B/op	 1536076 allocs/op
Benchmark500KB100Aliases-8   	       2	 646305778 ns/op	262996448 B/op	 3996330 allocs/op
145MB / 140% memory growth

Benchmark1000KBMaps-8                  	       2	 515058247 ns/op	220560496 B/op	 3072079 allocs/op
Benchmark1000KB100Aliases-8            	       1	1007482569 ns/op	393118304 B/op	 5832425 allocs/op
164MB / 78% memory growth

Benchmark2000KBMaps-8         	       2	 988554128 ns/op	440802576 B/op	 6144084 allocs/op
Benchmark2000KB100Aliases-8   	       1	1500724350 ns/op	607745104 B/op	 8809907 allocs/op
159MB / 38% memory growth

Benchmark3000KBMaps-8         	       1	1558944296 ns/op	662789424 B/op	 9216086 allocs/op
Benchmark3000KB100Aliases-8   	       1	2130929511 ns/op	804852592 B/op	11394373 allocs/op
135MB / 20% memory growth

Benchmark10000KBMaps-8                 	       1	4861632275 ns/op	2199723248 B/op	30720091 allocs/op
Benchmark10000KB100Aliases-8           	       1	6217891286 ns/op	2359273744 B/op	32710176 allocs/op
150MB / 7% memory growth

Benchmarks

Benchmarks for decoding large documents with lots of allocations
Benchmarks for runtime of decoding very deep documents

$ go test . -bench . -benchmem -run None
goos: darwin
goarch: amd64
pkg: gopkg.in/yaml.v2
Benchmark1000KB100Aliases-8            	       1	1101005272 ns/op	393203632 B/op	 5833714 allocs/op
Benchmark1000KBDeeplyNestedSlices-8    	       5	 212081876 ns/op	 4690417 B/op	    9070 allocs/op
Benchmark1000KBDeeplyNestedMaps-8      	       5	 207762964 ns/op	 4690702 B/op	    9074 allocs/op
Benchmark1000KBDeeplyNestedIndents-8   	     300	   4000775 ns/op	 2970468 B/op	   10085 allocs/op
Benchmark1000KB1000IndentLines-8       	       3	 400336822 ns/op	143718512 B/op	 4093516 allocs/op
Benchmark1KBMaps-8                     	    3000	    451697 ns/op	  218984 B/op	    3126 allocs/op
Benchmark10KBMaps-8                    	     300	   4930581 ns/op	 2178172 B/op	   30780 allocs/op
Benchmark100KBMaps-8                   	      30	  48219421 ns/op	22002820 B/op	  307269 allocs/op
Benchmark1000KBMaps-8                  	       3	 477795963 ns/op	220560496 B/op	 3072079 allocs/op
PASS
ok  	gopkg.in/yaml.v2	15.168s

liggitt · 2019-10-02T18:13:59Z

cc @niemeyer

…mits

niemeyer · 2019-10-02T18:28:30Z

This looks simple and reasonable. There are minor non-functional details, but I won't bother you with those. Thanks for doing the proper analysis backing it.

sfowl · 2019-10-09T05:54:35Z

@liggitt @niemeyer This patch partly comprises the k8s fix for CVE-2019-11253, which updates gopkg.in/yaml.v2 to version 2.2.4. Presumably any repository that uses an earlier version of gopkg.in/yaml.v2 and accepts untrusted YAML is similarly vulnerable. Rather than assigning individual CVEs for every repo that uses gopkg.in/yaml.v2 in a vulnerable way, I am wondering if it makes more sense to assign a CVE directly to gopkg.in/yaml.v2.

There is certainly an argument of "caveat emptor": that gopkg.in/yaml.v2 should not bear the responsibility of other projects to accept untrusted YAML. However, assigning a new CVE direct to gopkg.in/yaml.v2 allows for those projects to be alerted of this patch and potential vulnerability, which is essentially the same (i.e. allows for excessive resource usage via crafted YAML).

I can assist with CVE assignment for gopkg.in/yaml, if there are no objections.

frioux · 2019-10-09T23:03:24Z

Shouldn't this also be merged into v3? (And I guess v1?)

niemeyer · 2019-10-10T08:42:22Z

This patch is an improvement on a fix that was backported from v3. Now we need to forward port it again.

About assigning a CVE to go-yaml, the fix for this is to not accept certain valid documents, so this is a vulnerability in the specification more than it is in the code. We are using heuristics to attempt to draw a line between what's something reasonable from something that could be abuse or someone being naive. Even with that in place, if your system accepts a large enough YAML document you may still be in trouble, because that's the nature of variable expansion: it allows the document to grow non-linearly with its input size.

That's a class of problem that I haven't seen CVEs being assigned to before, and it still feels somewhat wrong in this case.

This is a forward port of v2 commit f221b84 by Jordan Liggitt.

go-yaml/yaml#515

full diff: go-yaml/yaml@v2.2.2...v2.2.7 includes: - go-yaml/yaml@caeefd8 addresses CVE-2019-11253 JSON/YAML parsing vulnerable to resource exhaustion attack - go-yaml/yaml#171 Tighten restrictions on float decoding - go-yaml/yaml#515 Add large document benchmarks, tune alias heuristic, add max depth limits - go-yaml/yaml@f90ceb4 fixes go-yaml/yaml#529 yaml.Unmarshal crashes on "assignment to entry in nil map" - go-yaml/yaml#543 Port stale simple_keys fix to v2 - go-yaml/yaml@1f64d61 fixes go-yaml/yaml#548 Invalid simple_keys now cause panics later in decode Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

Add large document benchmarks, tune alias heuristic, add max depth li…

8ff528c

…mits

liggitt mentioned this pull request Oct 2, 2019

limit yaml/json decode size kubernetes/kubernetes#83261

Merged

niemeyer merged commit f221b84 into go-yaml:v2 Oct 2, 2019

niemeyer added a commit that referenced this pull request Oct 10, 2019

Improve heuristics preventing CPU/memory abuse (#515)

05262c8

This is a forward port of v2 commit f221b84 by Jordan Liggitt.

dolmen added a commit to dolmen-go/openapi-preprocessor that referenced this pull request Oct 16, 2019

Upgrade gopkg.in/yaml.v3 for security

f3807a8

go-yaml/yaml#515

dolmen mentioned this pull request Oct 16, 2019

Upgrade gopkg.in/yaml.v3 for security dolmen-go/openapi-preprocessor#4

Merged

simonpasquier mentioned this pull request Oct 17, 2019

Bump gopkg.in/yaml.v2 prometheus/common#213

Merged

kayrus mentioned this pull request Oct 17, 2019

Bump gopkg.in/yaml.v2 to 2.2.4 gardener/gardener#1517

Closed

thaJeztah mentioned this pull request Nov 28, 2019

Fix vendored packages and update gopkg.in/yaml.v2 to v2.2.7 (CVE-2019-11253) kubernetes-sigs/yaml#29

Merged

nsitbon mentioned this pull request Dec 4, 2019

breaking changes #558

Closed

bartle-stripe mentioned this pull request Jun 25, 2020

Add option to disable limits added by PR #515 #632

Open

NathanZook mentioned this pull request Apr 22, 2021

Node aliases result in duplicate objects #732

Open

pivotaljohn mentioned this pull request Sep 20, 2021

[lang] anchors are not supported carvel-dev/ytt#186

Closed

neild mentioned this pull request Aug 22, 2022

x/vulndb: potential Go vuln in gopkg.in/yaml.v2: CVE-2022-3064 golang/vulndb#956

Closed

nclv mentioned this pull request Jul 11, 2024

Security issue: Parsing malicious or large YAML documents can consume excessive amounts of CPU or memory. goccy/go-yaml#461

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add large document benchmarks, tune alias heuristic, add max depth limits #515

Add large document benchmarks, tune alias heuristic, add max depth limits #515

liggitt commented Oct 2, 2019 •

edited

Loading

liggitt commented Oct 2, 2019

niemeyer commented Oct 2, 2019

sfowl commented Oct 9, 2019

frioux commented Oct 9, 2019

niemeyer commented Oct 10, 2019

Add large document benchmarks, tune alias heuristic, add max depth limits #515

Add large document benchmarks, tune alias heuristic, add max depth limits #515

Conversation

liggitt commented Oct 2, 2019 • edited Loading

Parse time of excessively deep nested or indented documents

Alias node expansion limits

Benchmarks

liggitt commented Oct 2, 2019

niemeyer commented Oct 2, 2019

sfowl commented Oct 9, 2019

frioux commented Oct 9, 2019

niemeyer commented Oct 10, 2019

liggitt commented Oct 2, 2019 •

edited

Loading