Improvement the int conversion overflow logic to handle bound checks #1194

czechbol · 2024-08-27T21:14:30Z

As per discussion in #1187, the implementation was lacking. This PR extends the bounds check logic to validate that the integer size is checked within the bounds of the destination type.

I'd like to hear @rittneje's opinion on the changes as they reported the issues with it and @ben-krieger's too since he is already looking into it as well.

fixes #1187
fixes #1202

codecov-commenter · 2024-08-27T21:21:14Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 81.65138% with 40 lines in your changes missing coverage. Please review.

Project coverage is 68.48%. Comparing base (ea5b276) to head (9f0427a).
Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
analyzers/conversion_overflow.go	81.65%	22 Missing and 18 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1194      +/-   ##
==========================================
+ Coverage   67.84%   68.48%   +0.64%     
==========================================
  Files          75       75              
  Lines        4186     4360     +174     
==========================================
+ Hits         2840     2986     +146     
- Misses       1210     1226      +16     
- Partials      136      148      +12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ben-krieger

I left some comments on how I might make the code a bit more clear. Feel free to completely ignore.

I did test this out on my codebase locally and it removed the same false negatives as my PR #1193. Therefore, I will be closing my PR in favor of yours.

analyzers/conversion_overflow.go

rittneje · 2024-08-27T23:42:37Z

@czechbol Unfortunately, I've never worked with the Go parser/AST before, so I won't be much help.

analyzers/conversion_overflow.go

czechbol · 2024-08-28T09:40:39Z

I apologise for the email spam. Got my github accounts mixed up.

Everything should now be accounted for apart from this comment - #1194 (comment).

I'm still thinking through how to tackle it.

ccojocar · 2024-08-28T11:44:28Z

Everything should now be accounted for apart from this comment - #1194 (comment).

I'm still thinking through how to tackle it.

@czechbol Thanks for addressing the review comments. Please just let me know when this is ready for final review.

ccojocar · 2024-08-28T11:45:57Z

I think this requires a rebase, since there are some conflicts in testutils/g115_samples.go which need to be resolved.

ben-krieger · 2024-08-28T13:24:24Z

Unfortunately with the latest changes, I've seen two regressions running gosec on my private codebase.

First, is a case of:

	xbLen := len(xb)
	if xbLen > math.MaxUint16 {
		return nil, fmt.Errorf("x contains too many bytes")
	}

	b = binary.BigEndian.AppendUint16(b, uint16(xbLen))

Second is a case of:

	numEntries := len(ov.Entries)
	if numEntries > math.MaxUint8 {
		return nil, fmt.Errorf(...)
	}

	_ = uint8(numEntries)

I'm guessing this is because int -> uintX doesn't guarantee that the value is > 0. However, len semantically does... This may be a little hard to check for, but I think it's worthwhile.

czechbol · 2024-08-28T13:47:44Z

I think this requires a rebase, since there are some conflicts in testutils/g115_samples.go which need to be resolved.

sure, that is not an issue

I'm guessing this is because int -> uintX doesn't guarantee that the value is > 0. However, len semantically does... This may be a little hard to check for, but I think it's worthwhile.

I'll look into it too after I figure out the conditional stuff.

analyzers/conversion_overflow.go

czechbol · 2024-08-28T20:25:00Z

This code is definitely not final. It will most likely need some refactoring after the following cases are handled as well.

I currently have some way to handle AND and OR conditions in the if statements. Since the SSA does not give me a nice way to check the entire if statement and breaks it up into component instructions, I need to look through them to determine their relationship. If you have suggestions how to do it differently, please let me know.

Currently unhandled cases that I'm aware of:

Convert instruction inside the if block. Even if I'm not a fan of this code style, the false positives should be resolved.
NOT operation within the conditions.

I hope to be able to solve these in the coming days. Some help is also welcome if anyone wants to get this done sooner.

analyzers/conversion_overflow.go

rittneje · 2024-08-28T22:01:35Z

@czechbol I am currently getting a panic on the following input:

package main

import "math"

func foo(x []string) int16 {
	if len(x) < math.MaxInt16 {
		return int16(len(x))
	}
	return 0
}

func main() {

}

panic: runtime error: index out of range [1] with length 1

goroutine 1 [running]:
github.com/securego/gosec/v2/analyzers.checkIfForRangeCheck(0xc0004aa048, 0xc0008107d0, {0xf8?, 0x7f61505e3178?}, 0xc000cab570)
/home/rittneje/gosec/analyzers/conversion_overflow.go:211 +0x568
github.com/securego/gosec/v2/analyzers.hasExplicitRangeCheck(0xc0008107d0, {0xc000289cc0?, 0x5?})
/home/rittneje/gosec/analyzers/conversion_overflow.go:176 +0x2b9
github.com/securego/gosec/v2/analyzers.isSafeConversion(0xc0008107d0)
/home/rittneje/gosec/analyzers/conversion_overflow.go:93 +0xaa
github.com/securego/gosec/v2/analyzers.runConversionOverflow(0xc000190a80)
/home/rittneje/gosec/analyzers/conversion_overflow.go:54 +0x27e
github.com/securego/gosec/v2.(*Analyzer).CheckAnalyzers(0xc0007c6080, 0xc00032cd80)
/home/rittneje/gosec/analyzer.go:446 +0x4a2
github.com/securego/gosec/v2.(*Analyzer).Process(0xc0007c6080, {0x0, 0x0, 0x0}, {0xc0007d6c50, 0x1, 0x3d?})
/home/rittneje/gosec/analyzer.go:317 +0x488
main.main()
/home/rittneje/gosec/cmd/gosec/main.go:476 +0xde5

rittneje · 2024-08-28T22:17:20Z

@czechbol I haven't thought through it much, but I'm wondering if it would be better to keep track of the known minimum and maximum bounds themselves, instead of just flags. That way we have a good foundation for adding support for more interesting cases in the future. As a rough idea:

func foo(x int64) int16 {
    // Initially our "context" is that x is in the range [MinInt64, MaxInt64].

    if x > math.MaxInt16 {
        return math.MaxInt16
    }

    // Now because of the return, we can negate the if check and "apply" it to the context.
    // Thus x is in the range [MinInt64, MaxInt16].

    if x < math.MinInt16 {
        return math.MinInt16
    }

    // Again because of the return, we can negate the if check and "apply" it to the context.
    // Thus x is in the range [MinInt16, MaxInt16].

    // We know that x must be in range, so this conversion is safe.
    return int16(x)
}

In the future, I think this bookkeeping could help with situations involving arithmetic:

func foo(x uint64) uint16 {
   if x > math.MaxUint16 - 1 {
       return math.MaxUint16
   }
    return uint16(x + 1)
}

ben-krieger · 2024-08-28T22:18:31Z

@czechbol I am currently getting a panic on the following input:
package main

import "math"

func foo(x []string) int16 {
	if len(x) < math.MaxInt16 {
		return int16(len(x))
	}
	return 0
}

func main() {

}
panic: runtime error: index out of range [1] with length 1
goroutine 1 [running]:
github.com/securego/gosec/v2/analyzers.checkIfForRangeCheck(0xc0004aa048, 0xc0008107d0, {0xf8?, 0x7f61505e3178?}, 0xc000cab570)
/home/rittneje/gosec/analyzers/conversion_overflow.go:211 +0x568
github.com/securego/gosec/v2/analyzers.hasExplicitRangeCheck(0xc0008107d0, {0xc000289cc0?, 0x5?})
/home/rittneje/gosec/analyzers/conversion_overflow.go:176 +0x2b9
github.com/securego/gosec/v2/analyzers.isSafeConversion(0xc0008107d0)
/home/rittneje/gosec/analyzers/conversion_overflow.go:93 +0xaa
github.com/securego/gosec/v2/analyzers.runConversionOverflow(0xc000190a80)
/home/rittneje/gosec/analyzers/conversion_overflow.go:54 +0x27e
github.com/securego/gosec/v2.(*Analyzer).CheckAnalyzers(0xc0007c6080, 0xc00032cd80)
/home/rittneje/gosec/analyzer.go:446 +0x4a2
github.com/securego/gosec/v2.(*Analyzer).Process(0xc0007c6080, {0x0, 0x0, 0x0}, {0xc0007d6c50, 0x1, 0x3d?})
/home/rittneje/gosec/analyzer.go:317 +0x488
main.main()
/home/rittneje/gosec/cmd/gosec/main.go:476 +0xde5

Putting the panic aside, we actually won't be able to allow this conversion. SSA IR will have each len(x) as a separate value. Therefore, the latter value which is converted won't be considered upper-bound-checked, since the bound check was actually on a different value.

I imagine that this case will frustrate a lot of people with gosec, but there's nothing we can do. Realizing that both len calls will have the same value is a compiler optimization at best, since I'm pretty sure you can mutate a slice's length using reflection.

czechbol · 2024-08-28T22:27:22Z

I am currently getting a panic on the following input

@rittneje Yes, I know there's panics, that's why I said I "have some way" of handling conditions, that Converts inside if blocks don't work and that it's not ready to be merged yet. There still is work to do, but I'm slowly covering more test cases and want to share the changes with you. If you find any other curious test cases, feel free to suggest them or open a PR to my branch.

I'll be honest, I definitely didn't fully understand what I was getting myself into when I opened #1189 but I'm in too deep now and I still want to fix all the false positives I can.

If what @ben-krieger says is true then it's a shame but what can we do.
Perhaps we could push this PR into a good enough middle ground solution and iterate from there.

In the meantime I'm switching this PR to a Draft.

ben-krieger · 2024-08-29T01:01:01Z

I am currently getting a panic on the following input

@rittneje Yes, I know there's panics, that's why I said I "have some way" of handling conditions, that Converts inside if blocks don't work and that it's not ready to be merged yet. There still is work to do, but I'm slowly covering more test cases and want to share the changes with you. If you find any other curious test cases, feel free to suggest them or open a PR to my branch.

I'll be honest, I definitely didn't fully understand what I was getting myself into when I opened #1189 but I'm in too deep now and I still want to fix all the false positives I can.

If what @ben-krieger says is true then it's a shame but what can we do. Perhaps we could push this PR into a good enough middle ground solution and iterate from there.

In the meantime I'm switching this PR to a Draft.

@czechbol I also feel like I jumped into something out of my depth. I was absolutely nerd-sniped.

But you've done a great job! In fact, I'm quite impressed with what 3 unprepared devs could do to significantly tackle these false positives.

In my opinion, the main thing holding us back is not even our experience with SSA or language analysis, but the fact that this probably really requires complex inspection of the abstract syntax tree to understand things like "does a failed bounds check result in conversion code being unreachable" and "does a possibly complex boolean expression check either bound, none, or both?" This is no doubt difficult even for developers of compilers!

So I think we should just take a few more good looks at what we can do here, document the most common false positives (and the couple false negatives we will introduce), and ship it.

I know this PR has merit, because it saves me from 22 obviously false positives in just one library I maintain. This is good work!

ccojocar · 2024-08-29T11:02:51Z

I currently have some way to handle AND and OR conditions in the if statements. Since the SSA does not give me a nice way to check the entire if statement and breaks it up into component instructions, I need to look through them to determine their relationship. If you have suggestions how to do it differently, please let me know.

@czechbol I think from If instruction you can get the Block to which it belongs to, and within that Block you have all outer scope instructions, which you can reference from the Cond instruction. Also you can get all function instructions block via the Parent(). It is more complicated if you want to track back through multiple function calls since the current SSA implementation doesn't provide the graph outside of the package.

I would focus to cover the most common uses cases at the function level. Validation outside of the function block can be addressed later.

To simplify, I would track the bounds and keep their values through different if checks similar with @rittneje suggested in #1194 (comment). The check can be then performed only once at the conversion time.

I would approach something like:

detect a conversion extract the variable reference
run a bounds analysis into the function instructions and track the values of the bounds through various if checks, or parse function calls
verify the overflow based on the type size and current bounds value at that point of conversion

czechbol · 2024-08-30T22:16:24Z

Now this is an interesting corner I've gotten myself into. I've involuntarily created a situation where the range check is an insufficient solution.

If someone wants to explicitly check a value with an equals operator, I currently have no way of handling that.

@ccojocar What do you think? Should I consider this out of scope of this PR?

analyzers/conversion_overflow.go

ccojocar · 2024-09-01T07:39:51Z

If someone wants to explicitly check a value with an equals operator, I currently have no way of handling that.
What do you think? Should I consider this out of scope of this PR?

It sounds good to me to handle it later since the current implementation in converting quite a good chunk of use cases. A good solution is better than a perfect solution which is not implemented.

Signed-off-by: czechbol <adamludes@gmail.com>

Co-authored-by: Ben Krieger <ben.krieger@intel.com>

Signed-off-by: czechbol <adamludes@gmail.com>

Change-Id: I8da6495eaaf25b1739389aa98492bd7df338085b Signed-off-by: Cosmin Cojocar <ccojocar@google.com>

Signed-off-by: czechbol <adamludes@gmail.com>

Change-Id: I0db56cb0a5f9ab6e815e2480ec0b66d7061b23d3 Signed-off-by: Cosmin Cojocar <ccojocar@google.com>

ccojocar · 2024-09-04T14:01:40Z

golangci-lint keeps complaining about a potential overflow but if I build gosec locally it doesn't flag anything. I'd disregard that action for now as they're still using the old version of gosec.

I ignored the warning for now.

stephenc · 2024-09-04T14:20:52Z

Does this also allow us to signal intentional truncation without a bounds check, e.g. uint16(x &0xffff)

ben-krieger · 2024-09-04T14:26:30Z

Does this also allow us to signal intentional truncation without a bounds check, e.g. uint16(x &0xffff)

No. Someone would need to take up a significant effort in order to support bounds checks that involve 1) arithmetic binary operations, 2) boolean logic operations, or 3) the min and max builtins from Go 1.21.

Perhaps we can create a ticket to track this? But I'm not sure who will sign up for that level of effort, so maybe it should be implemented across multiple PRs.

czechbol · 2024-09-04T14:37:04Z

@stephenc I'm sorry I claimed that it has been resolved I only noticed the sentence about truncation after you commented here.

That being said I currently don't have the time capacity to continue working on this feature, I've already sank too much time into it. I hope you understand that.

I've pushed this PR to a point where most people can at least work around the issue somehow and the most common examples that we could come up with should be covered.

stephenc · 2024-09-04T14:39:34Z

@czechbol fixing the ranges is great. As you closed my issue perhaps you could just create a placeholder about adding support for bitmasks to signal intentional truncation so that it does not get lost. No rush to implement, I can live with // # nosec G115 for now

stephenc · 2024-09-04T14:49:23Z

Does this also allow us to signal intentional truncation without a bounds check, e.g. uint16(x &0xffff)

No. Someone would need to take up a significant effort in order to support bounds checks that involve 1) arithmetic binary operations, 2) boolean logic operations, or 3) the min and max builtins from Go 1.21.

Perhaps we can create a ticket to track this? But I'm not sure who will sign up for that level of effort, so maybe it should be implemented across multiple PRs.

I see those as all different cases. I think intentional truncation is always going to be of a fixed pattern... a mask that fits into the target type and a bitwise and... and for it to be intentional I would pair that with both being inside the cast... such a cast, for the all 1's case, should be optimized as simple truncation by the compiler and thus it is clearly signalling intent to truncate. as things currently stand I don't see how I can signal intentional truncation other than by disabling the rule on the line which could mask an unintended truncation

ben-krieger · 2024-09-04T15:09:48Z

Does this also allow us to signal intentional truncation without a bounds check, e.g. uint16(x &0xffff)

No. Someone would need to take up a significant effort in order to support bounds checks that involve 1) arithmetic binary operations, 2) boolean logic operations, or 3) the min and max builtins from Go 1.21.
Perhaps we can create a ticket to track this? But I'm not sure who will sign up for that level of effort, so maybe it should be implemented across multiple PRs.

I see those as all different cases. I think intentional truncation is always going to be of a fixed pattern... a mask that fits into the target type and a bitwise and... and for it to be intentional I would pair that with both being inside the cast... such a cast, for the all 1's case, should be optimized as simple truncation by the compiler and thus it is clearly signalling intent to truncate. as things currently stand I don't see how I can signal intentional truncation other than by disabling the rule on the line which could mask an unintended truncation

The (ugly) way to get gosec to accept the conversion would be something like:

x = x&0xffff
if x < 0 || x > math.MaxUint16 {
    panic("unreachable")
}
_ = uint16(x)

I'm no compiler dev, but I expect that the entire if-block with panic would be pruned during compiling and not even have a runtime effect.

czechbol · 2024-09-04T15:19:12Z

@ben-krieger, @stephenc, @rittneje, @ccojocar FYI I opened #1212 to compile all the false positives we can find in a single issue.

ben-krieger reviewed Aug 27, 2024

View reviewed changes

ben-krieger mentioned this pull request Aug 27, 2024

Fix conversion overflow false positive when value is explicitly checked #1193

Closed

3 tasks

rittneje reviewed Aug 27, 2024

View reviewed changes