-
Notifications
You must be signed in to change notification settings - Fork 723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unnecessary memory allocation limit checks and rename max-allocation related functions #1137
Merged
val-ms
merged 12 commits into
Cisco-Talos:main
from
val-ms:remove-unnecessary-cli_-allocations
Mar 15, 2024
Merged
Remove unnecessary memory allocation limit checks and rename max-allocation related functions #1137
val-ms
merged 12 commits into
Cisco-Talos:main
from
val-ms:remove-unnecessary-cli_-allocations
Mar 15, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
c4eac44
to
81449c5
Compare
81449c5
to
1d09996
Compare
ragusaa
suggested changes
Mar 7, 2024
ragusaa
approved these changes
Mar 12, 2024
There are a large number of allocations for fix sized buffers using the `cli_malloc` and `cli_calloc` calls that check if the requested size is larger than our allocation threshold for allocations based on untrusted input. These allocations will *always* be higher than the threshold, so the extra stack frame and check for these calls is a waste of CPU. This commit replaces needless calls with A -> B: - cli_malloc -> malloc - cli_calloc -> calloc - CLI_MALLOC -> MALLOC - CLI_CALLOC -> CALLOC I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by the max-allocation threshold, when MMAP is found/enabled. But the alternative was set to cli_malloc / cli_calloc when disabled. I changed those as well. I didn't change the cli_realloc/2 calls because our version of realloc not only implements a threshold but also stabilizes the undefined behavior in realloc to protect against accidental double-free's. It may be worth implementing a cli_realloc that doesn't have the threshold built-in, however, so as to allow reallocaitons for things like buffers for loading signatures, which aren't subject to the same concern as allocations for scanning possible malware. There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC, because it appears to be allocating based on untrusted input.
We have some special functions to wrap malloc, calloc, and realloc to make sure we don't allocate more than some limit, similar to the max-filesize and max-scansize limits. Our wrappers are really only needed when allocating memory for scans based on untrusted user input, where a scan file could have bytes that claim you need to allocate some ridiculous amount of memory. Right now they're named: - cli_malloc - cli_calloc - cli_realloc - cli_realloc2 ... and these names do not convey their purpose This commit renames them to: - cli_max_malloc - cli_max_calloc - cli_max_realloc - cli_max_realloc2 The realloc ones also have an additional feature in that they will not free your pointer if you try to realloc to 0 bytes. Freeing the memory is undefined by the C spec, and only done with some realloc implementations, so this stabilizes on the behavior of not doing that, which should prevent accidental double-free's. So for the case where you may want to realloc and do not need to have a maximum, this commit adds the following functions: - cli_safer_realloc - cli_safer_realloc2 These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when MPOOL is disabled (e.g. because mmap-support is not found), so as to match the behavior in the mpool_realloc/2 functions that do not make use of the allocation-limit.
Should be safe to allocate a smaller string than one we already have.
The cli_max_malloc, cli_max_calloc, and cli_max_realloc functions provide a way to protect against allocating too much memory when the size of the allocation is derived from the untrusted input. Specifically, we worry about values in the file being scanned being manipulated to exhaust the RAM and crash the application. There is no need to check the limits if the size of the allocation is fixed, or if the size of the allocation is necessary for signature loading, or the general operation of the applications. E.g. checking the max-allocation limit for the size of a hash, or for the size of the scan recursion stack, is a complete waste of time. Although we significantly increased the max-allocation limit in a recent release, it is best not to check an allocation if the allocation will be safe. It would be a waste of time. I am also hopeful that if we can reduce the number allocations that require a limit-check to those that require it for the safe scan of a file, then eventually we can store the limit in the scan- context, and make it configurable.
Some sort of code merge way-back-when resulted in two identical max-allocation checks. I removed the noisy ones.
A code merge resulted in a duplicate copy of the CLI_STRDUP macro. Also fixed formatting.
Bytecode signature's are able to allocate buffers, but should probably adhere to clamav's max allocation limit. This adds a check to make sure they don't accidentally alloc too much based on untrusted user input.
Allocations for bytecode signatures to work need not check against the memory allocation limit, as bytecode signatures are considered trusted user input. You may note that I did not remove allocation limits from the bytecode API functions that may be called by the signatures such as adding json objects, hashsets, lzma and bz2 decompressors, etc. This is because it is likely that a bytecode signature may call them more times based on the structure of the file being scanned - particularly for the json objects.
Variables like the number of signature parts are considered trusted user input and so allocations based on those values need not check the memory allocation limit. Specifically for the allocation of the normalized buffer in cli_scanscript, I determined that the size of SCANBUFF is fixed and so safe, and the maxpatlen comes from the signature load and is therefore also trusted, so we do not need to check the allocation limit.
We add the _OR_GOTO_DONE suffix to the macros that go to done if the allocation fails. This makes it obvious what is different about the macro versus the equivalent function, and that error handling is built-in. Renamed the cli_strdup to safer_strdup to make it obvious that it exists because it is safer than regular strdup. Regular strdup doesn't have the NULL check before trying to dup, and so may result in a NULL-deref crash. Also remove unused STRDUP (_OR_GOTO_DONE) macro, since the one with the NULL-check is preferred.
1d09996
to
b3c9a56
Compare
mtremer
pushed a commit
to ipfire/ipfire-2.x
that referenced
this pull request
Jan 9, 2025
- Update from version 1.3.2 to 1.4.1 - Update of rootfile - Changelog 1.4.1 ClamAV 1.4.1 is a critical patch release with the following fixes: - [CVE-2024-20506](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-20506): Changed the logging module to disable following symlinks on Linux and Unix systems so as to prevent an attacker with existing access to the 'clamd' or 'freshclam' services from using a symlink to corrupt system files. This issue affects all currently supported versions. It will be fixed in: - 1.4.1 - 1.3.2 - 1.0.7 - 0.103.12 Thank you to Detlef for identifying this issue. - [CVE-2024-20505](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-20505): Fixed a possible out-of-bounds read bug in the PDF file parser that could cause a denial-of-service (DoS) condition. This issue affects all currently supported versions. It will be fixed in: - 1.4.1 - 1.3.2 - 1.0.7 - 0.103.12 Thank you to OSS-Fuzz for identifying this issue. - Removed unused Python modules from freshclam tests including deprecated 'cgi' module that is expected to cause test failures in Python 3.13. 1.4.0 Major changes - Added support for extracting ALZ archives. The new ClamAV file type for ALZ archives is `CL_TYPE_ALZ`. Added a [DCONF](https://docs.clamav.net/manual/Signatures/DynamicConfig.html) option to enable or disable ALZ archive support. > _Tip_: DCONF (Dynamic CONFiguration) is a feature that allows for some > configuration changes to be made via ClamAV `.cfg` "signatures". - [GitHub pull request](Cisco-Talos/clamav#1183) - Added support for extracting LHA/LZH archives. The new ClamAV file type for LHA/LZH archives is `CL_TYPE_LHA_LZH`. Added a [DCONF](https://docs.clamav.net/manual/Signatures/DynamicConfig.html) option to enable or disable LHA/LZH archive support. - [GitHub pull request](Cisco-Talos/clamav#1192) - Added the ability to disable image fuzzy hashing, if needed. For context, image fuzzy hashing is a detection mechanism useful for identifying malware by matching images included with the malware or phishing email/document. New ClamScan options: ``` --scan-image[=yes(*)/no] --scan-image-fuzzy-hash[=yes(*)/no] ``` New ClamD config options: ``` ScanImage yes(*)/no ScanImageFuzzyHash yes(*)/no ``` New libclamav scan options: ```c options.parse &= ~CL_SCAN_PARSE_IMAGE; options.parse &= ~CL_SCAN_PARSE_IMAGE_FUZZY_HASH; ``` Added a [DCONF](https://docs.clamav.net/manual/Signatures/DynamicConfig.html) option to enable or disable image fuzzy hashing support. - [GitHub pull request](Cisco-Talos/clamav#1186) Other improvements - Added cross-compiling instructions for targeting ARM64/aarch64 processors for [Windows](https://github.com/Cisco-Talos/clamav/blob/main/INSTALL-cross-windows-arm64.md) and [Linux](https://github.com/Cisco-Talos/clamav/blob/main/INSTALL-cross-linux-arm64.md). - [GitHub pull request](Cisco-Talos/clamav#1116) - Improved the Freshclam warning messages when being blocked or rate limited so as to include the Cloudflare Ray ID, which helps with issue triage. - [GitHub pull request](Cisco-Talos/clamav#1195) - Removed unnecessary memory allocation checks when the size to be allocated is fixed or comes from a trusted source. We also renamed internal memory allocation functions and macros, so it is more obvious what each function does. - [GitHub pull request](Cisco-Talos/clamav#1137) - Improved the Freshclam documentation to make it clear that the `--datadir` option must be an absolute path to a directory that already exists, is writable by Freshclam, and is readable by ClamScan and ClamD. - [GitHub pull request](Cisco-Talos/clamav#1199) - Added an optimization to avoid calculating the file hash if the clean file cache has been disabled. The file hash may still be calculated as needed to perform hash-based signature matching if any hash-based signatures exist that target a file of the same size, or if any hash-based signatures exist that target "any" file size. - [GitHub pull request](Cisco-Talos/clamav#1167) - Added an improvement to the SystemD service file for ClamOnAcc so that the service will shut down faster on some systems. - [GitHub pull request](Cisco-Talos/clamav#1164) - Added a CMake build dependency on the version map files so that the build will re-run if changes are made to the version map files. Work courtesy of Sebastian Andrzej Siewior. - [GitHub pull request](Cisco-Talos/clamav#1294) - Added an improvement to the CMake build so that the RUSTFLAGS settings are inherited from the environment. Work courtesy of liushuyu. - [GitHub pull request](Cisco-Talos/clamav#1301) Bug fixes - Silenced confusing warning message when scanning some HTML files. - [GitHub pull request](Cisco-Talos/clamav#1252) - Fixed minor compiler warnings. - [GitHub pull request](Cisco-Talos/clamav#1197) - Since the build system changed from Autotools to CMake, ClamAV no longer supports building with configurations where bzip2, libxml2, libz, libjson-c, or libpcre2 are not available. Libpcre is no longer supported in favor of libpcre2. In this release, we removed all the dead code associated with those unsupported build configurations. - [GitHub pull request](Cisco-Talos/clamav#1217) - Fixed assorted typos. Patch courtesy of RainRat. - [GitHub pull request](Cisco-Talos/clamav#1228) - Added missing documentation for the ClamScan `--force-to-disk` option. - [GitHub pull request](Cisco-Talos/clamav#1186) - Fixed an issue where ClamAV unit tests would prefer an older libclamunrar_iface library from the install path, if present, rather than the recently compiled library in the build path. - [GitHub pull request](Cisco-Talos/clamav#1258) - Fixed a build issue on Windows with newer versions of Rust. Also upgraded GitHub Actions imports to fix CI failures. Fixes courtesy of liushuyu. - [GitHub pull request](Cisco-Talos/clamav#1307) - Fixed an unaligned pointer dereference issue on select architectures. Fix courtesy of Sebastian Andrzej Siewior. - [GitHub pull request](Cisco-Talos/clamav#1293) - Fixed a bug that prevented loading plaintext (non-CVD) signature files when using the `--fail-if-cvd-older-than=DAYS` / `FailIfCvdOlderThan` option. Fix courtesy of Bark. - [GitHub pull request](Cisco-Talos/clamav#1309) Signed-off-by: Adolf Belka <adolf.belka@ipfire.org> Signed-off-by: Arne Fitzenreiter <arne_f@ipfire.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The goal of the ClamAV's functions like "cli_malloc" and "cli_realloc" are to check that the size of the allocation is not unreasonable. But we only need to do that check when the size is derived from untrusted user input (aka may vary based on values in the possibly malicious file that we're scanning).
There are many calls to allocate memory where the size of the allocation is either known at compile time or else is derived from trusted user input. Examples of variable but trusted allocation sizes include:
This PR seeks to eliminate those unnecessary max-allocation checks. This is mostly because checking the limits in these cases is a waste of CPU. But it also sets groundwork for a future improvement where we may pass in the memory allocation limit as a parameter, and make it generally configurable for the scanning service. This would be meaningful for anyone wishing to crank up the max-filesize to scan large archives that decompress to very large amounts of data.
Further, this PR renames the max-allocation functions and related functions and macros so it's obvious what they do. I suspect that the reason the likes of "cli_calloc" and "cli_realloc" were so widely used for fixed size allocations is that the developers knew we had a special wrapper for allocating memory and didn't know when they should be used.
What's also drawing attention to is the functions that wrap allocations but do NOT check the allocation-limit. These are
cli_strdup
,cli_realloc
, andcli_realloc2
. These functions are safer than the originalstrdup
andrealloc
functions from the C "standard library".strdup
, our wrapper adds a NULL check to prevent accidental NULL-deref crashes.realloc
, the "standard library" one is not standard. Some variations may free the pointer if size == 0. Some do not. Ours wrapper does not. In this way, we know the behavior will be consistent. Our version also has two variants. The first,cli_realloc
does not free the pointer on failure. The second,cli_realloc2
, does free the pointer on failure. That second behavior is generally less desirable because you must remember to set the pointer to NULL beforegoto done;
or else you risk a double-free. I have not removed the latter variant simply because it's a lot of work, so there will still be two variants. In this PR, I improve the documentation and I rename those two so you can easily tell the difference.Also, for both the
strdup
andrealloc
wrappers, we now have a "cli_safer_" prefix on them when they do not check the the allocation limit. When we do check the allocation limit, we instead use a "cli_max_" prefix.Finally, for the macros which wrap the allocation-related functions, I've added a "_OR_GOTO_DONE" suffix to the macro names. This makes it clear what is special about the macros, and that error handling is built int. If the allocation fails, then we
goto done;
.In summary, the functions and macros are now: