Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds file inspection callback and example code #170

Merged

Conversation

val-ms
Copy link
Contributor

@val-ms val-ms commented Jun 17, 2021

libclamav callbacks can be used to access embedded file content at each
layer of extraction during the course of a scan. The existing callbacks
only provide access to the file descriptor and a guess at the file type.

This patch adds a new callback for the purposes of file/archive
inspection that provides additional insight into the embedded file.
This includes:

  • ancestors: an array of parent file names
  • parent file size: the size of the direct parent layer
  • file name: current layer's filename, if any.
  • file buffer (pointer)
  • file size: size of file buffer
  • file type: just a guess at the current file's type
  • file descriptor: may be -1 if the layer is in-memory only.
  • layer attributes: a flag field. see LAYER_ATTRIBUTE_* defines in clamav.h

Two new example apps are added that are automatically built when
compiling under CMake:

  • ex2 demonstrates the prescan callback.
  • ex3 demonstrates the new file inspection callback.

The examples are now installed if enabled, so you can test them in the
Docker image, and so that they'll be colocated with the DLLs so you can
test them on Windows. The installed examples should also be able to find
the UnRAR library at run time, without having to set LD_LIBRARY_PATH.

This commit also sets the fmap->name in an fmap-scan using the basname
of the provided filename if the caller provided the filename and the
provided fmap does not have the name set.

@val-ms
Copy link
Contributor Author

val-ms commented Jun 17, 2021

This PR was replicated from an internal backlogged PR. It is pending additional research & testing. I'm not yet certain this new API offers the best parameters. Will have to work with other teams to demonstrate its value.

@val-ms
Copy link
Contributor Author

val-ms commented Jun 17, 2021

I forgot to note that to test the example program ex3, you need to tell it where to load libclamunrar_iface for RAR support to work. Eg:

LD_LIBRARY_PATH=/home/micasnyd/workspace/clamav/build/libclamunrar_iface ./examples/ex3_file_inspection ~/Downloads/test-files.zip

@val-ms val-ms added the 🚧experiment just an experiemnt, do not merge label Jul 19, 2021
@val-ms val-ms force-pushed the CLAM-1402-firepower-archive-inspection branch from a283af5 to 8ad5dfc Compare September 3, 2021 18:07
@val-ms
Copy link
Contributor Author

val-ms commented Oct 14, 2021

Some issues identified in review so far:

  • need parent filename with cl_scanmap() / cl_scanmap_callback().
    • right now the original filename appears to be lost with the scanmap API.
  • need recursion depth parameter in callback.
  • give information if the parent file is encrypted which will be useful if any decryption / password guessing capabilities are added.

@val-ms val-ms force-pushed the CLAM-1402-firepower-archive-inspection branch from 1dfacb6 to 97ef21f Compare December 7, 2021 02:56
@val-ms val-ms marked this pull request as draft January 14, 2022 19:04
@val-ms val-ms added ⚒️Work in Progress (WIP) and removed 🚧experiment just an experiemnt, do not merge labels Feb 9, 2022
@val-ms val-ms force-pushed the CLAM-1402-firepower-archive-inspection branch from 13395b8 to b44d803 Compare March 6, 2022 00:01
@val-ms val-ms marked this pull request as ready for review March 6, 2022 00:31
@val-ms val-ms requested a review from shutton March 6, 2022 00:31
@val-ms
Copy link
Contributor Author

val-ms commented Mar 6, 2022

I've updated this PR once more, making it so the "is normalized" and "was encrypted" booleans are provided in an attributes flag field that may be extended without breaking the API.
I've also changed the "parents" parameter to an "ancestors" parameter that is now an array of filenames. the filenames may be NULL for layers that don't have names.

I'm content with how this works now. Ready for review.

Copy link
Contributor

@shutton shutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar enough to follow the logic, so mainly commenting on style.

Also need to consider the risk level of this feature for 0.105. Presumably regression testing should vet it, but...

libclamav/matcher.c Outdated Show resolved Hide resolved
libclamav/scanners.c Outdated Show resolved Hide resolved
libclamav/scanners.c Outdated Show resolved Hide resolved
@val-ms val-ms requested a review from shutton March 10, 2022 06:34
@val-ms val-ms force-pushed the CLAM-1402-firepower-archive-inspection branch from ad19210 to d0b39c2 Compare March 10, 2022 22:21
@val-ms val-ms force-pushed the CLAM-1402-firepower-archive-inspection branch from d0b39c2 to 359a459 Compare May 12, 2022 22:45
@val-ms val-ms force-pushed the CLAM-1402-firepower-archive-inspection branch from 359a459 to 0cff2a8 Compare June 26, 2022 18:57
@val-ms
Copy link
Contributor Author

val-ms commented Jun 26, 2022

Just rebased to update it.

Copy link
Contributor

@shutton shutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

libclamav callbacks can be used to access embedded file content at each
layer of extraction during the course of a scan. The existing callbacks
only provide access to the file descriptor and a guess at the file type.

This patch adds a new callback for the purposes of file/archive
inspection that provides additional insight into the embedded file.
This includes:
- ancestors: an array of parent file names
- parent file size: the size of the direct parent layer
- file name: current layer's filename, if any.
- file buffer (pointer)
- file size: size of file buffer
- file type: just a guess at the current file's type
- file descriptor: may be -1 if the layer is in-memory only.
- layer attributes: a flag field. see LAYER_ATTRIBUTE_* defines in clamav.h

Two new example apps are added that are automatically built when
compiling under CMake:
- ex2 demonstrates the prescan callback.
- ex3 demonstrates the new file inspection callback.

The examples are now installed if enabled, so you can test them in the
Docker image, and so that they'll be colocated with the DLLs so you can
test them on Windows. The installed examples should also be able to find
the UnRAR library at run time, without having to set LD_LIBRARY_PATH.

This commit also sets the fmap->name in an fmap-scan using the basname
of the provided filename if the caller provided the filename and the
provided fmap does not have the name set.
Also demo the post-scan callback with alert-encrypted enabled.
The current implementation sets a "next layer attributes" flag field
in the scan context. This may introduce bugs if accidentally not cleared
during error handling, causing that attribute to be applied to a
different layer than intended.

This commit resolves that by adding an attribute flag to the major
internal scan functions and removing the "next layer attributes" from
the scan context. This attributes flag shares the same flag fields as
the attributes flag in the new file inspection callback and the flags
are defined in `clamav.h`.
@val-ms val-ms force-pushed the CLAM-1402-firepower-archive-inspection branch from 0cff2a8 to 1b13b99 Compare October 13, 2022 05:24
@val-ms val-ms merged commit cf81299 into Cisco-Talos:main Oct 13, 2022
@val-ms
Copy link
Contributor Author

val-ms commented Oct 13, 2022

Test results look good. Merging.

This new callback API will likely be annotated as "unstable" in the upcoming feature release in case we are required to adjust the parameters in the next feature release.

@val-ms val-ms deleted the CLAM-1402-firepower-archive-inspection branch October 13, 2022 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants