Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new(userspace/libsinsp): allow to not retrieve detailed user info #1765

Merged
merged 1 commit into from
Apr 19, 2024

Conversation

erthalion
Copy link
Contributor

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-savefile

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

/version driver-API-version-patch

/version driver-SCHEMA-version-major

/version driver-SCHEMA-version-minor

/version driver-SCHEMA-version-patch

sinsp_threadinfo contains two fields with user and login_user
information. Since those fields are of scap_userinfo type and statically
allocated, they take a lot of space:

	scap_userinfo              m_user;               /*   368  2312 */
	scap_userinfo              m_loginuser;          /*  2680  2312 */

which is 4624 bytes out of 5728 for the whole sinsp_threadinfo:

	/* size: 5728, cachelines: 90, members: 64 */

Most of this memory is coming from the fields name
(MAX_CREDENTIALS_STR_LEN), homedir and shell (both SCAP_MAX_PATH_SIZE).
For a process-heavy workload this can mean a lot of memory taken for
these purposes, as in the example below (max possible threadinfo cache size):

mem

The memory usage breakdown shows that most of it goes to manage
sinsp_threadinfo (where _M_allocate_node is a part of threadinfo allocation
process, e.g. in get_thread_ref):

73.4  43.0%  43.0%     73.5  43.1% std::__detail::_Hashtable_alloc::_M_allocate_node
37.2  21.8%  64.8%     37.2  21.8% sinsp_parser::reserve_event_buffer
30.7  18.0%  82.8%    109.9  64.4% sinsp_thread_manager::new_threadinfo

To make memory management more flexible, split m_user/m_loginuser into
two set of fields:

  • one containing uid/gid, which are ubiquitous and generally used
    everywhere
  • one for the rest of heavy details, which are needed less often

The new user_details structure is not supposed to use separately from
sinsp_threadinfo, thus it's defined inside the class. Allow to configure
whether to use the full or slim version of threadinfo via compile time
configuration.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the need, but i feel like a runtime configuration would be much better, like:

  • change API to void set_import_users(bool import_users, bool details=true);
  • modify scap_{userinfo,groupinfo} structures so that all chars array become just char pointers
  • by default, all sinsp_userinfo->{name,homedir,shell} will default to point to the <NA> static string if details==false

This should need the same amount of changes and is runtime configurable. Of course we need to pass the details bool down to scap, but since that information bit is already managed by sinsp_user class, it should be pretty easy.
WDYT?

case TYPE_SHELL:
RETURN_EXTRACT_CSTR(tinfo->m_user.shell);
RETURN_EXTRACT_CSTR(tinfo->m_user->shell);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this crash when dereferencing m_user that is nullptr (ie: when SINSP_USER_DETAILS is undefined)?

@erthalion
Copy link
Contributor Author

erthalion commented Mar 26, 2024

@FedeDP thanks for looking into this!

This should need the same amount of changes and is runtime configurable.

I was thinking about this, but haven't found any use-case for switching it at runtime. Which one do you have in mind?

Of course we need to pass the details bool down to scap, but since that information bit is already managed by sinsp_user class, it should be pretty easy. WDYT?

Not sure what do you mean by "information bit is already managed by sinsp_user"? My initial impression was that modifying scap_userinfo would be quite an invasive change, because it would touch at least two libraries (libscap and libsinsp) instead of one.

@FedeDP
Copy link
Contributor

FedeDP commented Mar 26, 2024

Not sure what do you mean by "information bit is already managed by sinsp_user"?

I mean that sinsp_usergroup_manager already manages all the user/group related info and the "talk with scap" bit.

My initial impression was that modifying scap_userinfo would be quite an invasive change

On a second thought, we also dump scap_user and scap_group to scap files in headers blocks; even if they should be unused by now (since sinsp also emits useradded and groupadded events and those are used to rebuild machine state during scap files replay). Therefore modifying those structs would also mean a scap files break and we cannot do that.

We might just create a sinsp_user and sinsp_group layer that differs from the scap one because it uses pointers (and not arrays) for strings.

I was thinking about this, but haven't found any use-case for switching it at runtime

It's just that i'd avoid the build time complexity given it is not needed in this specific case. I can see Falco gaining an option to disable the gathering of user and group details in some specific scenarios (lots of threadinfos and thus lots of ram used by Falco itself).

@erthalion erthalion force-pushed the feature/split-user-info branch from 824448e to 2df0a34 Compare March 27, 2024 15:22
@erthalion
Copy link
Contributor Author

We might just create a sinsp_user and sinsp_group layer that differs from the scap one because it uses pointers (and not arrays) for strings.

[...]

It's just that i'd avoid the build time complexity given it is not needed in this specific case. I can see Falco gaining an option to disable the gathering of user and group details in some specific scenarios (lots of threadinfos and thus lots of ram used by Falco itself).

Ok, @FedeDP I've updated PR based on your feedback, does it look closer to what you had in mind? Probably the only difference is that if not requested, name/homedir/shell are still going to be empty, not "N/A", but it's on purpose -- I still can't force myself to waste even these few bytes :)

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much! This looks fine, left some comments and questions!

userspace/libsinsp/threadinfo.cpp Outdated Show resolved Hide resolved
@@ -1125,6 +1142,7 @@ class SINSP_PUBLIC sinsp : public capture_stats_source
bool m_isfatfile_enabled;
bool m_isinternal_events_enabled;
bool m_hostname_and_port_resolution_enabled;
bool m_user_details_enabled;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move this bool inside m_usergroup_manager.

\param enable If set to false, no extended user information will be
stored in sinsp_threadinfo, only user id/group id will be available.
*/
void set_user_details(bool enable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have a single method set_import_users(bool import_users, bool with_details=true), instead of having 2 different APIs.

{
uint32_t uid; ///< User ID
uint32_t gid; ///< Group ID
std::string name; ///< Username
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am not sure whether using string is what we want here; but perhaps it works fine (ie: have you already tested memory consumption with this change?)
Anyway, storing <NA> when details are not enabled should not cost that much, right? Returning an empty string breaks UX since we always used NA for that.

Copy link
Contributor Author

@erthalion erthalion Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've re-run memory test with this change, getting the same results. From what I understand, an emptystring doesn't introduce as much overhead as statically allocated array anyway (although it might be larger than some alternatives).

Let me do couple more experiments to estimate how much exactly an overhead from using string and initializing it with NA would be. Another option to this could be to turn this structure into a class with a method getName, that will return NA if the underlying string is empty. This would seem to satisfy both memory and UX sides, but will involve more changes. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've re-run memory test with this change, getting the same results. From what I understand, and emptystring doesn't introduce as much overhead as statically allocated array anyway (although it might be larger than some alternatives).

+1 thanks!

Another option to this could be to turn this structure into a class with a method getName, that will return NA if the underlying string is empty.

Let's first try if storing <NA> in the string causes overhead, otherwise yep this would be an ok solution for me!
Thank you very much for the deep dive btw!

@erthalion
Copy link
Contributor Author

@FedeDP here is what I came up with to address the latest feedback. Essentially everything as discussed, except few unexpected changes in filtercheck*, since the current implementation needs to get a pointer to all the values. I've run the benchmark over it, the result still holds, about ~1.1 Gi with user details enabled, and ~0.85 Gi without them.

@FedeDP
Copy link
Contributor

FedeDP commented Apr 2, 2024

Sorry for the noise!
Actually, i love the new design too, i think it's a nice new addition to be able to just retrieve uids and gids without further info.

@FedeDP
Copy link
Contributor

FedeDP commented Apr 2, 2024

/milestone 0.16.0

@poiana poiana added this to the 0.16.0 milestone Apr 2, 2024
@FedeDP
Copy link
Contributor

FedeDP commented Apr 2, 2024

I think you need to fix chisels build :) Shoudl be pretty straightforward!

@FedeDP
Copy link
Contributor

FedeDP commented Apr 15, 2024

scap_event_encode_params_v

The failure comes from the wasm build...as far as i see, you haven't touched anything wasm specific though :/

@erthalion erthalion force-pushed the feature/split-user-info branch from 875b652 to 4513e73 Compare April 15, 2024 08:37
@erthalion
Copy link
Contributor Author

scap_event_encode_params_v

The failure comes from the wasm build...as far as i see, you haven't touched anything wasm specific though :/

Yeah, I'm going to try to simulate exactly the same build procedure as in this test, maybe that would help.

@FedeDP
Copy link
Contributor

FedeDP commented Apr 16, 2024

Perhaps @therealbobo can help us here? He was the first working on the emiscripten build integration if i recall correctly! 🙏

@erthalion
Copy link
Contributor Author

erthalion commented Apr 16, 2024

Perhaps @therealbobo can help us here? He was the first working on the emiscripten build integration if i recall correctly! 🙏

@FedeDP, @therealbobo Yeah, that would be great. I can reproduce the issue, but it's extremely weird and I can't come up with any explanation how it could be connected to the PR itself. What I observe is that scap_event_encode_params_v fails trying to access a parameter via va_arg(args, struct scap_const_sized_buffer). My understanding is that the buffer in question is the empty_buffer from the test, which is passed to add_event_advance_ts. Adding more logging indeed shows that the offending argument has field buf = 0xffffffff, but why and how is still not clear to me.

It seems that to trigger the issue it's enough to add only the subclasses sinsp_userinfo/sinsp_groupinfo and add extra members of those classes.

@FedeDP
Copy link
Contributor

FedeDP commented Apr 17, 2024

Am not able to test on my machine because my nodejs version is throwing some errors :/

TypeError: Failed to parse URL from /home/federico/Work/libs/build-user/libsinsp/test/unit-test-libsinsp.wasm
    at Object.fetch (node:internal/deps/undici/undici:11929:11) {
  [cause]: TypeError [ERR_INVALID_URL]: Invalid URL
      at new NodeError (node:internal/errors:400:5)
      at URL.onParseError (node:internal/url:565:9)
      at new URL (node:internal/url:645:5)
      at new Request (node:internal/deps/undici/undici:7315:25)
      at fetch2 (node:internal/deps/undici/undici:11013:25)
      at Object.fetch (node:internal/deps/undici/undici:11927:18)
      at fetch (node:internal/process/pre_execution:237:25)
      at instantiateAsync (/home/federico/Work/libs/build-user/libsinsp/test/unit-test-libsinsp.js:1393:14)
      at createWasm (/home/federico/Work/libs/build-user/libsinsp/test/unit-test-libsinsp.js:1429:3)
      at Object.<anonymous> (/home/federico/Work/libs/build-user/libsinsp/test/unit-test-libsinsp.js:6200:11) {
    input: '/home/federico/Work/libs/build-user/libsinsp/test/unit-test-libsinsp.wasm',
    code: 'ERR_INVALID_URL'
  }
}

Node.js v18.13.0

@therealbobo
Copy link
Contributor

therealbobo commented Apr 17, 2024

Sorry for the late response! I gave a quick look and after some search I tried with success the following fix:

diff --git a/userspace/libsinsp/test/CMakeLists.txt b/userspace/libsinsp/test/CMakeLists.txt
index 85ef11a3a..98c2a3c13 100644
--- a/userspace/libsinsp/test/CMakeLists.txt
+++ b/userspace/libsinsp/test/CMakeLists.txt
@@ -169,6 +169,7 @@ add_executable(unit-test-libsinsp ${LIBSINSP_UNIT_TESTS_SOURCES})

 if (EMSCRIPTEN)
        target_compile_options(unit-test-libsinsp PRIVATE "-sDISABLE_EXCEPTION_CATCHING=0")
+       target_link_options(unit-test-libsinsp PRIVATE "-sSTACK_SIZE=5MB")
        target_link_options(unit-test-libsinsp PRIVATE "-sDISABLE_EXCEPTION_CATCHING=0")
        target_link_options(unit-test-libsinsp PRIVATE "-sALLOW_MEMORY_GROWTH=1")
        target_link_options(unit-test-libsinsp PRIVATE "-sEXPORTED_FUNCTIONS=['_main','_htons','_ntohs']")

The problem is that the linker option `-sSTACK_SIZE' is not present in the emscripten version of the CI AFAIK but I could be wrong. Probably 5MB is overkill and we can settle with less.

@FedeDP
Copy link
Contributor

FedeDP commented Apr 17, 2024

Thanks bobo! What i don't get is why that CI is only failing on this PR though.

@FedeDP
Copy link
Contributor

FedeDP commented Apr 17, 2024

@erthalion i also noticed that some proc related tests were missing explicit cast for variadic arguments: #1798.
Hopefully this will fix the CI. Still, i don't get why the failure is only triggering on your PR.

@FedeDP
Copy link
Contributor

FedeDP commented Apr 17, 2024

My pr was merged! Care to rebase? Thanks!

sinsp_threadinfo contains two fields with user and login_user
information. Since those fields are of scap_userinfo type and statically
allocated, they take a lot of space:

    scap_userinfo              m_user;               /*   368  2312 */
    scap_userinfo              m_loginuser;          /*  2680  2312 */

which is 4624 bytes out of 5728 for the whole sinsp_threadinfo:

    /* size: 5728, cachelines: 90, members: 64 */

Most of this memory is coming from the fields name
(MAX_CREDENTIALS_STR_LEN), homedir and shell (both SCAP_MAX_PATH_SIZE).
For a process-heavy workload this can mean a lot of memory taken for
these purposes.

To make memory management more flexible, split m_user/m_loginuser into
two set of fields:
* one containing uid/gid, which are ubiquitous and generally used
everywhere
* one for the rest of heavy details, which are needed less often

The new sinsp_userinfo class is not supposed to use separately from
sinsp_threadinfo, thus it's defined inside the class.

Co-authored-by: Mauro Ezequiel Moltrasio <mmoltras@redhat.com>
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
@erthalion erthalion force-pushed the feature/split-user-info branch from 4513e73 to 8f2e7d6 Compare April 17, 2024 12:15
@erthalion
Copy link
Contributor Author

Thanks @therealbobo @FedeDP for investigating, this failure was really driving me crazy :)

@FedeDP
Copy link
Contributor

FedeDP commented Apr 17, 2024

CI Build / build-libs-emscripten 🧐 (pull_request) Successful in 4m

Yay!!!

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana poiana added the lgtm label Apr 17, 2024
@poiana
Copy link
Contributor

poiana commented Apr 17, 2024

LGTM label has been added.

Git tree hash: 33363019642517d7835bc29861129eef6f092a44

@erthalion
Copy link
Contributor Author

@leogr @jasondellaluce I would appreciate some final review and an approval if everything is fine :)

@poiana
Copy link
Contributor

poiana commented Apr 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: erthalion, FedeDP, leogr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit 90246b8 into falcosecurity:master Apr 19, 2024
40 checks passed
Andreagit97 added a commit to Andreagit97/testing that referenced this pull request Apr 26, 2024
After this PR falcosecurity/libs#1765 we are now
able to correctly retrieve again user name info. For this reason we need
to update the triggered rules

Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
@FedeDP FedeDP changed the title Split user info nwe(userspace/libsinsp): allow to not retrieve detailed user info Apr 29, 2024
@FedeDP FedeDP changed the title nwe(userspace/libsinsp): allow to not retrieve detailed user info new(userspace/libsinsp): allow to not retrieve detailed user info Apr 29, 2024
Andreagit97 added a commit to Andreagit97/testing that referenced this pull request Apr 29, 2024
After this PR falcosecurity/libs#1765 we are now
able to correctly retrieve again user name info. For this reason we need
to update the triggered rules

Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
poiana pushed a commit to falcosecurity/testing that referenced this pull request Apr 29, 2024
After this PR falcosecurity/libs#1765 we are now
able to correctly retrieve again user name info. For this reason we need
to update the triggered rules

Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants