Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(secrets): Add unprotected secret implementation #13998

Merged
merged 8 commits into from
Dec 4, 2023

Conversation

srebhan
Copy link
Member

@srebhan srebhan commented Sep 26, 2023

resolves #13804
resolves #13806
resolves #13807
superseeds #13812

This PR introduces an unprotected secret implementation that allows to run Telegraf in environments where the locked-page-limit is low and cannot be changed or where locked pages are not available. To switch to the not recommended unprotected more, you need to start Telegraf with the new --unprotected command-line flag. We issue a warning when running unprotected.

@telegraf-tiger telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Sep 26, 2023
cmd/telegraf/telegraf.go Outdated Show resolved Hide resolved
@srebhan
Copy link
Member Author

srebhan commented Sep 28, 2023

@redbaron can you please test the binary and also give this a review!?

@srebhan srebhan added area/agent security raise security concerns or improve the security of Telegraf ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. labels Sep 28, 2023
@redbaron
Copy link
Contributor

Will do. I'll need to rebase out patches on top of this one to be able to run it on our systems. I'll come back with results sometime next week.

@srebhan
Copy link
Member Author

srebhan commented Oct 17, 2023

@redbaron any news for the testing?

@powersj
Copy link
Contributor

powersj commented Oct 25, 2023

@redbaron wanted to check in and see if you have had a chance to give this a try?

@FlashSystems
Copy link
Contributor

I've applied this PR as a patch to the current master branch and tried to use the resulting binary within a systemd-nspawn container. The container does not have the CAP_IPC_LOCK capability. My first test was generating config via telegraf -sample-config -unprotected. It crashed right away with the following error message:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x555e96e0d0f4]

goroutine 1 [running]:
github.com/awnumar/memguard/core.Purge.func1(0xc0001d1c70)
        github.com/awnumar/memguard@v0.22.3/core/exit.go:23 +0x34
github.com/awnumar/memguard/core.Purge()
        github.com/awnumar/memguard@v0.22.3/core/exit.go:51 +0x1e
github.com/awnumar/memguard/core.Panic(...)
        github.com/awnumar/memguard@v0.22.3/core/exit.go:85
github.com/awnumar/memguard/core.NewBuffer(0x20)
        github.com/awnumar/memguard@v0.22.3/core/buffer.go:73 +0x51c
github.com/awnumar/memguard/core.NewCoffer()
        github.com/awnumar/memguard@v0.22.3/core/coffer.go:30 +0x2d
github.com/awnumar/memguard/core.init.0()
        github.com/awnumar/memguard@v0.22.3/core/enclave.go:15 +0x29

@srebhan
Copy link
Member Author

srebhan commented Nov 14, 2023

@FlashSystems thanks for the test. Will try to fix it and let you know. Would you be willing to give this another try?

@FlashSystems
Copy link
Contributor

@FlashSystems thanks for the test. Will try to fix it and let you know. Would you be willing to give this another try?

Sure. Just let me know when I should give it another try.

@srebhan
Copy link
Member Author

srebhan commented Nov 14, 2023

Following the code, I think there is currently nothing we can do about this. The minimum requirement even without locked secrets is a 3-page locked memory region and the ability to lock and protect memory regions. :-(
This is due to the fact that the used memguard library creates its key during import (in the init() function) and Golang not supporting conditional imports...

Would that be a no-go for you @FlashSystems? The only solution would be a build-time flag but given the amount of variants we support this would probably mean a custom-build on your side...

@Hipska
Copy link
Contributor

Hipska commented Nov 14, 2023

To me it looks like a good alternative to be able to build a telegraf with unprotected secrets by yourself..

@FlashSystems
Copy link
Contributor

Would that be a no-go for you @FlashSystems? The only solution would be a build-time flag but given the amount of variants we support this would probably mean a custom-build on your side...

OK, I understand the problem. But looking at the Bug-Reports at the top of this PR, I doubt that having a build time parameter is the right solution for all of these use cases. I think that distributions might have a hard time distinguishing at build time if telegraf is run on native hardware or within a container that has CAP_IPC_LOCK removed.

My specific use case is building the Arch Linux telegraf package within a clean chroot. This build process uses a systemd-nspawn container without the CAP_IPC_LOCK privilege. The build process calls the created telegraf binary to create the default configuration included within the package. This triggers the problem. For this use case having a build time option is no solution, because the final package should be built with protected secrets support but when just calling telegraf -sample-config -unprotected secrets support is unnecessary.

@jtroy
Copy link

jtroy commented Nov 14, 2023

Hi all,

I've been following this with interest as I'm trying to get the latest telegraf built for AIX. AIX only permits mlock() for processes running with root capabilities. After patching memguard's memcall dependency to support AIX and applying these patches, I'm seeing the same panic as @FlashSystems (except when running as root, of course). My use-case is to build, package and run telegraf for AIX in a way that mirrors Linux reasonably closely, so running as a non-root user is important to me. I don't mean to hijack this, just add another voice and volunteer to test.

@srebhan
Copy link
Member Author

srebhan commented Nov 16, 2023

Upstream issue created to propose the removal of init() initialization: awnumar/memguard#155

@srebhan
Copy link
Member Author

srebhan commented Nov 21, 2023

@redbaron, @FlashSystems and @jtroy can you please test this PR again?!? This includes awnumar/memguard#156 which might fix our issue with memguard initializing on import...

@srebhan srebhan force-pushed the secrets_alternatives branch from 6b30628 to 53b2859 Compare November 21, 2023 18:11
@FlashSystems
Copy link
Contributor

I've tested the new revision of this PR and it works. I can now build the package within a systemd-nspawn container. It even works without using the -unprotected switch, because generating the default config is not accessing any secrets.
Thank you @srebhan, for taking this issue upstream to the memguard project.

@srebhan
Copy link
Member Author

srebhan commented Nov 22, 2023

@redbaron any feedback from your side?

@powersj
Copy link
Contributor

powersj commented Nov 28, 2023

@srebhan, can you update this PR with the latest from upstream (and resolve the conflict)? Then we can request one more round of testing? Thanks!

@powersj powersj assigned srebhan and unassigned powersj Nov 28, 2023
@powersj powersj removed the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Nov 28, 2023
@srebhan srebhan force-pushed the secrets_alternatives branch from e9b437a to 3e8ef53 Compare November 30, 2023 09:47
@srebhan
Copy link
Member Author

srebhan commented Nov 30, 2023

@FlashSystems, @jtroy, @redbaron can you please test the latest version of this PR once the binary is available!? The required memguard PR #156 changed a bit and I want to make sure we are still fine...

@FlashSystems
Copy link
Contributor

@srebhan: I did a quick test and rebuild the package in the systemd-nspawn container. Generating the configuration file still works fine👍

@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Dec 4, 2023
@srebhan srebhan assigned powersj and unassigned srebhan Dec 4, 2023
@srebhan srebhan requested a review from redbaron December 4, 2023 16:17
@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Dec 4, 2023

@powersj powersj merged commit d570f01 into influxdata:master Dec 4, 2023
23 checks passed
@github-actions github-actions bot added this to the v1.29.0 milestone Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/agent feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. security raise security concerns or improve the security of Telegraf
Projects
None yet
6 participants