Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[service/proctelemetry] when host_proc env variable is set, metric server uses and reports on the wrong process #7435

Closed
dloucasfx opened this issue Mar 27, 2023 · 0 comments · Fixed by #7998
Labels
bug Something isn't working

Comments

@dloucasfx
Copy link
Contributor

Describe the bug
While debugging the below error in k8s env

Error: failed to register process metrics: process does not exist
2023/03/23 03:44:47 main.go:115: application run finished with error: failed to register process metrics: process does not exist

I have noticed that the metric server registers its own process irrespective whether HOST_PROC environment variable is set or not.
This becomes an issue when HOST_PROC is pointing (ex: k8s environment) to the host proc on the host filesystem, as gopsutil will be looking at the host processes to identify the OTEL process instead of looking into the container /proc .

In a container, OTEL runs as PID1, and as explained above, the metric server will be reporting metrics of the host PID1 instead.

Steps to reproduce
Deploy OTEL in a k8s cluster with HOST_PROC env set to a mounted path of the host proc folder inside the container and make sure the telemetry->metric server is enabled.
(HOST_PROC is needed to monitor the host itself.)

What did you expect to see?
the metric server identifies the otel process correctly and send metric (ex:cpu) of the OTEL process usage

What did you see instead?
you will either get an error:

Error: failed to register process metrics: process does not exist
2023/03/23 03:44:47 main.go:115: application run finished with error: failed to register process metrics: process does not exist

or you will get metrics from the wrong process.

What version did you use?
ALL

Environment
OS: Linux

Additional context
I took a stab at this here #7434 but this solution does not work well as all stats call in gopsutil will re-read the env variable HOST_PROC

We have a PR open in gopsutil shirou/gopsutil#1439 this will be the most robust way to address this issue, but not sure how long it will take to get it merged

@dloucasfx dloucasfx added the bug Something isn't working label Mar 27, 2023
codeboten pushed a commit that referenced this issue Aug 15, 2023
…ariable with a programmatic value (#7998)

Reprising
#7434

**Description:** 

While debugging the below error in k8s env
````
Error: failed to register process metrics: process does not exist
2023/03/23 03:44:47 main.go:115: application run finished with error: failed to register process metrics: process does not exist
````
I have noticed that the metric server is calling GOPSUTIL while the
HOST_PROC variable is set , this causes gopsutil `PidExistsWithContext `
to retrieve the process from the host instead from the container

````
func PidExistsWithContext(ctx context.Context, pid int32) (bool, error) {
	if pid <= 0 {
		return false, fmt.Errorf("invalid pid %v", pid)
	}
	proc, err := os.FindProcess(int(pid))
	if err != nil {
		return false, err
	}

	if isMount(common.HostProc()) { // if /<HOST_PROC>/proc exists and is mounted, check if /<HOST_PROC>/proc/<PID> folder exists
		_, err := os.Stat(common.HostProc(strconv.Itoa(int(pid))))
		if os.IsNotExist(err) {
			return false, nil
		}
		return err == nil, err
	}
````
This PR unsets and resets the host_proc variable and introduces an
option to allow the use of host_proc if for whatever reason they need to

**Link to tracking Issue:**
Fixes #7435

**Testing:**
unit tests

---------

Signed-off-by: Dani Louca <dlouca@splunk.com>
Co-authored-by: Dani Louca <dlouca@splunk.com>
Co-authored-by: Alex Boten <aboten@lightstep.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant