Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding k8 support for container id detection #1962

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
"@opentelemetry/api": "^1.0.0"
},
"dependencies": {
"@kubernetes/client-node": "^0.20.0",
Copy link
Contributor

@trentm trentm Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A concern here is that this is a large dependency, but mostly that it still uses the deprecated request HTTP client library. This would bring in some CVEs (kubernetes-client/javascript#754 (comment)).

There is an effort (https://github.com/kubernetes-client/javascript/blob/master/FETCH_MIGRATION.md kubernetes-client/javascript#754) to move away from request to fetch(). That looks active. There was a recent release-candidate release with that work: '1.0.0-rc4': '2023-12-23T15:06:31.075Z'. However, I haven't looked deeply and I'm not sure how soon that might come.

Even if that came, an issue with @kubernetes/client-node@1.x might be that I believe it dropped compat with Node.js v14 in kubernetes-client/javascript@3a995fb
That commit unfortunately doesn't mention the Node.js version, nor is it tied to a PR with any discussion.
That same commit is dropping an AbortController shim for older Node.js version support. AbortController is avaialble starting in Node.js v14.17.0, so it is possible that @kubernetes/client-node@1.x would work with node >=14.17.0. They don't test with Node v14 though, so that is tenuous support.

I haven't read through this PR yet. Would it be possible to do whatever k8s client calls without using @kubernetes/client-node? I don't know if it is a very complex API to call manually.

Another option would possibly be to wait for @kubernetes/client-node@1.x and the OTel SDK 2.0 milestone (https://github.com/open-telemetry/opentelemetry-js/milestone/17), because that will drop support for Node.js v14.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe - This api takes care of adding filepath required for certifcates etc to be present.
I think we can certainly manually do all the operations that this does. It may be bit more code . I can look into it.

I see the migration to fetch and partial compatibility for node 14 as well. I think officially node 14 is not supported now.

Let me take a look at the effort.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the library itself.

It is officially maintained by Kubernetes SIG API Machinery.

I suggest we use this and create an issue there regarding node 14 and request --> fetch ( which is I believe is already in progress )

@trentm let me know what you think.

"@opentelemetry/resources": "^1.0.0",
"@opentelemetry/semantic-conventions": "^1.0.0"
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ import * as fs from 'fs';
import * as util from 'util';
import { diag } from '@opentelemetry/api';

const { KubeConfig, CoreV1Api } = require('@kubernetes/client-node');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to use require instead of import?

require('dotenv').config();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this use of dotenv your debugging code that should be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - makes sense -


export class ContainerDetector implements Detector {
readonly CONTAINER_ID_LENGTH = 64;
readonly DEFAULT_CGROUP_V1_PATH = '/proc/self/cgroup';
Expand All @@ -36,7 +39,12 @@ export class ContainerDetector implements Detector {

async detect(_config?: ResourceDetectionConfig): Promise<Resource> {
try {
const containerId = await this._getContainerId();
let containerId = '';
if (this._isInKubernetesEnvironment()) {
containerId = await this._getContainerIdK8();
} else {
containerId = (await this._getContainerId()) || '';
}
return !containerId
? Resource.empty()
: new Resource({
Expand Down Expand Up @@ -100,6 +108,43 @@ export class ContainerDetector implements Detector {
return containerIdStr || '';
}

private _isInKubernetesEnvironment(): boolean {
return process.env.KUBERNETES_SERVICE_HOST !== undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know k8s well enough to have confidence in its standard envvars and its client API usage. I'd be happier if others with k8s experience could weigh in.

It might help to add some comments linking to official k8s docs justifying the envvars and APIs being used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is purely based out of my research - let me find other PRs or articles supporting this.

Thanks for the review

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

private async _getContainerIdK8(): Promise<string> {
const kubeconfig = new KubeConfig();
kubeconfig.loadFromDefault();

const api = kubeconfig.makeApiClient(CoreV1Api);
const namespace: string = process.env.NAMESPACE || 'default';
const containerName: string | undefined = process.env.CONTAINER_NAME;
if (!containerName) {
throw new Error('Container name not specified in environment');
}

const response = await api.listNamespacePod(namespace);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this listing all pods in the namespace? What if there are multiple pods in the namespace with the same container name (as would happen with a k8s depoyment)? Will each detect a random container ID?

Listing all pods in a namespace is also much more expensive than getting a single pod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @dashpole
I believe this statement is true where it will list all pods.

May be I can provide a condition where it will look for a specific pod defined as environment variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a typo, should be api.listNamespacedPod. I'm not sure why typescript isn't complaining


const podWithContainer = response.body.items.find(
(pod: { spec: { containers: any[] } }) => {
return pod.spec.containers.some(
container => container.name === containerName
);
}
);

if (!podWithContainer) {
throw new Error(`No pods found with container name '${containerName}'.`);
}

const container = podWithContainer.spec.containers.find(
(container: { name: string }) => container.name === containerName
);
const containerId = container ? container.containerID || '' : '';

return containerId;
}

/*
cgroupv1 path would still exist in case of container running on v2
but the cgroupv1 path would no longer have the container id and would
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import {
assertContainerResource,
assertEmptyResource,
} from '@opentelemetry/contrib-test-utils';
import { KubeConfig } from '@kubernetes/client-node';

import { ContainerDetector } from '../src';

Expand Down Expand Up @@ -137,4 +138,146 @@ describe('ContainerDetector', () => {
assertEmptyResource(resource);
});
});

describe('Detect containerId in k8 environment', () => {
it('should return an empty Resource when not in a K8 environment', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could please change all uses of "k8" to "k8s" (upper and lower case). AFAIK "k8s" is the more common abbreviation -- at least "k8s" is used as a prefix in some OpenTelemetry specs / semantic conventions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely

const containerDetector = new ContainerDetector();
sinon
.stub(containerDetector, '_isInKubernetesEnvironment' as any)
.returns(false);

const resource: Resource = await containerDetector.detect();
assertEmptyResource(resource);
sinon.restore();
});

it('should return a Resource with container ID when running in a Kubernetes environment', async () => {
// Stub _isInKubernetesEnvironment to return true
const containerDetector = new ContainerDetector();
sinon
.stub(containerDetector, '_isInKubernetesEnvironment' as any)
.returns(true);
// Stub _getContainerIdK8 to return a mock container ID
sinon
.stub(containerDetector, '_getContainerIdK8' as any)
.resolves(correctCgroupV1Data);

const resource: Resource = await containerDetector.detect();
assert.ok(resource);
assertContainerResource(resource, {
id: 'bcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklm',
});
sinon.restore();
});

it('should return an error with no CONTAINER_NAME environment variable defined', async () => {
const containerDetector = new ContainerDetector();
sinon
.stub(containerDetector, '_isInKubernetesEnvironment' as any)
.returns(true);
const k8Stub = sinon.spy(
ContainerDetector.prototype,
'_getContainerIdK8' as any
);
try {
await k8Stub();
} catch (error: any) {
assert.strictEqual(
error.message,
'Container name not specified in environment'
);
} finally {
k8Stub.restore();
}
});

it('should throw "no pod found" for wrong container name', async () => {
process.env.CONTAINER_NAME = 'my-container';
const containerDetector = new ContainerDetector();
sinon
.stub(containerDetector, '_isInKubernetesEnvironment' as any)
.returns(true);
const k8Stub = sinon.spy(
ContainerDetector.prototype,
'_getContainerIdK8' as any
);
const listNamespacePodStub = sinon.stub().resolves({
body: {
items: [
{
spec: {
containers: [{ name: 'container1' }, { name: 'container2' }],
},
},
],
},
});

const apiStub = {
listNamespacePod: listNamespacePodStub,
};
const kubeconfig = new KubeConfig();
kubeconfig.loadFromDefault();

const makeApiClientStub = sinon
.stub(KubeConfig.prototype, 'makeApiClient')
.returns(apiStub as any);
try {
await k8Stub();
assert(listNamespacePodStub.calledOnceWithExactly('default'));
assert(makeApiClientStub.calledOnce);
} catch (error: any) {
assert.strictEqual(
error.message,
"No pods found with container name 'my-container'."
);
} finally {
k8Stub.restore();
}
});

it('should return a containerId for right container name', async () => {
process.env.CONTAINER_NAME = 'container1';
const containerDetector = new ContainerDetector();
sinon
.stub(containerDetector, '_isInKubernetesEnvironment' as any)
.returns(true);
const k8Stub = sinon.spy(
ContainerDetector.prototype,
'_getContainerIdK8' as any
);
const listNamespacePodStub = sinon.stub().resolves({
body: {
items: [
{
spec: {
containers: [
{ name: 'container1', containerID: correctCgroupV1Data },
{ name: 'container2' },
],
},
},
],
},
});

const apiStub = {
listNamespacePod: listNamespacePodStub,
};
const kubeconfig = new KubeConfig();
kubeconfig.loadFromDefault();

const makeApiClientStub = sinon
.stub(KubeConfig.prototype, 'makeApiClient')
.returns(apiStub as any);
try {
const cid = await k8Stub();
assert(listNamespacePodStub.calledOnceWithExactly('default'));
assert(makeApiClientStub.calledOnce);
assert.strictEqual(cid, correctCgroupV1Data);
} finally {
k8Stub.restore();
}
});
});
});
Loading
Loading