Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension][ecsobserver] Init ECS Observer README #2377

Merged
merged 3 commits into from
Mar 10, 2021

Conversation

pingleig
Copy link
Contributor

@pingleig pingleig commented Feb 18, 2021

Description:

Follow up on #1920, this is the (slightly updated) doc for implementing service discovery (of scrape targets) for Prometheus receiver on Amazon Elastic Container (ECS). It's mainly based on the existing configuration & implementation we have in
amazon-cloudwatch-agent

There are three types of filter, we fetch all the tasks periodically and there is no server side filtering or list watch API.

  • service i.e. deployment name
  • task definition i.e. manifest
  • docker label

The discovered targets are consumed in two ways, write as a static file for file sd, or use the receiver creator framework to create new receiver at runtime, based on #1395 the latter seems to have performance problem (especially when a single collector instance is used for scraping a large cluster w/o sharding)

There are some open items

  • emit custom metric, discoverd_targets, api_error_counts etc. Should be able to use obsreport
  • In cloudwatch agent, we were attaching ecs labels directly i.e. InstanceId instead of generating meta label like __meta__ecs_task_arn__, __meta_ec2_instance_id__ and have user configuring relabel config. The __meta__ approaches is the one used by all the in tree discovery implementations (k8s, consul etc.) (Chose to use __meta_ecs_ prefix)

Link to tracking Issue: #1395

Testing:

N/A just the doc

Documentation:

A new README under extensions/observer/ecsobserver

@pingleig pingleig requested a review from jrcamp as a code owner February 18, 2021 20:27
@pingleig pingleig requested a review from a team February 18, 2021 20:27
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Feb 18, 2021

CLA Signed

The committers are authorized under a signed CLA.

@codecov
Copy link

codecov bot commented Feb 18, 2021

Codecov Report

Merging #2377 (11a9e60) into main (2382835) will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2377      +/-   ##
==========================================
- Coverage   90.63%   90.61%   -0.02%     
==========================================
  Files         404      404              
  Lines       20012    20012              
==========================================
- Hits        18137    18133       -4     
- Misses       1416     1418       +2     
- Partials      459      461       +2     
Flag Coverage Δ
integration 69.26% <ø> (ø)
unit 89.43% <ø> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
receiver/stanzareceiver/receiver.go 93.93% <0.00%> (-6.07%) ⬇️
receiver/k8sclusterreceiver/watcher.go 95.29% <0.00%> (-2.36%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2382835...11a9e60. Read the comment docs.

@jrcamp jrcamp self-assigned this Feb 18, 2021
Copy link
Contributor

@jrcamp jrcamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pingleig are you still working on changes in response to original PR? Just wanted to make sure not blocked on me.

extension/observer/ecsobserver/README.md Outdated Show resolved Hide resolved
extension/observer/ecsobserver/README.md Outdated Show resolved Hide resolved
| cluster_name | Mandatory | target ECS cluster name for service discovery |
| cluster_region | Mandatory | target ECS cluster's AWS region name |
| refresh_interval | Optional | how often to look for changes in endpoints (default: 10s) |
| result_file | Optional | path of YAML file for the scrape target results. When enabled, the observer always returns empty |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can result_file field be mandatory? Otherwise, what's purpose of observer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to mandatory for now, it's not mandatory in the original PR because receiver creator is supported and the extension just returns the endpoint to observer listener. To reduce the size of PRs I plan to only implement the result file logic.

@pingleig
Copy link
Contributor Author

@jrcamp Sorry for the delay, I wasn't blocked on the comments in original PR. I was has having some internal discussion on the output format. The conclusion is we will be using __meta__ecs_ prefix like other in tree service discovery implementations.

@tigrannajaryan
Copy link
Member

./extension/observer/ecsobserver/README.md:21:4: "sevice" is a misspelling of "service"
./extension/observer/ecsobserver/README.md:399:3: "discoverd" is a misspelling of "discovered"
./extension/observer/ecsobserver/README.md:400:3: "discoverd" is a misspelling of "discovered"
./extension/observer/ecsobserver/README.md:401:74: "discoverd" is a misspelling of "discovered"
make: *** [Makefile.Common:119: misspell] Error 2
make: *** Waiting for unfinished jobs....

@bogdandrutu
Copy link
Member

ping @jrcamp

- `make misspell`
- Fix order of filter types, should be service, task defintion, docker
  label
@jrcamp
Copy link
Contributor

jrcamp commented Mar 4, 2021

Looks good, thanks for the thorough README!

@pingleig
Copy link
Contributor Author

pingleig commented Mar 5, 2021

Can anyone with merge permission merge this PR? I think I don't need to do rebase on my fork because it's a README in a new folder and should not have any conflict.

@pingleig
Copy link
Contributor Author

pingleig commented Mar 9, 2021

ping @jrcamp @bogdandrutu

@bogdandrutu bogdandrutu merged commit 90b3fb1 into open-telemetry:main Mar 10, 2021
kisieland referenced this pull request in kisieland/opentelemetry-collector-contrib Mar 16, 2021
Bumps [github.com/google/uuid](https://github.com/google/uuid) from 1.1.4 to 1.1.5.
- [Release notes](https://github.com/google/uuid/releases)
- [Commits](google/uuid@v1.1.4...v1.1.5)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
pmatyjasek-sumo referenced this pull request in pmatyjasek-sumo/opentelemetry-collector-contrib Apr 28, 2021
* ecsobserver: Init README

* ecsobserver: Update output format to use __meta_ecs_ prefix

* ecsobserver: Fix typo

- `make misspell`
- Fix order of filter types, should be service, task defintion, docker
  label
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants