Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ability to export/collect scrape_jobs #304

Merged
merged 6 commits into from
Jun 13, 2019

Conversation

anarcat
Copy link

@anarcat anarcat commented Mar 12, 2019

Pull Request (PR) description

This is a revised version of #141, adapted to apply cleanly on master, but also simplified in that the datastructures for the scrape collector are cleaner. It's a array of strings (job names) instead of an array of hashes containing the job names and optional scheme.

I haven't tested this yet, but figured I would share my work for peer review anyways. I believe I have taken into account the comments in #141.

This Pull Request (PR) fixes the following issues

Fixes #126

Copy link

@costela costela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented on #141: thanks for picking up the work!
I'm a bit unsure as to how the "flat-list" will work, though. See comment on code.

manifests/scrape_job.pp Outdated Show resolved Hide resolved
manifests/daemon.pp Outdated Show resolved Hide resolved
manifests/daemon.pp Outdated Show resolved Hide resolved
@bastelfreak bastelfreak added enhancement New feature or request needs-work not ready to merge just yet labels Mar 16, 2019
@anarcat anarcat changed the title WIP: add ability to export/collect scrape_jobs add ability to export/collect scrape_jobs Mar 18, 2019
@anarcat
Copy link
Author

anarcat commented Mar 18, 2019

Alright, i've redone this patch to use the original data structure from #141 even if I don't quite like it. My main problem with it is that it needs to have an explicit list of job names in a configuration somewhere. by default, it will not collect anything unless the operator guesses those job names, and those are not exactly obvious.

For example, here the node exporter job is called prometheus-node-exporter (note the dashes) while by default it would be prometheus-node_exporter (note the dash and underscore). That's because I've done some work to sync with the Debian package, which uses a dash (see #303). It won't be obvious for new users how to configure those new jobs and I would prefer if all jobs are collected by default but understand this is not very practical as such right now.

Besides, this works for me in production now. I suspect tests might fail (again) and there is no test coverage for the new code (and there wasn't any in #141 either). I must admit I have never worked with tests in Puppet manifests (let alone for exported resources) so I would really appreciate help and guidance to complete that if it's a requirement to getting that merged.

That said, the thing works for me as such, and I believe it would already be a huge improvement over the "no internal discovery" current situation. :)

@anarcat
Copy link
Author

anarcat commented Mar 18, 2019

the CI build failed, but i believe it's a transient error, actually. it would be great if some admin could reschedule that...

@costela
Copy link

costela commented Mar 19, 2019

LGTM
the tests are unfortunately also where I got stuck for lack of time :/

@anarcat
Copy link
Author

anarcat commented Mar 19, 2019

the tests are unfortunately also where I got stuck for lack of time :/

do you think there's even the slightest chance we might get help for that? do you have a secret repository with half-finished test code I could start with? ;)

manifests/server.pp Outdated Show resolved Hide resolved
@@ -7,7 +7,7 @@
<% full_config = {
'global'=>global_config,
'rule_files'=>rule_files,
'scrape_configs'=>scrape_configs,
'scrape_configs'=>scrape_configs + @collected_scrape_jobs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please write a test to verify this line works,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have any suggestions on where i should start with this? i'm not very familiar with the puppet modules test environment, and even less so when it concerns exported resources...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much you know, so maybe this sounds a bit condescending, but basically you can just:

$ bundle install --with test
$ bundle exec rake test

And then just go through the stuff in the spec folder to get a feel for how things are currently being tested.
As for exported resources, here are some basic examples:
https://github.com/rodjek/rspec-puppet#testing-exported-resources

And yes: I know this is all easier said than done 😉

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not condescending, I had truly no idea where to start so your feedback was definitely useful. that said:

bundle install --with test

this is the kind of horrors that make me hesitant of actually getting down that rabbit hole. i modestly imagined this would do a third-party setup of gem things in /usr/local but i have actually zero idea where all that junk it downloaded was installed or why. it looks like it shoved a bunch of things in /var/lib but I'm going to pretend all this is perfectly normal and I won't need to worry about arbitrary code being installed as root on my machine. but that's on par for the game these days...

ANYWAY. :)

i tried to do the thing, i really did. here's the patch I came up with:

modified   spec/defines/daemon_spec.rb
@@ -24,7 +24,10 @@
           group:             'smurf_group',
           env_vars:          { SOMEVAR: 42 },
           bin_dir:           '/usr/local/bin',
-          install_method:    'url'
+          install_method:    'url',
+          export_scrape_job: true,
+          scrape_host:       'localhost',
+          scrape_port:       1234,
         }
       ].each do |parameters|
         context "with parameters #{parameters}" do
@@ -229,6 +232,11 @@
               'enable' => true
             )
           }
+          context 'exported resources' do
+            subject { exported_resources }
+
+            it { is_expected.to contain_file('/etc/prometheus/file_sd_config.d/smurf_exporter_localhost:1234.yaml') }
+          end
         end
       end
     end

this fails with:

     Failure/Error: it { is_expected.to contain_file('/etc/prometheus/file_sd_config.d/smurf_exporter_localhost:1234.yaml') }
       expected that the catalogue would contain File[/etc/prometheus/file_sd_config.d/smurf_exporter_localhost:1234.yaml]
     # ./spec/defines/daemon_spec.rb:238:in `block (7 levels) in <top (required)>'

so the file isn't created? did i do a typo or guessed the name wrong? how do i tell?

... i would love to write test, I really would. but this is just incomprehensible to me. :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bundle stuff should all go inside the vendor folder inside the project. AFAIK it shouldn't install anything system-wide. You mentioned "installed as root" and maybe that's part of the misunderstanding: bundle should be called as your normal non-root user. That's what the $ at the beginning of the commands is meant to indicate. It's a very old convention: $ indicates the normal user prompt, # indicates a root prompt. Just in case you didn't know (again: apologies if this comes over as condescending)

As for the actual test: the contain_* part is used to test the existence of some resource in the generated manifest. In this case the exported resource. The file you are testing for is not exported, it is realized by the exported resource, namely scrape_job. So here you should be testing if scrape_job is contained.
Does this help?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend our CONTRIBUTING.md: https://github.com/voxpupuli/puppet-prometheus/blob/master/.github/CONTRIBUTING.md

bundle install doesn't install anything as root, unless you prefix it with sudo. But our recommendation is the option --path .vendor/ which will install all the gems into a subdir within the current module and won't pollute $PATH.

Copy link
Author

@anarcat anarcat Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know about # and $. :) i did run bundle install as regular user, but it said it couldn't write to /var/lib and asked me for sudo... anyways, it turns out it all installed stuff in /var/lib/gems-2.5.0 which i believe was just empty before (and if it wasn't, it should be). so I did the --path trick (thanks for that one, it looks like I hadn't done my homework!)... and I got this error:

Downloading listen-3.1.5 revealed dependencies not in the API or the lockfile
(rb-fsevent (>= 0.9.4, ~> 0.9), ruby_dep (~> 1.2)).
Either installing with `--full-index` or running `bundle update listen` should
fix the problem.

which i didn't get without --path. running with --full-index doesn't fix the problem, but bundle update listen does, so I plowed ahead.

my tests are still failing. I tried this instead:

            it { is_expected.to contain_scrape_job('localhost:1234') }

and also:

            it { is_expected.to contain_prometheus_scrape_job('localhost:1234') }

they both fail with an error similar to the original:

     Failure/Error: it { is_expected.to contain_prometheus_scrape_job('localhost:1234') }
       expected that the catalogue would contain Prometheus_scrape_job[localhost:1234]
     # ./spec/defines/daemon_spec.rb:238:in `block (7 levels) in <top (required)>'

is there a way to list the resources that are actually exported? where are those tests running anyways? how can I debug this?

thanks for the hand holding! :)

update: it does look like the resource gets exported.

oh, and it does look like that scrape_job is exported, for what it's worth. at the end of the test run, I see:

[...]
Untouched resources:
[...]

  Prometheus::Scrape_job[localhost:1234]

so it looks like I'm messing up the quoting or something on that front...

... and worse, it doesn't address the original review here, which is:

    'scrape_configs'=>scrape_configs + @collected_scrape_jobs,

please write a test to verify this line works,

:)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For contain_* examples, see here. In this specific case you should probably be using contain_prometheus__scrape_job (note the "dunder"/double underscore to denote "::").

As for @bastelfreak 's original comment: yes, you'd have to also test the contents of the generated file to be sure the concatenation makes sense. This should be done in a separate test case, though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, i pushed what i think is a working test case for scrape_job.pp, let me know if that looks alright to you:

          context 'exported resources' do
            subject { exported_resources }
            it {
              is_expected.to contain_prometheus__scrape_job('localhost:1234')
            }
          end

also pushed here...

I did try to test the original line commented here, but I couldn't figure out how to create the scrape_job resources (the daemon, really) from the other file. I tried to add this to classes/prometheus_rspec.rb:

      context 'exported resources' do
        [
          {
            collect_scrape_jobs: [
              { job_name: 'smurf_exporter' },
            ]
          }
        ].each do |parameters|
          context "with prometheus version #{parameters[:version]}" do
            let(:params) do
              parameters
            end
            # copied over from defines/daemon_spec.rb
            let(:pre_condition) { 'prometheus::daemon { "smurf_exporter": export_scrape_job => true, scrape_host => "localhost", scrape_port => 1234, real_download_url => "https://github.com/prometheus/smurf_exporter/releases/v1.2.3/smurf_exporter-1.2.3.any.tar.gz"' }
            subject { exported_resources }
            it {
              is_expected.to contain_prometheus__scrape_job('localhost:1234')
            }
          end
        end
      end

but that crashes with:

  6) prometheus on ubuntu-18.04-x86_64 exported resources with prometheus version  should contain Prometheus::Scrape_job[localhost:1234]
     Failure/Error: is_expected.to contain_prometheus__scrape_job('localhost:1234')
     
     Puppet::ParseErrorWithIssue:
       Could not parse for environment rp_env: Syntax error at 'class' (line: 3, column: 1) on node curie.anarc.at

... since i'm just cargo culting this, i have no idea why it fails again. ;)

i guess that rspec is creating a puppet manifest and applying it with that stuff - is there a place where I can actually see what manifest it's creating to debug what i'm doing wrong?

thanks again for all the hand holding! we're getting closer!

manifests/config.pp Outdated Show resolved Hide resolved
@@ -7,7 +7,7 @@
<% full_config = {
'global'=>global_config,
'rule_files'=>rule_files,
'scrape_configs'=>scrape_configs,
'scrape_configs'=>scrape_configs + @collected_scrape_jobs,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bundle stuff should all go inside the vendor folder inside the project. AFAIK it shouldn't install anything system-wide. You mentioned "installed as root" and maybe that's part of the misunderstanding: bundle should be called as your normal non-root user. That's what the $ at the beginning of the commands is meant to indicate. It's a very old convention: $ indicates the normal user prompt, # indicates a root prompt. Just in case you didn't know (again: apologies if this comes over as condescending)

As for the actual test: the contain_* part is used to test the existence of some resource in the generated manifest. In this case the exported resource. The file you are testing for is not exported, it is realized by the exported resource, namely scrape_job. So here you should be testing if scrape_job is contained.
Does this help?

Copy link

@costela costela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delayed answer.
I've started using your branch to check it out and try to work on the specs, and I've found a few other things to clear up.

If you agree with my comments, you might want to cherry pick the two top commits from our RegioHelden/puppet-prometheus

manifests/node_exporter.pp Outdated Show resolved Hide resolved
manifests/init.pp Outdated Show resolved Hide resolved
manifests/server.pp Outdated Show resolved Hide resolved
@anarcat
Copy link
Author

anarcat commented Apr 8, 2019

If you agree with my comments, you might want to cherry pick the two top commits from our RegioHelden/puppet-prometheus

I've reviewed your comments individually, but couldn't figure out which two commits you were refering to here. Maybe I just couldn't find them but I suspect you forgot to push. Mind pasting the actual commit hashes to give me a hand here?

@costela
Copy link

costela commented Apr 8, 2019

@anarcat I mean the top two commits here: https://github.com/RegioHelden/puppet-prometheus/commits/1c9220b85708670aa80eba5943b3fd15fde75a97
The main addition in these commits are exporter settings for all available *_exporter.pp.

@anarcat
Copy link
Author

anarcat commented Apr 8, 2019

okay, in there i think i got b71f222 covered. 1c9220b includes a bunch of stuff including the disagreement we have about removing scrape_job_name, so obviously I tend to disregard that part. but the chunks about the port numbers do seem quite useful.

and about that, i have been thinking about DRY here: i had to copy-paste those port numbers in my firewall rules elsewhere and thought of how that could be repeated, but couldn't find a clean solution... just something to keep in mind i guess...

@anarcat
Copy link
Author

anarcat commented Apr 25, 2019

so i think i resolved all the review comments here and finally fixed the test suite so it's all green.

i understand there might be future improvements to be done on top of this PR, but that's just it: it can be done on top! i think we should go ahead and merge this massive commit before it goes stale. then extra goodies can be slapped on top in other PRs.

how does that sound?

@anarcat
Copy link
Author

anarcat commented Jun 5, 2019

i have rebased this, and i think it's ready to be merged. can someone approve and merge this?

manifests/scrape_job.pp Outdated Show resolved Hide resolved
manifests/scrape_job.pp Outdated Show resolved Hide resolved
@costela
Copy link

costela commented Jun 9, 2019

@bastelfreak PTAL

@bastelfreak
Copy link
Member

Okay, I got a few questions:

  • Why did you add the $scrape_job_name parameter to some exporters but not all?
  • Most exporters don't have the $scrape_port parameter, on some you hardcoded a port so it isn't possible to change it. Is there a reason for this? This seems to be very inconsistent
  • Same applies for the $export_scrape_job parameter
  • Also I think $export_scrape_job should default to false everywhere, to keep this backwards compatible. Otherwise it would be a breaking change

@@ -133,5 +135,8 @@
service_ensure => $service_ensure,
service_enable => $service_enable,
manage_service => $manage_service,
export_scrape_job => false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be export_scrape_job => $export_scrape_job with Boolean $export_scrape_job = false, added to the class parameters?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, jeez, yes! that's embarassing... 🙈

Copy link
Member

@alexjfisher alexjfisher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to fix inability to set export_scrape_job to true

@alexjfisher alexjfisher added the needs-work not ready to merge just yet label Jun 13, 2019
@costela
Copy link

costela commented Jun 13, 2019

@alexjfisher I'll get to it in about one hour, unless @anarcat beats me to it ;)

@costela
Copy link

costela commented Jun 13, 2019

@anarcat can you please accept anarcat#2? Should fix the remaining open point.

@anarcat
Copy link
Author

anarcat commented Jun 13, 2019

done. what's the policy on PRs again? should i squash all of this in one commit? or rebase in a cleaner set? it has gotten a bit messy...

@costela
Copy link

costela commented Jun 13, 2019

@anarcat it would be nice if we could keep at least 2 commits, one for each of us 😉
But if that's against voxpupuli's policy, than feel free to squash it down to one, after all, you carried the torch to the finish line! 👍

@anarcat
Copy link
Author

anarcat commented Jun 13, 2019

@costela are you okay with how the first commit shows up? you could be author and i'm committer or something... see 0712150

@costela
Copy link

costela commented Jun 13, 2019

@anarcat I think that wouldn't work, because whoever approves the PR and merges it will appear as "committer" (see the current master commit log for a few such cases).

But hey, this is some world-class bike-shedding we're doing! 😅
Let's just get it merged. I'm ok with not getting attribution, after all I basically left my original PR to rot...

@anarcat
Copy link
Author

anarcat commented Jun 13, 2019

i see okay, i'll craft beautiful snowflakes then :)

@anarcat
Copy link
Author

anarcat commented Jun 13, 2019

alright, i merged it down to 6 commits, two (and the largest ones!) credited to you, @costela. i hope this is good for everyone!

@alexjfisher i think the latest changes address your latest review.

@alexjfisher alexjfisher removed needs-squash needs-work not ready to merge just yet labels Jun 13, 2019
@alexjfisher
Copy link
Member

@anarcat @costela Thanks for your efforts and great team work!

@alexjfisher alexjfisher merged commit 6c557f7 into voxpupuli:master Jun 13, 2019
@anarcat anarcat deleted the export_resources branch June 13, 2019 15:55
@benedikt-haug
Copy link

I would absolutely love to use that feature. Do you already know when a new release will happen?

Thank you!

anarcat added a commit to anarcat/puppet-prometheus that referenced this pull request Sep 20, 2019
The previous work to collect scrape jobs (voxpupuli#304) was a good start, but
it really becomes powerful when you can select certain labels to apply
on the exporters.

For example, I use this to export the node roles' in the prometheus
metrics, which allows me to regroup results by role instead of by node.
Rovanion pushed a commit to Rovanion/puppet-prometheus that referenced this pull request May 5, 2021
add ability to export/collect scrape_jobs
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for exporting/collecting *_exporter configs
5 participants