Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheus alerts for openshift build subsystem #16495

Merged
merged 1 commit into from
Sep 28, 2017

Conversation

gabemontero
Copy link
Contributor

https://trello.com/c/RskNHpfh/1334-5-prometheus-alerts-for-build-metrics

A WIP initial pass at alerts for the openshift build subsystem

@openshift/devex @smarterclayton @zgalor @moolitayer @mfojtik ptal, defer if bandwidth dictates, and/or pull in others as you each deem fit

Disclaimers:

  1. I'm still debating the pros/cons of these alerts with https://docs.google.com/document/d/199PqyG3UsyXlwieHaqbGiWVa8eMWi8zzAn0YfcApr8Q/edit#heading=h.2efurbugauf in mind

  2. still debating the template parameters / defaults for the various thresholds ... I still have a to-do to revisit with ops contacts potential default values based on their existing zabbix monitoring

  3. still debating the severity as well

  4. based on the activity in Update test alert metadata #16026 I did not include the miqTarget annotation

I also removed the space in the existing alert name based on how I interpreted various naming conventions.

And other than the query on the alerts URI, the extended test changes stemmed from flakiness experienced during testing that was unrelated to the addition of the alerts.

thanks

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 21, 2017
@gabemontero
Copy link
Contributor Author

@gabemontero gabemontero added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 21, 2017
@@ -30,6 +30,15 @@ parameters:
name: SESSION_SECRET
generate: expression
from: "[a-zA-Z0-9]{43}"
- description: The threshold for the active build alert
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add parameters here. If you can't pick good defaults we shouldn't add the alert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically parameterization is the devil. No parameters. It's either a good rule, or it shouldn't be here (add it commented out).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to me the duration and volume of running builds that determines whether you have a problem or not is going to be determined in part by the size of your cluster. (possibly also derived from steady state observations). The fact that there aren't absolute values for it doesn't make it a bad rule.

if your suggestion is that such rules should be commented out rather than enabled out of the box, ok, but I would expect that's going to be the case for almost all our rules. It's hard to imagine a metric who's health/sick condition isn't going to be determined in part by cluster characteristics.

Maybe what I need is a better understanding of the intent of this file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and is there some other file where we should be putting in rules that we want for our actual online prod cluster?)


for _, sample := range metrics {
// first see if a metric has all the label names and label values we are looking for
foundCorrectLabels := true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have helpers to do this you should create them. This is pretty ugly code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a prometheus client library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: prometheus client library, at least for my interpretation of that, there is an alpha one for remote access ....see https://github.com/prometheus/client_golang#client-for-the-prometheus-http-api and https://github.com/prometheus/client_golang/blob/master/api/prometheus/v1/api.go

but the version of prometheus vendored into origin does not have the above

there is "client_golang" stuff in the prometheus vendored in, but it is non-HTTP non-remote

unless I missed something

otherwise sure I can work up some helpers ... might do it in a non-trello card PR depending on what happens with this work

severity: "MEDIUM"
message: "{{$labels.instance}} indicates at least ${PROMETHEUS_ACTIVE_BUILD_ALERT_THRESHOLD} OpenShift builds are taking more than ${PROMETHEUS_ACTIVE_BUILD_ALERT_DURATION} seconds to complete"
- alert: BuildFailureRate
expr: (count(openshift_build_terminal_phase_total{phase="failed"}) / count(openshift_build_terminal_phase_total{phase="complete"})) > ${PROMETHEUS_FAILED_BUILD_RATIO}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this alert get cleared? If openshift online has 9000 failed builds, do I have to wait for 9000 successful builds for the alert to go away?

Copy link
Contributor Author

@gabemontero gabemontero Sep 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending all the other discussion points in the PR, I almost did not include this alert to begin with (it was suggested to me) .... I'm inclined to simply remove it, but will wait a bit to see what other discussion arises with it before doing so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the suggester of the alert: i don't know what the prometheus workflow is like, or how the alerting behaves... can you have an alert that's ignored once tripped until it resets below the alert level?

Pruning old builds would be one way for it to be reset. Or we can punt it and do an alert based on "failed builds within the last X minutes" instead, but that's going to be another alert that needs to be tuned for your cluster/expectations.

expr: up{job="kubernetes-nodes"} == 0
annotations:
miqTarget: "ContainerNode"
severity: "HIGH"
message: "{{$labels.instance}} is down"
message: "{{$labels.instance}} is down"
- alert: HangingBuild
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not hanging, it's slow.

expr: count(openshift_build_running_phase_start_time_seconds{} < time() - ${PROMETHEUS_ACTIVE_BUILD_ALERT_DURATION}) > ${PROMETHEUS_ACTIVE_BUILD_ALERT_THRESHOLD}
annotations:
severity: "MEDIUM"
message: "{{$labels.instance}} indicates at least ${PROMETHEUS_ACTIVE_BUILD_ALERT_THRESHOLD} OpenShift builds are taking more than ${PROMETHEUS_ACTIVE_BUILD_ALERT_DURATION} seconds to complete"
Copy link
Contributor

@smarterclayton smarterclayton Sep 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a more clear alarm. However, I'm not sure I want to check the other two in until you prove that you want these alerts on. I.e. when we upgrade to 3.7.0 I'll turn your proposed alerts on and send them to you. Then you can decide whether they work or not.

I actually probably would say we're not ready to merge any of these, and instead you should give me the alerts to turn on in production which i will hook up to your email.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on what I've learned as part of taking on this card (see the referenced google doc), I'm inclined to not do any of these alerts, either in this example template or manually submitting them to you (if not submitting to you is even an option), and wait until we get more practical experience with the build subsystem in production. But the card was assigned and initiating a PR to start discussions seemed the best thing to do.

On the values of the conditions / evil template parameters, for the current zabbix equivalent that ops is employing today, https://github.com/openshift/openshift-tools/blob/prod/scripts/monitoring/cron-send-stuck-builds.py#L60-L93, the "age" is not hard coded, but a variable passed in.

That said, I've re-initiated the email exchange I started with them during the build metrics work, and seeing if default values emerge (perhaps the calling to the referenced function hardcodes the value).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's actually base it on real data. Suggest your top 3 alerts based on recent bugs, and I'll turn them on after the 3.7.0 upgrade.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10-4 re: alerts <=> recent bugs and #16495 (comment) .... @bparees and I will start conferring on all that

for completeness, an update on what ops is currently doing with their zabbix stuff:

  • they are currently concerned with builds stuck in NEW|PENDING ... not slow RUNNING builds
  • I elaborated a bit that is really a pod scheduling issue, and might be captured by any new k8s or platform mgmt team metrics that get injected/rebased into openshift/origin
  • but that we could add it to our build specific stuff easy enough if need
  • their build alerts are at the lowest level; so a warning on the zabbix board, but nobody gets paged

- description: The allowable active build duration for the active build alert
name: PROMETHEUS_ACTIVE_BUILD_ALERT_DURATION
value: "360"
- description: The allowable ration of failed to complete builds for the failed build alert
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ratio

@smarterclayton
Copy link
Contributor

So I think you guys got the hard one (being the first responders). Let's turn this around a bit.

The point of metrics is to enable you to debug and assess the health of your component in the product.

The point of alerts is to unambiguously indicate a serious problem is occurring that an engineer should react to.

The process for this is best laid out as

  1. add a metric that you think represents your health, based on past bugs / problems / challenges in production
  2. observe the metric in production and verify it matches your expectation
  3. repeat by adding more metrics or refining the ones in place
  4. define an unambiguous alert based on production data that would indicate that production is in trouble and we should react - observe that (you, that is) until you're confident it represents a real problem
  5. take that alert and refine its parameters further
  6. ship that alert to end users

I'd expect you guys to be in the gathering phase (between 1-4). I'd recommend you create a new section the readme and place your recommended snippets in there with some guesses and text that says what to do with them. I'll accumulate them into one of the prod clusters and then you can observe them.

We won't ship the alert until it's something you're confident we could page you on at 1am on a Saturday night and you'd be happy because it means you just saved the world.

@smarterclayton
Copy link
Contributor

smarterclayton commented Sep 21, 2017 via email

@bparees
Copy link
Contributor

bparees commented Sep 22, 2017

add a metric that you think represents your health, based on past bugs / problems / challenges in production

Well, that was the metric that included the failure reason+start time date which was rejected as not conforming to the prometheus philosophy. Failed builds alone is not a very interesting metrics for us since there are lots of perfectly normal reasons for builds to fail.

@gabemontero is going to circle back w/ a hybrid proposal between prometheus metrics and live query-based state analysis.

Alternatively, rather than adding failure reason+timetamp labels to our "completed builds" metric, what if we introduced additional metrics like:

FailedFetchSourceCount
FailedPushImageCount
FailedLast5MinutesCount

?

(the collector would do the work of summing those things when iterating the build cache).

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 24, 2017
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 24, 2017
@gabemontero
Copy link
Contributor Author

OK, I've just pushed updates in response to the various comments.

Specifics:

  • saved some kittens, template parameters gone

  • alerts removed from template, per the metrics analysis phase we are in

  • net, template no longer changed

  • updated README with some sample queries, where those queries are of the form I initially could see us executing online as part of assessing metrics / forming alerts later on

  • extended test massaging / simplification; comments on where various prometheus apis either are not vendored into origin yet, or not sufficient

  • on "top bug review" and stats that would have assisted diagnosis:

    1. all builds failed because of registry issues; so no more successful builds after the registry failure, bunch of failed builds for a given reason, the rate of failed builds increasing after the registry failure
    2. build pods were not getting created and scheduled properly; so a bunch of builds in new/pending
    3. overwhelmed docker registry delay docker push; builds slower than expected
  • affect on metrics: added/modified metrics; may delete a metric, but have not yet (want to discuss); specifics:

    1. broke out failed builds from the complete/cancelled/error; those three don't have failed reasons, so was able to stay at 1 label for both count/gauges, where the failed count just has the reason label, and the terminal(complete/cancelled/error) just has the phase label

    2. adding a new/pending constant metric with unix time, a la the running metric; since start time is not set on those, using the api obj create time; in theory this could be a generic pod metric, but don't see those yet, and I don't want to wait; also, this lines up with ops current monitoring with pending builds via zabbix

    3. for the running constant time metric, which would identify slow builds, overwhelmed docker registries, etc.; I'm now not convinced this will be useful ... what will be the too slow threshold??? I'm inclined to delete, but decided to wait for feedback before doing so.

@bparees
Copy link
Contributor

bparees commented Sep 26, 2017

for the running constant time metric, which would identify slow builds, overwhelmed docker registries, etc.; I'm now not convinced this will be useful ... what will be the too slow threshold??? I'm inclined to delete, but decided to wait for feedback before doing so.

let's hang onto it for now and see how it performs in the real world.

Copy link
Contributor

@bparees bparees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits but i'm in agreement w/ the general changes.


> openshift_build_failed_phase_total{}

Returns the latest totals for failed builds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this different from the count() query above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the count() query above is doing a sum of all the collected data points ("yesterday there were a total of 100 failed builds in the system", "today there were a total of 101", "count=201") that doesn't seem too useful since it's going to double (or triple or quadruple..) count a lot of builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this query will return a multi-row table which includes count by reason; I'll push an update the clarifies that, as well as the aggregation that occurs with count (though that might need to be sum ... I'll sort that out as well).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks. (I think you explained that to me before in my cube, sorry i needed the refresher)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np / yw

activeBuildCount = "running_phase_start_time_seconds"
activeBuildCountQuery = buildSubsystem + separator + activeBuildCount
newPendingBuildCount = "new_pending_phase_creation_time_seconds"
newPendingBuildCountQuery = buildSubsystem + separator + newPendingBuildCount
)

var (
// decided not to have a separate counter for failed builds, which have reasons,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment is no longer accurate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure isn't :-) ... will remove

}
for reason, count := range reasons {
addCountGauge(ch, failedBuildCountDesc, reason, float64(count))
}
addCountGauge(ch, terminalBuildCountDesc, errorPhase, float64(error))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should probably be counted under failedBuilds with a reason of "Build pod encountered an error" or something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(errored builds are closer to "failed" builds than they are to "completed+canceled" builds).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I went back and forth on this, considering each point you just mentioned vs. the fact that in the runtime, following a guru search, the StatusReason field in the build is never set with an error build

bottom line, I'm not convinced ... unless there is more than one reason for builds in error state, it is only wasteful to add a label reason for it ... we shouldn't use prometheus labels to clarify like this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying we treat "Error" as the reason. Treat Failed and Error builds the same, when applying the reason label for the Error builds, just manufacture a reason.

In other words i want the "failedBuilds" metrics to include the count of Error builds, because otherwise they may be ignored under "terminated" builds as not interesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see ... ok, we should make the reason string consistent in look/feel to the others ... I'll go with BuildPodError .... there already BuildPodDeleted and BuildPodExists

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

case buildapi.BuildPhaseFailed:
failed++
// currently only failed builds have reasons
reasons[string(b.Status.Reason)] = 1
case buildapi.BuildPhaseError:
error++
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may live to regret this variable name choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ... a golang thing perhaps ... good catch / I'll change

e, cc, cp, r := bc.collectBuild(ch, b)
for key, value := range r {
count := reasons[key]
count = count + value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count:=reasons[value]+value ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or better reasons[key]=reasons[key]+value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep ... iterated a bit on this piece and didn't finish clean up .. thx

@gabemontero
Copy link
Contributor Author

gabemontero commented Sep 26, 2017

cmd test run looks to have hit flake #16468 though that was presumably fixed / closed by @bparees

then, another cmd test did a oc delete all -l app=helloworld, which resulted in buildconfig "ruby-sample-build" deleted and then Error from server (NotFound): builds "ruby-sample-build-1" not found ... seems like maybe a race condition in deleting bc's and builds with owner ref's to the bc's

both of course unrelated to this change

@gabemontero
Copy link
Contributor Author

responses to sept 26 comments from @bparees pushed

@bparees
Copy link
Contributor

bparees commented Sep 27, 2017

/lgtm

but will
/hold

to give @smarterclayton a last opportunity to raise concerns before we merge this tomorrow. ( @gabemontero you can remove the WIP label)

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 27, 2017
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 27, 2017
@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 27, 2017
@gabemontero gabemontero removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 27, 2017
@gabemontero
Copy link
Contributor Author

Thought it just compiled for me locally, so not obvious what is up:

gmontero ~/go/src/github.com/openshift/origin  (build-alert)$ hack/build-go.sh test/extended/extended.test
++ Building go targets for linux/amd64: test/extended/extended.test
[INFO] hack/build-go.sh exited with code 0 after 00h 00m 19s

looking ...

@gabemontero
Copy link
Contributor Author

a rebase, make clean, then recompile now reproduces compile issue

@gabemontero
Copy link
Contributor Author

yep ... 31fe387#diff-5e0dc194b78051f414a42e52f4caf7dd merged yesterday, changed things up

@openshift-merge-robot openshift-merge-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed lgtm Indicates that a PR is ready to be merged. labels Sep 27, 2017
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 27, 2017
@gabemontero
Copy link
Contributor Author

rebase after @jim-minter pr merge pushed

at least for the cmd test failures that were available ... again appear to be random flakes ... was able to execute same tests locally

@bparees
Copy link
Contributor

bparees commented Sep 27, 2017

/lgtm
/unhold

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 27, 2017
@bparees bparees removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 27, 2017
@gabemontero
Copy link
Contributor Author

cmd test failures same as last time:

=== BEGIN TEST CASE ===
test/cmd/authentication.sh:101: executing 'oc get --raw /metrics --as=user3' expecting success and text 'apiserver_request_latencies'
FAILURE after 0.213s: test/cmd/authentication.sh:101: executing 'oc get --raw /metrics --as=user3' expecting success and text 'apiserver_request_latencies': the command returned the wrong error code; the output content test failed
There was no output from the command.
Standard error from the command:
Unable to connect to the server: unexpected EOF
=== END TEST CASE ===

and

=== BEGIN TEST CASE ===
hack/test-cmd.sh:126: executing 'oc delete project 'cmd-authentication'' expecting success
FAILURE after 0.177s: hack/test-cmd.sh:126: executing 'oc delete project 'cmd-authentication'' expecting success: the command returned the wrong error code
There was no output from the command.
Standard error from the command:
The connection to the server 172.17.0.2:28443 was refused - did you specify the right host or port?
=== END TEST CASE ===

But I can run make test-cmd, hack/test-cmd.sh test/cmd/authentication.sh locally ... not sure how to debug, but will wait to see what is up with other tests when logs are available.

@gabemontero
Copy link
Contributor Author

Found it:

panic: assignment to entry in nil map
goroutine 186758 [running]:
github.com/openshift/origin/pkg/build/metrics/prometheus.(*buildCollector).collectBuild(0x10ea7470, 0xc429102000, 0xc43682d200, 0x0, 0x1, 0x0)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/pkg/build/metrics/prometheus/metrics.go:146 +0x1ab
github.com/openshift/origin/pkg/build/metrics/prometheus.(*buildCollector).Collect(0x10ea7470, 0xc429102000)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/pkg/build/metrics/prometheus/metrics.go:99 +0x17c
github.com/openshift/origin/vendor/github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func2(0xc435ab4920, 0xc429102000, 0xefc10a0, 0x10ea7470)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/prometheus/client_golang/prometheus/registry.go:382 +0x61
created by github.com/openshift/origin/vendor/github.com/prometheus/client_golang/prometheus.(*Registry).Gather
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/prometheus/client_golang/prometheus/registry.go:383 +0x2ec

will need to push another update.

Not sure yet why it happens on ci and not locally, but given the nil ref / panic, does not matter.

@gabemontero
Copy link
Contributor Author

must be some sort of golang / version / compile diff thingy

@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 27, 2017
@bparees
Copy link
Contributor

bparees commented Sep 27, 2017

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 27, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bparees, gabemontero

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@gabemontero
Copy link
Contributor Author

the cmd tests are still running, but the authenticate.sh tests that failed before have passed

@bparees bparees added the kind/bug Categorizes issue or PR as related to a bug. label Sep 28, 2017
@gabemontero
Copy link
Contributor Author

unrelated cmd /test cmd

@gabemontero
Copy link
Contributor Author

/test cmd

@gabemontero
Copy link
Contributor Author

/test extended_networking_minimal

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue.

@openshift-merge-robot openshift-merge-robot merged commit f86e504 into openshift:master Sep 28, 2017
@gabemontero gabemontero deleted the build-alert branch September 28, 2017 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants