-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(moniker): Use moniker in TrafficGuard. #1727
feat(moniker): Use moniker in TrafficGuard. #1727
Conversation
@@ -58,4 +59,20 @@ static public Moniker monikerFromStage(Stage stage) { | |||
return null; | |||
} | |||
} | |||
|
|||
static public Moniker monikerFromStage(Stage stage, String fallbackFriggaName) { | |||
Moniker moniker = monikerFromStage(stage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: what happens when you have a stage that's dealing with potentially > 1 server group or cluster?
stage.context.moniker
seems to imply there only ever being one, and nothing in the context key indicates it's intent.
The rollback stages are a current example with > 1 server group specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a great point. I just tried a rollback and noticed this is what is sent from Deck to Orca:
{
"application" : "nginx",
"name" : "Rollback Server Group: nginx-trafficguards-v011",
"appConfig" : null,
"stages" : [ {
"rollbackType" : "EXPLICIT",
"rollbackContext" : {
"rollbackServerGroupName" : "nginx-trafficguards-v011",
"targetHealthyRollbackPercentage" : 100,
"restoreServerGroupName" : "nginx-trafficguards-v010"
},
"platformHealthOnlyShowOverride" : false,
"type" : "rollbackServerGroup",
"moniker" : {
"app" : "nginx",
"cluster" : "nginx-trafficguards",
"detail" : null,
"sequence" : 11,
"stack" : "trafficguards"
},
"region" : "us-west-2",
"credentials" : "aws-dev",
"cloudProvider" : "aws",
"user" : "anonymous",
"refId" : "0",
"requisiteStageRefIds" : [ ]
} ],
"origin" : "deck"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the case of TrafficGuard, the moniker:
"moniker" : {
"app" : "nginx",
"cluster" : "nginx-trafficguards",
"detail" : null,
"sequence" : 11,
"stack" : "trafficguards"
}
would be passed through. In TrafficGuard itself only moniker.cluster
is used. In ClusterMatcher
only moniker.stack
and moniker.detail
are used. moniker.sequence
is not used. So the moniker is more of a cluster moniker than a server group moniker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajordens Do you know an example of a stage that uses multiple clusters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - curious about @ajordens comment on the multiple clusters before merging
@@ -37,6 +38,7 @@ String getClouddriverOperation() { | |||
@Override | |||
void validateClusterStatus(Map<String, Object> operation) { | |||
trafficGuard.verifyTrafficRemoval((String) operation.get("serverGroupName"), | |||
(Moniker) operation.get("moniker"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this require an ObjectMapper
? I remember that biting us in a prior PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(there's a few other places in this PR that does the same)
*/ | ||
Moniker getMoniker() { | ||
// serverGroup.moniker is a Map type, but Groovy is able to convert it to a Moniker type. | ||
return serverGroup.moniker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is indeed a Map ... may be safer to objectMapper.convert(serverGroup.moniker, Moniker) rather than rely on groovy "magic".
@@ -37,6 +39,7 @@ String getClouddriverOperation() { | |||
@Override | |||
void validateClusterStatus(Map<String, Object> operation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could pass a Moniker
in to this method rather than extracting it each time since every implementation seems to need it
return; | ||
} | ||
|
||
verifyOtherServerGroupsAreTakingTraffic(serverGroupName, location, account, cloudProvider, operationDescriptor); | ||
verifyOtherServerGroupsAreTakingTraffic(serverGroupName, serverGroupMoniker,location, account, cloudProvider, operationDescriptor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing needed serverGroupMoniker,location
} | ||
|
||
static public Moniker monikerOrFrigga(Moniker moniker, String friggaName) { | ||
if (moniker == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If interested, a one liner equivalent: Optional.ofNullable(moniker).orElse(friggerToMoniker(friggaName)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or moniker == null ? friggaToMoniker(friggaName) : moniker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s even better.
private void verifyOtherServerGroupsAreTakingTraffic(String serverGroupName, Location location, String account, String cloudProvider, String operationDescriptor) { | ||
Names names = Names.parseName(serverGroupName); | ||
Optional<Map> cluster = oortHelper.getCluster(names.getApp(), account, names.getCluster(), cloudProvider); | ||
private void verifyOtherServerGroupsAreTakingTraffic(String serverGroupName, Moniker serverGroupMoniker, Location location, String account, String cloudProvider, String operationDescriptor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need to pass around serverGroupName
if we've got a moniker? Maybe we do now as a transition but couldn't you compare moniker's instead of !serverGroupName.equals(tsg.getName()) &&
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The server group name isn't contained in the moniker object (only app, stack, detail, cluster, sequence). In the case of the Frigga naming convention, you can reconstruct the name using the moniker object. But that reconstruction won't work for the manifest based k8s provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it silly to ask why we couldn’t have it also contain the serverGroupName?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lwander can probably give you a more thorough explanation. From my POV, I think that it isn't necessary to facilitate the migration away from frigga. It would also require a significant amount of effort. If you search for serverGroupName
in Orca there are 362 occurrences.
FYI: This is related to another point. There is now a lot of redundancy with application/stack/detail in the stage data payload. In many places, those fields are already included at the root level of the stage and are now in the moniker as well. A follow up task will probably be to consolidate those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that's a good question. We don't want the moniker to be the identifier for the object since the point of the moniker is to decouple the name from the relationships (app, cluster) that frigga presented us. The separation of concerns is: name
is a way to identify an object and moniker
is to decide which cluster & application an object belongs to.
When you assign or derive a moniker from an object, that object doesn't even have to provide us with a name, (e.g. ec2 tags), so the idea is really to keep these concepts separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point about including the name was somewhat predicated in that when I hear Moniker
, I think of name.
The relationships are really part of the identity of an object, which is why we need to pass application/cluster/region/name/etc. around everywhere.
My biggest grief is that we're slowly developing more and more ways to specify the coordinates for an object: serverGroupName, asgName, regions, region, zone, zones, moniker, etc.
In the case of moniker it's only dealing with names, so we can ignore regions/region/zone/zones and think about it only as a partial coordinate.
I hold out hope that there will be some consolidation, as it's difficult and error prone to keep everything straight.
Just my 2c.
I think it would be cleaner and possible to pass a Moniker into validateClusterStatus but otherwise this PR fits the existing patterns and can be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hope is this kind of clean up/refactor will point out more places where we can consolidate the many names/coordinates you point out. We'll keep this in mind going forward.
LGTM otherwise. @lwander my original comment about moniker being singular was rooted in the fact that there is no constraint that a stage only operates on a single cluster / server group. I would not be surprised if this causes some grief with some future stage impl, or hinders being able to use monikers everywhere that we would otherwise use Frigga. |
@ajordens I believe I have addressed all of your prior comments. If you wouldn't mind give it another look to make sure my changes fall in line with what you had in mind, I'd appreciate it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor feedback but agree with the direction.
* Used in TrafficGuard, which is Java, which doesn't play nice with @Delegate | ||
*/ | ||
Moniker getMoniker() { | ||
ObjectMapper objectMapper = new ObjectMapper(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should inject an ObjectMapper
into the constructor (it'll be autowired) vs explicitly creating one here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since TargetServerGroup is not a bean, I'm not quite sure how to do what you are asking. I pulled it out and made it final static
for now.
@@ -104,8 +104,9 @@ public TaskResult execute(Stage stage) { | |||
targetServerGroups.forEach( targetServerGroup -> { | |||
Map<String , Map> tmp = new HashMap<>(); | |||
Map operation = targetServerGroup.toClouddriverOperationPayload(request.getCredentials()); | |||
|
|||
validateClusterStatus(operation); | |||
Moniker moniker = targetServerGroup.getMoniker() == null ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly cleaner alternative to a somewhat ugly multiline ternary.
Moniker moniker = targetServerGroup.getMoniker();
if (moniker == null) {
moniker = MonikerHelper.friggaToMoniker(targetServerGroup.getName())
}
if (TargetServerGroup.isDynamicallyBound(stage)) { | ||
TargetServerGroup tsg = TargetServerGroupResolver.fromPreviousStage(stage); | ||
return tsg.getMoniker() == null ? MonikerHelper.friggaToMoniker(tsg.getName()) : tsg.getMoniker(); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need the else {
and since this is groovy, we can do:
return MonikerHelper.monikerFromStage(stage, serverGroupName ?: asgName)
@ajordens I took care of the style changes. I'm not really sure what to do about autowiring |
odd ... I don't know wth I was looking at to think it was a I've updated the branch and will merge when the build passes. |
* feat(pipeline_template): Re-save dependent pipelines on template save (spinnaker#1606) Allows updates to templates that define triggers, params, notifications or concurrency options to cascade changes to downstream pipelines. * chore(expressions): Allow to override global spel version at pipeline level (spinnaker#1607) - Refactored to allow version override at pipeline level * feat(fastproperties): allow individual stage overrides via trigger (spinnaker#1608) * fix(gradle): Pin jedis version (spinnaker#1609) * fix(fastproperty): do not override context on monitor stage (spinnaker#1610) * fix(events): start/end pipeline events broke because of missing toString (spinnaker#1611) * fix(timeout): stage timeout overrides cumulative task durations (spinnaker#1605) * chore(licenses): fix license config and add missing license headers * fix(pipeline_templates): load parent templates when inlining template for plan (spinnaker#1614) * fix(pipeline_template): Multiple fixes from integration suite (spinnaker#1616) * fix(pipeline_template): Correctly handle multiple stage injects * fix(pipeline_template): Fix module rendering in partials * chore(queue): removed envelope handling code we don't use * refactor(queue): queue now uses message hash as the message id This makes de-duping easier and means we can easily re-prioritize messages (PR to follow). * chore(queue): renamed things to make intent clearer * feat(pipeline_template): Convert to pipeline template endpoint (spinnaker#1615) * feat(pipeline_template): Jinja rendering in triggers, params and notifications (spinnaker#1619) * refactor(metrics): more configurable metrics wiring for thread pools * chore(core): simpler handling of parallel stages * fix(queue): re-prioritize message on queue if an identical one is pushed * fix(metrics): missed migrator in refactor of thread pool metrics (spinnaker#1625) * feat(cluster-match): implement general purpose cluster matching rule engine (spinnaker#1579) * fix(canary): target canary cleanup server groups by create date (spinnaker#1612) * fix(pipeline_template): config level stage replacement (spinnaker#1622) * feat(queue): Add queue shovel for migrating backends (spinnaker#1624) * feat(redis_migration): log if there are still pending orchestrations/pipelines (spinnaker#1626) * fix(pipeline_template): root-level configs mapped if undefined (spinnaker#1623) * fix(queue): Fix stupid spring wiring (spinnaker#1628) * feat(servergroup): allow ad-hoc entity tags on deployments (spinnaker#1627) * chore(oortService): expose /serverGroups endpoint (spinnaker#1630) * perf(rollingpush): Avoid unnecessarily searching for instance ids (spinnaker#1633) If a `serverGroupName` or `asgName` is available on the stage context, there is no need to lookup each instance id individually. Also, `getServerGroup()` is more efficient than `getServerGroupFromCluster()` when dealing with clusters containing multiple server groups. * refactor(tag-generator): include server group coordinates in generateTags signature (spinnaker#1634) * fix(pipeline_template): Allow conditional stages inside of partials (spinnaker#1631) * feat(pipeline_template): Allow granular inheritance control on params, triggers, notifications (spinnaker#1632) * fix(stages/bakery): Fix bake stage, ensure deploymentDetails is added to context - Refactor Check Preconditions stage type for parallel stages * feat(pipeline_template): Jinja rendering in partials (spinnaker#1637) Renders the partials' stage config before rendering partials, to allow for more advanced jinja expressions in the partials. * fix(managed_pipeline): validation of variable types, integration tests for invalid pipelines (spinnaker#1629) * refactor(pipeline_template): Support multiple MPT schema versions (spinnaker#1621) Adds the ability to parse different schema versions. This PR doesn't actually change any functionality, except for switching to the new handler API. All existing v1 schema code works the same way it did. Additionally introduced Kotlin to module as new default. * tests(pipeline_templates): integrations tests and removing unique ID (spinnaker#1638) * feat(pipeline_template): Support template-less configurations (spinnaker#1636) * fix(timeouts): some tasks inherit stage timeout override (spinnaker#1640) * fix(pipeline_template): Regression in deserializing tempalted pipeline requests (spinnaker#1644) * fix(pipeline_template): Check for correct paramConfig field during render (spinnaker#1642) * fix(web): Enforce limit of pipelines when using previous redis conn (spinnaker#1646) * fix(redis_migration): fix logging of pending work (spinnaker#1647) * feat(artifacts): Add receivedArtifacts to Pipeline model. (spinnaker#1648) * feat(job): decrease job timout and make overridable (spinnaker#1649) * fix(web): Enforce limit on previous redis for app pipelines endpoint (spinnaker#1650) * fix(pipeline_template): Do not store state in error handler lol (spinnaker#1651) * Revert "feat(artifacts): Add receivedArtifacts to Pipeline model. (spinnaker#1648)" (spinnaker#1653) This reverts commit 0d9a2f1. * fix(web): Revert take calls; unexpected upstream behaviors (spinnaker#1654) * fix(rollbacks): support for tolerating some instance failures (spinnaker#1643) * feat(stage context): Get all stage results in a list (spinnaker#1655) * refactor(clouddriver): monitor clouddriver tasks every 5 seconds (spinnaker#1639) * feat(context): Adds trigger contents to stage context (spinnaker#1659) * feat(clouddriver): Support sharding read-only requests by `user` (spinnaker#1641) This allows you to serve all pipelines/orchestrations generated by a particular user (or set of users) with a dedicated clouddriver read replica. * feat(moniker): Use a cluster's moniker if available. (spinnaker#1664) * Use cluster's moniker if available * Format tests for moniker vs frigga cluster names * chore(expressions): Cleanup some noisy logs (spinnaker#1666) - Include execution id in log message - Remove some noise with debug statements * fix(mine): Search stage definition builders directly instead of depending on stage navigator when trying to cancel canary. * fix(web): Return correct num executions with dual redis (spinnaker#1668) * perf(build): Removing orca-queue-sqs; unused, unsupported * feat(core): Deploy manifest stage & task (spinnaker#1674) * feat(pipeline): Resolve received and expected artifacts in trigger. (spinnaker#1667) * fix(expressions): Include evaluation summary in v2 (spinnaker#1673) - ensure evaluation errors are included in precondition * fix(execution windows): don't add duplicate execution windows to parallel stages * fix(queue): treat waiting pipelines queue as FIFO if keepWaitingPipelines (spinnaker#1677) * feat(pipeline_template): Add marker support to disable rendered value expansion (spinnaker#1665) * feat(get_commits): Display SCM changes in pipeline-triggered pipelines (spinnaker#1652) The changes tab that displays SCM diff between deploys was not showing if the pipeline was triggered by another pipeline. Now it does. * fix(pipeline_template): Propagate nested render errors (spinnaker#1679) * feat(moniker): Allow moniker field to pass through to StageData and TargetServerGroups (spinnaker#1678) * fix(queue): don't keep pushing dead messages indefinitely * feat(qa): dry run pipelines Allow pipelines to run in "dry-run" mode. Stages are proxied and simply test correct evaluation of expressions, optionality, ordering and consistent results. * add executionId to MDC (spinnaker#1682) * fix(rrb): RRB should determine resize percentages at runtime (spinnaker#1681) This PR introduces the ability to specify a `scalePct` when scaling based on the capacity of another server group (scale_to_server_group). Previously all RRB capacity adjustments on the target server group were determined at build time. * Oort get server groups (spinnaker#1680) * Add new clouddriver getServerGroup endpoint * Add new clouddriver getServerGroup endpoint * Removed frigga from oortHelper * Rename oortService method for clearity * Overload getServerGroups and deprecate old prototype * Update test * Also deprecate on the DelegatedOortService * fix(pipeline_template): Fixing regression in test harness (spinnaker#1686) * feat(pipeline_template): PipelineIdTag checks context for variables defining application and name (spinnaker#1688) * fix(log): ensure executionId is cleaned up from MDC (spinnaker#1685) * feat(stages): make FAILED_CONTINUE bubble up from synthetic stages * feat(moniker): Use moniker for instance check task (spinnaker#1689) * feat(moniker): Use monikers within server-group tasks (spinnaker#1693) * fix(log) ensure MDC cleanup occurs in correct thread (spinnaker#1698) * fix(pipeline_template): Do not treat markdown lists as YAML alias (spinnaker#1645) * feat(artifacts): simplify artifact matching (spinnaker#1696) * feat(core): Force cache refresh manifest task (spinnaker#1687) * feat(moniker): use moniker over frigga in ScaleToClusterResize * chore(imports): remove unused frigga import * core(manifest): Delete manifest op (spinnaker#1700) * feat(artifacts): Check if default artifact was intended (spinnaker#1701) * feat(rrb): Support for running a pipeline between scale up and disable (spinnaker#1694) Coordinates for both the new and old server groups are passed along as pipeline parameters. - `delayBeforeDisableSec` -> `getDelayBeforeCleanup()` - support for lookup by stage id in `#stage()` * fix(dryrun): need injected property class to support list of values * Use moniker in DetermineHealthProvidersTask (spinnaker#1702) * feat(moniker): Use moniker for Job stages. (spinnaker#1699) * chore(dependencies): updating spinnaker-depenencies (spinnaker#1707) * feat(queue): update delivery time on runtask (spinnaker#1676) Adds reschedule method to queue, which updated the delivery time of an existing message on the queue. * feat(logging): ops controller annotation + logstashEncoder dep (spinnaker#1692) * fix(dryrun): log dry run activity * Properly cast Moniker from context Properly cast Moniker from context Put the try catch in the MonikerHelper Only check for IllegalArgumentException check existence rather than exception * feat(polling): remove polling for wait, manual judgement, and execution window (spinnaker#1661) * feat(entitytags): Include previous server group image details (spinnaker#1705) This will ultimately facilitate an orchestrated rollback even if the previous server group no longer exists. It relies on the entity tags feature being enabled (dependency on elastic search and not enabled by default in `clouddriver` / `orca`). This PR also introduces some common retry handling (see `RetrySupport`). * fix(entitytags): Do not fetch previous server group for Titus (spinnaker#1712) Subsequent PR will make this work properly for Titus. * fix(gae): Support new ExpectedArtifact in GAE deploy. (spinnaker#1706) * fix(polling): update time left calculation (spinnaker#1713) * fix(dryrun): send dry run notification correctly * fix(propertyFiles): change wording of error message to indicate that there might be a syntax error in the file (spinnaker#1715) * fix(fastproperties): always clean up properties marked for rollback (spinnaker#1717) * feat(core): implement noop stage (spinnaker#1719) * fix(dryrun): remove pipeline config it for dry runs * feat(rollback): Support rolling back to a server group that no longer exists (spinnaker#1716) This PR provides a rollback strategy that will clone forward with the image details that were tagged as part of spinnaker#1705. It also supports the `imageName` being explicitly provided, but that's an exceptional case that would not be supported in the UI. The `spinnaker:metadata` tag will include `buildInfo` for the image if it is returned from `clouddriver`. * fix(runJob): retry fetching property files if not found (spinnaker#1721) * fix(executionWindow): revert to polling (spinnaker#1722) * fix(canary): gentle canary cleanup (spinnaker#1711) * fix(rollback): Propagate `interestingHealthProviderNames` (spinnaker#1723) If `interestingHealthProviderNames` are specified on the parent stage context, propagate them to the clone stage. * fix(dryrun): let Echo remove the pipelineConfigId * fix(dryrun): ignore certain keys in context * fix(cancel): cancel during wait stage (spinnaker#1726) * fix(logging): updating timeout message w/ timeout value (spinnaker#1728) * fix(dryrun): strip nested nulls when comparing context * fix(rrb): Only inject pipeline stage if applicaton + pipelineId present (spinnaker#1729) * fix(tasks): stop using 'shared' task state (spinnaker#1731) * fix(dryrun): try to cope with values that are sometimes floats or ints * fix(manual judgment): switching back to polling to respect timeout overrides (spinnaker#1735) * fix(exec window): leave shared state alone (spinnaker#1737) * Attempt to use moniker before frigga (spinnaker#1697) * feat(moniker): Use moniker for Rollingpush tasks. (spinnaker#1703) * feat(moniker): Pass moniker to cleanup stages. (spinnaker#1732) * feat(moniker): Use moniker for app name over frigga in flex (spinnaker#1736) * feat(manualJudgment): allow standard notification types for manual judgment (spinnaker#1739) * fix(dryrun): ignore additional context field * fix(mahe): do not clean up properties that have been updated (spinnaker#1741) * fix(mahe): fix property extraction on cleanup (spinnaker#1743) * fix(deploy): avoid crossing the streams in parallelized deploys Ensure a deploy stage is getting deployment details from its ancestor and not from global context. * fix(expressions): stop stripping null context values * fix(mahe): send correct query to determine if fast property exists (spinnaker#1747) * fix(canary): fix cleanup of unhealthy canaries with multiple clusters (spinnaker#1749) * fix(security): Prevent webhook users from spoofing authed user * fix(moniker): hotfix canary deploy stages moniker objects are incorrect on canary deploy stages, this removes those objects for now pending a fix of the source problem * fix(mahe): check property structure on cleanup (spinnaker#1752) * fix(fastproperties): Processing expressions in property override (spinnaker#1754) - Explicitly processing expressions in property stage override * fix(triggers): ensure canceling pipeline sends ExecutionComplete event (spinnaker#1753) * feat(core): Wait for manifest stable task (spinnaker#1755) * fix(context): stop looking for properties in trigger until we can figure out what is going on * fix(titus): Tag titus server groups with previous image metadata (spinnaker#1758) * chore(canary-v2): Update to new json-based initiate canary entrypoint. (spinnaker#1756) * fix(front50): Keep front50 optional (spinnaker#1760) * Revert "fix(deploy): avoid crossing the streams in parallelized deploys" This reverts commit f496530. * fix(canary-v2): Avoid naming collision between mine/kayenta tasks. (spinnaker#1761) * feat(orca) Place produced artifacts in stage output (spinnaker#1441) * feat(artifacts): Resolve expectedArtifact by ids in trigger. (spinnaker#1763) * fix(trafficguards): adds retry logic to validateClusterStatus (spinnaker#1759) * fix(front50): Don't try to run dependent pipelines that are disabled * chore(dependencies): Bump spinnaker-dependencies to 0.120.1 (spinnaker#1765) Should fix the Artifact.metadata serialization issue. * fix(pipeline_template): Resave all pipelines on template update (spinnaker#1766) * fix(rollingpush): Ensure `waitTaskState` is cleared between iterations (spinnaker#1767) * feat(moniker): Use moniker in TrafficGuard. (spinnaker#1727) * feat(moniker): Use monker in TrafficGuard. * Check for null monikers and fall back on frigga * Formatting * Refactor * Pass moniker to validateClusterStatus() * Style cleanup * fix(core): `DetachInstancesTask` should have traffic guards (spinnaker#1768) * fix(orca/canary): Don't presume array present (spinnaker#1770) In at least one case of total canary stage failure, the deployedClusterPairs context variable doesn't get set at all, resulting in a null value -- this avoids the "Cannot invoke method getAt() on null object" error message that results from trying to dereference the zero'th element in this case. * chore(gradle): Avoid running tests on master/release branch (spinnaker#1769) * fix(manifest): Fix delete behavior (spinnaker#1774) * chore(gradle): Avoid running junit platform tests on master/release branch (spinnaker#1775) * feat(core): Add correlation ids to orchestrations (spinnaker#1748) Allows us to repeatedly send orchestrations with the same correlation id and only running a single one. Pre-req for keel. * fix(canary): fix wait task after baseline asg disable (spinnaker#1771) * fix(pipeline_template): Deal with whitespace in jinja module kv pairs (spinnaker#1773) * fix(web): Make trigger map mutable (spinnaker#1776) * Fixes restarting ACA task stages (spinnaker#1777) fix(aca): Fix restarting ACA task stage * Use StageContext consistently without breaking strategies (spinnaker#1772) fix(expressions): Ensure StageContext is always present on Stage instances * fix(front50): Avoid canceling an already canceled child pipeline (spinnaker#1779) This PR also propagates some details about the particular stage that failed such that `deck` can render an appropriate link. * fix(log): clarify missing custom strategy error (spinnaker#1780) * fix(trafficguards): Fix Moniker usage in instance termination (spinnaker#1781) The instance termination task does not normally have a `serverGroupName` explicitly specified. Without such, the moniker used to check traffic guards is invalid. * fix(mpt): don't NPE on stages without a when (spinnaker#1783) * feat(pipeline_template): Render configuration for templated pipelines with dynamic source (spinnaker#1718) * Will render using a specific execution, or the latest if a specific one is not set. * fix(netflix): do not send property value when looking up existing property (spinnaker#1785) * refactor(model): Unify execution subtypes * refactor(model): move properties from pipeline & orchestration up * refactor(model): make Execution non-abstract * refactor(model): introduce execution type enum and use instead of instanceof * refactor(model): unify retrieve & delete methods in repository * refactor(model): unify retrieve observable methods in repository * refactor(model): convert all refs to Pipeline / Orchestration to Execution * refactor(model): clean up Kotlin property syntax * refactor(model): fix one class/test I missed * refactor(model): migrate queue messages to new `executionType` format * refactor(model): change enums to uppercase * refactor(model): change enums to uppercase * refactor(model): rebase hell * refactor(model): typo * fix(web): Reverting needless method sig refactor from spinnaker#1718 (spinnaker#1786) * fix(core): s/orchestrationLauncher/executionLauncher (spinnaker#1787) * fix(queue): custom de/serializer so we can migrate queue values slowly * chore(logging): shut up spammy logs in integration tests * fix(core): Support wait before scaling down in red/black (spinnaker#1789) Set `delayBeforeScaleDownSec` in your stage context. * fix(metrics): derive executionType tag from type not class name * fix(*): fix startup failure if pipeline templates are not enabled (spinnaker#1792) * fix(pipeline trigger): fix error when parsing parent pipeline with no type * chore(dependencies): upgrade Spring and Jackson * chore(dependencies): simpler defaulting of execution type * fix(tags): stop dumb failures in cleaning up astrid tags * fix(triggers): typo in previous fix * fix(notifications): send correct notification type on pipeline events (spinnaker#1795) * feat(clouddriver/aws): Allow finding SG vpc IDs by name (spinnaker#1784) * feat(pagerduty): Support multiple applications and keys directly (spinnaker#1797) * fix(mort): Flippy-floppy equalsIgnoreCase to avoid NPE (spinnaker#1800) * fix(pagerduty): Fix paging by keys only (spinnaker#1801) * feat(provider/kubernetes): undo rollout stage (spinnaker#1802) * feat(provider/kubernetes): scale manifest (spinnaker#1803) * chore(dependencies): dependency updates gradle 3.5 spinnaker-gradle-plugin 3.17.0 spinnaker-dependencies 0.122.0 * feat(provider/kuberntes): pause/resume rollout (spinnaker#1806) * chore(core): Use `RetrySupport` from `kork-exceptions` (spinnaker#1807) * feat(pagerduty): Automatically append 'from' to the details map (spinnaker#1808) * feat(manifest): more robust status handling (spinnaker#1809) * fix(kubernetes/rollback) Pass cloudProvider to tasks so it doesn't default to aws (spinnaker#1810) * fix(moniker): Use the correct moniker when applying source server group (spinnaker#1805) * feat(canary-v2): Add region attributes to kayenta stage. (spinnaker#1813) * fix(trafficguards): debug logging when no enabled asgs found (spinnaker#1814) * chore(dependencies): bump Kotlin to 1.1.60 * chore(dependencies): bump JUnit * fix(alerts): fix formatting of log message for global context alert * fix(timeouts): prevent stageTimeoutMs ending up in outputs/global * Xenial builds (spinnaker#1819) * feat(xenial_builds): Added Orca systemd service config. * chore(dependencies): update to latest spinnaker-dependencies version * refactor(expressions): Remove v1 SPEL code (spinnaker#1817) - Making v2 as the default engine for expressions - More improvements on the way * chore(dependencies): bump Mockito and Hamkrest * fix(fastproperties): prevent FP stuff getting written to global context * chore(dependencies): bump Kotlin to 1.2 * fix(templates): Tolerate all thrown failures on execution lookup. (spinnaker#1822) * feat(provider/kubernetes): insert artifacts during deploy (spinnaker#1823) * Make WaitForClusterDisableTask configurable in yml (spinnaker#1824) * fix(core): Missing closing brace (spinnaker#1826) * chore(systemd_logs): Remove unneeded log redirection. (spinnaker#1825) * feat(artifacts): support 'use prior execution' (spinnaker#1827) * fix(fastproperties): correct separation of context and output values in FP stage * chore(mahe): remove mahe (spinnaker#1830) * fix(expressions): expressions can reference prior stage outputs (spinnaker#1828) * fix(expressions): expressions can reference prior stage outputs * feat(provider/kubernetes): deploy from artifact (spinnaker#1831) * feat(pipeline_template) Add strategyId tag to render ids by application and strategy name (spinnaker#1833) * fix(moniker): fix cluster if detail is set to empty via SpEL (spinnaker#1832) * fix(job): retry on call to clouddriver for job status (spinnaker#1834) * feat(pipeline_template) Allow partials to be injected from template configuration. (spinnaker#1798) * Removed a method that was added twice during the merge. * Implemented getRegion for EcsImageDetails.
This PR will remove the use of frigga from
TrafficGuard
.Flow:
First,
Map<String, Object> operation
is generated from aStage
object inTargetServerGroup.toClouddriverOperationPayload()
andAbstractServerGroupTask.convert()
.Stage
object may or may not have a moniker. If it is an older pipeline config, then it will not have a moniker.operation
is used byvalidateClusterStatus()
inTerminateInstanceAndDecrementServerGroupTask.groovy
TerminateInstancesTask.groovy
BulkDestroyServerGroupTask.java
BulkDisableServerGroupTask.java
DestroyServerGroupTask.groovy
DisableServerGroupTask.groovy
ResizeServerGroupTask.groovy
When calling
TrafficGuard.verifyTrafficRemoval()
a moniker is generated (via MonikerHelper and frigga) if one can't be found in theoperation
. Once we transition old pipeline configs to all use monikers, we can remove generating the moniker. In the case of the new k8s provider, a moniker is always expected to be present.Testing
To test, I made an app in Spinnaker and set Traffic Guards. Then I made a pipeline that tries to disable, resize, and destroy the guarded server group. Then I made a copy of the pipeline and removed the monikers.
You can find the pipeline configurations here:
After running the pipelines, I got the expected results:
Note: These pipelines will be added to our internal test suite that runs against our installation of Spinnaker.