-
Notifications
You must be signed in to change notification settings - Fork 811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(moniker): Use the correct moniker when applying source server group #1805
Conversation
@@ -80,6 +82,26 @@ class ApplySourceServerGroupCapacityTask extends AbstractServerGroupTask { | |||
} | |||
} | |||
|
|||
@Override | |||
Moniker convertMoniker(Stage stage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused as to how this actually works ... you've got two return
statements in the same scope?
Anyhow ... this stage should be applying the previously captured capacity to the newly created server group.
Generally speaking, I believe it should be applicable across cloud providers and certainly must be used if CaptureSourceServerGroupCapacityTask
is used (as this task pins min=desired capacity).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thank you for pointing that out. The first return
was left over from debugging. It would still work, but is not the intended behavior.
The intended behavior is to grab the target sever group that will be used in the resize operation. Then return its moniker. Since we are getting the target server group from Clouddriver, it is guaranteed to have a moniker.
1d97528
to
ced874d
Compare
|
||
return targetServerGroup.getMoniker() | ||
} catch (Exception e) { | ||
log.error("Unable to apply source server group capacity (executionId: ${stage.execution.id})", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the offending server group name / region / account to this error message.
Rather than returning null on an exception, could a reasonable fallback be to construct a moniker via frigga.
The duplicate call to clouddriver
is definitely sub-optimal given that (a) it fetches far more data than just a moniker and (b) it literally happened in the immediately preceding call (and would have also happened previously since we do have the serverGroupName in the context.
In the short-term this is probably alright, but I would like an alternative to retrieving full server groups just to get the moniker. This may end up being a simple /moniker
endpoint in clouddriver taking a few query parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I broke up the exception handling into more specific cases. I think you are right, in the case that the moniker is in the wrong place in the payload, malformed, etc. then we should fall back on frigga. In the case that there is a clouddriver problem or something else unexpected, I think falling back on frigga might obfuscate the true error downstream (this is more of an inssue for non-frigga providers.)
I agree that the double call to clouddriver
is undesirable. I'd also love to avoid it. There are a few options here:
- Store the response as a class member.
- Rework
execute()
in the base class.
a. Combineconvert()
andconverMoniker()
in
b. Return different types fromconvert()
and use it inconvertMoniker()
- Put a lighter weight call into clouddriver (As you pointed out.)
I figured duplicating what was in convert()
introduced the least amount of risk. Albeit, at the cost of a double call.
As an aside, there is a certain amount of urgency to this PR since k8s rollbacks currently do not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no retries around the call to clouddriver, and the impact of ever having a null Moniker
is likely to be a NPE.
In this particular task, nothing actually appears to use the moniker (validateClusterStatus()
isn't overridden).
I'm calling these sorts of issues out because we've run into a number of issues lately around network failures between orca <-> clouddriver (or other services) without retries.
Similarly, at scale, we've noticed unnecessary overhead from inefficient calls to fetch data from clouddriver. A lot of these calls have been optimized for aws
but we still try and avoid them where not strictly necessary.
d59291d
to
f5a8a00
Compare
@@ -74,7 +74,7 @@ abstract class AbstractServerGroupTask extends AbstractCloudProviderAwareTask im | |||
return new TaskResult(ExecutionStatus.SUCCEEDED) | |||
} | |||
|
|||
def taskId = kato.requestOperations(cloudProvider, [[(serverGroupAction): operation]]) | |||
def taskId = kato.requestOperations(operation.cloudProvider, [[(serverGroupAction): operation]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajordens This fixes the problem in the k8s rollback stage. getCloudProvider(stage)
was defaulting to aws
. Changing the clouddriver request to use the operation's cloudProvider
makes a lot of sense to me conceptually. But it will have an impact on a lot of other stages and I'd love your eyes on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth noting the breaks all non-aws platforms too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a slightly cleaner fix andrew is patching up in a minute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, yup ... that's an oldie right there!
I wonder if we shouldn't at least visibly log when a cloudProvider
hasn't been specified and we're falling back to aws
.
Any concerns with merging @ajordens? This fixes rollback for all non-aws platforms. |
The It would seem like for |
Split out the |
My earlier point is that this Right? |
79f6a8a
to
a099975
Compare
Oh man, that is a great observation. I didn't realize there wasn't a trafficguard check here. You're right, I tested it for k8s and aws and returning |
👍
…On Fri, Nov 17, 2017 at 2:17 PM Andrew Backes ***@***.***> wrote:
Oh man, that is a great observation. I didn't realize there wasn't a
trafficguard check here.
You're right, I tested it for k8s and aws and returning null is
sufficient.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1805 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAXuLKJjK2y23WJS7MfgHodil52507MTks5s3gYDgaJpZM4Qffb6>
.
|
a099975
to
f049bae
Compare
Will merge after Travis |
f049bae
to
4606b3d
Compare
* feat(pipeline_template): Re-save dependent pipelines on template save (spinnaker#1606) Allows updates to templates that define triggers, params, notifications or concurrency options to cascade changes to downstream pipelines. * chore(expressions): Allow to override global spel version at pipeline level (spinnaker#1607) - Refactored to allow version override at pipeline level * feat(fastproperties): allow individual stage overrides via trigger (spinnaker#1608) * fix(gradle): Pin jedis version (spinnaker#1609) * fix(fastproperty): do not override context on monitor stage (spinnaker#1610) * fix(events): start/end pipeline events broke because of missing toString (spinnaker#1611) * fix(timeout): stage timeout overrides cumulative task durations (spinnaker#1605) * chore(licenses): fix license config and add missing license headers * fix(pipeline_templates): load parent templates when inlining template for plan (spinnaker#1614) * fix(pipeline_template): Multiple fixes from integration suite (spinnaker#1616) * fix(pipeline_template): Correctly handle multiple stage injects * fix(pipeline_template): Fix module rendering in partials * chore(queue): removed envelope handling code we don't use * refactor(queue): queue now uses message hash as the message id This makes de-duping easier and means we can easily re-prioritize messages (PR to follow). * chore(queue): renamed things to make intent clearer * feat(pipeline_template): Convert to pipeline template endpoint (spinnaker#1615) * feat(pipeline_template): Jinja rendering in triggers, params and notifications (spinnaker#1619) * refactor(metrics): more configurable metrics wiring for thread pools * chore(core): simpler handling of parallel stages * fix(queue): re-prioritize message on queue if an identical one is pushed * fix(metrics): missed migrator in refactor of thread pool metrics (spinnaker#1625) * feat(cluster-match): implement general purpose cluster matching rule engine (spinnaker#1579) * fix(canary): target canary cleanup server groups by create date (spinnaker#1612) * fix(pipeline_template): config level stage replacement (spinnaker#1622) * feat(queue): Add queue shovel for migrating backends (spinnaker#1624) * feat(redis_migration): log if there are still pending orchestrations/pipelines (spinnaker#1626) * fix(pipeline_template): root-level configs mapped if undefined (spinnaker#1623) * fix(queue): Fix stupid spring wiring (spinnaker#1628) * feat(servergroup): allow ad-hoc entity tags on deployments (spinnaker#1627) * chore(oortService): expose /serverGroups endpoint (spinnaker#1630) * perf(rollingpush): Avoid unnecessarily searching for instance ids (spinnaker#1633) If a `serverGroupName` or `asgName` is available on the stage context, there is no need to lookup each instance id individually. Also, `getServerGroup()` is more efficient than `getServerGroupFromCluster()` when dealing with clusters containing multiple server groups. * refactor(tag-generator): include server group coordinates in generateTags signature (spinnaker#1634) * fix(pipeline_template): Allow conditional stages inside of partials (spinnaker#1631) * feat(pipeline_template): Allow granular inheritance control on params, triggers, notifications (spinnaker#1632) * fix(stages/bakery): Fix bake stage, ensure deploymentDetails is added to context - Refactor Check Preconditions stage type for parallel stages * feat(pipeline_template): Jinja rendering in partials (spinnaker#1637) Renders the partials' stage config before rendering partials, to allow for more advanced jinja expressions in the partials. * fix(managed_pipeline): validation of variable types, integration tests for invalid pipelines (spinnaker#1629) * refactor(pipeline_template): Support multiple MPT schema versions (spinnaker#1621) Adds the ability to parse different schema versions. This PR doesn't actually change any functionality, except for switching to the new handler API. All existing v1 schema code works the same way it did. Additionally introduced Kotlin to module as new default. * tests(pipeline_templates): integrations tests and removing unique ID (spinnaker#1638) * feat(pipeline_template): Support template-less configurations (spinnaker#1636) * fix(timeouts): some tasks inherit stage timeout override (spinnaker#1640) * fix(pipeline_template): Regression in deserializing tempalted pipeline requests (spinnaker#1644) * fix(pipeline_template): Check for correct paramConfig field during render (spinnaker#1642) * fix(web): Enforce limit of pipelines when using previous redis conn (spinnaker#1646) * fix(redis_migration): fix logging of pending work (spinnaker#1647) * feat(artifacts): Add receivedArtifacts to Pipeline model. (spinnaker#1648) * feat(job): decrease job timout and make overridable (spinnaker#1649) * fix(web): Enforce limit on previous redis for app pipelines endpoint (spinnaker#1650) * fix(pipeline_template): Do not store state in error handler lol (spinnaker#1651) * Revert "feat(artifacts): Add receivedArtifacts to Pipeline model. (spinnaker#1648)" (spinnaker#1653) This reverts commit 0d9a2f1. * fix(web): Revert take calls; unexpected upstream behaviors (spinnaker#1654) * fix(rollbacks): support for tolerating some instance failures (spinnaker#1643) * feat(stage context): Get all stage results in a list (spinnaker#1655) * refactor(clouddriver): monitor clouddriver tasks every 5 seconds (spinnaker#1639) * feat(context): Adds trigger contents to stage context (spinnaker#1659) * feat(clouddriver): Support sharding read-only requests by `user` (spinnaker#1641) This allows you to serve all pipelines/orchestrations generated by a particular user (or set of users) with a dedicated clouddriver read replica. * feat(moniker): Use a cluster's moniker if available. (spinnaker#1664) * Use cluster's moniker if available * Format tests for moniker vs frigga cluster names * chore(expressions): Cleanup some noisy logs (spinnaker#1666) - Include execution id in log message - Remove some noise with debug statements * fix(mine): Search stage definition builders directly instead of depending on stage navigator when trying to cancel canary. * fix(web): Return correct num executions with dual redis (spinnaker#1668) * perf(build): Removing orca-queue-sqs; unused, unsupported * feat(core): Deploy manifest stage & task (spinnaker#1674) * feat(pipeline): Resolve received and expected artifacts in trigger. (spinnaker#1667) * fix(expressions): Include evaluation summary in v2 (spinnaker#1673) - ensure evaluation errors are included in precondition * fix(execution windows): don't add duplicate execution windows to parallel stages * fix(queue): treat waiting pipelines queue as FIFO if keepWaitingPipelines (spinnaker#1677) * feat(pipeline_template): Add marker support to disable rendered value expansion (spinnaker#1665) * feat(get_commits): Display SCM changes in pipeline-triggered pipelines (spinnaker#1652) The changes tab that displays SCM diff between deploys was not showing if the pipeline was triggered by another pipeline. Now it does. * fix(pipeline_template): Propagate nested render errors (spinnaker#1679) * feat(moniker): Allow moniker field to pass through to StageData and TargetServerGroups (spinnaker#1678) * fix(queue): don't keep pushing dead messages indefinitely * feat(qa): dry run pipelines Allow pipelines to run in "dry-run" mode. Stages are proxied and simply test correct evaluation of expressions, optionality, ordering and consistent results. * add executionId to MDC (spinnaker#1682) * fix(rrb): RRB should determine resize percentages at runtime (spinnaker#1681) This PR introduces the ability to specify a `scalePct` when scaling based on the capacity of another server group (scale_to_server_group). Previously all RRB capacity adjustments on the target server group were determined at build time. * Oort get server groups (spinnaker#1680) * Add new clouddriver getServerGroup endpoint * Add new clouddriver getServerGroup endpoint * Removed frigga from oortHelper * Rename oortService method for clearity * Overload getServerGroups and deprecate old prototype * Update test * Also deprecate on the DelegatedOortService * fix(pipeline_template): Fixing regression in test harness (spinnaker#1686) * feat(pipeline_template): PipelineIdTag checks context for variables defining application and name (spinnaker#1688) * fix(log): ensure executionId is cleaned up from MDC (spinnaker#1685) * feat(stages): make FAILED_CONTINUE bubble up from synthetic stages * feat(moniker): Use moniker for instance check task (spinnaker#1689) * feat(moniker): Use monikers within server-group tasks (spinnaker#1693) * fix(log) ensure MDC cleanup occurs in correct thread (spinnaker#1698) * fix(pipeline_template): Do not treat markdown lists as YAML alias (spinnaker#1645) * feat(artifacts): simplify artifact matching (spinnaker#1696) * feat(core): Force cache refresh manifest task (spinnaker#1687) * feat(moniker): use moniker over frigga in ScaleToClusterResize * chore(imports): remove unused frigga import * core(manifest): Delete manifest op (spinnaker#1700) * feat(artifacts): Check if default artifact was intended (spinnaker#1701) * feat(rrb): Support for running a pipeline between scale up and disable (spinnaker#1694) Coordinates for both the new and old server groups are passed along as pipeline parameters. - `delayBeforeDisableSec` -> `getDelayBeforeCleanup()` - support for lookup by stage id in `#stage()` * fix(dryrun): need injected property class to support list of values * Use moniker in DetermineHealthProvidersTask (spinnaker#1702) * feat(moniker): Use moniker for Job stages. (spinnaker#1699) * chore(dependencies): updating spinnaker-depenencies (spinnaker#1707) * feat(queue): update delivery time on runtask (spinnaker#1676) Adds reschedule method to queue, which updated the delivery time of an existing message on the queue. * feat(logging): ops controller annotation + logstashEncoder dep (spinnaker#1692) * fix(dryrun): log dry run activity * Properly cast Moniker from context Properly cast Moniker from context Put the try catch in the MonikerHelper Only check for IllegalArgumentException check existence rather than exception * feat(polling): remove polling for wait, manual judgement, and execution window (spinnaker#1661) * feat(entitytags): Include previous server group image details (spinnaker#1705) This will ultimately facilitate an orchestrated rollback even if the previous server group no longer exists. It relies on the entity tags feature being enabled (dependency on elastic search and not enabled by default in `clouddriver` / `orca`). This PR also introduces some common retry handling (see `RetrySupport`). * fix(entitytags): Do not fetch previous server group for Titus (spinnaker#1712) Subsequent PR will make this work properly for Titus. * fix(gae): Support new ExpectedArtifact in GAE deploy. (spinnaker#1706) * fix(polling): update time left calculation (spinnaker#1713) * fix(dryrun): send dry run notification correctly * fix(propertyFiles): change wording of error message to indicate that there might be a syntax error in the file (spinnaker#1715) * fix(fastproperties): always clean up properties marked for rollback (spinnaker#1717) * feat(core): implement noop stage (spinnaker#1719) * fix(dryrun): remove pipeline config it for dry runs * feat(rollback): Support rolling back to a server group that no longer exists (spinnaker#1716) This PR provides a rollback strategy that will clone forward with the image details that were tagged as part of spinnaker#1705. It also supports the `imageName` being explicitly provided, but that's an exceptional case that would not be supported in the UI. The `spinnaker:metadata` tag will include `buildInfo` for the image if it is returned from `clouddriver`. * fix(runJob): retry fetching property files if not found (spinnaker#1721) * fix(executionWindow): revert to polling (spinnaker#1722) * fix(canary): gentle canary cleanup (spinnaker#1711) * fix(rollback): Propagate `interestingHealthProviderNames` (spinnaker#1723) If `interestingHealthProviderNames` are specified on the parent stage context, propagate them to the clone stage. * fix(dryrun): let Echo remove the pipelineConfigId * fix(dryrun): ignore certain keys in context * fix(cancel): cancel during wait stage (spinnaker#1726) * fix(logging): updating timeout message w/ timeout value (spinnaker#1728) * fix(dryrun): strip nested nulls when comparing context * fix(rrb): Only inject pipeline stage if applicaton + pipelineId present (spinnaker#1729) * fix(tasks): stop using 'shared' task state (spinnaker#1731) * fix(dryrun): try to cope with values that are sometimes floats or ints * fix(manual judgment): switching back to polling to respect timeout overrides (spinnaker#1735) * fix(exec window): leave shared state alone (spinnaker#1737) * Attempt to use moniker before frigga (spinnaker#1697) * feat(moniker): Use moniker for Rollingpush tasks. (spinnaker#1703) * feat(moniker): Pass moniker to cleanup stages. (spinnaker#1732) * feat(moniker): Use moniker for app name over frigga in flex (spinnaker#1736) * feat(manualJudgment): allow standard notification types for manual judgment (spinnaker#1739) * fix(dryrun): ignore additional context field * fix(mahe): do not clean up properties that have been updated (spinnaker#1741) * fix(mahe): fix property extraction on cleanup (spinnaker#1743) * fix(deploy): avoid crossing the streams in parallelized deploys Ensure a deploy stage is getting deployment details from its ancestor and not from global context. * fix(expressions): stop stripping null context values * fix(mahe): send correct query to determine if fast property exists (spinnaker#1747) * fix(canary): fix cleanup of unhealthy canaries with multiple clusters (spinnaker#1749) * fix(security): Prevent webhook users from spoofing authed user * fix(moniker): hotfix canary deploy stages moniker objects are incorrect on canary deploy stages, this removes those objects for now pending a fix of the source problem * fix(mahe): check property structure on cleanup (spinnaker#1752) * fix(fastproperties): Processing expressions in property override (spinnaker#1754) - Explicitly processing expressions in property stage override * fix(triggers): ensure canceling pipeline sends ExecutionComplete event (spinnaker#1753) * feat(core): Wait for manifest stable task (spinnaker#1755) * fix(context): stop looking for properties in trigger until we can figure out what is going on * fix(titus): Tag titus server groups with previous image metadata (spinnaker#1758) * chore(canary-v2): Update to new json-based initiate canary entrypoint. (spinnaker#1756) * fix(front50): Keep front50 optional (spinnaker#1760) * Revert "fix(deploy): avoid crossing the streams in parallelized deploys" This reverts commit f496530. * fix(canary-v2): Avoid naming collision between mine/kayenta tasks. (spinnaker#1761) * feat(orca) Place produced artifacts in stage output (spinnaker#1441) * feat(artifacts): Resolve expectedArtifact by ids in trigger. (spinnaker#1763) * fix(trafficguards): adds retry logic to validateClusterStatus (spinnaker#1759) * fix(front50): Don't try to run dependent pipelines that are disabled * chore(dependencies): Bump spinnaker-dependencies to 0.120.1 (spinnaker#1765) Should fix the Artifact.metadata serialization issue. * fix(pipeline_template): Resave all pipelines on template update (spinnaker#1766) * fix(rollingpush): Ensure `waitTaskState` is cleared between iterations (spinnaker#1767) * feat(moniker): Use moniker in TrafficGuard. (spinnaker#1727) * feat(moniker): Use monker in TrafficGuard. * Check for null monikers and fall back on frigga * Formatting * Refactor * Pass moniker to validateClusterStatus() * Style cleanup * fix(core): `DetachInstancesTask` should have traffic guards (spinnaker#1768) * fix(orca/canary): Don't presume array present (spinnaker#1770) In at least one case of total canary stage failure, the deployedClusterPairs context variable doesn't get set at all, resulting in a null value -- this avoids the "Cannot invoke method getAt() on null object" error message that results from trying to dereference the zero'th element in this case. * chore(gradle): Avoid running tests on master/release branch (spinnaker#1769) * fix(manifest): Fix delete behavior (spinnaker#1774) * chore(gradle): Avoid running junit platform tests on master/release branch (spinnaker#1775) * feat(core): Add correlation ids to orchestrations (spinnaker#1748) Allows us to repeatedly send orchestrations with the same correlation id and only running a single one. Pre-req for keel. * fix(canary): fix wait task after baseline asg disable (spinnaker#1771) * fix(pipeline_template): Deal with whitespace in jinja module kv pairs (spinnaker#1773) * fix(web): Make trigger map mutable (spinnaker#1776) * Fixes restarting ACA task stages (spinnaker#1777) fix(aca): Fix restarting ACA task stage * Use StageContext consistently without breaking strategies (spinnaker#1772) fix(expressions): Ensure StageContext is always present on Stage instances * fix(front50): Avoid canceling an already canceled child pipeline (spinnaker#1779) This PR also propagates some details about the particular stage that failed such that `deck` can render an appropriate link. * fix(log): clarify missing custom strategy error (spinnaker#1780) * fix(trafficguards): Fix Moniker usage in instance termination (spinnaker#1781) The instance termination task does not normally have a `serverGroupName` explicitly specified. Without such, the moniker used to check traffic guards is invalid. * fix(mpt): don't NPE on stages without a when (spinnaker#1783) * feat(pipeline_template): Render configuration for templated pipelines with dynamic source (spinnaker#1718) * Will render using a specific execution, or the latest if a specific one is not set. * fix(netflix): do not send property value when looking up existing property (spinnaker#1785) * refactor(model): Unify execution subtypes * refactor(model): move properties from pipeline & orchestration up * refactor(model): make Execution non-abstract * refactor(model): introduce execution type enum and use instead of instanceof * refactor(model): unify retrieve & delete methods in repository * refactor(model): unify retrieve observable methods in repository * refactor(model): convert all refs to Pipeline / Orchestration to Execution * refactor(model): clean up Kotlin property syntax * refactor(model): fix one class/test I missed * refactor(model): migrate queue messages to new `executionType` format * refactor(model): change enums to uppercase * refactor(model): change enums to uppercase * refactor(model): rebase hell * refactor(model): typo * fix(web): Reverting needless method sig refactor from spinnaker#1718 (spinnaker#1786) * fix(core): s/orchestrationLauncher/executionLauncher (spinnaker#1787) * fix(queue): custom de/serializer so we can migrate queue values slowly * chore(logging): shut up spammy logs in integration tests * fix(core): Support wait before scaling down in red/black (spinnaker#1789) Set `delayBeforeScaleDownSec` in your stage context. * fix(metrics): derive executionType tag from type not class name * fix(*): fix startup failure if pipeline templates are not enabled (spinnaker#1792) * fix(pipeline trigger): fix error when parsing parent pipeline with no type * chore(dependencies): upgrade Spring and Jackson * chore(dependencies): simpler defaulting of execution type * fix(tags): stop dumb failures in cleaning up astrid tags * fix(triggers): typo in previous fix * fix(notifications): send correct notification type on pipeline events (spinnaker#1795) * feat(clouddriver/aws): Allow finding SG vpc IDs by name (spinnaker#1784) * feat(pagerduty): Support multiple applications and keys directly (spinnaker#1797) * fix(mort): Flippy-floppy equalsIgnoreCase to avoid NPE (spinnaker#1800) * fix(pagerduty): Fix paging by keys only (spinnaker#1801) * feat(provider/kubernetes): undo rollout stage (spinnaker#1802) * feat(provider/kubernetes): scale manifest (spinnaker#1803) * chore(dependencies): dependency updates gradle 3.5 spinnaker-gradle-plugin 3.17.0 spinnaker-dependencies 0.122.0 * feat(provider/kuberntes): pause/resume rollout (spinnaker#1806) * chore(core): Use `RetrySupport` from `kork-exceptions` (spinnaker#1807) * feat(pagerduty): Automatically append 'from' to the details map (spinnaker#1808) * feat(manifest): more robust status handling (spinnaker#1809) * fix(kubernetes/rollback) Pass cloudProvider to tasks so it doesn't default to aws (spinnaker#1810) * fix(moniker): Use the correct moniker when applying source server group (spinnaker#1805) * feat(canary-v2): Add region attributes to kayenta stage. (spinnaker#1813) * fix(trafficguards): debug logging when no enabled asgs found (spinnaker#1814) * chore(dependencies): bump Kotlin to 1.1.60 * chore(dependencies): bump JUnit * fix(alerts): fix formatting of log message for global context alert * fix(timeouts): prevent stageTimeoutMs ending up in outputs/global * Xenial builds (spinnaker#1819) * feat(xenial_builds): Added Orca systemd service config. * chore(dependencies): update to latest spinnaker-dependencies version * refactor(expressions): Remove v1 SPEL code (spinnaker#1817) - Making v2 as the default engine for expressions - More improvements on the way * chore(dependencies): bump Mockito and Hamkrest * fix(fastproperties): prevent FP stuff getting written to global context * chore(dependencies): bump Kotlin to 1.2 * fix(templates): Tolerate all thrown failures on execution lookup. (spinnaker#1822) * feat(provider/kubernetes): insert artifacts during deploy (spinnaker#1823) * Make WaitForClusterDisableTask configurable in yml (spinnaker#1824) * fix(core): Missing closing brace (spinnaker#1826) * chore(systemd_logs): Remove unneeded log redirection. (spinnaker#1825) * feat(artifacts): support 'use prior execution' (spinnaker#1827) * fix(fastproperties): correct separation of context and output values in FP stage * chore(mahe): remove mahe (spinnaker#1830) * fix(expressions): expressions can reference prior stage outputs (spinnaker#1828) * fix(expressions): expressions can reference prior stage outputs * feat(provider/kubernetes): deploy from artifact (spinnaker#1831) * feat(pipeline_template) Add strategyId tag to render ids by application and strategy name (spinnaker#1833) * fix(moniker): fix cluster if detail is set to empty via SpEL (spinnaker#1832) * fix(job): retry on call to clouddriver for job status (spinnaker#1834) * feat(pipeline_template) Allow partials to be injected from template configuration. (spinnaker#1798) * Removed a method that was added twice during the merge. * Implemented getRegion for EcsImageDetails.
ApplySourceServerGroupCapacityTask
overridesconvert()
.convertMoniker()
also needs to be overridden. The moniker used when verifying the cluster should be the same as the server group in the operation.Everything seems to be working just fine for AWS, but the k8s rollback stage is currently failing. There is a bigger problem that @lwander is working as to why k8s is using this stage that is in the aws provider.
@ajordens - Would you mind taking a look at this for me? This task seems to work a lot differently than the others that extend
AbstractServerGroupTask
and based on the blame you have the most experience with it.