Releases: logicmonitor/k8s-argus
v6.0.0-beta3
v6.0.0-beta3
v6.0.0-beta
Beta version - not recommended in production
v6.0.0-beta9
v6.0.0-beta9
v5.1.3
This release includes docker images fixes for release v5.1.2 along with following fixes.
Fixes:
- Argus now can evaluate discovery filtering expressions during periodic update of devices.
- Argus brings graceful and periodic reconciliation of pod devices when there is mismatch in IP addresses.
To enable graceful reconciliation user needs to add "
resync.pods = true
" property on kubernetes cluster level on LM portal and restart the argus pod.
Upgrade Steps
- Helm charts version for the new release argus is
0.18.0
.- Run
helm repo update
followed by upgrade argus using command:helm upgrade --reuse-values -f argus-config.yaml argus logicmonitor/argus
- Run
- Recreate Argus pod if it doesn't recreate automatically. Helm does not recreate pods if there is no change in definitions.
v5.1.2
Fixes:
- Argus now can evaluate discovery filtering expressions during periodic update of devices.
- Argus brings graceful and periodic reconciliation of pod devices when there is mismatch in IP addresses.
To enable graceful reconciliation user needs to add "
resync.pods = true
" property on kubernetes cluster level on LM portal and restart the argus pod.
Upgrade Steps
- Helm charts version for the new release argus is
0.18.0
.- Run helm repo update and followed by upgrade argus using command
helm upgrade --reuse-values --set imageTag="v5.1.2" -f argus-config.yaml argus logicmonitor/argus
- Run helm repo update and followed by upgrade argus using command
- Recreate Argus pod if it doesn't recreate automatically. Helm does not recreate pods if there is no change in definitions.
v5.1.1
Release 5.1.1
Fixes
Improved retrieving of device group by using its parentId, so even though multiple device groups with same name exists, argus makes sure to pick the correct groups.
Argus continues add device group even though any of the device group add API call fails.
Argus checks for empty device name before sending update call to santaba.
Upgrade Steps
- Upgrade helm charts to the new release (
helm repo update
) - argus chart version0.18.0
.- Run helm repo update and followed by upgrade argus and collectorset-controller using commands
helm upgrade -f argus-config.yaml argus logicmonitor/argus
- Run helm repo update and followed by upgrade argus and collectorset-controller using commands
- Recreate Argus pod if they don't recreate automatically. Helm does not recreate pods if there is no change in definitions.
v5.1.0
Release Documents
Release 5.1.0
What's New
- With this release, user can set the custom properties on cluster and device group levels for enabling events and log collection.
- Argus has added ability to automatically delete Kubernetes devices from "_deleted" device group based on configurable interval.
- Added support for
Kubernetes version 1.20
for generating self-links for resources.
Improvements
- User can configure time intervals for
Periodic discovery
,Periodic delete
andDevice cache sync
inargus-config.yaml
helm configuration. - Added installed Argus version details along with Argus version installation history properties on cluster level for troubleshooting purpose.
- Improved device cache to maintain the device naming to its consistency.
- Enhance Argus to discover headless services and HPA resources.
Fixes:
- Fixed a bug where Argus was not adding deleted devices to "_deleted" group even though "deleteDevices" configuration was set as false.
Custom properties for Kubernetes logs and event collection:
- For collecting Kubernetes logs and events, Argus has added ability to provide custom properties in helm configuration. User can add custom properties in helm charts and Argus will add those properties to respective device groups.
- For events collection,
lmlogs.k8sevents.enable
has been added at cluster level which is applicable for all the device groups. Default is set asfalse
. - For logs collection,
lmlogs.k8spodlog.enable
has been added at pods level. Default is set asfalse
.
To enable events or logs collection, user has to set those properties as
True
.
device_group_props:
cluster:
# To enable events collection
- name: "lmlogs.k8sevents.enable"
value: "false"
pods:
# To enable log collection for all pods
- name: "lmlogs.k8spodlog.enable"
value: "false"
services: []
deployments: []
nodes: []
etcd: []
hpas: []
User can modify these properties from LM Portal or through helm configuration. On Argus restart, property value would be reset to value set in helm config.
Auto delete Kubernetes devices from LMPortal based on configured duration:
- Argus moves terminated resources to
_deleted
dynamic device group ifDeleteDevices
parameter is set asfalse
. - Now user can specify the retention period for these devices using property
kubernetes.resourcedeleteafter
=P1DT0H0M0S
. - Default value for this property is 1 day, that means device will be permanently deleted from LM portal after 1 day and Maximum possible value is 10days which is configured in GCC
kubernetes.autoDeleteTask.maxDeleteAfterDuration
=P10DT0H0M0S
. - Currently this property can be found at cluster device group, though user can override this for other device groups.
Device_group_props:
cluster:
# To delete resources from the portal after specified time
- name: "kubernetes.resourcedeleteafterduration"
value: "P1DT0H0M0S"
override: false
Externalised Time intervals for periodic sync, periodic delete and device cache:
- User can configure time intervals for
Periodic discovery
,Periodic delete
andDevice cache sync
inargus-config.yaml
helm configuration. - Default values are set to
30m
,30m
,5m
and minimum possible values are10m
,10m
,5m
for periodic sync, periodic delete and device cache sync respectively.
app_intervals:
periodic_sync_interval: "30m"
periodic_delete_interval: "30m"
cache_sync_interval: "5m"
If user specifies the time intervals less than minimum possible values then default value would be considered and respective log would be added in Argus.
Please provide valid value for
periodic_sync_interval. Since invalid value is configured, forcefully setting it to default 30m.
If user enters invalid duration format (apart from what is mentioned in https://golang.org/pkg/time/#ParseDuration )in
argus-config.yaml
helm configuration, then application would be forcefully stopped and respective error would be logged. Until user enters correct format, Argus wont start.
Upgrade Steps
- Remove old LogicMonitor helm chart repo -
helm repo remove logicmonitor
- Add new LogicMonitor helm chart repo -
helm repo add logicmonitor https://logicmonitor.github.io/k8s-helm-charts
- Upgrade helm charts to the new release (
helm repo update
) - argus chart version0.17.0
and collectorset-controller chart version0.11.0
- Run helm repo update and followed by upgrade argus and collectorset-controller using commands
helm upgrade -f argus-config.yaml argus logicmonitor/argus
helm upgrade -f collectorset-controller-configuration.yaml collectorset-controller logicmonitor/collectorset-controller
- Run helm repo update and followed by upgrade argus and collectorset-controller using commands
- Recreate Argus, Collectorset-controller pods if they don't recreate automatically. Helm does not recreate pods if there is no change in definitions.
v5.0.1
Release 5.0.1
Fixes
- Conflicting devices among multiple clusters added to same Logicmonitor portal can now be added to monitor
Upgrade Steps
- Recreate the Argus pod
v5.0.0
Release 5.0.0
What's New
- With this release, Argus will discover HorizontalPodAutoscaler resources into Logicmonitor to monitor them.
Improvements
- Argus brought consistency and uniqueness among device names. See more on Device Naming strategy
- Updated Github hosted Argus documentation
- Adding support to monitor pods deployed on Fargate worker nodes.
- Adding permission to /metrics endpoint to monitor API server.
- User may add custom labels and annotations on Argus and Collector controller generated Kubernetes resources
HorizontalPodAutoscaler Monitoring
With this release, Argus have added capability to monitor HorizotalPodAutoscalers - hereafter referred as HPA.
Monitoring
Argus adds HPA as a device in Logicmonitor. Mainly, The Logicmonitor provided Kubernetes_HPA
datasource monitors following aspects of it:
- HPA Health: Ensures HPA is able to address the target resource under scale, HPA is able to retrieve desired metrics from metric adapters, HPA is able to scale up and scale down replicas to desired replicas as suggested by HPA's autoscale algorithm.
- HPA autoscale fluctuations: Datapoints collected as
current_replicas
anddesired_replicas
generates the line graph. It helps to analyse the trend of application autoscale behaviour. - Resource metrics: Datapoints for resource metrics collects out of the box.
- Other 3 categories of metrics in HPA are - Pods, Custom, and External metrics: For the sake of an example, Datasource groovy script has given datapoints each category. User have to modify groovy script to collect metrics for any metrics of these categories.
Device Naming Strategy
Adding plain names as that of kubernetes resource into Logicmonitor leads to conflicts as per the following Kubernetes characteristics:
- Same Kubernetes resource name can exist in more than one namespaces
- Same Kubernetes resource name can exist for different types in same namespaces
- Same Kubernetes resource name can exist in two different clusters. This scenario is very rare depending on existence of identical mirrored clusters viz. Production cluster and Production mirror environment for staging could exist.
Considering the above possibilities of occurring device name conflicts. Argus is now changed to generate unique device names.
Device name generation
Helm chart configuration parameters fullDisplayNameIncludeNamespace
and fullDisplayNameIncludeClusterName
takes part in device name generation.
Following are ways to define device name generation strategy:
- fullDisplayNameIncludeNamespace and fullDisplayNameIncludeClusterName both set to false (
default
):[Resource Name]-[Resource Type]
Ex: application1-12345-pod, application1-12345-svc - fullDisplayNameIncludeNamespace set to true and fullDisplayNameIncludeClusterName to false:
[Resource Name]-[Resource Type]-[Namespace]
ex: kube-apiserver-123-pod-kube-system - fullDisplayNameIncludeNamespace set to true and fullDisplayNameIncludeClusterName to true:
[Resource Name]-[Resource Type]-[Namespace]-[Cluster Name]
ex: kube-apiserver-123-pod-kube-system-prod-cluster. Cluster Name is referred to friendly cluster name given at the time of adding cluster into Logicmonitor. - fullDisplayNameIncludeNamespace set to false and fullDisplayNameIncludeClusterName to true: Here fullDisplayNameIncludeClusterName supersedes the fullDisplayNameIncludeNamespace hence result would be same as both params true.
How Argus behaves with conflicts when fullDisplayNameIncludeNamespace or fullDisplayNameIncludeClusterName is set to false
On occurrence of conflicts with other devices, Argus adds the new conflicting device with full unique name as [Resource Name]-[Resource Type]-[Namespace]-[Cluster Name]
and ensures that device remains under monitoring until user notices the device name conflicts and resolves them by opting in for unique name.
Argus also adds the device into _conflicts
device group while handling conflicts to nag the user about conflict occurrences. When user notices conflicts occurrences and opt in to use unique name to resolve them, all the resolved devices automatically removes from _conflicts group.
Custom labels and annotations to Argus and Collectorset Controller generated Kubernetes resources
User may provide Custom labels in respective helm chart configuration file. Helm chart configuration parameter labels
and annotations
holds the key:value maps. And mentioned labels and annotations gets added on all generated resources viz. deployment, pod, service, configmap, serviceaccount, clusterrole, etc.
Pre Upgrade steps
- You need to add
Kubernetes_HPA
datasource from Logicmonitor exchange beforehand to start monitoring for HorizontalPodAutoscaler devices.
Upgrade Steps
- Upgrade helm charts to the new release (
helm repo update
) - argus chart version0.16.0
and collectorset-controller chart version0.10.0
- Configure the helm configuration parameters
fullDisplayNameIncludeNamespace
andfullDisplayNameIncludeClusterName
according to your needs. Logicmonitor recommends to set both to true which ensures that no device conflicts occur. - Run helm repo update and followed by upgrade
helm upgrade -f argus-config.yaml argus logicmonitor/argus
- Recreate Argus, Collectorset-controller pods if helm upgrade doesn't recreate automatically. Sometimes, Helm does not recreate pods if there is no change in definitions.
Post Upgrade
- With this release, Argus by default adds devices with name as that of resource name, hence ensure that no devices present in
_conflicts
groups. If conflict groups has devices then make sure to choose appropriate naming strategy as aforementioned in docs.
Known issues
- Bug reference
DEVTS-9846
: If few devices does not adhere to chosen naming strategy, Argus may not be able retrieve few devices at the time of first resynchronisation to rename existing devices as per chosen naming strategy.Workaround
: Restart Argus few times to get it correct. We will update you once we resolve device retrieval bug.
v4.2.0
Release 4.2.0
Improvements
- Enhanced Argus to discover and add devices with same name across different namespaces
- Collecting Kubernetes version and Argus chart version as kind of telemetry on device group property - cluster device group will be used to store this info.
- Periodically synchronising Devices on LogicMonitor platform with its corresponding kubernetes resource. If kubernetes resource no more exists, Argus will remove the device from LogicMonitor platform.
- User may install LM Container solution using Spinnaker pipeline
Fixes
- Fixed bug to not to allow pod device without
system.ips
property. This property cannot be updated later on so Argus will not add device until the ip would be available for pod.
Upgrade Steps
- Important: While upgrading Argus chart from older versions to
0.14.0
or higher, you need to pass argus configuration file in helm upgrade command. Follow these steps to create configuration file:- Download configuration file template from here
- Get existing values using command
helm get values argus
and save in backup configuration file. - Put all existing values in downloaded configuration file at its appropriate places. And let the remaining values at their default.
- Upgrade helm charts to the new release (
helm repo update
) - argus chart version0.15.0
and collectorset-controller chart version0.9.0
- Run
helm repo update
and followed by upgradehelm upgrade --reuse-values -f argus-config.yaml argus logicmonitor/argus
- Run
- Recreate Argus, Collectorset-controller pods if they don't recreate automatically. Helm does not recreate pods if there is no change in definitions.