Releases: NetApp/harvest
Harvest Nightly Release
Nightly builds may include bugs and other issues. You might want to use the stable releases instead.
24.11.1
24.11.1 / 2024-11-25 Release
π Highlights of this major release include:
π Performance Improvements
- Significant memory footprint improvements for the REST collector. More details here. Thanks to Ryan for reporting it.
- Reduced memory footprint by using streaming in the REST collector.
β New Features
- Harvest supports Top files metrics collection. More details here.
- Volume and Cluster tags are supported via Volume and Cluster dashboards.
- Field Replaceable Unit (FRU) details have been added to the power dashboard.
- Track ONTAP image update progress for a cluster via the Cluster dashboard. Thanks to @knappmi for reporting it.
prom_port
is now supported within the poller. More details here.- We've fixed an intermittent latency/operations spike issue in the plugin-generated Harvest performance metrics. Thanks to @wooyoungAhn for reporting it.
Announcements
read how to migrate your Prometheus volume
π‘ IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the bin/harvest grafana import
CLI, from the Grafana UI, or from the Maintenance > Reset Harvest Dashboards
button in NAbox3. For NAbox4, this step is not needed.
Thanks to all the awesome contributors
π€ Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:
@70tas, @BrendonA667, @Falcon667, @mark Jordan, @paqui, Ryan, @CashnMoney, @ceojinhak, @ekolove, @knappmi, @wooyoungAhn
π± This release includes 14 features, 8 bug fixes, 2 documentation, 3 performance, 1 testing, 1 styling, 7 refactoring, 2 miscellaneous, and 3 ci pull requests.
π Features
- Add Tags To The Volume And Cluster Dashboards (#3273)
- Harvest Should Request Cluster Version Once (#3274)
- Top Files Collection (#3279)
- Enable Iface And Recvcheck Linters (#3280)
- Harvest Should Support Per-Poller Prom_ports (#3281)
- Harvest Should Log Number Of Renderedbytes For Each Collector (#3282)
- Asa R2 Should Use Keyperf Instead Of Restperf (#3289)
- Add Top Files Panels In Volume Dashboard (#3292)
- Adding The Ems Doc Link In The Health Dashboard Table (#3295)
- Add Dimm Panels In Power Dashboard (#3296)
- Adding Is_space_enforcement_logical, Is_space_reporting_logical⦠(#3301)
- Harvest Should Monitor
Wafl.dir.size.warning
(#3304) - Add Flexcache Keyperf Template (#3309)
- Add Top Metrics Plugin To Keyperf (#3315)
π Bug Fixes
- Set Dashboard Variable To Refresh To Time Range Change. (#3269)
- Correct The Mtu Unit In Network Dashboard (#3278)
- Zapi Collection (#3285)
- Metroclustercheck Collector Should Report Standby When Metroclus⦠(#3287)
- Missing Volumes After Vol Move (#3312)
- Metroclustercheck Collector Should Report "No Instances" (#3314)
- Panic If No Volumes Have Analytics Enabled (#3323)
- Partial Aggregation Handling In Plugins (#3324)
π Documentation
β‘ Performance
- Reduce The Memory Footprint Of Rest Collector (#3303)
- Add Streaming To Rest Collector (#3305)
- Improve Memory And Cpu Performance Of Rest Collector (#3310)
π§ Testing
- Sort Exporters For Deterministic Tests (#3290)
Styling
- Fix Logs (#3307)
Refactoring
- Remove Extra Log (#3257)
- Remove Env Logging (#3277)
- Simplify Negotiateontapapi (#3288)
- Keyperf Node Template Should Match Restperf Object Name (#3298)
- Remove Uses Of
Nolint:gocritic
(#3299) - Remove Unused Method In Rest Collector (#3308)
- Sync Template Names For Keyperf (#3316)
Miscellaneous
π¨ CI
24.11.0
24.11.0 / 2024-11-06 Release
π Highlights of this major release include:
-
π New dashboards:
- SnapMirror Destinations Dashboard which displays relationship details from the destination perspective.
- Vscan Dashboard which shows SVM-level and connection scanner details.
-
β Several of the existing dashboards include new panels in this release:
- SnapMirror dashboard now includes relationship details from the source perspective and has been renamed to "ONTAP: SnapMirror Sources".
- Health Dashboard's emergency events panel now includes all emergency EMS events from the last 24 hours.
- Network Dashboard
- Includes Link Aggregation Group (LAG) metrics
- Adds Ethernet port details
- s3 Object Storage dashboard includes panels for s3 metrics for SVM.
- Tenant Dashboard
- Adds Tenant/Bucket Capacity Growth Chart
- Includes average size per object details for each bucket
- Metadata Dashboard includes a panel displaying the number of instances collected.
- Power Dashboard includes a new "Average Power Consumption (kWh) Over Last Hour" panel.
- SVM Dashboard now features panels for logical space and physical space at the SVM level.
- Volume Deep Dive dashboard includes "Other IOPs" panel.
-
π Performance Improvements:
- Reduced memory footprint by optimizing memory allocations when serving metrics.
- Reduced API calls when using the RestPerf collector.
-
Harvest supports Top clients metrics collection. More details.
-
Harvest supports recording and replaying HTTP requests.
-
Harvest now provides a FIPS-compliant container image, available as a separate image (ghcr.io/netapp/harvest:24.08.0-1-fips).
-
Grafana import allows rewriting the cluster label during import.
Announcements
read how to migrate your Prometheus volume
π‘ IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox.
Thanks to all the awesome contributors
π€ Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:
- @ofu48167
- @WayneShen2
- @T1r0l
- @Daniel-Vaz
- @razaahmed
- @gaow1423
- @BrendonA667
- @70tas
- @annapook-netapp
- @buller7929
- @florent4155
- @heinowalther
- @db-wally007
- @embusalacchi
π± This release includes 36 features, 24 bug fixes, 7 documentation, 7 performance, 1 testing, 3 styling, 5 refactoring, 9 miscellaneous, and 15 ci pull requests.
π Features
- Tenant Dashboard Buckets Panel Should Include (#3101)
- Use Docker Buildx Secret For Token (#3108)
- Enable Pprof Endpoints On Localhost (#3110)
- Generate Fips Compliant Container Image For Harvest (#3113)
- Support Ifgroup Level Throughput Metrics (#3117)
- Harvest Should Include A Vscan Dashboard (#3121)
- Vscan Dashboard Should Include Topk (#3127)
- Top Clients Metrics Collection (#3132)
- Adding Panels For Ontaps3svm Object (#3134)
- Grafana Import Should Allow Rewriting Cluster Label (#3135)
- Replace Zerolog With Slog (#3146)
- Harvest Should Include Time-Series Panels For Tenants And Buckets (#3147)
- Send The Harvest Version To Ontap (#3152)
- Replace Zerolog With Slog (#3164)
- Add Documentation For Plugin-Generated Metrics And Enable Ci (#3169)
- Add Instances Collected Panel To Metadata Dashboard (#3178)
- Harvest Should Use Slogs Text Format By Default (#3179)
- Add "Average Power Consumption (Kwh) Over Last Hour" Panel To Power Dashboard (#3180)
- Replacing connector webhook with MS workflow (#3183)
- Handle Url Limit In Rest (#3186)
- Keyperf Collector Templates (#3194)
- Harvest Rest And Restperf Collectors Should Support Batching (#3195)
- Add Top Svm By Space In Svm Dashboard (#3200)
- All Harvest Dashboards Should Include Tags (#3202)
- Support Destination/Source Level View - Parity With Sm (#3204)
- Add Other Ops Panel In Volume Deep Dive Dashboard (#3209)
- Add Nfs Templates For Keyperf Collector (#3215)
- Adding Snapmirror Sources dashboard - 1 (#3216)
- Keyperf Collector Templates (#3219)
- Adding Ethernet Port Table From Netport Template (#3221)
- Fail Ci When There Are Errors In Prometheus Or Grafana (#3232)
- Log Cluster Name And Version With Poller Metadata (#3234)
- Harvest Should Support Recording And Replaying Http Requests (#3235)
- Add Emergency Events To Health Dashboard (#3238)
- Add Keyperf Metric Docs (#3240)
- Improve Harvest Memory Logging (#3244)
- Doctor should handle embedded exporters (#3258)
π Bug Fixes
- Handled Non Exported Qtrees In Template (#3105)
- Handled Nameservices In Svm Zapi Plugin (#3124)
- Fix Disk Count In Disk Dashboard (#3126)
- Handled Quota Index Key In Rest Template With Tests (#3128)
- Vscan Panels Throws 422 Error (#3133)
- Correcting The Alert Rule Expression For Required Labels (#3143)
- Svm Dashboard - Volume Capacity Row Ordering (#3158)
- Fsa History Data Should Work When Multi Select (#3159)
- Do Not Log Stdout When A Credential Script Fails (#3163)
- Remove '*' As 'All' Option In Workload Dropdown On Workload Dashboard (#3165)
Bin/Harvest Rest
Should Read Credentials Before Fetching Data (#3166)- Remove Embedded Shelf Power From Total Power In Series Panel To Match Stats Panel (#3167)
- Volume_aggr_labels Should Not Include Uuid Label (#3171)
- Add Embedded Shelf Type For Power Calculation (#3174)
- Using Instancename Instead Of Volname In Fabricpool Perf (#3175)
- Correct Failed State In Workflow (#3190)
- Handled Flexgroup Based On Volume Config Call (#3199)
- Filter By Svm, Volume In Sm Destination Dashboard (#3220)
- Remove _Labels From Metric Docs (#3222)
- Update Datacenter And Cluster Variables In Dashboards (#3227)
- Don't Double Export Aggregate Efficiency Metrics (#3230)
- Update Keyperf Collector Static Counter File Path (#3241)
- Fix Numbering In Quickstart (#3249)
- Fix Value Mapping In Tenant Dashboard (#3253)
- Rename volume latency in keyperf (#3261)
π Documentation
24.08.0
24.08.0 / 2024-08-12 Release
π Highlights of this major release include:
-
π Harvest dashboards now include links to other relevant dashboards. This makes it easier to navigate relationships between cluster objects.
-
β Several of the existing dashboards include new panels in this release:
- The Security dashboard shows SSL certificate expiration dates and warns if certificates are expiring soon. Prometheus alerts are created for expired certificates and certificates that will expire within the next month. Thanks to @timstiller for the suggestion.
- The Volume and Aggregate dashboards include new panels showing inactive data trends. Thanks to @razaahmed for the suggestion.
- The Workload dashboard includes panels showing the QoS percentage utilization at the policy level for shared QoS policies. Thanks to Rusty Brown for the suggestion.
- The Datacenter dashboard includes the number of Qtrees, Quotas, and Workloads in the Object Count panel.
- The Aggregate dashboard now includes topk timeseries.
- The Metadata dashboard now includes a stats panel showing the number of failed collectors. Thanks to @mamoep for the suggestion.
- The Metadata dashboard Pollers table includes the resident set size of each poller process.
- The StorageGRID Tenant dashboard now includes an "average size per object" column in the Tenant Quota panel. Thanks to @ofu48167 for the contribution.
-
πΎ Quotas and Qtrees templates are separated into individual templates instead of being combined as in earlier versions of Harvest.
-
The ChangeLog plugin monitors metric value changes in addition to label changes. Thanks to @pilot7777 for the suggestion.
-
Harvest collects quotas even when there are no qtrees. Thanks to @qrm1982 for reporting.
-
The StorageGRID collector supports single sign-on via a credential script auth token. Thanks to @santosh725 for suggesting.
-
Harvest supports OAuth 2.0 ONTAP collectors via a credential script auth token.
-
Harvest handles lun and namespace metrics with simple names.
-
Harvest collects
virtual_used
andvirtual_used_percent
metrics from volumes via REST on ONTAP versions 9.14.1+ -
Prometheus metrics retention has been increased to one year in the Docker compose workflow.
-
Harvest creates resolution metrics for health alerts. Thanks to @faguayot for suggesting.
-
Pollers report their status as the
poller_status
in native and container environments. -
Grafana import allows you to specify a custom all value when importing. Thanks to ChrisGautcher for the suggestion.
-
Harvest includes remediation steps for EMS active sync events in the EMS alert runbook. Thanks to @Nikhita-13 for the contribution.
-
bin/harvest doctor
reports when exporters are missing -
Harvest allows exporting metrics without a prefix. This can be handy when collecting from a StorageGRID Prometheus instance. See the storagegrid_metrics.yaml template for an example. Thanks to @Bhagyasri-Dolly for suggesting.
-
π Documentation Additions:
- Harvest includes a new "Getting Started" tutorial. Thanks to MichelePardini for the suggestion.
Announcements
qos_detail_service_time_latency
metrics. The metrics can be reenabled by setting with_service_latency: true
in the WorkloadDetailVolume template file. See #3015 for details.
read how to migrate your Prometheus volume
π‘ IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox3.
Thanks to all the awesome contributors
π€ Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:
- @timstiller
- @razaahmed
- @mamoep
- @ofu48167
- @pilot7777
- @qrm1982
- @santosh725
- @faguayot
- @nikhita
- @bhagyasri
- @Falcon667
- RustyBrown
- ChrisGautcher
- MichelePardini
π± This release includes 40 features, 28 bug fixes, 13 documentation, 1 performance, 2 testing, 5 refactoring, 12 miscellaneous, and 11 ci pull requests.
π Features
- Prometheus Should Retain Data For Up To One Year (#2919)
- Log Jitter During Best-Fit Template Loading (#2920)
- Add Failed Collectors Stats In Metadata Dashboard (#2929)
- Linking Dashboard Part-1 (#2931)
- Poller's Should Collect And Export Their Status And Memory (#2944)
- Include Rss In Poller Table Of Metadata Dashboard (#2948)
- Grafana Import Should Allow You To Specify A Custom All Value (#2953)
- Harvest Should Include Remediation Steps For Ems Active Sync Ev⦠(#2963)
- Linking Dashboards Part-2 (#2968)
- Support For Qos Percentage Utilization At Policy Level For Shared Qos Policies (#2972)
- Linking Dashboards Part-3 (#2976)
- Create Resolution Metrics For Health Alerts (#2977)
- Add Qtree,Quota,Workload Counts To Datacenter Dashboard (#2978)
- Harvest Should Track Poller Maxrss In Auto-Support (#2982)
- Add Topk To Aggregate Dashboard Timeseries Panels (#2987)
- Harvest Should Handle Lun And Namespace Metrics With Simple Names (#2998)
- Harvest Should Log Rss And Maxrss Every Hour (#2999)
- Implementing Certificate Expiry Detail In Security Dashboard (#3000)
- Remove Topk Vars From Storagegrid Dashboards (#3002)
- Add Inactive Data Metrics For Aggregate And Volume (#3003)
- Harvest Should Remove Service Center Metrics (#3019)
- Adding Quotas Detail In Asup (#3020)
- Harvest Should Allow Exporting Metrics Without A Prefix (#3022)
- Remove Service_time_latency Counter From Tests (#3027)
- Harvest Should Collect Virtual_used And Virtual_used_percent (#3031)
- Harvest Should Log Template Loading Errors (#3036)
- Enable Changelog Plugin To Monitor Metric Value Change (#3041)
--Debug
Cli Argument Should Enable Debug Logging (#3043)- Harvest Should Support Storagegrid Credentials Script With Auth⦠(#3048)
- Harvest Doctor Should Report When Exporters Are Missing (#3049)
- Update Qtree Template Doc - Collect Quotas When No Qtrees (#3056)
- Handled User/Group Quota In Historicallabels (#3060)
- Support Oauth2.0 Via Credential Script - Phase1 (#3066)
- Harvest Should Not Simultaneously Publish Quota Metrics From Qt⦠(#3067)
- Split Qtree/Quota Rest Templates (#3068)
- Adding Generated Instances/Metrics Count In Health Plugin Log (#3074)
- Health Dashboard Should Indicate When There Are No Events (#3077)
- Keyperfmetrics Collector Infrastructure (#3078)
- Adding Ut For Qtree Non Exported Case (#3085)
- Tenant Dashboard Should Include An
Average Size Per Object
Co⦠(#3091)
π Bug Fixes
- Zapi Rest Parity (#2934)
- Rest Templates Should Not Have Hyphon (#2943)
- Restore The Svm, Qtree, User, And Group Columns To The Quota Das⦠(#2950)
- Harvest Should Log Errors When Grafana Import Fails (#2962)
- Correct Details Folder Name While Import (#2966)
- Handling Min-Max In Gradient (#2969)
- Use Read/Write Data Due To Missing Historical Data In Dashboards (#2979)
- Fixing Non-Exported Flexgroup Instances Error ([#2980](https://...
v24.05.2
π This release is identical to 24.05.0, with the addition of two fixes:
- A fix that makes the NFS Troubleshooting dashboards load in NAbox and via
bin/harvest grafana import
. - A fix for a regression introduced in 24.05.1, that causes FlexGroup volume performance metrics to be skipped.
Upgrade Recommendation:
You should upgrade to 24.05.2
if any of the following apply to you:
- You want to use the NFS troubleshooting dashboards.
- You are on version 24.05.1 and your cluster includes FlexGroup volumes.
Full Changelog: v24.05.1...v24.05.2
24.05.1
π This release is the same as 24.05.0 with a fix that makes the NFS Troubleshooting dashboards load in NAbox. If you are not using NAbox or you do not use the NFS trouble shooting dashboards, you can ignore this release.
Full Changelog: v24.05.0...v24.05.1
24.05.0
24.05.0 / 2024-05-20 Release
π Highlights of this major release include:
-
Harvest supports consistency groups (CG) in the SnapMirror dashboard. Thanks to @Nikhita-13 for reporting this.
-
We've fixed an intermittent latency/ops spike problem caused by Harvest incorrectly handling ONTAP partial aggregation. This impacted all perf objects. A big thank you to @summertony15 for reporting this critical issue.
-
Harvest dashboards are compatible with Grafana 10.x.x versions.
-
π LUN, Flexgroup and cDot dashboard updated to work with FSx. Some panels are blank because FSx does not have that data.
-
The credentials script supports providing both username and password. Thanks to @kbhalaki for reporting.
-
Harvest configuration file supports reading parameters from environment variables. Kudos to @wally007 for the suggestion.
-
Harvest includes remediation steps for EMS alerts.
-
π New Dashboards:
NFS Troubleshooting
which provides links to detailed dashboards. Thanks to RustyBrown for contributing these.- Detailed Dashboards:
Volume by SVM
andVolume Deep Dive
.
-
π Performance Improvements:
- Rest/RestPerf Collector only requests metrics defined in templates, reducing API time, payload size, and collection load.
- TopK queries in dashboards are now faster. Thanks to AlessandroN for reporting.
-
β Several of the existing dashboards include new panels in this release:
- Workload dashboard includes adaptive QoS used percentage tracking. Thanks to @faguayot for reporting.
- Network dashboard includes ethernet errors. Thanks to Rusty Brown for contributing.
- Node dashboard includes the BMC firmware version. Thanks to @summertony15 for reporting.
- SVM dashboard now includes NFS4.2 panels. Thanks to Didlier for reporting.
- The Volume dashboard includes several new panels:
-
πΎ Harvest includes a new template to collect lock counts at the node, SVM, LIF, and volume levels.. Thanks to @troysmullerna for reporting.
-
π Documentation Additions:
- How to customize Prometheus's retention period in a Docker deployment. Thanks to @WayneShen2 for the suggestion.
- How to use endpoints in a REST collector template. Thanks to Hubert for reporting.
- Harvest includes remediation steps for EMS alerts.
- How to use
confpath
to extend templates.
-
Harvest supports embedded exporters in Harvest configuration. This means you can define your exporters in one place instead of multiple. Thanks to @wagneradrian92 for reporting.
-
Harvest supports exporting to multiple InfluxDB instances. Thanks to @figeac888 for reporting.
-
Node label metrics include HA partner details. Thanks to @johnwarlick for reporting.
Announcements
24.05
removes duplicate quota metrics. If you wish to enable them, refer here.
π‘ IMPORTANT After upgrading, don't forget to re-import your dashboards to get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox.
Known Issues
β οΈ Harvest does not calculate power metrics for AFF A250 systems. This data is not available from ONTAP via ZAPI or REST. See ONTAP bug 1511476 for more details.β οΈ ONTAP does not include REST metrics foroffbox_vscan_server
andoffbox_vscan
until ONTAP 9.13.1. See ONTAP bug 1473892 for more details.
IMPORTANT 7-mode filers that are not on the latest release of ONTAP may experience TLS connection issues with errors like tls: server selected unsupported protocol version 301
. This is caused by a change in Go 1.18. The default for TLS client connections was changed to TLS 1.2 in Go 1.18. Please upgrade your 7-mode filers (recommended) or set tls_min_version: tls10
in your harvest.yml
poller section. See #1007 for more details.
Thanks to all the awesome contributors
π€ A big thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards this release:
@BrendonA667, @Nikhita-13, @WayneShen2, @derDaywalker, @faguayot, @figeac888, @jgasher, @johnwarlick, @kbhalaki, @rdecaneva, @s-kuchi, @summertony15, @troysmullerna, @wagneradrian92, @wally007, @ybizeul, AlessandroN, Didlier, Hubert, Rusty Brow, Tamas Zsolt
π± This release includes 42 features, 38 bug fixes, 10 documentation, 1 performance, 6 styling, 9 refactoring, 16 miscellaneous, and 17 ci pull requests.
π Features
- Adding Zapi/Rest Templates For Lock-Get-Iter & Protocols/Locks (#2706)
- Dashboards Would Work With Grafana 10.X.x (#2713)
- Add Harvest.yml Environment Variable Expansion (#2714)
- Metadata Dashboard Should Include Poller Rss Panels And Time Se⦠(#2716)
- Harvest Should Export To Multiple Influxdb Exporters (#2722)
- Adding Ha_partner Info In Node (#2723)
- Improve Rest Collector (#2740)
- Harvest Should Track Network Bytes Received And Number Of Ontap⦠(#2745)
- Harvest Should Handle Ontap Counter Manager Rejection Errors (#2747)
- Harvest Network Dashboard Should Show Ethernet Errors (#2748)
- Changed Plugin Generated Metric Naming For Lock Object (#2750)
- Usage Of Predict_linear Function In Volume Dashboard (#2763)
- Improve Restperf Collector (#2765)
- Harvest Should Include Nfs Troubleshooting Dashboards (#2766)
- Adding Volume Growth Rate Panels In Volume Dashboard (#2768)
- Harvest Should Reduce Batch Size And Retry When Ontap Times Out (#2770)
- Ignore Performance Counters With Partial Aggregation (#2775)
- Harvest Should Reduce Batch Size And Retry When Ontap Times Out (#2776)
- Harvest Should Log When The Template Is Missing (#2779)
- Add Instance Log For Latency Calculation (#2794)
- Harvest Should Collect The Bmc Firmware Version (#2800)
- Add I/O Density Panels To Volume Dashboard (#2805)
- Reduce Dependencies (#2812)
- Use Constrained Topk To Improve Dashboard Performance (#2825)
- Supporting Consistency Group Drilldowns In Snapmirror Dashboard (#2830)
- Harvest Should Include Remediation Steps For Ems Alerts (#2836)
- Harvest Svm Dashboard Should Include Nfsv4.2 Panels (#2846)
- Adding Description To Svm Panels (#2861)
- Harvest Should Support Embedded Exporters (#2864)
- Adaptive Qos Used% Tracking (#2865)
- Credentials Script Should Support Both Username And Password (#2870)
- Adding Panel Descriptions In All Dashboards (#2878)
- Remove Hidden Topk Variables From Dashboards (#2881)
- Remove Duplicate Quota Metrics (#2886)
- Remove Hidden Topk Variables From Dashboards (#2889)
- Adding Description To Panels (#2891)
- Add Test Case For Join Queries In A Table (#2892)
- Adding Details Folder In Docker (#2896)
- Enable Request/Response Logging For Rest And Restperf Plugins (#2898)
- Flexgroup And Lun Dashboards Work With Fsx (#2899)
- Remove Hidden Topk From Aggregation Dashboard (#2900)
- Cdot Dashboards Work With Fsx (#2903)
π Bug Fixes
- Handle Inter-Cluster Snapmirrors When Different Datacenter (#2688)
- Display Poller Status With Harvest_docker Env ([#2705](https://github.com/NetApp/harvest/pull...
24.02.0
24.02.0 / 2024-02-21 Release
π Highlights of this major release include:
-
New Datacenter dashboard which contains node health, capacity, performance, storage efficiency, issues, snapshot, power, and temperature details.
-
Harvest includes SnapMirror active sync EMS events with alert rules. Thanks @Nikhita-13 for reporting.
-
Harvest monitors FlexCache performance metrics and includes a new FlexCache dashboard to visualize them. Thanks to @ewilts for raising.
-
Harvest detects HA pair down and sensor failures. These are shown in the Health dashboard. Thanks to @johnwarlick for raising.
-
Harvest monitors MetroCluster diagnostics and shows them in the MetroCluster dashboard. Thanks to @wagneradrian92 for reporting.
-
We improved the performance of all dashboards that include topk queries. Thanks to @mamoep for reporting!
-
We added filter support for the ZapiPerf collector. See filter for more detail. Thanks to @debbrata-netapp for reporting.
-
A
bin/harvest grafana customize
command that writes the dashboards to the filesystem so other programs can manage them. Thanks to @nicolai-hornung-bl for reporting! -
We fixed an intermittent latency spike problem that impacted all perf objects. Thanks to @summertony15 and @rodenj1 for reporting this critical issue.
-
β Several of the existing dashboards include new panels in this release:
- Node and Aggregate dashboard include volume stats panels. Thanks to @BrendonA667 for raising.
- SVM dashboard includes volume capacity panels. Thanks to @BrendonA667 for raising.
- SnapMirror dashboard includes automated_failover and automated_failover_duplex policies.
-
More Harvest dashboard dropdown variables include the
All
option. Making it easier to get an overview of your environment. -
All EMS alerts include an impact annotation. Thanks to @divya for raising.
-
πΎ Harvest includes new templates to collect:
- Network filesystem (NFS) rewinds performance metrics (rw_ctx). Thanks to @shawnahall71 for raising
- Network data management protocol (NDMP) session metrics. Thanks to @schumijo for raising.
-
π Documentation additions
- Harvest describe why and how to configure Docker's logging drivers Docker logging configuration Thanks to @madaan for raising.
- How to create templates that use ONTAP's private CLI details
- How to create custom Grafana dashboards Steps
- How to validate your
harvest.yml
file and share a redacted copy with the Harvest team. Details - Harvest describes high-level concepts here Thanks to @norespers for raising.
-
All constituents are disabled by default for workload detail performance templates.
-
The
bin/harvest zapi
CLI now supports atimeout
argument. -
Harvest performance collectors (ZapiPerf and RestPerf) ask ONTAP for performance counter metadata every 24 hours instead of every 20 minutes. Thanks to BrianMa for raising.
-
The Harvest REST collector's
api_time
metric now includes the API time for all template endpoints. Thanks to ChristopherWilcox for raising.
Announcements
24.02
disables four templates that collected metrics not used in dashboards.
These four templates are disabled by default: ObjectStoreClient
, TokenManager
, OntapS3SVM
, and Vscan
.
This change was made to reduce the number of collected metrics.
If you require these templates, you can enable them by uncommenting them in their corresponding default.yaml
or by extending the existing object template.
πΊ IMPORTANT The minimum version of Prometheus required to run Harvest is now 2.33.
Version 2.33 is required to take advantage of Prometheus's @
modifier.
Please upgrade your Prometheus server to at least 2.33 before upgrading Harvest.
π‘ IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox.
Known Issues
-
Harvest does not calculate power metrics for AFF A250 systems. This data is not available from ONTAP via ZAPI or REST.
See ONTAP bug 1511476 for more details. -
ONTAP does not include REST metrics for
offbox_vscan_server
andoffbox_vscan
until ONTAP 9.13.1. See ONTAP bug
1473892 for more details.
IMPORTANT 7-mode filers that are not on the latest release of ONTAP may experience TLS connection issues with errors
like tls: server selected unsupported protocol version 301
This is caused by a change in Go 1.18.
The default for TLS client connections was changed to TLS 1.2 in Go 1.18.
Please upgrade your 7-mode filers (recommended) or set tls_min_version: tls10
in
your harvest.yml
poller section.
See #1007 for more details.
Thanks to all the awesome contributors
π€ Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:
@shawnahall71, @pilot7777, @ben, @madaan, @johnwarlick, @jfong5040, @santosh725, @summertony15, @jmg011, @cheese1, @mamoep, @Falcon667, @dess, @debbrata-netapp, @ewilts,
@Nikhita-13, @norespers, @nicolai-hornung-bl, @BrendonA667, @schumijo, @divya, @joshuacook-tamu, @wagneradrian92, @george-strother
π± This release includes 26 features, 24 bug fixes, 20 documentation, 3 styling, 5 refactoring, 11 miscellaneous, and 12 ci pull requests.
π Features
- Include Start Time, Exported Metrics, And Poll Duration In Collector logs (#2493)
- Adding Rw_ctx Zapiperf Object Template (#2494)
- Change Pollcounter Schedule To 24H (#2499)
- Add Ha Down And Sensor Issues In Health Dashboard (#2519)
- Adding Ndmp Session Rest Template (#2531)
- Use Modifier For Topk To Improve Svm Dashboard Performance (#2553)
- Add Timeout For Zapi Cli (#2566)
- Restperf Disk Plugin Should Support Metric Customization (#2573)
- Add Filter Support For Zapiperf Collector (#2575)
- FlexCache Monitoring (#2583)
- Supporting Automated_failover, Automated_failover_duplex Policy In Sm (#2584)
- Disabled The Templates Whose All Metrics Are Not Consumed In Dashboards (#2587)
- Harvest Should Include Snapmirror Active Sync Ems Events (#2588)
- Use Modifier For Topk To Improve Dashboard Performance (#2590)
- Harvest Should Include A Snapmirror Active Sync Template (#2596)
- Disable Constituents By Default For Workload Detail Performance Templates (#2598)
- Adding Template For Metrocluster Diagnostics Check (#2601)
- Adding Per Volume Panels In Svm Dashboard (#2602)
- Add Grafana Customize Command (#2619)
- Add Volume Stats To Node And Aggregate Dashboard (#2627)
- Ems Alerts Should Include An Impact Annotation (#2631)
- Improving Debug Log Clarity And Reducing Noise (#2637)
- Datacenter Dashboard (#2650)
- Harvest Dashboards Should Include An All Option (#2661)
- Percent Unit Panels Should Use Decimal Points (#2663)
- Change Stat Panel For Uptime,Power Status,Fan Status To Table In Node Dashboard (#2668)
π Bug Fixes
- Handled Missing Uuid In Volume For Change_log (#2478)
- Remove Docs From Deb Binary (#2489)
- Parsed Logger Changes (#2490)
- Array Metrics Should Have Correct Base Label In Zapiperf (#2496)
- Harvest Should Collect Luns In Qtress (#2502)
- Grafana Export Should Set Correct Permissions (#2505)
Begin
Log For Pollcounter And Pollinstance Should Be In Ms ([#2509](https://githu...
23.11.0
23.11.0 / 2023-11-13 Release
π Highlights of this major release include:
-
New FlexGroup dashboard that includes FlexGroup constituents. Thanks to @Sandromuc and @ewilts for raising.
-
Harvest ChangeLog plugin to detect and monitor changes related to object creation, modification, and deletion.
-
We improved how Harvest calculates power. As a result, you may notice a decrease in the reported power metrics compared to previous versions. Details here. Thanks to Evan Lee for reporting!
-
Added
conf_path
variable for specifying the search path of Harvest templates. -
π¦ Streamlined the Harvest container installation process by eliminating the need to download a tar file. Running Harvest in a container is now simpler and more convenient.
-
β Several of the existing dashboards include new panels in this release:
- Aggregate and Volume dashboard includes performance and capacity tier data. Thanks to @ewilts for raising.
- Workload dashboard includes QoS fixed Utilization % panels. Thanks to @faguayot for raising.
- Disk Dashboard features performance panels at the disk raid-group level. Thanks to @kinderr95 for raising.
-
πΎ Harvest includes new templates to collect:
-
π Documentation additions
- Enhanced Quickstart guide for Harvest
- NABox logs collection guide
- Document poller
ca_cert
property. Thanks to Marvin Montanus for reporting! - Describe how Harvest calculates power. Thanks to Evan Lee for reporting!
- Details about hidden_fields and filter for the Rest Collector. Thanks to Johnathan Warlick for raising!
-
Enhanced the Volume dashboard to include clone information.
-
β‘ Optimized the Harvest binaries, significantly reducing their size.
-
The Metadata dashboard works inside container deployments.
-
The FabricPool panels in the Volume dashboard now support FlexGroup volumes. Thanks to @sriniji for reporting.
-
Large
harvest.yml
files can be refactoring into smaller ones. Thanks to @llelik and @Pengng88 for raising. -
π‘ Added help text about metrics to more Harvest dashboard panels.
Announcements
23.11
disables the CIFSSession
templates by default. This change was made to prevent the generation of a large number of metrics. If you require these templates, you can enable them. Please be aware that enabling them may result in a significant increase in metric collection time, Harvest memory footprint, and Prometheus used disk space. These metrics are utilized in the SMB2 dashboard.
23.11
has updated its power metric calculation algorithm. As a result, you may notice a decrease in the reported power metrics compared to previous versions. To collect these metrics, Rest API permissions are required. For detailed information on the power algorithm, please refer to the power algorithm documentation.
π‘ IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the bin/harvest grafana import
CLI, from the Grafana UI, or from the Maintenance > Reset Harvest Dashboards
button in NAbox.
Known Issues
-
Some AFF A250 systems do not report power metrics. See ONTAP bug 1511476 for more details.
-
ONTAP does not include REST metrics for
offbox_vscan_server
andoffbox_vscan
until ONTAP 9.13.1. See ONTAP bug
1473892 for more details.
IMPORTANT 7-mode filers that are not on the latest release of ONTAP may experience TLS connection issues with errors like tls: server selected unsupported protocol version 301
This is caused by a change in Go 1.18. The default for TLS client connections was changed to TLS 1.2 in Go 1.18. Please upgrade your 7-mode filers (recommended) or set tls_min_version: tls10
in your harvest.yml
poller section. See #1007 for more details.
Thanks to all the awesome contributors
π€ Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:
@Garydep, @MrObvious, @Pengng88, @Sandromuc, @ewilts, @faguayot, @jmg011, @kinderr95, @llelik, @mamoep, @rodenj1, @s-kuchi, @shawnahall71, @slater0013, @sriniji, @statdigger, @wyahn1219, AlessandroN, Dave, Diane, Evan Lee, Francesco, Heaven7, Johnathan Warlick, Madaan, Martijn Moret, Marvin Montanus, NicoSeiberth, RBrown, TonyHsieh, Watson9121, dbakerletn, imthenightbird, roller, twodot0h, tymercer
π± This release includes 38 features, 26 bug fixes, 24 documentation, 5 performance, 2 refactoring, 12 miscellaneous, and 7 ci pull requests.
π Features
- Change Log Detection In Harvest (#2178)
- Remove Daemon Dependency (#2195)
- Enable More Golanglint Linters (#2313)
- Gcc Is Not Required To Build Harvest (#2322)
- Ontap Permission Errors Should Be Logged As Errors (#2326)
- Harvest Should Load Templates From A Set Of Conf Directories (#2329)
- Ontap Power Calculation For Embedded Shelf (#2333)
- Enable More Golanglint Linters (#2334)
- Harvest Auto-Support Should Include Instance Count In Collector Section (#2337)
- Set Allvalue To Null When Svm Regex Is Applied (#2340)
- Add Parity For String Types Between Restperf And Zapiperf (#2342)
- Tiering Data Changes For Volume - Template Change (#2343)
- Docker Workflow Doesn't Need Tar Download (#2354)
- Enable Ports By Default In Docker Generate (#2360)
- Support Comma Separated Aggrs In Perf Metrics (#2376)
- Harvest Should Support Multiple Poller Files To Allow Refactori⦠(#2388)
- Adding Iwarp Restperf Template (#2390)
- Adding New Panels In Disk Dashboard (#2391)
- Harvest Should Load Templates From A Set Of Conf Directories (#2394)
- Add Api To Rest Error Log (#2401)
- Add Clone Info To Volume Dashboard (#2402)
- Cifs Share Templates (#2405)
- Support Flexgroup Constituents In Template (#2410)
- Add Flexgroup To Fabricpool Panels (#2419)
- Smb2 Restperf Counters (#2420)
- Adding Fc Rest Template For Fibre Channel Switch (#2424)
- Metric Doc Needs To Handle Templates With Same Object Names (#2426)
- Antiransomwarestate Label Should Be Exported (#2432)
- Metadata Dashboard Should Work With Containers And Remove System Resources Panel (#2433)
- Adding Restperf Object_store_server Template (#2435)
- Update Ci To Use Docker Run And Update Permissions (#2436)
- Enable More Golanglint Linters (#2439)
- Qos Fixed Utilization % Panels (#2445)
- Description Fetched From Ontap Docs Via Cli (#2454)
- Disable Cifssession Template (#2455)
- Add Labels Defined In Harvest Config To Metadata Metrics (#2456)
- Add Link_up Counter For Fcp (#2464)
- Implementing Support For Randomized Start Times In Tasks (#2465)
π Bug Fixes
23.08.0
23.08.0 / 2023-08-21 Release
π Highlights of this major release include:
-
Harvest Security dashboard highlights compliance using NetApp's Security hardening guide for ONTAP
-
Harvest's credential script supports ONTAP daily credential rotation. Thanks to @mamoep for raising.
-
π© Harvest makes it easy to run with both the ZAPI and REST collectors at the same time. Overlapping resources are deduplicated and only exported once. Harvest will automatically upgrade ZAPI conversations to REST when ZAPIs are suspended or disabled.
-
π Updated workload dashboard now includes Service Center, Latency Breakdown, and 50 panels
-
π Cluster dashboard updated to work with FSx. Some panels are blank because FSx does not have that data.
-
π£ The Harvest team published a couple of screencasts about:
-
β Several of the existing dashboards include new panels in this release:
- Aggregate dashboard includes busy volume panels
- SVM dashboard includes per NFS latency heatmaps. Thanks to @rbrownATnetapp for raising.
- Volume dashboard includes top resources by other IOPs panel and junction paths. Thanks to @tsohst for raising.
-
All Harvest dashboard tables include column filters
-
Harvest dashboards use color to highlight latency and busy threshold breaches
-
Harvest's Prometheus exporter supports TLS
-
πΎ Harvest includes new templates to collect:
- Iwarp metrics
- FCVI metrics
- Per volume NFS metrics
- Volume clone metrics
- QoS workload policy metrics
- NVME/TCP and NVME/RoCE metrics
- Flashpool metrics are included in RestPerf. Thanks to @lobster1860 for raising
-
π Documentation additions
- Move more documentation from GitHub to Harvest documentation site
- Clarify how to tell Harvest to continue using the ZAPI protocol
- Clarify generic vs custom plugins. Thanks to GregS for raising
- Clarify which version of Go is required to build Harvest. Thanks to MikeK for raising
- Clarify how to prepare ONTAP cDOT clusters for Harvest data collection
- EMS documentation should point to Harvest documentation site. Thanks to @cwaltham for raising
- Clarify how to gather log files on all platforms
- Explain how to use the
--labels
option ofbin/harvest grafana
. Thanks to @slater0013 for raising - Describe how to run docker compose generate command without required Harvest binaries
-
The Harvest
doctor
command validates collector names listed in yourharvest.yml
file -
An earlier version of Harvest collected cloud store information via REST. This release adds the same for ZAPI
-
When ONTAP resources are missing, Harvest tries to collect them every hour. Earlier versions of Harvest waited 24 hours before retrying, which often caused metrics to be missing after a cluster upgrade. Thanks to @Falcon667 for raising
-
Earlier versions of Harvest created world writable auto-support files. These files are now only read/writeable by the current user. Thanks to Bunnygirl for raising
-
bin/harvest import
should work with Grafana 10. Thanks to @wooyoungAhn for raising
Announcements
23.08
fixes a REST collector bug that caused partial data collection when ONTAP paginated results. See #2109 for details.
23.08
disables the NetConnections
and NFSClients
templates by default. You can enable them if needed. These templates were disabled because several customers reported that these templates created millions of metrics. None of these metrics are used in Harvest dashboards.
23.08
changes how Harvest monitors workloads. For detailed information, please refer to the discussion #2265.
π‘ The Compliance dashboard was removed after its panels were moved to the Security dashboard.
π Ambient temperature metric may experience an increase due to issue #2259
read how to migrate your Prometheus volume
π‘ IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the bin/harvest grafana import
CLI, from the Grafana UI, or from the Maintenance > Reset Harvest Dashboards
button in NAbox.
Known Issues
-
Some AFF A250 systems do not report power metrics. See ONTAP bug 1511476 for more details.
-
ONTAP does not include REST metrics for
offbox_vscan_server
andoffbox_vscan
until ONTAP 9.13.1. See ONTAP bug
1473892 for more details.
IMPORTANT 7-mode filers that are not on the latest release of ONTAP may experience TLS connection issues with errors like tls: server selected unsupported protocol version 301
This is caused by a change in Go 1.18. The default for TLS client connections was changed to TLS 1.2 in Go 1.18. Please upgrade your 7-mode filers (recommended) or set tls_min_version: tls10
in your harvest.yml
poller section. See #1007 for more details.
Thanks to all the awesome contributors
π€ Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:
@7840vz, @DAx-cGn, @Falcon667, @Hedius, @LukaszWasko, @MrObvious, @ReneMeier, @Sawall10, @T1r0l, @XDavidT, @amd-eulee, @aticatac, @chadpruden, @cwaltham, @cygio, @ddhti, @debert-ntap, @demalik, @electrocreative, @elsgaard, @ev1963, @faguayot, @iStep2Step, @jgasher, @jmg011, @lobster1860, @mamoep, @matejzero, @matthieu-sudo, @merdos, @pilot7777, @rbrownATnetapp, @rodenj1, @slater0013, @swordfish291, @tsohst, @wooyoungAhn, Alessandro.Nuzzo, Ed Wilts, GregS, Imthenightbird, KlausHub, MeghanaD, MikeK, Paul P2, Rusty Brown, Shubham Mer, Tudor Pascu, Watson9121, jf38800, jfong, lorenzoc, rcl23, roller, scrhobbs, troysmuller, twodot0h
π± This release includes 42 features, 40 bug fixes, 20 documentation, 2 performance, 4 testing, 1 styling, 9 refactoring, 20 miscellaneous, and 12 ci pull requests.
π Features
- Harvest Should Collect Iwarp Counters (#2071)
- Update Visitpanels To Be Recursive (#2085)
- Add Table Column Filter For Dashboards (#2088)
- Update Lagtime Based On Lasttransfersize (#2091)
- Harvest Should Add Grafana Import Rewrite Svm Filtering For Multi-Tenant Support (#2092)
- Fetch Cloud_store Info In Zapi Via Plugin (#2094)
- Collection Of Other Counters For Fcvi Perf Object (#2096)
- Add Nfs Io Types At The Volume Level (#2098)
- Add System Defined Workload Collection (#2099)
- Add Workload Panels In Workload Dashboard (#2100)
- Add Volume Clone Info In Rest (#2102)
- Added Volume Panels In Aggr Dashboard (#2104)
- Workload Policy Iops Metrics (#2111)
- Autoresolve Ems Would Export Metric Value As 0 And Autoresolve=True Label (#2120)
- Support Type Label For Volume For Backward Compatibility (#2132)
- Volume Clone Info For Zapi (#2140)
- Harvest Should Include Numpollers And Rss In Autosupport (#2143)
- Colors In Grafana Dashboards To Highlight Warning, Critical Severity (#2147)
- Security Hardening Guide (#2150)
- Harvest Prometheus Exporter Should Support Tls (#2153)
- Latency Units Should Be In Microseconds In Harvest Dashboard (#2156)
- Simplify Rest Auto-Upgrade (#2167)
- When Using A Credential Script, Re-Auth On 401S (#2180)
- Upgrade Zapi Conversations To Rest When Zapis Are Suspended Or β¦ (#2200)
- When Using A Credential Script, Re-Auth On 401S (#2203)
- Merge Compliance And Security Dashboard + Added Arw Fields (#2207)
- Supporting Topk In S3 Dashboard (#2208)
- Aff250 Power Calculation (#2211)
- Use Single
Go Build
Command To Build Harvest And Poller Binaries ([#2221](https://github.com/NetApp/harvest...