Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix discrepancies between YAML property names and CLI argument names #901

Open
replay opened this issue Jan 26, 2022 · 11 comments
Open

Fix discrepancies between YAML property names and CLI argument names #901

replay opened this issue Jan 26, 2022 · 11 comments

Comments

@replay
Copy link
Contributor

replay commented Jan 26, 2022

Context

Describe the bug
There are cases where cli flags are very similar but slightly different to the properties in the yaml config.
This can confuse users, so we should fix these discrepancies.
Before the Mimir launch it is easier to make such breaking changes than it will be after the launch, so we should do that now.

One example:

# cli
-blocks-storage.bucket-store.ignore-deletion-marks-delay

# yaml
blocks_storage:
  bucket_store:
    ignore_deletion_mark_delay # mark vs marks

We should come up with a way to automatically detect these discrepancies and then fix them.
Then we should add that tool which detects the discrepancies to the CI to prevent that more get introduced.

@pracucci
Copy link
Collaborator

I did a quick hack into the doc-generator to find our discrepancies in the last part of the CLI flags vs YAML field name. I've used this code:

func analyzeConfig(blocks []*configBlock) {
	for _, block := range blocks {
		for _, entry := range block.entries {
			if entry.kind != "field" {
				continue
			}

			flagParts := strings.Split(entry.fieldFlag, ".")
			flagLastPart := flagParts[len(flagParts)-1]
			flagLastPart = strings.ReplaceAll(flagLastPart, "-", "_")

			// Limits YAML are fine if they don't perfectly match the last part, but they should match the while CLI flag for clarity.
			fullFlag := strings.ReplaceAll(strings.ReplaceAll(entry.fieldFlag, "-", "_"), ".", "_")

			if (entry.fieldFlag != "") && (flagLastPart != entry.name) && (fullFlag != entry.name) {
				fmt.Println(fmt.Sprintf("- `-%s` (YAML: `%s`)", entry.fieldFlag, entry.name))
			}
		}
	}
}

And I've run it like this:

go run ./tools/doc-generator ./blabla | sort | uniq

The following is a list of discrepancies found:

  • -<prefix>.consul.client-timeout (YAML: http_client_timeout)
  • -<prefix>.consul.hostname (YAML: host)
  • -alertmanager.configs.fallback (YAML: fallback_config_file)
  • -alertmanager.storage.path (YAML: data_dir)
  • -consul.client-timeout (YAML: http_client_timeout)
  • -consul.hostname (YAML: host)
  • -distributor.drop-label (YAML: drop_labels)
  • -distributor.extra-query-delay (YAML: extra_queue_delay)
  • -distributor.ha-tracker.cluster (YAML: ha_cluster_label)
  • -distributor.ha-tracker.enable-for-all-users (YAML: accept_ha_samples)
  • -distributor.ha-tracker.max-clusters (YAML: ha_max_clusters)
  • -distributor.ha-tracker.replica (YAML: ha_replica_label)
  • -distributor.ingestion-rate-limit (YAML: ingestion_rate)
  • -frontend.align-querier-with-step (YAML: align_queries_with_step)
  • -frontend.instance-addr (YAML: address)
  • -frontend.instance-port (YAML: port)
  • -frontend.max-retries-per-request (YAML: max_retries)
  • -memberlist.abort-if-join-fails (YAML: abort_if_cluster_join_fails)
  • -memberlist.join (YAML: join_members)
  • -memberlist.nodename (YAML: node_name)
  • -memberlist.pullpush-interval (YAML: pull_push_interval)
  • -querier.dns-lookup-period (YAML: dns_lookup_duration)
  • -querier.max-outstanding-requests-per-tenant (YAML: max_outstanding_per_tenant)
  • -ruler.alertmanager-discovery (YAML: enable_alertmanager_discovery)
  • -ruler.alertmanager-use-v2 (YAML: enable_alertmanager_v2)
  • -ruler.external.url (YAML: external_url)
  • -server.grpc-conn-limit (YAML: grpc_listen_conn_limit)
  • -server.grpc-max-concurrent-streams (YAML: grpc_server_max_concurrent_streams)
  • -server.grpc-max-recv-msg-size-bytes (YAML: grpc_server_max_recv_msg_size)
  • -server.grpc-max-send-msg-size-bytes (YAML: grpc_server_max_send_msg_size)
  • -server.grpc.keepalive.max-connection-age-grace (YAML: grpc_server_max_connection_age_grace)
  • -server.grpc.keepalive.max-connection-age (YAML: grpc_server_max_connection_age)
  • -server.grpc.keepalive.max-connection-idle (YAML: grpc_server_max_connection_idle)
  • -server.grpc.keepalive.min-time-between-pings (YAML: grpc_server_min_time_between_pings)
  • -server.grpc.keepalive.ping-without-stream-allowed (YAML: grpc_server_ping_without_stream_allowed)
  • -server.grpc.keepalive.time (YAML: grpc_server_keepalive_time)
  • -server.grpc.keepalive.timeout (YAML: grpc_server_keepalive_timeout)
  • -server.http-conn-limit (YAML: http_listen_conn_limit)
  • -server.http-idle-timeout (YAML: http_server_idle_timeout)
  • -server.http-read-timeout (YAML: http_server_read_timeout)
  • -server.http-write-timeout (YAML: http_server_write_timeout)
  • -server.path-prefix (YAML: http_path_prefix)
  • -validation.create-grace-period (YAML: creation_grace_period)
  • -validation.max-length-label-name (YAML: max_label_name_length)
  • -validation.max-length-label-value (YAML: max_label_value_length)

@replay
Copy link
Contributor Author

replay commented Jan 31, 2022

Thanks Marco, that's very useful.

If you only compared the last node of the names and you already found so many discrepancies, I suspect that if we'd compare the full paths there would be way more discrepancies.

In order to fix this I think we should decided whether we want to:

  1. update all YAML properties to match the CLI flags
  2. update all the CLI flags to match the YAML properties
  3. decide on a case-by-case basis which makes more sense

If we'd have to consider the upgrade path from previous versions to the next version of Mimir, then 1) or 2) would be better because each would only affect a subset of the users. But since we don't need to worry about the upgrade path I think 3) would be the best in this case.

@pracucci
Copy link
Collaborator

pracucci commented Feb 1, 2022

I suspect that if we'd compare the full paths there would be way more discrepancies

I confirm there are more if we compare the full path.

@pracucci
Copy link
Collaborator

pracucci commented Feb 1, 2022

decide on a case-by-case basis which makes more sense

If we'll address this and we aim for cleanup + clarify, I would decide on a case-by-case basis.

@pracucci
Copy link
Collaborator

pracucci commented Feb 1, 2022

In order to fix

How valuable is fixing this? I think it's a nice to have but I've the feeling the additional burden on the upgrade math may not be worth it. I'm just dubious, not opposed to it.

@replay
Copy link
Contributor Author

replay commented Feb 1, 2022

How valuable is fixing this?

I think in a perfect world a user could look at a cli flag, for example when running mimir -h, and then directly know how to specify the equivalent setting in the yaml.

Whether this is worth it compared to the pain which it introduces when upgrading from an older GEM version to a new one is hard to judge for me... Assuming that in the future we will have many new GEM and Mimir users which will benefit from the easier configurability I'd say its worth it, as long as this isn't fixed the life of existing users is a bit easier because their upgrade path is easier while the life of new users is a bit harder because the configuration is less intuitive, assuming that in the future we will reach a point where the majority of users have started using GEM/Mimir at a time which is later than now it makes sense to do this now because by then the larger number of users will have benefited.

What do you think @09jvilla

@09jvilla
Copy link
Contributor

09jvilla commented Feb 1, 2022

I'm in favor of doing it for the reasons @replay outlined. I'm ok with some one time pain to people migrating over to serve the future happiness/ease-of-use of all the new folks coming on, especially as that happiness will be paid forward in perpetuity.

That being said, we're getting down to crunch time on what we can realistically get done before launch, and I would put this after all the docs work we need to help out on, as well as all the other existing default updates and code changes we are working on. So my vote is that we add it to the 'would be great to have it before launch', but if we don't get to it with the existing folks we have on this work, we can live without it. If we do have some cycles to get to it, can we maybe just focus on the flags from the list above that we would consider 'basic' and not 'advanced'? Presumably those are the flags that more people would interact with, and therefore its higher value to align them.

As an aside:
I know that we love to mix and match yaml and cli flags -- is that a good idea for a more 'basic' user? Should we try to push users to explicitly use just the yaml configuration? if the user is mostly just using one or the other, then ideally they won't be as frustrated that the two naming conventions do not match.

@pracucci
Copy link
Collaborator

pracucci commented Feb 2, 2022

Should we try to push users to explicitly use just the yaml configuration?

Not your question, but just to clarify: I would keep support for both (not remove CLI flags from code). In my experience, CLI flags are more flexible than YAML config (you can easily specify different config to different services when deploying in microservices mode) but they're also targetting an advances usage.

For a basic usage, I agree YAML config is better. It's also easier to specify the whole config in a single place (a single file) and just share it with others (eg. customers, colleagues, ...).

@osg-grafana
Copy link
Contributor

cc @chri2547 and @KMiller-Grafana

@09jvilla
Copy link
Contributor

09jvilla commented Mar 2, 2022

@osg-grafana -- Just to be clear, this ticket is out of scope for the Mimir launch

@osg-grafana osg-grafana added the type/docs Improvements or additions to documentation label Mar 3, 2022
@osg-grafana osg-grafana changed the title Fix discrepancies between yaml property names and cli arg names Out-of-scope-for-launch: Fix discrepancies between yaml property names and cli arg names Mar 3, 2022
@osg-grafana osg-grafana changed the title Out-of-scope-for-launch: Fix discrepancies between yaml property names and cli arg names Out-of-scope-for-launch: Fix discrepancies between YAML property names and CLI argument names Mar 3, 2022
@pracucci pracucci removed the type/docs Improvements or additions to documentation label Mar 8, 2022
@pracucci pracucci changed the title Out-of-scope-for-launch: Fix discrepancies between YAML property names and CLI argument names Fix discrepancies between YAML property names and CLI argument names Mar 8, 2022
@pracucci
Copy link
Collaborator

pracucci commented Mar 8, 2022

Removed the docs label cause it's not about docs, but code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants