Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

Old clusters don't error out on web interface (/web/cluster/alias/<cluster>) #1245

Closed
tomkrouper opened this issue Sep 28, 2020 · 2 comments · Fixed by #1246
Closed

Old clusters don't error out on web interface (/web/cluster/alias/<cluster>) #1245

tomkrouper opened this issue Sep 28, 2020 · 2 comments · Fixed by #1246

Comments

@tomkrouper
Copy link
Collaborator

If I go to the webui of the cluster foo that never existed, I'll get the error message.

{"Code":"ERROR","Message":"No cluster found for alias foo","Details":null}

If I got to the webui of the cluster bar that existed in the past but no longer has any hosts, I'll get an empty page.

The logs do show:

[martini] Started GET /web/cluster/alias/bar for <ipaddress>
[martini] Completed 200 OK in 236.58µs

Then a couple lines later, I see. The <fqdn> matches the information in the cluster_alias table.

2020-09-28 11:50:32 ERROR Unable to determine cluster name. clusterHint=<fqdn>:3306

I was able to test this on several "old" clusters, by finding them via:

select * from cluster_alias where last_registered < DATE(DATE_SUB(NOW(), INTERVAL 1 DAY)) order by 3

My guess is the forget should also remove the cluster_alias information or the webui should behave differently, I'm not sure which.

@shlomi-noach
Copy link
Collaborator

Agreed. cluster_alias is only ever populated from aggregated database_instance rows. But lack of rows from database_instance doesn't erase data from cluster_alias. We should expire tablet_alias based on last_registered and based on UnseenInstanceForgetHours.

@shlomi-noach
Copy link
Collaborator

#1246 fixes this issue

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants