-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Logs UI] Shorten the logs ML job ID prefixes #47477
Comments
Pinging @elastic/infra-logs-ui (Team:infra-logs-ui) |
This will no longer be a big deal when we can drop space ID from the Job ID, right? cc @weltenwort |
True: if there is a migration in place to move already existing jobs into the respective spaces, we should remove the id as part of that migration. |
@elastic/machine-learning What's the ETA on ML space awareness? |
ML job space awareness is planned for 7.11 |
Meta ticket for ref #64172 |
Refinement update: we no longer need to put the space ID in the index name, but we need to make sure we understand how to query with backwards compatibility, if we remove that. |
After digging into the code for this I have some questions.
|
Gentle re-ping, I suspect my edit didn't fire the notification, @elastic/machine-learning :) |
This is a question for the logs team. ML jobs can be shared between spaces, and I don't know what source_id is. It's a good time to consider if Logs UI should stick with only ever being able to link to one job (with a hard-coded fixed id). It is reasonable to think that customers would want the flexibility to see results from multiple jobs. Alternatively, and imho probably more flexibly, if sticking to 1 job then Logs UI could allow the user to override the job_id in adv settings. You might want to add a version number in the job_id. See next q.
The job_id cannot be changed. You would need to clone and restart new jobs. You cannot rename them. Other solutions have done post upgrade checks, and offered users the option to upgrade their ML jobs via app banners. (Where the upgrade will create a new job, stop the old one, and point to the new one).
The "group" field mainly allows filtering in the ML UI -- this allows users to view results from multiple jobs together and to manage multiple jobs together .. e.g. bulk stop. As a general guideline, define fewer groups ... otherwise the filtering causes a lot of groups of 1 which isn't the best user experience as job_id is already unique. Also in job config, include "custom_settings.managed: true". This means the job will have a badge in the ML UI and there are warnings if you try and delete/edit it. Already used by Metrics. |
@smith I think this will need more thought before we proceed. |
Thanks for looking @miltonhultgren. I'll put this back in the backlog for now. |
Sorry, did not mean to put you off .. imho changing to |
|
Experiencing same issue on 8.4.3.. Just a question; why dont we allow job names of 255 chars ? |
I recently ran into this issue creating a ML job and seeing "The job id cannot contain more than 64 characters." |
While working on #47477, I found that attempting to re-create a ML job faces a 404 because it uses an endpoint that has been removed / changed. This PR updates to use the newer endpoint to find which tasks are blocking in the ML system (like job deletion) and changes the types to match the new API.
While working on elastic#47477, I found that attempting to re-create a ML job faces a 404 because it uses an endpoint that has been removed / changed. This PR updates to use the newer endpoint to find which tasks are blocking in the ML system (like job deletion) and changes the types to match the new API. (cherry picked from commit 48b66d7)
…168075) # Backport This will backport the following commits from `main` to `8.11`: - [[infra] Use correct ML API to query blocking tasks (#167779)](#167779) <!--- Backport version: 8.9.7 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Milton Hultgren","email":"milton.hultgren@elastic.co"},"sourceCommit":{"committedDate":"2023-10-05T09:39:23Z","message":"[infra] Use correct ML API to query blocking tasks (#167779)\n\nWhile working on #47477, I found\r\nthat attempting to re-create a ML job faces a 404 because it uses an\r\nendpoint that has been removed / changed.\r\n\r\nThis PR updates to use the newer endpoint to find which tasks are\r\nblocking in the ML system (like job deletion) and changes the types to\r\nmatch the new API.","sha":"48b66d72dc8fc40fdf21a8c812cfd7659686ccf2","branchLabelMapping":{"^v8.12.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Feature:Logs UI","Team:Infra Monitoring UI","backport:prev-minor","v8.12.0"],"number":167779,"url":"https://github.com/elastic/kibana/pull/167779","mergeCommit":{"message":"[infra] Use correct ML API to query blocking tasks (#167779)\n\nWhile working on #47477, I found\r\nthat attempting to re-create a ML job faces a 404 because it uses an\r\nendpoint that has been removed / changed.\r\n\r\nThis PR updates to use the newer endpoint to find which tasks are\r\nblocking in the ML system (like job deletion) and changes the types to\r\nmatch the new API.","sha":"48b66d72dc8fc40fdf21a8c812cfd7659686ccf2"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.12.0","labelRegex":"^v8.12.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/167779","number":167779,"mergeCommit":{"message":"[infra] Use correct ML API to query blocking tasks (#167779)\n\nWhile working on #47477, I found\r\nthat attempting to re-create a ML job faces a 404 because it uses an\r\nendpoint that has been removed / changed.\r\n\r\nThis PR updates to use the newer endpoint to find which tasks are\r\nblocking in the ML system (like job deletion) and changes the types to\r\nmatch the new API.","sha":"48b66d72dc8fc40fdf21a8c812cfd7659686ccf2"}}]}] BACKPORT--> Co-authored-by: Milton Hultgren <milton.hultgren@elastic.co>
While working on elastic#47477, I found that attempting to re-create a ML job faces a 404 because it uses an endpoint that has been removed / changed. This PR updates to use the newer endpoint to find which tasks are blocking in the ML system (like job deletion) and changes the types to match the new API.
Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs) |
Closes elastic#47477 ### Summary ML job IDs have a limit of 64 characters. For the log ML jobs we add the string `kibana-logs-ui` plus the space and log view IDs as a prefix to the job names (`log-entry-rate` and `log-entry-categories-count`) which can quickly eat up the 64 character limit (even our own Stack Monitoring log view hits the limit). This prevents users from being able to create ML jobs and it's hard to rename a space or log view, and the limit is not hinted at during space creation (because they are unrelated in some sense). In order to achieve a more stable length to the ID, this PR introduces a new format for the prefix which creates a UUID v5 which uses the space and log view ID as seed information (it then removes the dashes to still be within the size limit for the categorization job). Since there is no technical difference between the new and old format, this PR makes an effort to continue to support the old format and allow migration of old jobs as needed. The old jobs work and may contain important data so the user should not feel forced to migrate. The main addition is a new small API that checks if any ML jobs are available and which format they use for the ID so that the app can request data accordingly and the APIs have been modified to take the ID format into account (except during creation which should always use the new format). The solution applied is not ideal. It simply passes the ID format along with the space and log view ID to each point where the ID is re-created (which is multiple). The ideal solution would be to store the job data in the store and pass that around instead but that seemed like a considerably larger effort. This PR does introduce some functional tests around the ML job creation process, so such a future refactor should be a bit safer than previously. ### How to test * Start from `main` * Start Elasticsearch * Start Kibana * Load the Sample web logs (Kibana home -> Try sample data -> Other sample data sets) * Visit the Anomalies page in the Logs UI * Set up any of the two ML jobs or both, wait for some results to show up * Checkout the PR branch * Visit the anomalies page and verify that it still works (requests go to resolve the ID format, should return 'legacy' which should then load the data for the legacy job) * Recreate the ML job and verify that the new job works and results still show up (new requests should go out with the new format being used, which may be a mixed mode if you have two jobs and only migrate one of them) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Summary
The static parts log rate job IDs should be as short as possible.
Rationale
The log rate jobs are assigned human-readable IDs that contain the static parts as well as the Kibana space and logs source IDs:
kibana-logs-ui-${spaceId}-${sourceId}-log-entry-rate
Since the Kibana space ID is set by the user, there is a risk of exceeding the 64-character limit on the length of the ML job ID. Reducing the lengths of the static parts can reduce that risk by leaving more room for the user-defined space ID.In the long term the space-awareness of ML jobs will remove the necessity for including the space ID in the job ID.
Because the id is used to find the jobs belonging to a source config, this will be a breaking change.
Acceptance criteria
logs-${spaceId}-${sourceId}-rate
.The text was updated successfully, but these errors were encountered: