Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Logs UI] Shorten the logs ML job ID prefixes #47477

Closed
weltenwort opened this issue Oct 7, 2019 · 17 comments · Fixed by #168234
Closed

[Logs UI] Shorten the logs ML job ID prefixes #47477

weltenwort opened this issue Oct 7, 2019 · 17 comments · Fixed by #168234
Assignees
Labels
Feature:Logs UI Logs UI feature impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. needs-refinement A reason and acceptance criteria need to be defined for this issue Team:obs-ux-logs Observability Logs User Experience Team

Comments

@weltenwort
Copy link
Member

weltenwort commented Oct 7, 2019

Summary

The static parts log rate job IDs should be as short as possible.

Rationale

The log rate jobs are assigned human-readable IDs that contain the static parts as well as the Kibana space and logs source IDs: kibana-logs-ui-${spaceId}-${sourceId}-log-entry-rate Since the Kibana space ID is set by the user, there is a risk of exceeding the 64-character limit on the length of the ML job ID. Reducing the lengths of the static parts can reduce that risk by leaving more room for the user-defined space ID.

In the long term the space-awareness of ML jobs will remove the necessity for including the space ID in the job ID.

Because the id is used to find the jobs belonging to a source config, this will be a breaking change.

Acceptance criteria

  • The id assigned to jobs is shortened to logs-${spaceId}-${sourceId}-rate.
  • The id assigned to a datafeed matches the job id.
  • Jobs that were created before this change are still handled correctly.
@weltenwort weltenwort added Feature:Logs UI Logs UI feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services v7.5.0 labels Oct 7, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/infra-logs-ui (Team:infra-logs-ui)

@jasonrhodes
Copy link
Member

This will no longer be a big deal when we can drop space ID from the Job ID, right? cc @weltenwort

@weltenwort
Copy link
Member Author

weltenwort commented Jun 12, 2020

True: if there is a migration in place to move already existing jobs into the respective spaces, we should remove the id as part of that migration.

@jasonrhodes
Copy link
Member

@elastic/machine-learning What's the ETA on ML space awareness?

@jgowdyelastic
Copy link
Member

ML job space awareness is planned for 7.11

@sophiec20
Copy link
Contributor

Meta ticket for ref #64172

@jasonrhodes jasonrhodes added the impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. label Jul 29, 2021
@jasonrhodes
Copy link
Member

Refinement update: we no longer need to put the space ID in the index name, but we need to make sure we understand how to query with backwards compatibility, if we remove that.

@jasonrhodes jasonrhodes added the needs-refinement A reason and acceptance criteria need to be defined for this issue label Jul 29, 2021
@smith smith removed the needs-refinement A reason and acceptance criteria need to be defined for this issue label Jul 26, 2022
@miltonhultgren miltonhultgren self-assigned this Sep 6, 2022
@miltonhultgren
Copy link
Contributor

miltonhultgren commented Sep 7, 2022

After digging into the code for this I have some questions.
Maybe @elastic/machine-learning are best suited to answer.

  1. What needs to be in the job ID to make it unique?
    Today we include a prefix, the space id, the logs/metrics source configuration id and the job name.

  2. Can I somehow migrate old jobs so that their old ids are renamed to use the new pattern?
    That would make the UI code a lot simpler, since otherwise I need to check for both the new and old format in every place where we refer to a job by its id.

  3. Are there any general guidelines for what to put into the group field when registering a job?
    For the log rate and categorisation jobs we put the group as "logs-ui" (meaning, the app that created them), but for our metric jobs we grouped them by "metrics" (the type of data they use) and "host"/"k8s" (the dataset they're working on).

@miltonhultgren
Copy link
Contributor

Gentle re-ping, I suspect my edit didn't fire the notification, @elastic/machine-learning :)

@sophiec20
Copy link
Contributor

sophiec20 commented Sep 12, 2022

What needs to be in the job ID to make it unique?

This is a question for the logs team. ML jobs can be shared between spaces, and I don't know what source_id is.

It's a good time to consider if Logs UI should stick with only ever being able to link to one job (with a hard-coded fixed id). It is reasonable to think that customers would want the flexibility to see results from multiple jobs.

Alternatively, and imho probably more flexibly, if sticking to 1 job then Logs UI could allow the user to override the job_id in adv settings.

You might want to add a version number in the job_id. See next q.

Can I somehow migrate old jobs so that their old ids are renamed to use the new pattern?

The job_id cannot be changed. You would need to clone and restart new jobs. You cannot rename them. Other solutions have done post upgrade checks, and offered users the option to upgrade their ML jobs via app banners. (Where the upgrade will create a new job, stop the old one, and point to the new one).

Are there any general guidelines for what to put into the group field when registering a job?

The "group" field mainly allows filtering in the ML UI -- this allows users to view results from multiple jobs together and to manage multiple jobs together .. e.g. bulk stop. As a general guideline, define fewer groups ... otherwise the filtering causes a lot of groups of 1 which isn't the best user experience as job_id is already unique.

Also in job config, include "custom_settings.managed: true". This means the job will have a badge in the ML UI and there are warnings if you try and delete/edit it. Already used by Metrics.

@miltonhultgren miltonhultgren removed their assignment Sep 13, 2022
@miltonhultgren
Copy link
Contributor

@smith I think this will need more thought before we proceed.

@smith smith added the needs-refinement A reason and acceptance criteria need to be defined for this issue label Sep 13, 2022
@smith
Copy link
Contributor

smith commented Sep 13, 2022

Thanks for looking @miltonhultgren. I'll put this back in the backlog for now.

@sophiec20
Copy link
Contributor

Sorry, did not mean to put you off .. imho changing to logs-${spaceId}-${sourceId}-rate would be a useful incremental change.

@smith
Copy link
Contributor

smith commented Sep 13, 2022

Sorry, did not mean to put you off .. imho changing to logs-${spaceId}-${sourceId}-rate would be a useful incremental change.

sourceId is usually default, but spaceId could be anything so we would still face the problem of easily exceeding the 64 character limit. IMO this is only worth picking up right now if we can prevent exceeding the limit without too much effort.

@Danouchka Danouchka added impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. and removed impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. labels Oct 6, 2022
@Danouchka
Copy link

Experiencing same issue on 8.4.3.. Just a question; why dont we allow job names of 255 chars ?

@adjenks
Copy link

adjenks commented Apr 14, 2023

I recently ran into this issue creating a ML job and seeing "The job id cannot contain more than 64 characters."
That lead me here: https://discuss.elastic.co/t/kibana-ml-the-job-id-cannot-contain-more-than-64-characters/303047
Which lead me here: #112938
Which lead me here.
I look forward to a fix.
Thank you and good luck.

@miltonhultgren miltonhultgren self-assigned this Sep 5, 2023
miltonhultgren added a commit that referenced this issue Oct 5, 2023
While working on #47477, I found
that attempting to re-create a ML job faces a 404 because it uses an
endpoint that has been removed / changed.

This PR updates to use the newer endpoint to find which tasks are
blocking in the ML system (like job deletion) and changes the types to
match the new API.
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Oct 5, 2023
While working on elastic#47477, I found
that attempting to re-create a ML job faces a 404 because it uses an
endpoint that has been removed / changed.

This PR updates to use the newer endpoint to find which tasks are
blocking in the ML system (like job deletion) and changes the types to
match the new API.

(cherry picked from commit 48b66d7)
kibanamachine added a commit that referenced this issue Oct 5, 2023
…168075)

# Backport

This will backport the following commits from `main` to `8.11`:
- [[infra] Use correct ML API to query blocking tasks
(#167779)](#167779)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Milton
Hultgren","email":"milton.hultgren@elastic.co"},"sourceCommit":{"committedDate":"2023-10-05T09:39:23Z","message":"[infra]
Use correct ML API to query blocking tasks (#167779)\n\nWhile working on
#47477, I found\r\nthat
attempting to re-create a ML job faces a 404 because it uses
an\r\nendpoint that has been removed / changed.\r\n\r\nThis PR updates
to use the newer endpoint to find which tasks are\r\nblocking in the ML
system (like job deletion) and changes the types to\r\nmatch the new
API.","sha":"48b66d72dc8fc40fdf21a8c812cfd7659686ccf2","branchLabelMapping":{"^v8.12.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Feature:Logs
UI","Team:Infra Monitoring
UI","backport:prev-minor","v8.12.0"],"number":167779,"url":"https://github.com/elastic/kibana/pull/167779","mergeCommit":{"message":"[infra]
Use correct ML API to query blocking tasks (#167779)\n\nWhile working on
#47477, I found\r\nthat
attempting to re-create a ML job faces a 404 because it uses
an\r\nendpoint that has been removed / changed.\r\n\r\nThis PR updates
to use the newer endpoint to find which tasks are\r\nblocking in the ML
system (like job deletion) and changes the types to\r\nmatch the new
API.","sha":"48b66d72dc8fc40fdf21a8c812cfd7659686ccf2"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.12.0","labelRegex":"^v8.12.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/167779","number":167779,"mergeCommit":{"message":"[infra]
Use correct ML API to query blocking tasks (#167779)\n\nWhile working on
#47477, I found\r\nthat
attempting to re-create a ML job faces a 404 because it uses
an\r\nendpoint that has been removed / changed.\r\n\r\nThis PR updates
to use the newer endpoint to find which tasks are\r\nblocking in the ML
system (like job deletion) and changes the types to\r\nmatch the new
API.","sha":"48b66d72dc8fc40fdf21a8c812cfd7659686ccf2"}}]}] BACKPORT-->

Co-authored-by: Milton Hultgren <milton.hultgren@elastic.co>
miltonhultgren added a commit to miltonhultgren/kibana that referenced this issue Oct 6, 2023
@miltonhultgren miltonhultgren changed the title [Logs UI] Shorten the log rate job ID prefixes [Logs UI] Shorten the logs ML job ID prefixes Oct 13, 2023
dej611 pushed a commit to dej611/kibana that referenced this issue Oct 17, 2023
While working on elastic#47477, I found
that attempting to re-create a ML job faces a 404 because it uses an
endpoint that has been removed / changed.

This PR updates to use the newer endpoint to find which tasks are
blocking in the ML system (like job deletion) and changes the types to
match the new API.
@gbamparop gbamparop added Team:obs-ux-logs Observability Logs User Experience Team and removed Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services labels Nov 9, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

@botelastic botelastic bot added needs-team Issues missing a team label and removed needs-team Issues missing a team label labels Nov 9, 2023
pull bot pushed a commit to MrLiukang/kibana that referenced this issue Nov 19, 2023
Closes elastic#47477

### Summary

ML job IDs have a limit of 64 characters. For the log ML jobs we add the
string `kibana-logs-ui` plus the space and log view IDs as a prefix to
the job names (`log-entry-rate` and `log-entry-categories-count`) which
can quickly eat up the 64 character limit (even our own Stack Monitoring
log view hits the limit). This prevents users from being able to create
ML jobs and it's hard to rename a space or log view, and the limit is
not hinted at during space creation (because they are unrelated in some
sense).

In order to achieve a more stable length to the ID, this PR introduces a
new format for the prefix which creates a UUID v5 which uses the space
and log view ID as seed information (it then removes the dashes to still
be within the size limit for the categorization job).

Since there is no technical difference between the new and old format,
this PR makes an effort to continue to support the old format and allow
migration of old jobs as needed. The old jobs work and may contain
important data so the user should not feel forced to migrate.

The main addition is a new small API that checks if any ML jobs are
available and which format they use for the ID so that the app can
request data accordingly and the APIs have been modified to take the ID
format into account (except during creation which should always use the
new format).

The solution applied is not ideal. It simply passes the ID format along
with the space and log view ID to each point where the ID is re-created
(which is multiple). The ideal solution would be to store the job data
in the store and pass that around instead but that seemed like a
considerably larger effort. This PR does introduce some functional tests
around the ML job creation process, so such a future refactor should be
a bit safer than previously.

### How to test

* Start from `main`
* Start Elasticsearch
* Start Kibana
* Load the Sample web logs (Kibana home -> Try sample data -> Other
sample data sets)
* Visit the Anomalies page in the Logs UI
* Set up any of the two ML jobs or both, wait for some results to show
up
* Checkout the PR branch
* Visit the anomalies page and verify that it still works (requests go
to resolve the ID format, should return 'legacy' which should then load
the data for the legacy job)
* Recreate the ML job and verify that the new job works and results
still show up (new requests should go out with the new format being
used, which may be a mixed mode if you have two jobs and only migrate
one of them)

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Logs UI Logs UI feature impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. needs-refinement A reason and acceptance criteria need to be defined for this issue Team:obs-ux-logs Observability Logs User Experience Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.