Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] telemetry fetchers use broken pagination logic #91654

Closed
afharo opened this issue Feb 17, 2021 · 12 comments
Closed

[Monitoring] telemetry fetchers use broken pagination logic #91654

afharo opened this issue Feb 17, 2021 · 12 comments
Labels
chore Feature:Stack Monitoring Feature:Telemetry impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services

Comments

@afharo
Copy link
Member

afharo commented Feb 17, 2021

As @robbavey pointed out, this is following the same approach as the get_beats_stats piece of logic.

@chrisronline, do you know if .monitoring-* indices have a custom index.max_result_window setting that allows pagination for more than 10k docs? Or if it's even possible that we reach 10k docs on these types of queries?

Originally posted by @afharo in #90850 (comment)

Let's revisit the pagination logics for Beats and Logstash to make sure they work for 10k+ docs. We can use either the Scroll or the Search After approaches.

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-telemetry (Team:KibanaTelemetry)

@afharo
Copy link
Member Author

afharo commented Feb 17, 2021

Ping @elastic/stack-monitoring-ui

@chrisronline
Copy link
Contributor

We don't have anything like that, AFAIK. It seems like it'd come up more but I honestly can't recall the last time I encountered someone with an issue around this

@afharo
Copy link
Member Author

afharo commented Feb 18, 2021

This is about collecting telemetry, so any errors will silently fail and simply not send telemetry. I don't think anyone would actually notice.

I also think it's an edge case. That would mean we are dealing with massive clusters with more than 10k Beasts instances, Logstash instances or Logstash's pipelines ephemeral IDs.

@chrisronline
Copy link
Contributor

Makes sense. I meant we don't hear issues from customers that they aren't able to see their 10k+ beats in the SM UI, so I'm not sure how often this scenario comes up for anyone (where they exceed 10k instances of anything) so it seems like an isolated use case but maybe something worth noting in docs somewhere.

@sgrodzicki sgrodzicki added the bug Fixes for quality problems that affect the customer experience label Jun 7, 2021
@simianhacker simianhacker added chore and removed bug Fixes for quality problems that affect the customer experience labels Jun 23, 2021
@simianhacker
Copy link
Member

I'm not sure what the issue is here. Are we trying to figure out if the UI can handle 10K+ logstash instances?

@Bamieh
Copy link
Member

Bamieh commented Jun 25, 2021

@simianhacker telemetry is running queries against the .monitoring-* indices to report them back to our cluster.

The current logic to paginate over the indices does not work because the max_window (from + size) can't exceed 10k docs by default in ES. This means with the current implementation no pagination will happen even if there are over 10k documents reported by beats/logstash.

@lukeelmers lukeelmers added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Oct 1, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Nov 4, 2021
@lukeelmers lukeelmers removed the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Jan 11, 2022
@afharo afharo added Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services and removed Team:Monitoring Stack Monitoring team labels Mar 4, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

@jasonrhodes
Copy link
Member

@Bamieh @afharo @robbavey we just re-discovered this ticket and we aren't sure if this is a "bug" or not? We are trying to determine if we need to pull this in...

@afharo
Copy link
Member Author

afharo commented Mar 15, 2022

@jasonrhodes in #127273 I removed the "broken" pagination logic: it basically doesn't paginate anymore.

However, ideally, the requests in there should use PIT pagination to retrieve all the available data in a less expensive way. For more context of why we prefer PIT requests vs. size: 1000 searches, please refer to #93770

@smith
Copy link
Contributor

smith commented Apr 12, 2022

Closing. If it's something we want to do we can start with:

However, ideally, the requests in there should use PIT pagination to retrieve all the available data in a less expensive way. For more context of why we prefer PIT requests vs. size: 1000 searches, please refer to #93770

@smith smith closed this as completed Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Feature:Stack Monitoring Feature:Telemetry impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services
Projects
None yet
Development

No branches or pull requests

9 participants