Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat graph investigate query time explosion #306

Merged
merged 9 commits into from
Dec 11, 2024

Conversation

Zenithar
Copy link
Contributor

@Zenithar Zenithar commented Dec 11, 2024

Context

  • Update JanusGraph to 1.1.0

  • Various performance-related fixes

    • Register commonly used composite indices used for querying the graph
    • Improve Jupyter Notebook queries to use newly registered indices
  • Enable the force-index flag to prevent g.V().count() like queries that could take down the graph by doing complete graph traversal. An error is raised when doing queries without index usage.

  • Enforce a hard limit to 10000 results returned to prevent server overload when the request is not limited.

  • Use indices to destroy the graph on the ingest local command call

@Zenithar Zenithar self-assigned this Dec 11, 2024
@Zenithar Zenithar marked this pull request as ready for review December 11, 2024 10:33
@Zenithar Zenithar requested a review from a team as a code owner December 11, 2024 10:33
Copy link
Contributor

@edznux-dd edznux-dd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!
Do you have some numbers to show a bit the improvements of these?

deployments/kubehound/graph/Dockerfile Outdated Show resolved Hide resolved
Comment on lines +182 to +191
// Create composite indices for the properties we want to search on
mgmt.buildIndex('byClusterAndRunIDComposite', Vertex.class).addKey(cluster).addKey(runID).buildCompositeIndex();
mgmt.buildIndex('byClassAndRunIDComposite', Vertex.class).addKey(cls).addKey(runID).buildCompositeIndex();
mgmt.buildIndex('byClassAndClusterComposite', Vertex.class).addKey(cls).addKey(cluster).buildCompositeIndex();
mgmt.buildIndex('byClassAndTypeComposite', Vertex.class).addKey(cls).addKey(type).buildCompositeIndex();
mgmt.buildIndex('byClassAndExposureComposite', Vertex.class).addKey(cls).addKey(exposure).buildCompositeIndex();
mgmt.buildIndex('byTypeAndNameComposite', Vertex.class).addKey(type).addKey(name).buildCompositeIndex();
mgmt.buildIndex('byImageAndRunIDComposite', Vertex.class).addKey(image).addKey(runID).buildCompositeIndex();
mgmt.buildIndex('byAppAndRunIDComposite', Vertex.class).addKey(app).addKey(runID).buildCompositeIndex();
mgmt.buildIndex('byNamespaceAndRunIDComposite', Vertex.class).addKey(namespace).addKey(runID).buildCompositeIndex();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, those will definitely be super useful.
Just by curiosity, do you know if it would have a significant impact on insert time? Or does it stay negligeable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For in-memory, I don't see any relevant/noticeable write performance improvements, except for the graph nuking, which is using the index now. It is more visible with concrete datastore and search index backends.

@Zenithar Zenithar merged commit 06bf54c into main Dec 11, 2024
9 checks passed
@Zenithar Zenithar deleted the feat_graph_investigate_query_time_explosion branch December 11, 2024 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants