Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we improve query performance? #1902

Open
sanket-uptycs opened this issue May 31, 2024 · 4 comments
Open

Can we improve query performance? #1902

sanket-uptycs opened this issue May 31, 2024 · 4 comments
Labels
question Further information is requested Stale Stale issues/PRs

Comments

@sanket-uptycs
Copy link

I am trying following query. Can we improve query performance?

SELECT * FROM cypher('graphdb', $$
     MATCH (n1:node1 {SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"})
     MATCH (n2:node2 {SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"})
     UNWIND n1.ArrayOfStrings as props
     WITH props, n1, n2
     WHERE toLower(props) = toLower(n2.ResourceId)
     MERGE (n1{SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"})-[r:ATTACHED_TO{SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}]->(n2{SubscriptionId: "8da31d20-daf9-42ad-bf7f-2
 cdcf6290001"})
     SET r.batchId = 1717068002, r.SubscriptionId = "8da31d20-daf9-42ad-bf7f-2cdcf6290001"
 $$) as (a agtype);

I have gin index on node1 node2 and ATTACHED_TO edge.
But it is taking
Planning Time: 0.309 ms Execution Time: 44548.615 ms
Here is the query plan.

 Custom Scan (Cypher Set)  (cost=0.00..0.00 rows=0 width=32) (actual time=44537.778..44537.782 rows=0 loops=1)
   ->  Subquery Scan on cypher  (cost=0.00..0.00 rows=1 width=32) (actual time=8.018..43411.094 rows=3000 loops=1)
         ->  Custom Scan (Cypher Merge)  (cost=0.00..0.00 rows=0 width=256) (actual time=8.017..43410.376 rows=3000 loops=1)
               ->  Subquery Scan on _age_default_alias_previous_cypher_clause  (cost=180.03..4370.67 rows=810 width=64) (actual time=8.014..43408.357 rows=3000 loops=1)
                     ->  Hash Left Join  (cost=180.03..4362.57 rows=810 width=160) (actual time=8.014..43407.518 rows=3000 loops=1)
                           Join Filter: ((age_properties(_age_default_alias_previous_cypher_clause_1.n1) @> '{"SubscriptionId": "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}'::agtype) AND (age_properties(_age_default_alias_previous_cypher_clause_1.n2) @> '{"SubscriptionId": "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}'::agtype))
                           ->  Subquery Scan on _age_default_alias_previous_cypher_clause_1  (cost=48.21..4206.43 rows=810 width=96) (actual time=0.900..43278.062 rows=3000 loops=1)
                                 Filter: (age_tolower(_age_default_alias_previous_cypher_clause_1.props) = age_tolower(agtype_access_operator(VARIADIC ARRAY[_age_default_alias_previous_cypher_clause_1.n2, '"ResourceId"'::agtype])))
                                 Rows Removed by Filter: 8997000
                                 ->  ProjectSet  (cost=48.21..966.43 rows=162000 width=96) (actual time=0.556..17782.350 rows=9000000 loops=1)
                                       ->  Nested Loop  (cost=48.21..152.38 rows=162 width=2796) (actual time=0.536..454.720 rows=4500000 loops=1)
                                             ->  Bitmap Heap Scan on "node2" n2  (cost=28.14..95.85 rows=18 width=1060) (actual time=0.349..4.000 rows=3000 loops=1)
                                                   Recheck Cond: (properties @> '{"SubscriptionId": "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}'::agtype)
                                                   Heap Blocks: exact=436
                                                   ->  Bitmap Index Scan on index_name2  (cost=0.00..28.14 rows=18 width=0) (actual time=0.305..0.305 rows=3000 loops=1)
                                                         Index Cond: (properties @> '{"SubscriptionId": "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}'::agtype)
                                             ->  Materialize  (cost=20.07..54.52 rows=9 width=1736) (actual time=0.000..0.057 rows=1500 loops=3000)
                                                   ->  Bitmap Heap Scan on "node1" n1  (cost=20.07..54.48 rows=9 width=1736) (actual time=0.183..1.288 rows=1500 loops=1)
                                                         Recheck Cond: (properties @> '{"SubscriptionId": "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}'::agtype)
                                                         Heap Blocks: exact=383
                                                         ->  Bitmap Index Scan on index_name  (cost=0.00..20.07 rows=9 width=0) (actual time=0.149..0.149 rows=1500 loops=1)
                                                               Index Cond: (properties @> '{"SubscriptionId": "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}'::agtype)
                           ->  Hash  (cost=131.40..131.40 rows=28 width=48) (actual time=7.053..7.054 rows=4600 loops=1)
                                 Buckets: 8192 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 1358kB
                                 ->  Bitmap Heap Scan on "ATTACHED_TO" r  (cost=36.21..131.40 rows=28 width=48) (actual time=0.617..6.157 rows=4600 loops=1)
                                       Recheck Cond: (properties @> '{"SubscriptionId": "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}'::agtype)
                                       Heap Blocks: exact=442
                                       ->  Bitmap Index Scan on index_name3  (cost=0.00..36.21 rows=28 width=0) (actual time=0.553..0.553 rows=4665 loops=1)
                                             Index Cond: (properties @> '{"SubscriptionId": "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}'::agtype)

select count(*) from graphdb._ag_label_edge;
| 324030 |

select count(*) from graphdb._ag_label_vertex;
| 220385 |
Number of Node and Edge types : 50+

@sanket-uptycs sanket-uptycs added the question Further information is requested label May 31, 2024
Copy link

This issue is stale because it has been open 60 days with no activity. Remove "Abondoned" label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the Stale Stale issues/PRs label Jul 31, 2024
Copy link

This issue was closed because it has been stalled for further 14 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 14, 2024
@jrgemignani jrgemignani removed the Stale Stale issues/PRs label Aug 14, 2024
@jrgemignani jrgemignani reopened this Aug 14, 2024
@jrgemignani
Copy link
Contributor

@sanket-uptycs We are looking into performance improvements in general.

I will point out that the below query is a bit redundant by using MERGE and SET. MERGE is basically MATCH X and if X doesn't exist CREATE X. Since you have already MATCHed them, you don't need to do it again. Additionally, since you are creating a new edge, you don't need to SET it.

SELECT * FROM cypher('graphdb', $$
     MATCH (n1:node1 {SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"})
     MATCH (n2:node2 {SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"})
     UNWIND n1.ArrayOfStrings as props
     WITH props, n1, n2
     WHERE toLower(props) = toLower(n2.ResourceId)
     MERGE (n1{SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"})-[r:ATTACHED_TO{SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"}]->(n2{SubscriptionId: "8da31d20-daf9-42ad-bf7f-2
 cdcf6290001"})
     SET r.batchId = 1717068002, r.SubscriptionId = "8da31d20-daf9-42ad-bf7f-2cdcf6290001"
 $$) as (a agtype);

I believe this is equivalent, and likely faster -

SELECT * FROM cypher('graphdb', $$
     MATCH (n1:node1 {SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"})
     MATCH (n2:node2 {SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001"})
     UNWIND n1.ArrayOfStrings as props
     WITH props, n1, n2
     WHERE toLower(props) = toLower(n2.ResourceId)
     CREATE (n1)-[r:ATTACHED_TO{SubscriptionId: "8da31d20-daf9-42ad-bf7f-2cdcf6290001", batchId: 1717068002}]->(n2)
 $$) as (a agtype);

Copy link

This issue is stale because it has been open 60 days with no activity. Remove "Abondoned" label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the Stale Stale issues/PRs label Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale Stale issues/PRs
Projects
None yet
Development

No branches or pull requests

2 participants