Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf/better quad ids #318

Closed
wants to merge 29 commits into from
Closed

Perf/better quad ids #318

wants to merge 29 commits into from

Conversation

jeswr
Copy link
Collaborator

@jeswr jeswr commented Jan 4, 2023

This is a draft PR that branches on #311 and improves the indexing of quads in the store by using the existing numeric id's of other terms to generate the id of the quad as mentioned in #311 (comment).

Note there have been some changes to #311 since this was first opened so merge with care

src/N3Parser.js Outdated Show resolved Hide resolved
src/N3Parser.js Outdated Show resolved Hide resolved
@jeswr jeswr marked this pull request as ready for review January 5, 2023 04:45
Comment on lines +374 to +383
store.getQuads(
new Quad(new NamedNode('s1'), new NamedNode('p1'), new NamedNode('o2'), new NamedNode('g')),
new NamedNode('p1'),
null
).length.should.equal(1);
store.getQuads(
new Quad(new NamedNode('s1'), new NamedNode('p1'), new NamedNode('o2'), new NamedNode('g2')),
new NamedNode('p1'),
null
).length.should.equal(0);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid unnecessary breakage downstream; I have written this so that nested quads with graph terms that are not the default graph can still be added to the store (as this was the existing behavior when we were just stringifying quad terms).

This is never necessary for rdf-star compliant quoted triples; so we may wish to break this behavior at some point.

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
jeswr and others added 2 commits January 5, 2023 15:49
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
@jeswr
Copy link
Collaborator Author

jeswr commented Jan 5, 2023

Doesn't actually seem to have much of an impact This has a reasonable impact - with the script I just added we get

Main

N3Store performance test
- Adding 5153632 triples to the default graph: 4.302s
* Memory usage for triples: 363MB
- Finding all 5153632 triples in the default graph 484 times (0 variables): 12.121s
- Finding all 10648 triples in the default graph 968 times (1 variable subject): 2.264s
- Finding all 0 triples in the default graph 968 times (1 variable predicate): 0.613ms
- Finding all 22 triples in the default graph 1936 times (1 variable predicate): 1.256s
- Finding all 0 triples in the default graph 968 times (1 variable object): 1.331ms
- Finding all 22 triples in the default graph 1936 times (1 variable objects): 1.268s
- Finding all 484 triples in the default graph 484 times (2 variables): 608.382ms

Here

N3Store performance test
- Adding 5153632 triples to the default graph: 3.358s
* Memory usage for triples: 357MB
- Finding all 5153632 triples in the default graph 484 times (0 variables): 10.789s
- Finding all 10648 triples in the default graph 968 times (1 variable subject): 2.121s
- Finding all 0 triples in the default graph 968 times (1 variable predicate): 0.691ms
- Finding all 22 triples in the default graph 1936 times (1 variable predicate): 1.071s
- Finding all 0 triples in the default graph 968 times (1 variable object): 0.528ms
- Finding all 22 triples in the default graph 1936 times (1 variable objects): 1.066s
- Finding all 484 triples in the default graph 484 times (2 variables): 570.413ms

@jeswr
Copy link
Collaborator Author

jeswr commented Jan 5, 2023

With the existing N3Store-perf.js there is a bit of a performance hit so it may be worth optimizing the function calls a bit, or using a parameter to change the way that we do quad ids.

Main

$ node perf/N3Store-perf.js 128
N3Store performance test
- Adding 2097152 triples to the default graph: 745.913ms
* Memory usage for triples: 150MB
- Finding all 2097152 triples in the default graph 16384 times (0 variables): 3.333s
- Finding all 2097152 triples in the default graph 32768 times (1 variable): 647.21ms
- Finding all 2097152 triples in the default graph 49152 times (2 variables): 582.631ms

- Adding 1048576 quads: 473.093ms
* Memory usage for quads: 124MB
- Finding all 1048576 quads 131072 times: 448.73ms
N3 Store tests for sparsely connected entities
- Adding 1048576 with all different IRIs: 3.513s
* Retrieving all 1048576 quads: 611.111ms
* Retrieving single by subject: 1.535s
* Retrieving single by predicate: 1.523s
* Retrieving single by object: 1.904s
* Retrieving single by subject-predicate: 2.210s
* Retrieving single by subject-object: 2.043s
* Retrieving single by predicate-object: 2.087s
* Retrieving single by subject-predicate-object: 1.721s

Here

$ node perf/N3Store-perf.js 128
N3Store performance test
- Adding 2097152 triples to the default graph: 768.303ms
* Memory usage for triples: 150MB
- Finding all 2097152 triples in the default graph 16384 times (0 variables): 3.478s
- Finding all 2097152 triples in the default graph 32768 times (1 variable): 725.156ms
- Finding all 2097152 triples in the default graph 49152 times (2 variables): 689.219ms

- Adding 1048576 quads: 470.674ms
* Memory usage for quads: 124MB
- Finding all 1048576 quads 131072 times: 506.057ms
N3 Store tests for sparsely connected entities
- Adding 1048576 with all different IRIs: 3.514s
* Retrieving all 1048576 quads: 627.364ms
* Retrieving single by subject: 1.592s
* Retrieving single by predicate: 1.591s
* Retrieving single by object: 1.569s
* Retrieving single by subject-predicate: 1.806s
* Retrieving single by subject-object: 1.782s
* Retrieving single by predicate-object: 1.817s
* Retrieving single by subject-predicate-object: 1.771s

@jeswr
Copy link
Collaborator Author

jeswr commented Jan 5, 2023

For N3StoreStarViews-perf.js

Main

N3Store performance test
- Adding 1073741824 triples to the default graph: 2.686s
* Memory usage for triples: 584MB
- Finding all 1073741824 triples in the default graph 4096 times (0 variables): 5.680s
- Finding all 262144 triples in the default graph 8192 times (1 variable subject): 1.876s
- Finding all 0 triples in the default graph 8192 times (1 variable predicate): 1.118ms
- Finding all 3 triples in the default graph 786432 times (1 variable predicate): 2.395s
- Finding all 0 triples in the default graph 8192 times (1 variable object): 1.386ms
- Finding all 3 triples in the default graph 786432 times (1 variable objects): 2.384s
- Finding all 9 triples in the default graph 262144 times (2 variables): 1.088s

Here

N3Store performance test
- Adding 1073741824 triples to the default graph: 2.370s
* Memory usage for triples: 554MB
- Finding all 1073741824 triples in the default graph 4096 times (0 variables): 5.102s
- Finding all 262144 triples in the default graph 8192 times (1 variable subject): 1.803s
- Finding all 0 triples in the default graph 8192 times (1 variable predicate): 8.72ms
- Finding all 3 triples in the default graph 786432 times (1 variable predicate): 2.378s
- Finding all 0 triples in the default graph 8192 times (1 variable object): 5.433ms
- Finding all 3 triples in the default graph 786432 times (1 variable objects): 2.354s
- Finding all 9 triples in the default graph 262144 times (2 variables): 1.032s

@jeswr jeswr mentioned this pull request Jan 11, 2023
@jeswr jeswr mentioned this pull request Mar 24, 2023
@jeswr jeswr closed this Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants