Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High impact on cpu and memory when running deep taxonomy benchmark #1053

Closed
josd opened this issue Oct 15, 2021 · 22 comments
Closed

High impact on cpu and memory when running deep taxonomy benchmark #1053

josd opened this issue Oct 15, 2021 · 22 comments

Comments

@josd
Copy link
Contributor

josd commented Oct 15, 2021

While testing https://github.com/josd/retina/tree/cf0358f59cad43a8380ed828bf0c77c48bec547f with Scryer Prolog all examples and test cases in etc seem to work fine
but there is a practical concern with the high impact on cpu and memory when running the deep taxonomy benchmark
https://github.com/josd/retina/blob/cf0358f59cad43a8380ed828bf0c77c48bec547f/etc/dt.pl

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" scryer-prolog -g run retina.pl etc/dt.pl
namespace_prefix('rdf:','http://www.w3.org/1999/02/22-rdf-syntax-ns#').
namespace_prefix('etc:','https://josd.github.io/retina/etc/ns#').
'rdf:type'('etc:z','etc:n1').
'rdf:type'('etc:z','etc:n10').
'rdf:type'('etc:z','etc:n100').
'rdf:type'('etc:z','etc:n1000').
'rdf:type'('etc:z','etc:n10000').
sec=111.750 kB=973264 cpu=99%

So it takes almost 2 minutes and 1GB memory.
Compare this for instance with

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" tpl -g run retina.pl etc/dt.pl
namespace_prefix('rdf:','http://www.w3.org/1999/02/22-rdf-syntax-ns#').
namespace_prefix('etc:','https://josd.github.io/retina/etc/ns#').
'rdf:type'('etc:z','etc:n1').
'rdf:type'('etc:z','etc:n10').
'rdf:type'('etc:z','etc:n100').
'rdf:type'('etc:z','etc:n1000').
'rdf:type'('etc:z','etc:n10000').
sec=0.170 kB=19368 cpu=100%

This benchmark was designed by the late Harold Boley http://www.cs.unb.ca/~boley/
and is interesting to evaluate the subsumption capabilities of reasoners (using RDF and OWL)
e.g.
[[
SNOMED CT currently contains more than 300,000 medical concepts, divided into
hierarchies as diverse as body structure, clinical findings, geographic location and
pharmaceutical/biological product
]] -- https://searchhealthit.techtarget.com/definition/SNOMED-CT

@josd
Copy link
Contributor Author

josd commented Oct 15, 2021

So rdf:type is defined on the web at https://www.w3.org/1999/02/22-rdf-syntax-ns#type
i.e. prefixing type with namespace_prefix('rdf:','http://www.w3.org/1999/02/22-rdf-syntax-ns#').
For more information look at https://www.w3.org/TR/rdf-mt/

@josd
Copy link
Contributor Author

josd commented Oct 19, 2021

For the moment we have a workaround and use 'etc:isTypeOf' as 'owl:SymmetricProperty' of 'rdf:type'
https://raw.githubusercontent.com/josd/retina/28d42333c240585ee1dafdd24a679271ae1fac69/etc/dt.pl

So now we get

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" scryer-prolog -g run,halt retina.pl etc/dt.pl
namespace_prefix('rdf:','http://www.w3.org/1999/02/22-rdf-syntax-ns#').
namespace_prefix('etc:','https://josd.github.io/retina/etc/ns#').
'rdf:type'('etc:z','etc:n1').
'rdf:type'('etc:z','etc:n10').
'rdf:type'('etc:z','etc:n100').
'rdf:type'('etc:z','etc:n1000').
'rdf:type'('etc:z','etc:n10000').
sec=16.550 kB=1083564 cpu=99%

which is still a high impact on memory but a very good impact on cpu thanks to first argument indexing
and when the files are loaded the time is even excellent:

$ scryer-prolog retina.pl etc/dt.pl
?- use_module(library(time)).
   true.
?- time(run).
namespace_prefix('rdf:','http://www.w3.org/1999/02/22-rdf-syntax-ns#').
namespace_prefix('etc:','https://josd.github.io/retina/etc/ns#').
'rdf:type'('etc:z','etc:n1').
'rdf:type'('etc:z','etc:n10').
'rdf:type'('etc:z','etc:n100').
'rdf:type'('etc:z','etc:n1000').
'rdf:type'('etc:z','etc:n10000').
   % CPU time: 0.032 seconds
   % CPU time: 0.040 seconds
   true.
?-

@josd
Copy link
Contributor Author

josd commented Jan 25, 2022

It's been a while and I was recently testing our latest deep taxonomy benchmark which uses 300000 RDF subclasses (like in the above mentioned SNOMED CT) using the latest rebis-dev.

The good news is that it works, but the performance is to be improved before we can practically use it:

$ scryer-prolog dt.pl
?- use_module(library(time))
?- time('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'(ELEMENT,'http://example.org/etc#N100000')).
   % CPU time: 433.831s
   ELEMENT = 'http://example.org/etc#z'

For Trealla we get
$ tpl dt.pl

?- time('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'(ELEMENT,'http://example.org/etc#N100000')).
Time elapsed 0.118 secs
   ELEMENT = 'http://example.org/etc#z'.

I guess this is because there is no JIT indexing yet.

@mthom
Copy link
Owner

mthom commented Jan 25, 2022

That is definitely a bug. Only the second argument of each RDF clause is instantiated, right? First instantiated argument indexing should activate there.

@josd
Copy link
Contributor Author

josd commented Jan 25, 2022

Yes @mthom, the second argument of rdf:type is an rdfs:Class which is always instiated with one of the 300000 classes.

@josd josd reopened this Jan 25, 2022
@josd
Copy link
Contributor Author

josd commented Jan 25, 2022

Maybe related with the high impact on cpu is the high impact on memory:

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" scryer-prolog -g "'http://www.w3.org/1999/02/22-rdf-synta
x-ns#type'(ELEMENT,'http://example.org/etc#N100000'),write(ELEMENT),nl,halt" dt.pl
http://example.org/etc#z
sec=488.290 kB=2075216 cpu=99%

versus

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" tpl -g "'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'
(ELEMENT,'http://example.org/etc#N100000'),write(ELEMENT),nl,halt" dt.pl
http://example.org/etc#z
sec=1.950 kB=160532 cpu=99%

which is 2.1 GB versus 161 MB.

@UWN
Copy link

UWN commented Jan 25, 2022

@josd: Trealla is structure sharing that might change a lot here. Rather compare it to a structure copying system like GNU, SICStus or SWI.

@triska
Copy link
Contributor

triska commented Jan 25, 2022

@mthom: I think the first argument is instantiated in the very first clause:

'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N0').

So, first instantiated argument indexing will select the first argument for indexing.

If indexing is the cause of this slowdown, then it may be worth refining the indexing strategy a bit. But a more robust solution would probably be #192.

@josd
Copy link
Contributor Author

josd commented Jan 25, 2022

Trealla is structure sharing that might change a lot here. Rather compare it to a structure copying system like GNU, SICStus or SWI.

@UWN, SWI actually uses even less memory ?!

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" swipl -g "'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'(ELEMENT,'http://example.org/etc#N100000'),writeq(ELEMENT),nl,halt" dt.pl
ERROR: /home/jdroo/github.com/josd/josd.github.io/temp/heiseneye/etc/dt.pl:300010:
ERROR:    No permission to assert procedure `'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'/2'
ERROR: /home/jdroo/github.com/josd/josd.github.io/temp/heiseneye/etc/dt.pl:300011:
ERROR:    No permission to assert procedure `'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'/2'
ERROR: /home/jdroo/github.com/josd/josd.github.io/temp/heiseneye/etc/dt.pl:300012:
ERROR:    No permission to assert procedure `'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'/2'
ERROR: /home/jdroo/github.com/josd/josd.github.io/temp/heiseneye/etc/dt.pl:300013:
ERROR:    No permission to assert procedure `'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'/2'
ERROR: /home/jdroo/github.com/josd/josd.github.io/temp/heiseneye/etc/dt.pl:300014:
ERROR:    No permission to assert procedure `'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'/2'
'http://example.org/etc#z'
sec=5.250 kB=142064 cpu=90%

i.e. 142 MB for SWI versus 161 MB for Trealla.

The ERROR's are due to my stubborn nature to use => as logical implication like explained at Heiseneye reasoning and that conflicts with SWI's Single Sided Unification rules.
I argued/protested at https://swi-prolog.discourse.group/t/do-the-picat-operators-and-have-a-formal-name-in-swi-prolog/3594/6?u=josderoo but lost 😢

@triska
Copy link
Contributor

triska commented Jan 26, 2022

@josd: If you want to try a different strategy, you could start for example here:

let optimal_index = match Self::first_instantiated_index(&clauses) {

For instance, does it make a difference for your program if you, as an experimental modification, hard-code the second argument for indexing (for predicates that have at least 2 arguments)?

You could also consider implementing a heuristic that analyzes the clauses more thoroughly, especially for cases like yours where it is rather clear which argument should be chosen for indexing.

@josd
Copy link
Contributor Author

josd commented Jan 27, 2022

Thanks for the sugestions and I was trying a few tweaks during the past hours, but all I got was several thread 'main' panicked at 'out-of-memory' (on a system with 12 GB Mem plus 4 GB Swp) or even longer reasoning times

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" scryer-prolog -g run,halt ../heiseneye.pl dt.pl
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N10') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N100') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N1000') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N10000') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N100000') => yes.
sec=1746.610 kB=2076628 cpu=99%

My current reference is the N3 implementation https://github.com/josd/eye/tree/master/reasoning/dt using swipl and which takes about 1 sec

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" swipl -x dtpe.pvm -- "$@" --wcache http://josd.github.io/eye/reasoning .. --nope http://josd.github.io/eye/reasoning/dt/test-facts.n3 --query http://josd.github.io/eye/reasoning/dt/test-query.n3
eye --wcache http://josd.github.io/eye/reasoning .. http://josd.github.io/eye/reasoning/dt/test-dt.n3 --wcache http://josd.github.io/eye/reasoning .. --nope http://josd.github.io/eye/reasoning/dt/test-facts.n3 --query http://josd.github.io/eye/reasoning/dt/test-query.n3
EYE v22.0117.1432 josd
SWI-Prolog version 8.5.6-17-ge8a965252
starting 441 [msec cputime] 440 [msec walltime]
#Processed by EYE v22.0117.1432 josd
#eye --wcache http://josd.github.io/eye/reasoning .. http://josd.github.io/eye/reasoning/dt/test-dt.n3 --wcache http://josd.github.io/eye/reasoning .. --nope http://josd.github.io/eye/reasoning/dt/test-facts.n3 --query http://josd.github.io/eye/reasoning/dt/test-query.n3

GET http://josd.github.io/eye/reasoning/dt/test-facts.n3 FROM ../dt/test-facts.n3 SC=2
GET http://josd.github.io/eye/reasoning/dt/test-query.n3 FROM ../dt/test-query.n3 SC=1
networking 134 [msec cputime] 139 [msec walltime]
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX : <http://eulersharp.sourceforge.net/2009/12dtb/test#>
PREFIX e: <http://eulersharp.sourceforge.net/2003/03swap/log-rules#>

:TestVariable a :A2.

reasoning 473 [msec cputime] 503 [msec walltime]
#2022-01-27T00:28:10.692Z in=300004 out=1 ent=1 step=1 brake=1 inf=1718597 sec=1.048 inf/sec=1639883
#ENDS

2022-01-27T00:28:10.692Z in=300004 out=1 ent=1 step=1 brake=1 inf=1718597 sec=1.048 inf/sec=1639883

sec=1.090 kB=315416 cpu=99%

@mthom
Copy link
Owner

mthom commented Jan 27, 2022

If I omit the first clause from the RDF predicate and leave the rest as-is, I get an out of memory error! That triggers it to build a 299,999 item index, which does not happen if just the first argument is instantiated in the first clause.

Ideally, this code should be generated:

try_me_else(N).
... % unify X1 to first atom argument.
proceed.
trust_me.
switch_on_term(2, ...). % switch on second argument in remaining 299,999 clauses.
...

@josd
Copy link
Contributor Author

josd commented Jan 27, 2022

Exactly and similarly for instance with

'http://example.org/etc#isTypeOf'('http://example.org/etc#N0','http://example.org/etc#z').
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'(X,Y) :- 'http://example.org/etc#isTypeOf'(Y,X).

or

'http://example.org/etc#N0'('http://example.org/etc#z').
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'(X,'http://example.org/etc#N0') :- 'http://example.org/etc#N0'(X).

instead of

'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N0').

@josd
Copy link
Contributor Author

josd commented Jan 28, 2022

For the time being in Heisseneye I am working around rdf:type and webizing classes as unary predicates like 'http://example.org/etc#Woman'('http://example.org/etc#Emmy_Noether').

@mthom
Copy link
Owner

mthom commented Feb 12, 2022

The deep taxonomy benchmark should work properly on rebis-dev now.

@josd
Copy link
Contributor Author

josd commented Feb 12, 2022

Thank you very much @mthom and for the deep taxonomy benchmark with 30000 classes we now get

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" scryer-prolog -g run,halt ../heiseneye.pl dt.pl
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N10') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N100') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N1000') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N10000') => yes.
sec=5.640 kB=225360 cpu=99%

and for 300000 classes it is now a nice linear performance

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" scryer-prolog -g run,halt ../heiseneye.pl dt.pl
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N10') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N100') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N1000') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N10000') => yes.
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/etc#z','http://example.org/etc#N100000') => yes.
sec=49.890 kB=2159432 cpu=99%

@josd josd closed this as completed Feb 12, 2022
@josd josd reopened this Feb 17, 2022
@josd
Copy link
Contributor Author

josd commented Feb 17, 2022

Looking at some historical results I saw that they were done via forward chaining and so I changed dt.pl using gendt.pl by adapting line 15 and line 25.
The current results for Scryer are:

depth [sec]
1000 6.340
10000 403.450
100000 38166.890

Also the memory usage for depth 100000 is quite huge kB=7374872.

So for Scryer and depth 100000 we currently have

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" scryer-prolog -g run,halt webeye.pl examples/dt.pl
'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'('http://example.org/ns#z','http://example.org/ns#N100000') => true.
sec=38166.890 kB=7374872 cpu=99%

and we compare it with what we currently have in eye/reasoning/dt

$ /usr/bin/time -f "sec=%e0 kB=%M cpu=%P" ./test --quiet
eye --wcache http://josd.github.io/eye/reasoning .. http://josd.github.io/eye/reasoning/dt/test-dt.n3 --quiet --wcache http://josd.github.io/eye/reasoning .. --nope http://josd.github.io/eye/reasoning/dt/test-facts.n3 --query http://josd.github.io/eye/reasoning/dt/test-query.n3
EYE v22.0128.1551 josd
SWI-Prolog version 8.5.7-8-g7f1517ec1
starting 281 [msec cputime] 278 [msec walltime]
GET http://josd.github.io/eye/reasoning/dt/test-facts.n3 FROM ../dt/test-facts.n3 SC=2
GET http://josd.github.io/eye/reasoning/dt/test-query.n3 FROM ../dt/test-query.n3 SC=1
networking 2 [msec cputime] 2 [msec walltime]
reasoning 4835 [msec cputime] 4840 [msec walltime]
2022-02-17T10:45:49.426Z in=300004 out=1 ent=299999 step=599998 brake=2 inf=79218192 sec=5.118 inf/sec=15478349

sec=5.130 kB=224132 cpu=99%

@josd
Copy link
Contributor Author

josd commented Mar 7, 2022

We are now moving back to backward chaining and the performance is fine.

@josd josd closed this as completed Mar 7, 2022
@mthom
Copy link
Owner

mthom commented Mar 7, 2022

I'm highly certain the previous blow up was caused by the lack of deep indexing.

@josd
Copy link
Contributor Author

josd commented Mar 7, 2022

Me too and in SWI it is indeed like n*log(n).
Anyhow, we can live with the current indexing in Scryer and one day it might indeed support deep indexing 😉
Thank you very much @mthom

@mthom
Copy link
Owner

mthom commented Mar 8, 2022

Anyhow, we can live with the current indexing in Scryer and one day it might indeed support deep indexing

I hope so! There's quite a lot to do. I wonder if writing developer documents on the Rust side of Scryer would encourage, e.g., work on that front. I have an initial document I'm now hammering out/procrastinating on explaining how instructions.rs is generated.

@josd
Copy link
Contributor Author

josd commented Mar 8, 2022

That sounds like the best news in years @pbonte and @RubenVerborgh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants