Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory overflows cannot be caught #226

Open
UWN opened this issue May 14, 2022 · 17 comments
Open

Memory overflows cannot be caught #226

UWN opened this issue May 14, 2022 · 17 comments

Comments

@UWN
Copy link

UWN commented May 14, 2022

What is really nice is that you have chosen current_prolog_flag(max_arity, unbounded)!

?- length(_,E), N is 2^E, writeq(N),nl, catch(functor(F,f,N), Error, true), nonv
ar(Error).
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
fatal error: runtime: out of memory

Expected: Error = error(resource_error(memory), _Imp_def).

@ichiban
Copy link
Owner

ichiban commented Dec 18, 2022

I've been digging into Go issues/proposals related to OOM but it looks like there's no way to tell if allocation succeeds without OOM for sure.

We'd be better off just setting a sensible max_arity, let's say, 1023 in the track of scryer.

$ scryer-prolog 
?- current_prolog_flag(max_arity, X).
   X = 1023.

@UWN
Copy link
Author

UWN commented Dec 19, 2022

The limited arity is not the best feature of Scryer. SWI's arity is unbounded too. And just replace functor(F,f,N) by length(F, N) to see that the problem would not go away by this.

@ichiban
Copy link
Owner

ichiban commented Dec 20, 2022

I see. We should be able to mitigate the risk by limiting max_arity in the case of functor(F,f,N) and if we take the route, we should limit the length of lists as well, which is not ideal.

Why don't we let it crash, call it a limitation of this processor, and ask users to call functor/3 and length/2 with care?

@UWN
Copy link
Author

UWN commented Dec 20, 2022

Why don't we let it crash, call it a limitation of this processor,

You can always resort to that at least for standard conformity. For that matter, you can even claim exit 1 to be a conforming system. See this for more. But by this you are reducing the domain of applicability of your system. Non-termination and thus such overflows are happening all the time to (at least) beginners. And a crashing system isn't helpful in such situations.

and ask users to call functor/3 and length/2 with care?

Limiting max_arity means that you are putting the burden onto the programmer. So the programs have to be more complex than otherwise necessary. Instead of a simple arg/3 one has to program something more complex.

Well, it is up to you to decide which way you want to go.

@ichiban
Copy link
Owner

ichiban commented Dec 24, 2022

Instead of allocating a huge chunk of memory right away, we can gradually allocate a large enough chunk for the arguments that are actually referred.

In sparse-compounds branch, I implemented *engine.sparseCompound, an experimental representation for such compounds. It's a compound backed by sparse vector. It'll still crash when a large enough number of arguments are actually used but it doesn't in this particular query.

Also, I implemented a sparse list as well. So, it can handle length(_,E), N is 2^E, writeq(N),nl, catch(length(F,N), Error, true), nonvar(Error)., too.

@UWN Do you think this will help in the cases of such overflows?

$ go version
go version go1.19.2 darwin/arm64
$ git checkout sparse-compounds
Already on 'sparse-compounds'
Your branch is up to date with 'origin/sparse-compounds'.
$ go install github.com/ichiban/prolog/cmd/1pl
$ $(go env GOPATH)/bin/1pl
Top level for ichiban/prolog (devel)
This is for testing purposes only!
See https://github.com/ichiban/prolog for more details.
Type Ctrl-C or 'halt.' to exit.
?- length(_,E), N is 2^E, writeq(N),nl, catch(functor(F,f,N), Error, true), nonv
ar(Error).
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
268435456
536870912
1073741824
2147483648
4294967296
8589934592
17179869184
34359738368
68719476736
137438953472
274877906944
549755813888
1099511627776
2199023255552
4398046511104
8796093022208
17592186044416
35184372088832
70368744177664
140737488355328
281474976710656
562949953421312
1125899906842624
2251799813685248
4503599627370496
9007199254740992
18014398509481984
36028797018963968
72057594037927936
144115188075855872
288230376151711744
576460752303423488
1152921504606846976
2305843009213693952
4611686018427387904
2022/12/24 11:46:07 error(evaluation_error(int_overflow),is/2)
?- 

@UWN
Copy link
Author

UWN commented Dec 25, 2022

Replace functor(F,f,N) by length(L,N) to see why this does not help.

@ichiban
Copy link
Owner

ichiban commented Dec 26, 2022

With length(L,N), L is unified with a newly introduced *sparseList as well. It'll delay memory allocation until its element is unified.

$ $(go env GOPATH)/bin/1pl
Top level for ichiban/prolog (devel)
This is for testing purposes only!
See https://github.com/ichiban/prolog for more details.
Type Ctrl-C or 'halt.' to exit.
?- length(_,E), N is 2^E, writeq(N),nl, catch(length(L,N), Error, true), nonvar(
Error).
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
268435456
536870912
1073741824
2147483648
4294967296
8589934592
17179869184
34359738368
68719476736
137438953472
274877906944
549755813888
1099511627776
2199023255552
4398046511104
8796093022208
17592186044416
35184372088832
70368744177664
140737488355328
281474976710656
562949953421312
1125899906842624
2251799813685248
4503599627370496
9007199254740992
18014398509481984
36028797018963968
72057594037927936
144115188075855872
288230376151711744
576460752303423488
1152921504606846976
2305843009213693952
4611686018427387904
2022/12/26 09:44:27 error(evaluation_error(int_overflow),is/2)
?- 

@UWN
Copy link
Author

UWN commented Dec 26, 2022

I don't see your point at all: It seems you are adding quite some complexity to your implementation for handling extremely rare cases. What would be much more helpful (and that was my motivation to give these examples) is to signal actual resource errors such that they can be caught and recovered.

Just add maplist(=(_),L)to my query.

@ichiban
Copy link
Owner

ichiban commented Dec 26, 2022

Can any of existing implementations handle these queries and raise resource_error(memory)?

As far as I know, when there's no enough memory, OSs overcommit memory and return unusable memory chunks on malloc(). When the program tries to use the chunk, OOM Killer kills the process. The program can't catch and/or recover the error.

@UWN
Copy link
Author

UWN commented Dec 26, 2022

SICStus and SWI. For SICStus, you have to ulimit -v memory first (under Linux), otherwise you get roughly the behavior you describe. For SWI, there is a relatively small default limit of 1GB or so. SICStus produces resource_error(memory) and SWI resource_error(stack). Both with an error/2 term around, indeed.

@ichiban
Copy link
Owner

ichiban commented Dec 29, 2022

Thank you for the examples. I think our implementation is more like SICStus except Go runtime doesn't let us handle the case of runtime: out of memory. SWI takes a totally different approach of memory allocation but we can learn from them and set an internal memory limit for our implementation.

Go has a soft memory limit so I'm going to limit allocation of large chunks of memory to respect the limit.

@UWN
Copy link
Author

UWN commented Dec 29, 2022

You cannot ulimit -v Go.

Another way to (partially) address this problem is to provide some mechanism to limit the number of inferences. This helps, at least for the typical cases of non-termination beginners make.

@ichiban
Copy link
Owner

ichiban commented Dec 30, 2022

You can, but Go runtime immediately exit(2) when it reached to the limit.

I'm preparing a fix to respect Go's builtin soft memory limit in #279. You can set the limit with an environment variable GOMEMLIMIT. functor/3 and length/2 will respect the limit and when they are likely to breach the limit, they'll throw error(resource_error(memory), _).

$ GOMEMLIMIT=500MiB $(go env GOPATH)/bin/1pl
Top level for ichiban/prolog (devel)
This is for testing purposes only!
See https://github.com/ichiban/prolog for more details.
Type Ctrl-C or 'halt.' to exit.
?- length(_,E), N is 2^E, writeq(N),nl, catch(functor(F,f,N), Error, true), nonv
ar(Error).
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
E = 24,
Error = error(resource_error(memory),functor/3),
F = _16780056,
N = 16777216

When combined with ulimit -v, Go's official document recommends a 5-10% buffer.

In this case, a good rule of thumb is to leave an additional 5-10% of headroom to account for memory sources the Go runtime is unaware of.
https://tip.golang.org/doc/gc-guide

@ichiban
Copy link
Owner

ichiban commented Dec 31, 2022

Merged the fix and released as v0.15.1.

$ go install github.com/ichiban/prolog/cmd/1pl@latest
go: downloading github.com/ichiban/prolog v0.15.1
$ GOMEMLIMIT=500MiB $(go env GOPATH)/bin/1pl
Top level for ichiban/prolog v0.15.1
This is for testing purposes only!
See https://github.com/ichiban/prolog for more details.
Type Ctrl-C or 'halt.' to exit.
?- length(_,E), N is 2^E, writeq(N),nl, catch(functor(F,f,N), Error, true), nonv
ar(Error).
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
E = 24,
Error = error(resource_error(memory),functor/3),
F = _16780056,
N = 16777216.
?- length(_,E), N is 2^E, writeq(N),nl, catch(length(L,N), Error, true), nonvar(
Error).
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
E = 24,
Error = error(resource_error(memory),length/2),
L = _33559888,
N = 16777216.
?- 

@UWN
Copy link
Author

UWN commented Jun 5, 2023

Here is a case where this limit is not respected:

ulrich@p0:/opt/gupu/ichiban-prolog$ go version
go version go1.20.4 linux/amd64
ulrich@p0:/opt/gupu/ichiban-prolog$ GOMEMLIMIT=1MiB $(go env GOPATH)/bin/1pl
Top level for ichiban/prolog v1.1.0
This is for testing purposes only!
See https://github.com/ichiban/prolog for more details.
Type Ctrl-C or 'halt.' to exit.
?- length(L,8).
L = [_65,_66,_67,_68,_69,_70,_71,_72];
?- length(L,16).
2023/06/05 12:17:38 error(resource_error(memory),length/2)  % that is perfect!
?- append(L,_,_),length(L,16).
L = [_85,_90,_95,_100,_105,_110,_115,_120,_125,_130,_135,_140,_145,_150,_155,_160].  % why now?
?- append(L,_,_),length(L,1024),!,fail.
false. % even larger
?- append(L,_,_),length(L,32768),!,fail.
   running. % still running for more than 15'
   false. % finally failed after 28'

@ichiban
Copy link
Owner

ichiban commented Jun 21, 2023

This is because the memory overflow detection currently implemented can only detect obvious cases such as length(L, 1024). before it exceeds the soft memory limit.

Since append/3 is implemented as below (actually, it's in Go but essentially same). We allocate memory for many tiny chunks for ./2 and they're undetected.

append([], L, L).
append([X|L1], L2, [X|L3]) :- append(L1, L2, L3).

Maybe we should implement another layer of memory overflow detection for tiny memory allocations which detects overflows after allocating a tiny object and exceeding the soft memory limit.

@UWN
Copy link
Author

UWN commented Jun 22, 2023

There is no need to have very precise overflow detection. It could be performed every nth inference only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants