QCheck2.Gen.list_size stackoverflow on big sizes #156

sir4ur0n · 2021-08-17T08:41:03Z

QCheck2.Gen.list_size crashes with Stack_overflow

I added a few tests with a sized list generator from issue #64 which crashes the QCheck2.Gen.list_size generator with Stack_overflow. I believe that this is because it isn't tail recursive like foldn underlying QCheck.Gen.list_size:
https://github.com/c-cube/qcheck/pull/153/files#diff-44df483bf5666cb8dd729f02820e34cec6c7b7199bedcb551d82fcfccdff8c05R160-R176
Can it be formulated with an accumulator instead?

Originally posted by @jmid in #153 (comment)

The text was updated successfully, but these errors were encountered:

sir4ur0n · 2021-08-17T08:51:14Z

I took a look at it yesterday and today.

⚠️ Disclaimer: I am not seasoned in performance/debugging in OCaml, so my investigations should be taken with a grain of salt. Any more experienced person is more than welcome to correct me or help me better analyze the problem 😁

Making list_size tail recursive is apparently not enough: I still had StackOverflow errors (but to my surprise, the error backtrace only contained 4 stack layers...):

Test lists equal to duplication failed:

ERROR: uncaught exception in generator for test lists equal to duplication after 100 steps:
Exception: Stack overflow
Backtrace: Raised by primitive operation at QCheck2.Gen.liftA2 in file "src/core/QCheck2.ml", line 247, characters 4-18
Called from QCheck2.Gen.ap in file "src/core/QCheck2.ml", line 239, characters 74-80
Called from QCheck2.Tree.bind in file "src/core/QCheck2.ml", line 191, characters 28-31
Called from QCheck2.Test.check_state in file "src/core/QCheck2.ml", line 1520, characters 12-32

Note that I added let () = Printexc.record_backtrace true before the test to get backtraces.

And this is weird to me, cause I am used in other languages to having a HUGE stack printed whenever a StackOverflow happens. I have no idea why in OCaml this happens...

That being said, I noticed that replacing list_size implementation (even tail recursive) with a for loop seems to work?

(* Current implementation on master.
   The stackoverflow contains a huge amount of recursive calls of QCheck2.Gen.ap which is called by liftA2 *)
  let list_size (size : int t) (gen : 'a t) : 'a list t =
    size >>= fun size ->
    let rec loop n =
      if n <= 0
      then pure []
      else liftA2 List.cons gen (loop (n - 1))
    in
    loop size

(* Tail recursive implementation, if I'm not mistaken.
   As explained above, there are only 4 layers in the backtrace, which I don't understand *)
  let list_size (size : int t) (gen : 'a t) : 'a list t =
    let rec loop n acc =
      if n <= 0
      then acc
      else (loop [@tailcall]) (n - 1) (liftA2 List.cons gen acc)
    in
    loop size (pure [])

(* For loop.
   Passes. *)
  let list_size (size : int t) (gen : 'a t) : 'a list t =
    fun st ->
    Tree.bind (size st) @@ fun size ->
    let acc = ref @@ Tree.pure [] in
    for _ = 0 to (size - 1) do
      acc := Tree.liftA2 List.cons (gen st) !acc
    done;
    assert (List.length (Tree.root !acc) = size);
    !acc

Since I am neither sure of my analysis nor of my for-loop solution, I'm waiting for feedback from @jmid @c-cube or any other experienced OCamler before submitting any kind of patch/PR.

jmid · 2021-08-18T09:33:22Z

I'm looking at this now.

First a quick comment: I would characterize this issue as useful (if not important).
Here's an example to illustrate adopted from
https://github.com/jmid/sm2-tes21/blob/main/lec06/lec06.ml
https://www.janmidtgaard.dk/quickcheck/slides/lec06.pdf

(* a recursive sum function *)
let rec sum xs = match xs with
  | [] -> 0
  | x::xs -> x + sum xs

(* a tail-recursive sum function, passing an accumulator *)
let sum' xs =
  let rec sum_local xs acc = match xs with
    | [] -> acc
    | x::xs -> sum_local xs (x+acc)
  in sum_local xs 0

We can use the Gen combinators to quickly test whether these functions are actually tail recursive:

utop # Gen.(generate1 (list_size (return 10) (return 1)));;
- : int list = [1; 1; 1; 1; 1; 1; 1; 1; 1; 1]
utop # sum Gen.(generate1 (list_size (return 200) (return 1)));;
- : int = 200
utop # sum Gen.(generate1 (list_size (return 2_000_000) (return 1)));;
Stack overflow during evaluation (looping recursion?).
utop # sum' Gen.(generate1 (list_size (return 200) (return 1)));;
- : int = 200
utop # sum' Gen.(generate1 (list_size (return 2_000_000) (return 1)));;
- : int = 2000000

Since sum' doesn't require 2 million stack frames to sum a 2-million element list it is probably tail recursive 😀

jmid · 2021-08-18T12:31:18Z

OK, I can confirm that the 3rd version passes. I also get Stack_overflow with the 2nd version (with an inserted size >>= fun size -> to make it type check):

  let list_size (size : int t) (gen : 'a t) : 'a list t =
    size >>= fun size ->
    let rec loop n acc =
      if n <= 0
      then acc
      else (loop [@tailcall]) (n - 1) (liftA2 List.cons gen acc)
    in
    loop size (pure [])

This difference also puzzled me. Then I noticed that your 3rd version loops over Tree.t rather than Gen.t like the 2nd. I therefore wrote a tail-recursive version corresponding to the for-loop over Tree.t:

  let list_size (size : int t) (gen : 'a t) : 'a list t =
    fun st ->
    Tree.bind (size st) @@ fun size ->
    let rec loop n acc =
      if n <= 0
      then acc
      else (loop [@tailcall]) (n - 1) (Tree.liftA2 List.cons (gen st) acc)
    in
    loop size (Tree.pure [])

This one also doesn't Stack_overflow.

So what is the difference between these two?
The last one builds up in a tail-recursive loop an 'a list Tree.t
whereas the previous one builds up in a tail-recursive loop an 'a list Gen.t.
Since 'a Gen.t = RS.t -> 'a Tree.t this means we build up a generator (without Stack_overflow) of type RS.t -> 'a Tree.t (stage 1) that may later raise Stack_overflow when it is later passed a Random.State (stage 2). As such, the accumulator (and currying) just pushed the problem to stage 2, where a non-tail recursive chain of Gen.liftA2 calls await a Random.State argument.

jmid · 2021-08-18T12:40:58Z

Regarding stack traces, like you I am able to produce short ones by running OCAMLRUNPARAM="b" dune exec test/core/QCheck2_expect_test.exe:

--- Failure --------------------------------------------------------------------

Test lists shorter than 432 failed:

ERROR: uncaught exception in generator for test lists shorter than 432 after 100 steps:
Exception: Stack overflow
Backtrace: Raised by primitive operation at QCheck2.Gen.liftA2 in file "src/core/QCheck2.ml", line 244, characters 4-19
Called from QCheck2.Gen.ap in file "src/core/QCheck2.ml", line 239, characters 74-80
Called from QCheck2.Tree.bind in file "src/core/QCheck2.ml", line 191, characters 28-31
Called from QCheck2.Test.check_state in file "src/core/QCheck2.ml", line 1549, characters 12-32


--- Failure --------------------------------------------------------------------

Test lists shorter than 4332 failed:

ERROR: uncaught exception in generator for test lists shorter than 4332 after 100 steps:
Exception: Stack overflow
Backtrace: Raised by primitive operation at QCheck2.Gen.liftA2 in file "src/core/QCheck2.ml", line 244, characters 4-19
Called from QCheck2.Gen.ap in file "src/core/QCheck2.ml", line 239, characters 74-80
Called from QCheck2.Tree.bind in file "src/core/QCheck2.ml", line 191, characters 28-31
Called from QCheck2.Test.check_state in file "src/core/QCheck2.ml", line 1549, characters 12-32
Called from QCheck2.Gen.(>|=) in file "src/core/QCheck2.ml", line 233, characters 18-25


--- Failure --------------------------------------------------------------------

Test lists equal to duplication failed:

ERROR: uncaught exception in generator for test lists equal to duplication after 100 steps:
Exception: Stack overflow
Backtrace: Raised by primitive operation at QCheck2.Gen.liftA2 in file "src/core/QCheck2.ml", line 244, characters 4-19
Called from QCheck2.Gen.ap in file "src/core/QCheck2.ml", line 239, characters 74-80
Called from QCheck2.Tree.bind in file "src/core/QCheck2.ml", line 191, characters 28-31
Called from QCheck2.Test.check_state in file "src/core/QCheck2.ml", line 1549, characters 12-32
Called from QCheck2.Gen.(>|=) in file "src/core/QCheck2.ml", line 233, characters 18-25
Called from QCheck2.Gen.liftA2 in file "src/core/QCheck2.ml", line 244, characters 4-19
Called from QCheck2.Gen.ap in file "src/core/QCheck2.ml", line 239, characters 74-80
Called from QCheck2.Tree.bind in file "src/core/QCheck2.ml", line 191, characters 28-31
Called from QCheck2.Test.check_state in file "src/core/QCheck2.ml", line 1549, characters 12-32

I cannot tell why more entries aren't included above. In F# I know currying can affect the number of lines negatively and that eta-expansion can help to include more 🤷‍♂️

jmid · 2021-08-18T15:29:52Z

OK, I just created a PR in #160 to address this.

I also just realized something: Both QCheck and QCheck2's Gen.list_size generates lists from the back (tail-to-front).
For QCheck2 this was not obvious from the implementation since it contained no explicit generator calls:

  let list_size (size : int t) (gen : 'a t) : 'a list t =
    size >>= fun size ->
    let rec loop n =
      if n <= 0
      then pure []
      else liftA2 List.cons gen (loop (n - 1))
    in
    loop size

As far as I can tell, this is a consequence of the current Gen.ap implementation (and right-to-left argument evaluation):

  let ap (f : ('a -> 'b) t) (x : 'a t) : 'b t = fun st -> Tree.ap (f st)  (x st)

  let (<*>) = ap

  let liftA2 (f : 'a -> 'b -> 'c) (a : 'a t) (b : 'b t) : 'c t =
    (a >|= f) <*> b

So what? For QCheck2 with integrated shrinking, this means that lists will be shrunk from the right(!)

I've confirmed the above by replacing the above ap definition by the following (which triggers a mismatch with the expected outputs):

  let ap (f : ('a -> 'b) t) (x : 'a t) : 'b t =
    fun st ->
      let fst = f st in
      let xst = x st in
      Tree.ap fst xst

I think this is another argument for a splittable RNG for QCheck2 😀

jmid · 2021-08-19T09:39:47Z

Here's an example to illustrate the point - a stateful generator:

utop # 
  let shrink_nil = fun _ -> Seq.empty

  let stateful_gen,reset_state =
    let open QCheck2 in
    let state = ref 0 in
    Gen.make_primitive
      ~gen:(fun _st ->
              let v = !state in
              incr state;
              v)
      ~shrink:shrink_nil,
    (fun () -> state := 0);;
val shrink_nil : 'a -> 'b Seq.t = <fun>
val stateful_gen : int QCheck2.Gen.t = <abstr>
val reset_state : unit -> unit = <fun>
utop # QCheck2.Gen.(generate1 (small_list stateful_gen));;
- : int list = [6; 5; 4; 3; 2; 1; 0]

Based on this, we can now observe how the derived size-based shrinker operates:

utop # 
  let () = reset_state ()
  let print_list xs = print_endline QCheck2.Print.(list int xs)
  let list_genorder_test =
    let open QCheck2 in
    Test.make ~name:"list genorder test" ~count:1 ~print:Print.(list int)
      (Gen.small_list stateful_gen) (fun xs -> print_list xs; xs=[]);;
val print_list : int list -> unit = <fun>
val list_genorder_test : QCheck2.Test.t = QCheck2.Test.Test <abstr>
utop # QCheck2.Test.check_exn list_genorder_test;;
[3; 2; 1; 0]
[]
[5; 4]
[]
[6]
[]
Exception:
test `list genorder test` failed on ≥ 1 cases: [6] (after 2 shrink steps)

c-cube · 2021-08-19T13:03:42Z

nice illustration, and a useful trick 🙂 . I'll have to remember that one.

jmid · 2021-08-19T19:16:59Z

So what? For QCheck2 with integrated shrinking, this means that lists will be shrunk from the right(!)

Actually this is not necessarily the case. Lists are generated right-to-left but the individual elements are at least shrunk left-to-right. It is however harder to observe how the size/spine-shrinker proceeds, as the element-generator is restarted in a different Random.State for each size in the size shrink tree (we really need that splittable RNG):

utop # 
let t = QCheck2.(Test.make ~print:Print.(list int)
                           Gen.(list small_int) (fun xs -> Print.(list int) xs |> print_endline; List.length xs < 4));;
val t : QCheck2.Test.t = QCheck2.Test.Test <abstr>
utop # QCheck_base_runner.run_tests [t];;
[6; 77; 5; 3; 8; 0; 5; 9; 8; 4; 5; 4; 2; 7; 62; 6; 9; 6; 0; 3; 50; 1; 6; 74; 3; 52; 7; 36; 5; 9; 37; 51; 2; 6; 7; 7; 1; 2; 6; 3; 4; 6; 2; 74; 8; 3; 8; 6; 9; 6; 3; 9; 9; 7; 8; 9; 3; 9; 3; 16; 1; 3; 3; 3; 2; 0; 8; 0; 2; 2; 3; 22; 1; 42; 4; 68; 86; 6; 7; 0; 3; 1; 5; 6; 6; 5; 57; 9; 6; 7; 6; 6; 8; 5; 7; 2; 7; 65; 7; 2; 6; 34; 5; 59; 3; 1; 6; 28; 6; 3; 63; 8; 62; 4; 6; 6; 92; 6; 3; 7; 6; 1; 8; 27; 3; 5; 2; 1; 3; 10; 2; 8; 7; 9; 8; 0; 9; 4; 2; 72; 9; 6; 8; 0; 5; 14; 3; 28; 3; 5; 1; 5; 44; 41; 5; 5; 0; 8; 55; 5; 1; 2; 5; 9; 8; 3; 8; 22; 7; 5; 8; 9; 25; 5; 7; 70; 3; 2; 6; 6; 1; 93; 1; 7; 7; 2; 4; 5; 6; 25; 2; 9; 5; 2; 3; 8; 2; 6; 4; 37; 18; 5; 88; 69; 7; 44; 5; 8; 6; 7; 51; 69; 9; 6; 6; 69; 6; 1; 26; 2; 4; 4; 45; 8; 7; 9; 2; 8; 68; 5; 0; 6; 6; 8; 80; 74; 4; 4; 9; 5; 97; 10; 41; 6; 44; 35; 60; 0; 7; 75]
[]
[28; 2; 4; 1; 5; 2; 5; 4; 6; 52; 3; 1; 69; 25; 6; 5; 4; 5; 8; 61; 86; 1; 0; 8; 81; 7; 1; 5; 85; 3; 98; 96; 65; 9; 3; 7; 15; 4; 7; 7; 20; 6; 4; 1; 8; 1; 8; 38; 1; 5; 4; 7; 1; 7; 3; 1; 6; 0; 1; 21; 2; 10; 2; 9; 9; 4; 18; 2; 3; 8; 9; 4; 2; 2; 5; 4; 3; 8; 2; 8; 1; 43; 8; 8; 6; 7; 8; 4; 3; 2; 42; 7; 68; 6; 1; 3; 0; 8; 7; 0; 0; 9; 54; 1; 8; 3; 40; 1; 6; 85; 8; 8; 0; 7; 9; 4; 4; 7; 8; 9; 59; 3; 8; 2; 7]
[]
[78; 3; 4; 9; 0; 5; 37; 9; 57; 29; 9; 0; 8; 6; 2; 4; 9; 0; 92; 9; 4; 92; 7; 0; 8; 8; 7; 0; 4; 9; 8; 74; 91; 9; 0; 3; 9; 9; 0; 6; 0; 3; 6; 7; 1; 15; 3; 8; 2; 0; 65; 84; 59; 9; 0; 2; 0; 9; 9; 3; 5; 86]
[]
[0; 4; 58; 0; 1; 9; 8; 1; 9; 2; 3; 4; 20; 84; 8; 79; 7; 7; 8; 2; 42; 3; 4; 97; 4; 2; 8; 44; 67; 81; 3]
[]
[0; 28; 3; 69; 9; 4; 8; 1; 6; 0; 3; 6; 1; 8; 73]
[]
[6; 9; 5; 5; 2; 3; 5]
[]
[46; 2; 5]
[7; 58; 2; 51; 0]
[]
[2; 5]
[4; 7; 3]
[2; 3; 8; 8]
[]
[71; 11]
[74; 1; 2]
[0; 3; 8; 8]
[0; 0; 8; 8]
[0; 0; 0; 8]
[0; 0; 0; 0]

--- Failure --------------------------------------------------------------------

Test anon_test_10 failed (11 shrink steps):

[0; 0; 0; 0]

Here, the individual elements from the first 4-element counter example [2; 3; 8; 8] are shrunk left-to-right.

jmid added a commit to jmid/qcheck that referenced this issue Aug 18, 2021

Fix stack_overflow in c-cube#156 and update expected output

57a5d42

jmid mentioned this issue Aug 18, 2021

Fix QCheck2.gen.list_size Stack overflow #160

Merged

jmid linked a pull request Sep 7, 2021 that will close this issue

Fix QCheck2.gen.list_size Stack overflow #160

Merged

jmid closed this as completed Sep 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QCheck2.Gen.list_size stackoverflow on big sizes #156

QCheck2.Gen.list_size stackoverflow on big sizes #156

sir4ur0n commented Aug 17, 2021

sir4ur0n commented Aug 17, 2021

jmid commented Aug 18, 2021

jmid commented Aug 18, 2021

jmid commented Aug 18, 2021

jmid commented Aug 18, 2021

jmid commented Aug 19, 2021

c-cube commented Aug 19, 2021

jmid commented Aug 19, 2021

QCheck2.Gen.list_size stackoverflow on big sizes #156

QCheck2.Gen.list_size stackoverflow on big sizes #156

Comments

sir4ur0n commented Aug 17, 2021

sir4ur0n commented Aug 17, 2021

jmid commented Aug 18, 2021

jmid commented Aug 18, 2021

jmid commented Aug 18, 2021

jmid commented Aug 18, 2021

jmid commented Aug 19, 2021

c-cube commented Aug 19, 2021

jmid commented Aug 19, 2021