Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 929 #1209

Draft
wants to merge 2 commits into
base: next
Choose a base branch
from
Draft

Fix issue 929 #1209

wants to merge 2 commits into from

Conversation

Halbaroth
Copy link
Collaborator

This PR fixes the issue #929 about model arrays.

Our implementation of ArrayEx is complete but we do not split on literals a = b where a and b are any arrays of the problem. Doing these splits during the search proof would be a bad idea because we do not need them to be complete. In the example of the issue #929, we obtain an equality of the form (store a1 x y) = (store a2 x y) and we know that (select a1 x) = (select a2 x). We never try to decide a1 = a2.

Notice that I update the set of arrays by tracking literals Eq of origin Subst. We know that this design is fragile because it could happen that substitutions are not sent to theories. The worst that can happen is to produced wrong models for arrays in very tricky cases. A better implementation consists in using our new global domains system to track selects in Adt_rel and use the keys of this map instead of arrays. I did not opt to this implementation to keep this PR simple and atomic.

@Halbaroth Halbaroth added bug models This issue is related to model generation. labels Aug 14, 2024
@Halbaroth Halbaroth linked an issue Aug 14, 2024 that may be closed by this pull request
This PR fixes the issue OCamlPro#929 about model arrays.
Our implementation of `ArrayEx` is complete but we do not split on
literals `a = b` where `a` and `b` are any arrays of the problem. Doing
these splits during the search proof would be a bad idea because we do
not need them to be complete. In the example of the issue OCamlPro#929, we
obtain an equality of the form `(store a1 x y) = (store a2 x y)` and
we know that `(select a1 x) = (select a2 x)`. We never try to decide `a1
= a2`.

Notice that I update the set of arrays by tracking literals `Eq` of
origin `Subst`. We know that this design is fragile because it could
happen that substitutions are not sent to theories. The worst that can
happen is to produced wrong models for arrays in very tricky cases.
A better implementation consists in using our new global domains system to
track `selects` in `Adt_rel` and use the keys of this map instead of
`arrays`. I did not opt to this implementation to keep this PR simple
and atomic.
Copy link
Collaborator

@bclement-ocp bclement-ocp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the changes to the arrays_rel module to allow splitting on array equality. I am not sure I understand the changes to the uf module; at least I don't understand what the expected behavior change is.

let two = Numbers.Q.from_int 2

let case_split env _uf ~for_model:_ =
let two = Numbers.Q.from_int 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't write let two = 3; I would like to keep what remains of my sanity intact.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened??? I have never touch this piece of code!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd wager on a misplaced C-a keystroke 😉

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my guess too. Actually I do not use this feature, may be I should disable these keystrokes.

Comment on lines +436 to +445
let (let*) = Option.bind

let split_indices env =
let* cs = LR.Set.choose_opt env.split in
Some (LR.neg cs)

let next_split ~for_model env =
match split_indices env with
| None -> split_arrays ~for_model env
| cs -> cs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a somewhat verbose way of writing:

let next_split ~for_model env =
  match LR.Set.choose_opt env.split with
  | Some cs -> Some (LR.neg cs)
  | None -> split_arrays ~for_model env

symbol. *)
}

let mk () = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this create; it is the de-facto standard name in OCaml for this type of function.

Copy link
Collaborator Author

@Halbaroth Halbaroth Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, both conventions are used in this codebase and other ones. I do not care about these kind of details.

Copy link
Collaborator Author

@Halbaroth Halbaroth Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm I guess you meant this is the convention in the stdlib for mutable data structures, indeed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm I guess you meant this is the convention in the stdlib for mutable data structures, indeed.

Exactly this

type t = {
arrays : (Expr.t, unit) Hashtbl.t;
(** Set of arrays for which we need to generate a model value.
Notice that this set also contains embedded arrays. *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what you mean by an "embedded array"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, using Expr.t as keys to an Hashtbl.t is wrong. The hashing function from Expr should be used with Hashtbl.Make(Expr).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't understand why this new table is useful. For any key t here I believe that the representative of t is always guaranteed to be in the selects table; can't we just use the selects table directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By embedded arrays I mean an array used in a payload of an ADT. Currently, we have a test embedded-array.models.smt2 to ensure we produce correct models for these kind of expressions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can use selects table directly because we always add an array, even if there is no select term using it in our problem.

Notice that this set also contains embedded arrays. *)

selects: (X.r, (Expr.t, Expr.t) Hashtbl.t) Hashtbl.t;
(** Contains all the accesses to arrays. *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This comment should explain what an entry in this table looks like. I know this comment is moved from an existing location, but it does not help understand the meaning of the table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, using X.r as a key to a Hashtbl.t is wrong. The Shostak.HX module should be used instead.


let get_abstract_for abstracts_cache env (t : Expr.t) =
let retrieve_selects cache env (t : Expr.t) =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear from its name that this function can raise Not_found. If this is intentional, this function should be called retrieve_selects_exn instead (I'd even call it find_selects_exn; retrieve usually has an implication that work needs to be performed to construct the data, e.g. fetch it from the network).

let r, _ = find env t in
match Hashtbl.find cache.selects r with
| exception Not_found ->
Hashtbl.add cache.selects r (Hashtbl.create 17)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment from the previous moved code is valuable and could be kept here.

selects: (X.r, (Expr.t, Expr.t) Hashtbl.t) Hashtbl.t;
(** Contains all the accesses to arrays. *)

abstracts: (r, Expr.t) Hashtbl.t;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto; this should use Shostak.HX instead of Hashtbl.t.

List.fold_left
(fun env l ->
match l with
| Xliteral.Eq (r1, r2), _, _, Th_util.Subst when is_array r2 ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: ideally we would ignore Sy.Set arrays here but they can be UF representatives so I don't think we can :(

match SX.choose env.arrays with
| exception Not_found -> None
| a1 ->
match SX.find_first (fun a -> not @@ X.equal a a1) env.arrays with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to take into consideration the types of arrays. We must not create an equality literal between arrays of different types. Probably the arrays table needs to be segregated by both index and value types.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the current implementation loops trying to assert the same disequality. See for instance:

(set-logic ALL)
(set-option :produce-models true)
(declare-const a (Array Int Int))
(declare-const b (Array Int Int))
(declare-const c (Array Int Int))
(declare-const i Int)
(assert (= (select a i) (select b i)))
(assert (= (select a i) (select c i)))
(check-sat)
(get-model)

With -d split we can see that we loop repeatedly selecting b <> a as a split during model generation. Uf.already_distinct would need to be checked.

@Halbaroth Halbaroth marked this pull request as draft August 22, 2024 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug models This issue is related to model generation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong model for arrays
2 participants