Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make meet in AddressDomain more precise #1468

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

michael-schwarz
Copy link
Member

@michael-schwarz michael-schwarz commented May 15, 2024

Currently, in the address domain inside the buckets, for the meets we use ProjectiveSetPairwiseMeet

let meet m1 m2 =
let meet_buckets b1 b2 acc =
B.fold (fun e1 acc ->
B.fold (fun e2 acc ->
if B.may_be_equal e1 e2 then
add e1 (add e2 acc)
else
acc
) b2 acc
) b1 acc
in
fold_buckets (fun _ b1 acc ->
fold_buckets (fun _ b2 acc ->
meet_buckets b1 b2 acc
) m2 acc
) m1 (empty ())

where B.may_be_equal delegates to

let may_be_equal a b = Option.value (Addr.semantic_equal a b) ~default:true

Addr.semantic_equal given by

let semantic_equal x y = match x, y with
| Addr x, Addr y -> Mval.semantic_equal x y
| StrPtr s1, StrPtr s2 -> SD.semantic_equal s1 s2
| NullPtr, NullPtr -> Some true
| UnknownPtr, UnknownPtr
| UnknownPtr, Addr _
| Addr _, UnknownPtr
| UnknownPtr, StrPtr _
| StrPtr _, UnknownPtr -> None
| _, _ -> Some false

which calls SD.sematic_equal

let semantic_equal x y =
match x, y with
| None, _
| _, None -> Some true
| Some a, Some b -> if a = b then None else Some false

this is needed to handle all sorts of different offsets that may be semantically equal, i.e., evaluate to the same physical address. In this case it keeps both elements.

After some discussion with @jerhard and @hseidl, we reached the following insights:

This behavior is overly conservative, as the mess of the address lattice contains some sub-lattices where the meet operation is semantically correct.

This now checks via a new predicate amenable_to_meet whether for the two addresses a and b this relationship between values and the meet holds. If this is the case, a meet is performed and the element added to the resulting set only if it is distinct from bot.

Currently, we answer amenable_to_meetby true whenever both arguments are strings, or when the both are addresses that differ in the numeric(!) offsets only.

This PR also adds two tests where this yields additional precision.

Closes #1467

@michael-schwarz michael-schwarz changed the title Make meet in AddressDomain more Precise Make meet in AddressDomain more precise May 15, 2024
@michael-schwarz michael-schwarz requested a review from sim642 May 15, 2024 14:17
@michael-schwarz michael-schwarz marked this pull request as ready for review May 15, 2024 14:17
src/cdomain/value/cdomains/addressDomain_intf.ml Outdated Show resolved Hide resolved
src/domain/disjointDomain.ml Outdated Show resolved Hide resolved
@@ -110,6 +110,11 @@ struct
| StrPtr _, UnknownPtr -> None
| _, _ -> Some false

let amenable_to_meet x y = match x,y with
| StrPtr _, StrPtr _ -> true
| Addr x, Addr y when Mval.equal (Mval.top_indices x) (Mval.top_indices y) -> true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't the address set buckets already guarantee this to only be called in such cases because Mval.top_indices are the representatives?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the buckets guarantee this already, this function can be called with two arbitrary addresses, so we can not just return true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, although it could just be an implementation detail (like the buckets themselves are), which isn't exposed to the outside by the interface. Then such misuse would be impossible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed this assumption now to just assume that things placed into the same bucket are amenable to meet, which seems to be the case in our implementation and I think also makes sense semantically.

This largely simplifies the implementation and makes all sorts of extra considerations obsolete.

tests/regression/27-inv_invariants/22-meet-ptrs.c Outdated Show resolved Hide resolved
@michael-schwarz michael-schwarz self-assigned this Dec 15, 2024
@michael-schwarz
Copy link
Member Author

Sorry for being MIA, I will take this up and hopefully get it mergeable before the holidays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Meet of string pointers in address domain imprecise
2 participants