Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finite Data Structures #15

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Finite Data Structures #15

wants to merge 4 commits into from

Conversation

AHaliq
Copy link

@AHaliq AHaliq commented Nov 14, 2023

Finite Data Structures Experiments

Here we experiment implementing a Finite Map and Finite Set data structure. We tried 3 approaches

1. LinkedList and LinkedListNodup

These are our own defined structs and Iterator

The structs were tested with property based testing, verify with the following

cargo test

relevant files

  • src/linkedlist/mod.rs
  • tests/linkedlist_test.rs
  • src/noduplist/mod.rs
  • tests/linkedlistnodup_test.rs
  • src/finset/linkedlistnodup.rs
  • src/finmap/linkedlistnodup.rs

2. seq::Seq

An enum in the seq crate

The problem with seq is that it does not implement from_iter and its into_iter is implemented on a borrow of Self and not the type itself.

seq was intended to be used for modular reusable linked lists that can be composed with one another.

Our objective of making an encapsulated list-like data structure without having to manage sub lists is at odds with this design as evident with the numerous borrow checker errors in its development. The solution was to use alot of Copy and Box which goes against the point of Seq to begin with

relevant files

  • src/seqlist/mod.rs

3. std::vec::Vec

Vec behaves exactly how we intended the list built on top of Seq supposed to

implementing finite map and finite set was very straightforward

leveraging sort and dedup functions to maintain unique elements in the structure

relevant files

  • src/finset/vecset.rs
  • src/finmap/vecmap.rs

Map and Set

The data structures themselves are defined with the following postconditions

We expressed them via the ensures macro

relevant files

  • src/finset
  • src/finmap

UPDATE

removed linkedlist and seq implementations

@AHaliq AHaliq marked this pull request as draft November 14, 2023 11:07
@AHaliq AHaliq force-pushed the haliq/finmap branch 4 times, most recently from b508b43 to b0a5996 Compare November 15, 2023 08:33
@AHaliq
Copy link
Author

AHaliq commented Nov 20, 2023

todo: look into property based testing

finmap/src/finmap.rs Outdated Show resolved Hide resolved
finmap/src/finmap.rs Outdated Show resolved Hide resolved
finmap/src/finmap.rs Outdated Show resolved Hide resolved
findatastructures/src/finmap/linkedlistnodup.rs Outdated Show resolved Hide resolved
findatastructures/src/finmap/linkedlistnodup.rs Outdated Show resolved Hide resolved
findatastructures/src/finmap/linkedlistnodup.rs Outdated Show resolved Hide resolved
findatastructures/src/finmap/linkedlistnodup.rs Outdated Show resolved Hide resolved
findatastructures/src/finmap/linkedlistnodup.rs Outdated Show resolved Hide resolved
@spitters
Copy link
Collaborator

spitters commented Dec 6, 2023

@AHaliq if you believe this is ready, you can remove the Draft status.
It would probably be good to remove LinkedList and Seq.
It's nice to see the experiments, but they should probably not be in the final version.
@franziskuskiefer Is hacspec/spec still the place where this should go, or is there a new version for hax?

@AHaliq AHaliq changed the title Draft FinMap Finite Data Structures Dec 6, 2023
@AHaliq
Copy link
Author

AHaliq commented Dec 7, 2023

the PR is ready. removed the Seq and LinkedList implementations

@annenkov
Copy link

annenkov commented Dec 8, 2023

Hi @AHaliq, the PR looks good! Looks like all the basic functionality is implemented. I think it is a very useful addition.

Just wanted to mention some small additions you might consider:

  1. Implement Iterator (or IntoIterator for references). This would allow one to iterate over the contents without consuming the map/set.
  2. add keys() and values(), but this is completely optional because it can be implemented using iterators. However, these methods are part of the standard collections, so they could be useful at some point.

Copy link
Member

@franziskuskiefer franziskuskiefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I left a couple questions and suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some doc comments everywhere. Here should go what this crate is doing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this module doing? Please add some comments.

findatastructures/src/finmap/mod.rs Show resolved Hide resolved
/// Get the number of key-value pairs in the map
fn len(&self) -> usize;
/// Get the keys in the map
fn keys(&self) -> Vec<&Self::K>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keys and values functions usually return an iterator. You could also use a slice if iterators are too complex. What is the design rational behind using a vector here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterators are indeed complex, having to add lifetime variables to the trait and impl, seems like another battle with the borrowchecker.

I've also tried to look into slices, the issue is that the struct is a single vec of pairs, returning slices would require changing the struct to two parallel vecs one for keys another for values.

In light of this, should we change the struct? Or is Vec as a return type justified?

/// Check if the map contains a key
fn contains_key(&self, k: &Self::K) -> bool;
/// Insert a key-value pair into the map
fn insert(&self, k: Self::K, v: Self::V) -> Self;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insert usually returns the value that used to be in the map at k. What's the design rational behind returning a copy? Also, when returning the object itself, it should consume self. Similarly for remove.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

   = help: message: For now, ensures clause don't work on function that have `&mut` inputs (see https://github.com/hacspec/hacspec-v2/issues/290)

If I understand your comment correctly, these functions ideally should mutate the struct rather than return a copy, however the design is as such due to the limitations in the ensures macro

Thus if this function were to return just the replaced value while being immutable, it is equivalent to get

findatastructures/src/finset/mod.rs Show resolved Hide resolved
/// Check if the set contains a value
fn contains(&self, v: &Self::V) -> bool;
/// Insert a value into the set
fn insert(&self, v: Self::V) -> Self;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comments to the map. What's the design rational for these functions, esp. in deviating from the common behaviour in Rust?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if its about not mutating the struct, again its due to the ensures macro

@spitters
Copy link
Collaborator

spitters commented Jan 4, 2024

@AHaliq What is the status of this? I see you're waiting for some answers from @franziskuskiefer . Have you resolved his other suggestions?

@AHaliq
Copy link
Author

AHaliq commented Jan 4, 2024

@spitters yes, the other comments about the unecessary traits has been fixed but not pushed upstream yet. as for documentation, it would be best to finalize the functions before having to redo them again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants