Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility of only allowing leaves to hold values #41

Open
kyrias opened this issue Dec 25, 2017 · 5 comments
Open

Possibility of only allowing leaves to hold values #41

kyrias opened this issue Dec 25, 2017 · 5 comments

Comments

@kyrias
Copy link
Contributor

kyrias commented Dec 25, 2017

I have some use-cases where it would be preferable if nodes with children were not able to hold values themselves. I'm not sure how easy it would be to support this in the current codebase though, since the inserts are done with folds, which would make it hard to error out in the middle. At the same time it would be quite expensive to try to manually do the check in the code that uses sequence_trie before running insert.

So I guess I'm mostly interested in your thoughts over how feasible it would be to add to this crate, or if I should try to throw a custom crate together?

@kyrias
Copy link
Contributor Author

kyrias commented Dec 29, 2017

So I actually just quickly hacked together support for this here: kyrias@756d553.

Downside is that some of the function signatures change, but it's a fairly small change. Most of the change just consists of bubbling up the proper error to the caller on insert, but unless the sequence_trie was explicitly created as a values-at-leaves trie, the caller still doesn't have to care.about the return value, since it'll always be Ok(..).

Would appreciate any thoughts on it though.

@kyrias
Copy link
Contributor Author

kyrias commented Feb 24, 2018

@michaelsproul ping?

@michaelsproul
Copy link
Owner

Sorry for the slow reply @kyrias!

I'm reluctant to add a (runtime) flag to the code to enable this behaviour, because it seems to add a fair amount of ergonomic overhead. I don't think the Result<Option<T>, Error> interface is particularly nice as it forces you to handle the Result even if you've set values_at_leaves=false...

I wonder if there's a cleaner way to accomplish the effect you're after, maybe by encoding the termination of a key in the key itself? I'm thinking something like this:

// I can't think of a good name for this, but it's isomorphic to `Option`, so we could just use that...
enum Wrapper<T> {
   Value(T),
   Terminus
}

/// Wrap the parts of a key, adding a terminus value at the end
pub fn wrap(key: Vec<T>) -> Vec<Wrapper<T>> {
   key.into_iter()
        .map(Wrapper::Value)
        .chain(Some(Wrapper::Terminus))
        .collect()
}

Then, rather than keying the trie by keys of type Vec<T>, you would use Vec<Wrapper<T>>.

To maintain the property that the trie's internal nodes do not hold values, you would just need to ensure that only properly terminated inputs are inserted (you could make the wrap function part of the constructor for some trie key type, so that instances of the type always have this property).

For example, a trie with the key value pairs:

[Value("hello"), Value("world"), Terminus] => 111
[Value("hello"), Value("world"), Value("wow"), Terminus] => 222

would be represented like:

Node {
  value: None,
  children: hashmap! {
    "hello" => Node {
      value: None,
      children: hashmap! {
        "world" => Node {
          value: None,
          children: hashmap! {
            Terminus => Node {
              value: Some(111),
              children: hashmap! {},
            },
            "wow" => Node {
              value: None,
              children: hashmap! {
                Terminus => Node {
                  value: Some(222),
                  children: hashmap! {},
                }
              }
            }
          }
        }
      }
    }
  }
}

(where the strings are wrapped in Value(..) as appropriate)

Would something like this suit your use case?

@kyrias
Copy link
Contributor Author

kyrias commented Feb 25, 2018

Well, that would effectively be the same as how it currently works where ["foo", "bar"] and ["foo", "bar", "baz"] can both hold values at the same time, which in what I want to be able to prevent.

@michaelsproul
Copy link
Owner

Oh, right... I got caught up in the internal representation side of things, and didn't realise that this still provides the same behaviour at the level of keys... hmmm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants