-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design to allow multiple versions of primitive VSA operators #73
Comments
How about an API such as this (not an exhaustive list but just an illustration): import torchhd
# binding operations
torchhd.multiply(a, b) # works for all data types (implements XOR for boolean tensors)
torchhd.convolution(a, b)
...
# bundling operations
torchhd.sum(a, b)
torchhd.mean(a, b) # normalizes the result of sum
torchhd.randsel(a, b)
... With this design our current We should add sections in the documentation to group the different binding and bundling implementations so it's clear for people what they are meant to do. |
Sorry - I am so totally not a pythonista that I can't tell what that does. If it's any help, my motivation for the request is:
This is also the motivation for #79. If the encoding functions are defined in terms of abstract bind and bundle then the impact of different instantiations of bind and bundle can be investigated. |
In my comment above I tried to present a sketch of the functions that Torchhd could expose/provide to allow users to switch between multiple methods for the primitive operations. So besides the current default (sum for bundling and product for binding) we would also expose circular convolution for binding and randsel for bundling, among many other possible methods. I think to make this design very ergonomic for the user we should still expose generic bind, bundle and permute functions but allow users to override them. For this API design I was thinking we can take inspiration from My idea is a design as follows: import torchhd
A, B = torchhd.random_hv(2, 10000)
torchhd.bundle(A, B) # addition by default
torchhd.bind(A, B) # product by default
# here comes the new part
torchhd.set_bundle_method(torchhd.randsel)
# torchhd.set_bind_method(...)
# torchhd.set_permute_method(...)
# from now on any call to torchhd.bundle will execute randsel
torchhd.bundle(A, B)
# you could equivalently do
torchhd.randsel(A, B) A benefit of this design is that our data structures such as hash tables and graphs internally use the bundle and bind functions which means that their behavior will automatically update when import torchhd
def my_custom_bundle(a, b):
return a + b
torchhd.set_bundle_method(my_custom_bundle)
A, B = torchhd.random_hv(2, 10000)
torchhd.bundle(A, B) # uses my_custom_bundle I think this could be very useful for research purposes. By default the code as users write it with the current version stays the same but more flexibility becomes available because the primitives can be changed. @rgayler could you perhaps help me compile or point to an overview of all the various versions of bind, bundle and permute that we should implement? I am looking for general descriptions of the methods such that we can implement them for the broadest range of hypervector types. For example, multiply binding is equivalent to XOR for binary hypervectors so I would like to keep these as one function whose execution depends on the hypervector type, instead of providing both |
One potential problem I thought of it that currently we also provide One way to combat this is to allow the user to override both the single and multi version of each primitive. When a user only provides the implementation for a single we can fallback to a for loop in the multi-methods implementation. And for the build-in methods the user could specify them as a string so that we can set the single and multi versions correctly behind the scene. Here are some examples that will hopefully make it more clear: import torchhd, torch
# assuming we have randsel implemented in the library
torchhd.set_bundle_method("randsel")
hypervectors = torchhd.random_hv(2, 10000)
A, B = hypervectors
torchhd.bundle(A, B) # randsel
torchhd.functional.multibundle(hypervectors) # efficient multi-randsel
def my_custom_bundle(a, b):
return torch.add(a, b)
# user only specifies custom single bundle method
torchhd.set_bundle_method(my_custom_bundle)
torchhd.bundle(A, B) # uses torch.add
torchhd.functional.multibundle(hypervectors) # will use a for-loop with torch.add
def my_custom_multibundle(a):
return torch.sum(a, dim=-2)
# user specifies both a custom single and multi bundle method
torchhd.set_bundle_method(my_custom_bundle, my_custom_multibundle)
torchhd.bundle(A, B) # uses torch.add
torchhd.functional.multibundle(hypervectors) # uses torch.sum |
I'm working in the |
@mikeheddes there are a few survey papers around (I'll get pointers to the ones I know) that list multiple instantiations of the generic operators. From memory, these don't make a point of highlighting which instantiations are special cases of others. FWIW binding with XOR and bipolar multiplicative binding can both be seen as equivalent to complex multiplicative binding where pahse angle has been quantised to two levels. I understand why it's attractive to minimise the number of function names and dispatch on argument type, but if it turns out that's too hard I don't think it would be a tragedy to have a bunch of related instantioations of primitive operators and just note the relationships in the documentation. |
These are the obvious recent surveys: |
@rgayler thank you for the papers, I wanted to make sure there are no important ones that we were not aware of yet. I will try to make a table where I group special cases of methods. Then we can refine that before we implement them. |
I can foresee that once we have a broader range of VSA/HDC models implemented we could add a method like |
Allow for multiple, different versions of the primitive VSA operators (bind, bundle, permute, and maybe others).
This is needed, for example to introduce HRR binding, and randsel bundling (#75).
This is related to #72 and #25, which appear to imply dispatching on hypervector type. However, people will want to introduce and compare different operators applied to the same hypervectors, so this issue is orthogonal to hypervector type.
The text was updated successfully, but these errors were encountered: