Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support String output tensors #67

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

marshallpierce
Copy link
Contributor

Based on #65.

This approach allocates owned Strings for each element, which works, but stresses the allocator, and incurs unnecessary copying.

Part of the complication stems from the limitation that in Rust, a field can't be a reference to another field in the same struct. This means that having a Vec of copied data, referred to by a Vec<&str>, which is then referred to by an ArrayView, requires a sequence of 3 structs to express. Building a Vec gets rid of the references, but also loses the efficiency of 1 allocation with strs pointing into it.

I'm not terribly happy with this implementation's efficiency but it does work, so I figured it was a decent starting point for later improvements. I'm playing with &str output in another branch but I haven't gotten the lifetimes to work out yet.

…rom a tensor

This is some prep work for string output types and tensor types that vary across the model outputs. For now, the supported types are just the basic numeric types.

Since strings have to be copied out of a tensor, it only makes sense to have `String` be an output type, not `str`, hence the new type so that we can have more input types supported than output types.
Outputs aren't all the same type for a single model, so this allows extracting different types per tensor.
This approach allocates owned Strings for each element, which works, but stresses the allocator, and incurs unnecessary copying.

Part of the complication stems from the limitation that in Rust, a field can't be a reference to another field in the same struct. This means that having a Vec<u8> of copied data, referred to by a Vec<&str>, which is then referred to by an ArrayView, requires a sequence of 3 structs to express. Building a Vec<String> gets rid of the references, but also loses the efficiency of 1 allocation with strs pointing into it.
@marshallpierce marshallpierce force-pushed the mp/string-output-owned branch from 80b68d1 to d2f1ebe Compare March 9, 2021 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant