-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for hex representation for mixed tensors in queries #32231
Comments
I still experience the same with [8.424.11] |
use 'input.query(qt)={"0":3DE38E393E638E393EAAAAAB}' |
It's IMHO unfortunate that one then needs one format for JSON feed and one string format without quotes for queries. When I have a dict<string,string> - now I need to write a custom routine to produce a string for the query and not use the JSON representation of the dict<string,string>. |
Snippet from a notebook import struct
import torch
import numpy as np
def binarize_tensor(tensor: torch.Tensor) -> str:
"""
Binarize a floating-point 1-d tensor by thresholding at zero
and packing the bits into bytes. Returns the hex str representation of the bytes.
"""
if not tensor.is_floating_point():
raise ValueError("Input tensor must be of floating-point type.")
return np.packbits(np.where(tensor > 0, 1, 0), axis=0).astype(np.int8).tobytes().hex()
def tensor_to_hex_bfloat16(tensor: torch.Tensor) -> str:
if not tensor.is_floating_point():
raise ValueError("Input tensor must be of float32 type.")
def float_to_bfloat16_hex(f: float) -> str:
packed_float = struct.pack('=f', f)
bfloat16_bits = struct.unpack('=H', packed_float[2:])[0]
return format(bfloat16_bits, '04X')
hex_list = [float_to_bfloat16_hex(float(val)) for val in tensor.flatten()]
return "".join(hex_list)
async def get_vespa_response(
embedding: torch.Tensor,
qid: str,
session: VespaAsync,
depth=20,
profile = "float-float") -> List[ScoredDoc]:
# The query tensor API does not support hex formats yet
# so this format will throw a parse error
float_embedding = {index: tensor_to_hex_bfloat16(vector)
for index, vector in enumerate(embedding)}
binary_embedding = {index: binarize_tensor(vector)
for index, vector in enumerate(embedding)}
response: VespaQueryResponse = await session.query(
yql="select id from pdf_page where true", # brute force search, rank all pages
ranking=profile,
hits=5,
timeout=10,
body={
"input.query(qt)" : float_embedding,
"input.query(qtb)" : binary_embedding,
"ranking.rerankCount": depth
}
)
assert response.is_successful()
scored_docs = [] This will not work with the the custom tensor format, but works for feeding vespa_docs = []
for row, embedding in zip(ds, embeddings):
embedding_full = dict()
embedding_binary = dict()
# You can experiment with pooling if you want to reduce the number of embeddings
#pooled_embedding = pool_embeddings(embedding, pool_factor=2) # reduce the number of embeddings by a factor of 2
for j, emb in enumerate(embedding):
embedding_full[j] = tensor_to_hex_bfloat16(emb)
embedding_binary[j] = binarize_tensor(emb)
vespa_doc = {
"id": row['docId'],
"embedding": embedding_full,
"binary_embedding": embedding_binary
}
vespa_docs.append(vespa_doc) |
there are many differences between the JSON formats and the "literal form". we can try to smooth over some of these differences but there's no way to get rid of them all. |
Maybe we should support inputting tensors in JSON format somehow? |
I understand that not all tensor formats translate to something representable in JSON, but I do think that mixed tensors with one mapped dimension and one indexed dimension could. Now I need two functions, one for feed and one for queries. |
We support hex format in document JSON but not for queries.
Is valid, but attempting to send this format in queries will throw a 400 bad request
The text was updated successfully, but these errors were encountered: