-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate: compute embeddings via CoreML model #63
Comments
Super appreciate the investigation! Crazy that it's so difficult. Fwiw I found a bug in my thinking / was paying too much attention to c bindings and not enough to the embedding logic itself and forgot to take the mean 😅 but it's now fixed in https://github.com/jasonjmcghee/rust_embedding_lib Still haven't taken the time to do the final step to make it a framework. |
I met the creator of https://github.com/unum-cloud/usearch today and they have a swift offering. Could be another option instead of @sqlite-vss. |
Here's a model they built that supports images and text. https://huggingface.co/unum-cloud/uform-vl-english |
@roblg if you're interested in taking another shot at this, https://github.com/ashvardanian/SwiftSemanticSearch looks super promising! |
(Splitting out this discussion from #17; putting it here to document what I tried in case someone else wants to follow up)
I attempted to convert the
gte-small
model from HuggingFace from pytorch --> CoreML and integrated it into rem.Attempt #1 just use the CoreML model that somebody uploaded to the HF repo a few weeks ago
Result: I was able to easily get a tokenizer imported via
swift-transformers
, and import the CoreML model, but the actual model prediction resulted inNaN
s.Attempt #2 convert the model myself using huggingface exporters project
Result: conversion fails in the validation phase, because it outputs
NaN
s... (see a pattern here? :) )Attempt #3 manual conversion by following coremltools documentation
Result: kind of a few different things, but mostly:
NaN
s.I'm unclear whether conversion of a pytorch model for embeddings specifically is something that's supported/intended by coremltools. They have a lot of models included that seem much more complicated than a BERT embedding model should be but 🤷.
After a lot of poking and tweaking of inputs, I was able to get the pytorch model loaded into CoreML in fp16 format (it was defaulting to fp32 for some reason -- I think that's why the model uploaded to HF was so big to begin with). When I got to this point I get fp32 <--> fp16 compatibility issues from CoreML tools, which is a definite improvement, but... still not functional.
Error:
Summary
So... I'm going to table this for now, given that there's already a more flexible/probably less finicky alternative (the rust lib + bindings). It was fun while it lasted, but there are only so many hours in the day. 😅
(feel free to close this, I just didn't want to carp up #17 given that there are ~3 discussions happening there right now.)
The text was updated successfully, but these errors were encountered: