Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load DeepSparseSentenceTransformer #1649

Open
Capt4in-Levi opened this issue May 15, 2024 · 2 comments
Open

Unable to load DeepSparseSentenceTransformer #1649

Capt4in-Levi opened this issue May 15, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Capt4in-Levi
Copy link

Capt4in-Levi commented May 15, 2024

Describe the bug
I'm trying to replicate the sample code given in the DeepSparseSentenceTransformer documentation. I'm facing errors while executing it. It is mostly related to the version compatibility of the modules , however I'm stuck trying to find what exactly the issue is , can you please help with this?
Expected behavior
To load models DeepSparseSentenceTransformer without any errors
Environment
Include all relevant environment information:

  1. OS [e.g. Ubuntu 18.04]: Linux-5.10.215-203.850.amzn2.x86_64-x86_64-with-glibc2.26
  2. Python version [e.g. 3.8]: 3.10.14
  3. DeepSparse version or commit hash [e.g. 0.1.0, f7245c8]: deepsparse-1.7.1
  4. ML framework version(s) [e.g. torch 1.7.1]: torch-2.1.0
  5. Other Python package versions [e.g. SparseML, Sparsify, numpy, ONNX]:
    onnx-1.14.1 onnxruntime-1.16.3 sparsezoo-1.7.0 sparsezoo-nightly-1.8.0.20240401
  6. CPU info - output of deepsparse/src/deepsparse/arch.bin or output of cpu_architecture() as follows:
>>> import deepsparse.cpu
>>> print(deepsparse.cpu.cpu_architecture())
```{'L1_data_cache_size': 32768, 'L1_instruction_cache_size': 32768, 'L2_cache_size': 262144, 'L3_cache_size': 47185920, 'architecture': 'x86_64', 'available_cores_per_socket': 2, 'available_num_cores': 2, 'available_num_hw_threads': 2, 'available_num_numa': 1, 'available_num_sockets': 1, 'available_sockets': 1, 'available_threads_per_core': 1, 'bf16': False, 'cores_per_socket': 2, 'dotprod': False, 'i8mm': False, 'isa': 'avx2', 'num_cores': 2, 'num_hw_threads': 2, 'num_numa': 1, 'num_sockets': 1, 'threads_per_core': 1, 'vbmi': False, 'vbmi2': False, 'vendor': 'GenuineIntel', 'vendor_id': 'Intel', 'vendor_model': 'Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz', 'vnni': False, 'zen1': False}

To Reproduce
Exact steps to reproduce the behavior:
!pip install deepsparse[sentence_transformers]
!pip install tf-keras
(prompted by deepsparse to install this)

from deepsparse.sentence_transformers import DeepSparseSentenceTransformer
model = DeepSparseSentenceTransformer('neuralmagic/bge-small-en-v1.5-quant', export=False)

#Our sentences we like to encode
sentences = ['This framework generates embeddings for each input sentence',
    'Sentences are passed as a list of string.',
    'The quick brown fox jumps over the lazy dog.']

# Sentences are encoded by calling model.encode()
import time
st = time.time()
embeddings = model.encode(sentences)
ed = time.time()
print("time taken is : ",ed-st)

# Print the embeddings
for sentence, embedding in zip(sentences, embeddings):
    print("Sentence:", sentence)
    print("Embedding:", embedding.shape)
    print("")

Errors
RuntimeError: Failed to import optimum.deepsparse.modeling because of the following error (look up to see its traceback):
Failed to import optimum.exporters.onnx.main because of the following error (look up to see its traceback):
cannot import name 'is_torch_less_than_1_11' from 'transformers.pytorch_utils' (/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/pytorch_utils.py)

Additional context
Add any other context about the problem here. Also include any relevant files.

@Capt4in-Levi Capt4in-Levi added the bug Something isn't working label May 15, 2024
@Capt4in-Levi
Copy link
Author

Capt4in-Levi commented May 20, 2024

Sharing the working versions of torch, transformers, optimum and deep sparse would fine too if I'm missing something obvious - @mgoin

@dilip467
Copy link

What is the time taken for one embedding generation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants