You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
Invoking BM25Encoder.getDefault() fails to download related files with a SSL cert error:
File "****.py", line 45, in sparseVectorQuery
bm25 = BM25Encoder.default()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pinecone_text/sparse/bm25_encoder.py", line 261, in default
wget.download(url, str(tmp_path))
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/wget.py", line 526, in download
(tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 241, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open
raise URLError(err)
Expected Behavior
Expect BM25Encoder.getDefault() to download supporting files successfully when they are not already present in the environment.
Steps To Reproduce
from pinecone_text.sparse import BM25Encoder
def createSparseVectors(chunks):
# encoder = tiktoken.encoding_for_model("gpt-4")
# extract text to form corpus
corpus = []
for chunk in chunks:
corpus.append(chunk["content"])
bm25 = BM25Encoder.default() # <<<<<<<<<<<<<<<
bm25.fit(corpus)
sparse_vectors = bm25.encode_documents(corpus)
return sparse_vectors
x = [
{"content":"apples are red"},
{"content":"bananas are yellow"}
]
createSparseVectors(x)
newgolddream
changed the title
BM25Encoder.getDefault() Fails to download stopwords on a cert error
BM25Encoder.default() Fails to download stopwords on a cert error
Feb 12, 2024
Is this a new bug?
Current Behavior
Invoking BM25Encoder.getDefault() fails to download related files with a SSL cert error:
Expected Behavior
Expect BM25Encoder.getDefault() to download supporting files successfully when they are not already present in the environment.
Steps To Reproduce
Relevant log output
No response
Environment
Additional Context
No response
The text was updated successfully, but these errors were encountered: