Repository to store and share vector stores / embedding for LLM models
DocsHUB is a open-source solution for storing vectors for LLM models. Like a package manager but for vector stores.
Say goodbye to time-consuming scraping and embedding, and let DocsHUB speed up your project by providing you with latest embeddings
- Github workflows to prepare json document with indexes ✅
- Website with search and index
- Plugin with DocsGPT
- Plugins for other projects
- API for search
vectors - where all vecor stores are
ingestors - scripts to prepare and ingest data into vector stores
Just navigate to a folder you need vectors/<language>/<library_name>/<version>/
And download:
docs.index, faiss_store.pkl
You can also use this index to find items you need in here (updated on every push)
https://d3dg1063dc54p9.cloudfront.net/combined.json
Anyone can create a pull request. It should contain 3 files
- docs.index
- faiss_store.pkl
- metadata.json
Ensure the path is correct
vectors/<language>/<library_name>/<version>/
if its actual python (language itself) for example use
vectors/python/.project/version/
And in a corresponding path
Metadata is a json document with this fields:
- name
- language
- version
- description (one or two sectences)
- fullName (Full project name not a slug name)
- date (to know when it was last updated)
- docLink (link to the documentation that was used for it)
Example of metadata.json
{
"name": "pandas",
"language": "python",
"version": "1.5.3",
"description": "Pandas is alibrary providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.",
"fullName": "Pandas",
"date": "07/02/2023",
"docLink": "https://pandas.pydata.org/docs/"
}
Built with 🦜️🔗 LangChain