Skip to content

This repository demonstrates how to perform semantic search using embeddings with a local embedding model and cosine similarity. It includes a setup for generating embeddings and finding relevant information based on user queries. The project utilizes Python, Torch, and a custom embedding model served locally via an API.

License

Notifications You must be signed in to change notification settings

a1brz/semantic-vector-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Semantic Vector Search

Prerequisites

  • Python 3.7+: Ensure you have Python installed. You can download it from python.org.
  • Git: To clone the repository. Download from git-scm.com.
  • Ollama: Required for embeddings. Download from Ollama.

Setup Steps

  1. Clone the Repository

    git clone https://github.com/a1brz/semantic-vector-search.git
    cd semantic-vector-search
  2. Set Up Python Virtual Environment

    • Unix/macOS:
      python3 -m venv venv
      source venv/bin/activate
    • Windows:
      python -m venv venv
      venv\Scripts\activate
  3. Install Required Modules

    pip install --upgrade pip
    pip install -r requirements.txt
  4. Install and Configure Ollama

    • Download Ollama: Visit the Ollama Download Page and follow the installation instructions for your operating system.
    • Start Ollama Server:
      ollama serve

      Note: Ensure that Ollama is running before proceeding to the next step.

  5. Download Embeddings Model

    ollama pull mxbai-embed-large

    Note: This command downloads the mxbai-embed-large model. Ensure you have a stable internet connection.

  6. Run the Semantic Search Script

    python semantic_search.py

Additional Tips

  • Deactivate Virtual Environment: When you're done, you can deactivate the virtual environment by running:

    deactivate
  • Troubleshooting:

    • If you encounter issues with ollama serve, ensure that no other services are conflicting on the required ports.
    • Verify that all dependencies are correctly installed. You can list installed packages using:
      pip list
  • Customization: Feel free to modify semantic_search.py to better suit your specific use case or to experiment with different embedding models.

Repository Structure

semantic-vector-search/
├── venv/                  # Python virtual environment
├── requirements.txt       # Python dependencies
├── semantic_search.py     # Main script for semantic search
├── README.md              # This readme file
├── .gitignore             # Git ignore file
└── ...                    # Additional files and folders

About

This repository demonstrates how to perform semantic search using embeddings with a local embedding model and cosine similarity. It includes a setup for generating embeddings and finding relevant information based on user queries. The project utilizes Python, Torch, and a custom embedding model served locally via an API.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages