Skip to content

This project creates a real-time conversational AI, either serverless via SvelteKit/Static or using LangChain with FastAPI as a web server, streaming GPT model responses and supporting in-browser LLMs via webllm.

License

Notifications You must be signed in to change notification settings

randomtask2000/MultiShot.AI

Repository files navigation

Multishot.ai is a stateless UI with Svelte static, webllm, FastAPI and LangChain backend

multishot.ai is a free AI/LLM chatbot

Go to multishot.ai in Chrome or WebGPU enabled browser to try it out.

Features

This project demonstrates how to create a real-time conversational AI from models hosted in your browser or commercially available. It uses FastAPI to create a web server that accepts user inputs and streams generated responses back to the user in a Svelte UI app.

The app also supports the running of LLM's in the browser via webllm and it's completely private.

Have a look at the live version here, multishot.ai, although it needs Chrome or Edge that have WebGPU support.

trophy

Goals

  1. App is stateless ✅
    1. Unlike most AI apps with a Streamlit, Gradio, Flask, and Django UI, to name a couple. These frameworks are good for the desktop but cost prohibitive to run 24/7 in the cloud. This app is designed to be stateless and can be run on a serverless platform like Vercel or Netlify and AWS S3 for static hosting allowing the app to be run for free or at a very low cost.
  2. webllm ✅ - it's like Ollama but runs completely in the browser.
    1. webllm is a high-performance, in-browser language model inference engine that leverages WebGPU for hardware acceleration, enabling powerful LLM operations directly within web browsers without server-side processing. Thanks to the open-source efforts like LLaMA, Alpaca, Vicuna and Dolly, we start to see an exciting future of building our own open source language models and personal AI assistant.
    2. Move the toggle from remote to local to find the local models and it will download one for you and you're good to start chatting.
    3. Llama8b or any webllm model You can even Llama 8b
    4. Completely private and secure ✅ - no one can listen in on your conversations.
  3. Composability ✅
    1. Svelte
      1. version 4 ✅
    2. Embedable and composable design ✅
    3. Serverless ready ✅
    4. CDN support ✅
  4. Responsive design for Mobile phones ✅
    1. Mobile first approach ✅
      1. drawing
    2. Skeleton
    3. TailWindCSS
    4. Theme selector and persistence
    5. Contains UI animations ✅
  5. Python backend supporting FastAPI ✅
    1. LangChain
  6. Frontend and backend support multiple models (agentic in nature). The app supports the following API's
    1. OpenAI
    2. Anthropic
    3. Ollama (local/remote) ✅
    4. Groq
  7. Code highlighting ✅
    1. Copy code button ✅
  8. What's next
    1. add web-scraping
    2. document upload
    3. deletion of chats ✅
    4. workflows
    5. RAG with RAPTOR
    6. Stable Diffusion?

Screenshot of MultiShot.AI

Installation and Usage

With the addition of running models locally with webllm in your browser, you can run this app without the python backend and skip that installation and steps by starting at running the UI in the ./ui/ directory with npm run dev -- --port=3333. Select the local toggle button in the UI to download and select the model to run prompts locally without sending data over the web. It's a great way to keep your LLM chattery private and for the author to pay nothing for your use on the live demo at multishot.ai.

  1. Clone the repository
    1. If you want to run the app without a python backend stip the following steps and start at the bottom with pnpm or npm.
  2. Install Python (Python 3.7+ is recommended).
    1. Create a virtual environment python -m venv .venv
    2. Activate your virtual environment source .venv/bin/activate
  3. Install necessary libraries. This project uses FastAPI, uvicorn, LangChain, among others.
    1. In case you haven't done so activate your virtual environment source .venv/bin/activate
    2. In server directory run: pip install -r requirements.txt.
  4. Add your OpenAI API key to the ./server/.env and use example.env as a template in the server directory.
  5. Start the FastAPI server by running uvicorn server.main:app --reload
  6. Start the UI with pnpm but you can use npm if you prefer and have time.
    1. cd ./ui/
    2. pnpm install --save-dev vite
    3. pnpm build - if you want to build for production
    4. pnpm exec vite --port=3333 or npm run dev -- --port=3333
  7. Your UI will run on http://localhost:3333/ and your backend on http://127.0.0.1:8000/static/index.html.

About

This project creates a real-time conversational AI, either serverless via SvelteKit/Static or using LangChain with FastAPI as a web server, streaming GPT model responses and supporting in-browser LLMs via webllm.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published