Welcome to the BioMANIA! This guide provides detailed instructions on how to set up, run, and interact with the BioMANIA chatbot interface, which connects seamlessly with various APIs to deliver information across numerous libraries and frameworks.
Project Overview:
🌟 We warmly invite you to share your trained models and datasets in our issues section, making it easier for others to utilize and extend your work, thus amplifying its impact. Feel free to explore and provide feedback on tools shared by other contributors as well! 🚀🔍
We welcome 🤗 you to refer to the Q&A section if you encounter any problems during your exploration and contribute some issues for discussion! 🧐 👨💻
Our demonstration showcases how to utilize a chatbot to simultaneously use scanpy and squidpy in a single conversation, including loading data, invoking functions for analysis, and presenting outputs in the form of code, images, and tables
We also offer a command-line interface (CLI) demo through the terminal.
We provide hosted on our server!
(240929-For Online Demo, note that when multiple user are using, there might be delay in connection. We will check the demo running everyday, issue (if any) will be fixed in the next day. It is recommended to ask question in English in this time, as the corpus is designed for English and thus results will be more accurate.)
We provide several ways to run the service: python script, terminal CLI, Docker, colab demo. Among those, terminal CLI is the easiest way to start. \
# setup the environment
pip install git+https://github.com/batmen-lab/BioMANIA.git --index-url https://pypi.org/simple
# setup OPENAI_API_KEY
echo 'OPENAI_API_KEY="sk-proj-xxxx"' >> .env
# (optional) setup github token
echo "GITHUB_TOKEN=your_github_token" >> .env
# download data, retriever, and resources from drive, and put them to the
# - data/standard_process/{LIB} and
# - hugging_models/retriever_model_finetuned/{LIB} and
# - ../../resources/
pip install gdown
gdown https://drive.google.com/uc?id=1nT28pIJ_dsdvi2yD8ffWt_aePXsSWdqI
sh download_data_model.sh
# setup the PYTHONPATH
export PYTHONPATH=$PYTHONPATH:$(pwd)
# CLI service quick start!
pip install gradio
python -m BioMANIA.deploy.cli_demo
# or gradio app. (TODO 240509: Images showing are under developing!)
#python -m BioMANIA.deploy.cli_gradio
For ease of use, we provide Docker image containing scanpy, squidpy, ehrapy, snapatac2. You can refer the detailed tools list from dockerhub.
# Pull back-end service and front-end UI service with:
# 241016 updated
sudo docker pull chatbotuibiomania/biomania-together:v1.1.12-cuda12.6-ubuntu22.04
Start service with
# run on gpu
sudo docker run -e LIB=scanpy -e OPENAI_API_KEY=[your_openai_api_key] -e GITHUB_TOKEN=[github_pat_xxx] --gpus all -d -p 3000:3000 chatbotuibiomania/biomania-together:v1.1.12-cuda12.6-ubuntu22.04
# or on cpu
sudo docker run -e LIB=scanpy -e OPENAI_API_KEY=[your_openai_api_key] -e GITHUB_TOKEN=[github_pat_xxx] -d -p 3000:3000 chatbotuibiomania/biomania-together:v1.1.12-cuda12.6-ubuntu22.04
Then check UI service with http://localhost:3000/en
.
Important Tips for Running Docker Without Bugs:
- To run docker on GPU, you need to install
nvidia-docker
andnvidia container toolkit
. Rundocker info | grep "Default Runtime"
to check if your device can run docker with gpu. - Feel free to adjust the cuda image version inside the
Dockerfile
to configure it for different CUDA settings which is compatible for your device.
We understand the desire to run the service on a server and visualize locally. You can initiate the ngrok service by running this script on your server:
ngrok http 3000
then get the url like https://[ngrok_id].ngrok-free.app
and copy it to chrome to start!
This section is provided for user who want DIY more flexible function.
For instance, let's take scanpy
as an example. Detailed library support information can be found in the Q&A
To prepare your environment for the BioMANIA project, follow these steps:
- Clone the repository and install dependencies:
git clone https://github.com/batmen-lab/BioMANIA.git
cd BioMANIA
conda create -n biomania python=3.9
conda activate biomania
pip install -r requirements.txt --index-url https://pypi.org/simple
export PYTHONPATH=$PYTHONPATH:$(pwd)
- Set up your OpenAI API key in the
BioMANIA/.env
file.
echo 'OPENAI_API_KEY="sk-proj-xxxx"' >> .env
- For inference purposes, a standard OpenAI API key is sufficient.
- If you intend to use functionalities such as instruction generation or GPT API predictions, a paid OpenAI account is required as it may reach rate limit.
- Feel free to switch to
model_name='gpt-3.5-turbo-0125'
orgpt-4-0125-preview
insrc/models/model.py
if you want.
Download the necessary data and models from our Google Drive link. For those library data, you can download only the one you need.
We provide a script for downloading models and datas from Google Drive for scanpy as an example. This works if you are accessible to google.
gdown https://drive.google.com/uc?id=1nT28pIJ_dsdvi2yD8ffWt_aePXsSWdqI
sh download_data_model.sh
Organize the downloaded files at BioMANIA/data
or BioMANIA/hugging_models
as follows (base
are necessary):
data
├── conversations
├── others-data
└── standard_process
├── base
│ ├── API_composite.json
│ └── ...
├── scanpy
│ ├── API_composite.json
│ └── ...
├── {LIB}
│ ├── API_composite.json
│ └── ...
└── ...
hugging_models
└── retriever_model_finetuned
├── {LIB}
└── ...
../../resources
By meticulously following the steps above, you'll have all the essential data and models perfectly organized for the project.
We also offer some demo chat, you can find them in ./examples
. Notice that these demo chat are converted from the PyPI readthedoc tutorials. You can check the original tutorial link through the tutorial_links.txt
.
This is compatible with Node.js version 19.
# Under folder BioMANIA/chatbot_ui_biomania
npm install && npm run build
Start both services for back-end and front-end UI with:
# Under folder `BioMANIA/`
# backend, in one terminal
python -m src.deploy.inference_dialog_server
# frontend, in another terminal
cd chatbot_ui_biomania/
npm run dev
Your chatbot server is now operational at http://localhost:3000/en
, primed to process user queries.
When selecting different libraries on the UI page, the retriever's path will automatically be changed based on the library selected
For users who wish to customize functionality more deeply, we provide a script example that demonstrates direct interaction with the BioMANIA library via a Python script. In this example, users can
- switch different initial loaded library
- change the llm type by either ollama supported models i.e. llama3, or openai supported models i.e. gpt-3.5-turbo
- manage the conversation state, either continue the previous saved session, or start a new conversation This method is particularly suited for developers and researchers who want to quickly adjust and test different data processing strategies based on specific research needs.
# under BioMANIA/
from src.deploy.model import Model
conversation_started = True
model = Model(logger=None, device='cpu', model_llm_type='llama3')
user_input = "Could you load the built in dataset?"
library = "scanpy"
# for the first turn of a dialog, use conversation_started=True, then use conversation_started=False for the following dialogs
# if you want to use previous session, use the same session_id as before and conversation_started = False
model.run_pipeline(user_input, library, top_k=1, files=[], conversation_started=conversation_started, session_id="")
Please refer to the separate README for tutorials that supporting converting different coding tools to our APP.
If you want to share your pretrained APP to others, there are two ways.
You can build docker and push to dockerhub, and share your docker image url in our issue. For environment setting of your tool, please refer to BioMANIA/docker_utils/{LIB}/
to add the env files, or modify the Dockerfile to build your environment.
# cd BioMANIA
sudo docker build --build-arg LIB=[your_tool_name] -t [docker_image_name] -f Dockerfile ./
# (optional)push to docker
sudo docker push [your_docker_repo]/[docker_image_name]:[tag]
Notice if you want to include some data inside the docker, please modify the Dockerfile
carefully to copy the folders to /app
. Also add your PyPI or Git pip install url to the requirements.txt
before your packaging for docker.
You can just share your data
and hugging_models
folder and logo
image by drive link to our issue.
We extend our gratitude to the following references:
Thank you for choosing BioMANIA. We hope this guide assists you in navigating through our project with ease.
- v1.1.12 (2024-10-16)
- Update code scripts & upload data and models & update docker which are aligned with paper.
- Will renew the scripts for generating report, documents for Git2APP, R2APP soon.
- Update report generation.
- Update R2APP and Git2APP document.
view version_history for more details!
Please cite our paper if you fine our data, model or code useful.
@article{dong2023biomania,
title={BioMANIA: Simplifying bioinformatics data analysis through conversation},
author={Dong, Zhengyuan and Zhong, Victor and Lu, Yang},
journal={bioRxiv},
pages={2023--10},
year={2023},
publisher={Cold Spring Harbor Laboratory}
}