sd-webui-blip2 is a stable diffusion extension that generates image captions with blip2 Using that caption as a prompt may help you get closer to your ideal picture.
Open "Extensions" -> "Install from URL" paste the link below
https://github.com/Tps-F/sd-webui-blip2.git
If you receive the message "Can't install salesforce-lavis" please follow the steps below.
Windows: Open powershell with admin on your-stable-diffusion-webui location and type
Set-ExecutionPolicy RemoteSigned -Scope CurrentUser -Force
./venv/scripts/activate
pip install salesforce-lavis
Mac: Open terminal on your stabl-diffusion-webui location and type
source venv/bin/activate
pip install salesforce-lavis
Build from source
C++ build environment is required
git clone https://github.com/salesforce/LAVIS.git
cd LAVIS
pip install -e .
First select a model, If that model does not exist, the download will begin. Please be patient... Next, select the image for which you want to choose a caption and press "Generate Caption"!
Information about the parameters is as follows
Name | Description |
---|---|
Beam Search | Generates a single prompt |
Nucleus Sampling | Generates three prompts |
Length Penalty | Answer length |
Repeat Penalty | arger value prevents repetition |
Temperature | higher temperature => more likely to sample low probability tokens |
If there are any other features you need, please report them in an issue!
Windows
winget install Microsoft.VisualStudio.2022.BuildTools
and Please install Windows 11 SDK from
Visual Studio Installer > Modify > Check Desktop development
or Download from
https://developer.microsoft.com/ja-jp/windows/downloads/windows-sdk/
Linux
sudo apt-get install cmake
or
pacman -Syu cmake
Mac(has some issue)
xcode-select --install
brew install cmake
ImportError: cannot import name 'ALL_LAYERNORM_LAYERS' from 'transformers.pytorch_utils' (/Users/ftps/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/pytorch_utils.py)
Open stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/pytorch_utils.py and add line
ALL_LAYERNORM_LAYERS = [nn.LayerNorm]
Before this line
logger = logging.get_logger(__name__)
PermissionError: [Errno 13] Permission denied: '/Users/ftps/.cache/torch/hub/checkpoints
open Terminal and type
chmod +x .cache