This is a single-file ggml
implementation of Depth-Anything-V2.
ggml needs to be installed globally.
To compile the source code, use cmake
:
mkdir build && cd build
cmake .. && make
Please refer to Hugging Face for the required model weights.
Model | Download Link |
---|---|
Small | Download |
Base | Download |
Large | Download |
usage: ./dptv2 <s|b|l> <WEIGHTS> <INPUT_FILE> <OUTPUT_FILE>
Run DPTv2 model on an input image.
options:
vit-size {s,b,l,g} Specify the Vision Transformer size (`s`, `b`, `l`).
weights WEIGHTS Path to the input image.
input INPUT_FILE Path to the input image.
output OUTPUT_FILE Path to save the output image.
-
To save the output to a file:
dptv2 s weights/vits.safetensors docs/input.jpg docs/vits.jpg dptv2 b weights/vitb.safetensors docs/input.jpg docs/vitb.jpg dptv2 l weights/vitl.safetensors docs/input.jpg docs/vitl.jpg
Thanks to safetensors-cpp, stb and tinycolormap for their beautiful work.