Skip to content

Commit

Permalink
Add openclip docs (#1131)
Browse files Browse the repository at this point in the history
---------

Co-authored-by: Omar Sanseviero <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
  • Loading branch information
3 people authored Nov 27, 2023
1 parent f376c2a commit a3fb686
Show file tree
Hide file tree
Showing 3 changed files with 79 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docs/hub/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@
title: Keras
- local: ml-agents
title: ML-Agents
- local: open_clip
title: OpenCLIP
- local: paddlenlp
title: PaddleNLP
- local: rl-baselines3-zoo
Expand Down
1 change: 1 addition & 0 deletions docs/hub/models-libraries.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ The table below summarizes the supported libraries and their level of integratio
| [MidiTok](https://github.com/Natooz/MidiTok) | Tokenizers for symbolic music / MIDI files. |||||
| [ML-Agents](https://github.com/huggingface/ml-agents) | Enables games and simulations made with Unity to serve as environments for training intelligent agents. |||||
| [NeMo](https://github.com/NVIDIA/NeMo) | Conversational AI toolkit built for researchers |||||
| [OpenCLIP](https://github.com/mlfoundations/open_clip) | Library for open-source implementation of OpenAI's CLIP |||||
| [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) | Easy-to-use and powerful NLP library built on PaddlePaddle |||||
| [Pyannote](https://github.com/pyannote/pyannote-audio) | Neural building blocks for speaker diarization. |||||
| [PyCTCDecode](https://github.com/kensho-technologies/pyctcdecode) | Language model supported CTC decoding for speech recognition |||||
Expand Down
76 changes: 76 additions & 0 deletions docs/hub/open_clip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Using OpenCLIP at Hugging Face

[OpenCLIP](https://github.com/mlfoundations/open_clip) is an open-source implementation of OpenAI's CLIP.

## Exploring OpenCLIP on the Hub

You can find OpenCLIP models by filtering at the left of the [models page](https://huggingface.co/models?library=open_clip&sort=trending).

OpenCLIP models hosted on the Hub have a model card with useful information about the models. Thanks to OpenCLIP Hugging Face Hub integration, you can load OpenCLIP models with a few lines of code. You can also deploy these models using [Inference Endpoints](https://huggingface.co/inference-endpoints).


## Installation

To get started, you can follow the [OpenCLIP installation guide](https://github.com/mlfoundations/open_clip#usage).
You can also use the following one-line install through pip:

```
$ pip install open_clip_torch
```

## Using existing models

All OpenCLIP models can easily be loaded from the Hub:

```py
import open_clip

model, preprocess = open_clip.create_model_from_pretrained('hf-hub:laion/CLIP-ViT-g-14-laion2B-s12B-b42K')
tokenizer = open_clip.get_tokenizer('hf-hub:laion/CLIP-ViT-g-14-laion2B-s12B-b42K')
```

Once loaded, you can encode the image and text to do [zero-shot image classification](https://huggingface.co/tasks/zero-shot-image-classification):

```py
import torch
from PIL import Image

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)
image = preprocess(image).unsqueeze(0)
text = tokenizer(["a diagram", "a dog", "a cat"])

with torch.no_grad(), torch.cuda.amp.autocast():
image_features = model.encode_image(image)
text_features = model.encode_text(text)
image_features /= image_features.norm(dim=-1, keepdim=True)
text_features /= text_features.norm(dim=-1, keepdim=True)

text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)

print("Label probs:", text_probs)
```

It outputs the probability of each possible class:

```text
Label probs: tensor([[0.0020, 0.0034, 0.9946]])
```

If you want to load a specific OpenCLIP model, you can click `Use in OpenCLIP` in the model card and you will be given a working snippet!

<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/openclip_repo_light.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/openclip_repo.png"/>
</div>
<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/openclip_snippet_light.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/openclip_snippet.png"/>
</div>


## Additional resources

* OpenCLIP [repository](https://github.com/mlfoundations/open_clip)
* OpenCLIP [docs](https://github.com/mlfoundations/open_clip/tree/main/docs)
* OpenCLIP [models in the Hub](https://huggingface.co/models?library=open_clip&sort=trending)

0 comments on commit a3fb686

Please sign in to comment.