In this example, we use BiT (Big Transfer), to build an end-to-end neural image search system. You can use this demo to index an image dataset and query the most similar image from it.
Features that come out of the box:
- Interactive query
- Index with shards
- REST and gRPC gateway
- Dashboard monitor
To save you from dependency hell, we'll use the containerized version in these instructions. That means you only need to have Docker installed. No Python virtualenv, no Python package (un)install.
NOTE Use Python 3.7 for this example.
Table of Contents
- TL;DR: Just Show Me the Pokemon!
- Run outside of Docker
- Troubleshooting
- Documentation
- Community
- License
I want Pokémon! I don't care about Jina cloud-native neural search or whatever big names you throw around, just show me the Pokémon!
We have a pre-built Docker image ready to use, you need to run this on your console:
docker run -p 45678:45678 jinahub/app.example.pokedexwithbit:0.0.1-0.9.20
So now you're ready to query! And for that you have two options:
- You can use Jinabox.js to find the Pokemon which matches most clearly. Just set the endpoint to
http://127.0.0.1:45678/api/search
and drag from the thumbnails on the left or from your file manager. - Or you can
curl
/query/js it via HTTP POST request. Details here.
For this example we're using Pokemon sprites from veekun.com. To download them run:
sh ./get_data.sh
In this example we use BiT (Big Transfer) model, To download it:
sh ./download.sh
python app.py -t index
After this you should see a new workspace
folder, which contains all the encoded data generated during indexing.
python app.py -t query_restful
And then follow the Jinabox instructions from the Query from Docker section above.
Jina's REST API uses the data URI scheme to represent multimedia data. To query your indexed data, simply organize your picture(s) into this scheme and send a POST request to http://0.0.0.0:45678/api/search
, e.g.:
curl --verbose --request POST -d '{"top_k": 10, "mode": "search", "data": ["data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAICAIAAABLbSncAAAA2ElEQVR4nADIADf/AxWcWRUeCEeBO68T3u1qLWarHqMaxDnxhAEaLh0Ssu6ZGfnKcjP4CeDLoJok3o4aOPYAJocsjktZfo4Z7Q/WR1UTgppAAdguAhR+AUm9AnqRH2jgdBZ0R+kKxAFoAME32BL7fwQbcLzhw+dXMmY9BS9K8EarXyWLH8VYK1MACkxlLTY4Eh69XfjpROqjE7P0AeBx6DGmA8/lRRlTCmPkL196pC0aWBkVs2wyjqb/LABVYL8Xgeomjl3VtEMxAeaUrGvnIawVh/oBAAD///GwU6v3yCoVAAAAAElFTkSuQmCC", "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAICAIAAABLbSncAAAA2ElEQVR4nADIADf/AvdGjTZeOlQq07xSYPgJjlWRwfWEBx2+CgAVrPrP+O5ghhOa+a0cocoWnaMJFAsBuCQCgiJOKDBcIQTiLieOrPD/cp/6iZ/Iu4HqAh5dGzggIQVJI3WqTxwVTDjs5XJOy38AlgHoaKgY+xJEXeFTyR7FOfF7JNWjs3b8evQE6B2dTDvQZx3n3Rz6rgOtVlaZRLvR9geCAxuY3G+0mepEAhrTISES3bwPWYYi48OUrQOc//IaJeij9xZGGmDIG9kc73fNI7eA8VMBAAD//0SxXMMT90UdAAAAAElFTkSuQmCC"]}' -H 'Content-Type: application/json' 'http://0.0.0.0:45678/api/search'
JSON payload syntax and spec can be found in the docs.
The above explains how to use a REST gateway, but by default Jina uses a gRPC gateway, which has much higher performance and richer features. Read our documentation on Jina IO for more information.
After playing with it for a while, you may want to change the code and rebuild the image. Simply run:
docker build -t jinaai/app.examples.pokedexwithbit .
If it's running successfully, you should be able to see logs scrolling in the console and in the dashboard:
Under $(pwd)/workspace
, you'll see a list of directories chunk_compound_indexer-*
after indexing. This is because we set shards to 8.
BiT model seems pretty resource-hungry. If you are using Docker Desktop, make sure to assign enough memory for your Docker container, especially when you have multiple shards. Below are my MacOS settings with two shards:
Incremental indexing and entry-level deleting are yet not supported in this demo. Duplicate indexing may not throw exceptions, but may produce strange results. So make sure to clean $(pwd)/workspace
before each run.
Meet other problems? Check our troubleshooting guide or submit a Github issue.
The best way to learn Jina in depth is to read our documentation. Documentation is built on every push, merge, and release event of the master branch. You can find more details about the following topics in our documentation.
- Jina command line interface arguments explained
- Jina Python API interface
- Jina YAML syntax for executor, driver and flow
- Jina Protobuf schema
- Environment variables used in Jina
- ... and more
- Slack channel - a communication platform for developers to discuss Jina
- Community newsletter - subscribe to the latest update, release and event news of Jina
- LinkedIn - get to know Jina AI as a company and find job opportunities
- - follow us and interact with us using hashtag
#JinaSearch
- Company - know more about our company, we are fully committed to open-source!
Copyright (c) 2021 Jina AI Limited. All rights reserved.
Jina is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.