This service, built with Python and Flask, utilizes Optical Character Recognition (OCR) technology to pinpoint words in images with remarkable precision. Simply submit an image along with a target word via a POST request, and the tool goes to work. If the word is located, it returns the exact coordinates of where it was found within the image. If the word remains elusive, the service will let you know it couldn't find it.
- Keras OCR Flask Service
- API
- Running with Docker Compose
- Devlopment docker container
- Running natively
- Send a post request to /process as form-data
- Include the screenshot as "file" and the word you are searching for as "word"
- Will return a json response
{
"result": "found",
"x": 3464,
"y": 1872
}
or
{
"result": "not found"
}
For these instructions we use Ubuntu Desktop 24.04 LTS
We follow the official Docker instructions for installing on Ubuntu. We implore you to read their instructions to install but here is the steps for brevity's sake.
Docker desktop for Ubuntu does not work with this project, ensure you are installing Docker Engine and not Docker Desktop.
- Uninstall old versions
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
- Set up Docker's apt repository
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
- Install the docker packages
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
- Manage docker as a non-root user
# add your current user to the docker group
sudo usermod -aG docker $USER
Log in and log out after this step for the new group to take affect.
# test you can run docker without sudo
docker run hello-world
- (Optional) If docker is not starting for you try this workaround to fix a known issue with Ubuntu 24.04 and Docker Desktop (which you shouldn't have installed if you followed these instructions!)
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0
This project will work on a processor but its success in game automation largely depends on the performance boost of accelerating TensorFlow on a CUDA enabled NVIDIA GPU.
- Check if you already have a supported driver install. Commonly a working driver will be found during Ubuntu's installations if you allowed third party drivers.
# if you receive output like the image below you have a working driver
nvidia-smi
If no output is given, or if the "command is not found" follow these instructions to install an appropriate driver:
See CUDA GPUs - Compute Capability to find out which version of CUDA your GPU supports. In this case the 4090 supports 8.9.
# use below if you want automatic detection
sudo ubuntu-drivers install
# you can view available drivers and select a specific one to install instead
sudo ubuntu-drivers list
Again, like the sections above, we implore you to read the official NVIDIA Container Toolkit documentation and use ours as reference.
- Add the repository to apt
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
- Update the list
sudo apt-get update
- Install the packages
sudo apt-get install -y nvidia-container-toolkit
- Configure docker
sudo nvidia-ctk runtime configure --runtime=docker
- Restart docker
sudo systemctl restart docker
- Verify it all worked
sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
This command should have identical output as if you ran nvidia-smi
on the host system.
-
Open a terminal in the root directory of the repository
-
Build and start the service
docker compose up
- Stop the service
docker compose stop
- Restart the service
docker compose restart
- Stop and remove all containers
docker compose down
You can run and build the container this way if you want to be able to make changes to the code and pick those changes up without having to rebuild the whole container.
docker build -t <tag-name> -f Dockerfile.dev .
Ensure you run this command from the root of the respository so the correct directory is mounted to the container.
sudo docker run --shm-size=1g --ulimit memlock=-1 --name keras-ocr -it -v $(pwd):/repo --gpus all <tag-name>
Once you run the command from step 2 you should have been greeted to an SSH session into the running container.
# install general dependencies
pip install -r requirements_docker.txt
# install tensorflow
pip install tensorflow==2.12.0
# test if GPU is detected
python test_cudapresence.py
# run the service
python keras_server.py
To start we will install TensorFlow for Linux following the official documentation. Our instructions assume you are using an Nvidia graphics card for CUDA acceleration.
- We standardized on installing Keras OCR on Ubuntu Server 20.04 and these instructions are from a fresh install.
- This project is currently using TensorFlow 2.12.0
Miniconda is the recommended approach for installing TensorFlow with GPU support, we follow this advice.
- Execute
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o Miniconda3-latest-Linux-x86_64.sh
- Then execute
bash Miniconda3-latest-Linux-x86_64.sh
- You may need to restart your terminal or source ~/.bashrc to enable the conda command.
- Use conda -V to test if it is installed successfully.
We will create a conda environment in which to operate. In Labs we use the /home/<user>
directory. We stick with the home directory because most of our deploys are to native machines with no other services; they are meant just for Keras.
Staring in the home directory of the user (cd ~
):
- Execute
conda create --name tf python=3.10
- Activate the environment with
conda activate tf
You can skip this part if you just want to run Keras on the CPU, however many of the game tests that use Keras will fail as the CPU is not fast enough for some of the timings expected in the game tests.
Use CUDA GPUs - Compute Capability to find out which version of CUDA your GPU supports. In this case the 4090 supports 8.9.
TensorFlow tested build configurations
- Install the graphics card driver.
- Execute
sudo ubuntu-drivers install
if you want automatic detection. - If you want to see the available drivers, run
sudo ubuntu-drivers list
- Execute
- Use the following command to verify it is installed
nvidia-smi
. - Install Cuda Tool Kit with Conda
- Execute
conda install -c conda-forge cudatoolkit=11.8.0
- Execute
- Install cuDNN with pip.
- Execute
pip install nvidia-cudnn-cu11==8.9.5
8.6.0.163
for 30's series GPUs.
- Execute
- Configure the system paths. You can do it with the following command every time you start a new terminal after activating your conda environment.
CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))
export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$CONDA_PREFIX/lib/:$LD_LIBRARY_PATH
- For your convenience it is recommended that you automate it with the following commands. The system paths will be automatically configured when you activate this conda environment.
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$CONDA_PREFIX/lib/:$LD_LIBRARY_PATH' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
- Install TensorFlow
pip install --upgrade pip
pip install tensorflow==2.12.0
Test it works on CPU
python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
Test it works on GPU
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
Now we can install the rest of the dependencies and test if our API is working.
- Install the rest of the dependencies.
- Execute
pip install -r requirements.txt
- Execute
- Test Keras and Tensorflow.
- Execute
python3 test_cudapresence.py
- It should print out that GPU is available.
- Execute
- Execute
run-keras-service.sh