Self-contained minimalistic ML model inference service example. The application demonstrates a wrapper API over models to serve the predictions in batches. It includes e2e tests for requests and prediction responses.
Application expects the provided model object to have an sklearn-style predict function.
pip install -r requirements.txt
python -m pytest
coverage run -m pytest
coverage report -m
flask run
docker build -t infer .
docker run -it infer
gcloud builds submit --tag<project-id>/infer
gcloud run deploy --image<project-id>/infer[[1,2,3,4],[1,1,1,1]]