Skip to content

Commit

Permalink
Update README.md to enhance setup instructions and clarify API details
Browse files Browse the repository at this point in the history
  • Loading branch information
cjlee7128 committed Nov 12, 2024
1 parent f6f5533 commit 0520cfa
Showing 1 changed file with 103 additions and 32 deletions.
135 changes: 103 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,60 @@

This repository provides the main parts of the MLModelScope API

## Running
## Getting started

### Prerequisites

* [Docker](https://docs.docker.com/get-docker/)
* [Docker Compose](https://docs.docker.com/compose/install/)
* [Python](https://www.python.org/downloads/) (for the Python API, version 3.8 or later)

### Cloning the repository

To clone the repository, run:

```bash
git clone https://github.com/c3sr/mlmodelscope-api.git
cd mlmodelscope-api
```

### Environment variables

The API requires two environment files to be present in the project root folder `mlmodelscope-api`:

* `.env` - contains environment variables for the API
* `.env.companion` - contains environment variables for the Companion service

The `.env` file should contain the following variables:

- `DOCKER_REGISTRY` - the Docker registry to pull images from, defaults to `c3sr`
- `ENVIRONMENT` - the environment the API is running in such as `local.`
- `API_VERSION` - the version of the API
- `DB_HOST` - the hostname of the PostgreSQL database
- `DB_PORT` - the port of the PostgreSQL database
- `DB_USER` - the username for the PostgreSQL database
- `DB_PASSWORD` - the password for the PostgreSQL database
- `DB_DBNAME` - the name of the PostgreSQL database
- `MQ_HOST` - the hostname of the RabbitMQ message queue
- `MQ_PORT` - the port of the RabbitMQ message queue
- `MQ_USER` - the username for the RabbitMQ message queue
- `MQ_PASSWORD` - the password for the RabbitMQ message queue
- `MQ_ERLANG_COOKIE` - the Erlang cookie for RabbitMQ
- `tracer_PORT` - the port for the Jaeger tracer
- `tracer_HOST` - the hostname for the Jaeger tracer
- `TRACER_ADDRESS` - the address for the Jaeger tracer

The `.env.companion` file should contain the following variables:

- `COMPANION_AWS_KEY` - the AWS key for the Companion service
- `COMPANION_AWS_SECRET` - the AWS secret for the Companion service
- `COMPANION_AWS_BUCKET` - the AWS bucket for the Companion service
- `COMPANION_AWS_REGION` - the AWS region for the Companion service

As this file will
likely contain private credentials it should **never** be committed to source control!

### Running

To run a local version of the API you will need to have Docker and Docker
Compose installed. The following command will build the latest API image
Expand All @@ -15,7 +68,7 @@ The additional components launched are:
* RabbitMQ - message queue providing communication between the API and ML agents
* PostgreSQL - the database
* The database is initialized from the file `docker/data/c3sr-bootstrap.sql.gz`
* Companion - assists in cloud storage uploads, see below for details
* Companion - assists in cloud storage uploads
* Traefik - reverse proxy, see below for details
* Consul - service discovery
* A suite of services to support monitoring with Prometheus/Grafana
Expand All @@ -36,10 +89,13 @@ You can read more about the Docker Compose configuration [here](docs/docker-comp

## API

The `/api` directory contains an application that provides most of
the API endpoints for mlmodelscope.
There are two API implementations in this repository: a Go API and a Python API. The Go API is the default API for the project, and the Python API is provided for compatibility with existing code.

### Go API

The `/api` directory contains a Go application that provides API endpoints for mlmodelscope. Docker Compose is configured to run the Go API by default.

### Running unit tests
#### Running unit tests

To run the unit tests, change to the `/api` directory and run:

Expand All @@ -53,19 +109,9 @@ Add the `-v` flag to see detailed output from the tests:
go test -v ./...
```

### Running integration tests

To run the integration tests, change to the `/api` directory and run:

```bash
scripts/run-integration-tests.sh
```

This script will start the required services (RabbitMQ, PostgreSQL, and a Mock agent) in Docker containers and run the tests. When the tests are complete the containers will be stopped and removed.

### Debugging in a container
#### Debugging in a container

It is possible to debug the API endpoints while they run in a container
It is possible to debug the Go API endpoints while they run in a container
(this can be useful to test behavior when the API is running on a Docker
network alongside ML agents.) To enable debugging in the container, run
the API from the `docker/Dockerfile.api-debug` Dockerfile. This Dockerfile
Expand All @@ -74,21 +120,39 @@ debugger attached. Delve listens on port 2345, which is exposed to the host
machine. The API itself will not begin running until a debugging client is
attached to Delve.

## Companion
### Python API

[Companion](https://uppy.io/docs/companion/) is a service used to enable direct
uploads to a cloud storage provider. It requires additional environment variables
for configuration:
The `/python_api` directory contains a Python application that provides API endpoints for mlmodelscope.

* COMPANION_AWS_KEY
* COMPANION_AWS_SECRET
* COMPANION_AWS_BUCKET
* COMPANION_AWS_REGION
#### Setting up an environment

In local development these variables should be provided in an environment
file named `.env.companion` in the project root folder. As this file will
likely contain private credentials it should **never** be committed to source
control!
Python API requires Python 3.8 or later. We recommend using a virtual environment to manage dependencies such as `virtualenv` or `conda`.

To install the dependencies, run:

```bash
pip install -r python_api/requirements.txt
```

#### Editing the configuration

The Python API uses configuration environment variables to connect to the database and message queue. These variables are coded in the `python_api/db.py` and `python_api/mq.py` files. You can edit these files to change the configuration or set the environment variables in your shell.

#### Running the API

To run the Python API, change to the `/python_api` directory and run:

```bash
fastapi run api.py --reload
```

The default port is `8000` but if you want to change the port you can set with `--port` flag

```bash
fastapi run api.py --reload --port 8001
```

You may need to set the port for the API to connect to [`mlmodelscope`](https://github.com/c3sr/mlmodelscope) which is running on port configured as `REACT_APP_API_PORT` in the `mlmodelscope/.env` file.

## Traefik

Expand All @@ -103,16 +167,23 @@ will proxy that at http://local.mlmodelscope.org/.
The `scripts/run-agent.sh` script will run an agent container for one of the
following ML frameworks:

* mxnet
* onnxruntime
* pytorch
* tensorflow
* onnxruntime
* jax
* mxnet

For example, to run a PyTorch agent, run:

```bash
./scripts/run-agent.sh pytorch
```

The `docker/carml-config-examle.yml` file will be copied to `.carml_config.yml` and
that file will be mapped into the running container as a Docker volume. If you
need to modify the configuration in any way, you should edit the `.carml_config.yml`
file and **not** `docker/carml-config-example.yml`.

### Project Wiki
## Project Wiki

https://wiki.mlmodelscope.org/

0 comments on commit 0520cfa

Please sign in to comment.