Skip to content
This repository has been archived by the owner on Nov 26, 2021. It is now read-only.

Setting up the Training Environment

Fabian Hueske edited this page Jul 31, 2020 · 10 revisions

The Apache Flink SQL Training is based on the Flink's SQL CLI client, an interactive client to submit SQL queries to Flink and visualize the results. The training assumes basic knowledge of SQL. You will not need to write Java or Scala code or use an IDE.

The training environment is set up using Docker Compose. The Docker Compose setup consists of multiple Docker containers that run different services:

  • a Flink SQL client container to submit queries and visualize their results,
  • a Flink master and a Flink worker container to execute queries,
  • an Apache Kafka container to produce input streams and consume result streams,
  • an Apache Zookeeper container (required by Kafka), and
  • a MySQL container to maintain an external materialized table.
  • a Minio container to provide an S3-compatible storage

All containers are started from images that are publicly available on Docker Hub.

Requirements

  • You only need to have Docker installed on your machine.

Docker is available for Linux, MacOS, and Windows.

Note: You need to configure Docker with sufficient resources to avoid that the training environment becomes unresponsive. We have made good experiences with running Docker at 3-4 GB memory and 3-4 CPU cores.

Getting the Docker Compose Configuration

Docker Compose environments are configured with a YAML file. The default path of the file is docker-compose.yml.

Get the configuration file for the training environment by cloning our Git repository

git clone https://github.com/ververica/sql-training

Starting the Training Environment

To run the training environment, the Docker engine needs to run on your machine.

Moreover, all required Docker images need to be present in the local image store. Docker will automatically check for missing images and download them from Docker Hub. It will take a couple of minutes to download all required images (approx. 2.3 GB) when you run the command for the first time. Once the images are available, the environment starts in a few seconds.

In order to start the training environment, open a terminal (Windows users can use cmd), enter the directory that contains the docker-compose.yml file, and run the following command(s).

  • Linux & MacOS
docker-compose up -d
  • Windows
set COMPOSE_CONVERT_WINDOWS_PATHS=1
docker-compose up -d

The docker-compose command starts all containers of the Docker Compose configuration in detached mode. You can check if the environment run, by accessing Flink's web UI at http://localhost:8081.

Entering the SQL CLI client

To enter the SQL CLI client run:

docker-compose exec sql-client ./sql-client.sh

The command starts the SQL CLI client in the container. You should see the welcome screen of the CLI client.


                                   ▒▓██▓██▒
                               ▓████▒▒█▓▒▓███▓▒
                            ▓███▓░░        ▒▒▒▓██▒  ▒
                          ░██▒   ▒▒▓▓█▓▓▒░      ▒████
                          ██▒         ░▒▓███▒    ▒█▒█▒
                            ░▓█            ███   ▓░▒██
                              ▓█       ▒▒▒▒▒▓██▓░▒░▓▓█
                            █░ █   ▒▒░       ███▓▓█ ▒█▒▒▒
                            ████░   ▒▓█▓      ██▒▒▒ ▓███▒
                         ░▒█▓▓██       ▓█▒    ▓█▒▓██▓ ░█░
                   ▓░▒▓████▒ ██         ▒█    █▓░▒█▒░▒█▒
                  ███▓░██▓  ▓█           █   █▓ ▒▓█▓▓█▒
                ░██▓  ░█░            █  █▒ ▒█████▓▒ ██▓░▒
               ███░ ░ █░          ▓ ░█ █████▒░░    ░█░▓  ▓░
              ██▓█ ▒▒▓▒          ▓███████▓░       ▒█▒ ▒▓ ▓██▓
           ▒██▓ ▓█ █▓█       ░▒█████▓▓▒░         ██▒▒  █ ▒  ▓█▒
           ▓█▓  ▓█ ██▓ ░▓▓▓▓▓▓▓▒              ▒██▓           ░█▒
           ▓█    █ ▓███▓▒░              ░▓▓▓███▓          ░▒░ ▓█
           ██▓    ██▒    ░▒▓▓███▓▓▓▓▓██████▓▒            ▓███  █
          ▓███▒ ███   ░▓▓▒░░   ░▓████▓░                  ░▒▓▒  █▓
          █▓▒▒▓▓██  ░▒▒░░░▒▒▒▒▓██▓░                            █▓
          ██ ▓░▒█   ▓▓▓▓▒░░  ▒█▓       ▒▓▓██▓    ▓▒          ▒▒▓
          ▓█▓ ▓▒█  █▓░  ░▒▓▓██▒            ░▓█▒   ▒▒▒░▒▒▓█████▒
           ██░ ▓█▒█▒  ▒▓▓▒  ▓█                █░      ░░░░   ░█▒
           ▓█   ▒█▓   ░     █░                ▒█              █▓
            █▓   ██         █░                 ▓▓        ▒█▓▓▓▒█░
             █▓ ░▓██░       ▓▒                  ▓█▓▒░░░▒▓█░    ▒█
              ██   ▓█▓░      ▒                    ░▒█▒██▒      ▓▓
               ▓█▒   ▒█▓▒░                         ▒▒ █▒█▓▒▒░░▒██
                ░██▒    ▒▓▓▒                     ▓██▓▒█▒ ░▓▓▓▓▒█▓
                  ░▓██▒                          ▓░  ▒█▓█  ░░▒▒▒
                      ▒▓▓▓▓▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░▓▓  ▓░▒█░
          
    ______ _ _       _       _____  ____  _         _____ _ _            _  BETA   
   |  ____| (_)     | |     / ____|/ __ \| |       / ____| (_)          | |  
   | |__  | |_ _ __ | | __ | (___ | |  | | |      | |    | |_  ___ _ __ | |_ 
   |  __| | | | '_ \| |/ /  \___ \| |  | | |      | |    | | |/ _ \ '_ \| __|
   | |    | | | | | |   <   ____) | |__| | |____  | |____| | |  __/ | | | |_ 
   |_|    |_|_|_| |_|_|\_\ |_____/ \___\_\______|  \_____|_|_|\___|_| |_|\__|
          
        Welcome! Enter 'HELP;' to list all available commands. 'QUIT;' to exit.

Running a Simple Query

To execute a simple query run

SELECT * FROM Rides;

The CLI client will enter the result visualization mode and display the results.

rideId                    taxiId                   isStart                       lon                       lat                   rideTime                   psgCnt
  2706                2013002631                      true                -73.961716                   40.8058     2013-01-01 00:10:00.0                         1
  2707                2013002632                      true                -73.987404                  40.77599     2013-01-01 00:10:00.0                         1
  2708                2013002633                      true                 -73.98752                 40.719883     2013-01-01 00:10:00.0                         1
  2709                2013002634                      true                 -73.99147                 40.712574     2013-01-01 00:10:00.0                         1
  2710                2013002635                      true                 -73.98552                 40.768192     2013-01-01 00:10:00.0                         1
  2711                2013002636                      true                -73.870865                 40.773773     2013-01-01 00:10:00.0                         1
  2712                2013002637                      true                 -73.96637                 40.794533     2013-01-01 00:10:00.0                         1
  2713                2013002638                      true                -74.009796                 40.738075     2013-01-01 00:10:01.0                         2
  2714                2013002639                      true                 -73.99565                  40.75974     2013-01-01 00:10:01.0                         1
  2715                2013002640                      true                 -74.00353                 40.732105     2013-01-01 00:10:01.0                         1
  2716                2013002641                      true                 -73.96548                 40.790794     2013-01-01 00:10:01.0                         1
  2717                2013002642                      true                 -74.00452                 40.742096     2013-01-01 00:10:01.0                         1
  1703                2013001680                     false                 -73.94827                 40.772167     2013-01-01 00:10:02.0                         3
  2718                2013002643                      true                 -73.99971                  40.71458     2013-01-01 00:10:02.0                         3

Flink's web UI lists the running query at http://localhost:8081.

To leave the result visualization mode and terminate the query hit q.

Shutting Down the Training Environment

You can leave the SQL CLI client by typing quit; or exit;.

The training environment is shut down by running

docker-compose down

Troubleshooting

If the training environment becomes unresponsive, restart the Docker engine, and provide more resources to Docker.

We have had good experiences with running Docker at 3-4 GB memory and 3-4 CPU cores.