Skip to content

Integration Test Revisions

Paul Rogers edited this page Mar 25, 2022 · 4 revisions

Revisions

  • The base image could contain all the required dependencies.
  • Single setup script rather than the current hodge-podge
  • Move all entry point functionality into a script
  • Use the build created in distribution.
  • Use standard Docker ways to manage processes.

Proposed Plan of Attack

Analysis:

  • Work out the Docker file used to create the Docker images.
  • What does that Docker file need? Where does it get the Druid artifacts? What other artifacts does it need? (Done)
  • What is the runtime setup? What is expected to occur in the shared directory?
  • How many container images does the build create? How do they differ?
  • How do the integration tests launch the desired Docker containers?
  • How do the integration tests interact with the Docker containers?
  • How do we build the public Docker file and its scripts used by the tutorial?

From this, we can work out an alternative design:

  • Modify the pom.xml files and projects to produce the needed artifacts in the correct way.
  • Produce Docker images as Maven artifacts (though, unfortunately, they can't live in the Maven build tree.)
  • Work our the proper way to launch the images and run the integration tests.

The alternative design:

  • Build the context manually. Keep notes, or create an ad-hoc script.
  • Test the original Docker file.
  • Validate the container manually.
  • Create a script to validate the container.
  • Launch the container to at least ensure the service starts.
  • Modify the Dockerfile and the associated scripts.
  • Re-validate the container build.
  • Create a new druid-docker project.
  • Create the Docker context in that project's target directory in the test-compile phase.
  • Create the Docker container in the package phase.
  • Run tests from the integration-test module.

Revised design:

  • Create a new branch, cherry-pick changes from extn branch (Done)
  • Create a docker directory for docker-related items. (Done)
  • Create a base-docker project (Done)
  • Create a trivial container following this outline (Done)
  • Add additional content step-by-step (Done)
    • ZK
    • Python
    • MySQL
    • Scripts
    • Keys
    • Other odd bits
    • Entry point
  • Create the shared folder contents
    • Put in base-docker/target/shared
    • Copy files in the compile phase
  • Create a the test-docker project. (Done)
    • Expand Druid tarball into directories

The Docker work was the easy part. The hard part are the test. They are too complex to tackle all at once. Instead:

  • Pick one group, say the high-availability group.
  • Create a new Maven project.
  • Migrate the tests for that group to the new project. Ensure the code builds.
  • Launch cluster by hand
    • Simple one-service script
    • Repeat for each service
    • Launch script with env vars
    • Test-specific cluster script works
  • Detangle the launch scripts. Create a new one (or ones).
  • Move the Docker Compose scripts.
  • Copy base shared folder to a temp location
  • Launch containers
  • Run tests

Stages:

  • Basic functionality
    • Keep the shared directory in home
    • Keep docker compose files as-is
  • Build the shared directory in target somewhere
  • Push group-specific stuff into the container script, out of docker-compose
  • Different solution for configs

Migration path:

  • Create the above in parallel with the existing material
  • Migrate integration tests
  • Create the test which is the goal of this whole mess.
    • Port the test code
  • Later migrate the docker build
  • What to do with the K8s integration test stuff?

References:

Design Details

test-docker

Placed in the build sequence after distribution, before integration-tests.

docker-tests

Root project for ITs. Migrated from integration-tests. Holds common files.

Integration Test Projects

  • Sources for a group
  • Resources used by the tests
  • Docker-compose file

Integration Test Usage

Project contains a start-cluster.sh script to bring up the Docker cluster manually, and a stop-cluster.sh to bring it down.

To run tests manually:

  • Build the project up through distribution
  • Build base image (only when things change)
  • Build Druid image (when build changes)
  • start-cluster.sh
  • In the debugger, run a test
  • stop-cluster.sh

Operational Notes

List images:

docker images

Launch a disposable container with a shell:

docker run --rm -it --entrypoint sh org.apache.druid/test-docker:0.23.0-SNAPSHOT

Redesign Approach

Outline

  • Use official images for dependencies
  • Generate configs, docker-compose, shared
  • Split test groups into projects
  • Maven, or Java, generates files, launches cluster, shuts down
  • Druid configs (and build) mounted into container

Tasks

  • Review how tests actually run
  • Cluster verification tests
    • ZK
    • MySQL
    • Kafka
  • Work out how to create the Druid containers
    • Verify Druid
  • Port one test
  • Work out cluster config
  • Work out cluster start/stop

YAML Config

Move all configuration to a YAML file. Rough structure:

test:
  name: <name>
  include: <base file>
  <key>: <value>
  zk:
    - host: <host>
      port: <port>
  metastore:
    type: <type>
    host: <host>
    port: <port>
    user: <user>
    pwd: <pwd>
  services:
    <name>:
      type: <service>
      properties:
        <key>: <value>
      jvm:
        - <arg>

The file is used to generate everything else.

  • Generates the Docker Compose file.
  • Generates the properties for the test.
  • Defines the configs to be created.

Upstream Images

Images exist for ZK and MySQL. For Kafka, there are three, only the last of which clearly maps to Kafka 3.1.0 (the one which Druid uses):

Druid Launch

In the new model, we dispense with supervisor, since it's not needed. For MM, we need a reap process. Perhaps the Docker-provided init?

Working backwards:

  • druid-service.sh does the work previously done by the entrypoint and by Supervisor.
  • druid-service.sh uses druid.sh same as the entrypoint used to do.
  • Docker Compose builds up the environment variables as today.

Router as Example

  • Create the launch script
  • Split the lib path
    • Druid's libs (hardcode in scripts)
  • Determine mount points
  • Change image entry point to use the script
  • Figure out security thing. Remove for most tests?
  • Copy common and router files to new project.
  • Review for changed service names, etc.
  • Revise the docker compose entries
  • Add init where needed
  • Test file to launch ZK, router
  • Examine container to ensure things are set up correctly
  • Use browser to hit Router to check visibility (or update the service tests)

Env Vars

  • DRUID_SERVICE
    • Must be set to service name
    • Also used for log file, issue if two services of same type
  • SERVICE_DRUID_JAVA_OPTS
  • COMMON_DRUID_JAVA_OPTS
    • Set in common
    • Includes -Dlog4j.configurationFile=/shared/docker/lib/log4j2.xml, but this file does not exist.
  • DRUID_DEP_LIB_DIR
    • Should be all of druid lib?
    • DRUID_DEP_LIB_DIR=/shared/hadoop_xml:/shared/docker/lib/*:/usr/local/druid/lib/mysql-connector-java.jar
  • DRUID_LOG_PATH
    • Set in container

Relative to the previous code:

Item Previous Revised
Druid build Ad-hoc Distribution
MySQL Connector Downloaded Maven
Kafka Protobuf Downloaded Maven
Kafka, ZK, MySQL In single image Official images
Druid launch Supervisor Direct
DRUID_DEP_LIB_DIR DRUID_CLASSPATH
/shared/hadoop_xml /shared/hadoop-xml
/shared/docker/lib/* $DRUID_HOME/LIB/*
/usr/local/druid/lib/mysql-connector-java.jar Already in lib
DRUID_SERVICE Name of service Same
" Name of log $DRUID_SERVICE-$DRUID_INSTANCE
COMMON_DRUID_JAVA_OPTS
-Duser.timezone=UTC -Dfile.encoding=UTF-8 Same
-Dlog4j.configurationFile=/shared/docker/lib/log4j2.xml LOG4J_CONFIG
-XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError Same
-XX:HeapDumpPath=/tmp -XX:HeapDumpPath=/shared/logs
DRUID_LOG_DIR ? Implicit at /shared/logs
SERVICE_DRUID_JAVA_OPTS Reduced
-server -> COMMON_DRUID_JAVA_OPTS
-Xmx64m -Xms64m Same
-XX:+UseG1GC -> COMMON_DRUID_JAVA_OPTS
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5009 -> DEBUG_OPTS

Docker Compose

  • Base directory: common files, not used to launch
  • Each test
    • docker-compose.yaml for that one test
  • Use default services where possible
  • Use test-specific configs to set up special cases: multiple items, etc.
  • Name of directory is name of Docker app: druid-cluster. Easier to use one name rather than the test group.

Notes & TODO

  • Why does docker-compose.base.yml mount src/test/resources as a volume?
    • Create a temporary hack. Remove it: docker-tests/src/test/resources
  • Rename env files to end with .env
  • Rename .yml files to .yaml
  • Can there be just one supervisor conf file? With contents for target service?
  • Rename Docker compose files to just docker-compose.yaml.
  • Rename base-docker to base-test-image
  • Rename test-docker to test-image
  • Move start-mysql.sh to /usr/local (Obsolete)
  • Put Druid libs in /usr/local/druid-libs to make merging easier
  • Rename the docker directories to the app name, probably the group name.
  • Remove the static IP addresses: rely on service host names
  • Remove the base image project
  • Split external dependencies from Druid in compose
  • Find a new name for the data setup steps from launch.sh
Clone this wiki locally