Skip to content

Build and Debug

Paul Rogers edited this page Sep 29, 2021 · 16 revisions

Java

Druid officially uses Java 8. However, developers have found that everything except a few extensions work well with Java 11.

Other products have newer dependencies. If you must use Java 8, then on the Mac, use jenv and brew to manage. See this post.

brew install jenv
brew tap adoptopenjdk/openjdk
brew install --cask adoptopenjdk8
jenv add /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home

Java Version in Eclipse

If you use the newest Eclipse, you must change the JVM. The newest comes configured to use Java 16 by default. However, that version is strict about module enforcement and you'll get the following exception:

...module java.base does not "opens java.lang" to unnamed module...

See this StackOverflow article.

The solution is to configure Eclipse to use Java 11 instead.

Build Scripts

mvn clean package -Pskip-static-checks -Pskip-tests -Dmaven.javadoc.skip=true -T1.0C

Note that, if the Java version is not 8 (is, say, 11), the above build will appear to work, but the tar.gz file is not produced.

Or

# Druid aliases
alias druid-quick="mvn -T 8 -DskipTests -Dforbiddenapis.skip=true -Dcheckstyle.skip=true -Dpmd.skip=true -Dmaven.javadoc.skip=true -Danimal.sniffer.skip=true -Denforcer.skip=true -Dspotbugs.skip=true clean install"

Standalone Launch

See the documentation.

Create a directory for the Druid install, say ~/bin.

export DRUID_DEV=<path to druid>
export BIN_DIR=~/bin
export DRUID_VER=0.23.0-SNAPSHOT
export DRUID_HOME=$BIN_DIR/apache-druid-${DRUID_VER}
mkdir $BIN_DIR
cd $BIN_DIR
tar -xzf $DRUID_DEV/distribution/target/apache-druid-${DRUID_VER}-bin.tar.gz
cd $DRUID_HOME

It can be handy to put the environment variables in the shell startup script (.zshrc on the Mac), and the other commands in a script to run after each build.

Note: if using Juypter, it turns out that Juypter's default port (8888) is the same as Druid's default port. Change one of them. For Jupyter:

jupyter notebook -port 9000

Launch locally:

cd $DRUID_HOME
./bin/start-micro-quickstart

Visit the UI: https://localhost:8888

Seems that this version includes the needed ZK and database.

Druid Configuration

Every system has its unique way to handle configuration. Configuration has to work in both the production and development environments. Druid runs each of its services as a distinct Java processes. The chain of events is:

  • $DRUID_HOME/bin/micro-quickstart - A shell script which invokes the service script with the configuration to run.
  • $DRUID_HOME/conf/supervise/single-server/micro-quickstart.conf - A Perl file which invokes each service:
    broker bin/run-druid broker conf/druid/single-server/micro-quickstart

Each service is configured via a specific directory, pointed to by the above files. For example $DRUID_HOME/conf/druid/single-server/micro-quickstart. There is one directory per service. Within a directory, say broker, there are three files:

  • jvm.config - JVM config passed on the Java command line.
  • main.config - JVM command passed to select the "main" routine.
  • runtime.properties - Druid properties, passed as JVM -D settings, to configure Druid itself.

In production, a script assembles this information into a launch command. Our job, when running in the debugger, is to use the IDE to do this work.

Configure Eclipse

Druid must run as four processes (at least). We don't want to launch all four from the IDE. A trick from Gian is to start with a pre-built Druid: either one downloaded from the Druid project, or built locally. Let's assume we're using the one we built above. We start by running the "cluster" using the standard script as shown above. Ingest data and ensure all works properly.

First. while the micro-quickstart cluster is running, let's get the configuration we need.

ps aux | grep "java -server.*apache-druid" > /tmp/services.txt

In your IDE, create a launch configuration. These are the instructions for Eclipse:

  • Kind: Java Application
  • Name: Historical
  • Project: services (?)
  • Main Class: org.apache.druid.cli.Main
  • Program Arguments: server historical
  • Working Directory: the value of $DRUID_HOME
  • JVM Arguments: see below
  • Dependencies/Classpath: see below

JVM Arguments: Here we want to doctor up the arguments we captured above. Pull out the JVM arguments other than the class path and the standard Java setup. Here is an an example, double-check that this is valid in your case. Also, not that $DRUID_HOME is not available in Eclipse, fill in the actual path.

-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

Some other settings you might want to use:

-Dlog4j.configurationFile=<some path>/log4j2.xml

The command line we captured earlier identifies config files we need on the class path. Go to the Dependencies, tab, click on Classpath, then "Advanced", and select "Add External Folder". Repeat to add each of the following:

$DRUID_HOME/conf/druid/single-server/micro-quickstart/historical
$DRUID_HOME/conf/druid/single-server/micro-quickstart/_common

Oddly, the following listed on the captured command line don't seem to actually exist:

$DRUID_HOME/conf/druid/single-server/micro-quickstart/_common/hadoop-xml
$DRUID_HOME/conf/druid/single-server/micro-quickstart/../_common
$DRUID_HOME/conf/druid/single-server/micro-quickstart/../_common/hadoop-xml

Move these entries to the top of the classpath list to mimic the captured command line.

To verify that all is good, click "Show Command Line". Ignore the Eclipse-provided items in the class path. Ensure that the rest looks like the command line we captured above.

Launch from Eclipse

Shut down the cluster by typing ^C (control-C) in the console window where the cluster is running.

Now, find the supervise config file mentioned above, say $DRUID_HOME/conf/supervise/single-server/micro-quickstart.conf. Comment out the process we want to run, say historical:

#historical bin/run-druid historical conf/druid/single-server/micro-quickstart

Launch the cluster again:

cd $DRUID_HOME
./bin/start-micro-quickstart

Wait for the services to start, then use the "Services" tab in the Druid UI to ensure that all services except historical are running.

Use the launch configuration created earlier to launch our historical node.