-
Notifications
You must be signed in to change notification settings - Fork 0
Build and Debug
Druid officially uses Java 8. However, developers have found that everything except a few extensions work well with Java 11.
Other products have newer dependencies. If you must use Java 8, then on the Mac, use jenv
and brew
to manage. See this post. Also, see this post for Java 8 in particular.
brew install jenv
brew tap adoptopenjdk/openjdk
brew install --cask adoptopenjdk8
jenv add /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
If you use the newest Eclipse, you must change the JVM. The newest comes configured to use Java 16 by default. However, that version is strict about module enforcement and you'll get the following exception:
...module java.base does not "opens java.lang" to unnamed module...
See this StackOverflow article.
The solution is to configure Eclipse to use Java 11 instead.
mvn clean package -Pskip-static-checks -Pskip-tests -Dmaven.javadoc.skip=true -T1.0C
Note that, if the Java version is not 8 (is, say, 11), the above build will appear to work, but the tar.gz
file is not produced.
Or
# Druid aliases
alias druid-quick="mvn -T 8 -DskipTests -Dforbiddenapis.skip=true -Dcheckstyle.skip=true -Dpmd.skip=true -Dmaven.javadoc.skip=true -Danimal.sniffer.skip=true -Denforcer.skip=true -Dspotbugs.skip=true clean install"
See the documentation.
Create a directory for the Druid install, say ~/bin
.
export DRUID_DEV=<path to druid>
export BIN_DIR=~/bin
export DRUID_VER=0.23.0-SNAPSHOT
export DRUID_HOME=$BIN_DIR/apache-druid-${DRUID_VER}
mkdir $BIN_DIR
cd $BIN_DIR
tar -xzf $DRUID_DEV/distribution/target/apache-druid-${DRUID_VER}-bin.tar.gz
cd $DRUID_HOME
It can be handy to put the environment variables in the shell startup script (.zshrc
on the Mac), and the other commands in a script to run after each build.
Note: if using Juypter, it turns out that Juypter's default port (8888) is the same as Druid's default port. Change one of them. For Jupyter:
jupyter notebook -port 9000
Launch locally:
cd $DRUID_HOME
./bin/start-micro-quickstart
Visit the UI: https://localhost:8888
Seems that this version includes the needed ZK and database.
Every system has its unique way to handle configuration. Configuration has to work in both the production and development environments. Druid runs each of its services as a distinct Java processes. The chain of events is:
-
$DRUID_HOME/bin/micro-quickstart
- A shell script which invokes theservice
script with the configuration to run. -
$DRUID_HOME/conf/supervise/single-server/micro-quickstart.conf
- A Perl file which invokes each service:
broker bin/run-druid broker conf/druid/single-server/micro-quickstart
Each service is configured via a specific directory, pointed to by the above files. For example $DRUID_HOME/conf/druid/single-server/micro-quickstart
. There is one directory per service. Within a directory, say broker
, there are three files:
-
jvm.config
- JVM config passed on the Java command line. -
main.config
- JVM command passed to select the "main" routine. -
runtime.properties
- Druid properties, passed as JVM-D
settings, to configure Druid itself.
In production, a script assembles this information into a launch command. Our job, when running in the debugger, is to use the IDE to do this work.
Druid must run as four processes (at least). We don't want to launch all four from the IDE. A trick from Gian is to start with a pre-built Druid: either one downloaded from the Druid project, or built locally. Let's assume we're using the one we built above. We start by running the "cluster" using the standard script as shown above. Ingest data and ensure all works properly.
First. while the micro-quickstart
cluster is running, let's get the configuration we need.
ps aux | grep "java -server.*apache-druid" > /tmp/services.txt
In your IDE, create a launch configuration. These are the instructions for Eclipse:
- Kind: Java Application
- Name:
Historical
- Project:
services
(?) - Main Class:
org.apache.druid.cli.Main
- Program Arguments:
server historical
- Working Directory: the value of
$DRUID_HOME
- JVM Arguments: see below
- Dependencies/Classpath: see below
JVM Arguments: Here we want to doctor up the arguments we captured above. Pull out the JVM arguments other than the class path and the standard Java setup. Here is an an example, double-check that this is valid in your case. Also, not that $DRUID_HOME
is not available in Eclipse, fill in the actual path.
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
Some other settings you might want to use:
-Dlog4j.configurationFile=<some path>/log4j2.xml
The command line we captured earlier identifies config files we need on the class path. Go to the Dependencies, tab, click on Classpath, then "Advanced", and select "Add External Folder". Repeat to add each of the following:
$DRUID_HOME/conf/druid/single-server/micro-quickstart/historical
$DRUID_HOME/conf/druid/single-server/micro-quickstart/_common
Oddly, the following listed on the captured command line don't seem to actually exist:
$DRUID_HOME/conf/druid/single-server/micro-quickstart/_common/hadoop-xml
$DRUID_HOME/conf/druid/single-server/micro-quickstart/../_common
$DRUID_HOME/conf/druid/single-server/micro-quickstart/../_common/hadoop-xml
Move these entries to the top of the classpath list to mimic the captured command line.
To verify that all is good, click "Show Command Line". Ignore the Eclipse-provided items in the class path. Ensure that the rest looks like the command line we captured above.
Shut down the cluster by typing ^C
(control-C) in the console window where the cluster is running.
Now, find the supervise
config file mentioned above, say $DRUID_HOME/conf/supervise/single-server/micro-quickstart.conf
. Comment out the process we want to run, say historical
:
#historical bin/run-druid historical conf/druid/single-server/micro-quickstart
Launch the cluster again:
cd $DRUID_HOME
./bin/start-micro-quickstart
Wait for the services to start, then use the "Services" tab in the Druid UI to ensure that all services except historical
are running.
Use the launch configuration created earlier to launch our historical node.