Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
kanakb committed Jul 27, 2016
0 parents commit 865a1e1
Show file tree
Hide file tree
Showing 11 changed files with 1,638 additions and 0 deletions.
12 changes: 12 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
cscope.*
.classpath
.project
.svn
target/
.idea
*.iml
*.ipr
*.iws
.settings/
out/
.DS_Store
307 changes: 307 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

43 changes: 43 additions & 0 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
kafka-assigner
Copyright 2016 Sift Science.

I. Included Software

This product includes software developed at
Sift Science (http://www.siftscience.com/).
Licensed under the Apache License 2.0.

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
Licensed under the Apache License 2.0.

This product includes software developed at
args4j (http://args4j.kohsuke.org/).
Licensed under the MIT License.

This product includes software developed at
junit (http://junit.org/).
Licensed under the Eclipse Public License.

This product includes software developed at
Google (http://www.google.com/).
Licensed under the Apache License 2.0.

This product includes software developed at
json.org (http://www.json.org/).
Licensed under the JSON License.

This product includes software developed at
zkclient (https://github.com/sgroschupf/zkclient).
Licensed under the Apache License 2.0.

This product includes software developed at
scala (http://www.scala-lang.org/).
Licensed under the Scala License.

II. License Summary
- Apache License 2.0
- MIT License
- Eclipse Public License
- JSON License
- Scala License
85 changes: 85 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
kafka-assigner
==============
This is a rack-aware tool for assigning Kafka partitions to brokers that minimizes data movement. It also includes the ability to inspect the current live brokers in the cluster and the current partition assignment.

**Using this tool will greatly simplify operations like decommissioning a broker, adding a new broker, or replacing a broker.**

# Why is this necessary?
Kafka's built-in algorithm is easy to use and monitor, but it does not take into account existing assignments of partitions to nodes. Instead, the burden is on the operator to either move entire topics across brokers, or come up with a sane way of moving some number of partitions of existing topics. This is extremely disruptive.

This tool _minimizes_ the number of partitions already assigned that need to leave a given node, while ensuring that each broker is responsible for a similar number of partitions. This enables use cases like node replacement, in which we would like to bring up a broker that is responsible for the same data as a misbehaving broker that it is replacing.

# How does this work?
This tool uses a strategy that behaves similarly to [Apache Helix](http://helix.apache.org)'s auto-rebalancing algorithm. It first assigns as many already-assigned partitions back to nodes as it can (while ensuring that no node is overloaded), and then evenly assigns all other partitions such that every node eventually ends up responsible for roughly the same number of partitions.

# How is this tool used?

## Get the tool
1. Download from the "Releases" page
2. `tar xf kafka-assigner-1.0-pkg.tar`
3. `cd kafka-assigner-1.0/bin`

## Run the tool
Requires Java 1.7+

```
./kafka-assignment-generator.sh [options...] arguments...
--broker_hosts VAL : comma-separated list of broker
hostnames (instead of broker IDs)
--broker_hosts_to_remove VAL : comma-separated list of broker
hostnames to exclude (instead of
broker IDs)
--disable_rack_awareness : set to true to ignore rack
configurations
--integer_broker_ids VAL : comma-separated list of Kafka broker
IDs (integers)
--mode [PRINT_CURRENT_ASSIGNMENT | : the mode to run (PRINT_CURRENT_ASSIGNM
PRINT_CURRENT_BROKERS | ENT, PRINT_CURRENT_BROKERS,
PRINT_REASSIGNMENT] PRINT_REASSIGNMENT)
--topics VAL : comma-separated list of topics
--zk_string VAL : ZK quorum as comma-separated
host:port pairs
```

### Example: reassign partitions to all live hosts
```
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_REASSIGNMENT
```

The output JSON can then be fed into Kafka's reassign partitions command. See [here](http://kafka.apache.org/0100/ops.html#basic_ops_partitionassignment) for instructions.

### Example: reassign partitions to all but a few live hosts
This mode is useful for decommissioning or replacing a node. The partitions will be assigned to all live hosts, excluding the hosts that are specified.
```
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_REASSIGNMENT --broker_hosts_to_remove misbehaving-host1,misbehaving-host2
```

The output JSON can then be fed into Kafka's reassign partitions command. See [here](http://kafka.apache.org/0100/ops.html#basic_ops_partitionassignment) for instructions.

### Example: reassign partitions to specific hosts
Note that in this mode, it is expected that every host that should own partitions should be specified, including existing ones.
```
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_REASSIGNMENT --broker_hosts host1,host2,host3
```

The output JSON can then be fed into Kafka's reassign partitions command. See [here](http://kafka.apache.org/0100/ops.html#basic_ops_partitionassignment) for instructions.

### Example: print current brokers
```
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_CURRENT_BROKERS
```

### Example: print current assignment
```
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_CURRENT_ASSIGNMENT
```

# Building
Requires Java 1.7+ and Maven 3.2+

1. Clone this repository
2. `mvn install package`
3. Artifacts are in `target/kafka-assigner-pkg`

# License
Licensed under the Apache License 2.0.
133 changes: 133 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>siftscience</groupId>
<artifactId>kafka-assigner</artifactId>
<packaging>jar</packaging>
<version>1.0</version>

<name>kafka-assigner</name>
<description>Tools for reassigning Kafka partitions with minimal movement</description>
<url>http://maven.apache.org</url>

<licenses>
<license>
<name>Apache License, Version 2.0</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
<distribution>repo</distribution>
</license>
</licenses>

<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>args4j</groupId>
<artifactId>args4j</artifactId>
<version>2.0.29</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>13.0.1</version>
</dependency>
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20131018</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.1</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.11</artifactId>
<version>0.10.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>0.10.0.0</version>
</dependency>
</dependencies>

<build>
<defaultGoal>clean install</defaultGoal>
<plugins>

<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>appassembler-maven-plugin</artifactId>
<version>1.1.1</version>
<configuration>
<binFileExtensions>
<unix>.sh</unix>
</binFileExtensions>
<!-- Set the target configuration directory to be used in the bin scripts -->
<configurationDirectory>conf</configurationDirectory>
<!-- Copy the contents from "/src/main/config" to the target configuration directory in the assembled application -->
<copyConfigurationDirectory>true</copyConfigurationDirectory>
<!-- Include the target configuration directory in the beginning of the classpath declaration in the bin scripts -->
<includeConfigurationDirectoryInClasspath>true</includeConfigurationDirectoryInClasspath>
<assembleDirectory>${project.build.directory}/${project.artifactId}-pkg</assembleDirectory>
<!-- Extra JVM arguments that will be included in the bin scripts -->
<extraJvmArguments>-Xms512m -Xmx512m</extraJvmArguments>
<!-- Generate bin scripts for windows and unix pr default -->
<platforms>
<platform>windows</platform>
<platform>unix</platform>
</platforms>
<programs>
<program>
<mainClass>siftscience.kafka.tools.KafkaAssignmentGenerator</mainClass>
<name>kafka-assignment-generator</name>
</program>
</programs>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>assemble</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.3</version>
<configuration>
<descriptors>
<descriptor>${project.basedir}/src/assemble/assembly.xml</descriptor>
</descriptors>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.0</version>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
</build>

</project>
59 changes: 59 additions & 0 deletions src/assemble/assembly.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<assembly>
<id>pkg</id>
<formats>
<format>tar</format>
</formats>
<fileSets>
<fileSet>
<directory>${project.build.directory}/${project.artifactId}-pkg/bin</directory>
<outputDirectory>bin</outputDirectory>
<lineEnding>unix</lineEnding>
<fileMode>0755</fileMode>
<directoryMode>0755</directoryMode>
</fileSet>
<fileSet>
<directory>${project.build.directory}/${project.artifactId}-pkg/repo/</directory>
<outputDirectory>repo</outputDirectory>
<fileMode>0755</fileMode>
<directoryMode>0755</directoryMode>
<excludes>
<exclude>**/*.xml</exclude>
</excludes>
</fileSet>
<fileSet>
<directory>${project.build.directory}/${project.artifactId}-pkg/conf</directory>
<outputDirectory>conf</outputDirectory>
<lineEnding>unix</lineEnding>
<fileMode>0755</fileMode>
<directoryMode>0755</directoryMode>
</fileSet>
<fileSet>
<directory>${project.basedir}</directory>
<outputDirectory>/</outputDirectory>
<includes>
<include>LICENSE</include>
<include>NOTICE</include>
</includes>
<fileMode>0755</fileMode>
</fileSet>
</fileSets>
</assembly>
31 changes: 31 additions & 0 deletions src/main/config/log4j.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

# Set root logger level to DEBUG and its only appender to A1.
log4j.rootLogger=ERROR,A1

# A1 is set to be a ConsoleAppender.
log4j.appender.A1=org.apache.log4j.ConsoleAppender

# A1 uses PatternLayout.
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n

log4j.logger.org.I0Itec=ERROR
log4j.logger.org.apache=ERROR
Loading

0 comments on commit 865a1e1

Please sign in to comment.