Jumbo is a tool that allows you to deploy a virtualized Hadoop cluster on a local machine in minutes. It is made to help you quickly bootstrap development environments without struggling with nodes and services configurations.
Jumbo is written in Python and relies on other tools that it coordinates:
- Vagrant, to manage the virtual machines;
- Ansible, to configure the cluster;
- Apache Ambari, to provision and manage the Hadoop cluster.
The distribution used for the Hadoop cluster is Hortonworks Data Platform.
Originally, Jumbo is designed for developers with a limited knowledge of the Hadoop deployment process. But this doesn't mean that it cannot be helpful to others! Everything needed to create and deploy a Hadoop cluster is done by Jumbo, so if you need different environments (e.g. for different projects, testing...), be sure it will be useful to you!
A complete documentation is available at Jumbo website. Jumbo installation instructions are available on the installation page.
If you want a local documentation, it is also available in Gitbook format in the docs/
folder.
Current version: v0.4.4
- Add Kerberos support
- Add a
-r
option onaddservice
for automatic dependency installation - "Proxify" Vagrant commands into Jumbo:
start
,stop
,status
,restart
,delete
- Start HDP services on vagrant start
- Host the documentation on a website (jumbo.adaltas.com)
- Allow custom configurations via JSON (versions, urls...)
- Add informative commands (info, versions, available services...)
- Add support for all HDP services
- Generalize HA support
- Smart cluster topology based on available ressources
- Allow to duplicate existing cluster with a different name
Jumbo is a very recent project. We would be happy to have feedback so don't hesitate to post issues or even to do a PR if you need extra features!
Jumbo was developed by Gauthier Leonard and Xavier Hermand at Adaltas.
Jumbo is licensed under MIT License. See LICENSE for the full license text.