This article provides a detailed guide of deploying Zookeeper to AWS using Exhibitor for cluster management. Exhibitor is a great help for managing your cluster but getting things up and running is not well documented. Hopefully this article corrects that deficiency.

tldr; Full source for this project can be found on Github.

We are going to use Ansible for provisioning our server. If you aren’t familiar with Ansible this article can serve as a practical guide to getting started.

The first step is downloading and installing Zookeeper. This is a fairly straight-forward process of downloading the tarball from Apache and unzipping it to your installation directory. Our first provisioning file installs the required packages using apt, creates a zookeeper user and group, and downloads and unpacks the Zookeeper tarball. Save this file as provision.yml.

---
- name: Install Zookeeper
  hosts: all
  vars:
    version: 2.4.6
    user: zookeeper
    group: zookeeper
    install_dir: /opt/zookeeper
    data_dir: /opt/zookeeper/data
    log_dir: /opt/zookeeper/log

  tasks:
    - name: Install Required Packages
      sudo: true
      apt:
        update_cache: true
        pkg: "{{ item }}"
        state: present
      with_items:
        - wget
        - supervisor
        - openjdk-7-jdk

    - name: Create Group
      sudo: true
      group:
        name: "{{group}}"
        state: present

    - name: Create User
      sudo: true
      user:
        name: "{{user}}"
        group: "{{group}}"
        state: present

    - name: Download Zookeeper
      get_url:
        url: http://mirror.csclub.uwaterloo.ca/apache/zookeeper/zookeeper-{{version}}/zookeeper-{{version}}.tar.gz
        dest: /tmp/zookeeper-{{version}}.tar.gz

    - name: Extract Zookeeper
      sudo: true
      unarchive:
        src: /tmp/zookeeper-{{version}}.tar.gz
        dest: /opt/
        copy: no

    - name: Remove Existing Install
      sudo: true
      command: rm -rf "{{ install_dir }}"

    - name: Move Zookeeper Install Directory
      sudo: true
      command: mv -f /opt/zookeeper-{{version}} "{{ install_dir }}"

    - name: Create Data Directory
      sudo: true
      file:
        path: "{{data_dir}}"
        owner: "{{user}}"
        group: "{{group}}"
        mode: 0755
        state: directory

    - name: Create Log Directory
      sudo: true
      file:
        path: "{{log_dir}}"
        owner: "{{user}}"
        group: "{{group}}"
        mode: 0755
        state: directory

    - name: Update Permissions
      sudo: true
      file:
        path: "{{ install_dir }}"
        owner: "{{ user }}"
        group: "{{ group }}"
        recurse: yes
        mode: 0755
        state: directory

We use Packer to create an AMI based on this provisioning script that can be deployed to AWS. You will need to fill in your own source AMI, subnet and security group depending on your own AWS setup. When making an AMI with Packer, Packer will launch an EC2 instance, run your script using that instance, and then create the AMI. As such, you will need to begin your provisioner by installing Ansible so that the playbook can be run on the EC2 host. Once installed you can use the Ansible provisioner to run our script and produce an AMI. Save this file as packer.json.

{
  "variables": {
    "aws_access_key": "",
    "aws_secret_key": ""
  },
  "builders": [{
    "type": "amazon-ebs",
    "access_key": "{{user `aws_access_key`}}",
    "secret_key": "{{user `aws_secret_key`}}",
    "ami_name": "zookeeper {{timestamp}}",
    "instance_type": "t2.micro",
    "source_ami": "ami-...",
    "region": "us-west-2",
    "subnet_id": "subnet-...",
    "security_group_id": "sg-...",
    "ssh_username": "ubuntu",
    "associate_public_ip_address": true
  }],
  "provisioners": [
    {
      "type": "shell",
      "inline": [
        "sleep 30",
        "sudo apt-add-repository ppa:rquillo/ansible",
        "sudo /usr/bin/apt-get update",
        "sudo /usr/bin/apt-get -y install ansible"
      ]
    },
    {
      "type": "ansible-local",
      "playbook_file": "provision.yml",
      "playbook_dir": "."
    }
  ]
}

At this point you can create a Makefile encapsulating the steps for building your AMI.

build-ami:
	packer build \
	-var 'aws_access_key=${AWS_ACCESS_KEY}' \
	-var 'aws_secret_key=${AWS_SECRET_KEY}' \
	packer.json

You can then deploy the AMI using EC2.

Given this starting point, you can expand the Ansible playbook to install Exhibitor. The details of this are left out of this article, consult the Github repo for the full source code.

Configuring Zookeeper with Exhibitor

Pre-check list:

  • Your Zookeeper/Exhibitor instances have an IAM Role with full S3 access.
  • Your Zookeeper/Exhibitor instances have public IP addresses.
  • Your security group allows access to port 8080.

Now you can navigate to port 8080 of one of your instances with your browser. You will be presented with the Exhibitor Control Panel listing your instances and their status.

Exhibitor Control Panel

If you’ve gotten this far, it’s time to configure your instances. Navigate to the Config tab.

Exhibitor Config Tab

Enter edit mode to make changes to your configuration. All changes will be saved to S3 and shared between running Zookeeper instances.

Exhibitor Editing

The changes you want to make are to the list of servers in the cluster. The format is S:NodeId:IpAddress, where S signifies a server node, NodeId specifies a unique identifier for the node, and IpAddress is the ip of the node. For the IpAddress field, enter the private IP for your nodes. You only need to enter these values from one node and all nodes will see the changes (after restarting).

Here’s an example for configuring three nodes with ids 1, 2, and 3 that have Private DNS entries like the following:

ip-192-32-7-100.us-west-2.compute.internal
ip-192-32-7-101.us-west-2.compute.internal
ip-192-32-7-102.us-west-2.compute.internal
S:1:ip-192-32-7-100,S:2:ip-192-32-7-101,S:3:ip-192-32-7-102

When saving this configuration, the Zookeeper instances will restart and use the new configuration copied from S3.

You can verify the status of your cluster using the Exhibitor Control Panel UI.

Exhibitor Control Panel

At this point you have a running Zookeeper cluster and Exhibitor will ensure the Zookeeper process is running.