Remote Monitoring of Apache Cassandra running in Docker via JMX using Datadog


This is a step-by-step guide on how to monitor Apache Cassandra database running as a Docker container using cloud monitoring service Datadog.

1. Create your own Docker image of Cassandra

If you haven’t done it already, create a new Git repository and add two files there:

  • Dockerfile
  • jmxremote.password

Dockerfile:

FROM cassandra:latest

# We need this to enable JMX monitoring for Datadog agent
COPY ./jmxremote.password /etc/cassandra/jmxremote.password
RUN chmod 400 /etc/cassandra/jmxremote.password

COPY ./jmxremote.password /etc/java-8-openjdk/management/jmxremote.password

jmxremote.password:

monitorRole QED

With this we allow user named “monitorRole” with password “QED” to connect to Cassandra using JMX.

2. Run Cassandra Docker image with additional parameters

Run the Docker image created in the step before with two additional environment variables:

  • JVM_OPTS=-Djava.rmi.server.hostname=[HERE GOES HOSTNAME OF YOUR CASSANDRA]
  • LOCAL_JMX=no

By default, Cassandra allows local JMX connections only.

3. Create your own Docker image of Datadog agent

Create new Git repository and put two file there:

  • Dockerfile
  • cassandra.yaml

Dockerfile:

# Agent running a Cassandra monitoring
FROM datadog/docker-dd-agent

# Install JMXFetch dependencies
RUN apt-get update \
&& apt-get install openjdk-7-jre-headless -qq --no-install-recommends

# Add Cassandra check configuration
ADD cassandra.yaml /etc/dd-agent/conf.d/cassandra.yaml

cassandra.yaml:

instances:
  - host: [HERE GOES HOSTNAME OF YOUR CASSANDRA]
    port: [HERE GOES JMX PORT OF YOUR CASSANDRA, TYPICALLY 7199]
    cassandra_aliasing: true
    user: monitorRole
    password: QED
    #name: cassandra_instance
    #trust_store_path: /path/to/trustStore.jks # Optional, should be set if ssl is enabled
    #trust_store_password: password
    #java_bin_path: /path/to/java #Optional, should be set if the agent cannot find your java executable

# List of metrics to be collected by the integration
# Visit http://docs.datadoghq.com/integrations/java/ to customize it
init_config:
  # List of metrics to be collected by the integration
  # Read http://docs.datadoghq.com/integrations/java/ to learn how to customize it
  conf:
    - include:
        domain: org.apache.cassandra.metrics
        type: ClientRequest
        scope:
          - Read
          - Write
        name:
          - Latency
          - Timeouts
          - Unavailables
        attribute:
          - Count
          - OneMinuteRate
    - include:
        domain: org.apache.cassandra.metrics
        type: ClientRequest
        scope:
          - Read
          - Write
        name:
          - TotalLatency
    - include:
        domain: org.apache.cassandra.metrics
        type: Storage
        name:
          - Load
          - Exceptions
    - include:
        domain: org.apache.cassandra.metrics
        type: ColumnFamily
        name:
          - TotalDiskSpaceUsed
          - BloomFilterDiskSpaceUsed
          - BloomFilterFalsePositives
          - BloomFilterFalseRatio
          - CompressionRatio
          - LiveDiskSpaceUsed
          - LiveSSTableCount
          - MaxRowSize
          - MeanRowSize
          - MemtableColumnsCount
          - MemtableLiveDataSize
          - MemtableSwitchCount
          - MinRowSize
      exclude:
        keyspace:
          - system
          - system_auth
          - system_distributed
          - system_traces
    - include:
        domain: org.apache.cassandra.metrics
        type: Cache
        name:
          - Capacity
          - Size
        attribute:
          - Value
    - include:
        domain: org.apache.cassandra.metrics
        type: Cache
        name:
          - Hits
          - Requests
        attribute:
          - Count
    - include:
        domain: org.apache.cassandra.metrics
        type: ThreadPools
        path: request
        name:
          - ActiveTasks
          - CompletedTasks
          - PendingTasks
          - CurrentlyBlockedTasks
    - include:
        domain: org.apache.cassandra.db
        attribute:
          - UpdateInterval

The cassandra.yaml file contains connection information for the Datadog agent and also list of metrics to collect.

4. Run the Datadog agent

Probably it makes sense to run the Datadog Docker image on the same machine as Cassandra so that it can collect metrics about the same HW. But I am not sure about what I am saying here.

5. Enable Cassandra integration in Datadog

To start collecting data you have to install integration in Datadog. Quick check can be to visualise cassandra.latency.one_minute_rate metric which represents number of read/write requests.