Version: 0.14

Cluster Monitoring Deployment

After deploying a GreptimeDB cluster using GreptimeDB Operator, by default, its components (Metasrv / Datanode / Frontend) expose a /metrics endpoint on their HTTP port (default 4000) for Prometheus metrics.

We provide two approaches to monitor the GreptimeDB cluster:

Enable GreptimeDB Self-Monitoring: The GreptimeDB Operator will launch an additional GreptimeDB Standalone instance and Vector Sidecar container to collect and store metrics and logs from the GreptimeDB cluster.
Use Prometheus Operator to Configure Prometheus Metrics Monitoring: Users need first to deploy Prometheus Operator and create Prometheus instance, then use Prometheus Operator's PodMonitor to write GreptimeDB cluster metrics into Prometheus.

Users can choose the appropriate monitoring approach based on their needs.

Enable GreptimeDB Self-Monitoring

In self-monitoring mode, GreptimeDB Operator will launch an additional GreptimeDB Standalone instance to collect metrics and logs from the GreptimeDB cluster, including cluster logs and slow query logs. To collect log data, GreptimeDB Operator will start a Vector sidecar container in each Pod. When this mode is enabled, JSON format logging will be automatically enabled for the cluster.

If you deploy the GreptimeDB cluster using Helm Chart (refer to Getting Started), you can configure the values.yaml file as follows:

monitoring:
  enabled: true

This will deploy a GreptimeDB Standalone instance named ${cluster}-monitoring to collect metrics and logs. You can check it with:

kubectl get greptimedbstandalones.greptime.io ${cluster}-monitoring -n ${namespace}

By default, this GreptimeDB Standalone instance will store monitoring data using the Kubernetes default StorageClass in local storage. You can adjust this based on your needs.

The GreptimeDB Standalone instance can be configured via the monitoring.standalone field in values.yaml, for example:

monitoring:
  enabled: true
  standalone:
    base:
     main:
       # Configure GreptimeDB Standalone instance image
       image: "greptime/greptimedb:latest"

       # Configure GreptimeDB Standalone instance resources
       resources:
         requests:
           cpu: "2"
           memory: "4Gi"
         limits:
           cpu: "2"
           memory: "4Gi"
    
    # Configure object storage for GreptimeDB Standalone instance
    objectStorage:
      s3:
        # Configure bucket
        bucket: "monitoring"
        # Configure region  
        region: "ap-southeast-1"
        # Configure secret name
        secretName: "s3-credentials"
        # Configure root path
        root: "standalone-with-s3-data"

The GreptimeDB Standalone instance will expose services using ${cluster}-monitoring-standalone as the Kubernetes Service name. You can use the following addresses to read monitoring data:

Prometheus metrics: http://${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4000/v1/prometheus
SQL logs: ${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4002. By default, cluster logs are stored in public._gt_logs table and slow query logs are stored in public._gt_slow_queries table.

The Vector sidecar configuration for log collection can be customized via the monitoring.vector field:

monitoring:
  enabled: true
  vector:
    # Configure Vector image registry
    registry: docker.io
    # Configure Vector image repository 
    repository: timberio/vector
    # Configure Vector image tag
    tag: nightly-alpine

    # Configure Vector resources
    resources:
      requests:
        cpu: "50m"
        memory: "64Mi"
      limits:
        cpu: "50m" 
        memory: "64Mi"

NOTE

The configuration structure has changed between chart versions:

In older version: meta.etcdEndpoints
In newer version: meta.backendStorage.etcd.endpoints

Always refer to the latest values.yaml in the Helm chart repository for the most up-to-date configuration structure.

note

If you're not using Helm Chart, you can manually configure self-monitoring mode in the GreptimeDBCluster YAML:

apiVersion: greptime.io/v1alpha1
kind: GreptimeDBCluster
metadata:
  name: basic
spec:
  base:
    main:
      image: greptime/greptimedb:latest
  frontend:
    replicas: 1
  meta:
    replicas: 1
    backendStorage:
      etcd:
        endpoints:
          - "etcd.etcd-cluster.svc.cluster.local:2379"
  datanode:
    replicas: 1
  monitoring:
    enabled: true

The monitoring field configures self-monitoring mode. See GreptimeDBCluster API docs for details.

Use Prometheus Operator to Configure Prometheus Metrics Monitoring

Users need to first deploy Prometheus Operator and create Prometheus instance. For example, you can use kube-prometheus-stack to deploy the Prometheus stack. You can refer to its official documentation for more details.

After deploying Prometheus Operator and instances, you can configure Prometheus monitoring via the prometheusMonitor field in values.yaml:

prometheusMonitor:
  # Enable Prometheus monitoring - this will create PodMonitor resources
  enabled: true
  # Configure scrape interval
  interval: "30s"
  # Configure labels
  labels:
    release: prometheus

note

The labels field must match the matchLabels field used to create the Prometheus instance, otherwise metrics collection won't work properly.

After configuring prometheusMonitor, GreptimeDB Operator will automatically create PodMonitor resources and import metrics into Prometheus. You can check the PodMonitor resources with:

kubectl get podmonitors.monitoring.coreos.com -n ${namespace}

NOTE

The configuration structure has changed between chart versions:

In older version: meta.etcdEndpoints
In newer version: meta.backendStorage.etcd.endpoints

Always refer to the latest values.yaml in the Helm chart repository for the most up-to-date configuration structure.

note

If not using Helm Chart, you can manually configure Prometheus monitoring in the GreptimeDBCluster YAML:

apiVersion: greptime.io/v1alpha1
kind: GreptimeDBCluster
metadata:
  name: basic
spec:
  base:
    main:
      image: greptime/greptimedb:latest
  frontend:
    replicas: 1
  meta:
    replicas: 1
    backendStorage:
      etcd:
        endpoints:
          - "etcd.etcd-cluster.svc.cluster.local:2379"
  datanode:
    replicas: 1
  prometheusMonitor:
    enabled: true
    interval: "30s"
    labels:
      release: prometheus

The prometheusMonitor field configures Prometheus monitoring.

Import Grafana Dashboards

GreptimeDB cluster currently provides 3 Grafana dashboards:

note

The Cluster Logs Dashboard and Slow Query Logs Dashboard are only for self-monitoring mode, while the Cluster Metrics Dashboard works for both self-monitoring and Prometheus monitoring modes.

If using Helm Chart, you can enable grafana.enabled to deploy Grafana and import dashboards automatically (see Getting Started):

grafana:
  enabled: true

If you already have Grafana deployed, follow these steps to import the dashboards:

Add Data Sources

You can refer to Grafana's datasources docs to add the following 3 data sources:
- metrics data source
  
  For importing Prometheus metrics, works with both monitoring modes. For self-monitoring mode, use http://${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4000/v1/prometheus as the URL. For your own Prometheus instance, use your Prometheus instance URL.
- information-schema data source
  
  For importing cluster metadata via SQL, works with both monitoring modes. Use ${cluster}-frontend.${namespace}.svc.cluster.local:4002 as the SQL address with database information_schema.
- logs data source
  
  For importing cluster and slow query logs via SQL, only works with self-monitoring mode. Use ${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4002 as the SQL address with database public.
Import Dashboards

You can refer to Grafana's Import dashboards docs.

Cluster Monitoring Deployment

Enable GreptimeDB Self-Monitoring​

Use Prometheus Operator to Configure Prometheus Metrics Monitoring​

Import Grafana Dashboards​

Enable GreptimeDB Self-Monitoring

Use Prometheus Operator to Configure Prometheus Metrics Monitoring

Import Grafana Dashboards