Cluster Monitoring Deployment
After deploying a GreptimeDB cluster using GreptimeDB Operator, by default, its components (Metasrv / Datanode / Frontend) expose a /metrics
endpoint on their HTTP port (default 4000
) for Prometheus metrics.
We provide two approaches to monitor the GreptimeDB cluster:
- Enable GreptimeDB Self-Monitoring: The GreptimeDB Operator will launch an additional GreptimeDB Standalone instance and Vector Sidecar container to collect and store metrics and logs from the GreptimeDB cluster.
- Use Prometheus Operator to Configure Prometheus Metrics Monitoring: Users need first to deploy Prometheus Operator and create Prometheus instance, then use Prometheus Operator's
PodMonitor
to write GreptimeDB cluster metrics into Prometheus.
Users can choose the appropriate monitoring approach based on their needs.
Enable GreptimeDB Self-Monitoring
In self-monitoring mode, GreptimeDB Operator will launch an additional GreptimeDB Standalone instance to collect metrics and logs from the GreptimeDB cluster, including cluster logs and slow query logs. To collect log data, GreptimeDB Operator will start a Vector sidecar container in each Pod. When this mode is enabled, JSON format logging will be automatically enabled for the cluster.
If you deploy the GreptimeDB cluster using Helm Chart (refer to Getting Started), you can configure the values.yaml
file as follows:
monitoring:
enabled: true
This will deploy a GreptimeDB Standalone instance named ${cluster}-monitoring
to collect metrics and logs. You can check it with:
kubectl get greptimedbstandalones.greptime.io ${cluster}-monitoring -n ${namespace}
By default, this GreptimeDB Standalone instance will store monitoring data using the Kubernetes default StorageClass in local storage. You can adjust this based on your needs.
The GreptimeDB Standalone instance can be configured via the monitoring.standalone
field in values.yaml
, for example:
monitoring:
enabled: true
standalone:
base:
main:
# Configure GreptimeDB Standalone instance image
image: "greptime/greptimedb:latest"
# Configure GreptimeDB Standalone instance resources
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "2"
memory: "4Gi"
# Configure object storage for GreptimeDB Standalone instance
objectStorage:
s3:
# Configure bucket
bucket: "monitoring"
# Configure region
region: "ap-southeast-1"
# Configure secret name
secretName: "s3-credentials"
# Configure root path
root: "standalone-with-s3-data"
The GreptimeDB Standalone instance will expose services using ${cluster}-monitoring-standalone
as the Kubernetes Service name. You can use the following addresses to read monitoring data:
- Prometheus metrics:
http://${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4000/v1/prometheus
- SQL logs:
${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4002
. By default, cluster logs are stored inpublic._gt_logs
table and slow query logs are stored inpublic._gt_slow_queries
table.
The Vector sidecar configuration for log collection can be customized via the monitoring.vector
field:
monitoring:
enabled: true
vector:
# Configure Vector image registry
registry: docker.io
# Configure Vector image repository
repository: timberio/vector
# Configure Vector image tag
tag: nightly-alpine
# Configure Vector resources
resources:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "50m"
memory: "64Mi"
If you're not using Helm Chart, you can manually configure self-monitoring mode in the GreptimeDBCluster
YAML:
apiVersion: greptime.io/v1alpha1
kind: GreptimeDBCluster
metadata:
name: basic
spec:
base:
main:
image: greptime/greptimedb:latest
frontend:
replicas: 1
meta:
replicas: 1
etcdEndpoints:
- "etcd.etcd-cluster.svc.cluster.local:2379"
datanode:
replicas: 1
monitoring:
enabled: true
The monitoring
field configures self-monitoring mode. See GreptimeDBCluster
API docs for details.
Use Prometheus Operator to Configure Prometheus Metrics Monitoring
Users need to first deploy Prometheus Operator and create Prometheus instance. For example, you can use kube-prometheus-stack to deploy the Prometheus stack. You can refer to its official documentation for more details.
After deploying Prometheus Operator and instances, you can configure Prometheus monitoring via the prometheusMonitor
field in values.yaml
:
prometheusMonitor:
# Enable Prometheus monitoring - this will create PodMonitor resources
enabled: true
# Configure scrape interval
interval: "30s"
# Configure labels
labels:
release: prometheus
The labels
field must match the matchLabels
field used to create the Prometheus instance, otherwise metrics collection won't work properly.
After configuring prometheusMonitor
, GreptimeDB Operator will automatically create PodMonitor
resources and import metrics into Prometheus. You can check the PodMonitor
resources with:
kubectl get podmonitors.monitoring.coreos.com -n ${namespace}
If not using Helm Chart, you can manually configure Prometheus monitoring in the GreptimeDBCluster
YAML:
apiVersion: greptime.io/v1alpha1
kind: GreptimeDBCluster
metadata:
name: basic
spec:
base:
main:
image: greptime/greptimedb:latest
frontend:
replicas: 1
meta:
replicas: 1
etcdEndpoints:
- "etcd.etcd-cluster.svc.cluster.local:2379"
datanode:
replicas: 1
prometheusMonitor:
enabled: true
interval: "30s"
labels:
release: prometheus
The prometheusMonitor
field configures Prometheus monitoring.
Import Grafana Dashboards
GreptimeDB cluster currently provides 3 Grafana dashboards:
The Cluster Logs Dashboard and Slow Query Logs Dashboard are only for self-monitoring mode, while the Cluster Metrics Dashboard works for both self-monitoring and Prometheus monitoring modes.
If using Helm Chart, you can enable grafana.enabled
to deploy Grafana and import dashboards automatically (see Getting Started):
grafana:
enabled: true
If you already have Grafana deployed, follow these steps to import the dashboards:
-
Add Data Sources
You can refer to Grafana's datasources docs to add the following 3 data sources:
-
metrics
data sourceFor importing Prometheus metrics, works with both monitoring modes. For self-monitoring mode, use
http://${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4000/v1/prometheus
as the URL. For your own Prometheus instance, use your Prometheus instance URL. -
information-schema
data sourceFor importing cluster metadata via SQL, works with both monitoring modes. Use
${cluster}-frontend.${namespace}.svc.cluster.local:4002
as the SQL address with databaseinformation_schema
. -
logs
data sourceFor importing cluster and slow query logs via SQL, only works with self-monitoring mode. Use
${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4002
as the SQL address with databasepublic
.
-
-
Import Dashboards
You can refer to Grafana's Import dashboards docs.