Version: Nightly

Frequently Asked Questions

Core Capabilities

What is GreptimeDB?

GreptimeDB is an open-source, cloud-native unified observability database designed to store and analyze metrics, logs, and traces in a single system. Built with Rust for high performance, it offers:

Up to 50x lower operational and storage costs
Sub-second query responses on petabyte-scale datasets
Native OpenTelemetry support
SQL, PromQL, and stream processing capabilities
Compute-storage separation for flexible scaling

How is GreptimeDB's performance compared to other solutions?

GreptimeDB delivers superior performance across observability workloads:

Write Performance:

2-4.7x faster than Elasticsearch (up to 470% throughput)
1.5x faster than Loki (121k vs 78k rows/s)
2x faster than InfluxDB (250k-360k rows/s)
Matches ClickHouse performance (111% throughput)

Query Performance:

40-80x faster than Loki for log queries
500x faster for repeated queries (with caching)
2-11x faster than InfluxDB for complex time-series queries
Competitive with ClickHouse across different query patterns

Storage & Cost Efficiency:

87% less storage than Elasticsearch (12.7% footprint)
50% less storage than ClickHouse
50% less storage than Loki (3.03GB vs 6.59GB compressed)
Up to 50x lower operational costs vs traditional stacks

Resource Optimization:

40% less CPU usage compared to previous versions
Lowest memory consumption among tested databases
Consistent performance on object storage (S3/GCS)
Superior high-cardinality data handling

Unique Advantages:

Single database for metrics, logs, and traces
Native cloud-native architecture
Horizontal scalability (handles 1.15B+ rows)
Full-text search with native indexing

Benchmark reports: vs InfluxDB | vs Loki | Log Benchmark

How does GreptimeDB handle metrics, logs, and traces?

GreptimeDB is designed as a unified observability database that natively supports all three telemetry types:

Metrics: Full Prometheus compatibility with PromQL support
Logs: Full-text indexing, Loki protocol support, and efficient compression
Traces: Experimental OpenTelemetry trace storage with scalable querying

This unified approach eliminates data silos and enables cross-signal correlation without complex data pipelines.

For detailed documentation:

What are the main use cases for GreptimeDB?

GreptimeDB excels in:

Unified Observability: Replace complex monitoring stacks with a single database
Edge and Cloud Data Management: Seamless data synchronization across environments
IoT and Automotive: Process high-volume sensor data efficiently
AI/LLM Monitoring: Track model performance and behavior
Real-time Analytics: Sub-second queries on petabyte-scale datasets

Architecture & Performance

Can GreptimeDB replace my Prometheus setup?

Yes, GreptimeDB provides:

Native PromQL support with near 100% compatibility
Prometheus remote write protocol support
Efficient handling of high-cardinality metrics
Long-term storage without downsampling
Better resource efficiency than traditional Prometheus+Thanos stacks

What indexing capabilities does GreptimeDB offer?

GreptimeDB provides rich indexing options:

Inverted indexes: Fast lookups on tag columns
Full-text indexes: Efficient log searching
Skipping indexes: Accelerate range queries
Vector indexes: Support for AI/ML workloads

These indexes enable sub-second queries even on petabyte-scale datasets.

For configuration details, see Index Management.

How does GreptimeDB achieve cost efficiency?

GreptimeDB reduces costs through:

Columnar storage: Superior compression ratios
Compute-storage separation: Independent scaling of resources
Efficient cardinality management: Handles high-cardinality data without explosion
Unified platform: Eliminates need for multiple specialized databases

Result: Up to 50x lower operational and storage costs compared to traditional stacks.

What makes GreptimeDB cloud-native?

GreptimeDB is purpose-built for Kubernetes with:

Disaggregated architecture: Separate compute and storage layers
Elastic scaling: Add/remove nodes based on workload
Multi-cloud support: Run across AWS, GCP, Azure seamlessly
Kubernetes operators: Simplified deployment and management
Object storage backend: Use S3, GCS, or Azure Blob for data persistence

For Kubernetes deployment details, see the Kubernetes Deployment Guide.

Does GreptimeDB support schemaless data ingestion?

Yes, GreptimeDB supports automatic schema creation when using:

gRPC protocol
InfluxDB Line Protocol
OpenTSDB protocol
Prometheus Remote Write
OpenTelemetry protocol
Loki protocol (for log data)
Elasticsearch-compatible APIs (for log data)

Tables and columns are created automatically on first write, eliminating manual schema management.

Integration & Compatibility

What protocols and tools does GreptimeDB support?

GreptimeDB provides extensive compatibility:

Protocols: OpenTelemetry, Prometheus Remote Write, InfluxDB Line, Loki, Elasticsearch, MySQL, PostgreSQL (see Protocols Overview)
Query Languages: SQL, PromQL
Visualization: Grafana integration, any MySQL/PostgreSQL compatible tool
Data Pipeline: Vector, Fluent Bit, Telegraf, Kafka
SDKs: Go, Java, Rust, Erlang, Python

Is GreptimeDB compatible with Grafana?

Yes, GreptimeDB offers:

Grafana integration with official plugin
MySQL/PostgreSQL protocol support for standard Grafana data sources
Native PromQL for Prometheus-style queries
SQL support for complex analytics

How does GreptimeDB integrate with OpenTelemetry?

GreptimeDB is OpenTelemetry-native:

Direct OTLP ingestion for metrics, logs, and traces
No translation layer or data loss
Supports OpenTelemetry Collector and SDKs
Preserves semantic conventions and resource attributes

What SDKs are available for GreptimeDB?

Go: greptimedb-ingester-go
Java: greptimedb-ingester-java
Rust: greptimedb-ingester-rust
Erlang: greptimedb-ingester-erl
Python: Via SQL drivers (MySQL/PostgreSQL compatible)

How can I migrate from other databases to GreptimeDB?

GreptimeDB provides migration guides for popular databases:

From ClickHouse: Table schema and data migration
From InfluxDB: Line protocol and data migration
From Prometheus: Remote write and historical data migration
From MySQL/PostgreSQL: SQL-based migration

For detailed migration instructions, see Migration Overview.

What disaster recovery options does GreptimeDB provide?

GreptimeDB offers multiple disaster recovery strategies to meet different availability requirements:

Standalone DR Solution: Uses remote WAL and object storage, achieving RPO=0 and RTO in minutes for small-scale scenarios
Region Failover: Automatic failover for individual regions with minimal downtime
Active-Active Failover (Enterprise): Synchronous request replication between nodes for high availability
Cross-Region Single Cluster: Spans three regions with zero RPO and region-level error tolerance
Backup and Restore: Periodic data backups with configurable RPO based on backup frequency

Choose the appropriate solution based on your availability requirements, deployment scale, and cost considerations. For detailed guidance, see Disaster Recovery Overview.

Data Management & Processing

How does GreptimeDB handle data lifecycle?

Retention Policies:

Database-level and table-level TTL settings
Automatic data expiration without manual cleanup
Configurable via TTL Documentation

Data Export:

COPY TO command for S3, local files
Standard SQL queries via any compatible client
Export functionality for backup and disaster recovery: Back up & Restore Data

How does GreptimeDB handle high-cardinality and real-time processing?

High-Cardinality Management:

Advanced indexing strategies prevent cardinality explosion
Columnar storage with intelligent compression
Distributed query execution with data pruning
Handles millions of unique time series efficiently

Learn more about indexing: Index Management

Real-Time Processing:

Flow Engine: Real-time stream processing system that enables continuous, incremental computation on streaming data with automatic result table updates
Pipeline: Data parsing and transformation mechanism for processing incoming data in real-time, with configurable processors for field extraction and data type conversion across multiple data formats
Output Tables: Persist processed results for analysis

What are GreptimeDB's scalability characteristics?

Scale Limits:

No strict limitations on table or column count
Hundreds of tables with minimal performance impact
Performance scales with primary key design, not table count
Column-oriented storage ensures efficient partial reads

Partitioning & Distribution:

Automatic time-based organization within regions
Manual distributed sharding via PARTITION clause (see Table Sharding Guide)
Automatic region splitting planned for future releases
Dynamic partitioning without configuration (Enterprise feature)

Core Scalability Features:

Multi-tiered caching: Write cache (disk-backed) and read cache (LRU policy) for optimal performance
Object storage backend: Virtually unlimited storage via S3/GCS/Azure Blob
Asynchronous WAL: Efficient write-ahead logging with optional per-table controls
Distributed query execution: Multi-node coordination for large datasets
Manual Compaction: Available via admin commands

Enterprise Scale Features:

Advanced partitioning and automatic rebalancing
Enhanced multi-tenancy and isolation
Enterprise-grade monitoring and management tools

For architecture details, see the storage architecture blog.

What are GreptimeDB's design trade-offs?

GreptimeDB is optimized for observability workloads with intentional limitations:

No ACID transactions: Prioritizes high-throughput writes over transactional consistency
Limited delete operations: Designed for append-heavy observability data
Time-series focused: Optimized for IoT, metrics, logs, and traces rather than general OLTP
Simplified joins: Optimized for time-series queries over complex relational operations

Deployment & Operations

What are the deployment options for GreptimeDB?

Cluster Deployment (Production):

Minimum 3 nodes for high availability
Services: metasrv, frontend, and datanode on each node
Can separate services for larger scale deployments
See Capacity Planning Guide

Edge & Standalone:

Android ARM64 platforms (prebuilt binaries available)
Raspberry Pi and constrained environments
Single-node mode for development/testing
Efficient resource usage for IoT scenarios

Storage Backends:

Production: S3, GCS, Azure Blob for data persistence
Development: Local storage for testing
Metadata: MySQL/PostgreSQL backend support for metasrv

For deployment and administration details: Deployments & Administration Overview

How does data distribution work?

Current State:

Manual partitioning via PARTITION clause during table creation (see Table Sharding Guide)
Time-based automatic organization within regions
Manual region migration support for load balancing (see Region Migration Guide)
Automatic region failover for disaster recovery (see Region Failover)

Roadmap:

Automatic region splitting and rebalancing
Dynamic workload distribution across nodes

How do I monitor and troubleshoot GreptimeDB?

GreptimeDB provides comprehensive monitoring capabilities including metrics collection, health checks, and observability integrations. For detailed monitoring setup and troubleshooting guides, see the Monitoring Overview.

Open Source vs Enterprise vs Cloud

What are the differences between GreptimeDB versions?

Open Source Version:

High-performance ingestion and query capabilities
Cluster deployment with basic read-write separation
Multiple protocol support (OpenTelemetry, Prometheus, InfluxDB, etc.)
Basic authentication and access control
Basic data encryption
Community support

Enterprise Version (all Open Source features plus):

Cost-based query optimizer for better performance
Advanced read-write separation and active-active failover (see Active-Active Failover)
Automatic scaling, indexing, and load balancing
Layered caching and enterprise-level web console
Enterprise authorization (RBAC/LDAP integration)
Enhanced security and audit features
One-on-one technical support with 24/7 service response
Professional customization services

GreptimeCloud (fully managed, all Enterprise features plus):

Serverless autoscaling with pay-as-you-go pricing
Fully managed deployment with seamless upgrades
Independent resource pools with isolated networks
Configurable read/write capacity and unlimited storage
Advanced dashboard with Prometheus workbench
SLA guarantees and automated disaster recovery

For detailed comparison, see Pricing & Features.

What security features are available?

Open Source:

Basic username/password authentication
TLS/SSL support for connections

Enterprise/Cloud:

Role-based access control (RBAC)
Team management and API keys
Data encryption at rest
Audit logging for compliance

Technical Details

How does GreptimeDB extend Apache DataFusion?

GreptimeDB builds on DataFusion with:

Query Languages: Native PromQL alongside SQL
Distributed Execution: Multi-node query coordination
Custom Functions: Time-series specific UDFs/UDAFs
Optimizations: Rules tailored for observability workloads
Counter Handling: Automatic reset detection in rate() and delta() functions

For custom function development: Function Documentation

What's the difference between GreptimeDB and InfluxDB?

Key differences:

Open Source: GreptimeDB's entire distributed system is fully open source
Architecture: Region-based design optimized for observability workloads
Query Languages: SQL + PromQL vs InfluxQL + SQL
Unified Model: Native support for metrics, logs, and traces in one system
Storage: Pluggable engines with dedicated optimizations
Cloud-Native: Built for Kubernetes with disaggregated compute/storage (see Kubernetes Deployment Guide)

For detailed comparisons, see GreptimeDB vs InfluxDB. Additional product comparisons (vs. ClickHouse, Loki, etc.) are available in the Resources menu on our website.

How does GreptimeDB's storage engine work?

LSM-Tree Architecture:

Based on Log-Structured Merge Tree (LSMT) design
WAL can use local disk or distributed services (e.g., Kafka) via Log Store API
SST files are flushed to object storage (S3/GCS) or local disk
Designed for cloud-native environments with object storage as primary backend
Optimized for time-series workloads with TWCS (Time-Window Compaction Strategy)

Performance Considerations:

Timestamps: Datetime formats (yyyy-MM-dd HH:mm:ss) have no performance impact
Compression: Measure only data directory; WAL is cyclically reused
Append-only tables: Recommended for better write and query performance, especially for log scenarios
Flow Engine: Currently SQL-based; PromQL support under evaluation

What are best practices for specific use cases?

Network Monitoring (e.g., thousands of NICs):

Use Flow tables for continuous aggregation
Manual downsampling via Flow Engine for data reduction
Output to regular tables for long-term storage

Log Analytics:

Use append-only tables for better write and query performance
Create indexes on frequently queried fields (Index Management)
Storage efficiency: 50% of ClickHouse, 12.7% of Elasticsearch

Table Design & Performance:

For table modeling guidance: Design Table
For performance optimization: Performance Tuning Tips

Getting Started

Where can I find documentation and benchmarks?

Performance & Benchmarks:

Installation & Deployment:

How can I contribute to GreptimeDB?

Welcome to the community! Get started:

Code: Contribution Guide
First Issues: Good First Issues
Community: Slack Channel
Documentation: Help improve these docs!

What's next?

Try GreptimeCloud: Free serverless tier
Self-host: Follow the installation guide
Explore Integrations: GreptimeDB supports extensive integrations with Prometheus, Vector, Kafka, Telegraf, EMQX, Metabase, and many more. See Integrations Overview for the complete list, or start with OpenTelemetry or Prometheus
Join Community: Connect with users and maintainers on Slack

Frequently Asked Questions

Core Capabilities​

What is GreptimeDB?​

How is GreptimeDB's performance compared to other solutions?​

How does GreptimeDB handle metrics, logs, and traces?​

What are the main use cases for GreptimeDB?​

Architecture & Performance​

Can GreptimeDB replace my Prometheus setup?​

What indexing capabilities does GreptimeDB offer?​

How does GreptimeDB achieve cost efficiency?​

What makes GreptimeDB cloud-native?​

Does GreptimeDB support schemaless data ingestion?​

Integration & Compatibility​

What protocols and tools does GreptimeDB support?​

Is GreptimeDB compatible with Grafana?​

How does GreptimeDB integrate with OpenTelemetry?​

What SDKs are available for GreptimeDB?​

How can I migrate from other databases to GreptimeDB?​

What disaster recovery options does GreptimeDB provide?​

Data Management & Processing​

How does GreptimeDB handle data lifecycle?​

How does GreptimeDB handle high-cardinality and real-time processing?​

What are GreptimeDB's scalability characteristics?​

What are GreptimeDB's design trade-offs?​

Deployment & Operations​

What are the deployment options for GreptimeDB?​

How does data distribution work?​

How do I monitor and troubleshoot GreptimeDB?​

Open Source vs Enterprise vs Cloud​

What are the differences between GreptimeDB versions?​

What security features are available?​

Technical Details​

How does GreptimeDB extend Apache DataFusion?​

What's the difference between GreptimeDB and InfluxDB?​

How does GreptimeDB's storage engine work?​

What are best practices for specific use cases?​

Getting Started​

Where can I find documentation and benchmarks?​

How can I contribute to GreptimeDB?​

What's next?​