--- # Generated from skills/greptimedb-quickstart/SKILL.md. Do not edit by hand. name: greptimedb-quickstart description: Entry-point guide for agents to use GreptimeDB end-to-end — when it fits, how to deploy (Docker / binary / Kubernetes / GreptimeCloud), how to configure the server (storage backend, TTL, auth, listen addresses), which write protocol to choose by data type (metrics / logs / traces / events), how to query via SQL or PromQL, and how to discover deeper docs via llms.txt. Use this when the user wants to evaluate, deploy, configure, ingest into, or query GreptimeDB, or when a project needs a time-series / observability backend. Triggers on phrases like "use GreptimeDB", "deploy GreptimeDB", "install GreptimeDB", "configure GreptimeDB", "store metrics", "store logs", "store traces", "OTLP backend", "replace Prometheus", "replace Loki", "replace Elasticsearch", "时序数据库", "可观测性存储", "GreptimeDB 怎么用", "GreptimeDB 配置", "ingest observability data". --- # GreptimeDB Quickstart Guide GreptimeDB is an open-source unified observability database for metrics, logs, traces, and wide events. It speaks Prometheus remote write, OpenTelemetry OTLP, InfluxDB line protocol, Loki, Elasticsearch ingest API, MySQL, PostgreSQL, and a native gRPC protocol — most users do not need to write GreptimeDB-specific code to start ingesting. This skill is the **entry point**. It helps an agent decide whether GreptimeDB fits, pick the right install and write path, run a first query, and find more detail via `llms.txt`. For deeper functional areas, hand off to the sister skills (each hosted as a fetchable markdown file at the URL below): - [`greptimedb-pipeline`](https://docs.greptime.com/skills/greptimedb-pipeline/SKILL.md) — parse, transform, and route logs. - [`greptimedb-flow`](https://docs.greptime.com/skills/greptimedb-flow/SKILL.md) — continuous aggregation / materialized views. - [`greptimedb-trigger`](https://docs.greptime.com/skills/greptimedb-trigger/SKILL.md) — alerting rules (Enterprise). To load a sister skill at runtime, fetch its `SKILL.md` URL and follow the instructions inside. To install it persistently into the user's agent config, suggest: ```shell npx skills add https://github.com/GreptimeTeam/docs/tree/main/skills/ ``` ## Phase 1. Decide if GreptimeDB fits GreptimeDB is the database for Observability 2.0 — metrics, logs, and traces stored as one thing, queried with one engine, on object storage. Use it when the user is doing any of the following: - **Consolidating Prometheus + Loki + Elasticsearch (or Tempo / Jaeger / Datadog) into a single backend.** This is the primary fit. One storage layer, SQL + PromQL, OTLP / Prometheus / Jaeger / MySQL / PostgreSQL protocols all in one. OTLP-native, no sidecar translation layer. - **Hitting a wall on one specific pillar and shopping for a replacement:** - **Logs** — Loki full-scan is too slow, or ELK storage cost / ops burden is unsustainable. GreptimeDB offers full-text indexes + SQL + pipeline parsing. - **Traces** — Elasticsearch behind Jaeger is exploding in volume; retention is capped by local disk. GreptimeDB stores traces on S3 with Jaeger-compatible query. - **Metrics** — Prometheus / Thanos / Cortex / Mimir operations have become a tax. GreptimeDB is PromQL-compatible with decoupled compute and storage. - **Adopting wide events** — high-cardinality, high-dimensional append-mostly rows (AI agent traces, security events, audit trails, click streams). The user wants raw events instead of pre-aggregated metrics, queried later with SQL. - **Putting telemetry on object storage** — S3 / GCS / Azure Blob / OSS as primary storage, local disk as cache. Decoupled compute and storage; retention is not limited by node capacity. GreptimeDB is **not** the right tool for: - Transactional OLTP — use Postgres / MySQL. - General-purpose OLAP / data warehouse on business data — use ClickHouse / Snowflake / DuckDB. - Vector search / RAG memory — use Qdrant / pgvector / Pinecone. - A single small Prometheus instance with no scale pressure — Prometheus alone is fine. If the workload is one of these, recommend the appropriate alternative instead of bending GreptimeDB to fit. If any positive bullet above matches, proceed. ## Phase 2. Pick a deployment | Scenario | Choice | Doc | |----------|--------|-----| | Local dev, single-node trial | Standalone (Docker or binary) | | | Production self-host | Cluster on Kubernetes via `greptimedb-operator` | | | Managed cloud (no ops) | GreptimeCloud | | **Docker is the recommended path** (isolated, single command, no host dependencies). Bare-metal binary is a fine fallback when Docker is unavailable or undesirable. ### Docker quickstart Single-node, ephemeral on container exit; mount a volume to persist: ```shell docker run -p 127.0.0.1:4000-4003:4000-4003 \ -v "$(pwd)/greptimedb_data:/greptimedb_data" \ --name greptime --rm \ greptime/greptimedb:v1.0.2 standalone start \ --http-addr 0.0.0.0:4000 \ --rpc-bind-addr 0.0.0.0:4001 \ --mysql-addr 0.0.0.0:4002 \ --postgres-addr 0.0.0.0:4003 ``` ### Bare-metal binary (Linux / macOS) Download the matching `greptime` binary into the current directory, then run it: ```shell curl -fsSL \ https://raw.githubusercontent.com/greptimeteam/greptimedb/main/scripts/install.sh \ | sh -s v1.0.2 ./greptime standalone start \ --http-addr 0.0.0.0:4000 \ --rpc-bind-addr 0.0.0.0:4001 \ --mysql-addr 0.0.0.0:4002 \ --postgres-addr 0.0.0.0:4003 ``` Data is stored relative to the working directory by default. For persistent or production paths, pass `--data-home ` or use a config file (see "Configuring the server" below). For Windows, point the user at the binary download on or use WSL. ### Default ports - `4000` — HTTP API and built-in Dashboard (`http://127.0.0.1:4000/dashboard`). - `4001` — gRPC (native protocol used by official ingester SDKs). - `4002` — MySQL wire protocol. - `4003` — PostgreSQL wire protocol. For production deployment, fetch the Kubernetes operator guide and follow it end-to-end; do not improvise raw `StatefulSet` YAML. ### Configuring the server Tunable knobs (object-store backend, storage paths, TTL defaults, memory limits, WAL, logging, tracing) live in the TOML config file. Pass it with `--config-file`, or override individual options via CLI flags / environment variables (`GREPTIMEDB_*`). The full option reference is at ; the CLI flag list is at . Common things an agent should reach for the config doc to set up: - **Object storage backend** (S3 / GCS / Azure Blob / OSS) — under `[storage]`. - **TTL defaults / retention** at the table or database level. - **HTTP / gRPC / MySQL / Postgres listen addresses** beyond the defaults. - **Enabling authentication** (see auth doc linked in Phase 4). - **Tracing and logging** to OTLP / files. ## Phase 3. Pick a write path Route by data type. Recommended path first, fallback second. | Data type | Recommended | Fallback | Doc | |-----------|-------------|----------|-----| | Prometheus-shaped metrics | Prometheus remote write | OTLP metrics, InfluxDB line | | | OpenTelemetry signals (any) | OTLP HTTP | OTLP gRPC | | | Logs needing parsing / transformation | Pipeline + HTTP ingest → use `greptimedb-pipeline` skill | greptime_identity built-in pipeline | | | Logs already structured, no parse | Loki API or `greptime_identity` pipeline | OTLP logs | | | Distributed traces | OTLP HTTP | OTLP gRPC | | | Structured rows, SQL-friendly | `INSERT` via MySQL :4002 or PostgreSQL :4003 | HTTP SQL endpoint | | | High-throughput application code | gRPC Ingester SDK (official SDKs: Rust, Go, Java, .NET, TypeScript, Erlang, C) | HTTP | | | InfluxDB v1/v2 line protocol clients | InfluxDB line protocol | OTLP | | | Elasticsearch ingest API clients | Elasticsearch protocol | OTLP logs | | GreptimeDB **auto-creates tables** for protocol-based ingestion paths (Prometheus, OTLP, InfluxDB line, Loki, Elasticsearch). The user does not need to write `CREATE TABLE` first unless they want to control column types or indexes up front. > **Heads up on column names.** Auto-created tables use Greptime's naming, not > the client's. The time index column is `greptime_timestamp` for InfluxDB > line, Loki, Prometheus remote write, and Elasticsearch ingest — *not* `ts`, > `time`, or `timestamp`. Confirm with `DESC TABLE ` before composing > the first query. ### Inspect the auto-created schema before querying After the first successful write, run `DESC TABLE ` (or `describe_table` via MCP) before composing SQL / PromQL. Common conventions to expect: - **Time index column**: `greptime_timestamp` for Prometheus remote write, InfluxDB line, Loki, Elasticsearch ingest, and **OTLP metrics** (which follow the Prom data model). **OTLP logs and traces** use a column named `timestamp` instead — for logs sourced from the payload's `time_unix_nano`; for spans sourced from `start_time_unix_nano` (spans also get a `timestamp_end` column from `end_time_unix_nano`). - **Tags / labels** become primary-key string columns with semantic type `TAG`; **fields / values** become value columns (`Float64`, `String`, etc.) with semantic type `FIELD`. - **Loki `structured_metadata`** lands in a single `Json` column — query with `json_get_*` functions. - **Elasticsearch nested fields** are flattened into dotted columns (e.g. `meta.count`, `meta.city`). Scalar JSON types are preserved (`Int64` / `Float64` / `Boolean` / `String`); arrays are stringified into a `String` column. Not a JSON column — `DESC TABLE` first to pick the right cast. - **Default table name** is determined by the protocol: InfluxDB line uses the measurement name; Loki defaults to `loki_logs` (one table for all streams); Elasticsearch takes the bulk action's `_index`, falling back to the URL path index when using `/v1/elasticsearch//_bulk`. Skipping this step is the most common cause of "column not found" failures in agent-driven workflows — one `DESC TABLE` saves one wasted query round-trip. When the user needs to parse text logs (regex, dissect, JSON extraction, field math, multi-table routing), **hand off to the [`greptimedb-pipeline`](https://docs.greptime.com/skills/greptimedb-pipeline/SKILL.md) skill** — do not try to reimplement it inline here. ## Phase 4. Query the data ### Prefer MCP tools when available If the user has the GreptimeDB MCP server (`greptimedb-mcp-server`) configured in their agent, **use its tools instead of raw HTTP / SQL clients**: - `execute_sql`, `execute_tql`, `query_range` — SQL / PromQL / time-window range queries. - `describe_table`, `explain_query`, `health_check` — introspection. - `list_pipelines`, `create_pipeline`, `dryrun_pipeline`, `delete_pipeline` — pipeline management (works with the `greptimedb-pipeline` skill). - `list_dashboards`, `create_dashboard`, `delete_dashboard` — Perses dashboard management. MCP tools handle endpoint, auth, encoding, and result parsing for the agent. Falling back to raw clients should only happen when MCP is not configured. **If the user does not have it installed yet**, suggest: ```shell pip install greptimedb-mcp-server ``` Then merge the following entry into the host agent's `mcpServers` block. The config file location depends on the host: - **Claude Code** — `~/.claude.json` (top-level `mcpServers` is user-wide; per-project entries live under `projects..mcpServers`). Or run `claude mcp add -s user greptimedb -- greptimedb-mcp-server --host localhost --database public` to install user-wide (drop `-s user` for project-only). - **Claude Desktop** — `~/Library/Application Support/Claude/claude_desktop_config.json` on macOS; `%APPDATA%\Claude\claude_desktop_config.json` on Windows. - **Cursor** — `~/.cursor/mcp.json` (global) or `/.cursor/mcp.json` (per-project). - **Other hosts** — if the user already has a working MCP host, merge the entry below into their existing `mcpServers` map; do not overwrite the file. ```json { "mcpServers": { "greptimedb": { "command": "greptimedb-mcp-server", "args": ["--host", "localhost", "--database", "public"] } } } ``` Defaults: connects to `localhost:4002` (MySQL protocol) with database `public`, no password. For non-default credentials, ports, HTTP server mode, or data-masking options, see the project README at . ### Query modes (without MCP) | Mode | When to use | Connection | |------|-------------|------------| | SQL (MySQL protocol) | General-purpose queries from MySQL clients / drivers | `mysql -h 127.0.0.1 -P 4002` | | SQL (PostgreSQL protocol) | General-purpose queries from PostgreSQL clients / drivers | `psql -h 127.0.0.1 -p 4003 -d public` | | SQL (HTTP) | Stateless calls from any language | `POST http://127.0.0.1:4000/v1/sql?db=` | | PromQL | Existing Prometheus dashboards or alerts | `GET http://127.0.0.1:4000/v1/prometheus/api/v1/query` | | Range queries (`RANGE` / `ALIGN`) | Time-windowed aggregations in SQL | SQL on any of the above | | Log search | Term / phrase matching on log tables | SQL via `matches_term(column, 'value')` or the `@@` shorthand — see | **HTTP SQL — minimal request shape.** Body is `application/x-www-form-urlencoded`, the SQL goes in a `sql=` field, and the database is a query-string parameter (not a header, not a JSON body): ```shell curl -X POST \ -H 'Content-Type: application/x-www-form-urlencoded' \ --data-urlencode 'sql=SELECT * FROM monitor LIMIT 10' \ 'http://127.0.0.1:4000/v1/sql?db=public' ``` For auth-enabled deployments, add `-H 'Authorization: Basic '`. Optional knobs: `format=table|csvWithNames|...` in the query string, `X-Greptime-Timeout: 120s` header for long queries. Full reference at . **PromQL — selecting a specific field on multi-field tables.** GreptimeDB tables can have multiple `FIELD` columns; vanilla PromQL has no syntax for this, so Greptime adds a `__field__` matcher. Without it, the query runs across *every* field column, which is rarely what you want: ```promql # Pick one field cpu{host="web-01", __field__="usage_user"} # Subset via regex cpu{__field__=~"usage_(user|system)"} ``` Greptime extension only — Prometheus users will not recognize it. See . Authentication is **off by default** in standalone mode; production deployments should enable it. See . ### Hand-off to sister skills - **Continuous aggregation / materialized view** — downsampling, time-window rollups, streaming aggregation — hand off to [`greptimedb-flow`](https://docs.greptime.com/skills/greptimedb-flow/SKILL.md). - **Alerting rules / Alertmanager webhook** — hand off to [`greptimedb-trigger`](https://docs.greptime.com/skills/greptimedb-trigger/SKILL.md) (Enterprise only; for open source, point the user at Prometheus Alertmanager with GreptimeDB as the PromQL backend). ## Phase 5. Use llms.txt for deeper detail The docs site publishes machine-friendly markdown alongside its HTML. Prefer these endpoints over the HTML pages when fetching content: - **** — sectioned index of every doc page in the latest stable version. Start here when you need to find the right page. - **** — single concatenated file containing the full content of every page. Use when you want to grep the whole corpus locally. - **`https://docs.greptime.com/.md`** — every HTML page at `https://docs.greptime.com//` has a `.md` sibling at the same path with `.md` appended. The HTML pages also embed `` in ``, so any link can be rewritten to `.md` for clean markdown content. The docs are mirrored across two hosts: `docs.greptime.com` (global) and `docs.greptime.cn` (mainland China). Paths are identical; use whichever host you fetched this skill from. When fetching a doc URL, default to the `.md` form. The HTML form is for humans; the `.md` form is for agents. ## Reference Direct `.md` entry points: - Quickstart walkthrough: - Data model concepts: - Protocols overview: - Ingest data overview: - Query data overview: - Kubernetes deployment overview: - Server configuration reference: - CLI command reference: - Deployments & administration overview: Sister skills (fetch the URL to load the skill content): - [`greptimedb-pipeline`](https://docs.greptime.com/skills/greptimedb-pipeline/SKILL.md) — log parsing, transformation, multi-table routing. - [`greptimedb-flow`](https://docs.greptime.com/skills/greptimedb-flow/SKILL.md) — continuous aggregation, materialized views, downsampling. - [`greptimedb-trigger`](https://docs.greptime.com/skills/greptimedb-trigger/SKILL.md) — alerting rules with Alertmanager-compatible webhooks (Enterprise).