GreptimeDB consists of the following key components:
Frontendthat exposes read and write service in various protocols, forwards requests to
Datanodeis responsible for storing data to persistent storage such as local disk or object storage in the cloud such as AWS S3, Azure Blob Storage etc.
Metasrvserver that coordinates the operations between the
To better understand
GreptimeDB, a few concepts need to be introduced:
tableis where user data is stored in
tablehas a schema and a totally ordered primary key. A
tableis split into segments called
regionby its partition key.
regionis a contiguous segment of a table, and also could be regarded as a partition in some relational databases. A
regioncould be replicated on multiple
datanodeand only one of these replicas is writable and can serve write requests, while any replica can serve read requests.
datanodestores and serves
datanodecan serve multiple
regioncan be served by multiple
metasrvstores the metadata of the cluster, such as tables,
regionsof each table, etc. It also coordinates
frontendhas a catalog implementation, which fetches the metadata from
metasrv, tells which
tableis served by which
frontendis a stateless service that serves requests from client. It acts as a proxy to forward read and write requests to corresponding
datanode, according to the mapping from catalog.
- A timeseries of a
tableis identified by its primary key. Each
tablemust have a timestamp column, as
GreptimeDBis a timeseries database. Data in
tablewill be sorted by its primary key and timestamp, but the actual order is implementation specific and may change in the future.
How it works
Before diving into each component, let's take a high level view of how the database works.
- Users can interact with the database via various protocols, such as ingesting data using
InfluxDB line protocol, then exploring the data using SQL or PromQL. The
frontendis the component users or clients connect to and operate, thus hide
- Assumes a user uses the HTTP API to insert data into the database, by sending a HTTP request to a
frontendinstance. When the
frontendreceives the request, it then parses the request body using corresponding protocol parser, and finds the table to write to from a catalog manager based on
frontendrelies on a push-pull strategy to cache metadata from
metasrv, thus it knows which
datanode, or more precisely, the
regiona request should be sent to. A request may be split and sent to multiple
regions, if its contents need to be stored in different
datanodereceives the request, it writes the data to the
region, and then sends response back to the
frontend. Writing to the
regionwill then write to the underlying storage engine, which will eventually put the data to persistent device.
frontendhas received all responses from the target
datanodes, it then sends the result back to the user.
For more details on each component, see the following guides: