Configuration
GreptimeDB supports layered configuration with the following precedence order (where each item overrides the one below it):
- Greptime command line options
- Configuration file options
- Environment variables
- Default values
You only need to set up the configurations you require. GreptimeDB will assign default values for any settings not configured.
How to set up configurations
Greptime command line options
You can specify several configurations using command line arguments. For example, to start GreptimeDB in standalone mode with a configured HTTP address:
greptime standalone start --http-addr 127.0.0.1:4000
For all the options supported by the Greptime command line, refer to the GreptimeDB Command Line Interface.
Configuration file options
You can specify configurations in a TOML file.
For example, create a configuration file standalone.example.toml
as shown below:
[storage]
type = "File"
data_home = "/tmp/greptimedb/"
Then, specify the configuration file using the command line argument -c [file_path]
.
greptime [standalone | frontend | datanode | metasrv] start -c config/standalone.example.toml
For example, to start in standalone mode:
greptime standalone start -c standalone.example.toml
Example files
Below are example configuration files for each GreptimeDB component, including all available configurations. In actual scenarios, you only need to configure the required options and do not need to configure all options as in the sample file.
Environment variable
Every item in the configuration file can be mapped to environment variables.
For example, to set the data_home
configuration item for the datanode using an environment variable:
# ...
[storage]
data_home = "/data/greptimedb"
# ...
Use the following shell command to set the environment variable in the following format:
export GREPTIMEDB_DATANODE__STORAGE__DATA_HOME=/data/greptimedb
Environment Variable Rules
-
Each environment variable should have the component prefix, for example:
GREPTIMEDB_FRONTEND
GREPTIMEDB_METASRV
GREPTIMEDB_DATANODE
GREPTIMEDB_STANDALONE
-
Use double underscore
__
separators. For example, the data structurestorage.data_home
is transformed toSTORAGE__DATA_HOME
.
The environment variable also accepts lists that are separated by commas ,
, for example:
GREPTIMEDB_METASRV__META_CLIENT__METASRV_ADDRS=127.0.0.1:3001,127.0.0.1:3002,127.0.0.1:3003
Options
In this section, we will introduce some main configuration options. For all options, refer to the Configuration Reference on Github.
Protocol options
Protocol options are valid in frontend
and standalone
subcommands,
specifying protocol server addresses and other protocol-related options.
Below is an example configuration with default values.
You can change the values or disable certain protocols in your configuration file.
For example, to disable OpenTSDB protocol support, set the enable
parameter to false
.
Note that HTTP and gRPC protocols cannot be disabled for the database to function correctly.
[http]
addr = "127.0.0.1:4000"
timeout = "30s"
body_limit = "64MB"
[grpc]
addr = "127.0.0.1:4001"
runtime_size = 8
[mysql]
enable = true
addr = "127.0.0.1:4002"
runtime_size = 2
[mysql.tls]
mode = "disable"
cert_path = ""
key_path = ""
[postgres]
enable = true
addr = "127.0.0.1:4003"
runtime_size = 2
[postgres.tls]
mode = "disable"
cert_path = ""
key_path = ""
[opentsdb]
enable = true
[influxdb]
enable = true
[prom_store]
enable = true
with_metric_engine = true
The following table describes the options in detail:
Option | Key | Type | Description |
---|---|---|---|
http | HTTP server options | ||
addr | String | Server address, "127.0.0.1:4000" by default | |
timeout | String | HTTP request timeout, "30s" by default | |
body_limit | String | HTTP max body size, "64MB" by default | |
is_strict_mode | Boolean | Whether to enable the strict verification mode of the protocol, which will slightly affect performance. False by default. | |
grpc | gRPC server options | ||
addr | String | Server address, "127.0.0.1:4001" by default | |
runtime_size | Integer | The number of server worker threads, 8 by default | |
mysql | MySQL server options | ||
enable | Boolean | Whether to enable MySQL protocol, true by default | |
addr | String | Server address, "127.0.0.1:4002" by default | |
runtime_size | Integer | The number of server worker threads, 2 by default | |
influxdb | InfluxDB Protocol options | ||
enable | Boolean | Whether to enable InfluxDB protocol in HTTP API, true by default | |
opentsdb | OpenTSDB Protocol options | ||
enable | Boolean | Whether to enable OpenTSDB protocol in HTTP API, true by default | |
prom_store | Prometheus remote storage options | ||
enable | Boolean | Whether to enable Prometheus Remote Write and read in HTTP API, true by default | |
with_metric_engine | Boolean | Whether to use the metric engine on Prometheus Remote Write, true by default | |
postgres | PostgresSQL server options | ||
enable | Boolean | Whether to enable PostgresSQL protocol, true by default | |
addr | String | Server address, "127.0.0.1:4003" by default | |
runtime_size | Integer | The number of server worker threads, 2 by default |
Storage options
The storage
options are valid in datanode and standalone mode, which specify the database data directory and other storage-related options.
GreptimeDB supports storing data in local file system, AWS S3 and compatible services (including MinIO, digitalocean space, Tencent Cloud Object Storage(COS), Baidu Object Storage(BOS) and so on), Azure Blob Storage and Aliyun OSS.
Option | Key | Type | Description |
---|---|---|---|
storage | Storage options | ||
type | String | Storage type, supports "File", "S3" and "Oss" etc. | |
File | Local file storage options, valid when type="File" | ||
data_home | String | Database storage root directory, "/tmp/greptimedb" by default | |
S3 | AWS S3 storage options, valid when type="S3" | ||
bucket | String | The S3 bucket name | |
root | String | The root path in S3 bucket | |
endpoint | String | The API endpoint of S3 | |
region | String | The S3 region | |
access_key_id | String | The S3 access key id | |
secret_access_key | String | The S3 secret access key | |
Oss | Aliyun OSS storage options, valid when type="Oss" | ||
bucket | String | The OSS bucket name | |
root | String | The root path in OSS bucket | |
endpoint | String | The API endpoint of OSS | |
access_key_id | String | The OSS access key id | |
secret_access_key | String | The OSS secret access key | |
Azblob | Azure Blob Storage options, valid when type="Azblob" | ||
container | String | The container name | |
root | String | The root path in container | |
endpoint | String | The API endpoint of Azure Blob Storage | |
account_name | String | The account name of Azure Blob Storage | |
account_key | String | The access key | |
sas_token | String | The shared access signature | |
Gsc | Google Cloud Storage options, valid when type="Gsc" | ||
root | String | The root path in Gsc bucket | |
bucket | String | The Gsc bucket name | |
scope | String | The Gsc service scope | |
credential_path | String | The Gsc credentials path | |
endpoint | String | The API endpoint of Gsc |
A file storage sample configuration:
[storage]
type = "File"
data_home = "/tmp/greptimedb/"
A S3 storage sample configuration:
[storage]
type = "S3"
bucket = "test_greptimedb"
root = "/greptimedb"
access_key_id = "<access key id>"
secret_access_key = "<secret access key>"
Storage engine provider
[[storage.providers]]
setups the table storage engine providers. Based on these providers, you can create a table with a specified storage, see create table:
# Allows using multiple storages
[[storage.providers]]
type = "S3"
bucket = "test_greptimedb"
root = "/greptimedb"
access_key_id = "<access key id>"
secret_access_key = "<secret access key>"
[[storage.providers]]
type = "Gcs"
bucket = "test_greptimedb"
root = "/greptimedb"
credential_path = "<gcs credential path>"
All configured providers can be used as the storage
option when creating tables.
Object storage cache
When using S3, OSS or Azure Blob Storage, it's better to enable object storage caching for speedup data querying:
[storage]
type = "S3"
bucket = "test_greptimedb"
root = "/greptimedb"
access_key_id = "<access key id>"
secret_access_key = "<secret access key>"
## Enable object storage caching
cache_path = "/var/data/s3_local_cache"
cache_capacity = "256MiB"
The cache_path
is the local file directory that keeps cache files, and the cache_capacity
is the maximum total file size in the cache directory.
WAL options
The [wal]
section in datanode or standalone config file configures the options of Write-Ahead-Log:
Local WAL
[wal]
provider = "raft_engine"
file_size = "256MB"
purge_threshold = "4GB"
purge_interval = "10m"
read_batch_size = 128
sync_write = false
dir
: is the directory where to write logs. When usingFile
storage, it's{data_home}/wal
by default. It must be configured explicitly when using other storage types such asS3
etc.file_size
: the maximum size of the WAL log file, default is256MB
.purge_threshold
andpurge_interval
: control the purging of wal files, default is4GB
.sync_write
: whether to callfsync
when writing every log.
Remote WAL
[wal]
provider = "kafka"
broker_endpoints = ["127.0.0.1:9092"]
max_batch_bytes = "1MB"
consumer_wait_timeout = "100ms"
backoff_init = "500ms"
backoff_max = "10s"
backoff_base = 2
backoff_deadline = "5mins"
broker_endpoints
: The Kafka broker endpoints.max_batch_bytes
: The max size of a single producer batch.consumer_wait_timeout
: The consumer wait timeout.backoff_init
: The initial backoff delay.backoff_max
: The maximum backoff delay.backoff_base
: The exponential backoff rate.backoff_deadline
: The deadline of retries.
Remote WAL Authentication (Optional)
[wal.sasl]
type = "SCRAM-SHA-512"
username = "user"
password = "secret"
The SASL configuration for Kafka client, available SASL mechanisms: PLAIN
, SCRAM-SHA-256
, SCRAM-SHA-512
.
Remote WAL TLS (Optional)
[wal.tls]
server_ca_cert_path = "/path/to/server_cert"
client_cert_path = "/path/to/client_cert"
client_key_path = "/path/to/key"
The TLS configuration for Kafka client, support modes: TLS (using system ca certs), TLS (with specified ca certs), mTLS.
Examples:
TLS (using system ca certs)
[wal.tls]
TLS (with specified ca cert)
[wal.tls]
server_ca_cert_path = "/path/to/server_cert"
mTLS
[wal.tls]
server_ca_cert_path = "/path/to/server_cert"
client_cert_path = "/path/to/client_cert"
client_key_path = "/path/to/key"
Logging options
frontend
, metasrv
, datanode
and standalone
can all configure log and tracing related parameters in the [logging]
section:
[logging]
dir = "/tmp/greptimedb/logs"
level = "info"
enable_otlp_tracing = false
otlp_endpoint = "localhost:4317"
append_stdout = true
[logging.tracing_sample_ratio]
default_ratio = 1.0
dir
: log output directory.level
: output log level, available log level areinfo
,debug
,error
,warn
, the default level isinfo
.enable_otlp_tracing
: whether to turn on tracing, not turned on by default.otlp_endpoint
: Export the target endpoint of tracing using gRPC-based OTLP protocol, the default value islocalhost:4317
.append_stdout
: Whether to append logs to stdout. Defaults totrue
.tracing_sample_ratio
: This field can configure the sampling rate of tracing. How to usetracing_sample_ratio
, please refer to How to configure tracing sampling rate.
How to use distributed tracing, please reference Tracing
Region engine options
The parameters corresponding to different storage engines can be configured for datanode
and standalone
in the [region_engine]
section. Currently, only options for mito
region engine is available.
Frequently used options:
[[region_engine]]
[region_engine.mito]
num_workers = 8
manifest_checkpoint_distance = 10
max_background_jobs = 4
auto_flush_interval = "1h"
global_write_buffer_size = "1GB"
global_write_buffer_reject_size = "2GB"
sst_meta_cache_size = "128MB"
vector_cache_size = "512MB"
page_cache_size = "512MB"
sst_write_buffer_size = "8MB"
scan_parallelism = 0
[region_engine.mito.inverted_index]
create_on_flush = "auto"
create_on_compaction = "auto"
apply_on_query = "auto"
mem_threshold_on_create = "64M"
intermediate_path = ""
[region_engine.mito.memtable]
type = "time_series"
The mito
engine provides an experimental memtable which optimizes for write performance and memory efficiency under large amounts of time-series. Its read performance might not as fast as the default time_series
memtable.
[region_engine.mito.memtable]
type = "partition_tree"
index_max_keys_per_shard = 8192
data_freeze_threshold = 32768
fork_dictionary_bytes = "1GiB"
Available options:
Key | Type | Default | Descriptions |
---|---|---|---|
num_workers | Integer | 8 | Number of region workers. |
manifest_checkpoint_distance | Integer | 10 | Number of meta action updated to trigger a new checkpoint for the manifest. |
max_background_jobs | Integer | 4 | Max number of running background jobs |
auto_flush_interval | String | 1h | Interval to auto flush a region if it has not flushed yet. |
global_write_buffer_size | String | 1GB | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
global_write_buffer_reject_size | String | 2GB | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of global_write_buffer_size |
sst_meta_cache_size | String | 128MB | Cache size for SST metadata. Setting it to 0 to disable the cache. If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
vector_cache_size | String | 512MB | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache. If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
page_cache_size | String | 512MB | Cache size for pages of SST row groups. Setting it to 0 to disable the cache. If not set, it's default to 1/8 of OS memory. |
selector_result_cache_size | String | 512MB | Cache size for time series selector (e.g. last_value() ). Setting it to 0 to disable the cache.If not set, it's default to 1/8 of OS memory. |
sst_write_buffer_size | String | 8MB | Buffer size for SST writing. |
scan_parallelism | Integer | 0 | Parallelism to scan a region (default: 1/4 of cpu cores). - 0 : using the default value (1/4 of cpu cores).- 1 : scan in current thread.- n : scan in parallelism n. |
inverted_index | -- | -- | The options for inverted index in Mito engine. |
inverted_index.create_on_flush | String | auto | Whether to create the index on flush. - auto : automatically- disable : never |
inverted_index.create_on_compaction | String | auto | Whether to create the index on compaction. - auto : automatically- disable : never |
inverted_index.apply_on_query | String | auto | Whether to apply the index on query - auto : automatically- disable : never |
inverted_index.mem_threshold_on_create | String | 64M | Memory threshold for performing an external sort during index creation. Setting to empty will disable external sorting, forcing all sorting operations to happen in memory. |
inverted_index.intermediate_path | String | "" | File system path to store intermediate files for external sorting (default {data_home}/index_intermediate ). |
memtable.type | String | time_series | Memtable type. - time_series : time-series memtable- partition_tree : partition tree memtable (experimental) |
memtable.index_max_keys_per_shard | Integer | 8192 | The max number of keys in one shard. Only available for partition_tree memtable. |
memtable.data_freeze_threshold | Integer | 32768 | The max rows of data inside the actively writing buffer in one shard. Only available for partition_tree memtable. |
memtable.fork_dictionary_bytes | String | 1GiB | Max dictionary bytes. Only available for partition_tree memtable. |