Elasticsearch
Overview
GreptimeDB supports data ingestion through Elasticsearch's _bulk
API. We map Elasticsearch's Index concept to GreptimeDB's Table, and users can specify the database name using the db
URL parameter. Unlike native Elasticsearch, this API only supports data insertion, not modification or deletion. At the implementation level, both index
and create
commands in native Elasticsearch _bulk
API requests are treated as creation operations by GreptimeDB. Additionally, GreptimeDB only parses the _index
field from native _bulk
API command requests while ignoring other fields.
HTTP API
In most log collectors(such as Logstash and Filebeat mentioned below), you only need to configure the HTTP endpoint like http://${db_host}:${db_http_port}/v1/elasticsearch
, for example http://localhost:4000/v1/elasticsearch
.
GreptimeDB supports data ingestion through Elasticsearch protocol by implementing the following two HTTP endpoints:
-
/v1/elasticsearch/_bulk
: Users can use the POST method to write data in NDJSON format to GreptimeDB.For example, the following request will create a table named
test
and insert two records:POST /v1/elasticsearch/_bulk
{"create": {"_index": "test", "_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "test", "_id": "2"}}
{"name": "Jane", "age": 25} -
/v1/elasticsearch/${index}/_bulk
: Users can use the POST method to write data in NDJSON format to the${index}
table in GreptimeDB. If the POST request also contains an_index
field, the${index}
in the URL will be ignored.For example, the following request will create tables named
test
andanother_index
, and insert corresponding data:POST /v1/elasticsearch/test/_bulk
{"create": {"_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "another_index", "_id": "2"}}
{"name": "Jane", "age": 25}
HTTP Headers
x-greptime-db-name
: Specifies the database name. Defaults topublic
if not specified.x-greptime-pipeline-name
: Specifies the pipeline name. Defaults to GreptimeDB's internal pipelinegreptime_identity
if not specified.x-greptime-pipeline-version
: Specifies the pipeline version. Defaults to the latest version of the corresponding pipeline if not specified.
For more details about Pipeline, please refer to the Manage Pipelines documentation.
URL Parameters
You can use the following HTTP URL parameters:
db
: Specifies the database name. Defaults topublic
if not specified.pipeline_name
: Specifies the pipeline name. Defaults to GreptimeDB's internal pipelinegreptime_identity
if not specified.version
: Specifies the pipeline version. Defaults to the latest version of the corresponding pipeline if not specified.msg_field
: Specifies the JSON field name containing the original log data. For example, in Logstash and Filebeat, this field is typicallymessage
. If specified, GreptimeDB will attempt to parse the data in this field as JSON format. If parsing fails, the field will be treated as a string. This configuration option currently only takes effect in URL parameters.
Authentication Header
For detailed information about the authentication header, please refer to the Authorization documentation.
Usage
Use HTTP API to ingest data
You can create a request.json
file with the following content:
{"create": {"_index": "es_test", "_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "es_test", "_id": "2"}}
{"name": "Jane", "age": 25}
Then use the curl
command to send this file as a request body to GreptimeDB:
curl -XPOST http://localhost:4000/v1/elasticsearch/_bulk \
-H "Authorization: Basic {{authentication}}" \
-H "Content-Type: application/json" -d @request.json
We can use a mysql
client to connect to GreptimeDB and execute the following SQL to view the inserted data:
SELECT * FROM es_test;
We will see the following results:
mysql> SELECT * FROM es_test;
+------+------+----------------------------+
| age | name | greptime_timestamp |
+------+------+----------------------------+
| 30 | John | 2025-01-15 08:26:06.516665 |
| 25 | Jane | 2025-01-15 08:26:06.521510 |
+------+------+----------------------------+
2 rows in set (0.13 sec)
Logstash
If you are using Logstash to collect logs, you can use the following configuration to write data to GreptimeDB:
output {
elasticsearch {
hosts => ["http://localhost:4000/v1/elasticsearch"]
index => "my_index"
parameters => {
"pipeline_name" => "my_pipeline"
"msg_field" => "message"
}
}
}
Please pay attention to the following GreptimeDB-related configurations:
-
hosts
: Specifies the HTTP address of GreptimeDB's Elasticsearch protocol, which ishttp://${db_host}:${db_http_port}/v1/elasticsearch
. -
index
: Specifies the table name that will be written to. -
parameters
: Specifies the URL parameters for writing, where the example above specifies two parameters:pipeline_name
andmsg_field
.
Filebeat
If you are using Filebeat to collect logs, you can use the following configuration to write data to GreptimeDB:
output.elasticsearch:
hosts: ["http://localhost:4000/v1/elasticsearch"]
index: "my_index"
parameters:
pipeline_name: my_pipeline
msg_field: message
Please pay attention to the following GreptimeDB-related configurations:
-
hosts
: Specifies the HTTP address of GreptimeDB's Elasticsearch protocol, which ishttp://${db_host}:${db_http_port}/v1/elasticsearch
. -
index
: Specifies the table name that will be written to. -
parameters
: Specifies the URL parameters for writing, where the example above specifies two parameters:pipeline_name
andmsg_field
.
Telegraf
If you are using Telegraf to collect logs, you can use its Elasticsearch plugin to write data to GreptimeDB, as shown below:
[[outputs.elasticsearch]]
urls = [ "http://localhost:4000/v1/elasticsearch" ]
index_name = "test_table"
health_check_interval = "0s"
enable_sniffer = false
flush_interval = "1s"
manage_template = false
template_name = "telegraf"
overwrite_template = false
namepass = ["tail"]
[outputs.elasticsearch.headers]
"X-GREPTIME-DB-NAME" = "public"
"X-GREPTIME-PIPELINE-NAME" = "greptime_identity"
[[inputs.tail]]
files = ["/tmp/test.log"]
from_beginning = true
data_format = "value"
data_type = "string"
character_encoding = "utf-8"
interval = "1s"
pipe = false
watch_method = "inotify"
Please pay attention to the following GreptimeDB-related configurations:
-
urls
: Specifies the HTTP address of GreptimeDB's Elasticsearch protocol, which ishttp://${db_host}:${db_http_port}/v1/elasticsearch
. -
index_name
: Specifies the table name that will be written to. -
outputs.elasticsearch.header
: Specifies the HTTP Header for writing, where the example above specifies two parameters:X-GREPTIME-DB-NAME
andX-GREPTIME-PIPELINE-NAME
.