Elasticsearch
Overview
GreptimeDB supports data ingestion through Elasticsearch's _bulk
API. We map Elasticsearch's Index concept to GreptimeDB's Table, and users can specify the database name using the db
URL parameter. Unlike native Elasticsearch, this API only supports data insertion, not modification or deletion. At the implementation level, both index
and create
commands in native Elasticsearch _bulk
API requests are treated as creation operations by GreptimeDB. Additionally, GreptimeDB only parses the _index
field from native _bulk
API command requests while ignoring other fields.
HTTP API
GreptimeDB supports data ingestion via the Elasticsearch protocol through two HTTP endpoints:
-
/v1/elasticsearch/_bulk
: Users can use the POST method to write data in NDJSON format to GreptimeDB.For example, the following request will create a table named
test
and insert two records:POST /v1/elasticsearch/_bulk
{"create": {"_index": "test", "_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "test", "_id": "2"}}
{"name": "Jane", "age": 25} -
/v1/elasticsearch/${index}/_bulk
: Users can use the POST method to write data in NDJSON format to the${index}
table in GreptimeDB. If the POST request also contains an_index
field, the${index}
in the URL will be ignored.For example, the following request will create tables named
test
andanother_index
, and insert corresponding data:POST /v1/elasticsearch/test/_bulk
{"create": {"_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "another_index", "_id": "2"}}
{"name": "Jane", "age": 25}
Users can also use the following HTTP URL parameters:
db
: Specifies the database name. Defaults topublic
if not specifiedpipeline_name
: Specifies the pipeline name. Defaults to GreptimeDB's internal pipelinegreptime_identity
if not specifiedversion
: Specifies the pipeline version. Defaults to the latest version of the corresponding pipeline if not specifiedmsg_field
: Specifies the JSON field name containing the original log data. For example, in Logstash and Filebeat, this field is typicallymessage
. If specified, GreptimeDB will attempt to parse the data in this field as JSON format. If parsing fails, the field will be treated as a string
Usage
Use HTTP API to ingest data
You can create a request.json
file with the following content:
{"create": {"_index": "es_test", "_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "es_test", "_id": "2"}}
{"name": "Jane", "age": 25}
Then use the curl
command to send this file as a request body to GreptimeDB:
curl -XPOST http://localhost:4000/v1/elasticsearch/_bulk \
-H "Content-Type: application/json" -d @request.json
We can use a mysql
client to connect to GreptimeDB and execute the following SQL to view the inserted data:
SELECT * FROM es_test;
We will see the following results:
mysql> SELECT * FROM es_test;
+------+------+----------------------------+
| age | name | greptime_timestamp |
+------+------+----------------------------+
| 30 | John | 2025-01-15 08:26:06.516665 |
| 25 | Jane | 2025-01-15 08:26:06.521510 |
+------+------+----------------------------+
2 rows in set (0.13 sec)
Logstash
If you are using Logstash to collect logs, you can use the following configuration to write data to GreptimeDB:
output {
elasticsearch {
hosts => ["http://localhost:4000/v1/elasticsearch"]
index => "my_index"
parameters => {
"pipeline_name" => "my_pipeline"
"msg_field" => "message"
}
}
}
The parameters
section is optional, while hosts
and index
should be adjusted according to your actual setup.
Filebeat
If you are using Filebeat to collect logs, you can use the following configuration to write data to GreptimeDB:
output.elasticsearch:
hosts: ["http://localhost:4000/v1/elasticsearch"]
index: "my_index"
parameters:
pipeline_name: my_pipeline
msg_field: message
The parameters
section is optional, while hosts
and index
should be adjusted according to your actual setup.