Version: Nightly

Elasticsearch

Overview

GreptimeDB supports data ingestion through Elasticsearch's _bulk API. We map Elasticsearch's Index concept to GreptimeDB's Table, and users can specify the database name using the db URL parameter. Unlike native Elasticsearch, this API only supports data insertion, not modification or deletion. At the implementation level, both index and create commands in native Elasticsearch _bulk API requests are treated as creation operations by GreptimeDB. Additionally, GreptimeDB only parses the _index field from native _bulk API command requests while ignoring other fields.

HTTP API

In most log collectors(such as Logstash and Filebeat mentioned below), you only need to configure the HTTP endpoint like http://${db_host}:${db_http_port}/v1/elasticsearch, for example http://localhost:4000/v1/elasticsearch.

GreptimeDB supports data ingestion through Elasticsearch protocol by implementing the following two HTTP endpoints:

/v1/elasticsearch/_bulk: Users can use the POST method to write data in NDJSON format to GreptimeDB.

For example, the following request will create a table named test and insert two records:
```
POST /v1/elasticsearch/_bulk

{"create": {"_index": "test", "_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "test", "_id": "2"}}
{"name": "Jane", "age": 25}
```
/v1/elasticsearch/${index}/_bulk: Users can use the POST method to write data in NDJSON format to the ${index} table in GreptimeDB. If the POST request also contains an _index field, the ${index} in the URL will be ignored.

For example, the following request will create tables named test and another_index, and insert corresponding data:
```
POST /v1/elasticsearch/test/_bulk

{"create": {"_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "another_index", "_id": "2"}}
{"name": "Jane", "age": 25}
```

HTTP Headers

x-greptime-db-name: Specifies the database name. Defaults to public if not specified.
x-greptime-pipeline-name: Specifies the pipeline name. Defaults to GreptimeDB's internal pipeline greptime_identity if not specified.
x-greptime-pipeline-version: Specifies the pipeline version. Defaults to the latest version of the corresponding pipeline if not specified.

For more details about Pipeline, please refer to the Manage Pipelines documentation.

URL Parameters

You can use the following HTTP URL parameters:

db: Specifies the database name. Defaults to public if not specified.
pipeline_name: Specifies the pipeline name. Defaults to GreptimeDB's internal pipeline greptime_identity if not specified.
version: Specifies the pipeline version. Defaults to the latest version of the corresponding pipeline if not specified.
msg_field: Specifies the JSON field name containing the original log data. For example, in Logstash and Filebeat, this field is typically message. If specified, GreptimeDB will attempt to parse the data in this field as JSON format. If parsing fails, the field will be treated as a string. This configuration option currently only takes effect in URL parameters.

Authentication Header

For detailed information about the authentication header, please refer to the Authorization documentation.

Usage

Use HTTP API to ingest data

You can create a request.json file with the following content:

{"create": {"_index": "es_test", "_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "es_test", "_id": "2"}}
{"name": "Jane", "age": 25}

Then use the curl command to send this file as a request body to GreptimeDB:

curl -XPOST http://localhost:4000/v1/elasticsearch/_bulk \
  -H "Authorization: Basic {{authentication}}" \
  -H "Content-Type: application/json" -d @request.json

We can use a mysql client to connect to GreptimeDB and execute the following SQL to view the inserted data:

SELECT * FROM es_test;

We will see the following results:

mysql> SELECT * FROM es_test;
+------+------+----------------------------+
| age  | name | greptime_timestamp         |
+------+------+----------------------------+
|   30 | John | 2025-01-15 08:26:06.516665 |
|   25 | Jane | 2025-01-15 08:26:06.521510 |
+------+------+----------------------------+
2 rows in set (0.13 sec)

Logstash

If you are using Logstash to collect logs, you can use the following configuration to write data to GreptimeDB:

output {
    elasticsearch {
        hosts => ["http://localhost:4000/v1/elasticsearch"]
        index => "my_index"
        parameters => {
           "pipeline_name" => "my_pipeline"
           "msg_field" => "message"
        }
    }
}

Please pay attention to the following GreptimeDB-related configurations:

hosts: Specifies the HTTP address of GreptimeDB's Elasticsearch protocol, which is http://${db_host}:${db_http_port}/v1/elasticsearch.
index: Specifies the table name that will be written to.
parameters: Specifies the URL parameters for writing, where the example above specifies two parameters: pipeline_name and msg_field.

Filebeat

If you are using Filebeat to collect logs, you can use the following configuration to write data to GreptimeDB:

output.elasticsearch:
  hosts: ["http://localhost:4000/v1/elasticsearch"]
  index: "my_index"
  parameters:
    pipeline_name: my_pipeline
    msg_field: message

Please pay attention to the following GreptimeDB-related configurations:

hosts: Specifies the HTTP address of GreptimeDB's Elasticsearch protocol, which is http://${db_host}:${db_http_port}/v1/elasticsearch.
index: Specifies the table name that will be written to.
parameters: Specifies the URL parameters for writing, where the example above specifies two parameters: pipeline_name and msg_field.

Telegraf

If you are using Telegraf to collect logs, you can use its Elasticsearch plugin to write data to GreptimeDB, as shown below:

[[outputs.elasticsearch]]
  urls = [ "http://localhost:4000/v1/elasticsearch" ]
  index_name = "test_table"
  health_check_interval = "0s"
  enable_sniffer = false
  flush_interval = "1s"
  manage_template = false
  template_name = "telegraf"
  overwrite_template = false
  namepass = ["tail"]

 [outputs.elasticsearch.headers]
    "X-GREPTIME-DB-NAME" = "public"
    "X-GREPTIME-PIPELINE-NAME" = "greptime_identity"

[[inputs.tail]]
  files = ["/tmp/test.log"]
  from_beginning = true
  data_format = "value"
  data_type = "string"
  character_encoding = "utf-8"
  interval = "1s"
  pipe = false
  watch_method = "inotify"

Please pay attention to the following GreptimeDB-related configurations:

urls: Specifies the HTTP address of GreptimeDB's Elasticsearch protocol, which is http://${db_host}:${db_http_port}/v1/elasticsearch.
index_name: Specifies the table name that will be written to.
outputs.elasticsearch.header: Specifies the HTTP Header for writing, where the example above specifies two parameters: X-GREPTIME-DB-NAME and X-GREPTIME-PIPELINE-NAME.

Elasticsearch

Overview​

HTTP API​

HTTP Headers​

URL Parameters​

Authentication Header​

Usage​

Use HTTP API to ingest data​

Logstash​

Filebeat​

Telegraf​

Overview

HTTP API

HTTP Headers

URL Parameters

Authentication Header

Usage

Use HTTP API to ingest data

Logstash

Filebeat

Telegraf