Elasticsearch is a document-based NoSQL database. It stores JSON documents in a distributed, RESTful search and analytics engine built on Lucene.
Supported serviceslink
Fivetran supports two different Elasticsearch services:
- Elastic Cloud
- Self-Hosted Elasticsearch
Supported configurationslink
Fivetran supports the following Elasticsearch configurations:
Supportability Category | Supported Values |
---|---|
Database versions | 7.10.0 - 7.16.0 |
Maximum throughput * | 5.0 MBps |
Connector limit per database | No limit |
* Maximum throughput is your connector’s end-to-end update speed, measured in megabytes per second (MBps). We calculate the maximum throughput by averaging the number of rows synced per second during your connector’s last 3-4 syncs. To learn more about sync speed, see the Replication speeds section.
Network protocol | Supported Versions | Notes |
---|---|---|
Transport Layer Security (TLS) | TLS 1.0, TLS 1.1, TLS 1.2 | We can only support TLS versions that your corresponding version of the database supports. |
Known limitationslink
Elasticseach field names are case-sensitive, but columns are case-insensitive in your Fivetran destination. We therefore reject fields with the same name but different capitalization as duplicate columns. For example, Field
and field
are different field names in Elasticsearch but would both map to the field
column in your destination. To avoid this error, use different names for each field.
Learn more in Fivetran’s naming conventions documentation.
Featureslink
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | All tables and fields |
Custom data | check | |
Data blocking | check | Column level |
Column hashing | check | |
Re-sync | check | Connector and table level |
History | ||
API configurable | ||
Priority-first sync | ||
Fivetran data models | ||
Private networking |
Setup guidelink
For specific instructions on how to set up your Elasticsearch connector, see the guide for your Elasticsearch service type:
Sync overviewlink
Once Fivetran is connected to your Elasticsearch instance, we fetch all historical data up to the present state. We then sync only the most recent inserts and updates at regular intervals using the sequence number and version fields on the documents. We capture deleted data using Fivetran Teleport Sync.
Fivetran Teleport Synclink
Fivetran Teleport Sync is a proprietary database replication method that offers the completeness of snapshots while approaching the speed of log-based systems. With this sync mechanism, Fivetran can incrementally sync deleted data with no additional setup other than a read-only connection.
Fivetran Teleport Sync’s queries perform the following operations on your Elasticsearch instance:
- Do a full scan of each synced index’s unique IDs
- Aggregate a compressed unique ID snapshot in the instance’s memory
For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your Elasticsearch instance:
- 1 GB Free RAM
- 1 Free CPU Core
- IOPS (Teleport Sync times decrease linearly with an increase of available IOPS).
Replication speedslink
Two major factors can cause disparities between our estimates and the exact replication speed for your Fivetran-connected databases: network latency and discrepancies in the format of the data we receive versus how the data is stored at rest in the data destination.
The ability to sync changes quickly also depends on the sync frequency you configure. The risk of the sync falling behind, or being unable to keep up with data changes, decreases as the sync frequency increases. We recommend a higher sync frequency for data sources with a high rate of data changes.
Schema informationlink
Fivetran tries to replicate the exact indices from your Elasticsearch source database to your destination. For every index in the Elasticsearch database that you connect to Fivetran, we create a table in your destination that maps to its native schema.
Fivetran-generated columnslink
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source database._fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row.
We add these columns to give you insight into the state of your data and the progress of your data syncs.
Type transformations and mappinglink
As we extract your data, we match Elasticsearch data types to types that Fivetran supports. Our system attempts to infer the types of any columns with data types that we don’t recognize.
The following table illustrates how we transform your Elasticsearch data types into Fivetran supported types:
Elasticsearch Type | Fivetran Type | Fivetran Supported |
---|---|---|
BINARY | BINARY | True |
BOOLEAN | BOOLEAN | True |
TEXT | STRING | True |
KEYWORD | STRING | True |
CONSTANT_KEYWORD | STRING | True |
WILDCARD | STRING | True |
INTEGER | INT | True |
SHORT | SHORT | True |
BYTE | - | False |
DOUBLE | DOUBLE | True |
FLOAT | FLOAT | True |
HALF_FLOAT | FLOAT | True |
SCALED_FLOAT | BIGDECIMAL | True |
LONG | LONG | True |
UNSIGNED_LONG | LONG | True |
DATE | INSTANT | True |
DATE_NANOS | INSTANT | True |
OBJECT | JSON | True |
FLATTENED | JSON | True |
NESTED | JSON | True |
JOIN | JSON | True |
ALIAS | - | False |
Elasticsearch allows you to put more than one value into a field as an array. We do not support syncing Elasticsearch arrays.
If we are missing an important data type that you need, please reach out to support.
In some cases, when loading data into your destination, we may need to convert Fivetran data types into data types that are supported by the destination. For more information, see the individual data destination pages.
Nested datalink
If your data is nested, we extract the topmost layer of data and sync the rest as JSON. For example, the following source table…
{
"foo": 1,
"bar": 2,
"nested": {
"baz": 3
}
}
…is converted to the following table when we load it into your destination:
foo INTEGER | bar INTEGER | nested JSON |
---|---|---|
1 | 2 | {"baz":3} |
Excluding source datalink
If you don’t want to sync all the data from your master database, you can exclude indices from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Data Blocking documentation.