MariaDB is an open source fork of the popular MySQL relational database. Its development is led by some of the original developers of MySQL.
Supported serviceslink
Fivetran supports three different MariaDB database services:
Supported versions | 10.1.2 - 10.4 |
Maximum throughput | 5.0 MBps |
MariaDB versions earlier than 10.1.2 may function correctly with the exception of fractional seconds in TIME, TIMESTAMP, and DATETIME columns. Prior to version 10.1.2, fractional seconds were stored differently, which causes problems with our incremental updates.
Maximum Speed (MBps) measures are based on measured end-to-end update speeds across Fivetran connectors.
Two major factors can cause disparities between our estimates and the exact replication speed for your Fivetran-connected databases: network latency and discrepancies in the format of the data we receive versus how the data is stored at rest in the data destination.
The ability to sync changes quickly also depends on configurable sync frequency. The risk of the sync falling behind, or being unable to keep up with data changes, decreases as the sync frequency increases. Fivetran recommends a higher sync frequency for data sources with a high rate of data changes.
To measure the rate of new data in your database, check the disk space usage metrics over time for databases hosted on cloud providers. For self-hosted databases, you can run the following query to determine disk space usage:
SELECT
table_schema as `Database`,
table_name AS `Table`,
round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
FROM information_schema.TABLES
WHERE table_schema = 'DB_NAME'
ORDER BY (data_length + index_length) DESC;
Featureslink
Feature Name | Supported | Notes |
---|---|---|
Capture deletes | check | All tables and fields |
Custom data | check | All tables and fields |
Data blocking | check | Column level, table level, and schema level |
Column hashing | check | |
Re-sync | check | Table level |
History | ||
API configurable | check | |
Priority-first sync | ||
Private networking | check | AWS PrivateLink: Generic MariaDB on EC2, MariaDB RDS Azure Private Link: Azure MariaDB |
Fivetran data models |
Setup guidelink
In your master database, you need to do the following:
- Enable ROW format binary log replication
- Set the
binlog_row_image
binary logging system variable toFULL
in global scope - Set binary log expiration to a minimum of one day (24 hours). We recommend setting the log expiration for seven days.
- Allow access to your MariaDB database via Fivetran’s IP
- (Optional) Allow access to a read-replica of your MariaDB database if you do not want to connect Fivetran to your production instance
- Make sure that the
mysql56_temporal_format
system variable is set to ON (this is the default). - Create a Fivetran-specific MariaDB user with read-level and replication permissions to binary logs
WARNING: This user must be reserved for Fivetran use only. At the beginning of each sync, a Fivetran user will attempt to kill any zombie processes it left behind on previous syncs. If you try to run any operations as this user, they may be killed. Also, if you are creating multiple connectors targeting the same source database, create a unique user for each connector. If you set up multiple connectors with the same Fivetran user, they can kill one another’s connections, slowing down your sync unnecessarily.
For specific instructions on how to set up your database, see the setup guide for your MariaDB database type:
Since MariaDB is a fork of the MySQL code base (up to version 5.5), the setup guides are nearly identical to the MySQL guides.
Sync overviewlink
Once Fivetran is connected to your MariaDB master database or read replica, we pull a full dump of all selected data from your database. We then connect to your binary log to pull all your new and changed data at regular intervals. The binary log is the same change tracking mechanism that MariaDB uses to perform its own replication for backups. If data in the source changes (for example, you add new tables or change a data type), Fivetran automatically detects and persists these changes into your destination.
Schema informationlink
Fivetran tries to replicate the exact schema and tables from your MariaDB source database to your destination according to our standard database update strategies. For every schema in the MariaDB database that you connect, we create a schema in your destination that maps directly to its native schema. This ensures that the data in your destination is in a familiar format to work with.
Fivetran-generated columnslink
Fivetran adds the following columns to every table in your destination:
_fivetran_deleted
(BOOLEAN) marks rows that were deleted in the source database._fivetran_synced
(UTC TIMESTAMP) indicates the time when Fivetran last successfully synced the row._fivetran_index
(INTEGER) shows the order of updates for tables that do not have a primary key._fivetran_id
(STRING) is the hash of the non-Fivetran values of each row. It’s a unique ID that Fivetran uses to avoid duplicate rows in tables that do not have a primary key.
We add these columns to give you insight into the state of your data and the progress of your data syncs.
Type transformations and mappinglink
As we extract your data, we match MariaDB data types to types that Fivetran supports. If we don’t support a certain data type, we automatically change that type to the closest supported type or, for some types, don’t load that data at all. Our system automatically skips columns with data types that we don’t accept or transform.
The following table illustrates how we transform your MariaDB data types into Fivetran supported types:
MariaDB Type | Fivetran Type | Fivetran Supported | Notes |
---|---|---|---|
BINARY | BINARY | True | |
BIGINT | LONG | True | |
BIT | BOOLEAN | True | BIT type with a single digit is supported |
BLOB | BINARY | True | |
CHAR | STRING | True | |
DATE | DATE | True | Invalid values will be loaded as NULL |
DATETIME | TIMESTAMP_NTZ | True | Invalid values will be loaded as NULL. Versions prior to 10.1.2 will break if any precision greater than 0 is used. |
DECIMAL/ NUMERIC | DECIMAL | True | |
DOUBLE | DOUBLE | True | |
ENUM | STRING | True | |
FLOAT | DOUBLE | True | |
GEOMETRY | JSON | True | |
GEOMETRYCOLLECTION | JSON | True | |
JSON | JSON | True | |
INT | INTEGER | True | |
LINESTRING | JSON | True | |
LONGBLOB | BINARY | True | |
LONGTEXT | STRING | True | |
MEDIUMBLOB | BINARY | True | |
MEDIUMINT | INTEGER | True | |
MEDIUMTEXT | STRING | True | |
MULTILINESTRING | JSON | True | |
MULTIPOINT | JSON | True | |
MULTIPOLYGON | JSON | True | |
POINT | JSON | True | |
POLYGON | JSON | True | |
SET | STRING | True | |
SMALLINT | INTEGER | True | |
TIME | STRING | True | Versions prior to 10.1.2 will break if any precision greater than 0 is used. |
TIMESTAMP | TIMESTAMP | True | MariaDB always stores timestamps in UTC Invalid values will be loaded as NULL. Versions prior to 10.1.2 will break if any precision greater than 0 is used. |
TINYBLOB | BINARY | True | |
TINYINT | BOOLEAN | True | If the source TINYINT column has a specified width of exactly 1 (i.e., TINYINT(1) ) and contains values of only 0s and 1s, the destination type will be BOOLEAN. If it contains values other than 0 or 1, the destination type will be INTEGER. |
TINYINT | INTEGER | True | In all other cases, the destination type for TINYINT columns will be INTEGER. If the width isn’t specified to be exactly 1 (either no specification or a value other than 1), the destination type will be INTEGER, even if the column contains only 1s or 0s. |
TINYTEXT | STRING | True | |
UNSIGNED BIGINT | DECIMAL | True | |
UNSIGNED INT | LONG | True | |
UNSIGNED SMALLINT | INTEGER | True | |
VARCHAR | STRING | True | |
VARBINARY | BINARY | True | |
YEAR | INTEGER | True | |
CURVE | False | ||
MULTICURVE | False | ||
MULTISURFACE | False | ||
SURFACE | False |
If we are missing an important data type that you need, please reach out to support.
In some cases, when loading data into your destination, we may need to convert Fivetran data types into data types that are supported by the destination. For more information, see the individual destination pages.
Excluding source datalink
If you don’t want to sync all the data from your master database, you can exclude schemas, tables, or columns from your syncs on your Fivetran dashboard. To do so, go to your connector details page and uncheck the objects you would like to omit from syncing. For more information, see our Column Blocking documentation.
Alternatively, you can change the permissions of the Fivetran user you created and restrict its access to certain tables or columns.
Note that column permissions will not shield confidential data, such as PII (Personally Identifiable Information), from Fivetran because they do not apply to binary logs (binlogs). Via your binlogs, Fivetran has access to the full contents of any changed rows, including columns that have no SELECT permissions. However, we filter out the prohibited columns at the earliest possible stage of our syncs and do not load them into your destination.
Initial synclink
When Fivetran connects to a new database, we first copy all rows from every table in every schema for which we have SELECT permission (except those you have excluded in your Fivetran dashboard) and add Fivetran-generated columns. Tables are copied in ascending size order (from smallest to largest). We copy rows by performing a SELECT statement on each table. For large tables, we copy a limited number of rows at a time so that we don’t have to start the sync over from the beginning if our connection is lost midway.
The duration of initial syncs can vary depending on the number and size of tables to be imported. We therefore interleave incremental updates with the table imports during the initial sync.
Updating datalink
Once the initial sync is complete, Fivetran performs incremental updates of any new or modified data from your source database. We use your binlogs to request only the data that has changed since our last sync. We require the binlogs to be in row format, so that they contain a separate event in each row. We pull INSERT, UPDATE, and DELETE events from your binlogs at regular intervals and upload them to your destination.
How we load UPDATE events into your destination depends on whether or not the table has a primary key. To find out which of your tables have primary keys, run this query in your source database:
SELECT
max(tc.CONSTRAINT_TYPE = 'PRIMARY KEY') IS NOT NULL
AND max(tc.CONSTRAINT_TYPE = 'PRIMARY KEY') = TRUE AS has_primary_key,
c.TABLE_SCHEMA AS table_schema,
c.TABLE_NAME AS table_name
FROM information_schema.COLUMNS c
LEFT JOIN information_schema.KEY_COLUMN_USAGE k ON k.TABLE_SCHEMA = c.TABLE_SCHEMA AND
k.TABLE_NAME = c.TABLE_NAME AND
k.COLUMN_NAME = c.COLUMN_NAME
LEFT JOIN information_schema.TABLE_CONSTRAINTS tc ON tc.TABLE_SCHEMA = k.TABLE_SCHEMA AND
tc.TABLE_NAME = k.TABLE_NAME AND
tc.CONSTRAINT_NAME = k.CONSTRAINT_NAME
LEFT JOIN information_schema.TABLES t ON t.TABLE_SCHEMA = c.TABLE_SCHEMA AND
t.TABLE_NAME = c.TABLE_NAME
WHERE t.TABLE_TYPE = 'BASE TABLE'
AND c.TABLE_SCHEMA NOT IN ('performanceschema', 'informationschema', 'mariadb', 'innodb', 'lookerscratch', 'tmp')
GROUP BY c.TABLE_SCHEMA, c.TABLE_NAME ORDER BY has_primary_key, c.TABLE_SCHEMA, c.TABLE_NAME;
Tables with a primary keylink
We merge changes to tables with primary keys into the corresponding tables in your destination:
- An INSERT in the source table generates a new row in the destination with
_fivetran_deleted = FALSE
. - A DELETE in the source table updates the corresponding row in the destination with
_fivetran_deleted = TRUE
. - An UPDATE in the source updates the data in the corresponding row in the destination.
Tables without a primary keylink
We handle changes to tables without a primary key differently:
- An INSERT in the source table generates a new row in the destination with
_fivetran_deleted = FALSE
. - The
_fivetran_id
column helps us handle DELETE operations:- If there is a row in the destination that has a corresponding
_fivetran_id
value, that row will be updated with_fivetran_deleted = TRUE
. - If there is not a row in the destination that has a corresponding
_fivetran_id
value, a new row will be added with_fivetran_deleted = TRUE
.
- If there is a row in the destination that has a corresponding
- An UPDATE in the source table is treated as a DELETE followed by an INSERT, so it results in two rows in the destination:
- A row containing the old values with
_fivetran_deleted = TRUE
- A row containing the new values with
_fivetran_deleted = FALSE
- A row containing the old values with
As a result, one record in your source database may have several corresponding rows in your destination. For example, suppose you have a products
table in your source database with no primary key:
description | quantity |
---|---|
Shrink-ray gun | 1 |
Boogie robot | 2 |
Cookie robot | 3 |
You load this table into your destination during your initial sync, creating this destination table:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | ‘2000-01-01 00:00:00’ | 0 | FALSE | asdf |
Cookie robot | 2 | ‘2000-01-01 00:00:00’ | 1 | FALSE | dfdf |
Boogie robot | 3 | ‘2000-01-01 00:00:00’ | 2 | FALSE | ewra |
You then update a row:
UPDATE products SET quantity = 4 WHERE description = 'Cookie robot';
After your UPDATE operation, your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | ‘2000-01-01 00:00:00’ | 0 | FALSE | asdf |
Cookie robot | 2 | ‘2000-01-01 00:00:00’ | 3 | TRUE | dfdf |
Boogie robot | 3 | ‘2000-01-01 00:00:00’ | 2 | FALSE | ewra |
Cookie robot | 4 | ‘2000-01-01 00:00:00’ | 4 | FALSE | zxfd |
You then delete a row:
DELETE FROM products WHERE description = 'Boogie robot';
After your DELETE operation, your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | ‘2000-01-01 00:00:00’ | 0 | FALSE | asdf |
Cookie robot | 2 | ‘2000-01-01 00:00:02’ | 3 | TRUE | dfdf |
Cookie robot | 4 | ‘2000-01-01 00:00:02’ | 4 | FALSE | zxfd |
Boogie robot | 3 | ‘2000-01-01 00:00:02’ | 5 | TRUE | ewra |
So, while there may be just one record in your source database where description = Cookie robot
, there are two in your destination - an old version where _fivetran_deleted = TRUE
, and a new version where _fivetran_deleted = FALSE
.
We also de-duplicate rows before we load them into your destination. We use the _fivetran_id
field, which is the hash of the non-Fivetran values in every row, to avoid creating multiple rows with identical contents. If, for example, you have the following table in your source:
description | quantity |
---|---|
Shrink-ray gun | 1 |
Shrink-ray gun | 1 |
Shrink-ray gun | 1 |
Then your destination table will look like this:
description | quantity | _fivetran_synced | _fivetran_index | _fivetran_deleted | _fivetran_id |
---|---|---|---|---|---|
Shrink-ray gun | 1 | ‘2000-01-01 00:00:00’ | 0 | FALSE | asdf |
Deleted rowslink
We don’t delete rows from the destination, though the way for how we process deletes differs for tables with primary keys and tables without primary keys. We handle deletes as part of streaming changes from the binlog. Note that we only process DELETE events from the binlog.
Unsupported MariaDB commandslink
Fivetran does not support several MariaDB commands:
TRUNCATE
DROP
(and re-creating the dropped table withCREATE
)LOAD
RENAME
CASCADING DELETES
CASCADING UPDATES
If you use any of these unsupported commands to delete or update the contents of a table, your changes will not be recorded in the binlogs. As a result, those records won’t be replicated correctly in your destination.
Migrating service providerslink
If you want to migrate service providers, we will need to do a full re-sync of your data because the new service provider won’t retain the same change tracking data as your original MariaDB database.