Follow these instructions to replicate your Amazon RDS PostgreSQL database to your destination using Fivetran.
Prerequisiteslink
To connect your Amazon RDS PostgreSQL database to Fivetran, you need:
- PostgreSQL version 8.4 - 14.x
- An RDS account with an account administrator role
- Access to your database server
- Access to your database host’s machine
- Your database host’s IP (e.g.,
1.2.3.4
) or domain (your.server.com
) - Your database’s port (usually
5432
) - TLS enabled on your database. Follow Amazon’s TLS setup instructions to enable TLS on your database.
- (If you want to connect using SSH) An SSH server
Setup instructionslink
Choose incremental sync mechanismlink
To keep your data up to date after the initial sync, we use one of the following incremental sync methods:
- logical replication with the
pgoutput
plugin - logical replication with the
test_decoding
plugin - XMIN
- Fivetran Teleport Sync BETA
The first three methods keep a record of recent data changes, which allows Fivetran to update only the data that has changed since our last sync. Fivetran Teleport Sync instead takes snapshots of tables to calculate differences.
TIP: We recommend using logical replication as your incremental update mechanism because it is faster than XMIN replication and allows Fivetran to detect deleted rows for tables with primary keys. Learn more in our Updating data documentation.
Choose logical replication with the pgoutput
plugin, logical replication with the test_decoding
plugin, XMIN, or Fivetran Teleport Sync as your incremental update mechanism. You will configure your incremental update mechanism in later steps.
Logical replication with the pgoutput
plugin
IMPORTANT: Because of AWS limitations, you can only use logical replication if you connect Fivetran to your Amazon RDS PostgreSQL master instance. You can only enable logical replication with the
pgoutput
plugin if your PostgreSQL version is 10.4 or later. Prior minor versions have breaking bugs.
Logical replication is based on logical decoding of the PostgreSQL write-ahead log (WAL). Fivetran reads the WAL using the pgoutput
plugin to detect any new or changed data. This plugin replicates from your custom publication without needing additional libraries.
To learn more, see our logical replication with the pgoutput
plugin documentation.
Logical replication with the test_decoding
plugin
IMPORTANT: Because of AWS limitations, you can only use logical replication if you connect Fivetran to your Amazon RDS PostgreSQL master instance. You can only enable logical replication with the
test_decoding
plugin if your PostgreSQL server version is 9.4 - 13.2. Prior minor versions have breaking bugs.
Logical replication is based on logical decoding of the PostgreSQL write-ahead log (WAL). Fivetran reads the WAL using the test_decoding
plugin to detect any new or changed data. This plugin receives WAL changes through the logical decoding mechanism and converts them into human-readable text.
To learn more, see our logical replication with the test_decoding
plugin documentation.
XMIN
NOTE: You can use XMIN for any Amazon RDS PostgreSQL deployment, including a read replica.
The XMIN method is based on the hidden xmin
system column that is present in all PostgreSQL tables. With XMIN, Fivetran must scan every table in full to detect updated data. We do not recommend XMIN for near real-time data needs because XMIN replication is slower than logical replication and doesn’t allow Fivetran to detect deleted rows.
Learn more in our XMIN documentation.
Fivetran Teleport Sync BETA
Fivetran Teleport Sync is a proprietary database replication method that allows Fivetran to incrementally replicate your database with no additional setup other than a read-only SQL connection.
Learn more in our Fivetran Teleport Sync documentation.
Create read replica (XMIN or Fivetran Teleport Sync only)link
IMPORTANT: You can only use a read replica if you chose XMIN or Fivetran Teleport Sync as your incremental update mechanism.
If you’d like, create a read replica for Fivetran’s exclusive use. Using a read replica allows Fivetran to integrate your data without putting unnecessary load on or interrupting the queries running on your master server. We recommend that you connect a read replica to Fivetran, but it’s not required.
If you chose logical replication as your incremental update mechanism, want to connect Fivetran to your master database, or already have a read replica, skip ahead to Step 4.
-
In your Amazon RDS Dashboard, select the PostgreSQL instance you want to replicate.
-
Click Actions, then select Create read replica from the drop-down menu.
-
In the Instance specifications section, specify the instance type for the read replica. It does not need to be as large as your master instance.
-
In the Network Security section, set the Publicly accessibility setting to Yes to ensure that the read replica is accessible from outside your VPC.
-
In the Settings section, enter your chosen instance ID.
-
Click Create read replica.
-
The replica’s status should now be
creating
.
It will take a few minutes for the read replica to finish being created. The status will change to available
when it is done.
Enable database accesslink
Grant Fivetran’s data processing servers access to your database server. How you grant access depends on whether or not your database instance is in a VPC.
If your instance is in a VPC, you must configure the two mechanisms that control access: VPC security groups and network access control lists (ACLs). If your instance is not in a VPC, you only need to configure security groups.
Configure security group
NOTE: These instructions assume that your database instance is in a VPC. If your database instance is not in a VPC, you can still use these instructions because configuring a non-VPC security group is an almost identical process.
-
In your Amazon RDS dashboard, click on the database instance you want to connect to Fivetran.
-
In the Connectivity & security section, find the database’s port number and make a note of it. You will need the port number to configure Fivetran.
-
In the Security column, click the link to the database instance’s security group.
-
In the Security Group panel, click Actions, then select Edit inbound rules from the drop-down menu.
-
Click Add Rule. This creates a new Custom TCP Rule at the bottom of the list.
-
Fill in the new Custom TCP Rule.
- In the Port Range field, enter your database instance’s port number. If you created a new read replica for Fivetran, this is the port number that you found in step 3 of this section. (The port will be
5432
for direct connections, unless you changed the default.) - What you enter in the Custom IP field depends on whether you’re connecting directly or using an SSH tunnel.
- If you’re connecting directly, enter Fivetran’s IPs for your database’s region.
- If you’re connecting using an SSH tunnel, enter
{your-ssh-tunnel-server-ip-address}/32
.
- (Optional) Enter a brief description in the Description field.
- In the Port Range field, enter your database instance’s port number. If you created a new read replica for Fivetran, this is the port number that you found in step 3 of this section. (The port will be
-
Click Save rules.
Configure network ACLs (VPC only)
If your database instance is not in a VPC, skip ahead to Step 5.
-
Return to the RDS dashboard.
-
Click on your database instance.
-
Click the link to the instance’s VPC.
-
Select the VPC ID.
-
In the Summary tab, click the Network ACL link.
You will see tabs for Inbound Rules and Outbound Rules. You must edit both.
Edit inbound rules
-
Select Inbound Rules.
-
If you have a default VPC that was automatically created by AWS, the settings already allow all incoming traffic. To verify that the settings allow incoming traffic, confirm that the Source value is
0.0.0.0/0
and that the ALLOW entry is listed above the DENY entry. -
If your inbound rules don’t include
ALL - 0.0.0.0/0 - ALLOW
entry, edit the rules to allow the Source to access the port number of your database instance. (The port will be5432
for direct connections, unless you changed the default.) For additional help, see Amazon’s Network ACLs documentation.- If you’re connecting directly, enter Fivetran’s IPs for your database’s region.
- If you’re connecting using an SSH tunnel, enter
{your-ssh-tunnel-server-ip-address}/32
.
Edit outbound rules
-
Select Outbound Rules.
-
If you have a default VPC that was automatically created by AWS, the settings already allow all outbound traffic. To verify that the settings allow outbound traffic, confirm that the Destination value is
0.0.0.0/0
and that the ALLOW entry is listed above the DENY entry. -
If your outbound rules don’t include an
ALL - 0.0.0.0/0 - ALLOW
entry, edit the rules to allow outbound traffic to all ports1024-65535
for the following Destinations:- If you’re connecting directly, enter Fivetran’s IPs for your database’s region.
- If you’re connecting using an SSH tunnel, enter
{your-ssh-tunnel-server-ip-address}/32
.
Create userlink
Create a database user for Fivetran’s exclusive use.
-
Open a connection to your Amazon RDS PostgreSQL database.
-
Create a user for Fivetran by executing the following SQL command. Replace
<username>
andsome-password
with a username and password of your choice.
CREATE USER <username> PASSWORD 'some-password';
Grant user read-only accesslink
Grant the Fivetran user read-only access to all tables by running the following commands. To grant access to a schema other than PostgreSQL’s default public
schema, replace public
with the schema name.
GRANT USAGE ON SCHEMA "public" TO <username>;
GRANT SELECT ON ALL TABLES IN SCHEMA "public" TO <username>;
ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT SELECT ON TABLES TO <username>;
NOTE: The last command makes sure that any future tables will be accessible to Fivetran.
If you want to grant access to multiple schemas, you must run these three commands for each schema.
Restrict access to tables (optional)
If you want to limit Fivetran’s access to your tables, grant the Fivetran user access to only the tables that you would like to sync. You must individually grant access for each table that you want to sync. You cannot grant access to all tables and then revoke access for a subset of tables.
-
Ensure that the Fivetran user has access to the schema that contains your table(s).
GRANT USAGE ON SCHEMA "some_schema" TO <username>;
-
Revoke any previously granted permissions to all tables in that schema.
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" REVOKE SELECT ON TABLES FROM <username>; REVOKE SELECT ON ALL TABLES IN SCHEMA "some_schema" FROM <username>;
-
Repeat the following command for each table you want Fivetran to sync.
GRANT SELECT ON "some_schema"."some_table" TO <username>;
-
By default, any tables that you create in the future will be excluded from the Fivetran user’s access. To grant access to new tables, run the following command.
ALTER DEFAULT PRIVILEGES IN SCHEMA "some_schema" GRANT SELECT ON TABLES TO <username>;
Restrict access to columns (optional)
If you want to limit Fivetran’s access to the columns in your tables, grant the Fivetran user access to only certain columns. You must individually grant access for each column that you want to sync.
-
Ensure that you have revoked any previously granted permission to read all columns in the table.
REVOKE SELECT ON "some_schema"."some_table" FROM fivetran;
-
Grant permission to the specific columns you want to sync (for example,
some_column
andother_column
).NOTE: If you chose XMIN as your incremental update mechanism, you must grant us access to the hidden system column
xmin
.GRANT SELECT (xmin, "some_column", "other_column") ON "some_schema"."some_table" TO fivetran;
Once you restrict access to columns within a table, the Fivetran user will not have access to any new columns added to that table in the future. To grant access to new columns, you must rerun the command above.
Configure incremental updateslink
Configure your chosen incremental update mechanism.
Logical replication with the test_decoding
plugin
To enable logical replication with the test_decoding
plugin, follow these steps:
-
Go to your Amazon RDS PostgreSQL master database. You cannot enable logical replication on a read replica.
-
Ensure that your server has ample free space for the logs. As soon as Fivetran processes a log, we delete it. However, we don’t delete logs if the sync is interrupted (for example, if we lose access to your database). In this case, logs may accumulate on your server and consume additional storage. The amount of additional disk space that these logs consume is proportional to the number of changes committed on the server. If we can’t resume a lost connection quickly enough and you need more disk space, you can drop the replication slot, which deletes its unconsumed logs.
-
Set the
rds.logical_replication
parameter to1
by following these steps:
i. Create a new parameter group (non-default group).ii. Enable the
logical_replication
flag in that group by setting the value to1
.iii. Set
wal_sender_timeout
to0
to disable the timeout mechanism.NOTE: Disabling the timeout mechanism ensures that the database won’t end the
wal_sender
process before Fivetran is able to establish a connection. This helps us stay up to date with the latest changes in the database’s WAL slot.iv. Apply the parameter group to the database.
v. Wait until the status changes to
pending-reboot
, then reboot the database to apply the new parameter group. -
Log in to a PostgreSQL console (such as a SQL workbench or psql) as a superuser. Superusers have the
rds_superuser
role. -
Create a logical replication slot for the database you want to sync by running the following command. You must use the output plugin
test_decoding
supplied in thepostgresql-contrib
subpackage.IMPORTANT: The replication slot name
fivetran_replication_slot
quoted throughout this guide is used purely as an example. The actual replication slot name should be unique for every connector using the same PostgreSQL cluster. Replication slot names cannot start with a number.SELECT pg_create_logical_replication_slot('fivetran_replication_slot', 'test_decoding');
-
Grant permission to the Fivetran user for reading the replication slot.
GRANT rds_replication TO <username>;
NOTE: The Fivetran user does not need the
rds_superuser
role. -
Log in as the Fivetran user.
-
Verify that the Fivetran user can read the replication slot by running the following command:
SELECT count(*) FROM pg_logical_slot_peek_changes('fivetran_replication_slot', null, null);
If the query succeeds, then permissions are sufficient.
IMPORTANT: You must periodically tune the
checkpoint_timeout
andmax_wal_size
parameters based on your PostgreSQL database operations. If you do not, you may experience replication failures. To learn how to tune, read this tuning checkpoints documentation.
Logical replication with the pgoutput
plugin
To enable logical replication with the pgoutput
plugin, follow these steps:
-
Go to your Amazon RDS PostgreSQL master database. You cannot enable logical replication on a read replica.
-
Ensure that your server has ample free space for the logs. As soon as Fivetran processes a log, we delete it. However, we don’t delete logs if the sync is interrupted (for example, if we lose access to your database). In this case, logs may accumulate on your server and consume additional storage. The amount of additional disk space that these logs consume is proportional to the number of changes committed on the server. If we can’t resume a lost connection quickly enough and you need more disk space, you can drop the replication slot, which deletes its unconsumed logs.
-
Set the
rds.logical_replication
parameter to1
by following these steps:
i. Create a new parameter group (non-default group).ii. Enable the
logical_replication
flag in that group by setting the value to1
.iii. Set
wal_sender_timeout
to0
.iv. Apply the parameter group to the database.
v. Wait until the status changes to
pending-reboot
, then reboot the database to apply the new parameter group. -
Log in to a PostgreSQL console (such as a SQL workbench or psql) as a superuser. Superusers have the
rds_superuser
role. -
Create a logical replication slot for the database you want to sync by running the following command. You must use the standard output plugin
pgoutput
.IMPORTANT: The replication slot name
fivetran_pgoutput_slot
quoted throughout this guide is used purely as an example. The actual replication slot name should be unique for every connector using the same PostgreSQL cluster. Replication slot names cannot start with a number.SELECT pg_create_logical_replication_slot('fivetran_pgoutput_slot', 'pgoutput');
-
Create a publication for your tables. If you want, you can create a publication for only certain tables so that you add or remove tables from the publication later on. Only changes from tables in the publication are replicated to Fivetran. Each database can have multiple distinct publications. You must have
CREATE
privileges or above to run this command.IMPORTANT: The publication name
fivetran_pub
quoted throughout this guide is used purely as an example. The actual publication name should be unique for every database and cannot start with a number.CREATE PUBLICATION fivetran_pub FOR TABLE table2, table4, table8;
To add or remove a table from a publication, run the following command. You must have ownership rights over the table(s).
ALTER PUBLICATION fivetran_pub ADD/DROP TABLE table_name;
Alternatively, you can create a publication for all of your tables. However, you cannot remove any table from this publication later on. You must have superuser privileges to run this command.
CREATE PUBLICATION fivetran_pub FOR ALL TABLES;
(Optional) You can choose which operations to include in the publication. For example, the following publication includes only
INSERT
andUPDATE
operations.CREATE PUBLICATION insert_only_pub FOR TABLE table1 WITH (publish = 'INSERT, UPDATE');
-
Verify that your chosen tables are in the publication.
SELECT * FROM pg_publication_tables.
-
Grant the Fivetran user permission to read the replication slot.
GRANT rds_replication TO <username>;
-
Log in as the Fivetran user.
-
Verify that the Fivetran user can read the replication slot by running the following command. Replace
fivetran_pgoutput_slot
with your replication slot name andfivetran_pub
with the publication name.SELECT count(*) FROM pg_logical_slot_peek_binary_changes('fivetran_pgoutput_slot', null, null, 'proto_version', '1', 'publication_names', 'fivetran_pub');
If the query succeeds, then permissions are sufficient.
IMPORTANT: You must periodically tune the
checkpoint_timeout
andmax_wal_size
parameters based on your PostgreSQL database operations. If you do not, you may experience replication failures. To learn how to tune, read this tuning checkpoints documentation.
XMIN
You do not need to do any additional configuration for the XMIN method.
Fivetran Teleport Sync BETA
If you are trying to connect with a standby or read replica, run the following SQL commands:
CREATE AGGREGATE BIT_XOR(IN v bigint) (SFUNC = int8xor, STYPE = bigint);
If you are not connecting with a read replica, you do not need to to do any additional configuration. The aggregate that the Teleport mechanism will later use is automatically created for you.
Finish Fivetran configurationlink
-
In your connector setup form, enter a destination schema prefix. This prefix applies to each replicated schema and cannot be changed once your connector is created.
-
In the Host field, enter your database host’s IP (for example,
1.2.3.4
) or domain (for example,your-database.cp0rdhwjbsae.us-east-1.rds.amazonaws.com
) -
Enter your database instance’s port number. The port will be
5432
, unless you changed the default. -
Enter the Fivetran-specific user that you created in Step 5.
-
Enter the password for the Fivetran-specific user that you created in Step 5.
-
Enter the name of your database (for example,
your_database
). -
Choose your connection method. If you selected Connect via an SSH tunnel, provide the following information:
- SSH host (do not use a load balancer’s IP address/hostname)
- SSH port
- SSH user
-
Choose your update method. If you selected Logical replication of the WAL using the test_decoding plugin, enter the name of your database’s replication slot. If you selected Logical replication of the WAL using pgoutput plugin, enter both the name of your database’s replication slot and publication name accordingly.
-
Click Save & Test. Fivetran tests and validates our connection to your Amazon RDS PostgreSQL database. Upon successful completion of the setup tests, you can sync your data using Fivetran.
Setup testslink
Fivetran performs the following tests to ensure that we can connect to your PostgreSQL RDS database and that it is properly configured:
- The Connecting to SSH Tunnel Test validates the SSH tunnel details you provided in the setup form. It then checks that we can connect to your database using the SSH Tunnel. (We skip this test if you aren’t connecting using SSH.)
- The Connecting to Host Test validates the database credentials you provided in the setup form. The test verifies that the host is not private and then checks the connectivity to the host.
- The Validating Certificate Test generates a pop-up window where you must choose which certificate you want Fivetran to use. It then validates that certificate and checks that we can connect to your database using TLS. (We skip this test if you aren’t connecting directly.)
- The Connecting to Database Test checks that we can access your database.
- The Connecting to WAL Replication Slot Test confirms that the database associated with the replication slot matches the name you supplied in the setup form. It then verifies that the replication slot uses the
pgoutput
if you selected WAL with pgoutput update method, or thetest_decoding
plugin if you selected WAL with the test_decoding update method. Lastly, it makes sure that the Fivetran user has replication privileges. (We skip this test if you selected XMIN as your incremental update mechanism) - The Checking Configuration Values Test checks a set of WAL-configured values against the recommended settings and detects if they are below the recommended range. (We skip this test if you selected XMIN as your incremental update mechanism.)
- The Publication Test verifies that the supplied publication name exists in your database. (We skip this test if you selected XMIN or WAL with the test_decoding plugin as your incremental update mechanism.)
- The Validating Speed Setup test validates Fivetran can fetch data from your source database quickly enough. During this test, we measure our ability to download sample data from your source database to Fivetran, but we do not perform a full sync. We start a timer, then download the sample data in memory. We then calculate the connector speed based on how much data we downloaded and how long it took to download. The test shows a warning if the download speed is less than 5MB/sec.
NOTE: The tests may take a few minutes to finish running.
Related Contentlink
description Connector Overview
account_tree Schema Information
settings API Connector Configuration