Connector Improvement: Improve handling of large historical resyncs for Postgres

I'm trying to run an initial sync of a large Aurora Postgres database into my new Snowflake destination. I currently sync this database into our Redshift destination. However, when trying to historically sync all tables from this database into Snowflake, I'm running into errors. I've worked with Michael Chau from Fivetran support (Support case #95522), and he's suggested that I batch the historical sync by table and schema, i.e. historically sync all tables in one schema, then do a second schema, etc. This is a workaround to any throughput constraints I might be encountering (see the thread for case #95522 for details). The size of the Postgres database, on disk, is around 700 GB, spread out among 600-700 tables.

Feature request: have Fivetran automate this batching. I'm planning to incrementally add one schema, let it sync, add another schema, let that one sync, etc until I have synced all tables and all schemas. It would be great if Fivetran could instead

Automatically detect when we have these "large batch" situations.
When it detects this situation, automate the sub-batch syncs. i.e., do what I'm doing manually, but do it in a for loop.

This will be a very tedious effort for me, and I will need to do it over the course of a day or two. It would be great if it "just worked" for me.

Community

Connector Improvement: Improve handling of large historical resyncs for Postgres

Comments