Connector Improvement: Sample sync of connectors in dev/Beta
Not plannedWe have a business use case regarding syncing only a sample of connectors in development/Beta environments to test our pipelines, which is crucial at initial setup and for updates as well. In short, we would like to avoid syncing very large tables multiple times (dev/beta/prod), and would like to have the option to sync only a sample for selected connectors in non-prod environments: this could be a simple LIMIT clause too. The idea is to have the tables set up in the different environments – especially new tables within an existing, previously synced connector – so pipelines could start using them without failure (which is the case currently unless we unpause the connector). Even though we have an initial sync from months ago for a given connector, as new tables appear we would need to get those to all environments as well, however, restarting/unpausing the connector is not an option as then all data would be resynced for the other tables too. Finally, we cannot sync only new tables while the connector itself is paused: "If the connector is paused, the table sync will be scheduled to be performed when the connector is re-enabled.".
Please help us with the above issue.
Many Thanks,
Miki
-
Official comment
What an interesting request! Thanks for submitting. This will be easier to accomplish with connectors featuring priority first sync. It should be possible for all connectors by customizing where the cursor is set. I recommend working with your account team to get this special behavior. It is not currently on the roadmap. However, I see there are already two upvotes and we will certainly dive deeper into the possibility of adding this to the product if many people need it. We try to keep the right balance between simplicity and configurability
-
Thank you for the reply, Alexander! For us, one of key connectors is MySQL which does not have priority-first sync enabled yet if I'm correct. Also, do you know if only doing the forward sync is possible with this feature? (and not doing a backfill afterwards)
I sense there might/will be a solution for most connectors that could be leveraged here to solve this issue relying on the priority-first sync in addition to syncing empty tables and columns, so it is definitely a good starting point. However, it looks like it is not rolled out evenly across all connectors yet.
Please sign in to leave a comment.
Comments
2 comments