Connector Improvement: Same sync time across tables
Say I have 2 tables - Orders and Customers. It is possible that you sync the Customers table first, then Orders, but in-between that time a new order is placed - so my data warehouse will see an Order without corresponding Customer.
So it would be good to have a cutoff for all tables to be the time the sync started, for example. This is a potential issue regardless of the sync frequency.
-
Official comment
Hi Mohammed Khalil and Nicholas Moulton - are there particular data sources that this is important for you?
The core of your request is about data consistency. Fivetran today can only _guarantee_ eventual consistency. For database connectors using a log-based replication method we _effectively_ have snapshot consistency at the end of a sync. You can take advantage of this with our new Integrated Scheduling feature, which will orchestrate downstream transformations after the sync is finished.
For API connectors, improving consistency is harder than you would think. We've found that many API's are eventually consistent if they are abstracting a distributed system. We call this data integrity error "late arriving events" & the only solution is to re-sync time slices repeatedly to catch these events. Because we only charge on active primary keys this behavior has no impact on your pricing.
To your suggestion - we could limit how recent a record we sync, but that comes at a cost of increased overall latency. Our general recommendation is run the connector more frequently to reduce the inconsistency. -
Our company has run into the same issue. It would be ideal to have referential integrity supported for the connectors but I can see where that would be very difficult to develop.
In place of that having a priority list to flag specific tables as "primary" tables would be simple and in most cases just as effective. -
Hi Fraser
Yes running the sync more frequently would reduce the prevalence of this issue, but still, it will still occur if a lot of data is being added to both tables whilst the sync is ongoing
Please sign in to leave a comment.
Comments
3 comments