Connector Improvement: Incremental Update / Sync Handling
Not plannedTo capture late-arriving records during incremental syncs, Fivetran uses a one-size-fits-all approach. On every sync, it computes:
Ideally, this behavior should be conditional:
`max(lastModifiedTime) - 1 hour` and deletes/re-inserts all records in that window. This logic is applied on every sync, even when no late-arriving data exists.Ideally, this behavior should be conditional:
-
If
max(lastModifiedTime) ≤ last _fivetran_synced, there are no new source updates beyond the previous sync boundary. In this case, no delete/re-insert should occur, and the next sync should simply continue fromlast _fivetran_syncedforward. -
If
max(lastModifiedTime) > last _fivetran_synced, late-arriving updates are possible. In this case, applying a bounded lookback window (e.g.last _fivetran_synced - 1 hour) is reasonable to safely capture out-of-order updates. - And the 1 hour fixed timing irrespective of the Fivetran sync periodicity is not a very robust approach either. Ideally, depending on the Fivetran sync periodicity the look back should be a sliding window.
The current one size fits all way of handling incremental syncs is resulting in unnecessary downstream data processing even though the source has not been modified. We would request that this be looked into further and be supported.
-
Official comment
Hi Raghav,
We apply this window approach because there's no way for us to determine whether the data is late arriving without attempting to merge it. Fivetran is designed to retain minimal data, and so the only way for us to check for late arrivals is to actually attempt to merge a small overlap into your destination.
If this merge is resulting in significant work on your destination, could you provide some details on the table size, and how you measure cost?
Thanks,
Eric
Please sign in to leave a comment.
Comments
1 comment