Connector Improvement: Improve Poor Information on Hidden Stripe Connector Re-sync Behavior
TLDR
Stripe re-sync behavior is unclear in its actual behavior and progress state. Better messaging and information can be conveyed to users to
- prevent accidental obstruction of day-to-day incremental loads
- prevent users from accidentally loading only partially-updated (and thus completely untrustworthy) data for certain child tables over extended periods of time
---
Some Stripe tables require predecessor Stripe tables to be up to date in order to for them to also have accurate data. For example, Card needs three upstream tables loaded before its own data is properly updated: [CHARGE, SOURCE, ACCOUNT].
When historically re-syncing a "downstream" table such as Card, the behavior of the re-sync is to first historically re-sync all upstream tables. In Card's case, all of Charge, all of Source, and all of Account. This behavior in itself isn't necessarily bad. But it becomes bad if this behavior isn't clearly communicated to the user. The user cedes control of data availability to Fivetran when conducting re-syncs, and not understanding the degree and impact of this means that a historical re-sync ends up being a potential bullet into the heart of day-to-day business decision making that can not be retrieved without the help of Fivetran support to stop unintended mass upstream re-syncs.
It's an abrasive and frustrating experience to have to learn this hidden and obscured behavior on the fly. General information and sync state can be much better expressed:
Update Connector Docs
- There doesn’t appear to be any language talking about certain tables potentially needing parent tables backfilled in Fivetran’s connector and re-sync FAQ page (https://fivetran.com/docs/using-fivetran/fivetran-dashboard/connectors/faq).
- The Fivetran Stripe connector docs page (https://fivetran.com/docs/applications/stripe) also does not talk about the behavior of some tables needing upstream tables to fully load correctly, which has led to us incorrectly pulling in some data for weeks/months. There is now Card-specific information, but that was only added after our org submitted a ticket asking about this newly-discovered re-sync behavior.
- There could be an indicator in a company's Fivetran console/web UI indicating that data might not be up to date given that predecessor/upstream tables are not loading. We had to learn this on the fly, and it's not clearly expressed (or expressed at all) in any relevant documentation linked above.
- In the Fivetran web interface, there is a gray visual "I" icon that suggests only a subset of the tables that will be backfilled if a re-sync is started. This doesn't completely convey the extent of the re-sync scope, and the user has to guess which tables will have to be completely historically re-synced. This icon could have more urgency conveyed in it, as the dull gray color is possibly the least-urgent color to convey this critical information. Additionally, this critical information should be expressed clearly and completely: every table should be listed, and it should not take a mouseover highlight in an interactive console to understand this will happen. This information should be expressed in static documentation and not obscured.

Better Express Sync State in the UI
- Give some kind of visual indicator expressing the progress of an ongoing sync, be it an incremental or a historical re-sync. Right now, we have to parse Fivetran logs, do math or conversion on Unix epoch timestamps passed to Stripe API endpoints to roughly estimate how much data has been ingested in the sync, and then guess ourselves how much time is remaining.
Benefits
- Fivetran customers don't stumble into irreversible (by ourselves, anyway) and prohibitively expensive mass re-sync operations
- Improved trust and reliability on Fivetran product, especially for critical data like Stripe financial information
Please sign in to leave a comment.
Comments
0 comments