Connector Dependencies
AnsweredHi All,
I think this would be an interesting feature to add to Fivetran.
We have a situation where we have multiple databases in our application which have dependencies on where unique keys are generated. Syncing the databases independently leads to a race condition where we can get data from the dependent databases first before the originator and thus our tests fail within our data warehouse.
For a more concrete example, our user identifier is created in database1 but is also used in database2. If database2 syncs before database1 then any referential integrity tests we have fail because the originator table hasn't synced and any new identifiers are unavailable.
It would be very beneficial to be able to have one connector sync then another one sync right away, as if the completion of one kicks off the next one.
I could see this being part of the scheduling section of each connector, saying that it depends on another connector with a checkbox/list of all the other connectors. This is especially useful as the “every 6 hours” is based on when you kick it off, so 6 hours for one connector is a totally different 6 hours for another one.
Having connectors “depend” on the completion of another would add more predictability to when the data would arrive, solving any dependencies between data sources.
Thanks
-
Official comment
Hi Craig,
Thank you for sharing your use case here! You are correct that the dashboard UI does not currently support associating connectors with each other. One of our key goals and product principles is to keep one, simple, default choice for our users when using Fivetran.
That said, it is common for our users to want to schedule or plan syncs in a certain way. We have an API that allows for this, which you can read about here: https://fivetran.com/docs/rest-api/connectors#usecasehowtotriggersyncsonlywhenyouneedthem
While this doesn't automatically trigger a second connector sync after a particular one completes, it could be a starting point. In the meantime, I will bring this up with our internal teams to see if it is a use case we should investigate further. At the moment, we do not have a capability like this planned for our product.
-
Hi Craig,
This is also a good use case for a scheduling tool such as Apache Airflow or Prefect. These tools allow you to create a directed acyclical graph (DAG) which guarantees the order-of-execution of sync jobs, and that the data is consistent before starting downstream processes. Fivetran has created integrations for Airflow (Fivetran Provider) and Prefect (Fivetran Task) so that you can schedule and monitor your Fivetran sync jobs from these tools. If you are already using one of these scheduling systems, or you decide to try them out, please let me know! We'd be happy to help.
Please sign in to leave a comment.
Comments
2 comments