Date/Time Picker for Sync Frequency on Transformations & Connectors
AnsweredHi, I'd love more time/date options when deciding sync schedules on my connectors & transformations. Every 15min-24hours is great, but I’d like to pick a time of day for that to fire off (like kick off every 24 hrs at 1am in the morning) or (1st of the month at 3am) or something… Right now my transformation run at all weird hours of the day so I gotta stay up till 1am to kick it off at that time.
-
Official comment
It's a conscious decision to have relatively simple & inflexible scheduling in the UI. Trying to time when connectors should run is an anti-pattern because it creates a fragile tightly coupled system. For example, what happens if the upstream job fails & the fresh data is not ready when the Fivetran connector runs once a day? This idea that tightly coupled complex systems are fragile & prone to error is explored in the classic book Normal Accidents by Charles Perrow.
A better strategy is to loosely couple your upstream data processing from Fivetran. The way to do this is running Fivetran frequently. Effectively, the Fivetran connector will be polling for changes & syncing them as they occur. Our pricing model only charges you for Monthly Active Rows, so frequent connector syncs that don't move any data are free. -
Hi Gary! Thanks for submitting this request.
Regarding sync schedules for connectors-- we have that capability today via our API. Would that meet your needs or do you prefer a solution in the dashboard?
Regarding schedules for transformations, our dbt Transformations (newly released in beta) supports custom schedules like you mention.
-
Hi Amy!
That's awesome the API has it. But I'd love to have the solution in the UI as well. I'm technical enough to be dangerous with APIs, but would much rather prefer a UI :)
As for transformations, I just commended on the other post too: I'd love to use dbt, but that's a separate tool that I would have to get approval and budget for (while I'll definitely make the case, it's just hard during these times). My IT team doesn't like us using free tools for sensitive data, so couldn't use their free tier either, which is why I asked for this. It'd be great to just ever so slightly expand the capabilities that you all traditionally have until I can get dbt going.
thank you!
-
Gary Sahota Thanks for sharing! We currently have the ability to refine the time that the connector runs when you're selecting the every 24 hour option but we don't yet have this for Transformations. Is that something that would work for this use case?
-
@... - This could work! Would love to be able to fire off transformations on specific date/times (like the 1st Monday of the month or something), but this solves my connector use case!
-
Fabulous news! Gary Sahota curious why you'd want to do first Monday of the month for a transformation?
-
Sure thing, @...! We actually take raw usage data from Mixpanel and load it into Snowflake using Fivetran (to ultimately display via Tableau dashboards). One metric we'd like to display is Monthly Active Users. So, on the first of a given month, we take all the previous month's usage data to get MAU by feature.
This transformation only needs to kick off once a month, so it doesn't really make sense to recalculate the same metric every 24 hours (or even every week). So having a specific date/time picker would let me Calculate Monthly Active, Weekly Active, Daily Active in 3 transformations fired off once a month, week, day respectively.
Happy to chat live if needed!
-
Being able to set a connector to run at a specific time say 6am after the source is updated, would be great instead of saying to run it every 24 hours. We have tables in the source that are updated once a day at a specific time, so it would be good to schedule the connector to run right after they are updated.
-
Some competitors allow for CRON notation scheduling of connectors. Would be an awesome feature for Fivetran for having more contorl over intra-day schedules or schedules that only need to run once a week / month.
-
This is a feature I would like to see too. I have a connector that takes more than 3 hours to run. Id like the connector to start in the morning at 3am and finish by 6 or 7 am when our morning reports go out. Even if I increase the sync frequency, our daily morning reports might be stale. When can we expect this feature to be available?
-
Hi Arpit - you can select the start time for once-a-day syncs (24 hour frequency)
Cheers,
Fraser -
We'd also love better control on the sync schedule through the UI. We have a couple of obvious scenarios:
- Some data needs to be synced frequently during office hours for nearish live reporting. Out of hours no-one cares, it just costs, which is bad.
- Month end reporting. For 4 days around month end we do frequent syncs between systems, but for the rest of the month it's not important.
In both these instances, I would have to manually update the schedule.
-
Thanks Steven Wilber for the feedback. I've noted this use-case down and may reach out in the future as we think through enhancements to our frequency settings. I do want to call out that you can leverage our API for manually triggering on custom schedules as a workaround.
-
Adding support from Rowell Belen:
We have a large DBT transformation that runs between 2 am to 4 am. I would like to pause the connector temporarily while the job is running then resume its normal 5-minute sync. The reason for this is that DBT transformation sometimes fails when Fivetran is trying to insert/update a large amount of data at the same time.
We could create a background process that calls the Fivetran API when we want to temporarily pause the sync and then resume after the transformation but it's not ideal. I believe a CRON schedule functionality would also be a better solution here.
-
Adding support from Plamen:
It would be really great if we can use cron syntax when setting Sync Frequency.
The current options in Sync Frequency are quite limiting.
-
I think, anti-pattern assertions aside, that for some systems and in some circumstances it would be useful to be able to at least suggest a time range when a connection should execute, especially if it's specified for a longer cadence.
In our case, we have a source system that for the large majority of it's data actually only changes once every 24 hours. So, for that source it makes sense to set the cadence tot he same duration, as little or no data will be changing by a more frequent sync cycle.
However, because the data update in the source system when it happens is large and processor intensive, we need to be able to specify when we want (or even more precisely when we don't want) the sync to run, it would be helpful to have that capability.
Just my 2 cents.
Cheers,
Albert
-
Albert Bupp - Thanks for the feedback. We do allow connectors that have a sync frequency of 24 hours to be set to run at a specific hour each day. Does that address the use-case you've described?
-
Hi Andrew Morse,
Well, hot damn, I guess I just didn't realize that I could click the "change" link next to the sync frequency setting (which only appears when it's set to 24 hours) in order set the run time, just as I was imagining it would work.
I do feel a little silly now. So, in honor of Gilda Radner and her Emily Litella Character, I will now somewhat contritely say, "never mind".
Cheers,
Albert
-
Hi, I would also like to have a fix time scheduling. Please add this feature when possible. Thanks.
-
Hi Ashish Tiwari - Thanks for the response. Can you please share more details about what you're looking for any why? We do have some options available today that we feel solves the majority of user needs:
- If you have a 24 hour sync frequency, we allow you to pick the time when we sync each day.
- If you have a unique custom schedule, you can use our API to control when we sync your connector.
If none of these solve your use-case, I'd love to understand it more so we can take it into consideration in the future.
-
I'd love to be able to select a specific offset too as I'm running 100+ connectors every ~6 hours against a Postgres database. They all start at the same time which is impacting performance.
-
Related to this, it would be great to have the ability to schedule a transformation to run once per month. For models that are updated monthly, being able to run the model on the first day of a month to update our downstream tables through the prior month-end would be great.
-
Hi,
We have a use case where we need to sync one of the connectors just once a week (every Friday). It would be great if you could make scheduling little more flexible in terms of start time/end time and also frequencies (more than 24 hours).
-
It is quite impossible to think of a data ingestion pipeline which doesn't allow user to have the flexibility of picking up any time. Instead some fixed interval.
please add this feature ASAP. Take a sample use case -
I have more than 150 Instagram business account, I want to pull data from all these 150 business account every hour. This data contains media stats, likes, comments, share, saves etc.
The way Instagram graph API works - It pulls on point data of the time when API call was made. I have board member who wants to see attribution model for all these stats every hour. But glad, I can't do that because FiveTran is pulling data any minute of the hour, so when I want to see attribution model between 10:00 PM to 11:00 PM , I can't do that because my data is actually pulled randomly instead first minute of the hour.
-
hey Team,
Any update on this? the CRON notation scheduling of connectors on the UI is a required feature
-
Allowing CRON scheduling would be a basic enough feature! In our instance, we have management meeting at certain time during a day so having the most fresh data is quite valuable.
-
It would be great if in the "sync frequency" section, we can add crontab schedule expression. because we can stop the connector during the weekend, for example, and save some money on the processing cost in both source and target databases.
-
We should be able to limit runs based on our choosing.
If no new data occurs between 12am and 6am, or if it's acceptable that any new data during that time only hits the destination after 6am, then we could save 25% of our Snowflake compute costs for the pipeline. Right now the pipelines spin up a compute resource every 15 minutes.
If we can use the API, why not allow it in the interface?
Please sign in to leave a comment.
Comments
28 comments