Orchestrate custom data transformations in your destination with Transformations for dbt Core*.
NOTE: Contact Fivetran Support to enable Transformations for dbt Core - Scheduled in Fivetran for your account.
Fivetran integrates with dbt Core to power our transformations. dbt Core, by dbt Labs, is an open-source transformation tool that enables you to perform sophisticated data transformations in your destination using simple SQL statements. With dbt Core, you can:
- Write and test SQL transformations
- Use version control with your transformations
- Create and share documentation about your transformations
Once you have set up dbt Core, you write SQL SELECT statements (a.k.a “dbt models”) in a Git repository to transform your data. dbt Core runs these SQL statements in your destination to build tables and views. dbt Core honors dependencies between your dbt models so everything is built in the correct order.
To work with dbt Core, you can either use the dbt CLI, a free and open-source command line interface, or dbt Cloud, dbt Labs’ hosted service. Fivetran can run dbt projects created with either dbt Cloud or dbt CLI.
There are two types of Transformations for dbt Core:
- Scheduled in Fivetran (recommended): We run your dbt models in your destination according to the schedule that you set in the Fivetran dashboard.
- Scheduled in Code: We run your dbt models in your destination according to the schedule that you set in your dbt project.
Scheduled in Fivetranlink
Fivetran connects to your Git provider and runs your dbt models in your destination according to the schedule that you choose in the Fivetran dashboard. We sync your dbt models from your Git provider every few minutes to ensure that we are up to date.
You create a transformation in the Fivetran dashboard for each dbt model that you want Fivetran to run. Each transformation consists of the following elements:
- Output model: A dbt model that transforms your data so it’s ready for analytics.
- Output model lineage: All upstream models that are needed to produce the output model, starting from your source table references in dbt Core.
- Schedule: A customizable schedule that determines how often Fivetran runs your transformation.
IMPORTANT: Each transformation references a single output model but executes all upstream models during each run.
By default, new transformations have the same schedule as their associated connectors, which means that Fivetan automatically runs your transformations as soon as we update your destination data. These integrated schedules reduce data latency and ensure that your analytics tools reflect new data as quickly as possible. Integrated schedules can also reduce compute costs, since downstream transformations do not run if their associated connector fails to sync. Learn more in the integrated scheduling section.
Learn how to manage transformations in your Fivetran dashboard.
TIP: If you want to customize your transformation schedule, we recommend that you schedule transformation runs in your Fivetran dashboard. However, you can use a configuration file in your Git repository instead if you prefer.
To run a transformation with integrated scheduling, Fivetran performs the following steps:
- Compile your dbt project and inspect the automatically-generated manifest file to build a complete data lineage graph for your dbt models.
- Match source table references in the dbt models to the source table names written by Fivetran connectors.
- Unify your pipelines into end-to-end directed graphs.
- Execute the pipelines in order, which minimizes latency on the analytics-ready tables in your destination.
Fivetran pipelines use the following elements:
- The start is the interval that initiates the pipeline.
- A connector updates source tables in the destination.
- A junction waits for multiple connectors to finish syncing before it triggers a dbt transformation.
- A transformation is a model or a collection of models that updates downstream tables in the destination.
- An output model generates an analytics-ready table. It is typically a leaf node on your data lineage graph.
- A test is an assertion that you make about the models in your dbt project. A test may succeed or fail independently of model execution.
Each start node defines its own data pipeline. In the following example, the start node is the connector sync frequency. The
oracle connector runs every 15 minutes. When it successfully finishes syncing, that initiates downstream transformations to produce the
churn output model.
You may prefer not to run some transformations that are logically downstream of the start node.
For example, if the
churn calculation is very expensive, you may want to run it hourly instead of every 15 minutes with the
oracle connector and the
customers model. In this case, you can create a new schedule for the
churn model, which introduces a separate start node. Whenever an output model is executed in Fivetran, all upstream models are rebuilt as part of the transformation.
While you can set downstream models on varied schedules, you can only integrate downstream models with connectors when their schedules match. In the example below, all connectors run every 15 minutes. The
customers model runs every 15 minutes and is therefore integrated with upstream connectors, but the
revenue model runs every hour and the
churn model runs once every 24 hours.
Fivetran comes with a fixed set of start nodes corresponding to different sync frequencies. When you select a frequency in the dashboard, the pipelines that activate those syncs are aware of overlaps and automatically adjust to them. In the example below, the
oracle connector is on a 15-minute schedule, the
netsuite connector is on an hourly schedule, and the
salesforce connector is on a 24-hour schedule.
- The 15 minute node activates every 15 minutes, except when the 1 hour or 24 hour node activates.
- The 1 hour node activates every hour, except when the 24 hour node activates.
- The 24 hour node activates all three connector syncs.
- You cannot manually trigger a transformation run. If you want to change when a transformation runs, you must edit its schedule.
- You cannot cancel a transformation run. If you want a transformation to stop running, you must delete it.
- You cannot manually trigger a connector sync if you have Transformations for dbt Core enabled.
Scheduled in Codelink
Fivetran connects to your Git provider and runs your dbt models in your destination according to the schedule that you set in your dbt project’s
deployment.yml file. We sync your dbt models from your Git provider every few minutes to ensure that we are up to date.
Fivetran supports Transformations for dbt Core for the following destinations:
Fivetran data modelslink
IMPORTANT: To use Fivetran’s data models, you must have a BigQuery, Redshift, or Snowflake destination.
To learn how to use Transformations for dbt Core, follow the setup guide that applies to you:
- To schedule transformations in the Fivetran dashboard, follow the Scheduled in Fivetran setup guide.
- To schedule transformations in your dbt code, follow the Scheduled in Code setup guide.
To see common use cases for Transformations for dbt Core - Scheduled in Fivetran, see our Use Cases documentation.
* dbt Core is a trademark of dbt Labs, Inc. All rights therein are reserved to dbt Labs, Inc. Fivetran Transformations is not a product or service of or endorsed by dbt Labs, Inc.