Community

Optional Date-Based Partitioning for Connector File Outputs

Answered

Henrique Asakura User

May 11, 2026 18:41

Currently, Fivetran stores ingested files in a delta-style structure, which works well for replication purposes. However, it would be extremely valuable to have an optional setting across connectors to organize files using date-based partitioning.

For example, instead of only storing files in a delta format, allow customers to define a structure such as:

source/year/month/day/files.parquet

source=<source_name>/year=/month=/day=/files.parquet

This would provide major benefits for downstream processing and analytics platforms such as Spark, Athena, Redshift Spectrum, Snowflake, Databricks, and dbt workflows, improving:

Query performance
Partition pruning
Lifecycle management
Cost optimization
Data lake organization standards

Ideally, this could be an optional configuration at the connector or destination level.

Please sign in to leave a comment.

Comments

1 comment

Official comment

Casey Karst Fivetranner
- June 05, 2026 18:17
Thanks Henrique for the request.
In your thinking, would you want the time based partitioning based on the fivetran_sync time or a user defined time value in the dataset?

For most of our workloads we have observed a natural ordering of data based on fivetran_sync time. This results in parquet min/max statistics being generated that are then used by query engines to file prune. Curious if you are seeing any of this with your query engine.

-Casey

Official comment

Casey Karst Fivetranner

June 05, 2026 18:17

Thanks Henrique for the request.
In your thinking, would you want the time based partitioning based on the fivetran_sync time or a user defined time value in the dataset?

For most of our workloads we have observed a natural ordering of data based on fivetran_sync time. This results in parquet min/max statistics being generated that are then used by query engines to file prune. Curious if you are seeing any of this with your query engine.

-Casey