Community

Destination Improvement: Add Compaction for Managed Data Lake (Apache Iceberg)

Answered

Chase Grainger User

February 10, 2026 13:23
Edited

We use Fivetran's minute sync frequency on quite a few connectors.

Fivetran does not perform compaction or allow downstream automatic AWS Glue compaction on Apache Iceberg databases, raising S3 costs for the amount of requests for many small files and worsening the performance when querying Apache Iceberg itself. This will only get more expensive and queries will get slower until compaction is added.

Please sign in to leave a comment.

Comments

2 comments

Official comment

Casey Karst Fivetranner
- February 10, 2026 18:03
Thanks for the feedback @Chase.
We are currently exploring how to enhance Fivetran's Managed Data Lake compaction strategy. Today we run compaction on the data included in the particular sync. For most workloads this results in sufficient compaction to reduce S3 I/0 costs and prevent Iceberg performance degradation.

I'd be happy to chat more about the specifics of your usecase. Would you mind emailing me at (casey dot karst at fivetran.com)?

Thanks,
Casey
Chris Redekop User
- April 10, 2026 18:05
This is pretty shocking to me, honestly. I expected compaction to be a core part of the managed experience, particularly for near real-time Apache Iceberg datasets.

Compaction isn’t really optional in Iceberg - it’s a fundamental maintenance task. If Fivetran doesn’t provide this as part of the managed service, there at least needs to be a clear and supported way for users to perform compaction themselves.