Community

Destination Improvement: Liquid Clustering support for Databricks Delta table destination

Completed

Terence Lim User

October 09, 2023 17:10

Our team has started using Liquid Clustering on Databricks recently and while trying to sync the tables, we're getting errors regarding "Clustering Information" column.

The issue is basically as follows:

All the column names beginning with a hash(#) are not the actual column names of the provided table rather they are sub-heading names for other details of the table like # Clustering Information indicating the below data is related to columns involved in clustering and # Partition Information indicating the below info is related to the partition columns and # Detailed Table Information indicating other details of the table and this is expected and is differentiated with '#' in the beginning.

We would appreciate if Fivetran can prioritize addressing this issue for Delta Table destinations, thank you!

Please sign in to leave a comment.

Comments

4 comments

Official comment

Egidio Terra Fivetranner
- December 20, 2025 00:48
Hi all,

I’m pleased to announce that Liquid Clustering is now available with Fivetran. You can enable it by setting the cluster keys directly in Databricks.

Best regards
Jonathan Boarman User
- December 11, 2024 19:24
Resolving this bug is critical for enterprise customers. Work arounds for this require needless full copies of source tables where "indexes" (of sorts) are added to the source tables. Without it, JOINs can run ridiculously long and is so wasteful that adding clustering to those tables is just the only way to efficiently perform the processing needed within the timeline requirements. As noted by Terence, the cluster detail should be readily ignored as completely irrelevant to the Fivetran update process since it's local metadata for Spark / Databricks to leverage internally.
Yik Sim User
- March 25, 2025 14:39
Adding to this. We synch the majority of data into Databricks via Fivetran (Microservices) and the Raw data is used by multiple team across the organization. Some of these source have very large tables that are joined and filtered multiple time depending on the use and are some of our most expensive operations.

Databricks recently release Auto Liquid Clustering that is designed to help optimize query performance by dynamically clustering based on query patterns, and it was recommended by our account team to try it to optimize our query performance and manage resource utilization.

Currently the only workaround is to replicate again the data that is already synched by Fivetran into another set of tables but it's both wasteful and expensive especially for very large tables. Given this new development and the significant time since liquid clustering has been GA with Databricks, I think it would be very helpful if Fivetran would consider looking into supporting this feature.
Egidio Terra Fivetranner
- September 12, 2025 22:53
Hi all,

Thank you so much for sharing your request and highlighting the importance of Liquid Clustering support for Databricks Delta table destinations!
We have great news—this feature is currently under development and will be available soon. We’ll keep this thread updated with news and timelines as we get closer to general release, so stay tuned!

Thanks again for your engagement and feedback.

Best regards,