Connector Improvement: Support for Partitioning Delta Tables - Delta Lake Destination
AnsweredCurrently, we have a table with approximately 10 billion records, and while the Delta table format provides significant advantages in terms of ACID compliance and transaction management, we are encountering performance challenges when querying this data. Specifically, the query response times are becoming increasingly slow due to the lack of appropriate partitioning in the Delta table.
Problem:
- Large Dataset: Our table contains 10 billion records, making unpartitioned queries extremely slow.
- No Partitioning Support: Fivetran writes data to Delta tables without an option to define partitions, causing full table scans for even simple queries.
Proposed Solution:
- Introduce a feature that allows users to define partition keys when configuring a Fivetran sync to Delta tables. For example, enabling partitioning based on columns such as date, region, or any other key relevant to the dataset.
- Optionally, allow users to specify custom partitioning strategies during initial setup or as a configuration update.
Benefits:
- Improved Query Performance: Partitioning the Delta table will significantly reduce query response times by limiting the amount of data scanned.
- Cost Efficiency: Query costs in platforms like Databricks or other compute engines will decrease due to reduced resource usage.
- Scalability: This feature will help manage and query massive datasets more effectively, aligning with the needs of large-scale data operations.
We believe this feature would add immense value for users dealing with large datasets and further enhance Fivetran's usability and performance in enterprise data environments.
-
Official comment
Hi Saroj,
Thank you for your suggestion! This feature is in our roadmap, once a feature is prioritized within our development, we will ensure to update its status and communicate with all subscribed stakeholders accordingly. We appreciate your understanding and continued interest in our product evolution. If you have any further questions or require more details on feature prioritization, please don't hesitate to reach out.Best regards,Egidio
Please sign in to leave a comment.
Comments
1 comment