Community

Destination Improvement: AWS S3 Data Lake: Add Query Support For Apache Spark

Sean McEwen User

July 12, 2023 17:13
Edited

Hello,

I am wanting to request a feature enhancement to the S3 Data Lake Destination. As of now, the destination only supports querying in Athena and Dremio. If a company is leveraging a Data Lake strategy in S3, they are likely to want to leverage other query engines that can read from the Glue Data Catalog, typically Spark via EMR/Databricks or AWS Glue. I think it would be crucial for this support to be added.

The urgency here is relatively high for our organization. It doesn't need to be added today, but we would likely need an understanding of where this feature would be prioritized on the roadmap. Thanks!

Please sign in to leave a comment.

Comments

2 comments

Sean McEwen User
- July 25, 2023 20:38
Hey has there been investigation into this request?
Coral Trivedi User
- April 03, 2024 00:38
- Edited
Hi Sean,

Thank you for reaching out and thanks for your patience. We have expanded the supported query engines - more details available here.

Would you mind sharing a bit more about your use case? Are you looking to query Iceberg or Delta Lake tables with EMR or Databricks?

Best,

Coral