Destination Improvement: Add configurable CDC metadata passthrough in Managed Data Lake
AnsweredWhen using the Managed Data Lake destination, important CDC metadata available from the source is not currently exposed in the final dataset. In particular, fields such as transaction ID and commit sequence/order are often captured in the source (eg. via WAL or binlog) but appear to be discarded or abstracted away in the Iceberg output.
Having the option to passthrough or surface this metadata would be extremely valuable for certain analytic and auditing scenarios. For example:
- Reconstructing which changes occurred as part of a single logical transaction spanning multiple tables
- Performing audit trail analysis for compliance or debugging purposes
- Performing usage and change pattern analysis (eg. frequency or size of multi-table edits)
Ideally this could be implemented as metadata columns (eg. _fivetran_txid, _fivetran_commit_seq) that users could enable per-destination or per-table.
-
Hi Chris!
Thanks for reaching out.
The CDC metadata comes from each individual source and is not defined with MDLS. Which Source are you using to write to MDLS? I can make sure we route the ask to the right folks internally.Thanks,
Casey
-
Primarily we are currently using the postgres (aurora) connector, but also the sqlserver and oracle connectors.
Please sign in to leave a comment.
Comments
2 comments