Community

Connector Improvement: Option to synchronize empty fields

Ivan Shomnikov User

April 11, 2022 23:40

Azure Blob Storage connector skips fields completely and doesn't create them in destination schema if they don't have values.

That's not very convenient for data modelling / ETL development as destination schema is created incomplete (early on not all source data exists). Plus it makes it impossible to perform data quality related work.

Please sign in to leave a comment.

Comments

11 comments

Official comment

Alison Fivetranner
- May 09, 2022 18:51
Hi Ivan,

Thank you for taking the time to provide your feedback. The not syncing of empty columns is a core service of our system and has historically helped customers not end up with confusing and unhelpful extra columns in their Dataware houses (doc link).

I'm interested to understand more about how it make it impossible to perform data quality related work? I believe many of our customers facing the need to control exact columns send a single file with dummy data in every field to get the initial columns configured - would that work for you?

Alison
David Irvine User
- July 25, 2022 13:23
This is a feature that i would appreciate across all destinations.

Our use case is reporting that monitors the source system for fields population. Highlighting fields that are empty is a key part of this.
Bobby Neelon User
- August 16, 2022 16:07
- Edited
I am in a similar boat with the SFTP connector. I have empty columns in the initial CSV file I'm loading, but those fields may be populated in future files we receive from the vendor.

To Ivan Shomnikov's point, if I want to create my staging model in dbt to handle all possible fields, I can't include all the columns or need to do ```null as <empty column name>``` because Fivetran didn't create all the possible columns that could ever be created. I would need to check each time a file is added to see if those Fields now contain data or not which defeats the purpose of an automated solution like Fivetran.

Again, I think a simple toggle button in the connector settings(can default to False) that allows you to sync empty columns if set to True would be very nice.

Just so I understand how Fivetran handles it now, if a new file comes in where one of the previously empty columns now has data, will Fivetran update the schema of the destination table?

Thank you!
Alison Fivetranner
- September 21, 2022 15:46
Hi Bobby,

Thank you for the details around your use case it really helps me understand your needs and their drivers. I'll add the feature to our backlog.

With regard to your question "if a new file comes in where one of the previously empty columns now has data, will Fivetran update the schema of the destination table?"
I can confirm that our Schema Migration code kicks in and the new column will be added to the table. Note it is possible to configure our Column Blocking & Hashing functionally to stop new data being added to your schema which would prevent the Schema Migration code doing it thing.

I hope that helps

Best Alison
Derek Synan User
- December 13, 2022 17:30
Alison Was this request added to your backlog for only Azure Blob Storage or for all connectors? Assuming it was an option that could be enabled or disabled during connector setup. Also, where does this stand for priority and timing on your backlog?
Alison Fivetranner
- December 13, 2022 21:57
Hi Derek,

Its something we are exploring, however it goes against one of the core things in our pipeline, where we try to deliver only the ready to be used valuable data into your warehouse. One of the challenges is we infer data type from the values in a column, if there are no values then that inference can't be used. Right now we are thinking about the right way to approach the need prior to committing a slot of the roadmap to it.

I hope that helps

Alison
David Irvine User
- December 14, 2022 09:59
Alison
https://support.fivetran.com/hc/en-us/community/posts/5384855869207/comments/10937833082519
Surely if the field is empty you could pick any type you like and just update it when data enters the system?
Although i would have through infering the type from the data is a bit risky for a sparsely populated field. That may get an extended domain when more values are entered. Surely a more reliable approach would be to ask the source sysetm where possible what the field format is and map to that.

For our use case the existing of an empty field is useful information that we want to be able to report on. Some of our reports are around the DB health we want to be able to catch fields that have been created but have not been populated to encourage removal of these fields again.
Amit Mittal User
- July 31, 2023 16:42
Is there is update to this feature request? To give you our business context, we ingest data to Snowflake and apply masking policies. Since data may come to blank columns in future, we need to have all columns available during initial sync so that we can apply all masking policies upfront.
Alison Fivetranner
- August 01, 2023 14:32
Thank you Amit for the additional context.
We at still working this into our roadmap

Alison
Glen Casey User
- May 27, 2025 09:30
What is the status of this feature request? We have similar use case(s) where we need all columns (empty or not) replicated from the source to the destination.

Thanks - Glen
Parmeet Kohli Fivetranner
- August 29, 2025 15:10
- Edited
Hi Glen, this is on our roadmap. I don't have a concrete timeline yet but will keep you posted on this thread.

Thank you,
Parmeet