Skip to main content

Community

Connector Improvement: Option to synchronize empty fields

Please sign in to leave a comment.

Comments

9 comments

  • Official comment

    Hi Ivan,

    Thank you for taking the time to provide your feedback. The not syncing of empty columns is a core service of our system and has historically helped customers not end up with confusing and unhelpful extra columns in their Dataware houses (doc link).

    I'm interested to understand more about how it make it impossible to perform data quality related work? I believe many of our customers facing the need to control exact columns send a single file with dummy data in every field to get the initial columns configured - would that work for you?

    Alison

    This is a feature that i would appreciate across all destinations.

    Our use case is reporting that monitors the source system for fields population. Highlighting fields that are empty is a key part of this.

    I am in a similar boat with the SFTP connector. I have empty columns in the initial CSV file I'm loading, but those fields may be populated in future files we receive from the vendor. 

    To Ivan Shomnikov's point, if I want to create my staging model in dbt to handle all possible fields, I can't include all the columns or need to do ```null as <empty column name>``` because Fivetran didn't create all the possible columns that could ever be created. I would need to check each time a file is added to see if those Fields now contain data or not which defeats the purpose of an automated solution like Fivetran.

    Again, I think a simple toggle button in the connector settings(can default to False) that allows you to sync empty columns if set to True would be very nice.

    Just so I understand how Fivetran handles it now, if a new file comes in where one of the previously empty columns now has data, will Fivetran update the schema of the destination table?

    Thank you!

    Hi Bobby,

    Thank you for the details around your use case it really helps me understand your needs and their drivers. I'll add the feature to our backlog.

    With regard to your question "if a new file comes in where one of the previously empty columns now has data, will Fivetran update the schema of the destination table?"
    I can confirm that our Schema Migration code kicks in and the new column will be added to the table. Note it is possible to configure our Column Blocking & Hashing functionally to stop new data being added to your schema which would prevent the Schema Migration code doing it thing.

    I hope that helps

    Best Alison

    Alison Was this request added to your backlog for only Azure Blob Storage or for all connectors? Assuming it was an option that could be enabled or disabled during connector setup. Also, where does this stand for priority and timing on your backlog?

    Hi Derek,

    Its something we are exploring, however it goes against one of the core things in our pipeline, where we try to deliver only the ready to be used valuable data into your warehouse. One of the challenges is we infer data type from the values in a column, if there are no values then that inference can't be used. Right now we are thinking about the right way to approach the need prior to committing a slot of the roadmap to it.

    I hope that helps

    Alison

    Alison
    https://support.fivetran.com/hc/en-us/community/posts/5384855869207/comments/10937833082519
    Surely if the field is empty you could pick any type you like and just update it when data enters the system?
    Although i would have through infering the type from the data is a bit risky for a sparsely populated field. That may get an extended domain when more values are entered. Surely a more reliable approach would be to ask the source sysetm where possible what the field format is and map to that.

    For our use case the existing of an empty field is useful information that we want to be able to report on. Some of our reports are around the DB health we want to be able to catch fields that have been created but have not been populated to encourage removal of these fields again.

    Is there is update to this feature request? To give you our business context, we ingest data to Snowflake and apply masking policies. Since data may come to blank columns in future, we need to  have all columns available during initial sync so that we can apply all masking policies upfront.

    Thank you Amit for the additional context.
    We at still working this into our roadmap

    Alison