Community

Connector Improvement: SFTP and S3 - Add All Columns to Destination

Answered

Lucas Cunningham User

October 28, 2023 12:18

While the documentation for SFTP and S3 file connectors says they will not sync empty columns this make them less than ideal solutions for ongoing file ingestion, in our case files are typically csv.

Would it be possible to add all fields to the destination regardless of whether there is data? We find it important it to maintain not only all the fields but their order as well, and having a field added out of order at a later date doesn't fit for us.

Thanks for you consideration!

Please sign in to leave a comment.

Comments

2 comments

Official comment

Alison Fivetranner
- November 01, 2023 19:02
Hi Lucas,

Thank you so much for using Fivetran and for taking the time to submit this request. This is a very interesting idea that I'd love to understand in more depth.
Can you share more about your data and pipelines so I understand your challenges better?
What sort of data is it that you are syncing via our file connectors?
Why is the column order important?
When would you expect the initially empty columns to populate with data?

Understanding this information will help me prioritize the ask.

Thank you once again,

Alison
Lucas Cunningham User
- November 02, 2023 15:45
- Edited
Thanks very much for the question, Alison.

One use case we have is with a data set exported from a provider as csv. The data set is rows of transactional data but each row could be a different type of transaction. There are ~50 columns in all, not all transaction types will populate each column, so a given row will likely never have data across all possible fields. We’re currently importing this data through another means but would love to transition to Fivetran.

As such, data analyses and reports have already been built that reference various fields. If we start a new client with this data set using Fivetran and not our existing import strategy, some fields will be dropped and we’d need to make adjustments down the line. Eventually all columns would be added and then we’d need to readjust other processes, etc.

So, in our case it would be best if all fields represented in a file ended up as fields in the table. And I can see our clients asking questions about data they upload if a field isn’t present in the resulting table. Their first instinct will be to say that not all data was loaded, leading us to troubleshoot their files and determine if the field in question was simply empty and needing to have this conversation with them. While it’s clear in your documentation it isn’t intuitive in my experience with loading data. We do a lot of file uploads for our clients, and I’d like to transition that to Fivetran but won’t be able to if we can’t be certain of the field mapping from file to table is consistent.

Thanks again for your attention, it’s greatly appreciated,

Lucas

EDIT: thinking about how to define a field when it has no data:

1. Offer a schema layout screen where the user could make that decision?

1a. Maybe a separate feature request but allow the schema screen to be editable by the user on a file connector. I could see this being useful if new data in a given text field exceeds the length of anything previously uploaded (unless you already handle expanding fields in such cases?).

2. Create the field as varchar(255) when empty.