We have redesigned the Skip Header Lines and Skip Footer Lines setup form fields for our Amazon S3, Dropbox, and Google Cloud Storage file connectors. To set these advanced options, enable them using the toggles and specify the number of skipped lines in the input fields.
Compare how the setup form looked before the change:
and how it looks now:
You can now opt to use a custom service account to authenticate your Google Cloud Storage bucket. For more information, see our setup instructions. We are gradually rolling out this improvement to all existing customers.
We now support syncing headerless delimited format files (CSV, TSV, log) for Google Cloud Storage connectors. We will create generic column names for CSV files without a header line. This feature is in beta and available to all customers. See the configuration options in our files documentation for details.
We have improved the way we track which files we have already synced to make sure we only pull new or changed data from the source containers. Previously, we re-synced files that were created at the same time as the last observed cursor position. That ensured that we never missed any files that were created while we were syncing your data. That also meant that we sometimes synced the same files twice. Now, in addition to tracking the timestamp, we also track the names of the files we have already synced. We store up to 1,000 file names. We sync files created at the time of the last observed cursor position only if we don’t have the file in our list of synced files for that timestamp.
Our Google Cloud Storage connector can now sync Parquet files. We support Parquet format 2.4.0. This feature is in Beta.