We have added a new advanced configuration option List Strategy to the connector setup form. Enable the Show Advanced Options toggle and select the listing strategy you want to use:
complete_listing: The default listing strategy where we list all the files in the bucket and filter new and modified files.
time_based_pattern_listing: An optimized listing strategy to improve sync speeds. We fetch the files in a lexicographic order starting from the last file of the previous sync. You can opt to use this strategy if your files are named based on the date or time they are added to the bucket. For example, if a time-based pattern exists in the bucket leading to every new file being appended in a lexicographically increasing order:
Filename Last Modified 2021/01/01.csv January 1, 2021 2021/01/02.csv January 2, 2021 2021/01/03.csv January 3, 2021 2021/01/04.csv January 4, 2021
NOTE: If we are unable to identify a time-based pattern in the file names, we will use the default strategy and list all the files.
We now support syncing headerless delimited format files (CSV, TSV, log) for S3 connectors. We will create generic column names for CSV files without a header line. This feature is in beta and available to all customers. See the configuration options in our files documentation for details.
We now exclude S3 objects that have been archived to Glacier storage class from our data syncs. If you want us to sync these objects, restore them to standard storage.
You can now connect to public S3 buckets without needing an AWS account.
Our Amazon S3 connector can now sync Parquet files. We support Parquet format 2.4.0. This feature is in Beta.
We now require you to configure a Role ARN when you set up an Amazon S3 connector, even if your bucket is a public bucket.
Existing Amazon S3 connectors will not be affected. However, if you reconfigure an existing connector or create a new one, you will have to configure a Role ARN even if your bucket is a public bucket.
We have added a new connection test to the Amazon S3 setup form that helps you adopt security best practices. The test checks your bucket policy settings to make sure you have created your Role ARN using the given externalID.
If you mistakenly create your Role ARN without using the give externID, the configuration will now fail the test. You will have to correct your Role ARN.
We can now sync data from public buckets.
Our setup form now correctly handles directory patterns that don’t end with