When you don’t want sensitive data such as personally identifiable information (PII) to be synced to your destination, you can use either of the following methods:
- Column Blocking - exclude the columns containing sensitive data from your syncs.
- Column Hashing - hash the values of the columns that store sensitive data. Hashing is a method for anonymizing data by replacing the actual value with a hash value. Using this method, you still sync these columns that now store hashed values to your destination. This lets you join datasets using the hashed columns as keys.
The following sections describe the Column Blocking and Column Hashing features in detail and list their specifics and limitations, as well as provide some useful tips and answers to frequently asked questions.
You can block specific columns from replicating to your destination. Column Blocking lets you avoid sending personally identifiable information (PII) to your destination. This may be helpful as a part of your strategy for GDPR compliance.
NOTE: You cannot block primary key columns.
All Fivetran connectors, except Magic Folder connectors, support column blocking.
Does column blocking prevent data from being stored on Fivetran’s servers?link
We retain your data for as little time as possible. After you uncheck a column on the connector details page, the data in that column still passes through our systems and may be stored temporarily during a sync. See our Retention of Customer Data documentation for more details.
Learn how to configure Column Blocking and Hashing in our Configure Column Blocking and Hashing Guide.
Column hashing is a method for anonymizing data in your destination while preserving its analytical value. You can join across data sets without introducing sensitive data to your destination. Because column hashing lets you anonymize personally identifiable information (PII) and store it in your destination, it may help with GDPR compliance.
Hashing is a one-way operation, unlike encryption, where data can be encrypted and decrypted. Once data is hashed, it cannot return to its previous state.
All Fivetran connectors, except Magic Folder connectors, support column hashing.
When you select Hashed, the next time your data syncs, Fivetran ingests your data, hashes it, and then writes the hashed data to your destination. To add an extra layer of security, Fivetran uses a unique salt per destination to ensure that the data cannot be decoded based on knowledge of the default Fivetran algorithm.
The salt is per destination so that all identical fields, such as email addresses, are still joinable between all data Fivetran loads into that destination.
NOTE: Once you choose to hash a column, we apply column hashing to all future syncs. If you also wish to hash the historical data in that column, you must re-sync.
If you are loading data with your own process into the same destination and would like to match our hashing, a user with the Account Administrator role of your Fivetran account can request the salt and hashing method from Fivetran support.
Calculate your own hasheslink
To generate the same hash value that we use for columns, perform the following steps:
Contact our Support team for the salt value and the hashing algorithm used for your connector.
Add the salt value as a suffix to the original value of the column. For example, if the value of your columns is
foo_barand the salt value is
i_am_the_secret_salt, the concatenated value is
TIP: If the original column value is
null, replace the value with an empty string.
Convert the value (from Step 2) to byte array (byte) format using UTF-8 encoding.
TIP: In Java, you can convert using the
Use the byte array (from Step 3) as an input to the hashing algorithm.
Convert the output byte array to a Base64 encoded string.
This encoded string is the final hash value.
Learn how to configure Column Blocking and Hashing and schema change settings in our Configure Column Blocking and Hashing Guide.