Question
Operations on string fields are failing with an error similar to the following:
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server] Cannot resolve collation conflict between "Latin1_General_BIN" and "SQL_Latin1_General_CP1_CI_AS" in the equal operation.
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server] Statement(s) could not be prepared.
Environment
Destination: SQL Server
Answer
One of the following three options can typically resolve this issue:
- Change the
database_default
collation to match that of Fivetran. - Account for this COLLATE setting in downstream operations (e.g.
JOIN ON fivetran_replicated_table.id COLLATE SQL_Latin1_General_CP1_CI_AS = table.id COLLATE SQL_Latin1_General_CP1_CI_AS
) - As a very last resort, contact Customer Support who can add a feature flag that allows joins to occur without collation. It is recommended that you only request this if you are confident that all character handling is handled properly in every source.
Cause
This error is caused by incompatible collations. Fivetran uses the Latin1_General_BIN
collate encoding setting during syncs so that string comparisons (e.g., JOIN operations) can be performed in a way that handles:
- Case sensitive strings
- Unusual or non-ASCII values
Its usage handles an edge case where a table has a primary key that uses strings (e.g., a result of a hash function).
Take two records that have primary keys with the same character representations but use different cases, such as abc and ABC. The collate setting Latin1_General_BIN
will recognise them as different values. This is because it uses case-sensitive encoding.
In contrast, if a setting such as SQL_Latin1_General_CP1_CI_AS
is used (which uses case-insensitive encoding), it may mark the two records as the same, even though they're not. It could also cause an error similar to the following:
java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: Violation of PRIMARY KEY constraint 'PK__m_table__3213E83F132AEF4A'.
Cannot insert duplicate key in object 'test_20201112_1923_14xzddh264qyu1c3b7sjs3yuix.m_table'. The duplicate key value is (abc).