Question
Operations on string fields are failing with an error similar to the following:
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server] Cannot resolve collation conflict between "Latin1_General_BIN" and "SQL_Latin1_General_CP1_CI_AS" in the equal operation.
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server] Statement(s) could not be prepared.
Environment
Destination: SQL Server
Answer
One of the three following options can resolve this issue:
- Change the
database_default
collation to match that of Fivetran - Account for this COLLATE setting in downstream operations (e.g.
JOIN ON fivetran_replicated_table.id COLLATE SQL_Latin1_General_CP1_CI_AS = table.id COLLATE SQL_Latin1_General_CP1_CI_AS
) - As a very last resort, contact Customer Support who can add a feature flag which allows joins to occur without collation. Only request this if you are 100% confident that all character handling is handled properly in every source.
Cause
This error is caused by incompatible collations. Fivetran uses the Latin1_General_BIN
collate encoding setting during syncs so that string comparisons (e.g. JOIN operations) can be performed in a way that handles:
- Case sensitive strings
- Unusual or non-ASCII values
It's usage handles an edge case where a table has a primary key that uses strings (e.g. result of a hash function).
Take two records which have primary keys with the same character representations but use different cases (e.g. 'abc' and 'ABC'). The collate setting Latin1_General_BIN
will recognise them as different values as it uses case sensitive encoding.
In contrast, if a setting such as SQL_Latin1_General_CP1_CI_AS
is used (which uses case insensitive encoding), it may mark one record as the same as the other (even though it's not). It could also cause a failure similar to the following:
java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: Violation of PRIMARY KEY constraint 'PK__m_table__3213E83F132AEF4A'.
Cannot insert duplicate key in object 'test_20201112_1923_14xzddh264qyu1c3b7sjs3yuix.m_table'. The duplicate key value is (abc).