Other: Automatic Unicode Support for Target Columns Using NVARCHAR/NCHAR
AnsweredSummary:
Enable HVR to automatically map source UTF-8/CESU-8 encoded character columns to NVARCHAR (or NCHAR) on the target system to preserve Unicode characters and prevent encoding issues.
Current Behavior
- Source: Oracle Database (UTF-8 → CESU-8)
- Target: Azure SQL Database (SQL_Latin1_General_CP1_CI_AS → WINDOWS-1252)
- When replicating data, columns defined as
VARCHARon the target cannot store UTF-8 characters. - Example:
-- Source
select substr(groes,5,1), ascii(substr(groes,5,1)) from mara where matnr='000000000001253780';
Result: ⅛ (ASCII: 14845339)
-- Target
select substring(groes,5,1), ascii(substring(groes,5,1)) from mara where matnr='000000000001253780';
Result: ? (ASCII: 63)
- Data integrity is lost because special characters are replaced with
?.
Problem
- Current workaround requires manual configuration:
- Adding
ColumnPropertiesaction withDatatype=nvarcharand specifying column length. - This is impractical for large schemas and dynamic environments.
- Adding
- With table property with the coercion type 'ENCODING'
- End users see incorrect text in reports and analytics, see above.
Expected Behavior
- HVR should automatically detect source encoding (UTF-8/CESU-8) and map all
CHAR/VARCHARcolumns toNCHAR/NVARCHARon the target, if target does not support UTF-8. - Preserve all Unicode characters during replication.
Business Impact
- Ensures accurate data representation in data warehouses.
- Eliminates coercion errors
- Reduces manual intervention and risk of human error.
-
Hi Marc,
Thanks , interesting use case. We’ll investigate whether HVR can auto-detect UTF-8/CESU-8 and map CHAR/VARCHAR → NCHAR/NVARCHAR when a target doesn’t support UTF-8 and if we can add those capabilities to the roadmap. In the meantime, you would best use ColumnProperties with Datatype=nvarchar (and appropriate length) or the table property coercion type ENCODING, as you are probably already doing.
Best regards,
Edwin
Please sign in to leave a comment.
Comments
1 comment