Community

Connector Improvement: Ignore duplicate with Bulk integration and COALESCE

Answered

Tushar Sharma User

July 29, 2024 10:20

Ignore duplicate with Bulk integration and COALESCE. This is to resolve if Source and target have no primary key and accept duplicate records. HVR consider duplicate during integration

Example error : F_JT0527: Coalesce encountered illegal insert-insert pair for table

or any other combination

Workaround is not efficient and cause huge latency in days for high volume data.

Please sign in to leave a comment.

Comments

4 comments

Mark Van de Wiel User
- August 01, 2024 03:41
Hi Tushar,

HVR provides an option NoDuplicateRows for action TableProperties (https://fivetran.com/docs/hvr6/action-reference/tableproperties#noduplicaterows). Is this what you refer to when you refer to a workaround that is not efficient? Have you used this option? Note that on a table with no keys on the source and duplicate rows the use of this option will result in out of sync conditions.

Thanks,
Mark.
Tushar Sharma User
- August 01, 2024 10:54
Hi Mark

This option was shared and not tested. If using this option cause data inconsistency then its not an option to use.

The table here highlighted the issue is an Audit table with multiple rows for same record inserted through trigger and is part of very high volume transactions and and one of the largest in replications.
Mark Van de Wiel User
- August 01, 2024 20:55
Tushar,

Can you please explain what workaround you refer to? Selecting a rowid?

Note that in my opinion if we know to expect duplicates then we should not generate illegal insert-insert pairs; that would just be a bug.

Thanks,
Mark.
Tushar Sharma User
- August 02, 2024 06:54
Hi Mark

Workaround provided as , to use CycleByteLimit = 1 so only 1 row will process at a time avoid duplicate in single bulk integration batch.

It was a duplicate in some 100MB of data. pending to integrate due to latency.

And we are expecting multiple insert into audit table as main table may have multiple update. each time a new update create a new record in audit. if there is latency and both insert in audit fall into 100MB CycleByteLimit then its shows us duplicate . Index at source (Sybase) is created with allow duplicate and Target SF does not honor any constrains so if HVR allows duplicate to process it will go through.