This refresh procedure is used for a specific table or tables that have data consistency problems.
Full Load for Small/Medium Sized Tables
Open the refresh dialog
Select source and target location(s)
Select the tables that need to be refreshed
Select the level of parallelism for tables
The Online Refresh options need to be checked, and the option needs to be set to “Skip
Previous Integrate”. This is because we assume two integrate locations. See here for other option explanations
Check the Schedule box and chose a sensible name for the Refresh Task
Click Schedule
Suspend integrate jobs, but capture jobs can be left running
Optionally - Go on to the HVR Scheduler, select the refresh job that has just been
created and chose “New Attribute”. Create a retry_max attribute and set this to the desired value. 0 means no retry.
Right mouse click the refresh job and chose start to kick off the job
Selective Refresh for Large Tables
Open the refresh dialog
Select source and target location(s)
Select the tables that need to be refreshed
Select the applicable context that activates the Restrict /RefreshCondition
Select the level of parallelism for tables (only applicable if multiple tables are refreshed)
The Online Refresh options need to be checked, and the option needs to be set to “Only Resilience”. This is because only a slice of the tables is being refreshed
Check the Schedule box and chose a sensible name for the Refresh Task
Click Schedule
Suspend integrate jobs, but capture jobs can be left running
Optionally - Go on to the HVR Scheduler, select the refresh job that has just been created and chose “New Attribute”. Create a retry_max attribute and set this to the desired value. 0 means no retry
Right mouse click the refresh job and chose start to kick off the job
Refreshes that use a Restrict /RefreshCondition, almost always use "Only Resilience". The only exception is when the /RefreshCondition will match a /CaptureCondition or /Integrate Condition that prevents some data set from being replicated. Selective refreshes to correct for missed data should always be done with ‘Only Resilience’.
For selective refreshes, to correct for missed updates, with conditions on e.g. last_update date; duplicates can occur after the refresh because the old row will still be in there. These duplicates need to be (hard) deleted manually.
Example of Restrict /RefreshCondition
Below an example Restrict /RefreshCondition is given based upon the creation_date and last_update_date columns, where creation_date stands for the date when the row was inserted and last_update_date stands for the last date when the row was updated. Using creation_date is applicable when inserts are missed; using last_update_date is applicable when updates are missed. A context parameter is added /Context=<ctx_name>. This allows the creation of specific refresh tasks with different /RefreshCondition applied.
Example based on creation_date
Source table:
Target table:
The row where CREATION_DATE = 2019-01-02 is missing from the target, therefore we have to gives this date as the condition of the refresh. The syntax of the condition depends on the source and target database. In this case, both the source and target databases are Oracle:
The condition has to be applied to both location groups (source and target), this way the row(s) where CREATION_DATE = 2019-01-02 will be selected from the source table and the rows where CREATION_DATE <> 2019-01-02 won’t be deleted from the target table.
2. Schedule and start the refresh:
a.
b.
c.
3. After scheduling the job, start it and wait until it is finished:
The target table after the refresh:
Source table:
Target table:
Example based on last_update_date
Source table:
Target table:
For some reason an update was not replicated to the target table which is visible from the difference in the last_update column of the row where C1 = 2. To correct this, we have to do a refresh, but first the differing row has to be deleted from the target. If we don’t delete it and we create a condition based on the source table’s last_update_date column’s value, then the refresh job won’t be able to locate the row in the target (since the target last_update_date value differs) and the target row where C1 = 2 won’t be deleted before the row with the correct values is inserted. This is important especially if there are key/unique columns in the target, because this scenario will lead to a violation of the key/unique constraint.
Delete the different row from the target:
Source table:
Target table:
2. Create the refresh condition:
3. Schedule and start the refresh (the steps are detailed in the previous example’s second step)
4. After the refresh is finished, the previously deleted row is inserted with the correct last_update_date column value:
Source table:
Target table:
On the source Restrict /RefreshCondition=”creation_date>to_date('15-MAR-18','dd-mon-yy') and creation_date<to_date('20-MAR-18','dd-mon-yy')” /Context=create_date
On the target Restrict /RefreshCondition=”creation_date > '2018-03-15' and creation_date < '2018-03-20' and hvr_is_deleted=0” /Context=create_date