Question
What does HVR Online Refresh do?
Environment
Local Data Processing
Answer
An online refresh is a refresh procedure that can be performed while users are making changes to the source database. The procedure consists of the following steps:
- Stop integration jobs.
- Capture jobs may be left running.
- Choose HVR Refresh in GUI and select the Online option (we also set parallelism & bulk).
- After completion, restart the integration.
- We check that capture is still running. If it has stopped, restart it.
Whether or not to stop the capture job is optional. It can be a performance improvement, but on the other hand, if the refresh takes a long time and the capture job is stopped then it may take a long time to catch up. Please note that the Online Refresh comes with two skipping options (option '-qrw' Read/Write and option '-qwo' Write only) and a no-skipping option (-qno). Using skipping on the read (capture) side of the refresh only make sense if the capture job is stopped. No skipping should be used when not all of the data for a particular table will be refreshed (e.g. due to a Restrict /RefreshCondition)
Some Background
The online refresh uses control-files to signal the replication jobs that a refresh for a certain table has taken place. When Read/Write skipping is enabled it instructs the capture job to skip all the changes from before the refresh and it instructs the integrate job to skip all the changes from before the refresh and to use resilient integration for changes that happened during the refresh. We need to be resilient to these changes because we can not be sure whether these changes are already picked up by the refresh. Resilience means transforming an insert into an update if the row already exists, transforming an update into an insert if the row does not exist and lost-deletes are ignored.
Write-only skipping means that the changes are only skipped on the integrate side; in this case, also the integrate job is resilient for changes that happened during the refresh.
No-skipping means that the integrate job will be resilient to the changes during the refresh but no changes will be skipped. The reason that the integrate job needs to be suspended is that the refresh writes these control files at the very end of the refresh. If the integrate job was left running all the changes would already have been integrate and the skipping/resilience has no effect.
Before the refresh starts "hvrinit" (previously known as "hvrload") can be used to define start-capture moment (which is not the same as running the capture job): hvrlinit defines the start moment of the capture, when the capture job is triggered at a later moment it will go back to the moment of the hvrinit) or to a specific time stamp in the past if capture rewind has been used (hvrinit -i.
Anyway, this step is only really needed if you would like to forcefully skip all changes before this hvrinit moment (e.g. because the capture job has been in a failed state for a week and you don't want to process these changes anymore) or if the capture job has never run before.
It is not needed to have /OnErrorSavedFailed or /Resilience defined (this used to be in older versions of HVR).