How To: Add parallelism by defining multiple Integration Jobs
Scenario:
- There are times when a single replication channel requires multiple integrate jobs to keep the latency within expected service levels. In this scenario, we are demonstrating how to define a single Capture job which routes data to four separate integrate jobs.
- The following example replicates tables from Oracle to SQL Server using the TPCC schema which contains eight tables. The steps are the same regardless of your target database. Two tables will be replicated by each of four integrate jobs.
- This solution uses an approach of defining four separate locations and four location groups. Defining separate Integrate Actions for each table will identify which location group will be assigned for each table.
- The following example assumes that a primary location has already been defined and tested for both Oracle and SQL Server. The exercise will then make new locations for the SQL Server to spread the workload across multiple integrate jobs.
Steps:
Step 1] Make three new locations for your target database location
A] As this solution is using four integrate jobs, we will start by making three new locations for the target database location. Do not make a copy of the location, but instead select New Location.
B] Add a new location named mss2 using the same details as used in the original mss location. Simply append the number 2 to the location name to make it mss2, then hit O Repeat this step until you have a total of four locations for SQL Server with the names of: mss, mss2, mss3, and mss4.
Step 2] Create your Channel
This example is going to create a new replication channel named one2many which will include one source capture job and four integrate jobs for eight tables.
A]Right-click on Channel Definitions and select New Channel
B] Enter any name and description for your new channel. This example used one2many for the channel name with a description that this is Solution for multiple integrate jobs.
Step 3: Create Location Groups
A] Create a total of five locations. One for the source Capture locations. Four for the Integrate locations. This example will name the source as SRC and the target groups as TGT1, TGT2, TGT3, and TGT4.
B].Right-click on Location Groups and New Group
C].Enter a name and description for the Source location, such as SRC and click OK
D].Repeat the process for all four of the Integrate locations.
Step 4: Define which tables are to be replicated by this channel
Using Table Explore, explore the tables located on the Oracle source location.
A] Right-click on Tables and select Table Explore
B] Select the location for Oracle and click Connect
A window will be displayed showing all tables that are available in the database and/or tables that are already defined in another channel.
C] Select the first eight tables in the schema named tpcc click Add
The following window is displaying showing the Base Table Name and the Local Data Processing Table name.
D] Click OK to continue.
The Console returns you to the Table Explorer dialog.
E] Click Close to close the dialog and return
Step 5: Define Capture Action
You will need to define a single Capture action for all source tables that are part of the SRC location group.
A] Right-click on the name of the location group SRC, select à New Action à Capture
All of the default Capture Action parameter values are acceptable for this channel.
B] Click OK to add the Capture Action.
Note that in each of the actions for configuring the Capture job that the action is configured at the group and not the location. The Configuration Action box should not be checked and the location stays grayed out. See below
Step 6: Define Integrate Actions
For all eight tables, you need to define a new Integrate Action to identify which location connection is used for the Integrate job. You will assign the first two tables (customer, district) to Location Group TGT1. The second set of tables (history, item) are assigned to group TGT2. The third set of tables (new_order, order_lines) to group TGT3. And finally, the last two tables (orders, stock) are assigned to TGT4 group.
A] Select the drop down for Tables, and select the first table, customer.
Note that in each of the actions for configuring the Integrate jobs that the action is configured at the group and not the location. The Configuration Action box should not be checked and the location stays grayed out. See below
B] Repeat this process for the second table which should also be assigned the TGT1 group. Select the drop down for Tables, and select the first table, district. Repeat this process by assigning the next two tables to TGT2, TGT3, and TGT4.
C]The resulting Integrate actions should appear like the figure below.
Step 7: Execute Local Data Processing Initialize box
In order to save the runtime components for the channel, execute the Local Data Processing Initialize step.
A] Right-click on the one2many channel and select Local Data Processing Initialize.
For this exercise, we will leave all of the default values for both the Locations tab and the Advanced Options tab. Make sure that the Create or Replace Objects is selected.
You can click on the Advanced options tab to see all the default settings when generating the runtime components.
B] Click on Initialize
The following message is displayed to show that Local Data Processing created five replication jobs, one for Capture and four for Integrate.
a.Click OK to return to the Initialize window
C] Click Close to return back to the Local Data Processing Console
Congratulations! You have completed defining a replication channel that replicates eights tables using a single Capture job while distributing the replication to four Integrate jobs on the target server.
This solution document does not cover the remain steps which is to initially load the data and start the incremental Capture/Integrate jobs with the Local Data Processing scheduler.
Appendix.
Results of running Refresh:
Output from Refresh:
one2many-refr-ora-mss: 2 tables recreated because ‘Always Recreate’ was selected. [option -cf]
Refreshed ‘district’ from location ‘ora’ to location ‘mss’ (10 rows 210 wide compression=49.5%). This bulk refresh took 0.15 seconds.
Refreshed ‘customer’ from location ‘ora’ to location ‘mss’ (30000 rows 861 wide compression=32.1%). This bulk refresh took 1.59 seconds.
Refreshed 2 tables from location ‘ora’ to ‘mss’.
one2many-refr-ora-mss2: 2 tables recreated because ‘Always Recreate’ was selected. [option -cf]
Refreshed ‘history’ from location ‘ora’ to location ‘mss2’ (72958 rows 173 wide compression=89.2%). This bulk refresh took 1.48 seconds.
Refreshed ‘item’ from location ‘ora’ to location ‘mss2’ (100000 rows 149 wide compression=45.2%). This bulk refresh took 1.47 seconds.
Refreshed 2 tables from location ‘ora’ to ‘mss2’.
one2many-refr-ora-mss3: 2 tables recreated because ‘Always Recreate’ was selected. [option -cf]
Refreshed ‘new_order’ from location ‘ora’ to location ‘mss3’ (9175 rows 69 wide compression=99.8%). This bulk refresh took 0.14 seconds.
Refreshed ‘order_line’ from location ‘ora’ to location ‘mss3’ (725176 rows 217 wide compression=81.3%). This bulk refresh took 10.14 seconds.
Refreshed 2 tables from location ‘ora’ to ‘mss3’.
one2many-refr-ora-mss4: 2 tables recreated because ‘Always Recreate’ was selected. [option -cf]
Refreshed ‘orders’ from location ‘ora’ to location ‘mss4’ (72545 rows 169 wide compression=94.3%). This bulk refresh took 1.45 seconds.
Refreshed ‘stock’ from location ‘ora’ to location ‘mss4’ (100000 rows 441 wide compression=26.0%). This bulk refresh took 2.91 seconds.
Refreshed 2 tables from location ‘ora’ to ‘mss4’.
Refresh of 8 tables from location ‘ora’ to 4 other locations finished.
Results of running Integrate jobs: