Administering a Coordinated Replicat Configuration

This section contains instructions for coordinating threads and re-partitioning the workload among new or different threads. A coordinated Replicat should be stopped cleanly with the STOP REPLICAT command before making modifications to the partition specifications in THREAD or THREADRANGE clauses of the MAP statements. A clean stop ensures that all of the threads, which may be at different locations in the trail at any given point, all finish their work and arrive at a common trail location.

At startup, Replicat issues an error and abends if it detects that the last shutdown was not clean and the partitioning in the MAP statements was changed to contain a different number of threads (threads were added or removed). However, if the same threads are kept in the parameter file but simply rearranged among different MAP statements, Replicat issues a warning but does not abend. This can result in missing or duplicate records, because there is no way to ensure continuity of the thread-to-workload allocations from the previous run.

The following is an example of this condition.

Following is the original partitioning scheme:

MAP source, target, THREADRANGE(1-5);
MAP source1, target1, THREADRANGE(6-10);

The following re-partitioning of the original scheme produces only a warning:

MAP source, target, THREADRANGE(1-4);
MAP source1, target1, THREADRANGE(5-10);

This section provides instructions for cleanly shutting down Replicat before performing a re-partitioning, as well as instructions for attempting to recover Replicat continuity when a re-partitioning is performed after an unclean shutdown.

The following tasks can be performed for a Replicat group in coordinated mode.

Performing a Planned Re-partitioning of the Workload

A planned re-partitioning is when Replicat is allowed to shut down cleanly before it is started again with a new parameter file that contains updated thread partitioning. A clean shutdown enables all of the threads to arrive at a common checkpoint position in the trail. At that point, the new partitioning scheme can be applied in the next run.

  1. Run Admin Client.
  2. Stop Replicat.
    STOP REPLICAT group
    
  3. Open the parameter file for editing.
    EDIT PARAMS group
    
  4. Make the required changes to the THREAD or THREADRANGE specifications in the MAP statements.
  5. Save and close the parameter file.
  6. Start Replicat.
    START REPLICAT group

Recovering Replicat After an Unplanned Re-partitioning

An unplanned re-partitioning is when Replicat is not allowed to shut down cleanly before it is started again with a new parameter file that contains updated thread partitioning. In this scenario, some or all of the old threads were not able to finish their work and arrive at a common checkpoint. Upon restart, the coordinator thread attempts to apply the old partitioning scheme, and Replicat abends with an error. You can recover the coordinated Replicat group from this condition in one of the following ways:

  • Use the auto-saved copy of the parameter file

  • Reprocess from the low watermark with HANDLECOLLISIONS

Reprocessing From the Low Watermark with HANDLECOLLISIONS

In this procedure, you reposition all of the threads to the low watermark position. This is the earliest checkpoint position performed among all of the threads. To state it another way, the low watermark position is the last record processed by the slowest thread before the unclean stop. When you start Replicat, the threads reprocess the operations that they were processing before Replicat stopped, and the HANDLECOLLISIONS parameter handles any duplicate-record and missing-record errors that occur as the faster threads reprocess operations that they applied before the unclean stop.

  1. Add the HANDLECOLLISIONS parameter to the Replicat parameter file. It is not necessary to use any THREADS options.
  2. Issue the INFO REPLICAT command for the Replicat group as a whole (the coordinator thread). Make a record of the RBA of the checkpoint. This is the low watermark value. This output also shows you the active thread IDs under the Group Name column. Make a record of these, as well.
    INFO REPLICAT group

    For example:

    OGG(slc03jgo) 3> info ra detail REPLICAT   RA       Last Started 2013-05-01 14:15   Status ABENDEDCOORDINATED          Coordinator                      MAXTHREADS 15Checkpoint Lag       00:00:00 (updated 00:00:07 ago)Process ID           11445Log Read Checkpoint  File ./dirdat/withMaxTransOp/bg000000001                     2013-05-02 07:49:45.975662  RBA 44704Lowest Log BSN value: (requires database login)Active Threads: ID  Group Name PID   Status   Lag at Chkpt  Time Since Chkpt1   RA001     11454 ABENDED  00:00:00      00:00:01    2   RA002     11455 ABENDED  00:00:00      00:00:04    3   RA003     11456 ABENDED  00:00:00      00:00:01    5   RA005     11457 ABENDED  00:00:00      00:00:02    6   RA006     11458 ABENDED  00:00:00      00:00:04    7   RA007     11459 ABENDED  00:00:00      00:00:04  
    
  3. Issue the INFO REPLICAT command for each processing thread ID and record the RBA position of each thread. Make a note of the highest RBA. This is the high watermark of the Replicat group.
    INFO REPLICAT threadID

    For example:

    info ra002
    REPLICAT   RA002    Last Started 2013-05-01 14:15   Status ABENDEDCOORDINATED          Replicat Thread                  Thread 2Checkpoint Lag       00:00:00 (updated 00:00:06 ago)Process ID           11455
    Log Read Checkpoint  File ./dirdat/withMaxTransOp/bg000000001                     2013-05-02 07:49:15.837271  RBA 45603
    
  4. Issue the ALTER REPLICAT command for the coordinator thread (Replicat as a whole, without any thread ID) and position to the low watermark RBA that you recorded.
    ALTER REPLICAT group EXTRBA low_watermark_rba
  5. Start Replicat.
    START REPLICAT group
  6. Issue the basic INFO REPLICAT command until it shows an RBA that is higher than the high watermark that you recorded. HANDLECOLLISIONS handles any collisions that occur due to previously applied transactions.
    INFO REPLICAT group
  7. Stop Replicat.
    STOP REPLICAT group
  8. Remove or comment out the HANDLECOLLISIONS parameter.
  9. Start Replicat.
    START REPLICAT group

Using the Auto-Saved Parameter File

A copy of the original parameter file is saved whenever the parameter file is edited before shutting down Replicat cleanly. You can revert to this parameter file and then resynchronize the threads so that they all catch up to the thread that had the most recent checkpoint position. Once the threads are synchronized, you can switch to the new parameter file and then start Replicat.

  1. Save the new parameter file to a different name, and then rename the saved original parameter file to the correct name (same as the group name). The saved parameter file has a .backup suffix and is stored in the dirprm subdirectory of the Oracle GoldenGate installation directory.
  2. Issue the following command to synchronize the Replicat threads to the maximum checkpoint position. This command automatically starts Replicat and executes the threads until they reach the maximum checkpoint position.
    SYNCHRONIZE REPLICAT group
    
  3. Issue the STATUS REPLICAT command until it shows that Replicat stopped cleanly.
    STATUS REPLICAT group
    
  4. Save the original parameter file to a different name, and then rename the new parameter file to the group name.
  5. Start Replicat.
    START REPLICAT group