Detecting a Network Bottleneck that is Affecting Oracle GoldenGate

To detect a network bottleneck that is affecting the throughput of Oracle GoldenGate, follow these steps.

  1. Issue the following command to view the ten most recent Extract checkpoints. If you are using a data-pump Extract on the source system, issue the command for the primary Extract and also for the data pump.
    INFO EXTRACT group, SHOWCH 10
    
  2. Look for the Write Checkpoint statistic. This is the place where Extract is writing to the trail.
    Write Checkpoint #1
    
    GGS Log Trail
    Current Checkpoint (current write position):
       Sequence #: 2
       RBA: 2142224
       Timestamp: 2011-01-09 14:16:50.567638
       Extract Trail: ./dirdat/eh
    
  3. For both the primary Extract and data pump:
    • Determine whether there are more than one or two checkpoints. There can be up to ten.

    • Find the Write Checkpoint n heading that has the highest increment number (for example, Write Checkpoint #8) and make a note of the Sequence, RBA, and Timestamp values. This is the most recent checkpoint.

  4. Refer to the information that you noted, and make the following validation:
    • Is the primary Extract generating a series of checkpoints, or just the initial checkpoint?

    • If a data pump is in use, is it generating a series of checkpoints, or just one?

  5. Issue INFO EXTRACT for the primary and data pump Extract processes again.
    • Has the most recent write checkpoint increased? Look at the most recent Sequence, RBA, and Timestamp values to see if their values were incremented forward since the previous INFO EXTRACT command.

  6. Issue the following command to view the status of the Replicat process.
    SEND REPLICAT group, STATUS
    
    • The status indicates whether Replicat is delaying (waiting for data to process), processing data, or at the end of the trail (EOF).

There is a network bottleneck if the status of Replicat is either in delay mode or at the end of the trail file and either of the following is true:

  • You are only using a primary Extract and its write checkpoint is not increasing or is increasing too slowly. Because this Extract process is responsible for sending data across the network, it will eventually run out of memory to contain the backlog of extracted data and abend.

  • You are using a data pump, and its write checkpoint is not increasing, but the write checkpoint of the primary Extract is increasing. In this case, the primary Extract can write to its local trail, but the data pump cannot write to the remote trail. The data pump will abend when it runs out of memory to contain the backlog of extracted data. The primary Extract will run until it reaches the last file in the trail sequence and will abend because it cannot make a checkpoint.

Note:

Even when there is a network outage, Replicat will process in a normal manner until it applies all of the remaining data from the trail to the target. Eventually, it will report that it reached the end of the trail file.