Oracle GoldenGate topology
The Oracle GoldenGate topology is a representation of the databases in a GoldenGate environment, the GoldenGate components configured on each server, and the flow of data between these components.
The flow of data in separate trails is read, written, validated, and check-pointed at each stage. GoldenGate is written in the C computer programming language and because it is native to the operating system, it can run extremely fast. The sending, receiving, and validation have very little impact on the overall machine performance. Should the performance become an issue due to the sheer volumes of data being replicated, you may consider configuring parallel Extract and/or Replicat processes.
Process topology
The following sections describe the process topology; firstly, discussing the rules that you must adhere to when implementing GoldenGate, followed by the order in which the processes must execute for end-to-end data replication.
The rules
While using parallel Extract and/or Replicat processes, ensure you keep related DDL and DML together in the same process group to ensure data integrity. The topology rules to configure the processes are as follows:
- All objects that are relational to an object are processed by the same group as the parent object
- All DDL and DML for any given database object are processed by the same Extract group and Replicat group
Should a referential constraint exist between tables, the child table with the foreign key must be included in the same Extract and Replicat group as the parent table having the primary key.
Tip
The Replicat process, when configured in Integrated Delivery or Coordinated Delivery mode (that can spawn multiple processes), provides inbuilt intelligence to manage data dependencies, conflict detection, and error handling.
Position
The following tables and associated diagrams help to describe the GoldenGate replication dataflow and position of each link in the process topology for the following two configuration options:
- CDC and data delivery with a data pump
- CDC and data delivery without a data pump
The following diagram illustrates the dataflow for the CDC and data delivery that includes a data pump process:
The following table describes the position of each process in the dataflow.
Start component |
End component |
Position |
---|---|---|
Extract process |
Local trail file |
1 |
Local trail file |
Data pump |
2 |
Data pump |
Server collector |
3 |
Server collector |
Remote trail file |
4 |
Remote trail file |
Replicat process |
5 |
The following diagram illustrates the dataflow for the CDC and data delivery. Here the Extract process communicates directly with the server collector.
The following table describes the position of each process in the dataflow.
Start component |
End component |
Position |
---|---|---|
Extract process |
Server collector |
1 |
Server collector |
Remote trail file |
2 |
Remote trail file |
Replicat process |
3 |
The former is the preferred topology, which includes a data pump to enable the safeguard of additional check-pointing in the process dataflow.
Statistics
In terms of performance monitoring, the GGSCI tool provides real-time statistics as well as comprehensive reports for each process configured in the GoldenGate topology. In addition to reporting on demand, it is also possible to schedule reports to be run. This can be particularly useful while performance tuning a process for a given load and period.
The INFO ALL
command provides a comprehensive overview of the process status and lag, whereas the STATS
option shows more detail. Both commands offer real-time reporting. The following example shows the statistical summary of the available information: