Replication lag refers to the amount of time a secondary is behind the primary. The greater the lag time, the greater the possibility read operations could return outdated data. An excessive lag time can be the result of several factors, including the following:
- Network latency: Check local area network (LAN) configuration, firewalls, routing, and communications media.
- Slow disk throughput: Check the state of the filesystem on the server or container hosting the replica set member. Consider using a faster and more up-to-date filesystem.
- Concurrency: Resource-intensive applications could tie up the primary, causing replication to secondaries to bottleneck. You may need to refactor such applications. One possibility is to add an appropriate write concern to force acknowledgments, allowing secondaries to catch up. This is covered in more detail in the next section of this chapter.
To get information on oplog data synchronization for a given...