Delivery
Making a service available from a single resource has a number of problems. You can only scale up—i.e., make the resource bigger (which has limits). If that resource fails or needs maintenance, the entire service is unavailable. The preferred approach is to scale out, having multiple instances, although this introduces its own set of challenges. These challenges are expanded when you want to offer the service over multiple locations, with each location having its own resilient instance. You cannot give customers of a service five different URLs and tell them to try in order, they need a single-entry point that distributes requests to a number of backend resources. There are two levels of this load balancing: within a region and between regions.
Intra-Region Load Balancing
There are two Microsoft-provided load-balancing solutions for load balancing within a region. They operate at different layers, giving each of them specific scenarios where they are the right solution...