You're reading from Building Serverless Web Applications Develop scalable web apps using the Serverless Framework on AWS

Product type Paperback

Published in Jul 2017

Publisher Packt

ISBN-13 9781787126473

Length 354 pages

Edition 1st Edition

Languages

JavaScript

Tools

AWS

Concepts

Serverless

Author (1):

Diego Zanon

View More author details

Introducing serverless

Serverless can be a model, a type of architecture, a pattern, or anything else you prefer to call it. For me, serverless is an adjective, a word that qualifies a way of thinking. It's a way to abstract how the code that you write will be executed. Thinking serverless is to not think in servers. You code, you test, you deploy, and that's (almost) enough.

Serverless is a buzzword. You still need servers to run your applications, but you should not worry about them that much. Maintaining a server is none of your business. The focus is on development and writing code, and not in the operation.

DevOps is still necessary, although with a smaller role. You need to automate the deployment and have at least a minimal monitoring of how your application is operating and how much it costs, but you don't need to start or stop machines to match the usage and neither do you need to replace failed instances or apply security patches to the operating system.

Thinking serverless

A serverless solution is entirely event-driven. Every time that a user requests some information, a trigger will notify your cloud vendor to pick your code and execute it to retrieve the answer. In contrast, a traditional solution also works to answer requests, but the code is always up and running, consuming machine resources that were reserved specifically for you, even when no one is using your system.

In a serverless architecture, it's not necessary to load the entire codebase into a running machine to process a single request. For a faster loading step, only the code that is necessary to answer the request is selected to run. This small piece of the solution is referenced as a function. So we only run functions on demand.

Although we call it simply as a function, it's usually a zipped package that contains a piece of code that runs as an entry point along with its dependencies.

In the following diagram, the serverless model is illustrated in a sequence of steps. It's an example of how a cloud provider could implement the concept, though it doesn't have to implement it in this way:

Let's understand the following steps shown in the preceding diagram:

The user sends a request to an address handled by the cloud provider.
Based on the message, the cloud service tries to locate which package must be used to answer the request.
The package (or function) is selected and loaded into a Docker container.
The container is executed and outputs an answer.
The answer is sent to the original user.

What makes the serverless model so interesting is that you are only billed for the time that was needed to execute your function, usually measured in fractions of seconds, not hours of use. If no one is using your service, you pay nothing.

Also, if you have a sudden peak of users accessing your application, the cloud service will load different instances to handle all simultaneous requests. If one of those cloud machines fails, another one will be made available automatically, without needing to configure anything.

Serverless and PaaS

Serverless is often confused with Platform as a Service (PaaS). PaaS is a kind of cloud computing model that allows developers to launch applications without worrying about the infrastructure. According to this definition, they have the same objective! And they do. Serverless is like a rebranding of PaaS, or you can call it the next generation of PaaS.

The main difference between PaaS and serverless is that in PaaS you don't manage machines, but you are billed by provisioning them, even if there is no user actively browsing your website. In PaaS, your code is always running and waiting for new requests. In serverless, there is a service that is listening for requests and will trigger your code to run only when necessary. This is reflected in your bill. You will pay only for the fractions of seconds that your code was executed and the number of requests that were made to this listener. Also, serverless has an immutable state between invocations, so it's always a fresh environment for every invocation. Even if the container is reused in a subsequent call, the filesystem is renewed.

IaaS and On-Premises

Besides PaaS, serverless is frequently compared with Infrastructure as a Service (IaaS) and On-Premises solutions to expose its differences. IaaS is another strategy to deploy cloud solutions where you hire virtual machines and is allowed to connect to them to configure everything that you need in the guest operating system. It gives you greater flexibility, but it comes with more responsibilities. You need to apply security patches, handle occasional failures, and set up new servers to handle usage peaks. Also, you pay the same per hour whether you are using 5% or 100% of the machine s CPU.

On-Premises is the traditional kind of solution where you buy the physical computers and run them inside your company. You get total flexibility and control with this approach. Hosting your own solution can be cheaper, but it happens only when your traffic usage is extremely stable. Over or under provisioning computers is so frequent that it's hard to have real gains using this approach, even more when you add the risks and costs to hire a team to manage those machines. Cloud providers may look expensive, but several detailed use cases prove that the return of investment (ROI) is larger running on the cloud than On-Premises. When using the cloud, you benefit from the economy of scale of many gigantic data centers. Running on your own exposes your business to a wide range of risks and costs that you'll never be able to anticipate.

The main goals of serverless

To define a service as serverless, it must have at least the following features:

Scale as you need: There is no under or over-provisioning
Highly available: It is fault tolerant and always online
Cost-efficient: You will never pay for idle servers

Scalability

With IaaS, you can achieve infinite scalability with any cloud service. You just need to hire new machines as your usage grows. You can also automate the process of starting and stopping servers as your demand changes. But this is not a fast way to scale. When you start a new machine, you usually need to wait for around 5 minutes before it can be usable to process new requests. Also, as starting and stopping machines is costly, you only do this after you are certain that you need. So, your automated process will wait some minutes to confirm that your demand has changed before taking any action.

Infinite scalability is used as a way to highlight that you can usually grow without worrying if the cloud provider has enough capacity to offer. That's not always true. Each cloud provider has limitations that you must consider if you are thinking in large applications. For example, AWS limits the number of running virtual machines (IaaS) of a specific type to 20 and the number of concurrent Lambda functions (serverless) to 1,000.

IaaS is able to handle well-behaved usage changes, but it can't handle unexpected high peaks that happen after announcements or marketing campaigns. With serverless, your scalability is measured in milliseconds and not minutes. Besides being scalable, it's very fast to scale. Also, it scales per invocation without needing to provision capacity.

When you consider a high usage frequency in a scale of minutes, IaaS suffers to satisfy the needed capacity while serverless meets even higher usages in less time.

In the following graph, the left-hand side graph shows how scalability occurs with IaaS. The right-hand side graph shows how well the demand can be satisfied using a serverless solution:

With an On-Premises approach, this is a bigger problem. As the usage grows, new machines must be bought and prepared, but increasing the infrastructure requires purchase orders to be created and approved, you need to wait the new servers to arrive and you need to give time to your team to configure and test them. It can take weeks to grow, or even months if the company is very big and requests many steps and procedures to be filled in.

Availability

A highly available solution is the one that is fault tolerant to hardware failures. If one machine goes out, you must keep running the application with a satisfactory performance. If you lose an entire data center due to a power outage, you must have machines in another data center to keep the service online. Having high availability generally means to duplicate your entire infrastructure, placing each half in a different data center.

Highly available solutions are usually very expensive in IaaS and On-Premises. If you have multiple machines to handle your workload, placing them in different physical places and running a load balancing service can be enough. If one data center goes out, you keep the traffic in the remaining machines and scale to compensate. However, there are cases where you will pay extra without using those machines.

For example, if you have a huge relational database that is scaled vertically, you will end up paying for another expensive machine as a slave just to keep the availability. Even for NoSQL databases, if you set a MongoDB replica set in a consistent model, you will pay for instances that will act only as secondaries, without serving to alleviate read requests.

Instead of running idle machines, you can set them in a cold start state, meaning that the machine is prepared, but is off to reduce costs. However, if you run a website that sells products or services, you can lose customers even in small downtimes. A cold start for web servers can take a few minutes to recover, but needs several more minutes for databases.

Considering these scenarios, in serverless, you get high availability for free. The cost is already considered in what you pay to use.

Another aspect of availability is how to handle Distributed Denial of Service (DDoS) attacks. When you receive a huge load of requests in a very short time, how do you handle it? There are some tools and techniques that help mitigate the problem, for example, blacklisting IPs that go over a specific request rate, but before those tools start to work, you need to scale the solution, and it needs to scale really fast to prevent the availability from being compromised. In this, again, serverless has the best scaling speed.

Cost efficiency

It's impossible to match the traffic usage with what you have provisioned. With IaaS or On-Premises, as a rule of thumb, CPU and RAM usage must always be lower than 90% for the machine to be considered healthy, and ideally CPU should be using less than 20% of the capacity with normal traffic. In this case, you are paying for 80% of waste when the capacity is in an idle state. Paying for computer resources that you don't use is not efficient.

Many cloud vendors advertise that you just pay for what you use, but they usually offer significant discounts when you provision for 24 hours of uptime in a long term (one year or more). It means that you pay for machines that you will keep running even in very low traffic hours. Also, even if you want to shut down machines to reduce costs, you need to keep at least a minimum infrastructure 24/7 to keep your web server and databases always online. Regarding high availability, you need extra machines to add redundancy. Again, it's a waste of resources.

Another efficiency problem is related with the databases, especially relational ones. Scaling vertically is a very troublesome task, so relational databases are always provisioned considering max peaks. It means that you pay for an expensive machine when most of the time you don't need one.

In serverless, you shouldn't worry about provisioning or idle times. You should pay exactly the CPU and RAM time that is used, measured in fractions of seconds and not hours. If it's a serverless database, you need to store data permanently, so this represents a cost even if no one is using your system. However, storage is very cheap compared to CPU time. The higher cost, which is the CPU needed to run the database engine that runs queries, will be billed only by the amount of time used without considering idle times.

Running a serverless system continuously for one hour has a much higher cost than one hour in a traditional infra. However, the difference is that serverless is designed for applications with variable usage, where you will never keep one machine at 100% for one hour straight. The cost efficiency of serverless is not perceived in websites with flat traffic.