Using App Engine to run your code
Now, it’s time to move from FaaS to PaaS and introduce the second option for our serverless deployments: App Engine.
Introducing App Engine
App Engine (https://cloud.google.com/appengine) is a serverless PaaS product for developing and hosting our web applications. We can choose among many popular programming languages and use any framework or library to build our application, and Google Cloud will handle the infrastructure, including a demand-based scaling system to ensure that you always have enough capacity for our users.
This product is a very good fit for microservices-based architectures and requires zero server management and zero configuration deployment tasks, so we can focus on developing amazing applications. Indeed, we can use App Engine to host different versions of our app and use this feature to create separate environments for development, testing, staging, and production.
It’s important that you know that there can only be one App Engine instance in each Google Cloud project and that whatever region you choose when you create it will become permanent, so please make that choice wisely.
An App Engine application (https://cloud.google.com/appengine/docs/legacy/standard/python/an-overview-of-app-engine) is made up of one or more services, each of which can use different runtimes, each of which can have customized performance settings. Each of our services can have multiple versions deployed that will run within one or more instances, depending on the amount of traffic that we configured it to handle.
All the resources of an application will be created in the region that we choose when we create our App Engine app, including code, settings, credentials, and all the associated metadata. Our application will include one or more services but must have at least what is called the default service, which can also have multiple versions deployed.
Each version of a service that we deploy in our app will contain both the source code that we want to run and the required configuration files. An updated version may contain changes in the code, the configuration, or both, and a new version will be created when redeploying the service after making changes to any of these elements.
The ability to have multiple versions of our application within each service will make it easier to switch between versions for cases such as rollbacks or testing and can also be very useful when we are migrating our service, allowing us to set up traffic splits to test new versions with a portion of the users before rolling them out to all of them.
The different deployed versions of our services will run on one or more instances depending on the load at each time. AppEngine will scale our resources automatically, up if required to maintain the performance level, or down to avoid resource waste and help reduce costs.
Each deployed version of a service must have a unique name, which can be used to target and route traffic to a specific resource. These names are built using URLs that follow this naming convention:
https://<VERSION>-dot-<SERVICE>-dot-<PROJECT_ID>.<REGION_ID>.r.appspot.com
Note
The maximum length of <VERSION>-dot-<SERVICE>-dot-<PROJECT_ID>
is 63 characters, where VERSION
is the name of our version, SERVICE
is the name of our service, and PROJECT_ID
is our project ID, or a DNS lookup error will occur. Another limitation is that the name of the version and the service cannot start or end with a hyphen. Any requests that our application receives will be routed only to those versions of our services that have been configured to handle the traffic. We can also use the configuration to define which specific services and versions will handle a request depending on parameters such as the URL path.
App Engine environment types
App Engine offers two different environment types.
The App Engine standard environment is the simplest offering, aimed at applications running specific versions of the supported programming languages. At the time of writing this chapter, you can write your application in Node.js, Java, Ruby, C#, Go, Python, or PHP. You can see the up-to-date list on this documentation page: https://cloud.google.com/appengine/docs/the-appengine-environments.
In a standard environment, our application will run on a lightweight instance inside of a sandbox, which means that there will be a few restrictions that you should consider. For example, we can only run a limited set of binary libraries, restricting access to external Google Cloud services only to those available using the App Engine API, instead of the standard ones. Other particularly important limitations are that App Engine standard applications cannot write to disk and that the options of CPU and memory to choose from are limited. For all these reasons, App Engine standard is a genuinely precise fit for stateless web applications that respond to HTTP requests quickly, that is, microservices.
App Engine standard is especially useful in scenarios with sudden changes in traffic because this environment can scale very quickly and supports scaling up and down. This means that it can scale up your application quickly and effortlessly to thousands of instances to handle sudden peaks, and scale it down to zero if there is no traffic for some time.
If the mentioned limitations are not a problem for your use case, this can be a remarkably interesting choice to run your code, because you will pay close to nothing (or literally nothing).
App Engine standard instances are charged based on instance hours, but the good news is that all customers get 28 instances in a standard environment free per day, not charged against our credits, which is great for testing and even for running a small architecture virtually for free.
The second type is called the App Engine flexible environment. This one will give us more power, more options... and more responsibilities, at a higher cost. In this case, our application will be containerized with Docker and run inside a virtual machine. This is a perfect fit for applications that are expecting a reasonably steady demand and need to scale more gradually. The cons of this environment are that the minimum number of instances in App Engine flexible is 1 and that scaling in response to traffic will take significantly longer in comparison with standard environments.
On the list of pros, flexible environments allow us to choose any Compute Engine machine type to run our containerized application, which means that we have access to many more combinations of CPU, memory, and storage than in the case of a standard environment.
Besides, flexible environments have fewer requirements about which versions of the supported programming languages we can use, and they even offer the possibility of building custom runtimes, which we can use to add support for any other programming languages or versions that we may specifically need. This will require additional effort to set it up but also opens the door to running web applications written in any version of any language.
Flexible App Engine instances are billed based on resource usage, including vCPU, memory, and persistent disks.
Finally, most of the restrictions that affect App Engine standard instances do not apply to flexible environments: we can write to disk, use any library of our choice, run multiple processes, and use standard cloud APIs to access external services.
Note
Standard and flexible App Engine environments should not be mutually exclusive, but complementary. The idea is that we run simple microservices using fast scaling and cost-efficient standard environments whenever possible and complement them with flexible environments used for those microservices that will not work under the limitations of a standard environment. Specific requirements such as needing more CPU or memory, requiring disk access, or making API calls to use cloud services will justify the use of flexible instances. When combining both instance types, inter-service communication can be implemented using Pub/Sub, HTTP, or Cloud Tasks, which makes App Engine a great choice to create architectures combining always-on and on-demand microservices.
You can read an interesting comparison table detailing the similarities and key differences between both environment instances in the following documentation section: https://cloud.google.com/appengine/docs/flexible/flexible-for-standard-users.
Scaling strategies in App Engine
App Engine applications are built on top of one or more instances, which are isolated from one another using a security layer. Received requests are balanced across any available instances.
We can choose whether we prefer a specific number of instances to run despite the traffic, or we can let App Engine handle the load by creating or shutting down instances as required. The scaling strategy can be customized using a configuration file called app.yaml
. Automatic scaling will be enabled by default, letting App Engine optimize the number of idle instances.
The following is a list of the three different scaling strategies available for App Engine:
- Manual scaling: A fixed number of instances will run despite changes in the amount of traffic received. This option makes sense for complex applications using a lot of memory and requiring a fast response.
- Basic scaling: As its name suggests, this option will make things simple by creating new instances when requests are received and shutting them down when instances have been idle for some time. This is a nice choice for applications with occasional traffic.
- Automatic scaling: This is the most advanced option, suitable for applications needing to fine-tune their scaling to prevent performance issues. Automatic scaling will let us define multiple metrics with their associated thresholds in our YAML configuration file. App Engine will use these metrics to decide when it’s the best time to create new instances or shut down idle ones so that there is no visible effect on performance. We can also optionally use the
automatic_scaling
parameter to define the minimum number of instances to always keep running.
You can find a table comparing these scaling strategies in the documentation page about App Engine instance management: https://cloud.google.com/appengine/docs/legacy/standard/python/how-instances-are-managed
The differences between the different strategies are quite simple to explain. In basic scaling, App Engine prioritizes cost savings over performance, even at the expense of increasing latency and hurting performance in some scenarios, for example, after it scales to 0. If low latency is an important requirement for your application, this option will probably not work for you.
Automatic scaling, however, uses an individual queue for each instance, whose length is periodically monitored and used to detect traffic peaks, deciding when new instances should be created. Also, instances with queues detected to be empty for a while will be turned off, but not destroyed, so they can be quickly reloaded if they are needed again later. While this process will reduce the time needed to scale up, it may still increase latency up to an unacceptable level for some users. However, we can mitigate this side effect by specifying a minimum number of idle instances to always keep running, so we can handle sudden peaks without seeing our performance hurt.
Using App Engine in microservice architectures
When we build an application using a microservice architecture, each of these microservices implements full isolation of code, which means that the only communication method that we can use to execute their code is using HTTP or a RESTful API call. One service will otherwise never be able to directly execute code running on another. Indeed, it’s common that different services are written using different programming languages too. Besides, each service has its own custom configuration, so we may be combining multiple scaling strategies.
However, there are some App Engine resources, such as Cloud Datastore, Memcached, or Task Queues, which are shared between all services running in the same App Engine project. While this may have advantages, it may be a risk since a microservices-based application must maintain code and data isolation between its microservices.
While there are some architectural patterns that can help mitigate unwanted sharing, enforced separation can be achieved by using multiple App Engine projects at the expense of worse performance and more administrative overhead. A hybrid approach can also be a very valid option.
The App Engine documentation contains more information about microservices, including a comparison of service and project isolation approaches, so you can make a better choice for your architecture: https://cloud.google.com/appengine/docs/legacy/standard/python/microservices-on-app-engine.
Before getting to the example, let’s introduce configuration files, which are key for deploying App Engine applications.
Configuring App Engine services
Each version of an App Engine service has an associated .yaml
file, which includes the name of the service and its version. For consistency, this file usually takes the same name as the service it defines, while this is not required. When we have multiple versions of a service, we can create multiple YAML files in the same directory, one for each version.
Usually, there is a separate directory for each service, where both its YAML and the code files are stored. There are some optional application-level configuration files, such as dispatch.yaml
, cron.yaml
, index.yaml
, and queue.yaml
, which will be located in the top-level directory of the app. However, if there is only one service or multiple versions of the same service, we may just prefer to use a single directory to store all configuration files.
Each service’s configuration file is used to define the configuration of the scaling type and instance class for a specific combination of service and version. Different scaling parameters will be used depending on the chosen scaling strategy, or otherwise automatic scaling will be used by default.
As we mentioned earlier, the YAML can also be used to map URL paths to specific scripts or to identify static files and apply a specific configuration to improve the overall efficiency.
There are four additional configuration files that control optional features that apply to all the services in an app:
dispatch.yaml
overrides default routing rules by sending incoming requests to a specific service based on the path or hostname in the URLcron.yaml
configures regularly scheduled tasks that operate at defined times or regular intervalsindex.yaml
specifies which indexes your app needs if using Datastore queriesqueue.yaml
configures push and pull queues
After covering all the main topics related to App Engine, it’s time to deploy and run some code to see all the discussed concepts in action.
Writing, deploying, and running code with App Engine
We will now deploy our resume-serving application in App Engine and see the differences between this implementation and the one using Cloud Functions.
The first file that we will create for our application is app.yaml
, which can be used to configure a lot of settings. In our case, it will include the following contents:
runtime: python38 service: resume-server handlers: - url: /favicon\.ico static_files: favicon.ico upload: favicon\.ico
First, we will define which runtime we want to use. In this case, it will be a Python 3.8 module. Then, we will define a service name. I chose resume-server
just in case you were already using the default
service for any other purposes. Please remember that if this parameter is not defined in the file, the app will be deployed to the default
service.
Since App Engine is a full application server, I’m taking the chance to include a favicon, that is, an icon that the web browser will show next to the page title. In this case, we just add the icon file, called favicon.ico
, and add a rule to serve the icon when it is requested. The runtime will forward the rest of the requests by default to a file called main.py
, which will be the next file that we will talk about.
As its name may suggest, main.py
contains the core of the code and it is indeed quite similar to the version that we created as a cloud function. There are some differences at the beginning of the file because we will be using Flask to handle the requests and an instance of Cloud Logging when the app is deployed in production:
from flask import request, current_app, Flask from google.cloud import storage import google.cloud.logging import logging BUCKET_NAME = "<YOUR_BUCKET_NAME>" DEFAULT_TEMPLATE_NAME = "english.html" app = Flask(__name__) app.debug = False app.testing = False # Configure logging if not app.testing: logging.basicConfig(level=logging.INFO) client = google.cloud.logging.Client() # Attaches a Cloud Logging handler to the root logger client.setup_logging()
After these lines, you will see the same functions that we already covered earlier in this chapter, until we get to the last few lines of the file. Notice how now we have one line for routing requests to the root URL and how the last line runs the app, making it listen on the loopback interface for local executions:
DEFAULT_TEMPLATE = "english.html" @app.route('/') def get(): template = request.args.get('template', DEFAULT_TEMPLATE) name = request.args.get('name', None) company = request.args.get('company', None) resume_html = return_resume(template, name, company) return resume_html # This is only used when running locally. When running live, # gunicorn runs the application. if __name__ == '__main__': app.run(host='127.0.0.1', port=8080, debug=True)
The deployment package also includes a requirements.txt
file. In this case, these are its contents:
Flask==2.2.2 google-cloud-storage==2.5.0 google-cloud-logging==3.2.4
Notice how all three imported packages have their version frozen, for the sake of stability in future deployments, as we already discussed.
Now we are ready for testing, and the four files have been copied to the same working directory: app.yaml
, favicon.ico
, main.py
, and requirements.txt
.
Python’s virtualenv
and pytest
can be used for local fast testing, and they are indeed recommended as the first option, rather than using dev_appserver
, which is the local development server that Google Cloud SDK provides. However, if you are still interested, there’s information about it in this section of the official documentation: https://cloud.google.com/appengine/docs/standard/tools/using-local-server.
Please notice that simulated environments may not have exactly the same restrictions and limitations as the sandbox. For example, available system functions and runtime language modules may be restricted, but timeouts or quotas may not.
The local development server will also simulate calls to services such as Datastore, Memcached, and task queues by performing their tasks locally. When our application is running in the development server, we can still make real remote API calls to the production infrastructure using the Google API's HTTP endpoints.
Another option to simulate a production App Engine environment is to use a Web Server Gateway Interface (WSGI) server locally by installing gunicorn
in Cloud Shell using the following command:
pip install gunicorn
Then, we will just run it using our app as an entry point, as in the following example:
gunicorn -b :$PORT main:app
Here, $PORT
is the port number we defined for our application, 8080
by default, and main:get
is the name of the Python file and the function to execute when a request is received.
In my example, I invoked it using the following command line in Cloud Shell, so that it runs in the background:
/home/<user>/.local/bin/gunicorn -b :8080 main:app &
Now, we can send requests using curl
and validate the output as part of our unit tests. For example, our usual test URL would now be triggered using the following command. Please don’t forget the double quotes, or otherwise only the first parameter will be received:
curl "http://127.0.0.1:8080/?template=english.html&name=John&company=StarTalent"
Applications designed for flexible environments can also be directly executed for testing, given that they will have direct access to cloud services. Using emulators is often recommended in cases like this in order to avoid incurring excessive costs while running the tests.
After successfully passing all local tests, the application will be ready for deployment. And it couldn’t be any simpler than running the following command in the console from the working directory containing all the files previously mentioned:
gcloud app deploy app.yaml
This deployment command supports other flags, such as the following:
--version
to specify a custom version ID--no-promote
to prevent traffic from being automatically routed to the new version--project
to deploy to a specific Google Cloud project
As it happened with Cloud Functions, you may be asked to authenticate yourself during the deployment, and you could also be asked to enable any required APIs the first time that you deploy an app. In the case of App Engine, this is an example of the output of a deployment command:
Services to deploy: descriptor: [/home/clouddevelopersguide/App_Engine/app.yaml] source: [/home/clouddevelopersguide/App_Engine] target project: [cloud-developers-365616] target service: [resume-server] target version: [20221021t201413] target url: [http://resume-server.cloud-developers-365616.uc.r.appspot.com] target service account: [App Engine default service account] Do you want to continue (Y/n)? Y Beginning deployment of service [resume-server]... Uploading 1 file to Google Cloud Storage 100% 100% File upload done. Updating service [resume-server]...done. Setting traffic split for service [resume-server]...done. Deployed service [resume-server] to [http://resume-server.cloud-developers-365616.uc.r.appspot.com] You can stream logs from the command line by running: $ gcloud app logs tail -s resume-server To view your application in the web browser run: $ gcloud app browse -s resume-server
Notice how we can use the last section of the output to get the URL to the application, and the one right above it to print the logs in the command-line console. The deployment process involves copying our files to GCS, and then updating the service and setting its traffic split.
Once we obtain the URL, we can again append the parameters to test the application. In my case, this was the complete URL:
https://resume-server-dot-cloud-developers-365616.uc.r.appspot.com/?template=english.html&name=John+Smith&company=StarTalent
You can read more about testing and deploying your applications in App Engine in this section of the official documentation site: https://cloud.google.com/appengine/docs/standard/testing-and-deploying-your-app.
Debugging in App Engine
Luckily for us, App Engine is compatible with many of the tools that we already introduced for testing and debugging our cloud functions. With App Engine, we can also use Cloud Monitoring and Cloud Logging to monitor the health and performance of our app, and Error Reporting to diagnose and fix bugs quickly. Cloud Trace can also help us understand how requests propagate through our application.
Cloud Debugger can help us inspect the state of any of our running services without interfering with their normal behavior. Besides, some IDEs, such as IntelliJ, allow debugging App Engine standard applications by connecting to a local instance of dev_appserver
. You can find more information in this section of the official documentation site: https://cloud.google.com/code/docs/intellij/deploy-local.
After completing the whole development cycle when using App Engine, it’s the perfect time to explain how we will be billed if we decide to use App Engine.
How much does it cost to run your code on App Engine?
App Engine pricing scales with our app’s usage, and there are a few basic components that will be included in the App Engine billing model, such as standard environment instances, flexible environment instances, and App Engine APIs and services.
As I mentioned earlier, flexible App Engine instances are billed based on resource utilization, including vCPU, memory, persistent disks, and outgoing network traffic. Standard App Engine instances follow a much simpler model based on the number of hours they have been running for. Any other APIs and services used should be also added to the bill, such as Memcached, task queue, or the Logs API.
For more details about the pricing, you can refer to this documentation section: https://cloud.google.com/appengine/pricing
Regarding the free tier, users get 28 standard frontend instances and 9 backend instances for free every day, and new customers get $300 in free credits to spend on App Engine. You may find all the details about quotas in the following section of the documentation website: https://cloud.google.com/appengine/docs/standard/quotas.
To get an estimate of our bill, we can use the Google Cloud Pricing Calculator available in the following section of the documentation: https://cloud.google.com/products/calculator#tab=app-engine.
Tips and tricks for running your code on App Engine
If you read the Limits section at the end of the App Engine Overview section of the documentation (https://cloud.google.com/appengine/docs/legacy/standard/python/an-overview-of-app-engine), you will see that there are different limits for the number of services and instances depending on the application type (free or paid) and whether the app is hosted in us-central or in any other region. You should take these numbers into account when you decide which application type to use.
If our app uses automatic scaling, it will take approximately 15 minutes of inactivity for the idle instances to start shutting down. To keep one or more idle instances running, we should set the value of min_idle_instances
to 1
or higher.
Regarding security, a component called the App Engine firewall can be used to set up access rules. Managed SSL/TLS certificates are included by default on custom domains at no additional cost.
This was all the information we need to know about App Engine. Now, it’s time to wrap up.