You're reading from Software Architecture Patterns for Serverless Systems Architecting for innovation with event-driven microservices and micro frontends

Product type Paperback

Published in Feb 2024

Publisher Packt

ISBN-13 9781803235448

Length 488 pages

Edition 2nd Edition

Languages

Java

Tools

Serverless

Concepts

Microservices

Author (1):

John Gilbert

View More author details

Table of Contents (16) Chapters

Preface

1. Architecting for Innovation

2. Defining Boundaries and Letting Go FREE CHAPTER

3. Taming the Presentation Tier

4. Trusting Facts and Eventual Consistency

5. Turning the Cloud into the Database

6. A Best Friend for the Frontend

7. Bridging Intersystem Gaps

8. Reacting to Events with More Events

9. Running in Multiple Regions

10. Securing Autonomous Subsystems in Depth

11. Choreographing Deployment and Delivery

12. Optimizing Observability

13. Don’t Delay, Start Experimenting

14. Other Books You May Enjoy

15. Index

Turning the crank

Now that we have a plan, we get to do the fun part. We get to turn the crank and do the work. We pull the next task, write the code, push it to production, and see the fruits of our labor. Right?

Actually, up to this point, the process has been very flexible. It has been a lot of talk, sticky notes, and diagrams. It has been flexible on purpose—because we are experimenting.

But now, the rubber meets the road, so to speak, and we must get serious. After all, programming languages are notorious for doing exactly what we tell them to do, and we are about to deploy a change to production while the users are actually using the system.

So, this small part of the process needs due diligence. I call it the task branch workflow. This workflow governs how we turn the crank and do the work. It includes both automated and manual gates. But it is a straightforward workflow that becomes second nature since we do it many times a day, day after day.

And it is this muscle memory, which we have built up day after day, that we will rely on to jump into action and fail forward when we make an honest human error.

Task branch workflow

The task branch workflow is a GitOps-style continuous deployment approach that defines the machinery for moving focused and deliberate changes from development into production. It joins our issue trackers, developer tools, source code repositories, and CI/CD pipelines into a cohesive system that enables us to crank out working software. It is the automated bureaucracy that provides the needed guardrails so that we can work efficiently and safely.

The following diagram depicts the activities in a task branch workflow:

A screen shot of a game Description automatically generated

Figure 11.4: Task branch workflow

Let’s dig into the different activities of the workflow.

Create task branch

A task branch workflow begins when a developer pulls the next task from the backlog and creates a new branch off the master branch of the specific repository. We name the branch after the task, and we perform all work for the task on this branch. The branch will be short-lived, as we measure a task in hours.

The developer writes the new code and the all-important test cases for the new code.

We explicitly define the scope of each task when the team defines the task roadmap. Examples include seeding a new repository from a template, adding a new connector class, adding a new method to a model class, updating a mapper function for a stream processor, adding a new button to a screen, and so on. The key to success is that each change is focused and backward compatible. In other words, the change follows the Robustness principle, as we covered in the Achieving zero-downtime deployments section.

The developer pushes the change to the remote branch, once all the tests pass. At this point, the CI pipeline will assert that all the tests pass. We will cover the details of the CI pipeline in the Continuous integration pipeline section.

Create draft pull request

The developer creates a draft pull request (PR) when the task is far enough along to warrant feedback and discussion. It is important to ask for feedback early and often. A good practice is to create the draft PR after the first push.

The small size and focus of a task branch are crucial to receiving prompt feedback from fellow developers, as they are working on their own tasks. They need to be able to zero in on the essence of the change, provide constructive feedback, and then get back to their own task.

Creating a draft PR also signals that it is time to deploy the change to the secondary/western region of the non-prod environment where continuous smoke tests are executing. We will cover regional canary deployments in the Continuous deployment pipeline section.

Ready for review

The developer moves the PR from draft mode into the ready-for-review status when the task is ready to go to production. The CI pipeline and the CD pipeline for the regional canary deployment to the non-prod environment must have already passed.

At this point, the developer opens the PR to a wider audience for code review and approval. How wide an audience and how many approvals will depend on the risk level of the change. If the change is to a new feature or a rarely used feature, then there is low risk. However, if it is the most critical part of the system, then it will need multiple approvals and, most likely, outside approval. The team will identify the risk level during story planning and notify any outside approvers in advance.

A PR for a presentation-layer change should include annotated screenshots and/or video recordings. This opens the review process to the non-technical stakeholders and provides context for the technical review.

Moving the PR to the ready-for-review status also signals that it is time to deploy the change to the primary/eastern region of the non-prod environment where continuous smoke tests are executing. We will cover this in the Continuous deployment pipeline section.

Merge to master

The team merges the PR into the master branch only after all the automated and manual gates have passed. These include all required approvals and successful execution of the following:

The CI pipeline
The CD pipelines for all the regions of the non-prod environment
All the automated smoke tests in all the regions of the non-prod environment

Merging the PR into the master branch will trigger the execution of the CI pipeline against the master branch. Successful execution of the CI pipeline on the master branch triggers the deployment of the change to the secondary/western region of the production environment. We will cover regional canary deployments in the Continuous deployment pipeline section.

Merging to master is a manual gate. The team may merge straight away or wait for the optimal time. Again, the team will identify the need for any delay during story planning.

Accept canary deployment

Accepting the regional canary deployment is the final manual gate for a task-level change.

At this point, the team has deployed the change to one region in the production environment. If everything has gone according to plan, then any active users in this region are none the wiser, all the continuous smoke tests have succeeded, and there were no anomalies detected.

From here, the team will wait for a predetermined amount of time and then accept the regional canary deployment. This will trigger the deployment of the change to all remaining regions in the production environment.

If the team detects a problem that they did not catch in any of the regions of the non-prod environment, then they stop the line and jump into action. This is where they rely on their muscle memory to fail forward fast.

The high observability of the system and the limited scope of the change facilitate root-cause analysis (RCA). Meanwhile, the bulkheads of the feature limit the blast radius, so upstream and downstream features continue to operate unabated.

Now that we have asserted the stability of a deployment, it’s time to determine if we are delivering the right functionality.

Feature flipping

Decoupling deployment from delivery gives us flexibility—it allows us to divide a delivery into a series of zero-downtime deployments. The code is available in production, but it is inactive. We can selectively activate the code for exploratory testers and early adopters, and iterate on the feedback. Then, when we have the right functionality, we deliver (that is, activate) the feature for general availability. In other words, we have the flexibility to flip features on and off throughout the process of discovering the right fit.

Exploratory testing

This is where the more traditional quality assurance (QA) activities come into play. The testers manually exercise the system to determine if a feature is fit for purpose. Once again, the focus is on discovery. The testers may have notional test scripts for what they expect to find, but this exercise is really all about finding the unexpected. What we think is right and what works in practice are not always the same, and this is an opportunity to find the gaps.

Testers usually perform exploratory testing in the non-prod environment, but they may test in the production environment as well. In either case, we need to turn the feature on for testers. For example, granting the necessary permissions to the Tester role enables a new feature for all testers.

Beta users

Feedback from beta users is invaluable. These are real users, using a feature to do real work. They are early adopters and willing participants who are vested in the outcome, and they are willing to volunteer their efforts to make the product better. They also understand that the functionality is incomplete and that their results may vary as the team experiments.

Beta users perform their work in the production environment. Again, we make the feature available to beta users with a feature flag. For example, we may create a special beta user role, and there may be a dedicated registration or invitation process.

To win the support of beta users, it may be necessary to synchronize their work with a legacy version of the functionality. A beta user’s time is valuable, and they do not want to have to do their work twice. They need the ability to switch between the old and the new. For example, they can start a task in the legacy version and complete it in the new version, and vice versa. If the new version is not working correctly, then they can switch back to the legacy version and pick up where they left off, to a reasonable extent. This does increase the level of effort, but it is a valuable risk-mitigation strategy and will likely reduce the level of effort in the long run. We covered legacy integration in Chapter 7, Bridging Intersystem Gaps.

General availability

In due time, we make a feature available to all users. The end product may barely resemble what we thought it would look like in the beginning, but that is OK. We needed to experiment to find the right solution for the end user.

We typically refer to this as a General Availability (GA) release. We might call exploratory testing an alpha release, and beta users use a beta release, but this time it is official, and with it comes a new level of responsibility.

We have turned the new feature on for all users, and it is time to remove the technical debt of any feature flags that we no longer need. For example, the standard user role now has all the necessary permissions, so we need to remove the specific tester and beta user roles. We also need to remove any references to these roles from the code, along with any artificial-feature-flag logic, and push a cleanup deployment.

From here, any changes to a GA feature require additional scrutiny during story planning to ensure zero downtime. Any major changes to a GA feature may make it the new legacy and require concurrently deployed versions and data synchronization. But this is the kind of change and evolution we have designed our architecture to support, so bring it on.

Now, let’s look closer at the CI/CD pipelines.

The rest of the chapter is locked