Orchestrating business processes
A business process is a long-lived flow of activities executed in a specific sequence to achieve the desired outcome. These activities consist of human tasks and atomic actions. The duration of a business process can range from minutes to days, or longer, depending on the nature of the human tasks that must be performed. For example, the food delivery process, which involves preparing and delivering food, should be measured in terms of minutes, whereas a typical business process that requires a management approval step may involve waiting hours or days for a manager to approve a task.
There are two approaches to implementing business processes: choreography and orchestration. These terms are borrowed from the arts and are used as metaphors for software techniques—for example, a team of dancers works together to perform a choreographed set of movements but the choreographer is not in control of the actual performance, whereas an orchestra is led by a conductor as it performs a musical composition.
As we have seen, there are limits to the choreography approach and it is only recommended for the most basic of business processes. Orchestration is the preferred approach. Orchestration is distinct from choreography as the flow of control is managed centrally by a conductor or mediator. This is the role that a control service plays. We can model the logic of a control service (that is, a business process) as a state machine using an activity diagram, such as the one shown in the following figure:
Figure 8.4: Activity diagram
A process is initiated by a specific event, and the end of the process is signaled by another event. Each step in the process represents a different state, indicating when a specific activity—such as a human task or an atomic action—should be performed. The directed lines between the states represent the state transitions and their guard conditions.
A control service is responsible for orchestrating (that is, mediating) state transitions, but not for implementing the individual activities, which are instead implemented by collaborating services—for example, a BFF service will implement a human task, and an ESG service might invoke an atomic action on an external system. Let’s take a deeper look at how we implement state transitions with entry and exit events.
Entry and exit events
If you are familiar with traditional business process management (BPM) tools, such as AWS Step Functions, then you will notice a difference in how control services orchestrate business processes. For example, to perform an atomic action step, a traditional BPM tool will directly invoke other services and wait for a response. It may also support publishing an entry event when the step starts and an exit event when a response is received so that others can observe the flow of control, but these events are completely optional.
Of course, having one service directly invoke another service violates our goal of creating autonomous services with fortified boundaries.
Control services, on the other hand, implement orchestration using just entry and exit events. This approach eliminates synchronous invocation dependency and builds on the dependency inversion principle (DIP), the inversion of responsibility principle (IRP), and the substitution principle to create a completely decoupled solution. It is also implicitly supporting observers, such as an event lake, without additional effort.
Following this approach, the control service acts as a policy-setting module that establishes the contracts that other services must implement to participate in the business process. These contracts are the pairs of entry and exit events. The following sequence diagram provides an example of these events for a process that is similar to the one depicted in Figure 8.4. In this example, a user wants to perform an action, but approval is required first:
Figure 8.5: Orchestration sequence diagram
The Initiator BFF service signals its intention to start a business process by publishing a request-submitted event. The Some Process control service transitions to the Task state when it receives the request-submitted event and emits a task-initiated event to indicate that it has entered that state. To exit this state, the process expects to receive a task-completed event.
The Task BFF service implements the human activity. It consumes task-initiated events and presents tasks to users in a work list. When a user completes a task, it produces a task-completed event with the user’s disposition.
The Some Process control service receives the task-completed event, evaluates the conditions, and determines whether to transition to the next state. If the user completed the task with an approval, then the business process transitions to the Action state and emits a request-approved event to indicate that it has entered that state.
To exit this state, the control service expects to receive a request-performed event.
The Some ESG service consumes the request-approved event, performs the atomic action, and produces a request-performed event when it has confirmation that the action is complete. At this point, the process is complete. The Some Process control service collects the request-performed event but has no further reaction.
The following code block shows how a control service can implement the business process using a set of rules. The first rule emits the task-initiated
event in reaction to the request-submitted
event. The second rule emits the request-approved
event in reaction to the task-completed
event. These rules are where we wire the outputs of one activity to the inputs of another activity as well—for example, in the first rule, uow.event.request
is mapped to task.artifact
, and in the second rule, uow.event.artifact
is mapped to request
:
import { evaluate } from 'aws-lambda-stream';
const RULES = [{
id: 'o1',
pattern: evaluate,
eventType: ['request-submitted'],
emit: (uow, rule, template) => ({
...template,
type: 'task-initiated',
task: {
subject: `Review request: ${uow.event.request.id}`,
role: 'Reviewer',
artifact: uow.event.request,
}}),
},
{
id: 'o2',
pattern: evaluate,
eventType: ['task-completed'],
filter: (uow) => uow.event.task.outcome === 'approved',
emit: (uow, rule, template) => ({
...template,
type: 'request-approved',
request: uow.event.artifact,
}),
}];
Leveraging entry and exit events allows us to completely decouple the various collaborators—for example, the Initiator BFF service does not know or care which business process handles the request, and the Some Process control service can be initiated by any service. The control service, in turn, does not know or care which service reacts to an entry event, so long as it emits the expected exit event. These are examples of the flexibility provided by the IRP, as upstream services defer to downstream services that take responsibility for reacting to the upstream events.
Following the DIP, the high-level control service is only concerned with when to emit an entry event and not with how to react. It establishes the contract of each entry/exit event pair and delegates the details to the low-level boundary services, and—as just mentioned—the control service has no dependency on which services implement the activities.
Following the Liskov substitution principle (LSP), we can substitute different collaborators—for example, a legacy system may implement an activity until the capability has been re-architected. Multiple collaborators can also participate simultaneously, with each filtering for and reacting to a subset of the events. This can be useful for beta-testing different potential implementations or for supporting different needs under different scenarios. In any case, the important aspect is that these details can change independently of the control service that defines the business process.
The low-level boundary services are also not coupled to the actual business process and can potentially participate in multiple business processes. For example, a boundary service such as the Task BFF service can define a generic pair of entry/exit events that it supports, such as task-initiated and task-completed, and thus can participate in any business process that uses those events.
An activity in a business process can also be implemented by another control service that is essentially implementing a sub-process. In this case, the entry event would initiate the sub-process, and the terminating event of the sub-process would be the exit event expected by the parent process.
Ultimately, the most important characteristic of orchestrating business processes with control services is that we are implementing all the control-flow logic for the business process in one place. This has three key benefits, as follows:
- It makes learning and understanding the policies easier because they are not spread across the system.
- When the policies change, the impact has a higher likelihood of being limited to a single control service.
- The policies can be tested in isolation from the details of the boundary services.
Now, let’s look at how to implement more complicated business processes with parallel execution paths.
Parallel execution
The previous business process example is considered to be a simple process because the steps in the process are executed sequentially. More complex business processes include parallel execution. Parallel execution can take the form of fan-outs or forks and joins.
A fan-out is applicable when multiple activities can be executed in parallel but there is no need for all the activities to be completed before continuing on to the next activity. For example, at a certain point in the flow, it may be necessary to send the customer a status update, but there is no need to hold up the rest of the process while this happens.
The following diagram depicts how this might look, with Step 1 fanning out to Step 2 and Step 3. Step 2 transitions to Step 4 when it is complete, whereas Step 3 has no follow-on steps. Implementing a fan-out is straightforward. We only need to emit an entry event, and there may not be a need to react to an exit event:
Figure 8.6: Fan-out
This is also an opportunity for multiple services to react in parallel to the same entry event, such as one ESG service sending a status update by email and another ESG service sending it by Short Message Service (SMS). Each will emit an exit event for auditing purposes, but the process does not need to react to these events.
Forks and joins are applicable when we need to execute multiple activities in parallel and they all must be completed before continuing to the next activity. For example, in our food delivery system, sending an order to the restaurant and selecting a driver can happen in parallel, but we may not want to dispatch the driver to the restaurant until we have confirmation that the order was received.
The following diagram depicts how this might look. The first black bar represents the fork, with Step 2 and Step 3 executing in parallel. The second black bar represents the join, which must occur before transitioning to Step 4. To implement a join, we leverage the correlated events that the control service keeps in its micro events store. The join rule asserts the presence of all required exit events before emitting the next entry event:
Figure 8.7: Fork and join
The following code block shows how a control service can implement a join using rules. We have already seen a similar example of CEP in the Dissecting the Control Service pattern section. The join
utility function asserts whether all the listed events have been correlated in the micro events store. When each step is completed, the rule has an opportunity to evaluate whether the other step has been completed. The entry event for Step 4 is emitted once the expression evaluates to true
:
import { evaluate } from 'aws-lambda-stream';
import { join } from './utils';
const RULES = [{
id: 'o3',
pattern: evaluate,
eventType: ['step2-completed', 'step3-completed'],
expression: join(['step2-completed', 'step3-completed']),
emit: 'step4-initiated',
}];
Parallel execution is just one example of how a business process can become complex. First-order alternate paths, such as a rejection flow in an approval process, are straightforward to model with activity diagrams. However, second-, third-, and Nth-order alternate paths are much easier to reason about as straight rules, with help from tools such as decision tables. One pattern for handling alternate paths is called the Saga pattern. Let’s look at Sagas next, and we will cover decision tables in the Implementing complex event processing (CEP) logic section.