Correlation
Correlation is the mechanism that is used to associate a message with the conversation that it belongs to. There are two main classes of correlation:
Automatic correlation refers to mechanisms where the correlation is handled automatically. BPM uses mechanisms like WS-Addressing and JMS Message IDs to achieve automatic correlation.
Message-based correlation refers to the mechanism the process developer needs to define some keys , which can be extracted from the message in order to determine which conversation a message belongs to. Examples are given in the next section.
There are some occasions when message-based correlation is necessary because automatic correlation is not available, for example:
When the other participant does not support WS-addressing, or
When a participant joins the conversation part way through but has only the data values, but no other information about the conversation
If you do not specify any settings for message-based correlation, the runtime engine will attempt to use automatic correlation. If it is not possible to do so, then you will get a correlation fault . The engine checks to see if the called process or service supports WS-addressing, in which case it will insert a WS-addressing header into the call. It will then wait for a matching reply. Similarly, if JMS is being used to transport the message, it will look for a reply message with the JMS correlation ID that matches the JMS message ID of the message it sent.
Correlation is especially important inside a loop construct, as there may be multiple threads/receives waiting at once, and the engine needs a way to know which reply belongs with which receive.
Correlation sets
When using message-based correlation, you define a set of keys that are used to determine which conversation a message belongs to. This set of keys is called a correlation set.
A correlation set is a list of the (minimum) set of attributes that are needed to uniquely identify the conversation. An example of a correlation set may be orderNumber
plus customerNumber
.
When the runtime engine sees a conversation that uses message-based correlation, which has a correlation set attached to the start activity, it will create an MD5 hash from the values of the correlation keys and use that to identify the correct reply message if and when it arrives.
When you are using message-based correlation , only the called process needs to be aware of correlation, not the calling process . The runtime engine will take care of details for the calling process, so you do not need to include any correlation details in the process model for the calling process.
Note
It is important to understand that these rules do not apply when the calling process wants to call the called process more than once, as is the case when the call is inside a loop, for example. This scenario will be discussed shortly.
In the called process, you need to include the correlation set definition, and specify that the appropriate events or tasks use correlation. Let's look at an example in the following diagram:
The receive task
in this process has correlation specified in its properties. It has a correlation set identified, which contains a single key called ck_number
, and the mode is set to Initiates as shown in the following screenshot. This tells the runtime engine that this process instance is going to use message-based correlation. It also has the Create Instance property set. This tells the runtime engine that an inbound message will start an instance of this process.
If there are other receive tasks or message catch events in this process, they need to have correlation defined with the same correlation set and the mode set to Uses. These are called mid-point receives —places where the process instance can receive another message after it has already started executing. These could be used by the calling process to send a "cancel" message to tell the running instance of the called process to stop work, for example.
You do not need to define any correlation properties on the outputs of the process, for example its send task , or any end (message) nodes or throw message events. Only inputs have correlation properties defined.
Correlation when there are multiple calls
There are some occasions when you will want to call a service or process several times from the same instance of a process. This commonly occurs when you want to call the service for every item in a collection, for example.
In this scenario, you need to place the send task and receive task (or throw and catch events) inside an embedded sub-process and define a scoped conversation inside the embedded sub-process. As mentioned previously, you will not need to define correlation information in the calling process, just the called process.
Here is an example of a process that contains a multi-instance embedded sub-process that iterates over an array of input data, calling another process to carry out some work on each element in that array, in parallel.
There is a scoped conversation defined inside the embedded sub-process as we see in the following image. The send and receive tasks each use this conversation, rather than the default conversation. We will build this process in the next chapter.