How does the persistence subsystem work?

The persistence subsystem consists of four parts:

1.       Persistence Framework API A set of persistent interfaces that define the contract with Workflow Service Host to persists its state. These enable you to build any type of persistence provider from database backed to say distributed in-memory cache.

2.       The SQL Workflow Instance Store (SWIS) implements the abstract class InstanceStore of the Persistence Framework API. This class is used by the Workflow Service Host to create, load, save and delete durable instance data.

3.       The SQL Server Persistence database stores all the durable instance state. It also stores the additional metadata that is used to activate and recover a service instance.

4.       The Workflow Management Service (WMS) is a Windows Service that activates a Workflow Service Host whenever there are unloaded instances with expired durable timers, instances that need to be reactivated after a graceful shutdown or instances that need to be recovered after a crash. The WMS is also involved in the execution of instance control commands, which will be covered in a future blog post.

All workflow instances that are hosted by a Workflow Service Host that configures the SQL Workflow Instance Store save their instance state in the SQL Server Persistence Database. In addition to the instance state binary blob, the persistence store holds the following:
The message correlation key allows the Workflow Service Host to correlate an incoming message to a service instance if that instance is not loaded.
The instance lock indicates whether the service instance is loaded.
The service deployment information defines how the service is deployed in a IIS/WAS-hosted environment. It consists of site name, application path, service path and service name. Note that the instance store does not contain any deployment information for self-hosted workflows, meaning that the WMS is specifically tailored for IIS hosted services.

Loading and locking

If a workflow instance is loaded by a Workflow Service Host the persistence store locks that instance, and the instance cannot be loaded by any other service host. If the service host unloads the service instance the lock is released, and the instance can be loaded by a different service host. The new service host may reside on a different machine. This means that a workflow instance can run on multiple machines throughout its lifetime, whenever activation messages arrive. It also means that you can build a machine farm without employing an intelligent message router that remembers which machine is running a particular workflow instance. Instead, the router routes a message to any of the machines in the farm. If the message is correlated to an existing workflow instance, the instance will be loaded by that machine. Casually speaking, the service instance follows the message.

Processing of expired timers

If a workflow instance is executing a delay activity at the time it persists the instance store stores the expiration time of the activity. At the time the activity expires the SQL Workflow Instance Store notifies the Workflow Service Host, which then loads the workflow instance and processes the expired delay activity. If multiple machines are available, the instance may be loaded on any of these machines.
If no Workflow Service Host is running on a particular machine the WMS will activate a host on that machine. This causes the instances of a particular workflow type to be distributed among all machines in the farm.

Instance recovery

If a service host has loaded a workflow instance the Workflow Service Host must renew the instance's lock on a regular basis. If not renewed on time, the lock expires. An expired lock indicates that the service host or the machine the service host ran on has crashed. In this case another Workflow Service Host will load the instance. If no other Workflow Service Host is running the WMS will activate a host.
If the Workflow Service Host shuts down (e.g., due to an app domain recycle) it releases the locks of all the service instances it has loaded.