Certain features of lock-free Twol are derived from sequential Twol back-end structure as described in Chapter 2. Hence, a brief revisit of sequential Twol back- end structure is now presented. Figure 5-2 illustrates the basic structure of a sequential Twol back-end (i.e. not lock-free). The sequential Twol back-end
structure consists of two tiers. The first-tier (T1) is an array of buckets where each bucket contains an unsorted linked list and the second tier (T2) is an unsorted linked list. Sequential Twol back-end, on its own, is not a priority queue. To create a priority queue with sequential Twol, the sequential Twol structure has to be combined with a front-end structure which can be a conventional priority queue (not shown in Figure 5-2), e.g. a sorted linked list or a tree-based structure. With the additional two tiers of the Twol back-end structure, this reduces sorting overhead in the front-end structure resulting in a more efficient epoch-based deferred-sort priority queue structure.
Initially, all elements are queued in T2 in an unsorted manner. On the first DeleteMin operation, two transfer operations are required. The first transfer operation moves elements in T2 to T1. Prior to the first transfer, each bucket in T1 is given a key-range so that elements whose key is within a bucket’s associated key- range will be inserted in that bucket in an unsorted manner. The key-range is designed so that, assuming that keys are uniformly distributed, there will be on average one element in each bucket. Thereafter, the second transfer takes place where elements in the first bucket of T1 are transferred to the front-end structure to be fully sorted so that the element with the smallest key can be deleted.
Once the T1 and T2 structures are set up, new arriving elements can be inserted in either the front-end structure, in T1, or in T2 depending on its key value.
As more DeleteMin operations occur, the elements in the front-end structure will deplete causing further transfer of elements from subsequent buckets of T1 to the front-end structure. As this continues, it will be clear that elements in T1 will also deplete, eventually leading to a point where there are no more elements in the front- end, nor in T1 and all the future events are all found in T2. When this point occurs,
the old T1 tier is deleted. This means the queue is now back to its initial state where all elements are in T2.
We define an epoch to be the period beginning from the point where all elements are in T2, followed by the two transfer operation leading to the creation of T1 and the front-end structure, followed by a transient sequence of Inserts and DeleteMins (where all three structures are in place) up to the point where no more elements are found in the front-end structure and in T1 so that T1 is deleted. In other words, sequential Twol proceeds in epochs where at the start of each epoch, two transfer operations will occur and a new T1 will be created. The epoch-based approach of sequential Twol drastically reduces sorting overheads particularly in the sorted front-end structure, leading to expected (1) performance for many priority increment distributions (see Chapter 2).
Figure 5-2: Basic structure of sequential Twol back-end structure.
Twol Structure
Lock-free Twol, as illustrated in Figure 5-3, although appearing very similar to sequential Twol back-end, has a number of contrasting differences compared to sequential Twol. The main reason for the differences is that in a lock-free environment, conflicting operations can occur at the same time. For example, a process may want to insert an element in T1 while at the same time another process may be performing a transfer (this means the removal of T1 and the creation of a new T1). So an inconsistency may result. Hence many of the straightforward sequential operations like transfer, deletions or insertions need to be CASed (i.e.
compare first for consistency and then swap if consistent, otherwise return failed) or MCASed (same definition as CASed except with operation on multiple memory locations) in a lock-free scenario.
A transfer operation is also particularly overhead hungry due to MCAS consistency checking. Hence the first major difference in lock-free Twol, as illustrated in Figure 5-3, is that T1 is an array of buckets each containing a sorted linked list. In that case, T1 is itself the front-end queue structure (no need for an additional front-end queue structure), thus eliminating the second transfer operation as explained in the previous section. However, lock-free Twol is also epoch-based in that as more DeleteMin operations are performed, elements in the first bucket will deplete and then deletion will proceed from the second bucket and so forth until the last bucket in T1. For convenience, we now make the following definitions:
Definition 5-1: Invalid Bucket – this is a bucket in T1 where all elements in that bucket have already been deleted. Newly arriving events will no longer be inserted in this bucket but will be inserted in subsequent valid buckets or in T2.
Definition 5-2 Earliest Bucket Index (also referred to as T1index) – the index of the valid bucket in T1 with the minimum key-range.
Since lock-free Twol only involves one transfer operation, we also redefine epoch to be as follows:
Definition 5-3: Epoch – the period beginning from the time all elements of the lock-free structure are found in T2, followed by the first DeleteMin operation resulting in one (and only one) transfer operation from T2 to T1, to the time T1 is depleted of elements (i.e. all the buckets of T1 are invalidated), so that T1 is eventually deleted.
The second major difference is that the T2 tier of lock-free Twol is made up of an m-size array of buckets each containing an unsorted linked list. The m parameter is equal to the number of processes accessing the lock-free Twol. It is noted that the T2 tier of sequential Twol is merely an unsorted linked list. The reason for the T2 modification is to reduce MCAS overheads. If just a single linked- list is used in the T2 tier of lock-free Twol, then any insert into T2 will incur MCAS overheads. For example, two processes are concurrently inserting events into the head of T2. To prevent inconsistency in pointer addressing by the two concurrent events trying to access the same queue position, overhead hungry MCAS operation must be employed. With the use of the m-size bucket list, each process inserts into its own bucket space without creating a consistency problem with other concurrent inserts into T2 by other processes. Hence, an insert operation into T2 can proceed with little MCAS overheads. This chapter also presents experimental studies for two types of lock-free Twol including one where T2 is an unsorted list and one where T2 is an m-size array of unsorted list. It will be clear in the numerical sections that the lock-free Twol with T2 an m-size array of unsorted list performs better.
. . .
Figure 5-3: Basic structure of lock-free Twol structure.
Subtle Features of Lock-free Twol Structure