Lock-Free Programming is a technique that allows concurrent update updates of shared data structures without using explicit locks. This method ensures system wide progressthat no threads block for arbitrarily long times.
Advantages:-
- Could be used in places
...
- where locks
...
- must be avoided
...
- , such as interrupt handlers
- Efficiency benefits compared to lock-based algorithms
...
- for some workloads, including potential scalability benefits on multiprocessor machines
...
- Avoidance of priority inversion in real-time systems
Limitations:-
- Requires use of special atomic processor instructions such as CAS (compare and swap) or LL/SC (load linked/store conditional)
Applications:-
- “Read-copy-update†(RCU) in Linux 2.5 kernel
...
- Lock-Free Programming on AMD Multi-Core
...
- Systems
The ABA problem occurs during
...
Problem Associated with Lock-Free Approach(ABA):
ABA problem occurs during synchronization, when a location is read twice , and has the same value for both reads. However, another thread can execute has executed in-between the two reads and change modified the value, do did other work, then change modified the value back, thus fooling the first thread in to into thinking that "nothing has changed" even though the second thread did work that violates that violating this assumption.
The ABA problem occurs due to the internal reuse of nodes that have been popped off the list or by the reclamation of memory occupied by removed nodes.
Noncompliant Code Example (
...
GNU glib)
A This code implements a queue data structure implementation using lock-free method.programming. It is implemented using glib. The function CAS() internally uses g_atomic_pointer_compare_and_exchange().
| Code Block | ||
|---|---|---|
| ||
| Code Block | ||
| ||
#include <glib.h>#include <glib.h> #include <glib-object.h> #include <glib-object.h> struct Node { void *data; Node *next; }; struct Queue { Node *head; Node *tail; }; Queue* queue_new(void) { Queue *q = g_slice_new( sizeof(Queue)); q->head = q->tail = g_slice_new0( sizeof(Node)); return q; } void queue_enqueue(Queue *q, gpointer data) { Node *node,; Node *tail,; Node *next; node = g_slice_new(Node); node->data = data; node->next = NULL; while (TRUE) { tail = q->tail; next = tail->next; if (tail != q->tail) { continue; } if (next != NULL) { CAS(&q->tail, tail, next); continue; } if (CAS( &tail->next, null, node)) { break; } } CAS(&q->tail, tail, node); } gpointer queue_dequeue(Queue *q) { Node *node,; Node *tail,; Node *next; while (TRUE) { head = q->head; tail = q->tail; next = head->next; if (head != q->head) { continue; } if (next == NULL) { return NULL; // Empty } if (head == tail) { CAS(&q->tail, tail, next); continue; } data = next->data; if (CAS(&q->head, head, next)) { break; } } g_slice_free(Node, head); return data; } |
The above implementation works with glib. The function CAS uses g_atomic_pointer_compare_and_exchange() internallyLet's consider the following example:
Let us assume there are two Let us assume there are two threads (T1 and T2) operating simultaneously on the queue. The queue looks like this,
head -> A -> B -> C -
...
> tail
The following sequence of operation occurs.
Thread | Queue Before | Operation | Queue After |
|---|---|---|---|
| head -> A -> B -> C -> tail | Enters queue Enters | head -> A -> B -> C -> tail |
| head -> A -> B -> C -> tail | removes node A | head -> B -> C -> tail |
| head -> B -> C -> tail | removes node B | head -> C C -> tail |
| head -> C C -> tail | enqueues node A back into the queue | head -> A -> C -> tail |
| head -> A -> C -> tail | removes node C | head -> A -> tail |
| head -> A -> tail | enqueues a new node D | head -> A -> D -> tail |
| head -> A -> D -> tail | Thread 1 starts execution | undefined {} |
According to the above sequence of events now head will be pointing to a memory which was removedfreed.
Also if , if reclaimed memory (B) is returned to the operating system is returned to the operating system (e.g., using using munmap()), access to such memory locations can result in fatal access violation violation errors.
Compliant Solution (
...
Hazard
...
Pointers)
The core idea is to associate a number (typically one or two) of singleof single-writer writer multi-reader shared pointersreader shared pointers, called hazard pointers.
A hazard pointer either has a null value or points to a node that may be accessed later by that thread without further validation that the reference to the node is still valid. Each hazard pointer can may be written only by its owner thread, but can may be read by other threads.
This In this methodology communicates , communication with the associated algorithms is accomplished only through hazard pointers and a procedure RetireNode() that is called by threads to pass the addresses of retired nodes.
Pseudo Code:
| Code Block | ||
|---|---|---|
| ||
// Hazard Pointers Types and Strucutre structure HPRecType { { HP[K]:*Nodetype; Next:*HPRecType;} // The header of the HPRec list HeadHPRec: *HPRecType; //Per-thread private variables rlist: listType; // initially empty rcount: integer; // initially 0 //The Retired Node routine RetiredNode(node:*NodeType) { rlist.push(node); rcount++; if(rcount >= R) Scan(HeadHPRec); } // The scan routine Scan(head:*HPRecType) { // Stage1: Scan HP list and insert non null values in plist plist.init(); hprec<-head; while (hprec != null) { for (i<-0 to K-1) { hptr<-hprec^HP[i]; if (hptr!= null) plist.insert(hptr); } hprec<-hprec^Next; } // Stage 2: search plist tmplist<-rlist.popAll(); rcount<-0; node<-tmplist.pop(); while (node != null) { if (plist.lookup(node)) { rlist.push(node); rcount++; } else { PrepareForReuse(node); } node<-tmplist.pop(); } plist.free(); } |
The scan consists of two stages. The first stage involves scanning the HP list for nonThe first stage involves scanning the HP list for non-null values. Whenever a nonnull non-null value is encountered, it is inserted in a local list plist, which can be implemented as a hash table. The second stage of Scan involves checking each checking each node in rlist against the pointers in plist. If the lookup yields no match, the node is identified to be ready yields no match, the node is identified to be ready for arbitrary reuse. Otherwise, it is retained in rlist until the next scan by the current thread. Insertion and lookup in plist take constant expected time. The task of the memory reclamation method The task of the memory reclamation method is to determine when a retired node is eligible for reuse safely while allowing memory reclamation.
In the implementation, the pointer being removed is stored in the hazard pointer and thus prevents other threads from reusing it, thus avoiding the ABA problem.
Code:
| Code Block | ||
|---|---|---|
| ||
 voidvoid queue_enqueue(Queue *q, gpointer data) {void queue_enqueue(Queue *q, gpointer data) { Node *node; Node *tail; Node *node, *tail, *next; node = g_slice_new(Node); node->data = data; node->next = NULL; while (TRUE) { tail = q->tail; HAZARD_SET(0, tail); // Mark tail has hazardous if (tail != q->tail) { // Check tail hasn't changed continue; } next = tail->next; if (tail != q->tail) { continue; } if (next != NULL) { CAS(&q->tail, tail, next); continue; } if (CAS(&tail->next, null, node) { break; } } CAS(&q->tail, tail, node); } gpointer queue_dequeue(Queue *q) { Node *tail,node; Node *tail; Node *next, *head; while (TRUE) { head = q->head; LF_HAZARD_SET(0, head); // Mark head as hazardous if (head != q->head) { // Check head hasn't changed continue; } tail = q->tail; next = head->next; LF_HAZARD_SET(1, next); // Mark next has hazardous if (head != q->head) { continue; } if (next == NULL) { return NULL; // Empty } if (head == tail) { CAS(&q->tail, tail, next); continue; } data = next->data; if (CAS(&q->head, head, next)) { break; } } LF_HAZARD_UNSET(head); // Retire head, and perform // reclamation if needed. return data; } 5 Node *node, *tail, *next; 7 node = g_slice_new(Node); |
In the above code, the pointer on being removed is stored in the hazard pointer and thus prevents other threads from reusing it and avoiding the ABA problem.
Compliant Solution (using Mutex)
Code:
Compliant Solution (Mutex)
In this compliant solution, pthread_mutex_lock() is used to lock the queue.
When thread 1 locks on the queue to perform any operation then thread 2 cannot perform any operation on the queue, thus preventing the ABA problem.
| Code Blockcode | ||
|---|---|---|
| ||
 void queue_enqueue(Queue *q, gpointer data) {#include <glib.h> #include <glib-object.h> struct Node { void *data; Node *next; }; struct Queue { Node *head; Node *tail; pthread_mutex_t mutex; }; Queue* queue_new(void) { Queue *q = g_slice_new( sizeof(Queue)); q->head = q->tail = g_slice_new0( sizeof(Node)); return q; } void queue_enqueue(Queue *q, gpointer data) { Node *node,; Node *tail, *next; Node *next; // Lock the queue before accesing the contents pthread_mutex_lock(&(q->mutex)); node = g_slice_new(Node); node->data = data; node->next = NULL; if(q->head == NULL) { q->head = node; q->tail = node; } else { q->tail->next = node; q->tail = node; } // Unlock the mutex pthread_mutex_unlock(&(queue->mutex)); } gpointer queue_dequeue(Queue *q) { Node *node,; Node *tail,; Node *next; pthread_mutex_lock(&(q->mutex)); head = q->head; tail = q->tail; next = head->next; data = next->data; q->head = next; g_slice_free(Node, head); pthread_mutex_unlock(&(queue->mutex)); return data; } |
In the above code, pthread_mutex_lock() is used to lock the queue.
When thread 1(T1) locks on the queue to perform any operation then thread 2 cannot perform any operation on the queue. Thus preventing the ABA problem.
Risk Assessment
Risk Assessment
The likelihood of having a race condition is low. Once the race condition occurs then The likelihood of having a race condition is low. Once the race condition occurs then reading memory that has already been freed can lead to abnormal program termination or unintended information disclosure.
Guideline | Severity | Likelihood | Remediation Cost | Priority | Level |
|---|---|---|---|---|---|
CON39-C | Medium | unlikely | High | P4 | L3 |
Bibliography
Hazard Pointers : MichaelMichael, M.M. "Hazard pointers: Safe memory reclamation for lock-free objects,"IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 8, Aug. 2004.
Linux v2.5.53 : lock-free ipv4 route cache
AMD Developer Central: Articles & Whitepapers:
Lock-Free Programming on AMD Multi-Core System (http://developer.amd.com/documentation/articles/pages/125200689.aspx)
---- AMD Developer Central: Articles & Whitepapers: http://developer.amd.com/documentation/articles/pages/125200689.aspx
Asgher, Sarmad. http://www.drdobbs.com/go-parallel/article/showArticle.jhtml;jsessionid=IBA5HITYQUKK1QE1GHRSKH4ATMY32JVN?articleID=219500200, Dr. Dobbs Go-Parallel, August 26, 200
[http://www.fireeye.com/products/index.html}FireEye Anti-Malware and Anti-Botnet Security tool is an example.
...
CON37-C. Do not use more than one mutex for concurrent waiting operations on a condition variable 14. Concurrency (CON) 49. Miscellaneous (MSC)