You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Lock-Free Programming is a technique that allows concurrent update of shared data structures without using explicit locks.  This method ensures system wide progress.

Advantages:

-         Could be used in places were locks have to be avoided(interrupt handler)

-         Efficiency benefits compared to lock-based algorithms for some workloads, including potential scalability benefits on multiprocessor machines

-         Avoidance of priority inversion in real-time systems

Limitations:

-         use of special atomic processor instructions such as CAS (compare and swap) or LL/SC (load linked/store conditional)

Applications:

-         “Read-copy-update” (RCU) in 2.5 kernel

-         Lock-Free Programming on AMD Multi-Core Systems 

Problem Associated with Lock-Free Approach(ABA):

ABA problem occurs during synchronization, when a location is read twice, has the same value for both reads. However, another thread can execute between the two reads and change the value, do other work, then change the value back, thus fooling the first thread in to thinking "nothing has changed" even though the second thread did work that violates that assumption.

ABA occurs due to the internal reuse of nodes that have been popped off the list or by the reclamation of memory occupied by removed nodes.

Noncompliant Code Example (Queue)

A queue data structure implementation using lock free method.

#include <glib.h>#include <glib.h>
#include <glib-object.h>
#include <glib-object.h>
struct Node {
  void *data;
  Node *next;
};

struct Queue {
  Node *head;
  Node *tail;
};

Queue*
queue_new(void){
  Queue *q = g_slice_new(sizeof(Queue));
  q->head = q->tail = g_slice_new0(sizeof(Node));
  return q;
}
void
queue_enqueue(Queue *q, gpointer data){
  Node *node, *tail, *next;
  node = g_slice_new(Node);
  node->data = data;
  node->next = NULL;
  while (TRUE) {
    tail = q->tail;
    next = tail->next;
    if (tail != q->tail)
      continue;
    if (next != NULL) {
      CAS(&q->tail, tail, next);
      continue;
    }
    if (CAS(&tail->next, null, node)
      break;
  }
    CAS(&q->tail, tail, node);
}



gpointer
queue_dequeue(Queue *q){
  Node *node, *tail, *next;
  while (TRUE) {
    head = q->head;
    tail = q->tail;
    next = head->next;
    if (head != q->head)
      continue;
    if (next == NULL)
      return NULL; // Empty
    if (head == tail) {
      CAS(&q->tail, tail, next);
      continue;
    }
    data = next->data;
    if (CAS(&q->head, head, next))
      break;
  }
  g_slice_free(Node, head);
  return data;
}

The above implementation works with glib. The function CAS uses g_atomic_pointer_compare_and_exchange() internallyLet's consider the following example: 

Let us assume there are two threads (T1 and T2) operating simultaneously on the queue. The queue looks like this,

head -> A -> B -> C -> tail

The following sequence of operation occurs.

Thread

Queue Before

Operation

Queue After

T1

head -> A -> B -> C -> tail

Enters queue_dequeue function
head = A, tail = C
next = B
after executing data = next->data;
 this thread gets pre-empted

head -> A -> B -> C -> tail

T2

head -> A -> B -> C -> tail

removes node A

head -> B -> C -> tail

T2

head -> B -> C -> tail

removes node B

head ->  C -> tail

T2

head ->  C -> tail

enqueues node A back into the queue

head -> A -> C -> tail

T2

head -> A -> C -> tail

removes node C

head -> A -> tail

T2

head -> A -> tail

enqueues a new node D
After enqueue operation thread 2 gets preempted

head -> A -> D -> tail

T1

head -> A -> D -> tail

Thread 1 starts execution
compares the local head= q->head = A (true in this case)
updates q->head with node B (but node B is removed)

undefined

According to the above sequence of events now head will be pointing to a memory which was removed.

Also  if reclaimed memory(B) is returned  to  the operating  system  (e.g., using  munmap), access to such memory locations can result in fatal access violation  errors.

Compliant Solution (using Hazard pointers)

The core idea is to associate a number (typically one or two)  of single-writer  multi-reader  shared pointers,  called hazard pointers.

A hazard pointer either has a null value or points to a node that may be accessed later by that thread without further validation that the reference to the node is still valid. Each hazard pointer can be written only by its owner thread, but can be read by other threads.

This methodology communicates with the associated algorithms only through hazard pointers and a procedure RetireNode that is called by threads to pass the addresses of retired nodes.

Pseudo Code:

// Hazard Pointers Types and Strucutre
 structure HPRecType
     { HP[K]:*Nodetype; Next:*HPRecType;}
// The header of the HPRec list
 HeadHPRec: *HPRecType;
//Per-thread private variables
 rlist: listType; //initially empty
 rcount: integer; //initially 0



//The Retired Node routine
RetiredNode(node:*NodeType){
  rlist.push(node);
  rcount++;
  if(rcount >= R)
    Scan(HeadHPRec);
}


// The scan routine
Scan(head:*HPRecType){
  // Stage1: Scan HP list and insert non null values in plist
  plist.init();
  hprec<-head;
  while(hprec != null){
    for(i<-0 to K-1){
      hptr<-hprec^HP[i];
      if(hptr!= null)
        plist.insert(hptr);
    }
    hprec<-hprec^Next;
  }


  // Stage 2: search plist
  tmplist<-rlist.popAll();
  rcount<-0;
  node<-tmplist.pop();
  while(node != null){
    if(plist.lookup(node)){
      rlist.push(node);
      rcount++;
    }
    else{
      PrepareForReuse(node);
    }
    node<-tmplist.pop();
  }
  plist.free();
}

The scan consists of two stages.  The  first  stage  involves  scanning  the  HP  list  for non-null values. Whenever a nonnull value is encountered, it is inserted in a local list plist, which can be implemented as a hash table. The second stage of Scan involves checking each node in rlist against the pointers in plist. If the lookup yields  no  match,  the  node  is  identified  to  be  ready  for arbitrary reuse. Otherwise, it is retained in rlist until the next scan by the current thread. Insertion and lookup in plist take constant expected time. The  task  of  the  memory  reclamation  method  is to determine when a retired node is eligible for reuse safely while allowing memory reclamation.

Code:

 void queue_enqueue(Queue *q, gpointer data) {void queue_enqueue(Queue *q, gpointer data) {
  Node *node, *tail, *next;
  node = g_slice_new(Node);
  node->data = data;
  node->next = NULL;
  while (TRUE) {
    tail = q->tail;
    HAZARD_SET(0, tail);              // Mark tail has hazardous
    if (tail != q->tail)             // Check tail hasn't changed
      continue;
    next = tail->next;
    if (tail != q->tail)
      continue;
    if (next != NULL) {
      CAS(&q->tail, tail, next);
      continue;
    }
    if (CAS(&tail->next, null, node)
      break;
  }
  CAS(&q->tail, tail, node);
}


gpointer queue_dequeue(Queue *q) {
  Node *tail, *next, *head;
  while (TRUE) {
    head = q->head;
    LF_HAZARD_SET(0, head);    // Mark head as hazardous
    if (head != q->head)       // Check head hasn't changed
      continue;
    tail = q->tail;
    next = head->next;
    LF_HAZARD_SET(1, next);    // Mark next has hazardous
    if (head != q->head)
      continue;
    if (next == NULL)
      return NULL; // Empty
    if (head == tail) {
      CAS(&q->tail, tail, next);
      continue;
    }
    data = next->data;
    if (CAS(&q->head, head, next))
      break;
  }
  LF_HAZARD_UNSET(head);        // Retire head, and perform
                                       // reclamation if needed.
  return data;
}
5 Node *node, *tail, *next;
7 node = g_slice_new(Node);

In the above code, the pointer on being removed is stored in the hazard pointer and thus prevents other threads from reusing it and avoiding the ABA problem.

Compliant Solution (using Mutex)

Code:

 void queue_enqueue(Queue *q, gpointer data) {#include <glib.h>
#include <glib-object.h>

struct Node {
	void *data;
	Node *next;
};

struct Queue {
  Node *head;
  Node *tail;
  pthread_mutex_t mutex;
};

Queue*
queue_new(void){
  Queue *q = g_slice_new(sizeof(Queue));
  q->head = q->tail = g_slice_new0(sizeof(Node));
  return q;
}

void
queue_enqueue(Queue *q, gpointer data){

  Node *node, *tail, *next;
  // Lock the queue before accesing the contents
  pthread_mutex_lock(&(q->mutex));

  node = g_slice_new(Node);
  node->data = data;
  node->next = NULL;

  if(q->head == NULL){
    q->head = node;
    q->tail = node;
  }
  else {
    q->tail->next = node;
    q->tail = node;
  }
  //Unlock the mutex
  pthread_mutex_unlock(&(queue->mutex));
}

gpointer
queue_dequeue(Queue *q){
  Node *node, *tail, *next;

  pthread_mutex_lock(&(q->mutex));

  head = q->head;
  tail = q->tail;
  next = head->next;
  data = next->data;
  q->head = next;

  g_slice_free(Node, head);
  pthread_mutex_unlock(&(queue->mutex));

  return data;
}

In the above code, pthread_mutex_lock() is used to lock the queue.

When thread 1(T1) locks on the queue to perform any operation then thread 2 cannot perform any operation on the queue. Thus preventing the ABA problem.

Risk Assessment

The likelihood of having a race condition is low. Once the race condition occurs then reading memory that has already been freed can lead to abnormal program termination or unintended information disclosure.

Guideline

Severity

Likelihood

Remediation Cost

Priority

Level

CON39-C

Medium

unlikely

High

P4

L3

Bibliography

Hazard Pointers : Michael, M.M. "Hazard pointers: Safe memory reclamation for lock-free objects,"IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 8, Aug. 2004.

Linux v2.5.53 : lock-free ipv4 route cache

AMD Developer Central: Articles & Whitepapers:

Lock-Free Programming on AMD Multi-Core System (http://developer.amd.com/documentation/articles/pages/125200689.aspx)

----
CON37-C. Do not use more than one mutex for concurrent waiting operations on a condition variable 14. Concurrency (CON) 49. Miscellaneous (MSC)

  • No labels