You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Lock-Free Programming is a technique that allows concurrent update of shared data structures without using explicit locks.  This method ensures system wide progress.

Advantages:

-         Could be used in places were locks have to be avoided(interrupt handler)

-         Efficiency benefits compared to lock-based algorithms for some workloads, including potential scalability benefits on multiprocessor machines

-         Avoidance of priority inversion in real-time systems

Limitations:

-         use of special atomic processor instructions such as CAS (compare and swap) or LL/SC (load linked/store conditional)

Applications:

-         “Read-copy-update” (RCU) in 2.5 kernel

-         Lock-Free Programming on AMD Multi-Core Systems 

-         API monitoring

Problem Associated with Lock-Free Approach(ABA):

ABA problem occurs during synchronization, when a location is read twice, has the same value for both reads. However, another thread can execute between the two reads and change the value, do other work, then change the value back, thus fooling the first thread in to thinking "nothing has changed" even though the second thread did work that violates that assumption.

ABA occurs due to the internal reuse of nodes that have been popped off the list or by the reclamation of memory occupied by removed nodes.

Noncompliant Code Example (Queue)

A queue data structure implementation using lock free method.

#include <glib.h>#include <glib.h>
#include <glib-object.h>
#include <glib-object.h>
struct Node {
  void *data;
  Node *next;
};

struct Queue {
  Node *head;
  Node *tail;
};

Queue*
queue_new(void){
  Queue *q = g_slice_new(sizeof(Queue));
  q->head = q->tail = g_slice_new0(sizeof(Node));
  return q;
}
void
queue_enqueue(Queue *q, gpointer data){
  Node *node, *tail, *next;
  node = g_slice_new(Node);
  node->data = data;
  node->next = NULL;
  while (TRUE) {
    tail = q->tail;
    next = tail->next;
    if (tail != q->tail)
      continue;
    if (next != NULL) {
      CAS(&q->tail, tail, next);
      continue;
    }
    if (CAS(&tail->next, null, node)
      break;
  }
    CAS(&q->tail, tail, node);
}



gpointer
queue_dequeue(Queue *q){
  Node *node, *tail, *next;
  while (TRUE) {
    head = q->head;
    tail = q->tail;
    next = head->next;
    if (head != q->head)
      continue;
    if (next == NULL)
      return NULL; // Empty
    if (head == tail) {
      CAS(&q->tail, tail, next);
      continue;
    }
    data = next->data;
    if (CAS(&q->head, head, next))
      break;
  }
  g_slice_free(Node, head);
  return data;
}

The above implementation works with glib. The function CAS uses g_atomic_pointer_compare_and_exchange() internallyLet's consider the following example: 

Let us assume there are two threads (T1 and T2) operating simultaneously on the queue. The queue looks like this,

head --> A --> B --> C --> tail

The following sequence of operation occurs.

Thread

Queue Before

Operation

Queue After

T1

head -> A -> B -> C -> tail

Enters queue_dequeue function
head = A, tail = C
next = B
after executing data = next->data;
 this thread gets pre-empted

head -> A -> B -> C -> tail

T2

head -> A -> B -> C -> tail

removes node A

head -> B -> C -> tail

T2

head -> B -> C -> tail

removes node B

head ->  C -> tail

T2

head ->  C -> tail

enqueues node A back into the queue

head -> A -> C -> tail

T2

head -> A -> C -> tail

removes node C

head -> A -> tail

T2

head -> A -> tail

enqueues a new node D
After enqueue operation thread 2 gets preempted

head -> A -> D -> tail

T1

head -> A -> D -> tail

Thread 1 starts execution
compares the local head= q->head = A (true in this case)
updates q->head with node B (but node B is removed)

undefined

According to the above sequence of events now head will be pointing to a memory which was removed.

Also  if reclaimed memory(B) is returned  to  the operating  system  (e.g., using  munmap), access to such memory locations can result in fatal access violation  errors.

Compliant Solution (using Hazard pointers)

The core idea is to associate a number (typically one or two)  of single-writer  multi-reader  shared pointers,  called hazard pointers.

A hazard pointer either has a null value or points to a node that may be accessed later by that thread without further validation that the reference to the node is still valid. Each hazard pointer can be written only by its owner thread, but can be read by other threads.

This methodology communicates with the associated algorithms only through hazard pointers and a procedure RetireNode that is called by threads to pass the addresses of retired nodes.

Pseudo Code:

// Hazard Pointers Types and Strucutre
 structure HPRecType
     { HP[K]:*Nodetype; Next:*HPRecType;}
// The header of the HPRec list
 HeadHPRec: *HPRecType;
//Per-thread private variables
 rlist: listType; //initially empty
 rcount: integer; //initially 0

The fact that all tasks will be waken up solves the problem because all tasks end up executing its predicate test, and therefore one will find it to be true and continue the execution until the end.

Compliant Solution (using pthread_cond_signal() but with a unique condition variable per thread)

Another way to solve the signal issue is to use a unique condition variable for each thread (maintaining a single mutex associated with it). In this case, the signal operation (pthread_cond_signal()) will wake up the only thread waiting on it.
NOTE: the predicate of the signaled thread must be true, otherwise a deadlock may occur anyway.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

#define NTHREADS  5

pthread_mutex_t mutex;
pthread_cond_t cond[NTHREADS];


void *run_step(void *t) {
  static int current_step = 0;
  int my_step = (int)t;
  int result;

  if ((result = pthread_mutex_lock(&mutex)) != 0) {
    /* Handle error condition */
  }

  printf("Thread %d has the lock\n", my_step);

  while (current_step != my_step) {
    printf("Thread %d is sleeping...\n", my_step);

    if ((result = pthread_cond_wait(&cond[my_step], &mutex)) != 0) {
      /* Handle error condition */
    }

    printf("Thread %d woke up\n", my_step);
  }

  /* Do processing... */
  printf("Thread %d is processing...\n", my_step);

  current_step++;

  /* Signal next step thread */
  if ((my_step + 1) < NTHREADS) {
    if ((result = pthread_cond_signal(&cond[my_step+1])) != 0) {
      /* Handle error condition */
    }
  }

  printf("Thread %d is exiting...\n", my_step);

  if ((result = pthread_mutex_unlock(&mutex)) != 0) {
    /* Handle error condition */
  }

  pthread_exit(NULL);
}


int main(int argc, char** argv) {
  int i;
  int result;
  pthread_attr_t attr;
  pthread_t threads[NTHREADS];
  int step[NTHREADS];

  if ((result = pthread_mutex_init(&mutex, NULL)) != 0) {
    /* Handle error condition */
  }

  for (i = 0; i< NTHREADS; i++) {
    if ((result = pthread_cond_init(&cond[i], NULL)) != 0) {
      /* Handle error condition */
    }
  }

  if ((result = pthread_attr_init(&attr)) != 0) {
    /* Handle error condition */
  }
  if ((result = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE)) != 0) {
    /* Handle error condition */
  }

  /* Create threads */
  for (i = 0; i < NTHREADS; i++) {
    step[i] = i;
    if ((result = pthread_create(&threads[i], &attr, run_step, (void *)step[i])) != 0) {
      /* Handle error condition */
    }
  }

  /* Wait for all threads to complete */
  for (i = NTHREADS-1; i >= 0; i--) {
    if ((result = pthread_join(threads[i], NULL)) != 0) {
      /* Handle error condition */
    }
  }

  if ((result = pthread_mutex_destroy(&mutex)) != 0) {
    /* Handle error condition */
  }

  for (i = 0; i < NTHREADS; i++) {
    if ((result = pthread_cond_destroy(&cond[i])) != 0) {
      /* Handle error condition */
    }
  }

  if ((result = pthread_attr_destroy(&attr)) != 0) {
    /* Handle error condition */
  }

  pthread_exit(NULL);
}

In the above code each thread has associated a unique condition variable which is signaled when that particular thread needs to be waken up. This solution turns to be more efficient because only the desired thread will be waken up.

Risk Assessment

Signal a single thread instead of all waiting threads can pose a threat to the liveness property of the system.

Guideline

Severity

Likelihood

Remediation Cost

Priority

Level

CON38-C

low

unlikely

medium

P2

L3

Related Guidelines

[The CERT Oracle Secure Coding Standard for Java|java:The CERT Oracle Secure Coding Standard for Java]:
[THI04-J. Notify all waiting threads instead of a single thread|Java:THI04-J. Notify all waiting threads instead of a single thread].

Bibliography

[[Open Group|AA. Bibliography#OpenGroup04]] [pthread_cond_signal() pthread_cond_broadcast()|http://www.opengroup.org/onlinepubs/7990989775/xsh/pthread_cond_signal.html]
----
CON37-C. Do not use more than one mutex for concurrent waiting operations on a condition variable 14. Concurrency (CON) 49. Miscellaneous (MSC)

  • No labels