Consider the following example of a thread barrier. The thread barrier requires a variable to "keep track of the total number of threads that have entered the barrier. Whenever sufficient threads enter the barrier, it will be lifted. A
synchronization primitive like a
mutex is also needed when implementing the thread barrier. This thread barrier method is also known as a "centralized barrier", as the threads wait before a "central barrier" until the expected number of threads have reached the barrier before it is lifted. This can be demonstrated by the following
C example using
POSIX threads. • include • include • define TOTAL_THREADS 2 • define THREAD_BARRIERS_NUMBER 3 typedef struct Barrier { pthread_mutex_t lock; int barrier_count; int thread_count; } Barrier; Barrier barrier; void barrier_init(ThreadBarrier *bar, pthread_mutexattr_t *attr, int count) { pthread_mutex_init(&(bar->lock), attr); bar->barrier_count = count; bar->thread_count = 0; // Initialize total thread count to be 0 } void barrier_wait(Barrier *bar) { if (!pthread_mutex_lock(&(bar->lock))) { bar->total_thread++; pthread_mutex_unlock(&(bar->lock)); } while (bar->thread_count barrier_count) { // Implements busy-wait barrier (do nothing until enough threads arrive) } if (!pthread_mutex_lock(&(bar->lock))) { bar->thread_count--; // Decrease by one thread as it passes the thread barrier pthread_mutex_unlock(&(bar->lock)); } } void barrier_destroy(Barrier *bar) { pthread_mutex_destroy(&(bar->lock)); } void *thread_func(
maybe_unused void *p) { printf("Thread ID %ld is waiting at barrier, as insufficient (%d) threads are running...\n", pthread_self(), THREAD_BARRIERS_NUMBER); thread_barrier_wait(&barrier); printf("Barrier lifted, thread ID %ld is running now\n", pthread_self()); } int main() { pthread_t thread_ids[TOTAL_THREADS]; thread_barrier_init(&barrier, NULL, THREAD_BARRIERS_NUMBER); for (int i = 0; i In this program, the thread barrier, struct Barrier, consists of: • lock: A POSIX thread mutex lock • thread_count: Total threads in the process • barrier_count: Total number of threads expected to enter the thread barrier so that it can be lifted Based on the definition of barrier, the implementation requires a function like thread_barrier_wait() in this program which "monitors" the total number of thread in the program in order to life the barrier. In this program, every thread calls thread_barrier_wait() will be blocked until THREAD_BARRIERS_NUMBER threads reach the thread barrier. As the main thread is blocked due to not having three threads, the line is never reached. The result of that program is: thread id is waiting at the barrier, as not enough 3 threads are running... thread id is waiting at the barrier, as not enough 3 threads are running... Only two threads are created with thread_func() as the thread function handler, which calls thread_barrier_wait(&barrier), while the thread barrier expects three threads to call thread_barrier_wait() in order to be lifted. Upon changin TOTAL_THREADS to 3, the thread barrier is lifted: Thread ID is waiting at barrier, as insufficient (3) threads are running... Thread ID is waiting at barrier, as insufficient (3) threads are running... Thread ID is waiting at barrier, as insufficient (3) threads are running... Barrier lifted, thread ID is running now Barrier lifted, thread ID is running now Barrier lifted, thread ID is running now Thread barrier lifted
Sense-reversal centralized barrier Besides decrementing the total thread number for each thread successfully passing the thread barrier, thread barriers may use opposite values to mark each thread state as passing or stopping. For example, 0 may denote stopping at the barrier while 1 denotes passing the barrier. This is known as "sense reversal" • include • include • define TOTAL_THREADS 2 • define THREAD_BARRIERS_NUMBER 3 typedef struct Barrier { pthread_mutex_t lock; int barrier_count; int thread_count; bool flag; } Barrier; Barrier barrier; void barrier_init(Barrier *bar, pthread_mutexattr_t *attr, int count) { pthread_mutex_init(&(bar->lock), attr); bar->thread_count = 0; bar->barrier_count = count; bar->flag = false; } void barrier_wait(Barrier *barrier) { thread_local bool local_flag = bar->flag; if (!pthread_mutex_lock(&(bar->lock))) { bar->thread_count++; local_sense = !local_sense; if (bar->thread_count == bar->barrier_count) { bar->thread_count = 0; bar->flag = local_flag; pthread_mutex_unlock(&(bar->lock)); } else { pthread_mutex_unlock(&(bar->lock)); while (bar->flag != local_flag) { // wait for flag } } } } void barrier_destroy(Barrier *bar) { pthread_mutex_destroy(&(bar->lock)); } void *thread_func(
maybe_unused void *p) { printf("Thread ID %ld is waiting at barrier, as insufficient (%d) threads are running...\n", pthread_self(), THREAD_BARRIERS_NUMBER); thread_barrier_wait(&barrier); printf("Barrier lifted, thread ID %ld is running now\n", pthread_self()); } int main() { pthread_t thread_ids[TOTAL_THREADS]; thread_barrier_init(&barrier, NULL, THREAD_BARRIERS_NUMBER); for (int i = 0; i This version of the previous centralized barrier implementation introduces two new variables:
Hardware barrier implementation A hardware barrier uses hardware to implement the above basic barrier model. == POSIX Threads barrier functions ==