0

In C++ do you need to lock a mutex before assigning to an atomic? I tried implementing the thread pool as shown here https://stackoverflow.com/a/32593825/2793618. In doing so, I created a thread safe queue and used atomics. In particular, in the shutdown method (or in my code the waitForCompletion) requires the thread pool loop function while loop variable to be set to true so that the thread can finish its work and join. But since atomics are thread safe, I didn't lock the mutex before assigning true to it in the shutdown method as shown below. This ended up causing a deadlock. Why is that the case?

ThreadPool.hpp:

#pragma once 

#include <atomic> 
#include <vector> 
#include <iostream> 
#include <thread>
#include <future>
#include <mutex>
#include <queue>
#include <functional>
#include <ThreadSafeQueue.hpp>

class ThreadPool{
    public: 
        ThreadPool(std::atomic_bool& result); 
        void waitForCompletion();
        void addJob(std::function<bool()> newJob);
        void setComplete();
    private: 
        void workLoop(std::atomic_bool& result); 
        int m_numThreads; 
        std::vector<std::thread> m_threads; 
        std::atomic_bool m_workComplete; 
        std::mutex m_mutex; 
        std::condition_variable m_jobWaitCondition; 
        ThreadSafeQueue<std::function<bool()>> m_JobQueue;
};

ThreadPool.cpp:

#include <ThreadPool.hpp> 

ThreadPool::ThreadPool(std::atomic_bool& result){ 
    m_numThreads = std::thread::hardware_concurrency();
    m_workComplete = false;
    for (int i = 0; i < m_numThreads; i++)
    {
        m_threads.push_back(std::thread(&ThreadPool::workLoop, this, std::ref(result)));
    }
}

// each thread executes this loop 
void ThreadPool::workLoop(std::atomic_bool& result){ 
    while(!m_workComplete){
        std::function<bool()> currentJob;
        bool popped;
        {
            std::unique_lock<std::mutex> lock(m_mutex); 
            m_jobWaitCondition.wait(lock, [this](){
                return !m_JobQueue.empty() || m_workComplete.load();
            });
            
            popped = m_JobQueue.pop(currentJob);
        }
        if(popped){
            result = currentJob() && result;
        }
    }
}

void ThreadPool::addJob(std::function<bool()> newJob){ 
    m_JobQueue.push(newJob);
    m_jobWaitCondition.notify_one();
}

void ThreadPool::setComplete(){
    m_workComplete = true; 
}

void ThreadPool::waitForCompletion(){
    {
        std::unique_lock<std::mutex> lock(m_mutex);
        m_workComplete.store(true);
    }
    
    m_jobWaitCondition.notify_all();

    for(auto& thread : m_threads){ 
        thread.join();
    }
    
    m_threads.clear();
}

ThreadSafeQueue.hpp:

#pragma once

#include <mutex>
#include <queue>

template <class T>
class ThreadSafeQueue {
   public:
    ThreadSafeQueue(){};
    void push(T element) {
        std::unique_lock<std::mutex> lock(m_mutex);
        m_queue.push(element);
    }
    bool pop(T& retElement) {
        std::unique_lock<std::mutex> lock(m_mutex);
        if (m_queue.empty()) {
            return false;
        }
        retElement = m_queue.front();
        m_queue.pop();
        return true;
    }
    bool empty(){ 
        std::unique_lock<std::mutex> lock(m_mutex); 
        return m_queue.empty();
    }

   private:
    std::queue<T> m_queue;
    std::mutex m_mutex;
};
user2793618
  • 305
  • 4
  • 10
  • `In C++ do you need to lock a mutex before assigning to an atomic?` NO! Please check out this C++ standard comity member talk: ["Lock-Free Programming (or, Juggling Razor Blades)"](https://youtu.be/c1gO9aB9nbs). – Marek R Mar 25 '22 at 16:34

2 Answers2

1

You have your dead lock while waiting for the condition. The condition although is only notified when there is a new job added. Your thread is waiting that condition to be notified. You may have non deterministic (from your point of view) checks of a condition "condition" but you may not rely them to exist.

You need to notify your condition when the task is completed.One possible place to do that is when you call for wait to complete or in any point where a completion state can be achieved.

I changed your code to this to illustrate:

// each thread executes this loop 
void ThreadPool::workLoop(std::atomic_bool& result){ 
    while(!m_workComplete)
    {
        std::function<bool()> currentJob;
        bool popped;
        {
        std::cout<<"Before the lock"<<std::endl;
            std::unique_lock<std::mutex> lock(m_mutex); 
        std::cout<<"After lock"<<std::endl;
            m_jobWaitCondition.wait(lock, [this]()
        {
            bool res = (!m_JobQueue.empty() || m_workComplete.load() );
        std::cout<<"res:"<<res<<std::endl;
                return res;
            });
        std::cout<<"After wait"<<std::endl;
            
            popped = m_JobQueue.pop(currentJob);
        }
        if(popped)
    { 
        std::cout<<"Popped"<<std::endl;
            result = currentJob() && result;
        std::cout<<"Popped 2"<<std::endl;
        }
    }
    std::cout<<"LEave"<<std::endl;
}

void ThreadPool::addJob(std::function<bool()> newJob){ 
    m_JobQueue.push(newJob);
    std::cout<<"before call notify"<<std::endl;
    m_jobWaitCondition.notify_one();
    std::cout<<"After call notify"<<std::endl;
}

I add a single job and the printed content is:

Before the lock After lock res:0 Before the lock After lock Before the lock Before the lock Before the lock res:0 Before the lock After lock res:0 After lock res:0 Before the lock After lock res:0 After lock before call notifyres:1

Before the lockAfter wait

Popped After lock res:0 After call notifyres:0

Popped 2 Before the lock res:0 res:0 res:0 res:0 After lock

res:0

After lock

res:0

Notice that last notify is called BEFORE the last "after lock" line (that precedes the condition wait)

  • After adding jobs, I immediately call waitForCompletion(), so the notify_all() call should fire. – user2793618 Mar 25 '22 at 17:10
  • 1
    But that is NOT when the deadlock happens. It happens after that. The condition is notified and go back to sleep eventually. It stays locked there after the jobs END! –  Mar 25 '22 at 18:18
  • I'm a bit confused. Wouldn't all of the threads wake after the notify_all() call and then since m_workComplete is true the thread would finish right? And to expand, in my "driver" code (consider it to be the main function), I create the ThreadPool object, add the jobs in a for loop then call waitForCompletion(). – user2793618 Mar 25 '22 at 18:21
  • 1
    You add the Jobs.. at each of such events the notify is called. Ok. But the dead lock is after that (I tried your code , and faced the deadlock). The deadlock happens because you add the job, the notification gives the control immediately to the pool thread.. that goes ahead and enter again into the loop (because the task is NOT completed) and get into the await immediately again . Only then the OS is returning the execution to your task. So it finishes when the condition is already waiting again. That is what happened when I ran your code. –  Mar 25 '22 at 18:53
  • 1
    Check the example I edited in my answer. Notice that last notify is called BEFORE the last "after lock" line (that precedes the condition wait) –  Mar 25 '22 at 19:03
  • Sorry this is the first one I’ve written a thread pool or done such low level concurrency, so it’s taking me a bit to understand. Could you expand on why the notify print out being before the last after lock line is a problem? Or indicative of deadlock? – user2793618 Mar 25 '22 at 19:16
  • 1
    Check my edited original answer. I added some code to debug and show explicitly the problem happening. it is normal for people to get confused. The trick is, the condition will NOT check its statement ( (!m_JobQueue.empty() || m_workComplete.load() ) magically it must be awoken for that. After it awakes, it checks and go to sleep again until someone awakes it again! It is not an automatic "awake me when this is not true" –  Mar 25 '22 at 19:20
  • Thanks for walking me though this! So I think I'm beginning to understand. The only times m_jobWaitCondition will check on the wait predicate is either from the notify_one() in the addJob() method or from notify_all() in waitForCompletion(). Otherwise the thread will sleep. So on the iteration where the thread is not sleeping, the thread will execute the job, and then return to the top of the while loop, where it will then sleep? The notify_all should wake all of the threads to check for the truth of m_workComplete, at which point the threads should all finish since workLoop() exits, right? – user2793618 Mar 25 '22 at 19:38
  • Adding on to my previous comment, the thread pool will not automatically check for the state of the queue, it will only do so when a new job is added right? So if all threads are busy then is that the cause of the deadlock? – user2793618 Mar 25 '22 at 19:42
  • 1
    You basically got it. There is s secondary behavior, that is not from the langauge but from the OS. A Sleeping thread may get spurious wake ups ( from your point of view , non deterministic). That is related to the computer architecture you do nto need to worry about that. What it means is that sometimes at seemingly random times the condition might check its predicate without you doing anything, but you should NEVER rely on it, because it may never happen. –  Mar 26 '22 at 11:27
  • I see, that makes much more sense now. So then would it be a child thread that sleeps, causing the deadlock or is it the main thread? I'm guessing that if the deadlocks happens before I can call the ```waitForCompletion``` function to execute, then it must be that a child thread is causing it? – user2793618 Mar 29 '22 at 19:42
0

You probably want to check m_workComplete after the wait() returns in workLoop(), otherwise you might be calling pop() on an empty queue, which is bad.

user18533375
  • 175
  • 1
  • I might just change this out, but ```pop()``` currently returns a bool for if it was able to pop. So this should be handled, but it is a bit weird. – user2793618 Mar 25 '22 at 15:47