面试题：C++多线程同步机制中的性能瓶颈与优化策略

常见同步机制在高并发场景下的性能问题

互斥锁（Mutex）
- 性能问题：高并发时，大量线程频繁竞争互斥锁，会导致锁争用严重，线程上下文切换频繁，消耗大量CPU资源。例如在一个多线程同时访问共享资源的场景下，每个线程都需要获取互斥锁才能访问，当线程数增多，等待锁的时间会显著增加。
读写锁（Read - Write Lock）
- 性能问题：写锁优先级过高时，读操作可能长时间被阻塞，影响读性能；而读锁优先级过高，写操作则会长时间等待，导致数据更新不及时。在高并发读写混合场景下，如果读线程过多，写线程可能长时间无法获取锁进行数据更新。
信号量（Semaphore）
- 性能问题：信号量计数操作本身需要原子操作保证，在高并发时，频繁的信号量计数更新会导致CPU开销增大。而且如果信号量初始值设置不合理，可能导致部分线程长时间等待，降低系统整体吞吐量。

优化策略

使用无锁数据结构
- 思路：利用原子操作实现数据结构，避免锁的使用，减少线程争用。例如无锁队列，通过使用原子指针和内存屏障，不同线程可以无锁地进行入队和出队操作。
- 示例：

#include <atomic>
#include <memory>

template<typename T>
class LockFreeQueue {
private:
    struct Node {
        T data;
        std::atomic<Node*> next;
        Node(const T& value) : data(value), next(nullptr) {}
    };
    std::atomic<Node*> head;
    std::atomic<Node*> tail;
public:
    LockFreeQueue() : head(nullptr), tail(nullptr) {}
    ~LockFreeQueue() {
        while (head.load() != nullptr) {
            Node* temp = head.load();
            head.store(temp->next.load());
            delete temp;
        }
    }
    bool enqueue(const T& value) {
        std::unique_ptr<Node> newNode = std::make_unique<Node>(value);
        Node* oldTail = tail.load();
        while (true) {
            Node* oldTailNext = oldTail->next.load();
            if (oldTailNext != nullptr) {
                tail.compare_exchange_weak(oldTail, oldTailNext);
                continue;
            }
            if (oldTail->next.compare_exchange_weak(oldTailNext, newNode.get())) {
                tail.compare_exchange_weak(oldTail, newNode.get());
                newNode.release();
                return true;
            }
        }
    }
    bool dequeue(T& value) {
        while (true) {
            Node* oldHead = head.load();
            if (oldHead == nullptr) {
                return false;
            }
            Node* oldTail = tail.load();
            Node* oldHeadNext = oldHead->next.load();
            if (oldHead == oldTail && oldHeadNext == nullptr) {
                return false;
            }
            if (head.compare_exchange_weak(oldHead, oldHeadNext)) {
                value = oldHead->data;
                delete oldHead;
                return true;
            }
        }
    }
};

减少锁争用范围
- 思路：将大的共享资源划分为多个小的独立部分，每个部分使用独立的锁。这样不同线程可以同时访问不同部分的资源，减少锁争用。
- 示例：假设有一个大的数组作为共享资源，将其分为多个子数组，每个子数组使用一个互斥锁。

#include <mutex>
#include <vector>

class PartitionedArray {
private:
    const int numPartitions = 10;
    std::vector<std::mutex> locks;
    std::vector<std::vector<int>> data;
public:
    PartitionedArray() : locks(numPartitions), data(numPartitions) {}
    void updateElement(int index, int value) {
        int partitionIndex = index / (100 / numPartitions);
        std::lock_guard<std::mutex> lock(locks[partitionIndex]);
        data[partitionIndex][index % (100 / numPartitions)] = value;
    }
};

高性能多线程队列设计思路及关键代码

设计思路
- 读操作优化：对于读操作，使用读锁（如读写锁中的读锁），允许多个线程同时读，提高读性能。如果读操作远多于写操作，可以适当提高读锁优先级。
- 写操作优化：写操作使用写锁，保证数据一致性。为减少写操作对读操作的影响，可以采用写时复制（Copy - On - Write）策略，即写操作时先复制一份数据，在副本上进行修改，修改完成后再替换原数据。
- 结合无锁操作：对于队列的入队和出队操作，可以使用无锁数据结构实现部分逻辑，减少锁的使用频率。
关键代码片段

#include <mutex>
#include <shared_mutex>
#include <vector>
#include <atomic>

template<typename T>
class HighPerformanceQueue {
private:
    std::vector<T> queueData;
    std::shared_mutex queueMutex;
    std::atomic<int> readIndex;
    std::atomic<int> writeIndex;
public:
    HighPerformanceQueue() : readIndex(0), writeIndex(0) {}
    void enqueue(const T& value) {
        std::unique_lock<std::shared_mutex> lock(queueMutex);
        queueData.push_back(value);
        writeIndex++;
    }
    bool dequeue(T& value) {
        std::shared_lock<std::shared_mutex> lock(queueMutex);
        if (readIndex >= writeIndex) {
            return false;
        }
        value = queueData[readIndex];
        readIndex++;
        return true;
    }
};

在上述代码中，使用std::shared_mutex作为同步机制，读操作使用std::shared_lock允许多个线程同时读，写操作使用std::unique_lock保证数据一致性。同时结合std::atomic类型的索引，在一定程度上减少锁争用范围。

面试题：C++多线程同步机制中的性能瓶颈与优化策略

知识考点

面试题答案

常见同步机制在高并发场景下的性能问题

优化策略

高性能多线程队列设计思路及关键代码