面试题：C++ 并行化冒泡排序

并行化策略

OpenMP：
- OpenMP通过#pragma omp parallel for指令来并行化循环。对于冒泡排序，冒泡排序有两层循环，外层循环控制比较轮数，内层循环进行相邻元素的比较和交换。我们可以并行化内层循环，因为每一轮内层循环的比较操作相互独立。
- 但是需要注意线程同步问题，由于多个线程可能同时访问和修改数组元素，需要使用互斥锁（mutex）来保证同一时间只有一个线程可以修改数组元素。
C++11线程库：
- C++11线程库通过std::thread来创建线程。可以根据CPU核心数创建相应数量的线程，每个线程负责处理数组的一部分数据。
- 对于冒泡排序，同样可以并行化内层循环。在线程间共享数据时，使用std::mutex来保护共享数据，避免数据竞争。同时，使用条件变量（std::condition_variable）来进行线程同步，确保所有线程完成一轮比较后再进入下一轮。

OpenMP实现代码

#include <iostream>
#include <omp.h>
#include <vector>

void bubbleSortParallelOpenMP(std::vector<int>& arr) {
    int n = arr.size();
    for (int i = 0; i < n - 1; ++i) {
        #pragma omp parallel for
        for (int j = 0; j < n - i - 1; ++j) {
            if (arr[j] > arr[j + 1]) {
                std::swap(arr[j], arr[j + 1]);
            }
        }
    }
}

C++11线程库实现代码

#include <iostream>
#include <vector>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>

std::mutex mtx;
std::condition_variable cv;
std::atomic<int> completedThreads(0);
std::atomic<bool> allDone(false);

void bubbleSortPart(std::vector<int>& arr, int start, int end, int n) {
    while (!allDone) {
        for (int i = 0; i < n - 1; ++i) {
            for (int j = start; j < end && j < n - i - 1; ++j) {
                std::unique_lock<std::mutex> lock(mtx);
                if (arr[j] > arr[j + 1]) {
                    std::swap(arr[j], arr[j + 1]);
                }
                lock.unlock();
            }
        }
        completedThreads++;
        if (completedThreads == std::thread::hardware_concurrency()) {
            completedThreads = 0;
            cv.notify_all();
        } else {
            std::unique_lock<std::mutex> lock(mtx);
            cv.wait(lock, [] { return allDone || completedThreads == std::thread::hardware_concurrency(); });
        }
    }
}

void bubbleSortParallelC11(std::vector<int>& arr) {
    int n = arr.size();
    int numThreads = std::thread::hardware_concurrency();
    std::vector<std::thread> threads;
    int step = n / numThreads;
    for (int i = 0; i < numThreads; ++i) {
        int start = i * step;
        int end = (i == numThreads - 1)? n : (i + 1) * step;
        threads.emplace_back(bubbleSortPart, std::ref(arr), start, end, n);
    }
    std::this_thread::sleep_for(std::chrono::seconds(1));
    allDone = true;
    cv.notify_all();
    for (auto& th : threads) {
        th.join();
    }
}

时间复杂度分析

串行冒泡排序：时间复杂度为 (O(n^2))，其中 (n) 是数组的长度。因为有两层嵌套循环，外层循环执行 (n - 1) 次，内层循环执行 (n - i - 1) 次。
并行化后：
- OpenMP：理论上，并行化内层循环后，时间复杂度可以接近 (O(n \log n))。因为内层循环的比较操作并行执行，减少了整体的执行时间。但由于线程同步和调度等开销，实际效果可能达不到理想的 (O(n \log n))，仍然接近 (O(n^2))。
- C++11线程库：同样，理想情况下时间复杂度可以接近 (O(n \log n))，但线程创建、同步等开销会影响实际性能，整体时间复杂度仍然近似 (O(n^2))。

空间复杂度分析

串行冒泡排序：空间复杂度为 (O(1))，因为只需要有限的额外空间用于交换元素。
并行化后：
- OpenMP：空间复杂度仍然为 (O(1))，OpenMP主要通过指令并行化循环，不额外增加大量空间。
- C++11线程库：空间复杂度变为 (O(n))，因为需要创建多个线程，每个线程可能需要一定的栈空间，并且还需要一些额外的同步变量（如mutex、condition_variable等），总体空间开销增加。同时如果按照代码中线程划分数据的方式，虽然数据本身没有额外空间开销，但线程函数局部变量等会增加一定空间复杂度。

面试题：C++ 并行化冒泡排序

知识考点

面试题答案

并行化策略

OpenMP实现代码

C++11线程库实现代码

时间复杂度分析

空间复杂度分析