面试题：C++ 字符串函数性能分析与优化

选择的C++字符串函数
- std::string::find函数：用于在字符串中查找子串。它返回子串首次出现的位置，如果未找到则返回std::string::npos。
思路阐述
- 避免内存泄漏：使用智能指针来管理动态分配的内存。在C++中，可以使用std::unique_ptr或std::shared_ptr。由于这里主要是处理文件读取和字符串操作，在读取文件行时，std::string对象会自动管理其内存，所以重点在于文件指针的管理。可以使用std::unique_ptr<FILE, int(*)(FILE*)>来管理文件指针，确保文件在不再使用时正确关闭。
- 提高运行效率：
  - 批量读取：为了减少文件I/O的次数，可以一次读取多行数据（例如，使用std::vector<std::string>预先分配一定大小来存储多行数据），而不是逐行读取。
  - 优化查找：对于长字符串，可以考虑使用更高效的字符串查找算法，如KMP算法。但标准库中的std::string::find在大多数情况下已经足够高效。
关键代码片段

#include <iostream>
#include <fstream>
#include <memory>
#include <string>
#include <vector>

int main() {
    // 使用智能指针管理文件，确保文件正确关闭
    std::unique_ptr<FILE, int(*)(FILE*)> file(fopen("large_text_file.txt", "r"), fclose);
    if (!file) {
        std::cerr << "Failed to open file" << std::endl;
        return 1;
    }

    std::string targetSubstring = "特定子串";
    int count = 0;
    std::vector<std::string> lines;
    lines.reserve(1000); // 预先分配空间，减少动态内存分配次数
    char buffer[1024];
    while (fgets(buffer, sizeof(buffer), file.get())) {
        lines.emplace_back(buffer);
        if (lines.size() >= lines.capacity()) {
            for (const auto& line : lines) {
                size_t pos = 0;
                while ((pos = line.find(targetSubstring, pos)) != std::string::npos) {
                    ++count;
                    pos += targetSubstring.length();
                }
            }
            lines.clear();
        }
    }
    // 处理剩余的行
    for (const auto& line : lines) {
        size_t pos = 0;
        while ((pos = line.find(targetSubstring, pos)) != std::string::npos) {
            ++count;
            pos += targetSubstring.length();
        }
    }

    std::cout << "特定子串出现的次数: " << count << std::endl;
    return 0;
}

面试题：C++ 字符串函数性能分析与优化

知识考点

面试题答案