面试题：Rust字符串与字节串在跨编码场景下的处理

use std::str;

fn iso8859_1_to_utf8(iso8859_1_bytes: &[u8]) -> Result<String, std::str::Utf8Error> {
    let iso8859_1_str = str::from_utf8(iso8859_1_bytes)?;
    let mut utf8_string = String::new();
    for c in iso8859_1_str.chars() {
        let bytes = c.encode_utf8(&mut utf8_string);
        let _ = bytes; // 这里可以忽略返回值，因为`encode_utf8`总是成功的
    }
    Ok(utf8_string)
}

fn utf8_to_iso8859_1(utf8_str: &str) -> Result<Vec<u8>, std::string::FromUtf8Error> {
    let mut iso8859_1_bytes = Vec::new();
    for c in utf8_str.chars() {
        let codepoint = c as u32;
        if codepoint <= 0xff {
            iso8859_1_bytes.push(codepoint as u8);
        } else {
            return Err(std::string::FromUtf8Error::new(vec![]));
        }
    }
    Ok(iso8859_1_bytes)
}

代码说明

iso8859_1_to_utf8函数：
- 首先使用str::from_utf8尝试将ISO - 8859 - 1字节串转换为字符串。由于ISO - 8859 - 1编码的字节范围是0 - 255，它在UTF - 8中是合法的字节序列，这里的from_utf8调用不会失败，但为了代码的严谨性保留了Result类型。
- 然后遍历这个字符串，对每个字符使用encode_utf8方法将其转换为UTF - 8编码并追加到新的String中。
- 函数返回Result<String, std::str::Utf8Error>，如果在转换过程中出现错误（虽然理论上这里不会出现），会返回相应的错误。
utf8_to_iso8859_1函数：
- 遍历UTF - 8字符串的每个字符。
- 检查字符的码点是否在0 - 255范围内（ISO - 8859 - 1的范围），如果是则将其转换为字节并追加到Vec<u8>中。
- 如果遇到码点超过255的字符，说明无法直接转换为ISO - 8859 - 1，返回Err。
- 函数返回Result<Vec<u8>, std::string::FromUtf8Error>，如果在转换过程中出现错误，会返回相应的错误。

面试题：Rust字符串与字节串在跨编码场景下的处理

知识考点

面试题答案

代码说明