面试题：SQLite 数据导入导出性能优化及异常处理

性能优化方案

批量操作：

避免逐条插入或导出数据。在SQLite中，可以使用事务将多个插入操作包装起来，这样可以减少I/O操作次数。例如，在Python中使用sqlite3库，代码如下：

import sqlite3
source_conn = sqlite3.connect('source.db')
target_conn = sqlite3.connect('target.db')
source_cursor = source_conn.cursor()
target_cursor = target_conn.cursor()
try:
    target_conn.execute('BEGIN')
    rows = source_cursor.execute('SELECT * FROM your_table').fetchmany(1000)
    while rows:
        target_cursor.executemany('INSERT INTO your_table VALUES (?,?,?)', rows)
        rows = source_cursor.execute('SELECT * FROM your_table').fetchmany(1000)
    target_conn.execute('COMMIT')
except Exception as e:
    target_conn.execute('ROLLBACK')
    raise e
finally:
    source_conn.close()
    target_conn.close()

索引处理：
- 在源数据库查询时，确保相关表上有合适的索引，以加快数据读取速度。对于目标数据库，在数据导入前先删除不必要的索引，数据导入完成后再重新创建索引。例如，删除索引：
```
DROP INDEX index_name;
```
- 重新创建索引：
```
CREATE INDEX index_name ON your_table(column1, column2);
```

并行处理：

如果系统资源允许，可以考虑并行处理多个表的数据迁移。例如在Python中使用multiprocessing库，为每个表创建一个单独的进程来进行数据迁移。示例代码如下：

import multiprocessing
import sqlite3

def migrate_table(table_name):
    source_conn = sqlite3.connect('source.db')
    target_conn = sqlite3.connect('target.db')
    source_cursor = source_conn.cursor()
    target_cursor = target_conn.cursor()
    try:
        target_conn.execute('BEGIN')
        rows = source_cursor.execute(f'SELECT * FROM {table_name}').fetchmany(1000)
        while rows:
            target_cursor.executemany(f'INSERT INTO {table_name} VALUES (?,?,?)', rows)
            rows = source_cursor.execute(f'SELECT * FROM {table_name}').fetchmany(1000)
        target_conn.execute('COMMIT')
    except Exception as e:
        target_conn.execute('ROLLBACK')
        raise e
    finally:
        source_conn.close()
        target_conn.close()

if __name__ == '__main__':
    tables = ['table1', 'table2', 'table3']
    processes = []
    for table in tables:
        p = multiprocessing.Process(target = migrate_table, args=(table,))
        p.start()
        processes.append(p)
    for p in processes:
        p.join()

异常处理方案

数据类型不匹配：
- 在迁移前，仔细分析源数据库和目标数据库的表结构，确保数据类型尽可能匹配。如果无法避免数据类型转换，在SQLite中，可以使用SQL函数进行类型转换。例如，将源表中的TEXT类型的数字转换为目标表中的INTEGER类型：
```
INSERT INTO target_table (int_column) SELECT CAST(text_column AS INTEGER) FROM source_table;
```
- 在程序代码层面，可以使用try - except块捕获数据类型转换异常，并记录错误日志。例如在Python中：
```
try:
    target_cursor.execute('INSERT INTO target_table (int_column) VALUES (?)', (int(source_row[0]),))
except ValueError as ve:
    with open('error.log', 'a') as f:
        f.write(f'Data type mismatch error: {ve} for row {source_row}\n')
```

主键冲突：

可以使用INSERT OR REPLACE语句代替INSERT语句。这样当遇到主键冲突时，新数据会替换旧数据。例如：

INSERT OR REPLACE INTO target_table (primary_key_column, other_column) VALUES ('value1', 'value2');

另一种方式是在插入前先进行查询，如果主键已存在则跳过插入操作。例如在Python中：

source_row = ('value1', 'value2')
target_cursor.execute('SELECT primary_key_column FROM target_table WHERE primary_key_column =?', (source_row[0],))
existing_row = target_cursor.fetchone()
if not existing_row:
    target_cursor.execute('INSERT INTO target_table VALUES (?,?)', source_row)

同时，记录主键冲突的日志，以便后续分析。

if existing_row:
    with open('error.log', 'a') as f:
        f.write(f'Primary key conflict for row {source_row}\n')

通过上述性能优化和异常处理方案，可以在大数据量且表间存在复杂关联关系的情况下，高效且准确地完成SQLite数据库间的数据迁移。

面试题：SQLite 数据导入导出性能优化及异常处理

知识考点

面试题答案

性能优化方案

异常处理方案