面试题答案
一键面试适配器模式解决方案设计
- 定义统一接口:
- 首先定义一个统一的数据读取接口,例如:
public interface DataReader { List<String> readData(); }
- 为不同数据源创建适配器:
- Kafka数据源适配器:
import org.apache.kafka.clients.consumer.ConsumerRecord; import org.apache.kafka.clients.consumer.ConsumerRecords; import org.apache.kafka.clients.consumer.KafkaConsumer; import java.util.ArrayList; import java.util.List; import java.util.Properties; public class KafkaDataAdapter implements DataReader { private final KafkaConsumer<String, String> consumer; private final String topic; public KafkaDataAdapter(String bootstrapServers, String groupId, String topic) { Properties props = new Properties(); props.put("bootstrap.servers", bootstrapServers); props.put("group.id", groupId); props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); this.consumer = new KafkaConsumer<>(props); this.topic = topic; this.consumer.subscribe(List.of(topic)); } @Override public List<String> readData() { List<String> data = new ArrayList<>(); ConsumerRecords<String, String> records = consumer.poll(100); for (ConsumerRecord<String, String> record : records) { data.add(record.value()); } return data; } }
- CSV文件数据源适配器:
import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.util.ArrayList; import java.util.List; public class CSVDataAdapter implements DataReader { private final String filePath; public CSVDataAdapter(String filePath) { this.filePath = filePath; } @Override public List<String> readData() { List<String> data = new ArrayList<>(); try (BufferedReader br = new BufferedReader(new FileReader(filePath))) { String line; while ((line = br.readLine()) != null) { data.add(line); } } catch (IOException e) { e.printStackTrace(); } return data; } }
- 远程API数据源适配器:
import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.net.HttpURLConnection; import java.net.URL; import java.util.ArrayList; import java.util.List; public class APIAdapter implements DataReader { private final String apiUrl; public APIAdapter(String apiUrl) { this.apiUrl = apiUrl; } @Override public List<String> readData() { List<String> data = new ArrayList<>(); try { URL url = new URL(apiUrl); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod("GET"); conn.connect(); int responseCode = conn.getResponseCode(); if (responseCode == HttpURLConnection.HTTP_OK) { BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream())); String inputLine; while ((inputLine = in.readLine()) != null) { data.add(inputLine); } in.close(); } } catch (IOException e) { e.printStackTrace(); } return data; } }
- 使用适配器:
public class Main { public static void main(String[] args) { // Kafka数据源使用示例 KafkaDataAdapter kafkaAdapter = new KafkaDataAdapter("localhost:9092", "group1", "testTopic"); List<String> kafkaData = kafkaAdapter.readData(); // CSV文件数据源使用示例 CSVDataAdapter csvAdapter = new CSVDataAdapter("path/to/your/file.csv"); List<String> csvData = csvAdapter.readData(); // 远程API数据源使用示例 APIAdapter apiAdapter = new APIAdapter("https://example.com/api/data"); List<String> apiData = apiAdapter.readData(); } }
性能优化措施
- 减少资源开销:
- 连接复用:
- 对于Kafka,使用单例模式创建KafkaConsumer实例,避免在高并发场景下频繁创建和销毁KafkaConsumer对象,因为创建KafkaConsumer对象开销较大。例如:
public class KafkaConsumerSingleton { private static KafkaConsumer<String, String> instance; private static final String BOOTSTRAP_SERVERS = "localhost:9092"; private static final String GROUP_ID = "group1"; private KafkaConsumerSingleton() {} public static KafkaConsumer<String, String> getInstance() { if (instance == null) { Properties props = new Properties(); props.put("bootstrap.servers", BOOTSTRAP_SERVERS); props.put("group.id", GROUP_ID); props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); instance = new KafkaConsumer<>(props); } return instance; } }
- 对于HTTP连接(远程API),可以使用连接池技术,如Apache HttpClient的PoolingHttpClientConnectionManager,避免每次调用API都创建新的连接。
- 缓存数据:
- 对于变化不频繁的CSV文件数据,可以在内存中缓存数据。例如,使用Guava Cache,在首次读取CSV文件后将数据缓存起来,后续读取直接从缓存获取。
import com.google.common.cache.Cache; import com.google.common.cache.CacheBuilder; public class CSVDataCache { private static final Cache<String, List<String>> cache = CacheBuilder.newBuilder() .maximumSize(100) .build(); public static List<String> getCSVData(String filePath) { return cache.get(filePath, () -> { List<String> data = new ArrayList<>(); try (BufferedReader br = new BufferedReader(new FileReader(filePath))) { String line; while ((line = br.readLine()) != null) { data.add(line); } } catch (IOException e) { e.printStackTrace(); } return data; }); } }
- 连接复用:
- 提高数据读取和转换效率:
- 异步处理:
- 对于Kafka数据读取,可以使用异步消费方式。KafkaConsumer提供了异步提交偏移量的方法,通过使用
commitAsync()
方法,可以减少等待同步提交偏移量的时间,提高消费效率。 - 对于远程API调用,可以使用Java的CompletableFuture进行异步调用。例如:
import java.util.concurrent.CompletableFuture; public class AsyncAPIAdapter { private final String apiUrl; public AsyncAPIAdapter(String apiUrl) { this.apiUrl = apiUrl; } public CompletableFuture<List<String>> readDataAsync() { return CompletableFuture.supplyAsync(() -> { List<String> data = new ArrayList<>(); try { URL url = new URL(apiUrl); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod("GET"); conn.connect(); int responseCode = conn.getResponseCode(); if (responseCode == HttpURLConnection.HTTP_OK) { BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream())); String inputLine; while ((inputLine = in.readLine()) != null) { data.add(inputLine); } in.close(); } } catch (IOException e) { e.printStackTrace(); } return data; }); } }
- 对于Kafka数据读取,可以使用异步消费方式。KafkaConsumer提供了异步提交偏移量的方法,通过使用
- 批量处理:
- 在Kafka消费中,可以设置合适的
max.poll.records
参数,一次从Kafka拉取更多的数据进行批量处理,减少拉取数据的次数。 - 对于CSV文件读取,可以一次读取多行数据,而不是逐行读取。例如,使用BufferedReader的
readLines()
方法(Java 8及以上),可以一次性读取整个文件内容到List中。 - 对于远程API,如果API支持批量请求,可以将多个请求合并为一个批量请求,减少网络开销。
- 在Kafka消费中,可以设置合适的
- 异步处理: