可能是Bulkprocessor在处理数据时出现了重复数据。可以通过以下方法解决:
判断Bulkprocessor是否重复写入数据。
调试时可以打开ES监视器(如Elasrticsearch-Head)观察数据写入情况。另外可以在代码中添加日志输出,查看Bulkprocessor每次写入的数据量。
使用Bulkprocessor的新版API。
在Elasticsearch 7.x中,Bulkprocessor的API已升级。可以使用新的BulkRequestBuilder和BulkProcessor.Builder方法创建Bulkprocessor,可以有效避免数据重复问题。
示例代码如下:
BulkRequest request = new BulkRequest();
request.add(new IndexRequest("index1", "doc", "1").source(XContentFactory.jsonBuilder()
.startObject()
.field("name", "Joe Smith")
.field("age", 25)
.endObject()));
request.add(new DeleteRequest("index2", "doc", "2"));
request.add(new UpdateRequest("index3", "doc", "3")
.doc(XContentFactory.jsonBuilder()
.startObject()
.field("gender", "male")
.endObject()));
BulkProcessor.Listener listener = new BulkProcessor.Listener() {
@Override
public void beforeBulk(long executionId, BulkRequest request) {
logger.debug("Executing bulk [{}] with {} requests", executionId, request.numberOfActions());
}
@Override
public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {
logger.debug("Executed bulk [{}] with {} requests", executionId, request.numberOfActions());
}
@Override
public void afterBulk(long executionId, BulkRequest request, Throwable failure) {
logger.warn("Error executing bulk [{}]", executionId, failure);
}
};
BulkProcessor bulkProcessor = BulkProcessor.builder(client::bulkAsync, listener).build();
bulkProcessor.add(request);
在使用Bulkprocessor的同时,还可以使用BulkProcessor的Listener接口,监听Bulkprocessor的执行情况,便于调试。