微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

Scroll API 遗漏了一些文件

如何解决Scroll API 遗漏了一些文件

我正在尝试使用 Scroll Api 从多个索引中获取所有文档,但它没有返回所有文档。我发现了一个类似的问题,但 op 显然缺少第一组文件。问题链接Elasticsearch Search Scroll API doesn't retrieve all the documents from an index

这是我的代码

//Code to get indexes

for (String indexName : indexNames) {
   final Scroll scroll = new Scroll(TimeValue.timeValueSeconds(45L));
   SearchRequest searchRequest = new SearchRequest(indexName);
   searchRequest.scroll(scroll);
   SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

   QueryBuilder query = QueryBuilders.boolQuery()
      .filter(QueryBuilders.termQuery(sourceId,2))
      .filter(QueryBuilders.rangeQuery(date).gte(01-05-2021).lte(31-05-2021));
   searchSourceBuilder.query(query);
   searchSourceBuilder.size(10000);
   searchRequest.source(searchSourceBuilder);

   SearchResponse searchResponse = client.search(searchRequest,RequestOptions.DEFAULT);
   String scrollId = searchResponse.getScrollId();
   SearchHit[] searchHits = searchResponse.getHits().getHits();

   List<Model> model = new ArrayList<>();      

   while(searchHits != null && searchHits.length > 0) {
      for (SearchHit document : searchHits){
         //add document to model list created above
         } //end of for loop

   // insert model list to database

   SearchScrollRequest searchScrollRequest = new SearchScrollRequest(scrollId);
   searchScrollRequest.scroll(scroll);
   searchResponse = client.scroll(searchScrollRequest,RequestOptions.DEFAULT);
   scrollId = searchResponse.getScrollId();
   searchHits = searchResponse.getHits().getHits();

   } //end of while loop

   ClearScrollRequest clear = new ClearScrollRequest();
   clear.addScrollId(scrollId);

} //end of for loop at the top

我应该得到的文件总数是 1.15 亿,但我遗漏了超过 200 万个文件。我反复检查了我的代码,但不知道我遗漏了什么。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。