1 Node cluster on my local laptop: 8core,xms=8G,Xmx=8G
Indexing performance (Single index):
10 million payments,each one about 5KB,with batch size = 10000. Each batch takes roughly 2.5 s → 4 s,total time to index 10 million payment is around 50 min
Indexing performance (Multiple indices):
20 separate indices store totally 10 million payments. Indexing execution is slightly faster than single index case. Each batch takes roughly 1.7 s → 3.8 s,total time to index 10 million payment is around 38 min
Parameters required for bulk load operation
Elasticsearch config: http.max_content_length: 500mb
Client time out adjustment:
RestClient.builder(HttpHost("localhost",9200)) .setRequestConfigCallback { it.apply { this.setConnectTimeout(5000) this.setSocketTimeout(60000) } }.setMaxRetryTimeoutMillis(60000))
Initially batch size is set to 100000,elastic search server becomes unstable with high GC frequency,occupying a large percent of cpu time. So larger batch size does not always imply higher performance
Query aggregation performance:
Test query: real aggregation query used by rule engine
{ "aggregations": { "date_range": { "range": { "field": "createdAt","ranges": [ { "key": "LAST_7_DAYS","from": 1544400968485,"to": 1545005768486 } ],"keyed": false },"aggregations": { "filter_aggregator": { "filters": { "filters": { "602c7d66-e990-4dfb-b6e2-72b62ff159d5": { "terms": { "beneficiaryId.keyword": [ "602c7d66-e990-4dfb-b6e2-72b62ff159d5" ],"boost": 1 } },"67cab0c8-2510-443d-8f00-bce19c04815e": { "terms": { "bankAccountUserId.keyword": [ "67cab0c8-2510-443d-8f00-bce19c04815e" ],"8da52e51-eabf-4f6c-b9f0-e222933c1cb7": { "terms": { "payerId.keyword": [ "8da52e51-eabf-4f6c-b9f0-e222933c1cb7" ],"8da52e51-eabf-4f6c-b9f0-e222933c1cb7_602c7d66-e990-4dfb-b6e2-72b62ff159d5": { "bool": { "filter": [ { "terms": { "payerId.keyword": [ "8da52e51-eabf-4f6c-b9f0-e222933c1cb7" ],"boost": 1 } },{ "terms": { "beneficiaryId.keyword": [ "602c7d66-e990-4dfb-b6e2-72b62ff159d5" ],"boost": 1 } } ],"adjust_pure_negative": true,"9a1b4bad-ccf5-4c67-8718-02696cb351e4": { "terms": { "clientId.keyword": [ "9a1b4bad-ccf5-4c67-8718-02696cb351e4" ],"9a1b4bad-ccf5-4c67-8718-02696cb351e4_602c7d66-e990-4dfb-b6e2-72b62ff159d5": { "bool": { "filter": [ { "terms": { "clientId.keyword": [ "9a1b4bad-ccf5-4c67-8718-02696cb351e4" ],"9a1b4bad-ccf5-4c67-8718-02696cb351e4_8da52e51-eabf-4f6c-b9f0-e222933c1cb7": { "bool": { "filter": [ { "terms": { "clientId.keyword": [ "9a1b4bad-ccf5-4c67-8718-02696cb351e4" ],{ "terms": { "payerId.keyword": [ "8da52e51-eabf-4f6c-b9f0-e222933c1cb7" ],"9a1b4bad-ccf5-4c67-8718-02696cb351e4_8da52e51-eabf-4f6c-b9f0-e222933c1cb7_602c7d66-e990-4dfb-b6e2-72b62ff159d5": { "bool": { "filter": [ { "terms": { "clientId.keyword": [ "9a1b4bad-ccf5-4c67-8718-02696cb351e4" ],"boost": 1 } } },"other_bucket": false,"other_bucket_key": "_other_" },"aggregations": { "beneficiary_amount": { "stats": { "field": "beneficiaryAmountUsd" } },"payer_amount": { "stats": { "field": "payerAmountUsd" } },"distinct_count_beneficiary": { "cardinality": { "field": "beneficiaryId.keyword" } },"distinct_count_payer": { "cardinality": { "field": "payerId.keyword" } },"distinct_count_client": { "cardinality": { "field": "clientId.keyword" } },"distinct_count_bank_acc": { "cardinality": { "field": "bankAccountUserId.keyword" } },"distinct_count_bene_country": { "cardinality": { "field": "beneficiaryCountry.keyword" } },"distinct_count_payer_country": { "cardinality": { "field": "payerCountry.keyword" } },"distinct_count_bene_currency": { "cardinality": { "field": "beneficiaryCurrency.keyword" } },"distinct_count_payer_currency": { "cardinality": { "field": "payerCurrency.keyword" } },"structured_payment_amount_personal": { "range": { "field": "payerAmountUsd","ranges": [ { "from": 9000,"to": 9999.999 } ],"keyed": false } },"structured_payment_amount_company": { "range": { "field": "payerAmountUsd","ranges": [ { "from": 112500000,"to": 124999999.999 } ],"keyed": false } } } } } } } }
Test result : (Single Index)
Scenario | Number of run | Execution times | Min | Max | Average |
---|---|---|---|---|---|
Single thread<br/>Search result hit<br/>Result size unset<br/> | 10 | Text |
Conclusion:
Aggregation performance hinges on the number of documents that matches the aggregation?
Result size parameter has significant impact on aggregation performance. not only because it skipped returning hit documents,but also because it enables caching for aggregation result,otherwise,you have to force result caching by explicitly setting?request_cache=true
https://www.elastic.co/guide/en/elasticsearch/reference/6.6/shard-request-cache.html
Executing query concurrently can also have negative impact on performance
Increasing number of indices have positive impact on index speed but have large negative impact on aggregation if the aggregation is performed across indices
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。