Show all Elasticsearch aggregation results/buckets and not just 10

Show all Elasticsearch aggregation results/buckets and not just 10

Show all Elasticsearch aggregation results/buckets and not just 10

The size param should be a param for the terms query example:

curl -XPOST http://localhost:9200/imoveis/_search?pretty=1 -d
{
   size: 0,
   aggregations: {
      bairro_count: {
         terms: {
            field: bairro.raw,
             size: 10000
         }
      }
   }
}

Use size: 0 for ES version 2 and prior.

Setting size:0 is deprecated in 2.x onwards, due to memory issues inflicted on your cluster with high-cardinality field values. You can read more about it in the github issue here .

It is recommended to explicitly set reasonable value for size a number between 1 to 2147483647.

How to show all buckets?

{
  size: 0,
  aggs: {
    aggregation_name: {
      terms: {
        field: your_field,
        size: 10000
      }
    }
  }
}

Note

  • size:10000 Get at most 10000 buckets. Default is 10.
  • size:0 In result, hits contains 10 documents by default. We dont need them.
  • By default, the buckets are ordered by the doc_count in decreasing order.

Why do I get Fielddata is disabled on text fields by default error?

Because fielddata is disabled on text fields by default. If you have not wxplicitly chosen a field type mapping, it has the default dynamic mappings for string fields.

So, instead of writing field: your_field you need to have field: your_field.keyword.

Show all Elasticsearch aggregation results/buckets and not just 10

If you want to get all unique values without setting a magic number (size: 10000), then use COMPOSITE AGGREGATION (ES 6.5+).

From official documentation:

If you want to retrieve all terms or all combinations of terms in a nested terms aggregation you should use the COMPOSITE AGGREGATION which allows to paginate over all possible terms rather than setting a size greater than the cardinality of the field in the terms aggregation. The terms aggregation is meant to return the top terms and does not allow pagination.

Implementation example in JavaScript:

const ITEMS_PER_PAGE = 1000;

const body = {
size: 0, // Returning only aggregation results: https://www.elastic.co/guide/en/elasticsearch/reference/current/returning-only-agg-results.html
aggs : {
langs: {
composite : {
size: ITEMS_PER_PAGE,
sources : [
{ language: { terms : { field: language } } }
]
}
}
}
};

const uniqueLanguages = [];

while (true) {
const result = await es.search(body);

const currentUniqueLangs = result.aggregations.langs.buckets.map(bucket => bucket.key);

uniqueLanguages.push(...currentUniqueLangs);

const after = result.aggregations.langs.after_key;

if (after) {
// continue paginating unique items
body.aggs.langs.composite.after = after;
} else {
break;
}
}

console.log(uniqueLanguages);

Related posts on Elasticsearch :

Leave a Reply

Your email address will not be published.