Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add example code for aggregate count and buckets #74

Open
priamai opened this issue Jul 5, 2021 · 3 comments
Open

add example code for aggregate count and buckets #74

priamai opened this issue Jul 5, 2021 · 3 comments
Labels
documentation Improvements or additions to documentation

Comments

@priamai
Copy link

priamai commented Jul 5, 2021

Hi there,
would be nice to add examples for the following:

And for bucket aggregation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html
can this be combined with cardinality for each bucket?

Cheers.

@alk-lbinet
Copy link
Contributor

Hi @priamai ,

Here are some examples:

from pandagg import Search

# valuecount agg
> Search().agg('test_value_count', 'value_count', field='rootFields')
{
  "aggs": {
    "test_value_count": {
      "value_count": {
        "field": "rootFields"
      }
    }
  }
}

# cardinality

> Search().agg('test_value_count', 'cardinality', field='rootFields')
{
  "aggs": {
    "test_value_count": {
      "cardinality": {
        "field": "rootFields"
      }
    }
  }
}

# group by terms aggregation, with cardinality per bucket
> Search().groupby('entiy type', 'terms', field='ruleEntityType').agg('cardinality count', 'cardinality', field='rootFields')
{
  "aggs": {
    "entiy type": {
      "terms": {
        "field": "ruleEntityType"
      },
      "aggs": {
        "cardinality count": {
          "cardinality": {
            "field": "rootFields"
          }
        }
      }
    }
  }
}

Using the discover helper (on a testing index):

> from elasticsearch import Elasticsearch
> from pandagg.discovery import discover

> client = Elasticsearch(hosts=[<your_host>])
> indices = discover(client)
> vr = indices.validation_rule_02

> vr.search().agg('test_value_count', 'value_count', field='rootFields')

     test_value_count
NaN              8674

> vr.search().agg('test_value_count', 'cardinality', field='rootFields')

     test_value_count
NaN               795

> vr.search().groupby('entiy type', 'terms', field='ruleEntityType').agg('cardinality count', 'cardinality', field='rootFields')
            doc_count  cardinality count
entiy type
0                6353                722
2                 813                 64
4                 238                 51
1                 222                 54
10                 71                 13
13                 56                 26
6                  40                  7
3                  35                 10
7                  29                  2
11                 25                  8

Hope it helps

Cheers

@alk-lbinet
Copy link
Contributor

@priamai FYI you might have to upgrade to last version v0.2.1 (https://github.com/alkemics/pandagg/releases/tag/v0.2.1)

@priamai
Copy link
Author

priamai commented Jul 6, 2021

It was working already but yes will upgrade for sure. Thanks!

@alk-lbinet alk-lbinet added the documentation Improvements or additions to documentation label Sep 7, 2021
@leonardbinet leonardbinet added this to the Solid documentation milestone Mar 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Development

No branches or pull requests

3 participants