Skip to content

Commit

Permalink
user-guide on aggregations declaration
Browse files Browse the repository at this point in the history
  • Loading branch information
leonardbinet committed Jun 22, 2020
1 parent c71fd43 commit 1f970cf
Show file tree
Hide file tree
Showing 3 changed files with 119 additions and 10 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
.*
!.github
!.pre-commit-config.yaml
_private/
*.py[co]
*.egg
*.egg-info
Expand Down
126 changes: 117 additions & 9 deletions docs/source/user-guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ The :class:`~pandagg.tree.query.abstract.Query` class provides :
- ability to insert clauses at specific points
- tree-like visual representation

Instantiation
=============
Declaration
===========

From native "dict" query
------------------------
Expand Down Expand Up @@ -112,8 +112,8 @@ All these classes inherit from :class:`~pandagg.tree.query.abstract.Query` and t
>>> isinstance(q, Query)
True

With single clause as flattened syntax
--------------------------------------
With flattened syntax
---------------------

In the flattened syntax, the query clause type is used as first argument:

Expand Down Expand Up @@ -288,17 +288,125 @@ Aggregation
The :class:`~pandagg.tree.aggs.aggs.Aggs` class provides :

- multiple syntaxes to declare and udpate a aggregation
- clause validation (with nested clauses validation)
- ability to insert clauses at specific points
- aggregation clause validation
- ability to insert clauses at specific locations (and not just below last manipulated clause)


Declaration
===========

From native "dict" query
------------------------

Given the following aggregation:

>>> expected_aggs = {
>>> "decade": {
>>> "histogram": {"field": "year", "interval": 10},
>>> "aggs": {
>>> "genres": {
>>> "terms": {"field": "genres", "size": 3},
>>> "aggs": {
>>> "max_nb_roles": {
>>> "max": {"field": "nb_roles"}
>>> },
>>> "avg_rank": {
>>> "avg": {"field": "rank"}
>>> }
>>> }
>>> }
>>> }
>>> }
>>> }

To declare :class:`~pandagg.tree.aggs.aggs.Aggs`, simply pass "dict" query as argument:

>>> from pandagg.aggs import Aggs
>>> a = Aggs(expected_aggs)

A visual representation of the query is available with :func:`~pandagg.tree.aggs.aggs.Aggs.show`:

>>> a.show()
<Aggregations>
decade <histogram, field="year", interval=10>
└── genres <terms, field="genres", size=3>
├── max_nb_roles <max, field="nb_roles">
└── avg_rank <avg, field="rank">


Call :func:`~pandagg.tree.aggs.aggs.Aggs.to_dict` to convert it to native dict:

>>> a.to_dict() == expected_aggs
True

With DSL classes
----------------

Pandagg provides a DSL to declare this query in a quite similar fashion:

>>> from pandagg.aggs import Histogram, Terms, Max, Avg
>>>
>>> a = Histogram("decade", field='year', interval=10, aggs=[
>>> Terms("genres", field="genres", size=3, aggs=[
>>> Max("max_nb_roles", field="nb_roles"),
>>> Avg("avg_rank", field="range")
>>> ]),
>>> ])

All these classes inherit from :class:`~pandagg.tree.aggs.aggs.Aggs` and thus provide the same interface.

Aggregation declaration
>>> from pandagg.aggs import Aggs
>>> isinstance(a, Aggs)
True

With flattened syntax
---------------------

In the flattened syntax, the first argument is the aggregation name, the second argument is the aggregation type, the
following keyword arguments define the aggregation body:

>>> from pandagg.query import Aggs
>>> a = Aggs('genres', 'terms', size=3)
>>> a.to_dict()
{'genres': {'terms': {'field': 'genres', 'size': 3}}}


Aggregations enrichment
=======================

Aggregations can be enriched using two methods:

- :func:`~pandagg.tree.aggs.aggs.Aggs.aggs`
- :func:`~pandagg.tree.aggs.aggs.Aggs.groupby`

Both methods return a new :class:`~pandagg.tree.aggs.aggs.Aggs` instance, and keep unchanged the initial Aggregation.

For instance:

>>> from pandagg.aggs import Aggs
>>> initial_a = Aggs()
>>> enriched_a = initial_a.aggs('genres_agg', 'terms', field='genres')

>>> initial_q.to_dict()
None

>>> enriched_q.to_dict()
{'genres_agg': {'terms': {'field': 'genres'}}}

.. note::

Calling :func:`~pandagg.tree.aggs.aggs.Aggs.to_dict` on an empty Aggregation returns `None`

>>> from pandagg.aggs import Aggs
>>> Aggs().to_dict()
None


TODO

Aggregation response
====================
********
Response
********

TODO

Expand Down
2 changes: 1 addition & 1 deletion pandagg/tree/aggs/aggs.py
Original file line number Diff line number Diff line change
Expand Up @@ -449,7 +449,7 @@ def aggs(self, *args, **kwargs):

def to_dict(self, from_=None, depth=None, with_name=True):
if self.root is None:
return {}
return None
from_ = self.root if from_ is None else from_
node = self.get(from_)
children_queries = {}
Expand Down

0 comments on commit 1f970cf

Please sign in to comment.