Merge pull request #41 from alk-lbinet/doc-d

Docs update
alkemics · Jun 29, 2020 · da42e1e · da42e1e
2 parents 8ca896c + 64b0c1d
commit da42e1e
Show file tree

Hide file tree

Showing 15 changed files with 1,222 additions and 991 deletions.
diff --git a/docs/source/advanced-usage.rst b/docs/source/advanced-usage.rst
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -13,7 +13,6 @@ pandagg
 
    introduction
    user-guide
-   advanced-usage
    Tutorial dataset <IMDB>
    API reference <reference/pandagg>
    Contributing <CONTRIBUTING>
@@ -43,8 +42,7 @@ Alternatively, you can grab the latest source code from `GitHub <https://github.
 Usage
 *****
 
-The :doc:`user-guide` is the place to go to learn how to use the library and
-accomplish common tasks. The more in-depth :doc:`advanced-usage` guide is the place to go for deeply nested queries.
+The :doc:`user-guide` is the place to go to learn how to use the library.
 
 An example based on publicly available IMDB data is documented in repository `examples/imdb` directory, with
 a jupyter notebook to showcase some of `pandagg` functionalities: `here it is <https://gistpreview.github.io/?4cedcfe49660cd6757b94ba491abb95a>`_.

diff --git a/docs/source/introduction.rst b/docs/source/introduction.rst
@@ -2,10 +2,6 @@
 Principles
 ##########
 
-.. note::
-
-    This is a work in progress. Some sections still need to be furnished.
-
 
 This library focuses on two principles:
 

diff --git a/docs/source/user-guide.aggs.rst b/docs/source/user-guide.aggs.rst
@@ -0,0 +1,122 @@
+***********
+Aggregation
+***********
+
+The :class:`~pandagg.tree.aggs.aggs.Aggs` class provides :
+
+- multiple syntaxes to declare and udpate a aggregation
+- aggregation clause validation
+- ability to insert clauses at specific locations (and not just below last manipulated clause)
+
+
+Declaration
+===========
+
+From native "dict" query
+------------------------
+
+Given the following aggregation:
+
+    >>> expected_aggs = {
+    >>>   "decade": {
+    >>>     "histogram": {"field": "year", "interval": 10},
+    >>>     "aggs": {
+    >>>       "genres": {
+    >>>         "terms": {"field": "genres", "size": 3},
+    >>>         "aggs": {
+    >>>           "max_nb_roles": {
+    >>>             "max": {"field": "nb_roles"}
+    >>>           },
+    >>>           "avg_rank": {
+    >>>             "avg": {"field": "rank"}
+    >>>           }
+    >>>         }
+    >>>       }
+    >>>     }
+    >>>   }
+    >>> }
+
+To declare :class:`~pandagg.tree.aggs.aggs.Aggs`, simply pass "dict" query as argument:
+
+    >>> from pandagg.aggs import Aggs
+    >>> a = Aggs(expected_aggs)
+
+A visual representation of the query is available with :func:`~pandagg.tree.aggs.aggs.Aggs.show`:
+
+    >>> a.show()
+    <Aggregations>
+    decade                                         <histogram, field="year", interval=10>
+    └── genres                                            <terms, field="genres", size=3>
+        ├── max_nb_roles                                          <max, field="nb_roles">
+        └── avg_rank                                                  <avg, field="rank">
+
+
+Call :func:`~pandagg.tree.aggs.aggs.Aggs.to_dict` to convert it to native dict:
+
+    >>> a.to_dict() == expected_aggs
+    True
+
+With DSL classes
+----------------
+
+Pandagg provides a DSL to declare this query in a quite similar fashion:
+
+    >>> from pandagg.aggs import Histogram, Terms, Max, Avg
+    >>>
+    >>> a = Histogram("decade", field='year', interval=10, aggs=[
+    >>>     Terms("genres", field="genres", size=3, aggs=[
+    >>>         Max("max_nb_roles", field="nb_roles"),
+    >>>         Avg("avg_rank", field="range")
+    >>>     ]),
+    >>> ])
+
+All these classes inherit from :class:`~pandagg.tree.aggs.aggs.Aggs` and thus provide the same interface.
+
+    >>> from pandagg.aggs import Aggs
+    >>> isinstance(a, Aggs)
+    True
+
+With flattened syntax
+---------------------
+
+In the flattened syntax, the first argument is the aggregation name, the second argument is the aggregation type, the
+following keyword arguments define the aggregation body:
+
+    >>> from pandagg.query import Aggs
+    >>> a = Aggs('genres', 'terms', size=3)
+    >>> a.to_dict()
+    {'genres': {'terms': {'field': 'genres', 'size': 3}}}
+
+
+Aggregations enrichment
+=======================
+
+Aggregations can be enriched using two methods:
+
+- :func:`~pandagg.tree.aggs.aggs.Aggs.aggs`
+- :func:`~pandagg.tree.aggs.aggs.Aggs.groupby`
+
+Both methods return a new :class:`~pandagg.tree.aggs.aggs.Aggs` instance, and keep unchanged the initial Aggregation.
+
+For instance:
+
+    >>> from pandagg.aggs import Aggs
+    >>> initial_a = Aggs()
+    >>> enriched_a = initial_a.aggs('genres_agg', 'terms', field='genres')
+
+    >>> initial_q.to_dict()
+    None
+
+    >>> enriched_q.to_dict()
+    {'genres_agg': {'terms': {'field': 'genres'}}}
+
+.. note::
+
+    Calling :func:`~pandagg.tree.aggs.aggs.Aggs.to_dict` on an empty Aggregation returns `None`
+
+        >>> from pandagg.aggs import Aggs
+        >>> Aggs().to_dict()
+        None
+
+
+TODO