Fix: Parsing performance and filter bugs (#14)

* Added more local nox sessions for docs * jupyter is now a private module * Fix & clarify required filters in brandscape_data * add docs and release badges to readme * remove old jupyter module * reduce dict access when determining rate limits * speed up parsing by 4x * formatting * update release notes and roadmap * bump version * formatting * release date
wppbav · Sep 29, 2023 · ebe0f1d · ebe0f1d
1 parent 94927b9
commit ebe0f1d
Show file tree

Hide file tree

Showing 18 changed files with 154 additions and 101 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -58,7 +58,7 @@ Please familiarize yourself with using these libraries in order to get started w
 
 It is highly recommended that you use [`mamba`](https://mamba.readthedocs.io/en/latest/) to manage Python environments in Windows. It is a faster implementation of `conda` and testing of `bavapi` on multiple versions of Python is set up to use `mamba` on Windows.
 
-Once you have `mamba` installed in your system, you should be able to run `nox` commands.
+Once you have `mamba` installed in your system, and you have set up an environment with `nox`, you should be able to run the `nox` commands for `bavapi`.
 
 ## Nox commands
 
@@ -78,14 +78,16 @@ To see a list of all available `nox` commands, run `nox -l` in your terminal. He
 - `coverage`: combine and report coverage results.
 - `build`: build distributable files for `bavapi`. Suitable for CI/CD pipeline.
 - `docs_deploy`: publish `bavapi` documentation to GitHub Pages. Suitable for local development and CI/CD pipelines.
+- `docs_serve`: serve `bavapi` documentation with `mike`. Suitable for local development.
+- `docs_build_and_serve`: run both `docs_deploy` and `docs_serve`. Suitable for local development.
 
 ## Code Style Guidelines
 
 `bavapi` supports Python 3.8 and above, so your code should be able to run in all the latest versions of python.
 
 1. Run `black bavapi` after writing or modifying code. This way the code style of the whole project will remain consistent.
 2. Run `mypy bavapi` after writing or modifying code to make sure type hints are correctly defined.
-3. Fully test your code using `pytest` and make sure you covered all your changes in the repository by running `nox -s tests_mamba`, `nox -s tests_mamba_e2e` and `nox -s coverage`.
+3. Fully test your code using `pytest` and make sure you covered all your changes in the repository by running all `tests_mamba`, `tests_mamba_e2e` and `coverage` sessions from `nox`.
 
 ## Documentation
 

diff --git a/README.md b/README.md
@@ -2,6 +2,8 @@
 
 [![CI status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml)
 [![coverage](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/nachomaiz/32196acdc05431cd2bc7a8c73a587a8d/raw/covbadge.json)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml)
+[![release status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/release.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/release.yml)
+[![docs status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/docs.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/docs.yml)
 [![PyPI](https://img.shields.io/pypi/v/wpp-bavapi)](https://pypi.org/project/wpp-bavapi/)
 [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/wpp-bavapi)](https://pypi.org/project/wpp-bavapi/)
 

diff --git a/bavapi/jupyter.py → bavapi/_jupyter.py b/bavapi/jupyter.py → bavapi/_jupyter.py
@@ -12,8 +12,6 @@
 if TYPE_CHECKING:
     from asyncio import AbstractEventLoop
 
-__all__ = ("running_in_jupyter", "patch_loop")
-
 
 def running_in_jupyter() -> bool:
     """

diff --git a/bavapi/client.py b/bavapi/client.py
@@ -392,15 +392,18 @@ async def brandscape_data(
 
         This endpoint requires at least one of the following combinations of filters:
 
-        - `studies`
-        - `brand_name`
-        - `brands`
-        - `year_number`/`years` and `brands`/`brand_name`
-        - `country_code`/`countries` and `brands`/`brand_name`
-        - `year_number`/`years` and `country_code`/`countries`
+        - Study + Audience + Brand + Category
+        - Country + Year + Audience
+        - Brand + Audience + Country + Year
+
+        You should read these from left to right. A combination of "Study + Audience"
+        worksjust as well as "Study + Audience + Brand".
+        However, "Category + Audience" will not.
+
+        If you use Country or Year filters, you must use both filters together.
 
         An audience filter is also highly recommended, as otherwise the API will return
-        data for all audiences (there are more than 30 standard audiences).
+        data for all audiences (there are more than 100 standard audiences).
 
         The `Audiences` class is provided to make it easier to filter audiences.
 

diff --git a/bavapi/filters.py b/bavapi/filters.py
@@ -181,10 +181,13 @@ class BrandscapeFilters(FountFilters):
 
     The `brandscape-data` endpoint requires the use of, at minimum, these filters:
 
-    - `studies`
-    - `brand_name` or `brands`
-    - `country_code` or `countries` and `brands` or `brand_name`
-    - `year_number` or `years` and `country_code` or `countries`
+    - Study + Audience + Brand + Category
+    - Country + Year + Audience
+    - Brand + Audience + Country + Year
+
+    You should read these from left to right. A combination of "Study + Audience"
+    worksjust as well as "Study + Audience + Brand".
+    However, "Category + Audience" will not.
 
     An audience filter is also highly recommended, as otherwise the API will return
     data for all audiences (there are more than 30 standard audiences).
@@ -236,21 +239,34 @@ class BrandscapeFilters(FountFilters):
     @model_validator(mode="before")
     @classmethod
     def _check_params(cls, values: Dict[str, object]) -> Dict[str, object]:
+        using_country_year = bool(
+            set(values).intersection(
+                ("country_code", "countries", "year_number", "years")
+            )
+        )
+
         if not (
             "brands" in values
             or "brand_name" in values
             or "studies" in values
-            or (
-                ("country_code" in values or "countries" in values)
-                and ("year_number" in values or "years" in values)
-            )
+            or using_country_year
         ):
             raise ValueError(
                 "You need to apply either the `brands`, or `studies`, or `brand_name` "
                 "filters, or the `country_code`/`countries` "
                 "and `year_number`/`years` filters together."
             )
 
+        if using_country_year:
+            if not (
+                ("country_code" in values or "countries" in values)
+                and ("year_number" in values or "years" in values)
+            ):
+                raise ValueError(
+                    "`country_code`/`countries` and `year_number`/`years` "
+                    "filters must be used together."
+                )
+
         return values
 
 

diff --git a/bavapi/http.py b/bavapi/http.py
@@ -242,10 +242,10 @@ async def query(self, endpoint: str, params: Query) -> Iterator[JSONDict]:
             (total) / (params.per_page or self.per_page)
         )
 
-        if n_pages > int(resp.headers["x-ratelimit-remaining"]):
+        if n_pages > (limit_remaining := int(resp.headers["x-ratelimit-remaining"])):
             raise RateLimitExceededError(
                 f"Number of pages ({n_pages}) for this request "
-                f"exceeds the rate limit ({resp.headers['x-ratelimit-remaining']}, "
+                f"exceeds the rate limit ({limit_remaining}, "
                 f"total={resp.headers['x-ratelimit-limit']})."
             )
 

diff --git a/bavapi/parsing/responses.py b/bavapi/parsing/responses.py
@@ -16,20 +16,19 @@
 
 T = TypeVar("T")
 
+Entry = Mapping[str, Union[T, Dict[str, T]]]
+
 
 def flatten_mapping(
-    mapping: Mapping[str, Union[T, Mapping[str, T]]],
-    parent: str = "",
-    sep: str = "_",
-    prefix: str = "",
+    mapping: Entry[T], parent: str = "", sep: str = "_", prefix: str = ""
 ) -> Dict[str, T]:
     """Recursively flattens all nested dictionaries into top level key-value pairs.
 
     Include prefixes or suffixes if nested keys clash with top-level keys.
 
     Parameters
     ----------
-    mapping : dict[str, Any] or mapping
+    mapping : dict[str, Any]
         Dictionary with potential nested dictionaries
     parent : str
         Parent key for generating children keys, by default ""
@@ -45,29 +44,36 @@ def flatten_mapping(
     dict[str, Any]
         Flattened dictionary.
     """
-    new: Dict[str, T] = {}
-
+    res: Dict[str, T] = {}
+    to_expand: Dict[str, Dict[str, T]] = {}
     for key, value in mapping.items():
         if parent:
             key = f"{parent}{sep}{key}"
 
-        if isinstance(value, Mapping):
-            if prefix and any(
-                k.startswith(key)
-                for k, v in mapping.items()
-                if not isinstance(v, Mapping)
-            ):
-                key = f"{prefix}{sep}{key}"
-
-            new = {**new, **flatten_mapping(value, key, sep, prefix)}
+        if isinstance(value, dict):
+            to_expand[key] = value
         else:
-            new[key] = value
+            res[key] = value
+
+    with_prefix: set[str] = (
+        {k for k in to_expand if any(_k.startswith(k) for _k in res)}
+        if prefix
+        else set()
+    )
+
+    expanded: Dict[str, T] = {}
+    for key, value in to_expand.items():
+        if key in with_prefix:
+            key = f"{prefix}{sep}{key}"
+
+        expanded.update(flatten_mapping(value, key, sep, prefix))
 
-    return new
+    res.update(expanded)
+    return res
 
 
 def flatten(
-    mapping: Mapping[str, Union[T, Mapping[str, T]]],
+    mapping: Entry[T],
     parent: str = "",
     sep: str = "_",
     prefix: str = "",
@@ -88,7 +94,7 @@ def flatten(
 
     Parameters
     ----------
-    mapping : dict[str, Any] or mapping
+    mapping : dict[str, Any]
         Dictionary with potential nested dictionaries and lists of dictionaries
     parent : str
         Parent key for generating children keys, by default ""
@@ -118,15 +124,14 @@ def flatten(
         yield new
     else:
         for key in expand_keys:
-            values = cast(List[Mapping[str, T]], new[key])
-            print(key)
+            values = cast(List[Dict[str, T]], new[key])
             for val in values:
                 for i in flatten(val, key, sep, prefix, expand):
                     yield {**{k: v for k, v in new.items() if k != key}, **i}
 
 
 def parse_response(
-    page: Iterable[Mapping[str, object]],
+    page: Iterable[Entry[T]],
     prefix: str = "",
     index: Optional[str] = None,
     expand: bool = False,
@@ -135,7 +140,7 @@ def parse_response(
 
     Parameters
     ----------
-    page : iterable of dicts with keys of type str
+    page : Iterable[dict[str, Any]]
         Page from API response.
     prefix : str, optional
         Prefix to prepend to columns with clashing names, by default `""`

diff --git a/bavapi/query.py b/bavapi/query.py
@@ -123,7 +123,7 @@ def with_page(self, page: int, per_page: int) -> "Query[F]":
             return self
 
         return self.__class__.model_construct(
-            self.model_fields_set.union(   # pylint: disable=no-member
+            self.model_fields_set.union(  # pylint: disable=no-member
                 {"page", "per_page"}
             ),
             page=self.page or page,

diff --git a/bavapi/sync.py b/bavapi/sync.py
@@ -13,20 +13,11 @@
 import asyncio
 import functools
 import sys
-from typing import (
-    TYPE_CHECKING,
-    Any,
-    Callable,
-    Coroutine,
-    List,
-    Literal,
-    Optional,
-    TypeVar,
-)
+from typing import TYPE_CHECKING, Awaitable, Callable, List, Literal, Optional, TypeVar
 
 from bavapi import filters as _filters
+from bavapi._jupyter import patch_loop, running_in_jupyter
 from bavapi.client import Client, OptionalFiltersOrMapping
-from bavapi.jupyter import patch_loop, running_in_jupyter
 from bavapi.query import Query
 from bavapi.typing import BaseListOrValues, JSONDict, OptionalListOr
 
@@ -45,7 +36,7 @@
 F = TypeVar("F", bound=_filters.FountFilters)
 
 
-def _coro(func: Callable[P, Coroutine[Any, Any, T]]) -> Callable[P, T]:
+def _coro(func: Callable[P, Awaitable[T]]) -> Callable[P, T]:
     @functools.wraps(func)
     def wrapper(*args: P.args, **kwargs: P.kwargs):
         try:
@@ -298,13 +289,18 @@ async def brandscape_data(
 
     This endpoint requires at least one of the following combinations of parameters:
 
-    - `studies`
-    - `brand_name` or `brands`
-    - `country_code` or `countries` and `brands` or `brand_name`
-    - `year_number` or `years` and `country_code` or `countries`
+    - Study + Audience + Brand + Category
+    - Country + Year + Audience
+    - Brand + Audience + Country + Year
+
+    You should read these from left to right. A combination of "Study + Audience"
+    worksjust as well as "Study + Audience + Brand".
+    However, "Category + Audience" will not.
+
+    If you use Country or Year filters, you must use both filters together.
 
     An audience filter is also highly recommended, as otherwise the API will return
-    data for all audiences (there are more than 30 standard audiences).
+    data for all audiences (there are more than 100 standard audiences).
 
     The `Audiences` class is provided to make it easier to filter audiences.
 

diff --git a/docs/endpoints/brandscape-data.md b/docs/endpoints/brandscape-data.md
@@ -43,11 +43,13 @@ For more information on available filters and functionality, see the Fount docum
 
 Thus, the `brandscape-data` endpoint has been restricted to require at least one of these specific set of filters:
 
-- `studies`
-- `brand_name`/`brands`
-- `year_number`/`years` and `brands`/`brand_name`
-- `country_code`/`countries` and `brands`/`brand_name`
-- `year_number`/`years` and `country_code`/`countries`
+- Study + Audience + Brand + Category
+- Country + Year + Audience
+- Brand + Audience + Country + Year
+
+You should read these from left to right. A combination of "Study + Audience" works just as well as "Study + Audience + Brand". However, "Category + Audience" will not.
+
+If you use Country or Year filters, you must use both filters together.
 
 If a query does not have any of these combinations of filters, it will raise a `ValidationError`:
 
@@ -58,7 +60,7 @@ bavapi.brandscape_data("TOKEN", year_number=2022) # Error, not enough filters
 
 bavapi.brandscape_data("TOKEN", brand_name="Facebook") # OK
 
-bavapi.brandscape_data("TOKEN", country_code="UK", brands=123)  # OK
+bavapi.brandscape_data("TOKEN", audience=22, brands=123)  # OK
 ```
 
 ## Default includes
@@ -70,15 +72,14 @@ If you add any of these values in the `include` field by themselves, the default
 If, on the other hand, you request an `include` that is *not* part of the default values, `bavapi` will append that new value to the default `include` values.
 
 ```py
-# All default includes will be requested
+# All default (study, brand, category and audience) includes will be requested
 bavapi.brandscape_data("TOKEN", brand_name="Facebook")
 
 # Only the "brand" include will be requested
 bavapi.brandscape_data("TOKEN", brand_name="Facebook", include="brand")
 
-# The "company" include will be appended to the default "include" values
+# The "company" include will be appended to the default `include` values
 bavapi.brandscape_data("TOKEN", brand_name="Facebook", include="company")
-
 ```
 
 ## Clashing column names
@@ -106,7 +107,7 @@ You can specify the metrics that your response should contain, and the API will
     - `relevance_c`
     - `relevance_rank`
     - Brand information such as `id`, `brand_name`, and `category_name`
-    - Any additional columns from the `include` parameter
+    - Any additional columns from the `include` parameter (see default behavior [above](#default-includes))
 
     ```py
     bavapi.brandscape_data(studies=111, metric_keys="differentiation") # (1)

diff --git a/docs/index.md b/docs/index.md
@@ -2,6 +2,8 @@
 
 [![CI status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml)
 [![coverage](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/nachomaiz/32196acdc05431cd2bc7a8c73a587a8d/raw/covbadge.json)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml)
+[![release status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/release.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/release.yml)
+[![docs status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/docs.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/docs.yml)
 [![PyPI](https://img.shields.io/pypi/v/wpp-bavapi)](https://pypi.org/project/wpp-bavapi/)
 [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/wpp-bavapi)
 ](https://pypi.org/project/wpp-bavapi/)