Skip to content

Commit

Permalink
Fix: Parsing performance and filter bugs (#14)
Browse files Browse the repository at this point in the history
* Added more local nox sessions for docs

* jupyter is now a private module

* Fix & clarify required filters in brandscape_data

* add docs and release badges to readme

* remove old jupyter module

* reduce dict access when determining rate limits

* speed up parsing by 4x

* formatting

* update release notes and roadmap

* bump version

* formatting

* release date
  • Loading branch information
Nacho Maiz authored Sep 29, 2023
1 parent 94927b9 commit ebe0f1d
Show file tree
Hide file tree
Showing 18 changed files with 154 additions and 101 deletions.
6 changes: 4 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Please familiarize yourself with using these libraries in order to get started w

It is highly recommended that you use [`mamba`](https://mamba.readthedocs.io/en/latest/) to manage Python environments in Windows. It is a faster implementation of `conda` and testing of `bavapi` on multiple versions of Python is set up to use `mamba` on Windows.

Once you have `mamba` installed in your system, you should be able to run `nox` commands.
Once you have `mamba` installed in your system, and you have set up an environment with `nox`, you should be able to run the `nox` commands for `bavapi`.

## Nox commands

Expand All @@ -78,14 +78,16 @@ To see a list of all available `nox` commands, run `nox -l` in your terminal. He
- `coverage`: combine and report coverage results.
- `build`: build distributable files for `bavapi`. Suitable for CI/CD pipeline.
- `docs_deploy`: publish `bavapi` documentation to GitHub Pages. Suitable for local development and CI/CD pipelines.
- `docs_serve`: serve `bavapi` documentation with `mike`. Suitable for local development.
- `docs_build_and_serve`: run both `docs_deploy` and `docs_serve`. Suitable for local development.

## Code Style Guidelines

`bavapi` supports Python 3.8 and above, so your code should be able to run in all the latest versions of python.

1. Run `black bavapi` after writing or modifying code. This way the code style of the whole project will remain consistent.
2. Run `mypy bavapi` after writing or modifying code to make sure type hints are correctly defined.
3. Fully test your code using `pytest` and make sure you covered all your changes in the repository by running `nox -s tests_mamba`, `nox -s tests_mamba_e2e` and `nox -s coverage`.
3. Fully test your code using `pytest` and make sure you covered all your changes in the repository by running all `tests_mamba`, `tests_mamba_e2e` and `coverage` sessions from `nox`.

## Documentation

Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

[![CI status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml)
[![coverage](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/nachomaiz/32196acdc05431cd2bc7a8c73a587a8d/raw/covbadge.json)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml)
[![release status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/release.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/release.yml)
[![docs status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/docs.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/docs.yml)
[![PyPI](https://img.shields.io/pypi/v/wpp-bavapi)](https://pypi.org/project/wpp-bavapi/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/wpp-bavapi)](https://pypi.org/project/wpp-bavapi/)

Expand Down
2 changes: 0 additions & 2 deletions bavapi/jupyter.py → bavapi/_jupyter.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@
if TYPE_CHECKING:
from asyncio import AbstractEventLoop

__all__ = ("running_in_jupyter", "patch_loop")


def running_in_jupyter() -> bool:
"""
Expand Down
17 changes: 10 additions & 7 deletions bavapi/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -392,15 +392,18 @@ async def brandscape_data(
This endpoint requires at least one of the following combinations of filters:
- `studies`
- `brand_name`
- `brands`
- `year_number`/`years` and `brands`/`brand_name`
- `country_code`/`countries` and `brands`/`brand_name`
- `year_number`/`years` and `country_code`/`countries`
- Study + Audience + Brand + Category
- Country + Year + Audience
- Brand + Audience + Country + Year
You should read these from left to right. A combination of "Study + Audience"
worksjust as well as "Study + Audience + Brand".
However, "Category + Audience" will not.
If you use Country or Year filters, you must use both filters together.
An audience filter is also highly recommended, as otherwise the API will return
data for all audiences (there are more than 30 standard audiences).
data for all audiences (there are more than 100 standard audiences).
The `Audiences` class is provided to make it easier to filter audiences.
Expand Down
32 changes: 24 additions & 8 deletions bavapi/filters.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,10 +181,13 @@ class BrandscapeFilters(FountFilters):
The `brandscape-data` endpoint requires the use of, at minimum, these filters:
- `studies`
- `brand_name` or `brands`
- `country_code` or `countries` and `brands` or `brand_name`
- `year_number` or `years` and `country_code` or `countries`
- Study + Audience + Brand + Category
- Country + Year + Audience
- Brand + Audience + Country + Year
You should read these from left to right. A combination of "Study + Audience"
worksjust as well as "Study + Audience + Brand".
However, "Category + Audience" will not.
An audience filter is also highly recommended, as otherwise the API will return
data for all audiences (there are more than 30 standard audiences).
Expand Down Expand Up @@ -236,21 +239,34 @@ class BrandscapeFilters(FountFilters):
@model_validator(mode="before")
@classmethod
def _check_params(cls, values: Dict[str, object]) -> Dict[str, object]:
using_country_year = bool(
set(values).intersection(
("country_code", "countries", "year_number", "years")
)
)

if not (
"brands" in values
or "brand_name" in values
or "studies" in values
or (
("country_code" in values or "countries" in values)
and ("year_number" in values or "years" in values)
)
or using_country_year
):
raise ValueError(
"You need to apply either the `brands`, or `studies`, or `brand_name` "
"filters, or the `country_code`/`countries` "
"and `year_number`/`years` filters together."
)

if using_country_year:
if not (
("country_code" in values or "countries" in values)
and ("year_number" in values or "years" in values)
):
raise ValueError(
"`country_code`/`countries` and `year_number`/`years` "
"filters must be used together."
)

return values


Expand Down
4 changes: 2 additions & 2 deletions bavapi/http.py
Original file line number Diff line number Diff line change
Expand Up @@ -242,10 +242,10 @@ async def query(self, endpoint: str, params: Query) -> Iterator[JSONDict]:
(total) / (params.per_page or self.per_page)
)

if n_pages > int(resp.headers["x-ratelimit-remaining"]):
if n_pages > (limit_remaining := int(resp.headers["x-ratelimit-remaining"])):
raise RateLimitExceededError(
f"Number of pages ({n_pages}) for this request "
f"exceeds the rate limit ({resp.headers['x-ratelimit-remaining']}, "
f"exceeds the rate limit ({limit_remaining}, "
f"total={resp.headers['x-ratelimit-limit']})."
)

Expand Down
53 changes: 29 additions & 24 deletions bavapi/parsing/responses.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,20 +16,19 @@

T = TypeVar("T")

Entry = Mapping[str, Union[T, Dict[str, T]]]


def flatten_mapping(
mapping: Mapping[str, Union[T, Mapping[str, T]]],
parent: str = "",
sep: str = "_",
prefix: str = "",
mapping: Entry[T], parent: str = "", sep: str = "_", prefix: str = ""
) -> Dict[str, T]:
"""Recursively flattens all nested dictionaries into top level key-value pairs.
Include prefixes or suffixes if nested keys clash with top-level keys.
Parameters
----------
mapping : dict[str, Any] or mapping
mapping : dict[str, Any]
Dictionary with potential nested dictionaries
parent : str
Parent key for generating children keys, by default ""
Expand All @@ -45,29 +44,36 @@ def flatten_mapping(
dict[str, Any]
Flattened dictionary.
"""
new: Dict[str, T] = {}

res: Dict[str, T] = {}
to_expand: Dict[str, Dict[str, T]] = {}
for key, value in mapping.items():
if parent:
key = f"{parent}{sep}{key}"

if isinstance(value, Mapping):
if prefix and any(
k.startswith(key)
for k, v in mapping.items()
if not isinstance(v, Mapping)
):
key = f"{prefix}{sep}{key}"

new = {**new, **flatten_mapping(value, key, sep, prefix)}
if isinstance(value, dict):
to_expand[key] = value
else:
new[key] = value
res[key] = value

with_prefix: set[str] = (
{k for k in to_expand if any(_k.startswith(k) for _k in res)}
if prefix
else set()
)

expanded: Dict[str, T] = {}
for key, value in to_expand.items():
if key in with_prefix:
key = f"{prefix}{sep}{key}"

expanded.update(flatten_mapping(value, key, sep, prefix))

return new
res.update(expanded)
return res


def flatten(
mapping: Mapping[str, Union[T, Mapping[str, T]]],
mapping: Entry[T],
parent: str = "",
sep: str = "_",
prefix: str = "",
Expand All @@ -88,7 +94,7 @@ def flatten(
Parameters
----------
mapping : dict[str, Any] or mapping
mapping : dict[str, Any]
Dictionary with potential nested dictionaries and lists of dictionaries
parent : str
Parent key for generating children keys, by default ""
Expand Down Expand Up @@ -118,15 +124,14 @@ def flatten(
yield new
else:
for key in expand_keys:
values = cast(List[Mapping[str, T]], new[key])
print(key)
values = cast(List[Dict[str, T]], new[key])
for val in values:
for i in flatten(val, key, sep, prefix, expand):
yield {**{k: v for k, v in new.items() if k != key}, **i}


def parse_response(
page: Iterable[Mapping[str, object]],
page: Iterable[Entry[T]],
prefix: str = "",
index: Optional[str] = None,
expand: bool = False,
Expand All @@ -135,7 +140,7 @@ def parse_response(
Parameters
----------
page : iterable of dicts with keys of type str
page : Iterable[dict[str, Any]]
Page from API response.
prefix : str, optional
Prefix to prepend to columns with clashing names, by default `""`
Expand Down
2 changes: 1 addition & 1 deletion bavapi/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ def with_page(self, page: int, per_page: int) -> "Query[F]":
return self

return self.__class__.model_construct(
self.model_fields_set.union( # pylint: disable=no-member
self.model_fields_set.union( # pylint: disable=no-member
{"page", "per_page"}
),
page=self.page or page,
Expand Down
30 changes: 13 additions & 17 deletions bavapi/sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,11 @@
import asyncio
import functools
import sys
from typing import (
TYPE_CHECKING,
Any,
Callable,
Coroutine,
List,
Literal,
Optional,
TypeVar,
)
from typing import TYPE_CHECKING, Awaitable, Callable, List, Literal, Optional, TypeVar

from bavapi import filters as _filters
from bavapi._jupyter import patch_loop, running_in_jupyter
from bavapi.client import Client, OptionalFiltersOrMapping
from bavapi.jupyter import patch_loop, running_in_jupyter
from bavapi.query import Query
from bavapi.typing import BaseListOrValues, JSONDict, OptionalListOr

Expand All @@ -45,7 +36,7 @@
F = TypeVar("F", bound=_filters.FountFilters)


def _coro(func: Callable[P, Coroutine[Any, Any, T]]) -> Callable[P, T]:
def _coro(func: Callable[P, Awaitable[T]]) -> Callable[P, T]:
@functools.wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs):
try:
Expand Down Expand Up @@ -298,13 +289,18 @@ async def brandscape_data(
This endpoint requires at least one of the following combinations of parameters:
- `studies`
- `brand_name` or `brands`
- `country_code` or `countries` and `brands` or `brand_name`
- `year_number` or `years` and `country_code` or `countries`
- Study + Audience + Brand + Category
- Country + Year + Audience
- Brand + Audience + Country + Year
You should read these from left to right. A combination of "Study + Audience"
worksjust as well as "Study + Audience + Brand".
However, "Category + Audience" will not.
If you use Country or Year filters, you must use both filters together.
An audience filter is also highly recommended, as otherwise the API will return
data for all audiences (there are more than 30 standard audiences).
data for all audiences (there are more than 100 standard audiences).
The `Audiences` class is provided to make it easier to filter audiences.
Expand Down
21 changes: 11 additions & 10 deletions docs/endpoints/brandscape-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,13 @@ For more information on available filters and functionality, see the Fount docum

Thus, the `brandscape-data` endpoint has been restricted to require at least one of these specific set of filters:

- `studies`
- `brand_name`/`brands`
- `year_number`/`years` and `brands`/`brand_name`
- `country_code`/`countries` and `brands`/`brand_name`
- `year_number`/`years` and `country_code`/`countries`
- Study + Audience + Brand + Category
- Country + Year + Audience
- Brand + Audience + Country + Year

You should read these from left to right. A combination of "Study + Audience" works just as well as "Study + Audience + Brand". However, "Category + Audience" will not.

If you use Country or Year filters, you must use both filters together.

If a query does not have any of these combinations of filters, it will raise a `ValidationError`:

Expand All @@ -58,7 +60,7 @@ bavapi.brandscape_data("TOKEN", year_number=2022) # Error, not enough filters

bavapi.brandscape_data("TOKEN", brand_name="Facebook") # OK

bavapi.brandscape_data("TOKEN", country_code="UK", brands=123) # OK
bavapi.brandscape_data("TOKEN", audience=22, brands=123) # OK
```

## Default includes
Expand All @@ -70,15 +72,14 @@ If you add any of these values in the `include` field by themselves, the default
If, on the other hand, you request an `include` that is *not* part of the default values, `bavapi` will append that new value to the default `include` values.

```py
# All default includes will be requested
# All default (study, brand, category and audience) includes will be requested
bavapi.brandscape_data("TOKEN", brand_name="Facebook")

# Only the "brand" include will be requested
bavapi.brandscape_data("TOKEN", brand_name="Facebook", include="brand")

# The "company" include will be appended to the default "include" values
# The "company" include will be appended to the default `include` values
bavapi.brandscape_data("TOKEN", brand_name="Facebook", include="company")

```

## Clashing column names
Expand Down Expand Up @@ -106,7 +107,7 @@ You can specify the metrics that your response should contain, and the API will
- `relevance_c`
- `relevance_rank`
- Brand information such as `id`, `brand_name`, and `category_name`
- Any additional columns from the `include` parameter
- Any additional columns from the `include` parameter (see default behavior [above](#default-includes))

```py
bavapi.brandscape_data(studies=111, metric_keys="differentiation") # (1)
Expand Down
2 changes: 2 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

[![CI status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml)
[![coverage](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/nachomaiz/32196acdc05431cd2bc7a8c73a587a8d/raw/covbadge.json)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/ci.yml)
[![release status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/release.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/release.yml)
[![docs status](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/docs.yml/badge.svg)](https://github.com/wppbav/bavapi-sdk-python/actions/workflows/docs.yml)
[![PyPI](https://img.shields.io/pypi/v/wpp-bavapi)](https://pypi.org/project/wpp-bavapi/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/wpp-bavapi)
](https://pypi.org/project/wpp-bavapi/)
Expand Down
Loading

0 comments on commit ebe0f1d

Please sign in to comment.