Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/dau reporting/add active users #6483

Draft
wants to merge 39 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
aadd4a6
feat: add dau_reporting template
kik-kik Oct 24, 2024
e18106e
feat: add dau_reporting template
kik-kik Oct 24, 2024
f17379b
feat: remove submission_date from the dau_reporting_first_seen template
kik-kik Nov 5, 2024
0d6869c
chore: final code cleanup
kik-kik Nov 5, 2024
9cb82c5
Merge branch 'feat/add_dau_reporting-to-glean_usage' of https://githu…
gkatre Nov 5, 2024
ca0ac4c
bigeye
gkatre Nov 5, 2024
3e7f3c1
feat: add dau_reporting template
kik-kik Oct 24, 2024
ae6198f
feat: remove submission_date from the dau_reporting_first_seen template
kik-kik Nov 5, 2024
fcb75fd
chore: final code cleanup
kik-kik Nov 5, 2024
f078dd1
feat: improve comments documenting why duration based fields are comm…
kik-kik Nov 6, 2024
8329101
feat: add field descriptions to schema.yaml, and remove sample_id as …
kik-kik Nov 6, 2024
bd53007
fix: correct clusting for the dau_reporting_daily to use locale inste…
kik-kik Nov 6, 2024
94a7459
fix: invalid query dau_reporting_clients_daily_v1
kik-kik Nov 6, 2024
7d2157d
feat: add dau_reporting template
kik-kik Oct 24, 2024
0060528
feat: remove submission_date from the dau_reporting_first_seen template
kik-kik Nov 5, 2024
da5431b
chore: final code cleanup
kik-kik Nov 5, 2024
81cacee
feat: improve comments documenting why duration based fields are comm…
kik-kik Nov 6, 2024
4a1ee6c
feat: add field descriptions to schema.yaml, and remove sample_id as …
kik-kik Nov 6, 2024
1818cf6
fix: correct clusting for the dau_reporting_daily to use locale inste…
kik-kik Nov 6, 2024
4e6f7fa
fix: invalid query dau_reporting_clients_daily_v1
kik-kik Nov 6, 2024
2dfd57a
feat: final tweaks based on the PR reviews
kik-kik Nov 7, 2024
b2a29ba
feat: remove client_id from the templates
kik-kik Nov 7, 2024
ed703da
Merge branch 'feat/add_dau_reporting-to-glean_usage' of https://githu…
gkatre Nov 7, 2024
2372513
Merge branch 'feat/add_dau_reporting-to-glean_usage' of https://githu…
gkatre Nov 7, 2024
db8c455
feat: add dau_reporting template
kik-kik Oct 24, 2024
9a53962
feat: remove submission_date from the dau_reporting_first_seen template
kik-kik Nov 5, 2024
c0186c9
chore: final code cleanup
kik-kik Nov 5, 2024
a81abb5
feat: improve comments documenting why duration based fields are comm…
kik-kik Nov 6, 2024
209c6b7
feat: add field descriptions to schema.yaml, and remove sample_id as …
kik-kik Nov 6, 2024
6226bdc
fix: correct clusting for the dau_reporting_daily to use locale inste…
kik-kik Nov 6, 2024
9040b07
fix: invalid query dau_reporting_clients_daily_v1
kik-kik Nov 6, 2024
ce6b038
feat: final tweaks based on the PR reviews
kik-kik Nov 7, 2024
35e13c4
feat: remove client_id from the templates
kik-kik Nov 7, 2024
ea87ed8
feat: update usage_profile_id reference as now the field existsin the…
kik-kik Nov 13, 2024
25e8ada
no message
gkatre Nov 13, 2024
24f16d7
feat: remove unused code from dau_reporting_clients_daily_v1.query.sql
kik-kik Nov 13, 2024
97fca04
Merge branch 'main' into feat/add_dau_reporting-to-glean_usage
kik-kik Nov 13, 2024
b102ae5
feat: remove unused definition inside dau_reporting_clients_daily_v1.…
kik-kik Nov 13, 2024
4808a07
Creating active_users and active_users_aggregates for dau_reporting ping
gkatre Nov 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions sql_generators/glean_usage/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@
baseline_clients_first_seen,
baseline_clients_last_seen,
clients_last_seen_joined,
dau_reporting_clients_daily,
dau_reporting_clients_first_seen,
dau_reporting_clients_last_seen,
dau_reporting_active_users_aggregates,
event_error_monitoring,
event_flow_monitoring,
event_monitoring_live,
Expand All @@ -43,6 +47,10 @@
event_error_monitoring.EventErrorMonitoring(),
event_flow_monitoring.EventFlowMonitoring(),
events_stream.EventsStreamTable(),
dau_reporting_clients_daily.DauReportingClientsDailyTable(),
dau_reporting_clients_first_seen.DauReportingClientsFirstSeenTable(),
dau_reporting_clients_last_seen.DauReportingClientsLastSeenTable(),
dau_reporting_active_users_aggregates.DauReportingActiveUsersAggregatesTable(),
]


Expand Down Expand Up @@ -136,7 +144,7 @@ def get_tables(table_name="baseline_v1"):
not in ConfigLoader.get("generate", "glean_usage", "skip_apps", fallback=[])
]

id_token=get_id_token()
id_token = get_id_token()

# Prepare parameters so that generation of all Glean datasets can be done in parallel

Expand All @@ -151,7 +159,7 @@ def get_tables(table_name="baseline_v1"):
use_cloud_function=use_cloud_function,
app_info=app_info,
parallelism=parallelism,
id_token=id_token
id_token=id_token,
),
baseline_table,
)
Expand All @@ -169,7 +177,7 @@ def get_tables(table_name="baseline_v1"):
output_dir=output_dir,
use_cloud_function=use_cloud_function,
parallelism=parallelism,
id_token=id_token
id_token=id_token,
),
info,
)
Expand Down
16 changes: 13 additions & 3 deletions sql_generators/glean_usage/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,16 @@ def table_names_from_baseline(baseline_table, include_project_id=True):
events_view=f"{prefix}.events",
events_stream_table=f"{prefix}_derived.events_stream_v1",
events_stream_view=f"{prefix}.events_stream",
dau_reporting_stable_table=f"{prefix}_stable.dau_reporting_v1",
dau_reporting_clients_daily_table=f"{prefix}_derived.dau_reporting_clients_daily_v1",
dau_reporting_clients_first_seen_table=f"{prefix}_derived.dau_reporting_clients_first_seen_v1",
dau_reporting_clients_last_seen_table=f"{prefix}_derived.dau_reporting_clients_last_seen_v1",
dau_reporting_active_users_aggregates_table=f"{prefix}_derived.dau_reporting_active_users_aggregates_v1",
dau_reporting_clients_daily_view=f"{prefix}.dau_reporting_clients_daily",
dau_reporting_clients_first_seen_view=f"{prefix}.dau_reporting_clients_first_seen",
dau_reporting_clients_last_seen_view=f"{prefix}.dau_reporting_clients_last_seen",
dau_reporting_active_users_view=f"{prefix}.dau_reporting_active_users",
dau_reporting_active_users_aggregates_view=f"{prefix}.dau_reporting_active_users_aggregates",
)


Expand Down Expand Up @@ -234,7 +244,7 @@ def generate_per_app_id(
use_cloud_function=True,
app_info=[],
parallelism=8,
id_token=None
id_token=None,
):
"""Generate the baseline table query per app_id."""
if not self.per_app_id_enabled:
Expand Down Expand Up @@ -268,7 +278,7 @@ def generate_per_app_id(
derived_dataset=derived_dataset,
app_name=app_name,
has_distribution_id=app_name in APPS_WITH_DISTRIBUTION_ID,
has_profile_group_id= app_name in APPS_WITH_PROFILE_GROUP_ID,
has_profile_group_id=app_name in APPS_WITH_PROFILE_GROUP_ID,
)

render_kwargs.update(self.custom_render_kwargs)
Expand Down Expand Up @@ -364,7 +374,7 @@ def generate_per_app(
output_dir=None,
use_cloud_function=True,
parallelism=8,
id_token=None
id_token=None,
):
"""Generate the baseline table query per app_name."""
if not self.per_app_enabled:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
"""Generating and run dau_reporting_active_users_aggregates queries for Glean apps."""

from sql_generators.glean_usage.common import GleanTable

TARGET_TABLE_ID = "dau_reporting_active_users_aggregates_v1"
PREFIX = "dau_reporting_active_users_aggregates"


class DauReportingActiveUsersAggregatesTable(GleanTable):
"""Represents generated dau_reporting_active_users_aggregates table."""

def __init__(self):
"""Initialize dau_reporting_active_users_aggregates table."""
GleanTable.__init__(self)
self.target_table_id = TARGET_TABLE_ID
self.prefix = PREFIX
self.base_table_name = "dau_reporting_v1"
17 changes: 17 additions & 0 deletions sql_generators/glean_usage/dau_reporting_clients_daily.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
"""Generating and run dau_reporting_clients_daily queries for Glean apps."""

from sql_generators.glean_usage.common import GleanTable

TARGET_TABLE_ID = "dau_reporting_clients_daily_v1"
PREFIX = "dau_reporting_clients_daily"


class DauReportingClientsDailyTable(GleanTable):
"""Represents generated dau_reporting_clients_daily table."""

def __init__(self):
"""Initialize dau_reporting_clients_daily table."""
GleanTable.__init__(self)
self.target_table_id = TARGET_TABLE_ID
self.prefix = PREFIX
self.base_table_name = "dau_reporting_v1"
17 changes: 17 additions & 0 deletions sql_generators/glean_usage/dau_reporting_clients_first_seen.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
"""Generating and run dau_reporting_clients_first_seen queries for Glean apps."""

from sql_generators.glean_usage.common import GleanTable

TARGET_TABLE_ID = "dau_reporting_clients_first_seen_v1"
PREFIX = "dau_reporting_clients_first_seen"


class DauReportingClientsFirstSeenTable(GleanTable):
"""Represents generated dau_reporting_clients_first_seen table."""

def __init__(self):
"""Initialize dau_reporting_clients_first_seen table."""
GleanTable.__init__(self)
self.target_table_id = TARGET_TABLE_ID
self.prefix = PREFIX
self.base_table_name = "dau_reporting_v1"
17 changes: 17 additions & 0 deletions sql_generators/glean_usage/dau_reporting_clients_last_seen.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
"""Generating and run dau_reporting_clients_last_seen queries for Glean apps."""

from sql_generators.glean_usage.common import GleanTable

TARGET_TABLE_ID = "dau_reporting_clients_last_seen_v1"
PREFIX = "dau_reporting_clients_last_seen"


class DauReportingClientsLastSeenTable(GleanTable):
"""Represents generated dau_reporting_clients_last_seen table."""

def __init__(self):
"""Initialize dau_reporting_clients_last_seen table."""
GleanTable.__init__(self)
self.target_table_id = TARGET_TABLE_ID
self.prefix = PREFIX
self.base_table_name = "dau_reporting_v1"
4 changes: 2 additions & 2 deletions sql_generators/glean_usage/templates/cross_channel.view.sql
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ AS
UNION ALL
{% endif -%}
{% if app_name == "fenix" -%}
SELECT
SELECT
"{{ dataset }}" AS normalized_app_id,
* REPLACE(mozfun.norm.fenix_app_info("{{ dataset }}", app_build).channel AS normalized_channel),
{% else -%}
SELECT
SELECT
"{{ dataset }}" AS normalized_app_id,
* REPLACE("{{ channel }}" AS normalized_channel)
{% endif -%}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{{ header_yaml }}
friendly_name: DAU Reporting Clients Last Seen
description: |-
A daily client aggregation metrics for dau_reporting ping. Merges the computations for client first seen
and last seen metrics

owners:
- [email protected]
labels: {}
bigquery: null
workgroup_access:
- role: roles/bigquery.dataViewer
members:
- workgroup:dataops-managed/taar
- workgroup:mozilla-confidential
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
fields:
- mode: NULLABLE
name: submission_date
type: DATE
description: |
Logical date used for processing and paritioning.

- mode: NULLABLE
name: usage_profile_id
type: STRING
description:

- mode: NULLABLE
name: first_run_date
type: DATE
description: |
The date of the first run of the application.

- mode: NULLABLE
name: normalized_channel
type: STRING
description: |
The channel the application is being distributed on.

- mode: NULLABLE
name: normalized_os
type: STRING
description: |
The name of the operating system.

- mode: NULLABLE
name: normalized_os_version
type: STRING
description: |
The user-visible version of the operating system (e.g. "1.2.3").
If the version detection fails, this metric gets set to Unknown.

- mode: NULLABLE
name: locale
type: STRING
description: |
The locale of the application during initialization (e.g. "es-ES").
If the locale can't be determined on the system, the value is "und", to indicate "undetermined".

- mode: NULLABLE
name: app_build
type: STRING
description: |
The build identifier generated by the CI system (e.g. "1234/A").
If the value was not provided through configuration, this metric gets set to Unknown.

- mode: NULLABLE
name: app_display_version
type: STRING
description: |
The user visible version string (e.g. "1.0.3").
If the value was not provided through configuration, this metric gets set to Unknown.

- mode: NULLABLE
name: distribution_id
type: STRING
description: |
A string containing the distribution identifier. This was used to identify installs
from Mozilla Online, but now also identifies partnership deal distributions.

- mode: NULLABLE
name: is_active
type: BOOLEAN
description: |
A flag field indicating whether the specific client was active.

- mode: NULLABLE
name: first_seen_date
type: DATE
description: |
Logical date of when we observed the client for the first time in our warehouse.

- mode: NULLABLE
name: days_seen_bits
type: INTEGER
description: |
Bit field shows on which of the last 28 days a client sent us the dau_reporting ping.

- mode: NULLABLE
name: days_active_bits
type: INTEGER
description: |
Bit field shows on which of the last 28 days a client fulfilled the active criteria.

- mode: NULLABLE
name: days_created_profile_bits
type: INTEGER
description: |
bit field indicating how many days lapsed since profile creation.

- mode: NULLABLE
name: activity_segment
type: STRING
description: |
categorizing activity days into segments

- mode: NULLABLE
name: is_dau
type: BOOLEAN
description: |
A flag field indicating whether the specific client was active on the submission_date.

- mode: NULLABLE
name: is_wau
type: BOOLEAN
description: |
A flag field indicating whether the specific client was active on any of the 7 days prior to the submission_date.

- mode: NULLABLE
name: is_mau
type: BOOLEAN
description: |
A flag field indicating whether the specific client was active on any of the 28 days prior to the submission_date.

- mode: NULLABLE
name: is_daily_user
type: BOOLEAN
description: |
A flag field indicating whether the specific client sent the dau_reporting ping on the submission_date.

- mode: NULLABLE
name: is_weekly_user
type: BOOLEAN
description: |
A flag field indicating whether the specific client sent the dau_reporting ping on any of the 7 days prior to the submission_date.

- mode: NULLABLE
name: is_monthly_user
type: BOOLEAN
description: |
A flag field indicating whether the specific client sent the dau_reporting ping on any of the 28 days prior to the
submission_date.
Loading