-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversion Metrics #916
Conversion Metrics #916
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
1 similar comment
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
3026a15
to
8f9977a
Compare
48d660f
to
42e1848
Compare
Have remaining snapshots from other engines to update, but I'll regenerate those once we get an initial review |
b74d600
to
a38834f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost done reviewing (just have dataflow to SQL left) - leaving my initial comments here and will finish up tonight or tomorrow morning!
under the condition of being within the time range and other conditions (such as constant properties). | ||
This builds the join description to satisfy all those conditions. | ||
""" | ||
window_condition = SqlQueryPlanJoinBuilder._make_time_range_window_join_condition( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this return what we expect if there is no window for the metric ? I'm not sure we have a test case for that yet 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I basically just consolidated _make_time_range_window_join_condition
from the cumulative join logic since they were the exact same with some differences in the beginning. It has some logic around None
windows. So the logic is essentially this
a.ds <= b.ds AND a.ds > b.ds - <window>
If no window is present we join across all time -> "a.ds <= b.ds"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right I think the part that confused me was - why do we use a.ds <= b.ds
for an all time join? 🤔 (I know that's not new logic here though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not all time join, we use a join with a INF backwards looking window. So instead of looking for any base events that converted within an X day window, we just say any base event that converted. Thus, we have a.ds <= b.ds
which would render out to base.ds <= conversion.ds
...st_dataflow_to_sql_plan.py/SqlQueryPlan/Postgres/test_conversion_metric__plan0_optimized.sql
Outdated
Show resolved
Hide resolved
...st_dataflow_to_sql_plan.py/SqlQueryPlan/Postgres/test_conversion_metric__plan0_optimized.sql
Outdated
Show resolved
Hide resolved
ce85cb5
to
50ea3fe
Compare
2be3d4a
to
8eec5a0
Compare
8eec5a0
to
d14eed6
Compare
Issue
Context in #252
Changes
Added the ability to define Conversion Metrics
Config skeleton:
Added
AddGenerateUuidNode
For conversion metrics, we need the ability to uniquely identify rows in order to deduplicate events properly. This node simply adds a randomly generated uuid column to enable that.
Added
JoinConversionEventsNode
Created a node to handle retrieving all successful conversions. This node takes the conversion and base data set and joins them against an entity and a valid time range to get successful conversions. It then deduplicates opportunities via the window function
first_value
to take the closest opportunity event to the corresponding conversion event. Then it returns a data set with each row representing a successful conversion. Because an opportunity event can lead to multiple conversions, there would be duplication in the final dataset from this node.Example of this node,
Conversion Metrics
After getting successful conversions from
JoinConversionEventsNode
, we join that count with the base opportunity count which gives us the total number of opportunities and total number of conversions that can be used to calculate theconversion_rate
orconversion_count
.