-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add analytics table for binary model performance analysis across multiple scores/targets #110
base: main
Are you sure you want to change the base?
Add analytics table for binary model performance analysis across multiple scores/targets #110
Conversation
4c1305c
to
d40d9fd
Compare
self._metric_values_slider = FloatRangeSlider( | ||
min=0.01, | ||
max=1.00, | ||
step=0.01, | ||
value=metric_values or [0.2, 0.8], | ||
description="Metric Values", | ||
style=WIDE_LABEL_STYLE, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
self.decimals = table_config.decimals | ||
self.metric = metric | ||
self.metric_values = metric_values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.metric_values = metric_values | |
self.metric_values = list(set(metric_values)) |
This fixes the issue where the range slider matches itself - see https://github.com/epic-open-source/seismometer/pull/110/files#r1904553861
# If polars package is not installed, overwrite is_na function in great_tables package to treat Agnostic | ||
# as pandas dataframe. | ||
try: | ||
import polars as pl | ||
|
||
# Use 'pl' to avoid the F401 error | ||
_ = pl.DataFrame() | ||
except ImportError: | ||
from typing import Any | ||
|
||
from great_tables._tbl_data import Agnostic, PdDataFrame, is_na | ||
|
||
@is_na.register(Agnostic) | ||
def _(df: PdDataFrame, x: Any) -> bool: | ||
return pd.isna(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this outside of this module as its fiddly and probably needed by both this class and others.
Also this will get called multiple times in the notebook, rather than us only needing it once right?
gt = ( | ||
gt.tab_stub(rowname_col=self._get_second_level[self.top_level], groupname_col=self.top_level) | ||
.fmt_number( | ||
columns=[ | ||
col | ||
for col in data.columns | ||
if is_numeric_dtype(data[col].dtype) and not is_integer_dtype(data[col].dtype) | ||
], | ||
decimals=self.decimals, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some minor visual things
Overview
Add model comparison tools to compare model performance across multiple binary model scores/targets.
Closes #62
Description of changes
We create an analytics table using great_tables package to present model performance data across multiple model scores/targets centered around specified values for a performance metric ('Flagged', 'Sensitivity', 'Specificity', 'Threshold'). Users can also choose two values for the specified metric to be considered, which columns should be displayed and if the columns should grouped by scores or targets.
For instance, if Sensitivity of [0.7,0.8] are specified, the performance metrics ('PPV', 'Flagged', 'Specificity', 'Threshold', etc.) across model scores/targets for Sensitivity=0.7 and Sensitivity =0.8 are provided.
Author Checklist
changelog/ISSUE.TYPE.rst
files; see changelog/README.md.