Skip to content
This repository has been archived by the owner on Jan 21, 2025. It is now read-only.

More informative error messages #36

Open
smlmbrt opened this issue Jan 5, 2023 · 1 comment
Open

More informative error messages #36

smlmbrt opened this issue Jan 5, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@smlmbrt
Copy link
Member

smlmbrt commented Jan 5, 2023

Hard to debug certain error messages for users as the assertions don't list the problematic variants (e.g. duplicate IDs, PGScatalog/pgsc_calc#72 (comment)). Should consider adding list of scoring files/variants that are causing the breakage

Relevant code snippet:

def _check_duplicate_vars(matches: pl.LazyFrame):
max_occurrence: list[int] = (matches.filter(pl.col('match_status') == 'matched')
.groupby(['accession', 'ID'])
.agg(pl.count())
.select('count')
.max()
.collect()
.get_column('count')
.to_list())
assert max_occurrence == [1], "Duplicate IDs in final matches"

@smlmbrt smlmbrt added the enhancement New feature or request label Jan 5, 2023
@smlmbrt
Copy link
Member Author

smlmbrt commented Jan 9, 2023

I think here we need to output: ['accession', 'ID', [list of affected scoring file row_nr]] that way we can grep the relevant rows of pvar and scoring file to see what's wrong

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant