parse-results: misaligned columns in output csv #54

nikyk · 2021-03-19T10:16:06Z

"parse-results" utility doesnt work as expected in some cases.
When I open csv file in Excel/LibreOfficeCalc, I see that many records occupy wrong columns.
Expected behavior: each column correspond to its own gene/pseudogene (HLA-A, DPB1, etc)
E.g., I see DQA1 record in column named QDB1 and so on. Some lines for samples have more records, than others.
For instance, the DPB2 gene exists in the output files for some samples, but is absent for the rest of the records.
Starting from column for this gene, order of column is broken.

Proposed solution:

scan all source *.report files for the first time, create complete list of genes present in *.report files
use list of genes to name columns
scan report files for the second time, and fill the table. If some gene is absent, leave column blank.

tsoi2018_hla_L1.xlsx

chbe-helix self-assigned this Mar 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parse-results: misaligned columns in output csv #54

parse-results: misaligned columns in output csv #54

nikyk commented Mar 19, 2021 •

edited

Loading

parse-results: misaligned columns in output csv #54

parse-results: misaligned columns in output csv #54

Comments

nikyk commented Mar 19, 2021 • edited Loading

nikyk commented Mar 19, 2021 •

edited

Loading