Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misformatted metadata causes incorrect output #8

Open
arisp99 opened this issue Sep 14, 2021 · 0 comments
Open

Misformatted metadata causes incorrect output #8

arisp99 opened this issue Sep 14, 2021 · 0 comments
Labels
bug unexpected problem or unintended behavior

Comments

@arisp99
Copy link
Member

arisp99 commented Sep 14, 2021

Bug Description

If the metadata in a file in misformatted (i.e., does not contain six lines), read_file() and read() returns the incorrect output. Currently, we lose information and, instead of being treated as rows, samples are treated as columns.

# Metadata contains only four lines
misformatted <- tibble::tribble(
                         ~Gene,            ~atp6,           ~mdr1,
               "Mutation Name", "atp6-Ala623Glu", "mdr1-Asn86Tyr",
                   "AA Change",      "Ala623Glu",      "Asn86Tyr",
                    "Targeted",            "Yes",           "Yes",
                  "D10-JJJ-23",              "0",            "13",
                  "D10-JJJ-43",              "0",             "0",
                  "D10-JJJ-50",             "15",             "0"
               )

path <- tempfile()
readr::write_csv(misformatted, path)
MIPr::read_file(path)
#> # A tibble: 2 × 8
#>   sample     gene  mutation_name  aa_change targeted d10_jjj_23 d10_jjj_43 value
#>   <chr>      <chr> <chr>          <chr>     <chr>    <chr>      <chr>      <chr>
#> 1 D10-JJJ-50 atp6  atp6-Ala623Glu Ala623Glu Yes      0          0          15   
#> 2 D10-JJJ-50 mdr1  mdr1-Asn86Tyr  Asn86Tyr  Yes      13         0          0
unlink(path)

Created on 2021-09-14 by the reprex package (v2.0.1)

Expected Behavior

We would expect to see six rows in our final dataset with all the metadata represented as columns, as shown below:

MIPr::read_file(path)
#> # A tibble: 6 × 8
#>   sample     gene  mutation_name aa_change targeted value
#>   <chr>      <chr> <chr>         <chr>     <chr>    <chr>
#> 1 D10-JJJ-23 atp6  atp6-Ala623G… Ala623Glu Yes      0    
#> 2 D10-JJJ-43 atp6  atp6-Ala623G… Ala623Glu Yes      0    
#> 3 D10-JJJ-50 atp6  atp6-Ala623G… Ala623Glu Yes      15   
#> 4 D10-JJJ-23 mdr1  mdr1-Asn86Tyr Asn86Tyr  Yes      13   
#> 5 D10-JJJ-43 mdr1  mdr1-Asn86Tyr Asn86Tyr  Yes      0    
#> 6 D10-JJJ-50 mdr1  mdr1-Asn86Tyr Asn86Tyr  Yes      0
@arisp99 arisp99 added the bug unexpected problem or unintended behavior label Sep 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

1 participant