Gdr 2682 #132

darsoo · 2024-09-17T09:17:32Z

Description

What changed?

Related JIRA issue:

Why was it changed?

Checklist for sustainable code base

I added tests for any code changed/added
I added documentation for any code changed/added
I made sure naming of any new functions is self-explanatory and consistent

Logistic checklist

Package version bumped
Changelog updated

Screenshots (optional)

bczech · 2024-09-19T09:00:16Z

Please reoxygenate the documentation

j-smola · 2024-09-25T12:36:14Z

@darsoo Please reoxygenate the documentation

j-smola

Please reoxygenate the documentation

j-smola · 2024-09-25T13:08:37Z

R/standardize_MAE.R

+      duplicated_ids <- setdiff(duplicated_ids, duplicated_ids_with_clid)
+    } else {
+      duplicated_ids <- col_data[[cellline_name]][duplicated(col_data[[cellline_name]])]
+    }


Why there is a different procedure for colData for SE and for data.table?
(I understand that in colData we do not have information about Drug, but it is misleading: the same content as input but different output depending on format - in the context of presence of cell line name duplicates)

> dt <- data.table::data.table( DrugName = c("DrugA", "DrugB", "DrugC", "DrugD", "DrugC", "DrugD"), Gnumber = c("G1", "G2", "G3", "G4", "G3", "G4"), CellLineName = c("ID1", "ID1", "ID2", "ID2", "ID2", "ID2"), clid = c("C1", "C2", "C3", "C4", "C5", "C6") ) > res_dt <- set_unique_cl_names_dt(dt) > res_dt DrugName Gnumber CellLineName clid <char> <char> <char> <char> 1: DrugA G1 ID1 C1 2: DrugB G2 ID1 C2 3: DrugC G3 ID2 (C3) C3 4: DrugD G4 ID2 (C4) C4 5: DrugC G3 ID2 (C5) C5 6: DrugD G4 ID2 (C6) C6 dt <- S4Vectors::DataFrame( DrugName = c("DrugA", "DrugB", "DrugC", "DrugD", "DrugC", "DrugD"), Gnumber = c("G1", "G2", "G3", "G4", "G3", "G4"), CellLineName = c("ID1", "ID1", "ID2", "ID2", "ID2", "ID2"), clid = c("C1", "C2", "C3", "C4", "C5", "C6") ) > res_S4 <- set_unique_cl_names_dt(dt) > res_S4 DataFrame with 6 rows and 4 columns DrugName Gnumber CellLineName clid <character> <character> <character> <character> 1 DrugA G1 ID1 (C1) C1 2 DrugB G2 ID1 (C2) C2 3 DrugC G3 ID2 (C3) C3 4 DrugD G4 ID2 (C4) C4 5 DrugC G3 ID2 (C5) C5 6 DrugD G4 ID2 (C6) C6

I see your point, but in my opinion it's a bit pointless. From a practical point of view DataFrame is only used for colData or rowData in SE, where there is no possibility for such a situation to occur. I don't see the need to complicate this logic.
More complex logic for data.table objects is required due to data specificity.

IMO now is complicated and not consistent.
I would vote for the function always returning the same result, regardless of format. The user may want to use this feature in a context other than within the application.

The only thing to change is - instead of checking format of col_data input - just check whether unique_col_names is "CellLineName".
If true - just add suffix, if not - check other columns and add suffix accordingly.
You have that code already written.

Yes but duplicated works in different way for data.table and DataFrame.

I'm on the same page as @j-smola. It would be great to have consistent logic regardless of the data format.

Ok. Added fix and test

R/standardize_MAE.R

Co-authored-by: j-smola <[email protected]>

j-smola · 2024-09-26T07:32:53Z

Please, reoxygenate the documentation (run devtools::document("./")) (You changed param description)

darsoo · 2024-09-26T10:34:38Z

Please, reoxygenate the documentation (run devtools::document("./")) (You changed param description)

done

darsoo added 4 commits September 16, 2024 15:57

added functions set_unique_cl_names_dt and set_unique_drug_names_dt

d651eff

added tests for set_unique_cl_names_dt and set_unique_drug_names_dt

788aca1

bump version

a8c1832

added param to set_unique_cl_names_dt and set_unique_drug_names_dt

6d68c02

darsoo requested a review from a team as a code owner September 17, 2024 09:17

darsoo requested review from j-smola and bczech and removed request for a team September 17, 2024 09:17

bczech approved these changes Sep 19, 2024

View reviewed changes

darsoo added 5 commits September 24, 2024 00:04

added draft for tests for set_unique_drug_names_dt

7aa8414

updaetd tests for set_unique_drug_names_dt and set_unique_cl_names_dt

6f1d0a7

added fix for set_unique_cl_names_dt

d6ea4fd

typo

5f7d6db

fix in test

4d68305

updated documentation

81914a3

j-smola reviewed Sep 25, 2024

View reviewed changes

Update R/standardize_MAE.R

e93ea86

Co-authored-by: j-smola <[email protected]>

updated documentation

4592d46

the function works the same regardless of the input class

3750ac6

gladkia approved these changes Sep 30, 2024

View reviewed changes

j-smola approved these changes Sep 30, 2024

View reviewed changes

darsoo merged commit 2875659 into main Sep 30, 2024
3 of 4 checks passed

darsoo deleted the GDR-2682 branch September 30, 2024 12:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gdr 2682 #132

Gdr 2682 #132

darsoo commented Sep 17, 2024

bczech commented Sep 19, 2024

j-smola commented Sep 25, 2024

j-smola left a comment

j-smola Sep 25, 2024

darsoo Sep 25, 2024

j-smola Sep 26, 2024 •

edited

Loading

darsoo Sep 26, 2024

gladkia Sep 30, 2024

darsoo Sep 30, 2024

j-smola commented Sep 26, 2024

darsoo commented Sep 26, 2024

Gdr 2682 #132

Gdr 2682 #132

Conversation

darsoo commented Sep 17, 2024

Description

What changed?

Why was it changed?

Checklist for sustainable code base

Logistic checklist

Screenshots (optional)

bczech commented Sep 19, 2024

j-smola commented Sep 25, 2024

j-smola left a comment

Choose a reason for hiding this comment

j-smola Sep 25, 2024

Choose a reason for hiding this comment

darsoo Sep 25, 2024

Choose a reason for hiding this comment

j-smola Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

darsoo Sep 26, 2024

Choose a reason for hiding this comment

gladkia Sep 30, 2024

Choose a reason for hiding this comment

darsoo Sep 30, 2024

Choose a reason for hiding this comment

j-smola commented Sep 26, 2024

darsoo commented Sep 26, 2024

j-smola Sep 26, 2024 •

edited

Loading