Reduce computational time of AWS megatest #158

LouisLeNezet · 2024-11-10T18:51:55Z

This PR change the features tested in the fulltest config on AWS to reduce computational cost.
The snapshot of VCF_PHASE_SHAPEIT5 is updated as it was not the case. (my bad in the previous PR).

PR checklist

LouisLeNezet · 2024-11-10T18:56:19Z

Whenever we have merged the chunk_model update I will launch the updated fulltest on my cluster.

atrigila

Always impressed by the amount of work you put on these contributions - much appreciated! :)

I noticed several distinct topics being addressed here. To make reviews faster and keep the changelog and file history cleaner, it would help if each of these topics were split into separate PRs. This also makes it easier to track changes and ensure each update is well-documented. Here are the main topics I see in this PR and that I tried to provide feedback on:

Reducing computational cost in fulltest configuration on AWS: This is the primary focus, and it's a great enhancement. You said previously that you have evidence that these new changes make the pipeline full test work, so I believe this is ready.
VCF_SPLIT_BCFTOOLS edge case fix: Since this is a specific edge case fix, a separate PR would be ideal. In this particular case I see a great benefit of having this particular subworkflow in nf-core/modules, as it allows the community to review and provide feedback on these changes, speeding up the review process and then only making this a simple update in the pipeline :)
Snapshot file updates and output file prefix changes: I understand that there have been changes in how files are named and therefore the snapshots, but it would be nice if this could have been something separate to provide context on why these prefixes were adjusted.
"Lenient mode" addition: This appears to be a new feature but I couldn't find any reference in your PR description talking about this, could you explain its purpose and the scenarios where it would be useful?
Minor corrections in the changelog and punctuation fixes: These small adjustments (like punctuation or formatting) would be best suited for a separate "cleanup" PR but they are OK.

atrigila · 2024-11-17T16:22:12Z

assets/multiqc_config.yml

+extra_fn_clean_exts:
+  - type: regex
+    pattern: "\\.batch[0-9]+"


If we now remove the batch number from the statistics, aren't all batches going to have the same name? How we will differentiate between batches? I haven't been able to process a test with >1 batch to see how the report will look like.

There is batch file processed in the statistic. The aim to remove this pattern was to clean up a bit the name of the sample in the multiqc output html.
What could be done would be to allow the grouping of the samples by batch in the multiqc output (enable by the last multiqc version). But I think this should be done outside this PR.

conf/modules.config

conf/steps/imputation_stitch.config

conf/steps/panel_prep.config

conf/steps/validation.config

conf/test.config

docs/output.md

nextflow.config

workflows/phaseimpute/tests/main.nf.test.snap

LouisLeNezet · 2024-11-17T17:27:33Z

I'm sorry that this PR have too much files changed.
It's just that when I do some debugging there is some small changed that needs to be adressed and they pile up quite quickly.
I'll try to split it.

Co-authored-by: Anabella Trigila <[email protected]>

LouisLeNezet

Ready for changes

LouisLeNezet · 2024-11-23T20:20:11Z

Hi @atrigila this PR is now ready !

atrigila · 2024-11-24T18:48:19Z

conf/test_all.config

-        cpus: 2,
-        memory: '10.GB',
+        cpus: 4,
+        memory: '4.GB',


I think it can handle up to 15GB

This limit was also set for my WSL virtual machine that only go up to 7Gb.

LouisLeNezet self-assigned this Nov 10, 2024

LouisLeNezet added bug Something isn't working enhancement New feature or request labels Nov 10, 2024

LouisLeNezet added this to the v0.99.0 milestone Nov 10, 2024

LouisLeNezet mentioned this pull request Nov 10, 2024

Update nf-tests and language server fixes #153

Merged

11 tasks

LouisLeNezet added 2 commits November 11, 2024 12:34

Reduce features of fulltest

f575591

Update changelog

b5e7d91

LouisLeNezet force-pushed the fulltest_only branch from d105ead to b5e7d91 Compare November 11, 2024 11:43

Check for multiple sample before splitting

01e0927

LouisLeNezet marked this pull request as draft November 11, 2024 20:41

LouisLeNezet added 7 commits November 12, 2024 14:48

Fix case if empty

976f2ff

Update snapshot phasing sbwf

2aed82f

Update CHANGELOG

7222146

Update config

826ca63

Add query version

c600724

Remove unecessary sbwf inclusion

2ca561d

Update changelog and add dependencies

1512e82

LouisLeNezet changed the title ~~Reduce computational time of AWS megatest~~ Reduce computational time of AWS megatest and Fix VCF_SPLIT_BCFTOOLS Nov 12, 2024

LouisLeNezet and others added 11 commits November 12, 2024 16:56

Fix window size

eeb8825

Fix name output suffix

add70ee

Update config

50344fe

Fix small naming

70a3aaf

Clean multiqc

4ff1ed0

Update output.md

e0b8d1b

Update nf-test and snapshot

2002404

Fix config import order

96258d3

Reset changes

b65bfa0

Update pluginsplit

320ae35

Update test

028f097

Update nf-test sbwf

0a5cb05

LouisLeNezet marked this pull request as ready for review November 13, 2024 16:24

LouisLeNezet requested a review from atrigila November 13, 2024 16:24

atrigila requested changes Nov 17, 2024

View reviewed changes

LouisLeNezet and others added 9 commits November 17, 2024 18:30

Update docs/output.md

9802306

Co-authored-by: Anabella Trigila <[email protected]>

Merge branch 'dev' into fulltest_only

ae061f6

Reset changes

c2a0d0d

Reset pluginsplit

42c5b68

Add back lenient mode in PR

ad643d1

Change resourceLimits

9d95bad

Set back testquality

3cc9f7a

Change resourceLimits

24b6321

Reset VCF_SPLIT_BCFTOOLS

d299471

LouisLeNezet commented Nov 17, 2024

View reviewed changes

Add chunks size config

e63304e

LouisLeNezet requested a review from atrigila November 17, 2024 18:53

LouisLeNezet changed the title ~~Reduce computational time of AWS megatest and Fix VCF_SPLIT_BCFTOOLS~~ Reduce computational time of AWS megatest Nov 17, 2024

LouisLeNezet and others added 8 commits November 17, 2024 20:01

Update snapshot

ae4e4fb

Reset change

6c2efb4

Reset change

e05b8ef

Update snapshot

861a0ee

Merge branch 'dev' into fulltest_only

0e47798

Merge branch 'dev' into fulltest_only

6ba6eba

Update snapshot

f8c01ac

Update snapshot

10d95fe

LouisLeNezet linked an issue Nov 24, 2024 that may be closed by this pull request

Fix -resume with fai usage #137

Closed

atrigila approved these changes Nov 24, 2024

View reviewed changes

LouisLeNezet merged commit 69badc3 into nf-core:dev Nov 24, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce computational time of AWS megatest #158

Reduce computational time of AWS megatest #158

LouisLeNezet commented Nov 10, 2024 •

edited

Loading

LouisLeNezet commented Nov 10, 2024

atrigila left a comment

atrigila Nov 17, 2024

LouisLeNezet Nov 17, 2024

LouisLeNezet commented Nov 17, 2024

LouisLeNezet left a comment

LouisLeNezet commented Nov 23, 2024

atrigila Nov 24, 2024

LouisLeNezet Nov 24, 2024

Reduce computational time of AWS megatest #158

Reduce computational time of AWS megatest #158

Conversation

LouisLeNezet commented Nov 10, 2024 • edited Loading

PR checklist

LouisLeNezet commented Nov 10, 2024

atrigila left a comment

Choose a reason for hiding this comment

atrigila Nov 17, 2024

Choose a reason for hiding this comment

LouisLeNezet Nov 17, 2024

Choose a reason for hiding this comment

LouisLeNezet commented Nov 17, 2024

LouisLeNezet left a comment

Choose a reason for hiding this comment

LouisLeNezet commented Nov 23, 2024

atrigila Nov 24, 2024

Choose a reason for hiding this comment

LouisLeNezet Nov 24, 2024

Choose a reason for hiding this comment

LouisLeNezet commented Nov 10, 2024 •

edited

Loading