Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phaseimpute template update and pass first test #5

Merged
merged 72 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
0225ffd
Update pipeline
LouisLeNezet Mar 5, 2024
19ce9bd
Update simple test
LouisLeNezet Mar 5, 2024
9484b09
Add bcftools mpileup
LouisLeNezet Mar 5, 2024
c408e1f
Ignore nf-test folder
LouisLeNezet Mar 6, 2024
98ec135
Create subworkflow to get regions with nf-test
LouisLeNezet Mar 6, 2024
26c554a
Delete extra spaces
LouisLeNezet Mar 6, 2024
449ca9f
Set subworkflows in own folder
LouisLeNezet Mar 6, 2024
f35e632
Add panel schema and csv
LouisLeNezet Mar 6, 2024
45dcee6
Delete lib folder
LouisLeNezet Mar 6, 2024
46a4cee
Put main workflow to dedicated folder
LouisLeNezet Mar 6, 2024
33a62ad
Set depth as integer
LouisLeNezet Mar 6, 2024
bb0855d
Update files
LouisLeNezet Mar 6, 2024
63d2c34
Fix unchanged files
LouisLeNezet Mar 6, 2024
794e155
Updates all modules
LouisLeNezet Mar 6, 2024
6868489
NF-core linting pass
LouisLeNezet Mar 6, 2024
be48cf1
New module to create the annotation file to rename the chromosome for…
LouisLeNezet Mar 8, 2024
ddbf5c1
Update the csv files for testing
LouisLeNezet Mar 8, 2024
7148a23
Fix link and parameters names
LouisLeNezet Mar 12, 2024
e170633
Relove unused view() statement
LouisLeNezet Mar 12, 2024
c733562
Add config file
LouisLeNezet Mar 12, 2024
ba412eb
Update parameters
LouisLeNezet Mar 12, 2024
3d61e99
Add test
LouisLeNezet Mar 12, 2024
e2c9989
Update data config file
LouisLeNezet Mar 12, 2024
930a2f0
Add sbwf
LouisLeNezet Mar 12, 2024
4d233a1
Merge branch 'devel' of github.com:LouisLeNezet/phaseimpute into dev
LouisLeNezet Mar 13, 2024
1290052
Make vcf_chr_rename nf-test works
LouisLeNezet Mar 13, 2024
11c24ac
Update config test
LouisLeNezet Mar 13, 2024
6cfb6a5
Add test function
LouisLeNezet Mar 13, 2024
e3d9e01
Update get panel
LouisLeNezet Mar 13, 2024
2293abb
Update get region and fasta as channel
LouisLeNezet Mar 13, 2024
d3a2e3b
Make get_region works
LouisLeNezet Mar 13, 2024
799a1e6
Delete unecessary view
LouisLeNezet Mar 13, 2024
fc6ecab
Update map
LouisLeNezet Mar 15, 2024
45d86e7
Add environment for development
LouisLeNezet Mar 15, 2024
5dc2521
Add missing steps with errors
LouisLeNezet Mar 15, 2024
861677f
update metromap
LouisLeNezet Mar 15, 2024
34b0535
Bcftools view: Change default to compressed format
LouisLeNezet Mar 17, 2024
e66b0ea
Mpileup change input files channel
LouisLeNezet Mar 17, 2024
fb171d6
Update genotype likelihood computation and channel workflow
LouisLeNezet Mar 17, 2024
2efff60
Rearrange channel creation in initialisation
LouisLeNezet Mar 17, 2024
9667949
Delete fasta index creation in get region, should be done beforehand
LouisLeNezet Mar 17, 2024
d2b10ea
Test file change
LouisLeNezet Mar 17, 2024
632e6cf
Update modules
LouisLeNezet Mar 18, 2024
17217e4
Update all tools
LouisLeNezet Mar 18, 2024
364d1ca
Reset tools
LouisLeNezet Mar 18, 2024
a2746ef
Update channel bcftools norm
LouisLeNezet Mar 18, 2024
e361d55
[automated] Fix code linting
nf-core-bot Mar 18, 2024
1b82c50
Update tools
LouisLeNezet Mar 18, 2024
c107a23
Merge branch 'dev' of github.com:LouisLeNezet/phaseimpute into dev
LouisLeNezet Mar 18, 2024
4876750
Reset tools file
LouisLeNezet Mar 18, 2024
180acf7
Update missing files
LouisLeNezet Mar 18, 2024
70e3a42
Unchanged files ifx
LouisLeNezet Mar 18, 2024
7e21199
Ignore ci.yml
LouisLeNezet Mar 18, 2024
c944137
Fix linting with eclint
LouisLeNezet Mar 18, 2024
a1df5f7
Fix eclint
LouisLeNezet Mar 18, 2024
76c9736
remove yml from documentation from indent linting
LouisLeNezet Mar 18, 2024
e3a4c32
Update editor config
LouisLeNezet Mar 18, 2024
cd9211b
Update to xml
LouisLeNezet Mar 18, 2024
711bb52
Edit editorconfig
LouisLeNezet Mar 18, 2024
a898e4f
edit editorconfig
LouisLeNezet Mar 18, 2024
72600f8
Update logo
LouisLeNezet Mar 18, 2024
6904cab
Update all logo to match template
LouisLeNezet Mar 18, 2024
0218313
Update test and make it work
LouisLeNezet Mar 18, 2024
9e303e0
Update conf/test.config
LouisLeNezet Mar 19, 2024
8f23e8f
Update conf/test_panelprep.config
LouisLeNezet Mar 19, 2024
e961d81
Update conf/test_panelprep.config
LouisLeNezet Mar 19, 2024
af41fb0
Update conf/test_panelprep.config
LouisLeNezet Mar 19, 2024
3b8d3d9
Undo non useful modification
LouisLeNezet Mar 19, 2024
201635c
Delete ch_multiqc as not used in initialisation
LouisLeNezet Mar 19, 2024
b2ac7e6
Update ci.yml duplicate
LouisLeNezet Mar 19, 2024
95ee3dc
Delete duplicate keys
LouisLeNezet Mar 19, 2024
bec515d
Set output to compress format
LouisLeNezet Mar 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,11 @@ Contributions to the code are even more welcome ;)

If you'd like to write some code for nf-core/phaseimpute, the standard workflow is as follows:

1. Check that there isn't already an issue about your idea in the [nf-core/phaseimpute issues](https://github.com/nf-core/phaseimpute/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this
1. Check that there isn't already an issue about your idea in the [nf-core/phaseimpute issues](https://github.com/nf-core/phaseimpute/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this
2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/phaseimpute repository](https://github.com/nf-core/phaseimpute) to your GitHub account
3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions)
4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10).
5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged
3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions)
4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10).
5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged

If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/).

Expand All @@ -41,15 +37,13 @@ Typically, pull-requests are only fully reviewed when these tests are passing, t

There are typically two types of tests that run:

### Lint tests
### Lint tests

`nf-core` has a [set of guidelines](https://nf-co.re/developers/guidelines) which all pipelines must adhere to.
To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core lint <pipeline-directory>` command.

If any failures or warnings are encountered, please follow the listed URL for more documentation.

### Pipeline tests
### Pipeline tests

Each `nf-core` pipeline should be set up with a minimal set of test-data.
Expand All @@ -59,7 +53,6 @@ These tests are run both with the latest available version of `Nextflow` and als

## Patch

:warning: Only in the unlikely and regretful event of a release happening with a bug.
:warning: Only in the unlikely and regretful event of a release happening with a bug.

- On your own fork, make a new branch `patch` based on `upstream/master`.
Expand Down
7 changes: 0 additions & 7 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
<!--
<!--
# nf-core/phaseimpute pull request

Many thanks for contributing to nf-core/phaseimpute!
Expand All @@ -12,14 +11,8 @@ Remember that PRs should be made against the dev branch, unless you're preparing
Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/phaseimpute/tree/master/.github/CONTRIBUTING.md)
-->

Remember that PRs should be made against the dev branch, unless you're preparing a pipeline release.

Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/phaseimpute/tree/master/.github/CONTRIBUTING.md)
-->

## PR checklist

- [ ] This comment contains a description of changes (with reason).
- [ ] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/phaseimpute/tree/master/.github/CONTRIBUTING.md)
Expand Down
5 changes: 0 additions & 5 deletions .github/workflows/branch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,13 @@ name: nf-core branch protection
on:
pull_request_target:
branches: [master]
pull_request_target:
branches: [master]

jobs:
test:
runs-on: ubuntu-latest
runs-on: ubuntu-latest
steps:
# PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
# PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
- name: Check PRs
if: github.repository == 'nf-core/phaseimpute'
if: github.repository == 'nf-core/phaseimpute'
run: |
{ [[ ${{github.event.pull_request.head.repo.full_name }} == nf-core/phaseimpute ]] && [[ $GITHUB_HEAD_REF == "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]]
Expand Down
5 changes: 0 additions & 5 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,10 @@ name: nf-core linting
# This workflow is triggered on pushes and PRs to the repository.
# It runs the `nf-core lint` and markdown lint tests to ensure
# that the code meets the nf-core guidelines.
# It runs the `nf-core lint` and markdown lint tests to ensure
# that the code meets the nf-core guidelines.
on:
push:
branches:
- dev
branches:
- dev
pull_request:
release:
types: [published]
Expand Down Expand Up @@ -51,7 +47,6 @@ jobs:
python -m pip install --upgrade pip
pip install nf-core


- name: Run nf-core lint
env:
GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }}
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ results/
testing/
testing*
*.pyc
*.code-workspace
*.code-workspace
.nf-test*
2 changes: 0 additions & 2 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,8 @@ Questions, concerns, or ideas on what we can include? Contact members of the Saf
## Our Responsibilities

Members of the Safety Team (the Safety Officers) are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behaviour.
The safety officer is responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behaviour.

The Safety Team, in consultation with the nf-core core team, have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this CoC, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
The safety officer in consultation with the nf-core core team have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.

Members of the core team or the Safety Team who violate the CoC will be required to recuse themselves pending investigation. They will not have access to any reports of the violations and will be subject to the same actions as others in violation of the CoC.

Expand Down
36 changes: 12 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,17 +19,7 @@

## Introduction

**nf-core/phaseimpute** is a bioinformatics pipeline that ...

<!-- TODO nf-core:
Complete this sentence with a 2-3 sentence summary of what types of data the pipeline ingests, a brief overview of the
major pipeline sections and the types of output it produces. You're giving an overview to someone new
to nf-core here, in 15-20 seconds. For an example, see https://github.com/nf-core/rnaseq/blob/master/README.md#introduction
-->

<!-- TODO nf-core: Include a figure that guides the user through the major workflow steps. Many nf-core
workflows use the "tube map" design for that. See https://nf-co.re/docs/contributing/design_guidelines#examples for examples. -->
<!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline -->
**nf-core/phaseimpute** is a bioinformatics pipeline to phase and impute genetic data. Different steps are available each corresponding to a dedicated modes.

### Main steps of the pipeline

Expand All @@ -43,30 +33,28 @@ The **phaseimpute** pipeline is constituted of 5 main steps:
> [!NOTE]
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.

<!-- TODO nf-core: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.
Explain what rows and columns represent. For instance (please edit as appropriate):

The basic usage of this pipeline is to impute a target dataset based on a phased panel.
First, prepare a samplesheet with your input data that looks as follows:

`samplesheet.csv`:

```csv
sample,fastq_1,fastq_2
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz
sample,bam,bai
1_BAM_1X,/path/to/.bam,/path/to/.bai
```

Each row represents a fastq file (single-end) or a pair of fastq files (paired end).

-->
Each row represents a bam file with its index file.

Now, you can run the pipeline using:

<!-- TODO nf-core: update the following command to include all required parameters for a minimal example -->

```bash
nextflow run nf-core/phaseimpute \
-profile <docker/singularity/.../institute> \
--input samplesheet.csv \
--genome "GRCh38" \
--panel <phased_reference_panel.vcf.gz> \
--steps "impute" \
--tools "glimpse1" \
--outdir <OUTDIR>
```

Expand Down Expand Up @@ -97,11 +85,12 @@ For more details about the output files and reports, please refer to the

## Credits

nf-core/phaseimpute was originally written by LouisLeNezet.
nf-core/phaseimpute was originally written by Louis Le Nézet.

We thank the following people for their extensive assistance in the development of this pipeline:

<!-- TODO nf-core: If applicable, make list of people who have also contributed -->
- Anabella Trigilla
- Saul Pierotti

## Contributions and Support

Expand All @@ -110,7 +99,6 @@ If you would like to contribute to this pipeline, please see the [contributing g
For further information or help, don't hesitate to get in touch on the [Slack `#phaseimpute` channel](https://nfcore.slack.com/channels/phaseimpute) (you can join with [this invite](https://nf-co.re/join/slack)).
For further information or help, don't hesitate to get in touch on the [Slack `#phaseimpute` channel](https://nfcore.slack.com/channels/phaseimpute) (you can join with [this invite](https://nf-co.re/join/slack)).

## Citations
## Citations

<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. -->
Expand Down
39 changes: 39 additions & 0 deletions assets/chr_rename_add.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look like it's just adding chr to the chromosome, any other way to do so, that doesn't rely on such a list?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be a module using a fai ?

Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
1 chr1
2 chr2
3 chr3
4 chr4
5 chr5
6 chr6
7 chr7
8 chr8
9 chr9
10 chr10
11 chr11
12 chr12
13 chr13
14 chr14
15 chr15
16 chr16
17 chr17
18 chr18
19 chr19
20 chr20
21 chr21
22 chr22
23 chr23
24 chr24
25 chr25
26 chr26
27 chr27
28 chr28
29 chr29
30 chr30
31 chr31
32 chr32
33 chr33
34 chr34
35 chr35
36 chr36
37 chr37
38 chr38
X chrX
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar question for this one

File renamed without changes.
Binary file modified assets/nf-core-phaseimpute_logo_light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions assets/panel.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
panel,vcf,index,sites,tsv,legend,phased
1000GP,1000GP.phased.vcf,1000GP.phased.vcf.csi,1000GP.sites,1000GP.tsv,,TRUE
1000GP_RePhase,1000GP.vcf,1000GP.vcf.csi,,,,FALSE
2 changes: 1 addition & 1 deletion assets/samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sample,BAM,BAI
sample,bam,bai
1_BAM_1X,/path/to/.bam,/path/to/.bai
1_BAM_SNP,/path/to/.bam,/path/to/.bai
6 changes: 3 additions & 3 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,17 @@
"errorMessage": "Sample name must be provided and cannot contain spaces",
"meta": ["id"]
},
"BAM": {
"bam": {
"type": "string",
"pattern": "^\\S+\\.bam$",
"errorMessage": "BAM file must be provided, cannot contain spaces and must have extension '.bam'"
},
"BAI": {
"bai": {
"errorMessage": "BAI file must be provided, cannot contain spaces and must have extension '.bai'",
"type": "string",
"pattern": "^\\S+\\.bai$"
}
},
"required": ["sample", "BAM", "BAI"]
"required": ["sample", "bam", "bai"]
}
}
48 changes: 48 additions & 0 deletions assets/schema_input_panel.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "https://raw.githubusercontent.com/nf-core/phaseimpute/master/assets/schema_input.json",
"title": "nf-core/phaseimpute pipeline - params.input_region schema",
"description": "Schema for the file provided with params.input_region",
"type": "array",
"items": {
"type": "object",
"properties": {
"panel": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Panel name must be provided and cannot contain spaces",
"meta": ["panel"]
},
"vcf": {
"type": "string",
"pattern": "^\\S+\\.vcf$",
"errorMessage": "Panel vcf file must be provided, cannot contain spaces and must have extension '.vcf'"
},
"index":{
"type": "string",
"pattern": "^\\S+\\.vcf\\.(tbi|csi)$",
"errorMessage": "Panel vcf index file must be provided, cannot contain spaces and must have extension '.vcf.tbi' or '.vcf.csi'"
},
"sites": {
"type": "string",
"pattern": "^\\S+\\.sites$",
"errorMessage": "Panel sites file must be provided, cannot contain spaces and must have extension '.sites'"
},
"tsv": {
"type": "string",
"pattern": "^\\S+\\.tsv$",
"errorMessage": "Panel tsv file must be provided, cannot contain spaces and must have extension '.tsv'"
},
"legend":{
"type": "string",
"pattern": "^\\S+\\.legend$",
"errorMessage": "Panel legend file must be provided, cannot contain spaces and must have extension '.legend'"
},
"phased": {
"type": "boolean",
"errorMessage": "Is the vcf given phased? Must be a boolean"
}
},
"required": ["panel", "vcf", "index", "phased"]
}
}
10 changes: 8 additions & 2 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,14 @@ params {
// Input data
// TODO nf-core: Specify the paths to your test data on nf-core/test-datasets
// TODO nf-core: Give any required params for the test so that command line flags are not needed
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv'
input = "../test-datasets/data/bam.csv"

// Genome references
genome = 'R64-1-1'
fasta = "../test-datasets/data/reference_genome/21_22/hs38DH.chr21_22.fa"
panel = "https://raw.githubusercontent.com/LouisLeNezet/test-datasets/phaseimpute/data/panel/21_22/1000GP.chr21_22.s.norel.bcf"
phased = true

// Impute parameters
step = "impute"
tools = "glimpse1"
}
2 changes: 1 addition & 1 deletion conf/test_panelprep.config
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ params {

// Input data
input = "tests/csv/panel.csv"
LouisLeNezet marked this conversation as resolved.
Show resolved Hide resolved
input_region_file = "tests/csv/regionsheet.csv"
input_region = "tests/csv/regionsheet.csv"
LouisLeNezet marked this conversation as resolved.
Show resolved Hide resolved
outdir = "results/test_panelprep"
LouisLeNezet marked this conversation as resolved.
Show resolved Hide resolved
genome = "GRCh38"

Expand Down
24 changes: 24 additions & 0 deletions docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,30 @@ conda activate nf-core-phaseimpute-1.0dev
nf-core modules install
```

## Run tests

```bash
nextflow run main.nf -profile singularity,test --outdir results -resume
```

## Problematic

### Channel management and combination

If only one specie at a time, then only one fasta file and only one map file (normally ?)
Do we want to be able to compute multiple panel at the same time ?
If so we need to correctly combine the different channel depending on their meta map.

All channel need to be identified by a meta map as follow:

- I : individual id
- P : panel id
- R : region used
- M : map used
- T : tool used
- G : reference genome used (is it needed ?)


## Open questions

How to use different schema ?
Expand Down
Loading
Loading