Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gffread EXTRACT_CDNA and EXTRACT_CDS feature to outputs #119

Merged
merged 25 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
ee702d7
Merge pull request #112 from Plant-Food-Research-Open/dev
GallVp Nov 21, 2024
c8667c9
Add EXTRACT_CDS feature to GFF_STORE workflow
liamlelievre Dec 4, 2024
dbebd7a
Update gff_store.nf
liamlelievre Dec 4, 2024
203af9b
Update gff_store.nf
liamlelievre Dec 4, 2024
13fe4e9
add EXTRACT_CDNA to gff_store.nf
liamlelievre Dec 4, 2024
c7bf40b
add GFF_STORE:EXTRACT_CDNA to modules.config
liamlelievre Dec 4, 2024
f0699a6
Update gff_store.nf
liamlelievre Dec 4, 2024
563bd13
Add cdna and cds outputs to output.md
liamlelievre Dec 4, 2024
79befa5
Add notes about cdna and cds update to CHANGELOG.md
liamlelievre Dec 4, 2024
47e2cf3
Added liamlelievre to contributors - README.md
liamlelievre Dec 4, 2024
2a118be
Update output.md
liamlelievre Dec 4, 2024
216225c
Added v0.6.0 notes to CHANGELOG.md
liamlelievre Dec 4, 2024
74cd2b2
removed trailing whitespace gff_store.nf
liamlelievre Dec 4, 2024
81871ff
Removed trailing whitespace - modules.config
liamlelievre Dec 4, 2024
3f898b0
rename params - modules.config
liamlelievre Dec 4, 2024
5af18c8
Rename params - nextflow.config
liamlelievre Dec 4, 2024
841ea02
Added code contributors
GallVp Dec 4, 2024
e05a469
Run nf-test successfully in minimal and stub
Dec 5, 2024
d2ff47e
Run nf-test successfully in minimal and stub, renamed attr, updated docs
Dec 5, 2024
2bcab04
Merge branch 'main' into add-gffread-feature
Dec 5, 2024
d21d70e
Add attributes option for -F -D to cds and cdna
Dec 5, 2024
9be84b2
Fixed linting issues
GallVp Dec 5, 2024
767239a
Updated snapshot
GallVp Dec 5, 2024
e042615
Fixed nextflow-setup version
GallVp Dec 5, 2024
43742fb
Fixed indent
GallVp Dec 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ jobs:
uses: actions/[email protected]

- name: Install Nextflow
uses: nf-core/setup-nextflow@v2
uses: nf-core/setup-nextflow@v2.0.0
with:
version: "${{ matrix.NXF_VER }}"

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/download_pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Install Nextflow
uses: nf-core/setup-nextflow@v2
uses: nf-core/setup-nextflow@v2.0.0

- name: Disk space cleanup
uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4

- name: Install Nextflow
uses: nf-core/setup-nextflow@v2
uses: nf-core/setup-nextflow@v2.0.0

- uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5
with:
Expand Down
2 changes: 1 addition & 1 deletion .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,5 @@ template:
outdir: .
skip_features:
- igenomes
version: 0.5.0
version: 0.6.0
update: null
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v0.6.0 - [4-Dec-2024]

### 'Added'

1. Added cDNA and CDS outputs to <OUTPUT_DIR>/annotations/<SAMPLE> directory [#118](https://github.com/Plant-Food-Research-Open/genepal/issues/118)

## v0.5.0 - [21-Nov-2024]

### `Added`
Expand Down
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ authors:
- family-names: "Thomson"
given-names: "Susan"
title: "genepal: A Nextflow pipeline for genome and pan-genome annotation"
version: 0.5.0
version: 0.6.0
date-released: 2024-11-21
url: "https://github.com/Plant-Food-Research-Open/genepal"
doi: 10.5281/zenodo.14195006
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ sbatch ./pfr_genepal

plant-food-research-open/genepal workflows were originally scripted by Jason Shiller ([@jasonshiller](https://github.com/jasonshiller)). Usman Rashid ([@gallvp](https://github.com/gallvp)) wrote the Nextflow pipeline.

We thank the following people for their extensive assistance in the development of this pipeline:
We thank the following people for extensive assistance in the development of the pipeline,

- Cecilia Deng [@CeciliaDeng](https://github.com/CeciliaDeng)
- Charles David [@charlesdavid](https://github.com/charlesdavid)
Expand All @@ -107,6 +107,10 @@ We thank the following people for their extensive assistance in the development
- Susan Thomson [@cflsjt](https://github.com/cflsjt)
- Ting-Hsuan Chen [@ting-hsuan-chen](https://github.com/ting-hsuan-chen)

and for contributions to the codebase,

- Liam Le Lievre [@liamlelievre](https://github.com/liamlelievre)

The pipeline uses nf-core modules contributed by following authors:

<a href="https://github.com/gallvp"><img src="https://github.com/gallvp.png" width="50" height="50"></a>
Expand Down Expand Up @@ -139,6 +143,7 @@ The pipeline uses nf-core modules contributed by following authors:
<a href="https://github.com/charles-plessy"><img src="https://github.com/charles-plessy.png" width="50" height="50"></a>
<a href="https://github.com/bunop"><img src="https://github.com/bunop.png" width="50" height="50"></a>
<a href="https://github.com/abhi18av"><img src="https://github.com/abhi18av.png" width="50" height="50"></a>
<a href="https://github.com/liamlelievre"><img src="https://github.com/liamlelievre.png" width="50" height="50"></a>
liamlelievre marked this conversation as resolved.
Show resolved Hide resolved

## Contributions and Support

Expand Down
2 changes: 1 addition & 1 deletion assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
report_comment: >
This report has been generated by the <a href="https://github.com/plant-food-research-open/genepal" target="_blank">plant-food-research-open/genepal</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://github.com/plant-food-research-open/genepal/blob/0.5.0/docs/usage.md" target="_blank">documentation</a>.
<a href="https://github.com/plant-food-research-open/genepal/blob/0.6.0/docs/usage.md" target="_blank">documentation</a>.

report_section_order:
"plant-food-research-open-genepal-methods-description":
Expand Down
23 changes: 22 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ process { // SUBWORKFLOW: GFF_STORE
}

withName: '.*:GFF_STORE:EXTRACT_PROTEINS' {
ext.args = params.add_attrs_to_proteins_fasta ? '-F -D -y' : '-y'
ext.args = params.add_attrs_to_proteins_cds_fastas ? '-F -D -y' : '-y'
ext.prefix = { "${meta.id}.pep" }

publishDir = [
Expand All @@ -295,6 +295,27 @@ process { // SUBWORKFLOW: GFF_STORE
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
}

withName: '.*:GFF_STORE:EXTRACT_CDS' {
ext.args = params.add_attrs_to_proteins_cds_fastas ? '-F -D -x' : '-x'
ext.prefix = { "${meta.id}.cds" }

publishDir = [
path: { "${params.outdir}/annotations/$meta.id" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
}
withName: '.*:GFF_STORE:EXTRACT_CDNA' {
ext.args = params.add_attrs_to_proteins_cds_fastas ? '-F -D -w' : '-w'
ext.prefix = { "${meta.id}.cdna" }

publishDir = [
path: { "${params.outdir}/annotations/$meta.id" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
}
}

process { // SUBWORKFLOW: FASTA_ORTHOFINDER
Expand Down
2 changes: 2 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,8 @@ If more than one genome is included in the pipeline, [ORTHOFINDER](https://githu
- `Y/`
- `Y.gt.gff3`: Final annotation file for genome `Y` which contains gene models and their functional annotations
- `Y.pep.fasta`: Protein sequences for the gene models
- 'Y.cdna.fasta': cDNA sequences for the gene models
- 'Y.cds.fasta': Coding sequences for the gene models

</details>

Expand Down
8 changes: 4 additions & 4 deletions docs/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,10 +68,10 @@ A Nextflow pipeline for consensus, phased and pan-genome annotation.

## Annotation output options

| Parameter | Description | Type | Default | Required | Hidden |
| ----------------------------- | ------------------------------------ | --------- | ------- | -------- | ------ |
| `braker_save_outputs` | Save BRAKER files | `boolean` | | | |
| `add_attrs_to_proteins_fasta` | Add gff attributes to proteins fasta | `boolean` | | | |
| Parameter | Description | Type | Default | Required | Hidden |
| ---------------------------------- | --------------------------------------------- | --------- | ------- | -------- | ------ |
| `braker_save_outputs` | Save BRAKER files | `boolean` | | | |
| `add_attrs_to_proteins_cds_fastas` | Add gff attributes to proteins/cDNA/CDS fasta | `boolean` | | | |

## Evaluation options

Expand Down
4 changes: 2 additions & 2 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ params {

// Annotation output options
braker_save_outputs = false
add_attrs_to_proteins_fasta = false
add_attrs_to_proteins_cds_fastas = false

// Evaluation options
busco_skip = false
Expand Down Expand Up @@ -261,7 +261,7 @@ manifest {
description = """A Nextflow pipeline for consensus, phased and pan-genome annotation."""
mainScript = 'main.nf'
nextflowVersion = '!>=24.04.2'
version = '0.5.0'
version = '0.6.0'
doi = 'https://doi.org/10.5281/zenodo.14195006'
}

Expand Down
6 changes: 3 additions & 3 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -287,10 +287,10 @@
"description": "Save BRAKER files",
"fa_icon": "fas fa-question-circle"
},
"add_attrs_to_proteins_fasta": {
"add_attrs_to_proteins_cds_fastas": {
"type": "boolean",
"fa_icon": "fas fa-question-circle",
"description": "Add gff attributes to proteins fasta"
"description": "Add gff attributes to proteins/cDNA/CDS fasta",
"fa_icon": "fas fa-question-circle"
}
}
},
Expand Down
27 changes: 27 additions & 0 deletions subworkflows/local/gff_store.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ import java.net.URLEncoder

include { GT_GFF3 as FINAL_GFF_CHECK } from '../../modules/nf-core/gt/gff3/main'
include { GFFREAD as EXTRACT_PROTEINS } from '../../modules/nf-core/gffread/main'
include { GFFREAD as EXTRACT_CDS } from '../../modules/nf-core/gffread/main'
include { GFFREAD as EXTRACT_CDNA } from '../../modules/nf-core/gffread/main'

workflow GFF_STORE {
take:
Expand Down Expand Up @@ -133,9 +135,34 @@ workflow GFF_STORE {
ch_final_proteins = EXTRACT_PROTEINS.out.gffread_fasta
ch_versions = ch_versions.mix(EXTRACT_PROTEINS.out.versions.first())

// MODULE: GFFREAD as EXTRACT_CDS
ch_cds_extraction_inputs = ch_final_gff
| join(ch_fasta)

EXTRACT_CDS(
ch_cds_extraction_inputs.map { meta, gff, fasta -> [ meta, gff ] },
ch_cds_extraction_inputs.map { meta, gff, fasta -> fasta }
)

ch_final_cds = EXTRACT_CDS.out.gffread_fasta
ch_versions = ch_versions.mix(EXTRACT_CDS.out.versions.first())

// MODULE: GFFREAD as EXTRACT_CDNA
ch_cdna_extraction_inputs = ch_final_gff
| join(ch_fasta)

EXTRACT_CDNA(
ch_cdna_extraction_inputs.map { meta, gff, fasta -> [ meta, gff ] },
ch_cdna_extraction_inputs.map { meta, gff, fasta -> fasta}
)

ch_final_cdna = EXTRACT_CDNA.out.gffread_fasta
ch_versions = ch_versions.mix(EXTRACT_CDNA.out.versions.first())

emit:
final_gff = ch_final_gff // [ meta, gff ]
final_proteins = ch_final_proteins // [ meta, fasta ]
final_cds = ch_final_cds // [ meta, fasta ]
final_cdna = ch_final_cdna // [ meta, fasta ]
versions = ch_versions // [ versions.yml ]
}
22 changes: 16 additions & 6 deletions tests/minimal/main.nf.test.snap
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"profile - test": {
"content": [
{
"successful tasks": 18,
"successful tasks": 20,
"versions": {
"AGAT_CONVERTSPGFF2GTF": {
"agat": "v1.4.0"
Expand All @@ -25,6 +25,12 @@
"CAT_PROTEIN_FASTAS": {
"pigz": "2.3.4"
},
"EXTRACT_CDNA": {
"gffread": "0.12.7"
},
"EXTRACT_CDS": {
"gffread": "0.12.7"
},
"EXTRACT_PROTEINS": {
"gffread": "0.12.7"
},
Expand Down Expand Up @@ -55,10 +61,12 @@
"tsebra": "1.1.2.5"
},
"Workflow": {
"plant-food-research-open/genepal": "v0.5.0"
"plant-food-research-open/genepal": "v0.6.0"
}
},
"stable paths": [
"a_thaliana.cdna.fasta:md5,12b9bef973e488640aec8c04ba3882fe",
"a_thaliana.cds.fasta:md5,b81060419355a590560f92aec8536281",
"a_thaliana.gt.gff3:md5,8ab16549095f605ff8715ac4a3de58ed",
"a_thaliana.pep.fasta:md5,4994c0393ca0245a1c57966d846d101e",
"a_thaliana.gff3:md5,d23d16cd86499d48a30ffb981ed27891",
Expand All @@ -67,6 +75,8 @@
"stable names": [
"annotations",
"annotations/a_thaliana",
"annotations/a_thaliana/a_thaliana.cdna.fasta",
"annotations/a_thaliana/a_thaliana.cds.fasta",
"annotations/a_thaliana/a_thaliana.gt.gff3",
"annotations/a_thaliana/a_thaliana.pep.fasta",
"etc",
Expand All @@ -81,9 +91,9 @@
}
],
"meta": {
"nf-test": "0.9.0",
"nextflow": "24.04.4"
"nf-test": "0.9.2",
"nextflow": "24.04.2"
},
"timestamp": "2024-11-19T11:35:02.477202"
"timestamp": "2024-12-05T07:51:43.818374"
}
}
}
8 changes: 4 additions & 4 deletions tests/short/main.nf.test.snap
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,18 @@
"successful tasks": 0,
"versions": {
"Workflow": {
"plant-food-research-open/genepal": "v0.5.0"
"plant-food-research-open/genepal": "v0.6.0"
}
},
"stable paths": [

]
}
],
"meta": {
"nf-test": "0.9.0",
"nextflow": "24.04.4"
},
"timestamp": "2024-10-22T11:39:43.110621"
"timestamp": "2024-12-05T16:37:07.37961"
}
}
}
26 changes: 20 additions & 6 deletions tests/stub/main.nf.test.snap
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"full - stub": {
"content": [
{
"successful tasks": 154,
"successful tasks": 162,
"versions": {
"AGAT_CONVERTSPGFF2GTF": {
"agat": "v1.4.0"
Expand Down Expand Up @@ -55,6 +55,12 @@
"EGGNOGMAPPER": {
"eggnog-mapper": "2.1.12"
},
"EXTRACT_CDNA": {
"gffread": "0.12.7"
},
"EXTRACT_CDS": {
"gffread": "0.12.7"
},
"EXTRACT_PROTEINS": {
"gffread": "0.12.7"
},
Expand Down Expand Up @@ -143,25 +149,33 @@
"tsebra": "1.1.2.5"
},
"Workflow": {
"plant-food-research-open/genepal": "v0.5.0"
"plant-food-research-open/genepal": "v0.6.0"
}
},
"stable paths": [
"donghong.cdna.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"donghong.cds.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"donghong.emapper.annotations:md5,d41d8cd98f00b204e9800998ecf8427e",
"donghong.emapper.hits:md5,d41d8cd98f00b204e9800998ecf8427e",
"donghong.emapper.seed_orthologs:md5,d41d8cd98f00b204e9800998ecf8427e",
"donghong.gt.gff3:md5,d41d8cd98f00b204e9800998ecf8427e",
"donghong.pep.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v2p1.cdna.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v2p1.cds.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v2p1.emapper.annotations:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v2p1.emapper.hits:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v2p1.emapper.seed_orthologs:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v2p1.gt.gff3:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v2p1.pep.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v3.cdna.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v3.cds.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v3.emapper.annotations:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v3.emapper.hits:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v3.emapper.seed_orthologs:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v3.gt.gff3:md5,d41d8cd98f00b204e9800998ecf8427e",
"red5_v3.pep.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red7_v5.cdna.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red7_v5.cds.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
"red7_v5.emapper.annotations:md5,d41d8cd98f00b204e9800998ecf8427e",
"red7_v5.emapper.hits:md5,d41d8cd98f00b204e9800998ecf8427e",
"red7_v5.emapper.seed_orthologs:md5,d41d8cd98f00b204e9800998ecf8427e",
Expand All @@ -188,9 +202,9 @@
}
],
"meta": {
"nf-test": "0.9.0",
"nextflow": "24.04.4"
"nf-test": "0.9.2",
"nextflow": "24.04.2"
},
"timestamp": "2024-11-21T12:34:14.056074"
"timestamp": "2024-12-05T07:56:38.915238"
}
}
}
Loading