Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require snvboxGenes.fa file #25

Open
vivekruhela opened this issue Jul 14, 2023 · 1 comment
Open

Require snvboxGenes.fa file #25

vivekruhela opened this issue Jul 14, 2023 · 1 comment

Comments

@vivekruhela
Copy link

Hello,

I am using the 2020+ tool to identify the potential candidate driver gene. I can download the required files, such as snvboxGenes.bed or scores.tar.gz, but I am not able to get the exact file required for gene.fa in the following command:

mut_annotate --summary -i genes.fa -b genes.bed -s score_dir -m mutations.txt -o summary.txt

I tried various fasta files generated from UCSC Table Brower, but now of them worked. Can you share the exact fasta file you used in your published work? Thanks.

@ctokheim
Copy link
Collaborator

ctokheim commented Jan 7, 2024

Hi. The snvboxGenes.fa file (i.e. input of -i in mut_annotate) is generated from the extract_gene_seq command (see https://probabilistic2020.readthedocs.io/en/latest/tutorial.html#gene-fasta). One just needs to download the hg19 fasta file from UCSC (http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.2bit), convert the file from 2bit to fasta format using the twoBitToFa command line tool from UCSC, and then run extract_gene_seq command with hg19.fa and snvboxGenes.bed as input.

I've also attached the snvboxGenes.fa file below as well.
snvboxGenes.fa.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants