Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error index f8 #8

Open
bioinfogit opened this issue Aug 16, 2023 · 4 comments
Open

Error index f8 #8

bioinfogit opened this issue Aug 16, 2023 · 4 comments

Comments

@bioinfogit
Copy link

Hi
I am getting following error
pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=[faidx] Could not build fai index ../path/tmp/f8_ref.fa.fai\n'
removing f8 from gene list works and I am using the latest version
paraphase --version
2.2.3

@xiao-chen-xc
Copy link
Collaborator

Hi @bioinfogit I'm not able to reproduce this error. Could you delete that tmp folder and try again?

@themkdemiiir
Copy link

themkdemiiir commented Nov 22, 2023

Hello,
the version is 2.2.3
docker image quay.io/pacbio/paraphase:2.2.3_build2

I received an error message when I executed the command in the Docker environment. However, after checking the outdir, I could not locate the tmp file.

root@a9f9e6fb2c28:/# paraphase --threads 8 --bam /longread/NA17282.HomoSapiens.aligned.haplotagged.bam -o /longread/ --reference /genomes/Homo_sapiens.GRCh38.dna.primary_assembly.fa              
ERROR:root:Error running the program...See error message below
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/paraphase/paraphase.py", line 472, in run
    configs = self.update_config(gene_list, tmpdir, args.reference)
  File "/usr/local/lib/python3.8/dist-packages/paraphase/paraphase.py", line 325, in update_config
    self.make_ref_fasta(ref_file, realign_region, genome)
  File "/usr/local/lib/python3.8/dist-packages/paraphase/paraphase.py", line 352, in make_ref_fasta
    pysam.faidx(ref_file)
  File "/usr/local/lib/python3.8/dist-packages/pysam/utils.py", line 83, in __call__
    raise SamtoolsError(
pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=[faidx] Could not build fai index /longread/tmp_2023-11-22-12-15-40-154747/smn1_ref.fa.fai\n'
INFO:root:Completed Paraphase analysis at 2023-11-22 12:15:40.277212...

@themkdemiiir
Copy link

themkdemiiir commented Nov 22, 2023

I now understand the issue. My ensembl reference file lacks "chr" string. Could you fix the problem here?

Error

kaan@biyoinfo1:~$ samtools faidx levopt/hg38/genomes/ensembl_p13_primary/Homo_sapiens.GRCh38.dna.primary_assembly.fa chr5:70890000-71100000 | sed -e "s/-/_/" | sed -e "s/:/_/" > kaan.txt
[W::fai_get_val] Reference chr5:70890000-71100000 not found in FASTA file, returning empty sequence
[faidx] Failed to fetch sequence in chr5:70890000-71100000
kaan@biyoinfo1:~$ samtools faidx kaan.txt
[faidx] Could not build fai index kaan.txt.fai

No Error

kaan@biyoinfo1:~$ samtools faidx levopt/hg38/genomes/ensembl_p13_primary/Homo_sapiens.GRCh38.dna.primary_assembly.fa 5:70890000-71100000 | sed -e "s/-/_/" | sed -e "s/:/_/" > kaan.txt
kaan@biyoinfo1:~$ samtools faidx kaan.txt

@xiao-chen-xc
Copy link
Collaborator

xiao-chen-xc commented Nov 28, 2023

Hi @themkdemiiir, Paraphase assumes GRCh38 has "chr" in chromosome names. Could you realign to the UCSC/NCBI version and rerun Paraphase? For best performance with HiFi data, please remove ALT contigs from the reference genome before alignment. We do have a recommended version of reference genome (with download links) documented here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants