Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RepEnrich2 gets stuck without error message #24

Open
mmisak opened this issue Nov 8, 2021 · 10 comments
Open

RepEnrich2 gets stuck without error message #24

mmisak opened this issue Nov 8, 2021 · 10 comments

Comments

@mmisak
Copy link

mmisak commented Nov 8, 2021

Hello,
I hope this tool is still actively maintained.

I was running RepEnrich2 successfully on our HPC with an old dataset in the past with the whole, unfiltered repeatmasker repeats. However, recently, with other datasets, running the RepEnrich2/RepEnrich2.py command would lead to the program getting stuck without any error message. Strangely, this only happens for some samples from a sequencing run, others do finish. Upon checking the pair1_ folder for affected samples, I always see that after creating .txt files for some thousands of repeats, at some point no new .txt files are created any more for several days (even though it's not done yet), until the job times out.

I'm running it on 130 CPUs with 1TB of RAM.

I checked stdout and stderr, but both are empty. Do you have any suggestions?

@nskvir
Copy link
Collaborator

nskvir commented Nov 8, 2021

Hi, thanks for your interest in our project,

While RepEnrich2 can sometimes take a long time on larger files, I've never observed it completely hanging before. Is there a simple way to reproduce this? For example, does it always hang on the exact same data files if you attempt a second run, or is it random to some degree? How large are the datasets, if you are needing to use 130 CPUs and 1TB of ram?

The only things I can think of off the top of my head (at least without any error messages) are either some kind of update/change to one of the other software dependencies resulting in some unintended behavior, or if you are running on some unusually large datasets perhaps there is some other kind of limitation or edge case there.

I would first look at the versions of the various other dependencies bowtie/bedtools/samtools/biopython and try to maybe see if any of them have been recently updated since you started having the issue.

@mmisak
Copy link
Author

mmisak commented Nov 8, 2021

Thanks for your reply.

I'm not sure whether it's always stuck at the same file or not. I'll rerun my most recent run and note down where it got stuck, then I'll be able to answer whether it hangs there again or not (It'll probably take 2 days or so to get there).

My .fastq files are around 18-22GB and I'm using a whole unfiltered mm10 repeatmasker index. Even for my successful run in the past, it did take about 2-3 days per sample. I thought, this was normal and probably caused by the many repeats to test(?)

I'm using the following tool versions:
python (2.7.14) + biopython 1.66
bowtie2 (2.2.9)
samtools (1.10)
bedtools (2.27.1)

Which versions do you use/usually recommend?

@mmisak
Copy link
Author

mmisak commented Nov 11, 2021

Update: Upon running it again on the same data with the same settings, it hangs again at exactly the same repeat.

So do you think, the tool versions I mentioned at the end of my last post are fine? Which versions do you recommend?

@nskvir
Copy link
Collaborator

nskvir commented Nov 11, 2021

The only version I can see that might have some kind of issue is Samtools (I believe we originally tested with v1.3.1). The .fastqs do seem fairly large, so long runtimes like you described are pretty typical. If it were due to this though, I wouldn't expect other samples to run without issue. Do you remember which repeat it hangs on? Without being able to reproduce the error on my end it is difficult to diagnose.

@mmisak
Copy link
Author

mmisak commented Nov 16, 2021

Sorry for the late reply, I got back to it just now.

The .fastqs do seem fairly large, so long runtimes like you described are pretty typical. If it were due to this though, I wouldn't expect other samples to run without issue. Do you remember which repeat it hangs on?

The last repeats that were written to to pair1_ are _CACGGT_n.txt and _CACAG_n.txt, but these files look fine. Is there any way to find out which repeat would have been next?

Without being able to reproduce the error on my end it is difficult to diagnose.

If we can't solve it like this, in case you would like to have a look, I'd ask my supervisor whether I can share one of the files that hang for me with you. For now, I will try to re-run everything with samtools 1.3.1 and see if that makes a difference.

@mmisak
Copy link
Author

mmisak commented Nov 24, 2021

Update:
Upon running this multiple times (also with samtools 1.3.1) now: For the sample, I am testing, the last file written is not always exactly the same file, but approximately. And weirdly, it always seems to stop after 5460 files were written to the pair1_ folder.

If it wouldn't work for other samples, I'd almost suspect that this is some weird restriction of our HPC/filesystem. But since others work, it's probably really RepEnrich getting stuck at the same repeat and being able write this number of files before hanging(?)

If you'd like to have a look, I can ask my supervisor whether I can share one of these files with you.

@nskvir
Copy link
Collaborator

nskvir commented Nov 24, 2021

Sure, I'd be interested to take a look if you get permission to share one of the files!

@mmisak
Copy link
Author

mmisak commented Dec 7, 2021

Again sorry for the late reply, I am doing this currently more on the the side.

I uploaded my data on our university's file sharing service: https://seafile.rlp.net/d/8c325fc2eefd495faa71/
The data contains 2 samples that failed for me, one paired end mouse sample and one single end human one. Additionally, I added the repeatmasker files and bowtie indexes I used.

Please let me know as soon as you copied the data, such that I know when I can delete it again.

@nskvir
Copy link
Collaborator

nskvir commented Dec 9, 2021

All finished copying, I'll try and run it when I have some time

@mmisak
Copy link
Author

mmisak commented Dec 10, 2021

Thanks a lot, I deleted the upload again. I hope, all files are fine, since there were some issues while uploading. In case they're not, just let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants