Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename fasta headers with regex #77

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

CSynodinos
Copy link

I am proposing a script for renaming fasta headers with regex. Both header id's and descriptions can be changed simultaneously and individually.

@CSynodinos CSynodinos changed the title Master rename fasta headers with regex Mar 15, 2022
@jorvis
Copy link
Owner

jorvis commented Apr 4, 2022

Not sure how I missed this! OK, I see documented header pattern CSV file, but could you give an example of using this? Some example input headers, a csv file, and exported headers? Was trying to see from the documentation how you'd handle the possibility of duplicates in the output.

@jorvis jorvis self-assigned this Apr 4, 2022
@CSynodinos
Copy link
Author

Hello Jorvis, sorry for the late response. I have attached some example files and the output. The input fasta is the Covid-19 Wuhan variant copy pasted a bunch of times but with each header altered apart from the first one.

The command I used was:
python3 header_renamer.py -i header_test_file.fasta -cv patterns.csv -cnt true

When it comes to the duplicates, my solution was to use the -cnt argument which basically adds a counter to each header iteration. It is not the best solution, but it does allow for making each header different regardless of whether you have duplicates or not. Its default value is False.

example_files.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants