-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Mummer4 tools #1953
Add Mummer4 tools #1953
Conversation
For reference, I wrapped a sub-set of the suite here: What you are doing here will be far more flexible and powerful - good plan 👍 |
Wonderful @nickeener! Many of us are currently at the GCC ... so sorry if we take a little bit longer to review it. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- please add a .shed.yml file
tools/mummer4/annotate.xml
Outdated
<command detect_errors="exit_code"> | ||
<![CDATA[ | ||
annotate | ||
$mgaps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
single quotes for all data-params and text-params
tools/mummer4/combineMUMs.xml
Outdated
<option value="no">NO</option> | ||
</param> | ||
<when value="yes" > | ||
<param name="qlabel" type="text" argument="-q" value="query" label="Query match label" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could remove this conditional and simply make the text parameter optional="True"
tools/mummer4/combineMUMs.xml
Outdated
<param name="strings" type="boolean" argument="-s" truevalue="-s" falsevalue="" label="Output all differences in strings" /> | ||
<param name="qheader" type="boolean" argument="-t" truevalue="-t" falsevalue="" label="Label query matches with query fasta header" /> | ||
<param name="verbose" type="select" label="Set verbose level for extra output" > | ||
<option value="0">0</option> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use indentation of 4 spaces everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just used a single tab for all my indentation, does that mean I need to go through and replace every tab with 4 spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this would be cool. please also see here: http://galaxy-iuc-standards.readthedocs.io/en/latest/best_practices/tool_xml.html
tools/mummer4/combineMUMs.xml
Outdated
</outputs> | ||
<tests> | ||
<test> | ||
<param name="reference_sequence" ftype="fasta" value="K12.fasta" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
tools/mummer4/dnadiff.xml
Outdated
<param name="query_sequence" type="data" format="fasta" multiple="True" label="Query Sequence(s)" /> | ||
</when> | ||
<when value="delta"> | ||
<param name="delta_data" type="data" format="txt" label="Alignment file from nucmer" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general question to all of these txt
formats. Are they really undefined, or are these maybe tabular etc...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would I tell if they were tabular? The original file in this case had the suffix ".delta" which wasn't an accepted format so I changed it to .txt.
I have a weird problem with the mgaps command. I initially installed mummer4 on my laptop several months ago and mgaps worked fine. But then just recently I got a new laptop and installed mummer4 on it and now the mgaps command does not exist. Not sure why it didn't get installed the 2nd time or how to get it now. Also, the nucmer command is now failing even though it passed before making this pull request and I have not updated it since. |
How did you installed it? With |
I used: conda install -c bioconda mummer4 |
Great job! Some notes:
|
@bgruening Could you please take a look at my modified code? Travis-CI tests do not pass yet due to some problem with git but we are working on that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nickeener thanks, this looks very good!
- please check all filetypes if you can not use tabular instead of txt
- please check your requirements, ideally you have just mummer4 as the requirement
- you could save a lot of lines if you use macros ... e.g. requirements, citations, version numbers and so on can all be moved into a macro file and shared across the tools
tools/mummer4/delta-filter.xml
Outdated
<tool id="delta-filter" name="Delta-Filter" version="0.1.0"> | ||
<requirements> | ||
<requirement type="package" version="4.0.0beta2">mummer4</requirement> | ||
<requirement type="package" version="1.12.14">sh</requirement> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nickeener why are all the other dependency needed? You just need mummer4 isn't it? And what is sh
?
tools/mummer4/mummer.xml
Outdated
@@ -0,0 +1,176 @@ | |||
<tool id="mummer" name="Mummer" version="0.1.0"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change all tool versions to the mummer version.
tools/mummer4/mummer.xml
Outdated
</conditional> | ||
</inputs> | ||
<outputs> | ||
<data name="output" format="txt" from_work_dir="mummer.txt" label="mummer" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove all label from the outputs ... or use something like label="${tool.name} on ${on_string}: mumplot"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all tools :)
tools/mummer4/nucmer.xml
Outdated
</requirements> | ||
<command detect_errors="exit_code"> | ||
<![CDATA[ | ||
#set $query=""+"".join(str($query_sequence).split(','))+"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also use "'+'"
here? Is the leading +""
needed?
tools/mummer4/show-coords.xml
Outdated
<tests> | ||
<test> | ||
<param name="delta" ftype="txt" value="nucmer.txt" /> | ||
<output name="output" ftype="txt" compare="diff" value="show-coords.txt" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, if you like, attach the column names actually as metadata.
See here: https://github.com/galaxyproject/tools-iuc/blob/master/tools/deseq2/deseq2.xml#L170
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it should probably be tabular isn't it?
@bgruening Thanks for the feedback. I believe I've fixed everything you mentioned. Are there any other changes I should make? |
… plus some other minor fixes
… 'from_work_dir' flag to reflect their correct extension, added nucmerplot and promerplot (combo tools of nucmer and mummerplot and promer and mummerplot), completely re-did the test data using smaller sequences, added labels to outputs, plus several other minor fixes
…axis option for mummerplot, and some other minor changes
…quirements, added macro for citation and mummerplot input, fixed version number, fixed output labels, added column names for certain outputs, and a few other fixes
…ti-fasta is much simpler)
@nickeener I rebased against master for you, but a few tests are still failing. Sorry. |
…names to some dnadiff outputs
…m tools can find original input files
@nekrut can you have a final look at this? |
@nickeener add help section as we discussed. Once done, @bgruening I think we are ready! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyone else?
Thanks @nickeener!
Thanks for all the help @bgruening! |
Hey my name's Nick and this is my first project for Dr. Nekrutenko. It's not quite done and there are still a few problems that I will list, but Nekrutenko suggested making a pull request so more people could offer help.
This is my attempt at wrapping the 16 commands of the mummer4 package, which contains sequence alignment and visualization tools. Of the 16 commands, I was able to able to get 10 of them to pass the test run by planemo (nucmer, delta-filter, exact-tandems, mgaps, mummer, repeat-match, show-aligns, show-coords, show-diff, show-snps). Another 3 fail the test but that problem should be fixed by the following pull request by Dave B: mummer4/mummer#68 (promer, dnadiff, mummerplot).
Problems:
mummer - needs to incorporate data manager created by Dave B (https://github.com/davebx/tools-iuc/tree/mummer_dm/data_managers/data_manager_mummer4_index_builder). Not sure how to go about this.
dnadiff - First test gives error: "Syntax error: Unbound variable input". Not sure what's going on here. Second test fails due to foundation.pm error that's fixed by aforementioned pull request.
show-tiling - When running the command, the output is blank. I have have the dependencies and optional utilities installed so I don't know why it's not working.
combineMUMs - When I run this command I get a seg fault. Also no idea why.
annotate - Confused about the input for this command. Man page says it's the output from the gaps command except there is no gaps command and the mgaps output doesn't seem to work.
I'd appreciate any suggestions on fixing these problems or any concerning any of the other commands. I look forward to working with you all.
Nick Keener