You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is this a feature request for FCS-adaptor or FCS-GX?
FCS-adaptor (I am using v0.4.0)
Describe the problem you'd like to be solved
Don't fail on sequences <10 bp
Describe the solution you'd like
Please add a CLI switch to simply skip over/ignore sequences that are shorter than 10bp
Describe alternatives you've considered
Checking/filtering all input sequences beforehand, which implies that each sequence file is processed at least twice (checking and then adaptor scanning)
Thanks
The text was updated successfully, but these errors were encountered:
Can I ask what context you are working with sequences <10 bp? NCBI GenBank submissions have length requirements (200 bp for genome sequences, 10 bp for others, hence the validation check included here). If these aren't intended for submission to NCBI archives, that's fine, and if we can consider adding this as an optional flag in a future release. For now my best suggestion would be a workaround to extract the short sequences, set them aside while running FCS-adaptor on larger sequences, then add them back in.
I am not "working" on these sequences, or at least I strongly assume that this is some garbage contained in a handful of the genome assemblies I am analyzing. If that length requirement exists due to strict filter criteria for submissions, maybe a cleaner solution would be to have a strict setting that is used/on by default that would simply label too short sequences for removal, i.e. analogous to flagging any remainders of adaptor sequences.
In my case, of course, I am now pre-filtering the assembly FASTAs.
Is this a feature request for FCS-adaptor or FCS-GX?
FCS-adaptor (I am using v0.4.0)
Describe the problem you'd like to be solved
Don't fail on sequences <10 bp
Describe the solution you'd like
Please add a CLI switch to simply skip over/ignore sequences that are shorter than 10bp
Describe alternatives you've considered
Checking/filtering all input sequences beforehand, which implies that each sequence file is processed at least twice (checking and then adaptor scanning)
Thanks
The text was updated successfully, but these errors were encountered: