Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results_Interpretation #10

Open
Oelsakha opened this issue Jan 16, 2023 · 3 comments
Open

Results_Interpretation #10

Oelsakha opened this issue Jan 16, 2023 · 3 comments

Comments

@Oelsakha
Copy link

Hi
Thank you so much for your nice tool and your hard work.
I have two question:

  1. I used antismash verion 7 and the tool works well ???
  2. How to choose cutoff value for tree classifier, logistic regression classifier, and svm classifier.
    Thank you
@allie-walker
Copy link
Owner

It will run with other versions of antismash that it was not trained on, but the predictions may be slightly less accurate due to the gene annotations being slightly different between versions. We are almost done adding support for antismash6 and will start working on adding support for 7 once it is out of beta. So far we have seen that using a mismatched training set/input antismash verison doesn't affect the predictions that much but we haven't tested it too rigorously.

For the cutoffs it depends on the application and how tolerant you are to false positives. The values represent a probability of activity. So anything >50% means that the ml classifier thinks it is more likely to be active than not. But that does not mean that everything <50% will not be active. If you are limited in how much you can screen you can use a higher cutoff for a better chance of success. Also if all three classifiers give similar probabilities that would likely indicate a more reliable probability, if they disagree (e.g. one says 20% active, the other says 60%) it could indicate a cluster that is difficult to predict on because it is too dissimilar to the training set. We are still testing how it works on novel gene clusters but generally we look for all three classifiers to give probabilities >50%.

@Oelsakha
Copy link
Author

Thank you so much!

@liangly1
Copy link

Hi
Thank you so much for your nice tool and your hard work.
I have two question:
Can this method currently predict the results of Antismash 7.0 version?
If not,Approximately when can it be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants