You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Excellent work! I am reading ProtSSN and trying to use it, but I have a few questions:
The input in downstream tasks is the entire protein, while the training set uses CATH domains. How does this difference affect the model's performance?
The model's inputs during training are crystal structures, but AF2 or ESM2 predicted structures are used for inference. How much bias does this introduce?
If I want to use ProtSSN for downstream tasks, do I just use the code provided in README to extract embeddings?
Congratulations again on your work!
Best regards
The text was updated successfully, but these errors were encountered:
Good question! From a biological perspective, the CATH domain already contains sufficient protein structure paradigms, but from a computational perspective, this is indeed a gap, and we will conduct additional experimental tests in the future.
We don't know how much error this will cause in dry experiments, because we can't get the crystal structure of most proteins. But we are currently doing wet experiment verification, and it seems that both the predicted structure and the crystal structure work well. We may be able to answer this question in our future iterative work.
I have added the new code for fine-tuning ProtSSN on any downstream tasks, you can see here. You could provide CSV with labels and PDB files for training, the dataset formation can be found here.
Thank you for your attention, and welcome to follow our latest work ProSST.😊
Hi,
Excellent work! I am reading ProtSSN and trying to use it, but I have a few questions:
Congratulations again on your work!
Best regards
The text was updated successfully, but these errors were encountered: