Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inputs during training and inference #5

Open
zchwang opened this issue Jun 27, 2024 · 1 comment
Open

Inputs during training and inference #5

zchwang opened this issue Jun 27, 2024 · 1 comment

Comments

@zchwang
Copy link

zchwang commented Jun 27, 2024

Hi,

Excellent work! I am reading ProtSSN and trying to use it, but I have a few questions:

  1. The input in downstream tasks is the entire protein, while the training set uses CATH domains. How does this difference affect the model's performance?
  2. The model's inputs during training are crystal structures, but AF2 or ESM2 predicted structures are used for inference. How much bias does this introduce?
  3. If I want to use ProtSSN for downstream tasks, do I just use the code provided in README to extract embeddings?

Congratulations again on your work!

Best regards

@tyang816
Copy link
Owner

Hi, Wang,

  1. Good question! From a biological perspective, the CATH domain already contains sufficient protein structure paradigms, but from a computational perspective, this is indeed a gap, and we will conduct additional experimental tests in the future.
  2. We don't know how much error this will cause in dry experiments, because we can't get the crystal structure of most proteins. But we are currently doing wet experiment verification, and it seems that both the predicted structure and the crystal structure work well. We may be able to answer this question in our future iterative work.
  3. I have added the new code for fine-tuning ProtSSN on any downstream tasks, you can see here. You could provide CSV with labels and PDB files for training, the dataset formation can be found here.

Thank you for your attention, and welcome to follow our latest work ProSST.😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants