Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support extraction of secondary structure elements from PDBx files #710

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ceziegler
Copy link

Added get_sse function. Each residue annotated as 'a' for alpha, 'b' for beta, or 'c' for coil

@ceziegler
Copy link
Author

I did not add tests for this function but I did step through with the debugger. Would you like for me to get ahead and write a test for this?

@padix-key
Copy link
Member

Resolves #534

@padix-key padix-key changed the title Issue 534 Support extraction of secondary structure elements from PDBx files Nov 30, 2024
Copy link
Member

@padix-key padix-key left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for adding the function! It would be good to have one or two tests in tests/structure/io/test_pdbx.py I can think of the following ones:

  • Apply get_sse() to a short structure from tests/structure/data (e.g. 1aki.bcif as it contains both, sheets and helices) and assert against a 'correct' reference SSE array, maybe crafted by hand from MOL* (e.g https://www.rcsb.org/3d-view/1aki)
  • Apply get_sse() to all structures in tests/structure/data via test parametrization and check if the length of the SSE array is equal to struc.get_residue_count() (peptide chains only).

In addition you would need to run the Ruff code formatter to make the CI pass and the conflicts need to be resolved.

Let me know if you have any questions!

src/biotite/structure/io/pdbx/convert.py Outdated Show resolved Hide resolved

Returns
----------
sec_struct_dic: keys are the different chains from the pdbx file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the description of a parameter Numpydoc uses the form

some_name : some_type
    Some description.

src/biotite/structure/io/pdbx/convert.py Outdated Show resolved Hide resolved
Comment on lines +1660 to +1669
# Get beta sheets
if "struct_sheet" in cif_feats:
beta = block["struct_sheet_range"]
pdb_chain = beta['beg_label_asym_id'].as_array(str)
start_pos = beta['beg_label_seq_id'].as_array(int)
end_pos = beta['end_label_seq_id'].as_array(int)

# set alpha helix positions
for idx in range(len(pdb_chain)):
sec_struct_dic[pdb_chain[idx]][start_pos[idx]:(end_pos[idx]+1)] = 'b'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost a duplication of the above code, only the category and assigned letter is different. Maybe it would be clearer to refactor it as a function that takes sec_struct_dic, the category and the letter to fill in?

src/biotite/structure/io/pdbx/convert.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants