-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support extraction of secondary structure elements from PDBx files #710
base: main
Are you sure you want to change the base?
Conversation
I did not add tests for this function but I did step through with the debugger. Would you like for me to get ahead and write a test for this? |
Resolves #534 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for adding the function! It would be good to have one or two tests in tests/structure/io/test_pdbx.py
I can think of the following ones:
- Apply
get_sse()
to a short structure fromtests/structure/data
(e.g.1aki.bcif
as it contains both, sheets and helices) and assert against a 'correct' reference SSE array, maybe crafted by hand from MOL* (e.g https://www.rcsb.org/3d-view/1aki) - Apply
get_sse()
to all structures intests/structure/data
via test parametrization and check if the length of the SSE array is equal tostruc.get_residue_count()
(peptide chains only).
In addition you would need to run the Ruff code formatter to make the CI pass and the conflicts need to be resolved.
Let me know if you have any questions!
|
||
Returns | ||
---------- | ||
sec_struct_dic: keys are the different chains from the pdbx file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the description of a parameter Numpydoc uses the form
some_name : some_type
Some description.
# Get beta sheets | ||
if "struct_sheet" in cif_feats: | ||
beta = block["struct_sheet_range"] | ||
pdb_chain = beta['beg_label_asym_id'].as_array(str) | ||
start_pos = beta['beg_label_seq_id'].as_array(int) | ||
end_pos = beta['end_label_seq_id'].as_array(int) | ||
|
||
# set alpha helix positions | ||
for idx in range(len(pdb_chain)): | ||
sec_struct_dic[pdb_chain[idx]][start_pos[idx]:(end_pos[idx]+1)] = 'b' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is almost a duplication of the above code, only the category and assigned letter is different. Maybe it would be clearer to refactor it as a function that takes sec_struct_dic
, the category and the letter to fill in?
Co-authored-by: Patrick Kunzmann <[email protected]>
Co-authored-by: Patrick Kunzmann <[email protected]>
Co-authored-by: Patrick Kunzmann <[email protected]>
Added get_sse function. Each residue annotated as 'a' for alpha, 'b' for beta, or 'c' for coil