Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A notebook that creates an example of an SBOL2 Generic Location #28

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open
174 changes: 174 additions & 0 deletions examples/sbol2/CreLoxRecombination.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Representing Cre-Lox Recombination with SBOL2\n",
"\n",
"Cre-Lox recombination is a site-specific recombination event, often used to modify sequences in synthetic biology by flipping or excising segments of DNA. This can be used to activate or deactivate genes or other elements.\n",
"\n",
"In this example, we will represent a situation where an unknown coding sequence (CDS) is initially in a reverse complement orientation (and thus not expressed) and flanked by two loxP recombination sites. Upon recombination, the CDS will be flipped into the forward orientation and become expressed. We'll use SBOL2 concepts like `Component`, `SequenceAnnotation`, `Range`, `GenericLocation`, and `SequenceConstraint` to model this event."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"import sbol2\n",
PrashantVaidyanathan marked this conversation as resolved.
Show resolved Hide resolved
"\n",
"# Create an SBOL document\n",
"doc = sbol2.Document()\n",
"\n",
"# Set a namespace for the document\n",
"sbol2.setHomespace('https://github.com/SynBioDex/SBOL-Notebooks')\n",
"\n",
"# Create a ComponentDefinition for the entire Cre-Lox recombination system\n",
"recombination_system = sbol2.ComponentDefinition('CreLox_Recombination_System', sbol2.BIOPAX_DNA)\n",
"doc.addComponentDefinition(recombination_system)\n",
"\n",
"# Create the Component for the Cre-Lox recombination site\n",
"loxP = sbol2.ComponentDefinition('loxP_site', sbol2.BIOPAX_DNA)\n",
"loxP_seq = sbol2.Sequence('loxP_seq', 'ATAACTTCGTATAATGTATGCTATACGAAGTTAT', sbol2.SBOL_ENCODING_IUPAC)\n",
"loxP.sequences = [loxP_seq]\n",
"doc.addComponentDefinition(loxP)\n",
"\n",
"# Create a ComponentDefinition for the CDS (Coding Sequence)\n",
"# The CDS is unknown or \"not yet selected,\" so we can use a generic placeholder sequence\n",
PrashantVaidyanathan marked this conversation as resolved.
Show resolved Hide resolved
"cds = sbol2.ComponentDefinition('unknown_CDS', sbol2.BIOPAX_DNA)\n",
"doc.addComponentDefinition(cds)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# Add the loxP sites and CDS as components to the CreLox_Recombination_System ComponentDefinition\n",
"loxP1_comp = recombination_system.components.create('loxP1_component')\n",
"loxP1_comp.definition = loxP.persistentIdentity\n",
"\n",
"cds_comp = recombination_system.components.create('cds_component')\n",
"cds_comp.definition = cds.persistentIdentity\n",
"\n",
"loxP2_comp = recombination_system.components.create('loxP2_component')\n",
"loxP2_comp.definition = loxP.persistentIdentity"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# Create SequenceAnnotations to represent the loxP sites and the CDS\n",
"# Annotation for the first loxP site\n",
"loxP1_annotation = sbol2.SequenceAnnotation('loxP1_annotation')\n",
"loxP1_range = sbol2.Range('loxP1_range', 1, len(loxP_seq.elements))\n",
"loxP1_annotation.locations.add(loxP1_range)\n",
"loxP.sequenceAnnotations.add(loxP1_annotation)\n",
"\n",
"# GenericLocation for the CDS (unknown or not yet selected, so we leave boundaries flexible)\n",
"cds_annotation = sbol2.SequenceAnnotation('cds_annotation')\n",
"cds_location = sbol2.GenericLocation('cds_generic_location')\n",
"cds_location.orientation = sbol2.SBOL_ORIENTATION_REVERSE_COMPLEMENT # Initially reverse\n",
"cds_annotation.locations.add(cds_location)\n",
"cds.sequenceAnnotations.add(cds_annotation)\n",
"\n",
"# GenericLocation for the second loxP site (second recombination site) whose position is not known because it comes after the CDS\n",
"loxP2_annotation = sbol2.SequenceAnnotation('loxP2_annotation')\n",
"loxP2_location = sbol2.GenericLocation('loxP2_generic_location')\n",
"loxP2_location.orientation = sbol2.SBOL_ORIENTATION_INLINE\n",
"loxP2_annotation.locations.add(loxP2_location)\n",
"loxP.sequenceAnnotations.add(loxP2_annotation)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# Create SequenceConstraints to describe the relationships\n",
"# Constraint to ensure that loxP1 and CDS are adjacent\n",
"constraint1 = recombination_system.sequenceConstraints.create('loxP1_to_CDS')\n",
"constraint1.subject = loxP1_comp\n",
"constraint1.object = cds_comp\n",
"constraint1.restriction = sbol2.SBOL_RESTRICTION_PRECEDES\n",
"\n",
"# Constraint to ensure that CDS and loxP2 are adjacent\n",
"constraint2 = recombination_system.sequenceConstraints.create('CDS_to_loxP2')\n",
"constraint2.subject = cds_comp\n",
"constraint2.object = loxP2_comp\n",
"constraint2.restriction = sbol2.SBOL_RESTRICTION_PRECEDES"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Validate the document to ensure compliance with SBOL standards\n",
"doc.validate()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Save the document to an SBOL file\n",
"doc.write('cre_lox_recombination.xml')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explanation\n",
"\n",
"In this example, we modeled a Cre-Lox recombination system:\n",
"\n",
"1. **Cre-Lox Recombination System**: We defined a `ComponentDefinition` to represent the entire recombination system. This component holds the loxP sites and the unknown CDS as its sub-components.\n",
" \n",
"2. **loxP Sites**: We created two `Component` objects representing the loxP recombination sites. Each loxP site is linked to a known DNA sequence and added as a component to the system.\n",
"\n",
"3. **Unknown CDS**: We defined an unknown or not-yet-selected coding sequence (CDS) using `GenericLocation`, which allows us to keep the boundaries flexible and represents the CDS in reverse complement orientation. The CDS is also added as a component to the system.\n",
"\n",
"4. **SequenceAnnotations**: We created `SequenceAnnotations` for each element (loxP sites and CDS), using a `Range` for the first loxP site and `GenericLocation` for the CDS and the second loxP site, in order to encode the orientations. If all of the `Component` objects were inline, the `GenericLocation` objects could instead be omitted.\n",
"\n",
"5. **SequenceConstraints**: Two `SequenceConstraint` objects were added to ensure that the CDS is flanked by the loxP sites in the correct order. These constraints enforce that:\n",
" - The first loxP site precedes the CDS.\n",
" - The CDS precedes the second loxP site.\n",
"\n",
"This setup simulates a Cre-Lox recombination device, where the recombination between the two loxP sites would flip the unknown CDS into the correct orientation, potentially activating it. By modeling the system this way, we maintain flexibility in defining the sequence, structure, and relationships between components."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "sbol_env",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
121 changes: 121 additions & 0 deletions examples/sbol2/CreatingSBOL2Objects/GenericLocation.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using `GenericLocation` to Define Flexible Regions on a Sequence\n",
"\n",
"`GenericLocation` in SBOL 2 is useful when regions on a sequence have flexible or undefined boundaries. This is often the case in situations where exact sequence positions are not available.\n",
"\n",
"In this notebook, we will start with a simple example to understand the basic usage of `GenericLocation`, and then link it to more complex use cases such as **Cre-Lox recombination**. `GenericLocation` becomes much more meaningful when combined with other SBOL concepts like `SequenceAnnotation`, `SequenceConstraint`, and `ComponentDefinition`.\n",
"\n",
"**Minimal Example:**\n",
"- We will create a DNA sequence and use `GenericLocation` to annotate a region without defined boundaries.\n",
"\n",
"**Practical Example:**\n",
"- For a practical application of `GenericLocation` in modeling biological processes, see the [Cre-Lox Recombination Notebook](../CreLoxRecombination.ipynb), where `GenericLocation` is used to model flexible coding sequence boundaries in a recombination system."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import sbol2\n",
"\n",
"# Create an SBOL document\n",
"doc = sbol2.Document()\n",
"\n",
"# Set a namespace for the document\n",
"sbol2.setHomespace('https://github.com/SynBioDex/SBOL-Notebooks')\n",
"\n",
"# Create a Sequence object with an arbitrary DNA sequence\n",
"sequence_elements = 'ATGCGTACGTAGCTAGTCTGATCGTAGCTAGTCGATGCA'\n",
"seq = sbol2.Sequence('example_sequence')\n",
"seq.elements = sequence_elements\n",
"seq.encoding = sbol2.SBOL_ENCODING_IUPAC\n",
"\n",
"# Add the sequence to the document\n",
"doc.addSequence(seq)\n",
"\n",
"# Create a ComponentDefinition for the sequence\n",
"comp_def = sbol2.ComponentDefinition('example_component', sbol2.BIOPAX_DNA)\n",
"comp_def.sequences = [seq.identity] # Link the sequence to the ComponentDefinition\n",
"\n",
"# Add the ComponentDefinition to the document\n",
"doc.addComponentDefinition(comp_def)\n",
"\n",
"# --- GenericLocation ---\n",
"# Create a SequenceAnnotation for a region on the sequence\n",
"annotation = sbol2.SequenceAnnotation('flexible_region_annotation')\n",
"\n",
"# Define a GenericLocation without specific boundaries\n",
"generic_location = sbol2.GenericLocation('generic_location')\n",
"generic_location.orientation = sbol2.SBOL_ORIENTATION_REVERSE_COMPLEMENT # By default the orientation is set to Inline.\n",
"\n",
"# Add the GenericLocation to the SequenceAnnotation\n",
"annotation.locations.add(generic_location)\n",
"\n",
"# Add the SequenceAnnotation to the ComponentDefinition\n",
"comp_def.sequenceAnnotations.add(annotation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Validate the document to ensure compliance with SBOL standards\n",
"doc.validate()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Save the document to an SBOL file\n",
"doc.write('genericlocation_example.xml')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this minimal example, we used a `GenericLocation` to annotate a region on a sequence without specifying exact start and end positions. This approach is useful when the boundaries of a region are not well-defined or are subject to change. The `orientation` property of `GenericLocation` can be used to indicate the strand direction (inline or reverse complement).\n",
"\n",
"While this example shows basic usage, `GenericLocation` becomes much more valuable in more complex systems where flexibility in sequence boundaries is essential.\n",
"\n",
"### Next Steps\n",
"\n",
"- This minimal example demonstrates the basics of using `GenericLocation`. To learn more about how this concept integrates with `SequenceConstraint` and `SequenceAnnotation`, explore the [Cre-Lox Recombination Notebook](../CreLoxRecombination.ipynb).\n",
"- In the Cre-Lox notebook, we demonstrate how `GenericLocation` allows us to model regions that undergo recombination, including sequence flipping and orientation changes."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "sbol_env",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}