Add friendzymes collection #238

jcahill · 2022-02-14T15:11:42Z

About

This PR is for inclusion of the Friendzymes Collection.

Description

This collection is aimed at expanding what people are able to do with FreeGenes collections and the iGEM distribution, both in terms of genetic assembly and in terms of biomanufacturing. Friendzymes' primary goals are to democratize strain engineering and recombinant protein manufacturing and purification.

For manufacturing, this collection contains an expansion of the FreeGenes Open Yeast Collection, including target P. pastoris-optimized target enzymes for recombinant production (such as Eco31I, an IP-free BsaI isoschizomer and its cognate methyltransferases), additional purification tags, an anti-His tag antibody for protein blotting and quantification, and additional yeast promoters. Further, this collection contains complements to the FreeGenes Bacillus subtilis Secretion Tag Library Plasmids, for recombinant protein production and secretion from B. subtilis. These include B. subtilis promoters, target proteins for production like Pfu-Sso7d polymerase, and various B. subtilis regulatory elements.

For strain engineering, we include E. coli origins or replication, E. coli, B. subtilis and P. pastoris selection markers, counterselection markers for E. coli, an origin of transfer for conjugation from E. coli to other bacterial species, homology arm pairs for genomic integration into B. subtilis and P. pastoris, and 5' and 3' recombinase site parts for insertion, deletion or inversion of synthetic genetic elements. Many of these parts are not elements of a canonical transcription unit, and do not have clearly defined part types in the MoClo/uLoop assembly standard; moreover, for some parts, their insertion into the transcription unit would require changing the overhangs on the core promoter, RBS, CDS, and/or terminator parts.

To address this challenge, we designed a high-fidelity, backwards-compatible expansion of the MoClo assembly standard, AllClo (https://docs.google.com/spreadsheets/d/1TICnbGYY96myM7TPXWwBsLvyadgSfmtbVTGsUN5iMI8/edit?usp=sharing), all with a single 26-overhang set that includes all uLoop overhangs and the vector assembly overhangs used in the Open Yeast Collection, and whose predicted ligation fidelity in a 26-part assembly is 96%.

We further designed a set of part switching linkers, that take as input canonical uLoop transcription unit components and output those parts with new 5' and 3' overhangs. These part switching reactions enable, for instance, insertion of recombination sites 5' to the promoter and/or 3' to the terminator in a TU, or ribozymes 3' to the promoter and 5' to the RBS/start site. In this way, standard uLoop parts can participate in assembly reactions that construct modular vector backbones, composite 5' and 3' UTRs, and multi-tagged CDSs.

The part switching linkers were designed to proceed in two methods: with an orthogonal, linker-specific Type IIS restriction site (BbsI), or with a conditionally methylatable, idempotent BsaI restriction site (mBsaI), that is suppressed when the linker is cloned inside an E. coli cell expressing HpaII and/or MspI, and becomes active when the part is cloned into an MspI-/HpaII- strain or PCR amplified to remove the methylation sites. These parts and this expanded assembly standard have the potential to enable iGEM teams with tools and a framework to manufacture their own enzymatic reagents and perform their own sophisticated modification of strains' genomic background.

Technical Notes

SwitchClo linkers may cause some automated checks to fail. This is because they contain IIS restriction sites, by design.
Parts are all housed under the benchling.com/friendzymes namespace. These are available for individual attachment if the maintainers wish. Some benchling items contain additional documentation in their Description fields.
We have not yet enumerated any items in the Libraries and Composites sub-sheet. We can amend the submission further if the maintainers wish for this tab to contain additional information.

Thanks,
Friendzymes Contributors

This is an initial draft of friendzymes.xlsx for upload. Its contents are near-final but subject to further revision prior to release.

jakebeal · 2022-02-14T15:56:15Z

@jcahill Can you please run the workflows on your fork? The automation needs to run the build in order to validate whether this can be integrated.

jcahill · 2022-02-14T16:08:08Z

@jakebeal Running script regression testing now: https://github.com/friendzymes/iGEM-distribution/actions/runs/1842097864

jakebeal · 2022-02-14T16:09:55Z

The synchronize.yml workflow is needed too, since that's what validates the constructs (as opposed to the script code).

Attempts to conform source prefixes and IDs.

jcahill · 2022-02-14T18:23:25Z

After several rounds of trial-and-error with source prefix and ID columns, synchronize.yml continues to fail at SBOL export. We are requesting assistance on how to proceed.

Blocker 1

Build automation rejects non-unique data source IDs, but it's unclear how this value can be meaningful if required to be unique.

Blocker 2

If the workflow logs are to be trusted, URI expansions are not being generated correctly. No combination of the following in the two relevant columns has generated a correct expansion:

Data Source Prefix	Data Source ID
`Prefix from dropdown menu`	`https?://explicit.url.tld/to/part/ID`
`Prefix from dropdown menu`	`PREFIX:ID`
`Prefix from dropdown menu`	`ID`

That is, all of the following fail:

Data Source Prefix	Data Source ID
`iGEM Registry`	`http://parts.igem.org/Part:BBa_K1074001`
`iGEM Registry`	`iGEM:BBa_K1074001`
`iGEM Registry`	`BBa_K1074001`

Logs

Using the final example from above, wiki namespace path /Part: is not included in the URI expansion.

Could not export SBOL file for package Friendzymes: An entity with identity "http://parts.igem.org/BBa_K1074001" already exists in document

jakebeal · 2022-02-14T18:55:44Z

With respect to your blockers, there are two key pieces of information that I think will help you:

Data sources have a "Literal Part" column that distinguishes whether or not there is expected to be a 1:1 correspondence between identifier and sequence. NCBI and iGEM, for example, both have are literal part, because if I tell you "NCBI accession FJ859897.1" or "iGEM part BBa_K1074001", that should map to a particular sequence. PubMed, on the other hand, is non-literal. So when you say BBa_K1074001 is EcoOri_ColE1pMB1pBR32, it's a mismatch, because if we retrieve BBa_K1074001, the sequence we find won't be the one that's in your sheet. If you got the sequence by extracting it out of BBa_K1074001, then that would be better to go into the design notes. Right now, it believes it's finding several conflicting definitions for BBa_K1074001 and complaining accordingly.
The URI generated (http://parts.igem.org/BBa_K1074001) is the intended one. Since the source material in the iGEM repository isn't in SBOL, we need to convert it into an SBOL object, and this is the name for that object, not the literal URI used to access the SBOL object. (We are working towards an implementation of the packaging approach described in SEP 054). Each import source currently has a special case for how to remap URIs in order to access the import, which is required because there is no standardization across the databases that we import from (lots of future work to be done in generalization of import approaches...)

On a separate note, I would also ask you to consider whether it would be a good idea to split this collection up into more than one package. I see inside of it a number of sub-collections that seem like they might stand on their own, such as the linker subcollection. Most other packages in the distribution are organized around function rather than around source: is that possible to do here, or is this something that needs to be monolithic like the current OpenYeast import from FreeGenes?

jcahill · 2022-02-14T20:34:26Z

Re: 1 and 2, thanks. We'll revise based on these notes.

Re: the size/scope of the package: We have had some discussion around handling this. I've re-raised the topic with the team in light of your suggestion.

So far, the working model has been to handle the whole collection as a single package, prioritizing the downside risks of confusion and fragmentation likely to stem from introducing an assembly standard of considerable complexity across multiple packages over the downside risks of concentrating too much material in one place.

Would grouping the natural classes of parts into libraries nested within the package be a suitable middle-ground?

jakebeal · 2022-02-14T20:51:45Z

Ah, if it's got an alternate assembly standard, then it probably does want to be isolated in a single package right now (and that will be a discussion necessary with iGEM HQ). If we had a full implementation of SEP 054, then sub-packages would be the right answer, but at the moment that's not an option.

eyesmo · 2022-02-14T21:44:58Z

To clarify, in this collection, all parts that are defined with specified overhangs in uLoop--all promoters, RBSs, CDSs, and terminators--have uLoop overhangs. It is the part types that are not explicitly defined in uLoop--vector backbone subcomponents, recombination sites, ribozymes--where the new overhangs and part definitions come in. So at least for level 0/level 1 assemblies, it's not so much intended to be an alternate assembly standard, as an expansion and extension of the existing iGEM/uLoop assembly standard. Happy to talk more about this on this thread, in Wednesday's meeting or on a call.

jakebeal · 2022-02-14T21:52:33Z

@eyesmo Yes, I think a discussion on the Wednesday distribution call would likely be a good thing.

jcahill · 2022-02-15T15:29:53Z

Workflows have run successfully on the fork.

jakebeal · 2022-02-15T17:13:22Z

Friendzymes/README.md

+- Rec3_Lox66 (recombination_signal_sequence) in 
+- Rec3_LoxP (recombination_signal_sequence) in 
+- Rec5_Lox71 (recombination_signal_sequence) _<span style="color:red">not included in distribution</span>_
+- Rec5_LoxP (recombination_signal_sequence) _<span style="color:red">not included in distribution</span>_


Are these two intentionally not included, or do you want to update the build plan?

I believe this is simply an unnoticed error in porting parts to the spreadsheet.

jakebeal · 2022-02-15T17:16:56Z

Friendzymes/views/Libraries and Composites.csv

+Libraries and Composite Parts,Blue text column headers are optional,,,,,,,,,,,,,,,,,,,,,,,,,,,
+,,,,,,,,,,,,,,,,,,,,,,,,,,,,
+Part/Library Name,Design Notes,Part Description,Final Product,Backbone/locus,Constraints,Part 1,Part 2,Part 3,Part 4,Part 5,Part 6,Part 7,Part 8,Part 9,,,,,,,,,,,,,,
+,,,False,,,,,,,,,,,,,,,,,,,,,,,,,


I see there's nothing in the composites sheet. Everything else is being ordered in a vector (generally pSB1C5). Right now, your sheet says that you want to have things delivered as just linear DNA fragments, which may not be compatible with FreeGenes processes. Is that the intention (in which case a discussion with @vinoo-igem is likely needed)? If it's not the intention, then the build plans should be expressed on this sheet and the "final product" markers on the parts sheet set to false.

To make sure I understand, you're saying that the Libraries/Composites tab is where we should put information about the cloning/holding vector these parts should be stored in, correct? So currently the sheet implies no cloning vector, just raw DNA?

I think it would be useful to discuss the relative merits of pSB1C5 vs pOpen_v3 vs pOpen_v4. Is there a thread where the engineering committee has covered this? We'd be open to pSB1C5 and would ideally like to use the same standard vector as the new iGEM Distribution; my main (possibly unfounded) concern here is about compatibility with the existing FreeGenes libraries that are going into the Distro (e.g. Open Yeast Collection and the Protein Expression Toolkit), which to my knowledge use pOpen_v3.

started an issue #244 to open up the discussion

That's correct: you want to use the "Backbone/locus" column to indicate the vector holding the part. You should also consider whether you need to add flanking sequences, depending on whether the vector comes with them built in already.

Which vectors can be used is a separate discussion with @vinoo-igem

All FreeGenes parts, including Open Yeast Collection and Open Enzyme Collection, are clone by Twist in pOpen_v3. pOpen_v3 is ampR. All our parts should be cloned by Twist in this vector to make is useful as an AllClo/OYC part for GGA.

jakebeal · 2022-02-15T17:20:12Z

distribution_synthesis_inserts.fasta

+GACCAGGTAGCATAACTTCGTATAATGTATGCTATACGAACGGTAATGATGAGACCGTGC
+AC
+>3_APOSTROPHE_Part_Switch_Linker
+GGTCTCATACTTGTGATGTCTTCGCCTACGGATTGTCTGTCAAGGCATGAGACC


These short parts cannot be built by Twist, whose minimum synthesis length is 300bp. They need to have padding and flanking sequences added to them. See the Anderson Promoters package for an example of how this has been done.

By my count, we have 36 parts under 300bp in length. This is just to note my intention to pad all of those parts.

Yes, I added ~50%GC hair-pin free random sequence to all of my OYC parts that were under 300bp. I made those parts 310bp total.

vinoo-igem · 2022-02-15T17:50:57Z

I think discussing this on our next call will be good! I do want to surface that this is ambitious and will take some effort for review, as this will feed directly into a number of different topics that we need to address this year, primarily what iGEM will be defining as the assembly standard beyond L0 basic parts (which also clearly needs work #236 #214) and vector construction and whether this would constitute testing and/or adoption.

eyesmo · 2022-02-15T19:19:27Z

I do want to surface that this is ambitious and will take some effort for review, as this will feed directly into a number of different topics that we need to address this year, primarily what iGEM will be defining as the assembly standard beyond L0 basic parts (which also clearly needs work #236 #214) and vector construction and whether this would constitute testing and/or adoption.

Very much looking forward to this review/discussion! A core desired outcome of mine is to help move the ball forward on these topics for iGEM

jcahill added 6 commits February 12, 2022 05:10

Create .gitkeep

f752eb8

Friendzymes: add friendzymes.xlsx pre-release draft 1

e4687fb

This is an initial draft of friendzymes.xlsx for upload. Its contents are near-final but subject to further revision prior to release.

Merge branch 'iGEM-Engineering:develop' into add-friendzymes-collection

1f087f2

Friendzymes: add friendzymes.xlsx release candidate 1

5f8be83

Friendzymes: add friendzymes.xlsx release candidate 2

6e9f6fc

Friendzymes: delete temp file .gitkeep

ef9a4ed

jcahill added 4 commits February 14, 2022 11:37

Friendzymes: add .tmp to trigger workflows

40ed1a9

Friendzymes: add friendzymes.xlsx release candidate 3

c821281

Attempts to conform source prefixes and IDs.

Friendzymes: add friendzymes.xlsx release candidate 4

0f89dea

Friendzymes: add friendzymes.xlsx release candidate 5

9bf15af

jcahill mentioned this pull request Feb 15, 2022

iGEM Distribution 2022 Finalization friendzymes/community#25

Open

25 tasks

jcahill and others added 6 commits February 15, 2022 09:33

Friendzymes: delete .tmp

434e784

Friendzymes: add friendzymes.xlsx release candidate 6

9de4293

Automatically update exports

da1c10e

Collate and summarize packages

9b72408

Merge branch 'iGEM-Engineering:develop' into add-friendzymes-collection

5ddd5f4

Build distribution

f698f26

jakebeal reviewed Feb 15, 2022

View reviewed changes

vinoo-igem assigned traci-igem Feb 15, 2022

vinoo-igem unassigned traci-igem Feb 15, 2022

vinoo-igem requested a review from traci-igem February 15, 2022 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add friendzymes collection #238

Add friendzymes collection #238

jcahill commented Feb 14, 2022

jakebeal commented Feb 14, 2022

jcahill commented Feb 14, 2022

jakebeal commented Feb 14, 2022

jcahill commented Feb 14, 2022

jakebeal commented Feb 14, 2022

jcahill commented Feb 14, 2022

jakebeal commented Feb 14, 2022

eyesmo commented Feb 14, 2022 •

edited

Loading

jakebeal commented Feb 14, 2022

jcahill commented Feb 15, 2022

jakebeal Feb 15, 2022

jcahill Feb 15, 2022

jakebeal Feb 15, 2022

eyesmo Feb 15, 2022 •

edited

Loading

vinoo-igem Feb 15, 2022

jakebeal Feb 15, 2022

openbiofabber Feb 16, 2022

jakebeal Feb 15, 2022

jcahill Feb 15, 2022

openbiofabber Feb 16, 2022

vinoo-igem commented Feb 15, 2022

eyesmo commented Feb 15, 2022

Add friendzymes collection #238

Are you sure you want to change the base?

Add friendzymes collection #238

Conversation

jcahill commented Feb 14, 2022

About

Description

Technical Notes

jakebeal commented Feb 14, 2022

jcahill commented Feb 14, 2022

jakebeal commented Feb 14, 2022

jcahill commented Feb 14, 2022

Blocker 1

Blocker 2

Logs

jakebeal commented Feb 14, 2022

jcahill commented Feb 14, 2022

jakebeal commented Feb 14, 2022

eyesmo commented Feb 14, 2022 • edited Loading

jakebeal commented Feb 14, 2022

jcahill commented Feb 15, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eyesmo Feb 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vinoo-igem commented Feb 15, 2022

eyesmo commented Feb 15, 2022

eyesmo commented Feb 14, 2022 •

edited

Loading

eyesmo Feb 15, 2022 •

edited

Loading