-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AllClo discussion #15
Comments
Having a description of each standard will be useful to definy the unifying components |
First line of discussion: It is unclear to me without being provided with some reading that the theoretical grounding to do this exists as of now. Second line of discussion: I believe generalization of tooling for formal proofs regarding various properties of assembly standards is in order, given the explosion of assembly standard variants and modular cloning flavors over the past few years. This would be a far more substantial undertaking than what I currently believe the AllClo plan to be, but I think it's worth pursuing. |
Isaac L on feasibility:
|
Hey people, what do you think about this strategy of work:
After having this list we could try to find intersections and inconsistencies between standard overhands using the SplitSet website and also our SplitSet reverse engineered tool in case we need to test a lot of different sets of overhangs (so we could make this programmatically). With this information in hand will be better and easier to develop a design of AllClo. |
My notes on AllClo: ---Note up front: it will be rare for a genetic device to use all of these part types at once. Rather than requiring every assembly reaction to include a bunch of neutral 'spanner' parts as placeholders for unused part types, we should handle this with Part Type Switching (PTS) linkers. For example, by default the core promoter part will have the same overhangs as in uLoop. To build a more complex 5' UTR, an assembly reaction will be performed between the promoter and two PTS linkers, that can convert the promoter's overhang definition to a pair of FiveClo overhangs. In parallel, the set of FiveClo parts that one wishes to assemble and use will also undergo a PTS reaction with linkers to convert their overhangs to close any unused part slots. This way, while each promoter starts out with FiveClo-incompatible overhangs and each FiveClo part listed below starts with its own unique combination of two overhangs, a single round of PTS reactions later (followed by either transformation/cloning or PCR amplification of PTSed parts), all parts have the overhangs required for the desired assembly; and moreover these overhangs are compatible with any combination of VecClo, ProClo, and ThreeClo parts as well, up to and including one-pot full vector assembly reactions. This PTS design strategy can be applied to VecClo, FiveClo, ProClo, and ThreeClo, ensuring simple and backwards-compatible uLoop assembly by default, while enabling access to the full array of complex/composite/high fidelity AllClo genetic design and assembly after a single round of PTS reactions. -- VecClo parts: -- FiveClo parts: -- ProClo2.0 parts: -- ThreeClo parts: 1. There must be a balance between what one can do in one step (flatness), and simplicity. |
A summary of the action steps from the notes above, in no particular order:
|
For posterity, here is an email exchange with Isaac and I discussing ProClo: Isaac -> Scott,Isaac,KeoniHi Scott, Isaac and Keoni, I’ve worked up a draft of a ProClo 2.0 assembly standard, that uses the optimized MoClo-compatible 24-overhang set Keoni calculated, that has exclusively small/hydrophilic amino acid dyads at the in-frame overhangs, and that preserves the MoClo part definition of CDSs (AATG-GCTT). Would love to get your thoughts and feedback on it. All the best, Keoni -> Isaac^2,ScottI'm mainly thinking that recursive BsaI + SapI is a better solution. ProClo looks like it requires C-terminal modification, which can affect the usability of proteins (I think T7 RNAp had that problem, if I remember correct). If we imagine that we want to eventually do genome-scale engineering, that requirement could be quite detrimental, or at least cause a shift in the usage. SapI should solve that with seamless fusions. Recursive BsaI could also help solve the issue of N terminal tagging. If there is a need for single-reaction, investing in the goldengate->pcr->goldengate technology could be a good idea, since that will generally be applicable (anyone can use). Also doesn't require changing the RBS definition of TACT-AATG. I am, however, considering changing a few definitions and rebuilding my set. For more on that, check out - https://github.com/trilobio/recursive_bsai . Basically, you should be able to overlap In general, I think that the issues of protein tagging and such have technical solutions that just need to be tested. I'm starting all those (that are relevant to our cloning system) in earnest now. To me honestly, I think that the sets are less interesting than the overall system. DNA synthesis is cheap itself so long as cloning is implemented right, so I'm focusing a bit on that (https://keonigandall.com/posts/affordable_dna_2.html under "Innovation in synthesis is not far away") Cheers, Keoni Isaac -> Keoni,Isaac,Scott
A couple of things: (1) Is SapI off-patent, or does it have an off-patent isoschizomer? If not, it’s sub-optimal for designing a high quality, unambiguously public domain assembly standard. (2) Even if SapI is now or will soon be in the public domain, adding additional enzymes to the assembly standard kind of impairs one of the things that makes recursive BsaI/AllClo so nice—the fact that you can do basically everything with one restriction enzyme. If we add more, the question arises of why we don’t just use an existing multi-enzyme assembly standard, like GoldenBraid or uLoop.
A couple of things: (1) the primary use case for ProClo is specifically in contexts where you want to tag proteins or build fusions; if you want to do genome-scale engineering or do anything working just with native, un-modified protein sequences, you can just add a stop codon before the GCTT overhang (as is already standard practice for MoClo) and define your terminators with the standard MoClo 5’ overhang. (2) T7 RNA polymerase notwithstanding, the set of proteins that don’t tolerate any C-terminal tags is tiny compared to the set that do tolerate them, so it seems sub-optimal to focus design of the assembly standard on the intolerant subset. If you don’t want C-terminal mods for a particular CDS, you can just add a stop codon before the GCTT overhang.
I agree that allowing the RBS to keep its MoClo part type definition would be nice. I can see the argument for making ProClo 2.0 a recursive ‘level 0’ assembly standard that’s fully MoClo-compatible, even if that breaks the flatness/one-pot-ness of AllClo. Interested to hear others’ thoughts on this.
Interested to see how this experiment goes. Four questions about it: (1) if this strategy requires you to change/remove GGAG, doesn’t that mean it’s not backwards-compatible with MoClo? (Unless you’re planning to use these overhangs only in level 0 or level 2+ recursive assemblies, with no MoClo promoter parts). (2) how well do the TCTC,TGAG overhangs play with the 24 overhangs in your original set? (3) It looks from your description in that repo like you’re testing how methylation at different locations impacts T4 ligase activity/fidelity. Are there papers that suggest this might be an issue? If so, that could be a pretty big problem for the recursive BsaI strategy. And (4) the repo doc mentioned you’re still getting some cutting on a methylated restriction site. What exactly is the sequence that was being methylated and cut in that experiment? Was it an HpaII-BsaI site?
Making synthesis cheaper is super important and I’m rooting for you/TriloBio to achieve this; but in the meantime, high-quality libraries of free, public domain DNA parts running on a powerful assembly standard will still be very useful to people all around the world! Finally, do we want to share this discussion in this issue thread? Could be useful to others to make in publicly viewable. All the best, |
Continuing conversation here:
https://patents.google.com/patent/EP0818537A2/en
Because those standards don't allow for recursion in complex assembly. For all intents except C-terminal fusions, SapI does not add any complexity to the current method, only 3 base pairs.
Or you could use SapI, which has the stop codon built in. Then you don't need divergent protein coding sequences for both assembly types.
As an alternative, you could simply use SapI and get the best of both worlds. Complete compatibility with un-modified protein sequences and compatibility with larger tagged constructs. The resistance seems to stem from the fact that SapI adds a restriction enzyme to the mix in the particular case of C-terminal fusions. But there are clear advantages as well - no set linker sequences and complete compatibility with all other types of cloning. What if you wish to build a genome with an entire pathway being his-tagged? I can agree that it may be superior for single protein fusions, but I think SapI is superior when you want to do many fusions or mix between different kinds of fusions and non-fusions. For the questions on the new sets:
Yes, it does. I don't really care though since we're resynthesizing everything from scratch anyway.
Will probably generate a new set.
Not testing different spots, I linked a paper that already did that. We're using objectively the worst spot available because it is less risky and has methylation on the right strand for doing complete re-shuffle (If only BtgZI was more available...)
HpaII uses B2, which is also still kind of crap at blocking BsaI activity. We're using B1 so that the sequence doesn't interact with ligase. T1 and T2 are much better, but are on the wrong strand. Much sad.
I think you may have missed the point of that essay :) DNA synthesis is cheap and none of our tech is aimed at making it cheaper. Cloning is not cheap, so all of our tech focus is on making cloning cheap, which is relevant to a powerful assembly standard (and our parts will be free/public domain anyway). Essentially what we're doing is focusing on process optimizations that make the assembly system easier, more powerful, and cheaper. For this reason, I'm not concerned about parts themselves, but I am concerned about the simplicity of the system because that directly translates to easier process optimizations. ProClo makes things more complex by adding in a whole new standard to keep track of. SapI keeps things simple, because although you can use it, it for the most part stays completely out of the user and executor's way. Recursive BsaI is also similar - it flattens the vector and insert landscape, simplifying overall use. Both can be augmented with complex parts, but the underlying system is relatively simple, allowing for those additions on top. |
Isaac N -> Keoni, Isaac, ScottHi! About SapI, as long as I know, its patent expired. Overall, although is a bit slower than BsaI, it works really nice (it's stored at -20°C, is stable, cheap, etc): Another thing: I don´t recommend using the "MoClo part definition of CDSs (AATG-GCTT)" instead I think is more valuable to use the CIDAR/uLoop/ProClo CDS definition (AATG-AGGT). Best! Scott -> Keoni, Isaac**2Hi all, Yes, SapI and it's methylases are IP free and indeed their CDSs are included in the Open Enzyme Collection! You can make your own. I agree with Isaac Núñez in that we should conform to existing standards as best as possible. I noticed in the ProClo draft that quite a few overhangs differ from the “Freegenes” standard used in the E. coli Protein Expression Toolkit that we all contributed to creating. Any changes should be justified with a strong rational. I made a 2nd tab in the g-sheet that compares the two. Cheers, |
Questions and responses for @Prosimio:
Responses for Scott:
Responses/questions for @Koeng101:
This suggests that recursive mBsaI linkers that use enzymatic methylation with HpaII or MspI, both of which methylate the bottom strand, are probably going to suffer from relatively high rates of undesired digestion of methylated mBsaI sites during assembly, correct? Though this data seems to contradict the results from the Great Lakes Biotech researcher you reference, who appears to have achieved very low rates of cleavage of bottom-strand-methylated BsaI sites. I'm guessing there's no way to enzymatically methylate the top strand cysotines (GGT C T C, T1 and T2 in this paper's nomenclature) only in mBsaI sites, but not in regular BsaI sites; and this is why you're looking into chemically synthesized oligos (correct?).
One takeaway I'm getting from this conversation is that there's a bit more technological uncertainty/risk around at least the enzymatic/oligo-free version of recursive mBsaI assembly than I'd been aware of. Given that, I think it might be worthwhile for Friendzymes to also design linker sets that perform part type switching with a second, orthogonal Type IIS restriction enzyme, as the assembly linkers/connectors Scott designed for the Open Yeast Collection do with BbsI. Do you all agree? |
Yep. It'll also depend on the results of an experiment I'll be running with SapI. If it turns out that it is impractical in reality, I'd be happy to switch over. But I think it is something that is hard to get consensus on until there are real experiments run.
I agree, we should talk it over.
A little diagram from the recursive bsai repo might help:
Basically, that particular set (and essentially only that set) allows for efficient part switching while keeping the methylation site away from the ligation site.
It is only able to do those sequences, but the real reason I'm looking into chemical synthesis is purely expediency. Gotta get it working ASAP. B1 is conservative for when you need seamless overhangs (like doing DNA assembly), while T1 or T2 is what we'd probably use in linkers. Really not a thing about efficiency though, it's pretty much all expediency.
You just become dependent on other things, like the ability to manufacture chemicals for minipreps and such. Still, I agree with you, because I think that supply chain independence is fun!
I have considered, but these experiments are rather expensive (would be about $500 to test that). Overall, you're making a basic assumption that might not be true - that the inefficiency practically matters. Our evidence right now from our friends at Great Lakes would point towards it not mattering.
To be honest, I didn't have the right keywords (ligase binding footprint) to find that info beforehand. Looks like it is 11 base pairs roughly. So if methylation breaks things, it breaks things. I suspect it'll be fine cause T4 is known to methylate itself, but still might break things. |
A work session seems like it would help here. |
Wrote this up above in an edit to my AllClo notes, reposting here for thoughts/feedback. Here's a proposed architecture/library of part types for AllClo: AllClo should combine the following into a single unified assembly standard, using a single high fidelity 4bp BsaI overhang set:
Important note: I propose that we handle this balance between optional complexity and default simplicity with Part Type Switching (PTS) linkers. For example, by default the core promoter part will have the same overhangs as in uLoop. To build a more complex 5' UTR, an assembly reaction will be performed between the promoter and two PTS linkers, that can expose a new pair of 4 bp overhangs, changing the promoter's overhang definition to a pair of FiveClo overhangs (BsaI/BbsI uLoop promoter-->FiveClo promoter PTS linker example here; BsaI/mBsaI uLoop promoter-->FiveClo promoter PTS linker example here). In parallel, the set of FiveClo parts that one wishes to assemble and use will also undergo PTS reactions with linkers to convert their overhangs to close any gaps in the assembly caused by unused part slots. This way, while each promoter starts out with FiveClo-incompatible overhangs and each FiveClo part listed below starts with its own unique combination of two overhangs, a single round of PTS reactions later (followed by either transformation/cloning or PCR amplification of PTSed parts), all parts have the overhangs required for the desired assembly; and moreover these overhangs are compatible with any combination of VecClo, ProClo, and ThreeClo parts as well, up to and including one-pot full vector assembly reactions. This PTS design strategy can be applied to VecClo, FiveClo, ProClo, and ThreeClo, ensuring simple and backwards-compatible uLoop assembly by default, while enabling access to the full array of complex/composite AllClo genetic design and high fidelity/one-pot assembly after a single round of PTS reactions. Proposed AllClo part types (overhang sequences TBD)
|
^In total, the above defines 26-27 distinct part types, requiring a high fidelity, uLoop-compatible set of ~36 4bp overhangs. |
Generally my thoughts are:
But:
Standards are... cool?From my first look over, I think AllClo may be trying to do too many things at once. Basically the classic XKCD - One thing I have been considering lately is the impact of automation. In fact, I think the field would be far further along if there was a switch to TK's BB-2 (oww has good diagram, basically allows fusions with biobricks type cloning) and the entire focus was instead on making it super easy to actually use those formats. Ie, say JCVI-Syn3 is ~512 genes. Well, if you include inter-gene regions, that's about ~2000 cloning reactions, which is actually doable by a single person in a single week, given proper setup. BioBricks and doing the thingI recall asking Drew about how BioBricks started. Well, turns out biobricks started because Drew and Tom's labs just started using it. I think this is a key distinction - are the people who are designing AllClo the people actually using it? Or would it be thrust upon the iGEM community? (I'd ask the same question to SBOL as well) For an easy example, will any of you, in this discussion, actually be using FiveClo or ThreeClo for a project of your own within the next year? If not, why define it now? The DO definition is cool, but I do have to wonder - can you actually assert that it is "simpler to define a specific DO part type with a unique, AllClo-compatible pair of overhangs"? There is a very good case that people really wouldn't give a shit in most cases, and would just want to be able to switch out the CDS. In that case, it is simpler to have a single dropout part, for example. Value of the assembly vs of the partsStill, I think there is value in discussion of assembly, but I do think that it needs to be grounded, in absolute terms, in the parts themselves. What are people going to actually use? |
Wetware jam session for finalizing AllClo: at the latest, February 20. |
This is fair for the structure I outlined above, which is more like an attempt to list all the potential level-0 part types I could imagine and what a logical order for them would be.
This year, of the assemblies and part types outlined above, Friendzymes will be using all the VecClo part types for building B. subtilis and P. pastoris shuttle/expression vectors (in fact, these part types have already been built for yeast and most have been built for B. subtilis). Of FiveClo and ThreeClo, we will use the 5' and 3' homology arms, assembly linkers, and recombinase sites for building genomic integration cassettes that can be selected for with a selection marker, then deleted by expression of an inducible recombinase, and then selected against with a counterselection marker--basically, the core components required for strain engineering. We will also use the ribozyme part type to add insulators to our B. subtilis constructs. We will use the ProClo part types to test secretion tags, track expression with reporters, to try purifying proteins with different affinity tags, and to compare the behavior of the enzymes we make with and without the tags attached (these are already built with ProClo1.0 assembly standard; to reduce the re-design and re-formatting overhead, we're now leaning toward keeping most of these as-is, and only changing the overhangs that cause fidelity issues with the VecClo overhangs in the Open Yeast Collection). We will be using dropout cassettes, possibly with DOclo linkers, not just for CDSs but for secretion tags and possibly homology arms as well. The part types I don't foresee us using this year, which therefore may not need to be defined yet, include distal promoter elements, operators, polyA-tails, and microRNA binding/cleavage sites. So, simplifying down, the core components of AllClo that we plan to use, and that therefore require definition, are:
Of these, the only ones that are really 'new' (i.e. not already present in FreeGenes toolkits like the Open Yeast Collection or Protein Expression Toolkit) are the 5' & 3' recombinase sites and the ribozymes. Abstracting a bit, the only things that need to be added are optional part types between the 5' linker and promoter, between the promoter and the RBS, and between the terminator and the 3' linker. Like so:
In this way, we can say that newPart1 will require overhangs that match the end of the 5' linker and the start of the Part-Type-Switched, FiveClo-defined promoter; newPart2 will require overhangs that match the end of the PTSed FiveClo promoter and the start of the standard RBS; and newPart3 will require overhangs that match the end of a PTSed ThreeClo terminator and the start of a PTSed ThreeClo 3' linker. So, we only need to define 2 new overhangs for the FiveClo promoter, and 1 new overhang each for the ThreeClo terminator and 3' linker. On the Part Type Switching front, we'll want PTS linkers for converting the 5' and 3' end of a promoter from MoClo/uLoop to FiveClo overhangs. I think we'll also want 'null' linkers that can participate in a PTS reaction (i.e. attach to the promoter and then be cloned/transformed or PCR amplified), but that don't change the overhang in the secondary Golden Gate reaction. These null PTS linkers are useful in cases where you only want to change one overhang on a part, for instance if you wanted to add a ribozyme insulator, but didn't want to add a 5' recombinase site, next to a promoter. Null linkers will actually be required for PTSing terminators and 3' linkers, since only one of their overhangs changes in the ThreeClo schema above. |
AllClo
Unification, harmonizations and validations of high fidelity MoClo genetic assembly standard expansions
Our objective is to create a standard expansion unifying all the previous ones (or at least harmonize much as possible). List of standards:
Obs: Need to validate each expansion of the assembly standard, and also BtgzI/BsaI-based part type switching (and/or methylation-based part type switching!) to go from a simplified assembly to a more complex assembly
The text was updated successfully, but these errors were encountered: