WO2013169867A1 - Methods and compositions for rewritable digital data storage in live cells - Google Patents
Methods and compositions for rewritable digital data storage in live cells Download PDFInfo
- Publication number
- WO2013169867A1 WO2013169867A1 PCT/US2013/040089 US2013040089W WO2013169867A1 WO 2013169867 A1 WO2013169867 A1 WO 2013169867A1 US 2013040089 W US2013040089 W US 2013040089W WO 2013169867 A1 WO2013169867 A1 WO 2013169867A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- integrase
- excisionase
- dna
- storage system
- data storage
- Prior art date
Links
- 238000013500 data storage Methods 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 title claims description 45
- 239000000203 mixture Substances 0.000 title description 5
- 108010061833 Integrases Proteins 0.000 claims abstract description 205
- 102100034343 Integrase Human genes 0.000 claims abstract description 185
- 108010055246 excisionase Proteins 0.000 claims abstract description 163
- 230000014509 gene expression Effects 0.000 claims abstract description 90
- 102000018120 Recombinases Human genes 0.000 claims abstract description 47
- 108010091086 Recombinases Proteins 0.000 claims abstract description 47
- 210000000349 chromosome Anatomy 0.000 claims abstract description 33
- 238000003860 storage Methods 0.000 claims abstract description 29
- 238000006731 degradation reaction Methods 0.000 claims abstract description 25
- 230000015556 catabolic process Effects 0.000 claims abstract description 24
- 210000004027 cell Anatomy 0.000 claims description 123
- 108090000623 proteins and genes Proteins 0.000 claims description 87
- 238000001727 in vivo Methods 0.000 claims description 57
- 239000000411 inducer Substances 0.000 claims description 26
- 230000001939 inductive effect Effects 0.000 claims description 19
- 230000006698 induction Effects 0.000 claims description 17
- 102000004169 proteins and genes Human genes 0.000 claims description 17
- 241001515965 unidentified phage Species 0.000 claims description 17
- 230000001965 increasing effect Effects 0.000 claims description 15
- 230000002269 spontaneous effect Effects 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 12
- 102000039446 nucleic acids Human genes 0.000 claims description 10
- 108020004707 nucleic acids Proteins 0.000 claims description 10
- 150000007523 nucleic acids Chemical class 0.000 claims description 10
- 230000002441 reversible effect Effects 0.000 claims description 8
- 102000004190 Enzymes Human genes 0.000 claims description 7
- 108090000790 Enzymes Proteins 0.000 claims description 7
- 230000003247 decreasing effect Effects 0.000 claims description 6
- 230000000717 retained effect Effects 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 3
- 108020001507 fusion proteins Proteins 0.000 claims description 2
- 102000037865 fusion proteins Human genes 0.000 claims description 2
- 229910021645 metal ion Inorganic materials 0.000 claims description 2
- 150000003384 small molecules Chemical group 0.000 claims description 2
- 230000006870 function Effects 0.000 abstract description 27
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 abstract description 22
- 230000015654 memory Effects 0.000 abstract description 8
- 108091028043 Nucleic acid sequence Proteins 0.000 abstract description 7
- 108020004414 DNA Proteins 0.000 description 134
- 238000005215 recombination Methods 0.000 description 74
- 230000006798 recombination Effects 0.000 description 71
- 239000013612 plasmid Substances 0.000 description 40
- 239000005090 green fluorescent protein Substances 0.000 description 22
- 230000010354 integration Effects 0.000 description 22
- 241000588724 Escherichia coli Species 0.000 description 19
- 230000002068 genetic effect Effects 0.000 description 18
- 238000004519 manufacturing process Methods 0.000 description 17
- 230000001404 mediated effect Effects 0.000 description 17
- 102000012330 Integrases Human genes 0.000 description 16
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 15
- 238000002474 experimental method Methods 0.000 description 14
- 230000007246 mechanism Effects 0.000 description 14
- 238000013461 design Methods 0.000 description 13
- 230000005764 inhibitory process Effects 0.000 description 13
- 230000002103 transcriptional effect Effects 0.000 description 13
- 101710123288 Recombination directionality factor Proteins 0.000 description 12
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 12
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 12
- 230000002457 bidirectional effect Effects 0.000 description 12
- 230000002759 chromosomal effect Effects 0.000 description 12
- 108091032917 Transfer-messenger RNA Proteins 0.000 description 11
- 230000027455 binding Effects 0.000 description 11
- 230000001105 regulatory effect Effects 0.000 description 11
- 238000012216 screening Methods 0.000 description 11
- 230000006399 behavior Effects 0.000 description 10
- 201000001718 Roberts syndrome Diseases 0.000 description 9
- 208000012474 Roberts-SC phocomelia syndrome Diseases 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 9
- 238000000338 in vitro Methods 0.000 description 9
- 238000011160 research Methods 0.000 description 9
- 230000032823 cell division Effects 0.000 description 8
- 239000000539 dimer Substances 0.000 description 8
- 230000000694 effects Effects 0.000 description 8
- 230000007774 longterm Effects 0.000 description 8
- 238000013518 transcription Methods 0.000 description 8
- 230000015572 biosynthetic process Effects 0.000 description 7
- 230000000875 corresponding effect Effects 0.000 description 7
- 238000010494 dissociation reaction Methods 0.000 description 7
- 230000005593 dissociations Effects 0.000 description 7
- 230000009977 dual effect Effects 0.000 description 7
- 238000000684 flow cytometry Methods 0.000 description 7
- 238000004088 simulation Methods 0.000 description 7
- 238000002948 stochastic simulation Methods 0.000 description 7
- 230000035897 transcription Effects 0.000 description 7
- 230000014616 translation Effects 0.000 description 7
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 108091081024 Start codon Proteins 0.000 description 6
- 230000008859 change Effects 0.000 description 6
- 238000010367 cloning Methods 0.000 description 6
- 238000010276 construction Methods 0.000 description 6
- 239000002609 medium Substances 0.000 description 6
- 230000010076 replication Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 230000001276 controlling effect Effects 0.000 description 5
- 210000000172 cytosol Anatomy 0.000 description 5
- 238000011161 development Methods 0.000 description 5
- 230000018109 developmental process Effects 0.000 description 5
- 238000005001 rutherford backscattering spectroscopy Methods 0.000 description 5
- 238000013519 translation Methods 0.000 description 5
- 241000702055 Escherichia virus HK022 Species 0.000 description 4
- 241000701959 Escherichia virus Lambda Species 0.000 description 4
- 241000187480 Mycobacterium smegmatis Species 0.000 description 4
- 230000032683 aging Effects 0.000 description 4
- 150000001413 amino acids Chemical class 0.000 description 4
- 238000004163 cytometry Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000014621 translational initiation Effects 0.000 description 4
- 230000004568 DNA-binding Effects 0.000 description 3
- SRBFZHDQGSBBOR-HWQSCIPKSA-N L-arabinopyranose Chemical compound O[C@H]1COC(O)[C@H](O)[C@H]1O SRBFZHDQGSBBOR-HWQSCIPKSA-N 0.000 description 3
- 239000003242 anti bacterial agent Substances 0.000 description 3
- 229940088710 antibiotic agent Drugs 0.000 description 3
- 238000003556 assay Methods 0.000 description 3
- 108010005774 beta-Galactosidase Proteins 0.000 description 3
- 230000001588 bifunctional effect Effects 0.000 description 3
- 230000033228 biological regulation Effects 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 229960005091 chloramphenicol Drugs 0.000 description 3
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 3
- 239000013611 chromosomal DNA Substances 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000003828 downregulation Effects 0.000 description 3
- 230000007613 environmental effect Effects 0.000 description 3
- 230000002255 enzymatic effect Effects 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 3
- 229930027917 kanamycin Natural products 0.000 description 3
- 229960000318 kanamycin Drugs 0.000 description 3
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 3
- 229930182823 kanamycin A Natural products 0.000 description 3
- 230000002045 lasting effect Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 239000000178 monomer Substances 0.000 description 3
- 238000010587 phase diagram Methods 0.000 description 3
- 108090000765 processed proteins & peptides Proteins 0.000 description 3
- 230000017854 proteolysis Effects 0.000 description 3
- 238000011002 quantification Methods 0.000 description 3
- 230000003362 replicative effect Effects 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 229930101283 tetracycline Natural products 0.000 description 3
- OFVLGDICTFRJMM-WESIUVDSSA-N tetracycline Chemical compound C1=CC=C2[C@](O)(C)[C@H]3C[C@H]4[C@H](N(C)C)C(O)=C(C(N)=O)C(=O)[C@@]4(O)C(O)=C3C(=O)C2=C1O OFVLGDICTFRJMM-WESIUVDSSA-N 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- OPIFSICVWOWJMJ-AEOCFKNESA-N 5-bromo-4-chloro-3-indolyl beta-D-galactoside Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1OC1=CNC2=CC=C(Br)C(Cl)=C12 OPIFSICVWOWJMJ-AEOCFKNESA-N 0.000 description 2
- 101100002068 Bacillus subtilis (strain 168) araR gene Proteins 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- 101000918259 Enterobacteria phage T4 Exonuclease subunit 1 Proteins 0.000 description 2
- 101000896152 Escherichia phage Mu Baseplate protein gp47 Proteins 0.000 description 2
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 2
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 108010052160 Site-specific recombinase Proteins 0.000 description 2
- 241001655322 Streptomycetales Species 0.000 description 2
- 108700005078 Synthetic Genes Proteins 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 101150044616 araC gene Proteins 0.000 description 2
- 230000010455 autoregulation Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 102000005936 beta-Galactosidase Human genes 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- 230000033077 cellular process Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 230000005782 double-strand break Effects 0.000 description 2
- 230000003203 everyday effect Effects 0.000 description 2
- 108091006047 fluorescent proteins Proteins 0.000 description 2
- 102000034287 fluorescent proteins Human genes 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 230000002427 irreversible effect Effects 0.000 description 2
- 230000002101 lytic effect Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000000386 microscopy Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000026897 pro-virus excision Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 102000004196 processed proteins & peptides Human genes 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000001243 protein synthesis Methods 0.000 description 2
- 230000000754 repressing effect Effects 0.000 description 2
- 230000008672 reprogramming Effects 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 241000244203 Caenorhabditis elegans Species 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 108010051219 Cre recombinase Proteins 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- 108010054814 DNA Gyrase Proteins 0.000 description 1
- 239000012623 DNA damaging agent Substances 0.000 description 1
- 238000012270 DNA recombination Methods 0.000 description 1
- 230000004543 DNA replication Effects 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 101100285408 Danio rerio eng2a gene Proteins 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 241000701968 Enterobacteria phage phi80 Species 0.000 description 1
- 101800001466 Envelope glycoprotein E1 Proteins 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 101000686179 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) Protein RecA Proteins 0.000 description 1
- 101100412102 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) rec2 gene Proteins 0.000 description 1
- 108091064358 Holliday junction Proteins 0.000 description 1
- 102000039011 Holliday junction Human genes 0.000 description 1
- 238000012404 In vitro experiment Methods 0.000 description 1
- 241000194034 Lactococcus lactis subsp. cremoris Species 0.000 description 1
- 239000006142 Luria-Bertani Agar Substances 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 241000187479 Mycobacterium tuberculosis Species 0.000 description 1
- 241001646725 Mycobacterium tuberculosis H37Rv Species 0.000 description 1
- 108700035964 Mycobacterium tuberculosis HsaD Proteins 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 108091092724 Noncoding DNA Proteins 0.000 description 1
- 108010011356 Nucleoside phosphotransferase Proteins 0.000 description 1
- 108010067902 Peptide Library Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 235000014962 Streptococcus cremoris Nutrition 0.000 description 1
- 241000701955 Streptomyces virus phiC31 Species 0.000 description 1
- 241001661355 Synapsis Species 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- PYMYPHUHKUWMLA-VAYJURFESA-N aldehydo-L-arabinose Chemical compound OC[C@H](O)[C@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-VAYJURFESA-N 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- WQZGKKKJIJFFOK-FPRJBGLDSA-N beta-D-galactose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@H]1O WQZGKKKJIJFFOK-FPRJBGLDSA-N 0.000 description 1
- 108010051210 beta-Fructofuranosidase Proteins 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- FPPNZSSZRUTDAP-UWFZAAFLSA-N carbenicillin Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)C(C(O)=O)C1=CC=CC=C1 FPPNZSSZRUTDAP-UWFZAAFLSA-N 0.000 description 1
- 229960003669 carbenicillin Drugs 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 230000024245 cell differentiation Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000009918 complex formation Effects 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 230000002079 cooperative effect Effects 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000001086 cytosolic effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 210000002308 embryonic cell Anatomy 0.000 description 1
- 239000003344 environmental pollutant Substances 0.000 description 1
- 230000004076 epigenetic alteration Effects 0.000 description 1
- 230000001973 epigenetic effect Effects 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 238000001943 fluorescence-activated cell sorting Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 239000001573 invertase Substances 0.000 description 1
- 235000011073 invertase Nutrition 0.000 description 1
- 238000012933 kinetic analysis Methods 0.000 description 1
- 238000003367 kinetic assay Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 229910052943 magnesium sulfate Inorganic materials 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 108700010839 phage proteins Proteins 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 238000002818 protein evolution Methods 0.000 description 1
- 238000010377 protein imaging Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000008844 regulatory mechanism Effects 0.000 description 1
- 210000003705 ribosome Anatomy 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 230000009758 senescence Effects 0.000 description 1
- 238000010206 sensitivity analysis Methods 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000011232 storage material Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 108091005946 superfolder green fluorescent proteins Proteins 0.000 description 1
- 239000007523 supplemented m9 medium Substances 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000002459 sustained effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- DPJRMOMPQZCRJU-UHFFFAOYSA-M thiamine hydrochloride Chemical compound Cl.[Cl-].CC1=C(CCO)SC=[N+]1CC1=CN=C(C)N=C1N DPJRMOMPQZCRJU-UHFFFAOYSA-M 0.000 description 1
- 229960000344 thiamine hydrochloride Drugs 0.000 description 1
- 235000019190 thiamine hydrochloride Nutrition 0.000 description 1
- 239000011747 thiamine hydrochloride Substances 0.000 description 1
- 101150065732 tir gene Proteins 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 230000013715 transcription antitermination Effects 0.000 description 1
- 230000005026 transcription initiation Effects 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/123—DNA computing
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B82—NANOTECHNOLOGY
- B82Y—SPECIFIC USES OR APPLICATIONS OF NANOSTRUCTURES; MEASUREMENT OR ANALYSIS OF NANOSTRUCTURES; MANUFACTURE OR TREATMENT OF NANOSTRUCTURES
- B82Y10/00—Nanotechnology for information processing, storage or transmission, e.g. quantum computing or single electron logic
Definitions
- Such epigenetic storage systems can be subject to evolutionary counter selection due to resource burdens placed on the host cell or spontaneous switching due to putatively stochastic fluctuations in cellular processes including gene expression.
- engineered transmission of DNA molecules could support data exchange between organisms as needed to implement higher-order multicellular behaviors within programmed consortia (Ham TS, Lee SK, Keasling JD, Arkin AP (2008) Design and construction of a double inversion recombination switch for heritable sequential genetic memory. PLoS ONE 3:e2815; Abelson H et al. (2000) Amorphous computing. Communications of the ACM 43:74-82). Practically, researchers have begun to use enzymes that modify DNA, typically site- specific recombinases, to study and control engineered genetic systems.
- recombinases can catalyze strand exchange between specific DNA sequences and enable precise manipulation of DNA in vitro and in vivo (Grindley NDF, Whiteson KL, Rice PA (2006) Mechanisms of site-specific recombination. Annu Rev Biochem 75:567-605 ). Depending on the relative location or orientation of recombination sites three distinct recombination outcomes, integration, excision or inversion, can be realized.
- Single-write architectures are limiting if many of the uses for genetic data storage are considered in detail.
- studies of replicative aging in yeast or human fibroblasts typically track at least 25 or 45 cell division events prior to the onset of senescence, respectively (Steinkraus KA, Kaeberlein M, Kennedy BK (2008) Replicative aging in yeast: the means to the end. Annu Rev Cell Dev Biol 24:29-54).
- Lineage mapping during worm development frequently tracks at least 10 differentiation events (Sulston JE, Schierenberg E, White JG, Thomson JN (1983), The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol 100:64-119), while research with mouse and human systems considers up to several hundred cell divisions (Frumkin D, Wasserstrom A, Kaplan S, Feige U, Shapiro E (2005)
- aspects of the present invention relate to methods and systems for rewritable digital data storage in live cells.
- binary digits can be stored in chromosomes, which enables combinatorial data storage.
- a DNA element can be flipped within the chromosome of a live cell, and then be flipped back to its original state. These two steps can be repeated an infinite number of times, thereby creating a binary digit (bit) data register.
- bit binary digit
- Such rewritability of passive (requiring no gene expression or active cellular process) data storage system is for the first time achieved by the present invention via a DNA encoded state register.
- a combinatorial data storage having up to 2 A N bits (if N registers) can be built, which is a drastic increase from previous storage size of N bits.
- it is also possible to support multiplexing of data storage e.g., re -use of the same recombinase enzymes to store >1 bit, such as 1 byte (8 bits)). This is useful for storing information about events inside cells (e.g., cell division or
- environmental information that impacts cells e.g., cytokine levels, environmental pollutants, etc.
- methods of the present invention enable the development a recombinase addressable data (RAD) modules that can be used to write and store binary digits within the chromosome of live cells.
- RAD recombinase addressable data
- So produced RAD systems are capable of passive and stable information storage over at least 100 cell divisions and can be switched repeatedly without performance degradation, as is required to support combinatorial data structures. Additionally, by varying the synthesis and degradation rates of recombinase functions programmed
- serine recombinase can be used.
- the serine recombinase functions used here do not require cell-specific co-factors and can be used to extend computing and control methods to the study and engineering of many biological systems.
- the invention provides an in vivo data storage system and methods for storing data using such a system.
- a system includes a recombinase addressable data module comprising an invertible DNA data register.
- the DNA data register comprises a DNA register sequence flanked by oppositional attachment sites.
- the directionality of the DNA register sequence is invertible to a set state 1 , and optionally reversibly invertible from said set state 1 to a reset state 0.
- the system further comprises a set generator comprising a first gene encoding an integrase; a reset generator comprising a second gene encoding an excisionase.
- one pair of set generator and reset generator can represent one binary digit (1 bit); multiple unique pairs (N) of set generators and reset generators can be present in one system which is capable of storing up to 2 A N bits of data.
- the reset generator further comprises a third gene encoding an integrase.
- the attachment sites of the invertible DNA data register are recognized and recombined by the integrase of the set generator when the DNA register sequence is in reset state 0.
- the attachment sites of the invertible DNA data register are recognized and recombined by the integrase-excisionase complex of the reset generator when the DNA register sequence is in set state 1.
- the integrase and excisionase can be derived from bacteriophage Bxbl, TP901-1, Phirvl and/or PhiC31, or other sources.
- aspects of the invention may include a reset generator in which the second and third genes together encode an excisionase-integrase complex or a fusion protein comprising the excisionase and the integrase.
- at least one of the first, second, or third gene is inducible.
- the inducible gene may be directly or indirectly induced by an inducer.
- an inducer may directly or inactivate transcription, or indirectly or directly inactivate a repressor of transcription.
- an inducer may require one or more co-factors.
- aspects include an inducer that is a small molecule, a chemical, a protein, an enzyme, a nucleic acid, or a metal ion.
- the inducer is an endogenous inducer; and in other embodiments, the inducer may be exogenous.
- at least one of the first, second or third genes is inducible by a transcription factor; or by two or more inducers functioning together or in the alternate. Aspects also include at least one of the first, second or third genes being autoinducible.
- Another embodiment provides a vector comprising an in vivo data storage system PCT Application
- a recombinant cell comprising an in vivo data storage system.
- the in vivo data storage system may be present, in whole or in part, in a chromosome of the recombinant cell.
- the DNA data register is present in a chromosome of the recombinant cell.
- at least one of the first, second or third genes is present in a chromosome of the recombinant cell.
- aspects of the invention also include a method for storing data in a cell.
- the method comprises providing a cell comprising an in vivo data storage system having a recombinase addressable data module including (i) a set generator comprising a first gene encoding an integrase; and (ii) an invertible DNA data register comprising a DNA register sequence flanked by oppositional attachment sites, wherein the directionality of the DNA register sequence is invertible to a set state 1 , and optionally reversibly invertible from said set state 1 to a reset state 0.
- a recombinase addressable data module including (i) a set generator comprising a first gene encoding an integrase; and (ii) an invertible DNA data register comprising a DNA register sequence flanked by oppositional attachment sites, wherein the directionality of the DNA register sequence is invertible to a set state 1 , and optionally reversibly invertible from said set state 1 to a reset
- methods according to the invention comprise inducing the first gene or allowing the induction of the first gene to express the integrase so as to allow the DNA register sequence to invert, thereby generating a set state of directionality for the DNA register sequence, the set state represented by binary digit 1, thereby storing data represented by the binary digit 1 in the cell.
- the recombinase addressable data module further comprises a reset generator comprising a second gene encoding an excisionase and, optionally, a third gene encoding an integrase
- the method further comprises: inducing the second gene or allowing the induction of the second gene to express the excisionase so as to allow the DNA register sequence to invert back, thereby generating a reset state of directionality for the DNA register sequence, the reset state represented by binary digit 0, thereby storing data represented by the binary digit 0 in the cell.
- methods of the invention comprise optionally repeating step the inversion step, thereby storing data represented by the binary digit 1 or 0 in the cell.
- aspects of the invention may relate to controlling expression of the excisionase to provide a stoichiometric amount of the excisionase in relation to one or both of an amount of an integrase and a copy number of the DNA register sequence, so as to favor generation of the reset state 0 of the DNA register sequence.
- Methods according to the invention may further comprise tunably controlling the reversible inversion of the DNA register sequence between a state 0 and a set state 1.
- Some embodiments comprise the method comprising one or more of: minimizing spontaneous inversion to set state 1 ; minimizing during the inversion step interference by PCT Application
- methods of the invention comprise minimizing spontaneous inversion to set state 1 by controlling basal expression of the integrase below a threshold level for spontaneous inversion.
- methods of the invention include minimizing during the inversion step interference by excisionase to favor generation of the set state 1 by comprises increasing degradation of the excisionase.
- methods of the invention comprising minimizing during the reverse inversion step stoichiometry mismatch to favor generation of state 0 by one or more of: increasing expression of the excisionase, decreasing expression of the integrase, increasing degradation of the integrase, and reducing a copy number of the DNA register sequence.
- aspects of the invention also relate to an in vivo data storage system which is a nonvolatile data storage system, and methods of storing data using a nonvolatile in vivo data storage system.
- the DNA data register stores data as a nonvolatile memory.
- the stored data is retained until the DNA data register rewritten or recoded with different data, erased (returned to original state), or otherwise rendered unreadable.
- the stored data is retained in the absence of expression of the first gene.
- the stored data is retained in the absence of expression of the first gene, the second gene, or both the first and second genes.
- Other embodiments may provide in vivo data storage system which is a volatile data storage system and methods of recording data using volatile in vivo data storage systems.
- Figures 1A-1D Architecture, mechanisms, and operation of a recombinase addressable data (RAD) module.
- RAD recombinase addressable data
- Figures 2A-2G Independent set and reset operations plus long-term data storage and switching in vivo.
- Figures 3A-3C Functional composition, expected operable ranges, and RESET failure modes for a RAD module.
- Figures 4A-4C Optimized genetic elements and reliable multi-cycle operation of a DNA- inversion RAD module.
- Figures 5A-5I Maps of certain constructs used.
- Fig. 5 A The PBAD-Int set flipper where Bxbl integrase was cloned downstream of the PBAD/AraC promoter (BBa_I0500) on pSB3Kl plasmid bearing a pl5A origin of replication (15-20 copies).
- Fig. 5B The PBAD- Xis/Int reset flipper circuit.
- Fig. 5C The screening vector and Fig. 5D, The RAD module depicted in Fig. 4A.
- Figures 6A-6B Alternate architecture for a reset circuit.
- Fig. 6A Schematic diagram of the decoupled reset circuit where integrase is expressed from a low-copy plasmid while excisionase is expressed from a medium-copy plasmid.
- Fig. 6B Cells bearing the chromosomal LR DNA register were transformed with both plasmids encoding integrase and excisionase, pulsed with arabinose and analyzed by flow cytometry. Cells relaxed to the BP state after induction with approximately 85% efficiency.
- Figures 7A-7C Influence of register copy number on recombination efficiency and consequences for integrase-excisionase mechanism.
- Fig. 7A Influence of copy number of the DNA register on the efficiency of integrase-excisionase mediated recombination.
- the bidirectional reset generator form Fig. 2C was transformed in cells containing the DNA data register in the LR state on a pSClOl plasmid (upper panel) or integrated in the chromosome (lower panel), and cells were pulsed with arabinose. When the register was on the chromosome, cells were driven toward the BP state more efficiently during induction, and the recombination efficiency of the Int/Xis reaction for LR to BP reaction was higher after inducer removal.
- Figure 8 Screening of different RBS designed using the RBS calculator. Constructs were tested using a DNA register that expresses gemini when flipped. Cells were co-transformed with the target and the different RBS variants constructs. Cells were grown with or without
- Figures 9A-9D Effect of the down-regulation of excisionase on set and reset functions.
- Fig. 9A schematic representation of the RAD module used for this particular experiment.
- the DNA data register has only one output, GFP, in the LR state.
- the PLtet-O-1 promoter controls the set integrase while the PBAD promoter controls a polycistron expressing integrase and excisionase.
- Both set and reset circuits are cloned on pSB3Kl plasmid (pl5A origin, 15-20 PCT Application
- Fig. 9B control experiments.
- Left panel BP and LR Target constructs on pSB4A5 low- copy plasmid (5-10 copies), showing the 2 states of the system (low or high GFP for BP and LR, respectively).
- Right panel a RAD module with no copy of the excisionase was transformed in cells containing the DNA data register in the BP state on pSB4A5 and the set generator was induced with Ate. Cells flipped to the LR state as monitored by GFP expression.
- Fig. 9C reduction of interference by down regulation of excisionase basal levels.
- a reset generator in which the excisionase is down-regulated with an AAK ssrA tag acts as a set generator and can flip from BP to LR due to stoichiometry mismatch and higher integrase levels.
- Figures 1 OA- IOC Example of an efficient reset generator setting-back after the end of a pulse.
- Fig. 10A Kinetic model based simulation of the reset efficiencies during and after the pulse for different expression scaling.
- a gray dot marks an integrase- excisionase expression scales that keeps the latch in an intermediate state both during and after a RESET pulse; a black dot (in white circle) marks an expression scale that causes the latch to revert back to LR state after a reset pulse.
- Fig. 10B Time-course simulation showing resetting failures during and after a RESET pulse. Gray and black lines are simulated using integrase and excisionase expression scaling parameters as marked with gray and black dots in Fig. 10A.
- Fig. IOC Schematic representation of the particular construct used in Fig.
- excisionase has a strong RBS and AAK degradation tag while integrase has a very weak RBS (BBa_B0033) followed by a GTG start codon. Therefore, even if the stoichiometry between the two proteins is correct during the pulse, the degradation rate of excisionase is higher than for integrase, resulting in entry into the set regime after the pulse.
- Figures 1 lA-1 IB Example of a set circuit resetting back after the end of a pulse.
- Fig. 11 A Kinetic model based simulation of the reset efficiencies during and after the pulse for different expression scaling.
- a gray dot in white circle) marks an integrase-excisionase PCT Application
- Figure 12 Detailed parameter sensitivity analysis of DNA inversion RAD module operable range, (i), Kinetic model based simulation of the set, reset and SR-latch efficiencies after the pulse for different expression scaling and kinetic parameters.
- the integrase lower bound of the operable range scales with integrase-flipper dissociation constant (ii), the excisionase upper bound of the set operable range and the excisionase lower bound of the reset operable range scale with integrase-excisionase dissociation constant (iii), the excisionase lower bound of the reset operable range and the integrase lower bound of the set and the reset operable range scale with the fold change between induced and basal expression (iv) increasing the amount of DNA register changes the excisionase upper bound of set operable range and the excisionase lower bound of reset operable range.
- Figures 13A-13F Detailed data for the RAD module operation cycles.
- Fig. 13C Quantification of switching plus storage efficiency of the RAD module for long input cycles.
- Fig. 13D Quantification of switching plus storage efficiency of the RAD module for short input cycles.
- Figs. 13C and 13D error bars are PCT Application
- Fig. 13E long term storage of the BP state in the context of the RAD module.
- Cells in the LR state were RESET with arabinose, switched and stored the BP state. After 40 generation, cells were SET back with Ate (blue/open circle line) or grown without inducer up to 100 generations of storage (red/filled square line).
- Fig. 13F long term storage of the LR state in the context of the RAD module. After one cycle of RESET with arabinose and SET with Ate, cells were grown for 40 generations, RESET with arabinose or grown without inducer up to 100 generations. In both Figs. 13E and 13F, switching plus storage efficiency was comparable with the initial efficiencies.
- Figures 14A-14C Comparison of operable ranges of S/R latches using alternative mechanisms.
- Fig. 14A Schematic diagram of a hypothetic S/R latch based on a DNA inversion RAD module whose DNA register can be inverted and reverted by two different recombinase, Reel and Rec2 respectively.
- Fig. 14B Schematic diagram of a mutual inhibition S/R- latch. A pair of mutually repressed genes functions as a bistable switch; an additional copy of each gene, driven by SET or RESET inputs, is necessary to couple arbitrary transcriptional signals to the state of the bistable switch.
- Fig. 14C Kinetic model based simulation of the set, reset and SR- latch. Note that the operable range for set and reset circuits are symmetric and there is no efficiency loss due to stoichiometry mis-match as in the case of integrase-excisionase based DNA inversion latch.
- Figures 15A-15B Tuning the reset-specific integrase degradation rate allows for different bias in the outcome of integrase-excisionase mediated recombination.
- Fig. 15 A Schematic representation of the system used in this experiment. Only the integrase ssrA degradation tag changes. These constructs are alternate resets elements obtained during the screen for a functional reset using a destabilized excisionase.
- Fig. 15B Three resets elements using different ssrA sequences for the integrase display different proportions of cells in the BP state after pulse (pink: gate used to measure the percentage of cells in BP along with the actual value).
- Figures 16A-16Q Maps of various genetic elements used herein. Fig. 16A,
- Fig. 16F PhiC31PbadXisPtetIntj64100.
- Fig. 16G Phirvl constIntj64100.
- Fig. 16H Phirvl PbadXisPtetlnt integrationVector.
- Fig. 161 PhiC31PbadXisPtetInt_integrationVector.
- Fig. 16F PhiC31PbadXisPtetIntj64100.
- Fig. 16G Phirvl constIntj64100.
- Fig. 16H Phirvl PbadXisPtetlnt integrationVector.
- Fig. 161 Phirvl PbadXisPtetlnt integrationVector.
- TP901-1, Phirvl and PhiC31 Cells bearing chromosomal BP data register built from attB/attP recombination sites of bacteriophage Bxbl, TP901-1, Phirvl or PhiC31 weree transformed with a medium copy plasmid expressing their cognate integrases. Expression of the integrase drove state switching of the data register from BP state to LR state (arrows indicate switching from gray/before switching to black/after switching).
- Figure 18 Independent reset operations using serine integrases and excisionases from bacteriophage Bxbl, TP901-1, Phirvl and PhiC31.
- Cells bearing chromosomal LR data register built from attL/attR recombination sites of bacteriophage Bxbl, TP901-1, Phirvl or PhiC31 were transformed with a medium copy plasmid expressing their cognate integrases and excisionases.
- Coexpressing integrase and excisionase drove state switching of the data register from LR state toward BP state (arrows indicate switching from gray/before switching to black/after switching).
- FIG. 19 Specificity of integrase- DNA data register from bacteriophage Bxbl, TP901- 1, Phirvl and PhiC31.
- Cells bearing chromosomal BP data register from bacteriophage Bxbl, TP901-1, Phirvl or PhiC31 were transformed with a medium copy plasmid expressing an integrase from bacteriophage Bxbl, TP901-1, Phirvl or PhiC31 (represented by black dots; negative control is represented by gray dots where there is no integrase expression). Only when the integrase and the data register are from the same bacteriophage can the integrase efficiently switch the data register from BP state to LR state (i.e., black dots that do not overlap with gray dots).
- Figure 20 Resetting by RAD modules with both integrase and excisionase expressed from chromosome.
- the PBAD-Xis / Ptet-Int flipper circuit was integrated to Escherichia coli chromosome at phage HK022 integration site; its cognate LR data register was integrated at phage phi80 site (Bxbl and PhiC31) or at phage p21 site (Phirvl). Dual induction with 0.1% arabinose and 200 ng/ml drove state switching from LR to BP state.
- RAD rewriteable recombinase addressable data
- phage integrases are unique in that the directionality of the recombination reaction can be influenced by an excisionase co- factor (Groth AC, Calos MP (2004) Phage integrases: biology and applications. Journal of Molecular Biology 335:667-678).
- Groth AC Calos MP (2004) Phage integrases: biology and applications. Journal of Molecular Biology 335:667-678.
- a phage integrase alone typically catalyzes site-specific recombination between an attP site on the infecting phage chromosome and an attB site encoded within the host chromosome.
- the resulting integration reaction inserts the phage genome within the host chromosome bracketed by newly formed attL and attR (LR) sites.
- Phages integrases are thought to represent two evolutionary and mechanistically distinct recombinase families (Groth AC, Calos MP (2004) Phage integrases: biology and applications. Journal of Molecular Biology 335:667-678).
- Tyrosine integrases such as the bacteriophage lambda integrase, often have relatively long attachment sites (-200 bp), use a Holliday junction mechanism during strand exchange, and require host specific co-factors.
- serine integrases use a double-strand break mechanism during recombination and can have shorter attachment sites (-50 bp).
- some serine integrases do not require host cofactors, a feature that has led to their successful reuse across a range of organisms (Keravala A et al.
- bacteriophage serine integrase was used in the examples provided herein.
- suitable bacteriophage serine integrase include but not limited to, Hin, Gin, Cin, cpC31, cpRvl, R4, TP901, A118, U153, Bxbl and cpFCl .
- Bacteriophage Bxbl now provides the best characterized serine integrase excisionase system (Kim AI et al. (2003) Mycobacteriophage Bxbl integrates into the Mycobacterium smegmatis groELl gene. Molecular Microbiology 50:463-473; Ghosh P, Kim AI, Hatfull GF (2003) The orientation of mycobacteriophage Bxbl integration is solely dependent on the central dinucleotide of attP and attB. Molecular Cell 12: 1101-1111; Ghosh P, Wasil LR, Hatfull GF
- Bxbl gp35 is a serine integrase that catalyzes integration of the Bxbl genome into the GroELl gene of M. smegmatis (Kim AI et al. (2003) Mycobacteriophage Bxbl integrates into the Mycobacterium smegmatis groELl gene. Molecular Microbiology 50:463-473).
- Bxbl gp47 is an excisionase that mediates excision in vivo and has been shown to control recombination directionality in vitro with high efficiency (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86).
- Minimal attB, attP, attL, and attR sites have been defined for the Bxbl system (Kim AI et al. (2003) Mycobacteriophage Bxbl integrates into the Mycobacterium smegmatis groELl gene.
- the Bxbl excisionase does not bind DNA independently and, from in vitro studies, is thought to control integrase directionality in a stoichiometric manner (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86).
- Mycobacterium tuberculosis H37Rv another serine integrase, excisionase and att sites were identified (Bibb LA, Hatfull GF (2002) Integration and excision of the Mycobacterium tuberculosis prophage-like element, phiRvl . Mol Microbiol 45(6): 1515-26).
- Bibb LA Hatfull GF (2002) Integration and excision of the Mycobacterium tuberculosis prophage-like element, phiRvl . Mol Microbiol 45(6): 1515-26).
- PhiC31 excisionase can bind to its cognate integrase in the absence of recombination sites.
- the set element for Bxbl (PBAD-driven-integrase generator, Figs. 2B, 2D, 2E) was cloned in pSB3Kl plasmid (pl5A origin; 15-20 copies).
- the reset element for Bxbl (PBAD- driven-excisionase+integrase generator, Fig. 2C, 2D, 2E) was cloned on J64100 plasmid.
- the full RAD module (PLtet-Ol driven integrase generator and PBAD-driven-excisionase+integrase generator, Fig. 4, Fig. 14) was cloned in J64100.
- PBAD-driven excisionase / Ptet-driven integrase generator (Fig. 17 Bxbl, TP901-1 and PhiC31, Fig. 18) was cloned in J64100 plasmid.
- Constitutive promoter-driven integrase generator (Fig. 17 Phirvl, Fig. 19) was cloned in J64100 plasmid.
- L- arabinose (Calbiochem) was used at a final 0.5% w/v concentration; anhydrotetracycline (Sigma) was used at a final concentration of 20 ng/ml.
- Figs. 17-20 cells were grown in Hi-Def Azure medium (Teknova) supplemented with 0.66 % v/v glycerol.
- L-arabinose was used at a final 0.2%> (Fig. 17 and 18) or 0.1% (Fig. 20) w/v concentration; and hydrotetracyclin was at a final concentration 200 ng/ml.
- a saturated culture was diluted 1 :2000 in media with inducer.
- Cells were centrifuged and washed before each dilution step.
- overnight grown cultures were diluted 1 : 100 in media with inducer, grown for 4 hours, at which point cells were washed, diluted 1 :2000, and grown for an additional 16H. For Figs. 17-20 induction was done overnight.
- RAD Recombinase Addressable Data
- the RAD module consists of an inducible "set” generator producing integrase, an inducible “reset” generator producing integrase and excisionase, and a DNA data register (Fig. 1 A).
- An alternative RAD module architecture used in Fig. 17 (Bxbl, TP901-1 and PhiC31) and Fig. 18 has integrase and excisionase under two different inducible promoters (Ptet and PBAD, respectively) so that integrase and
- the DNA inversion RAD module is driven by two generic transcription input signals, set and reset.
- a set signal drives expression of integrase that inverts a DNA element serving as a genetic data register. Flipping the register converts flanking attB and attP sites to attL and attR sites, respectively.
- a reset signal drives expression of integrase and excisionase and restores both register orientation and the original flanking attB and attP sites.
- the register itself encodes a constitutive promoter which initiates strand-specific transcription. Following successful set or reset operations, mutually exclusive transcription outputs "1" or "0" are activated, respectively. For the RAD module developed here a "1" or "0" register state produces red or green fluorescent protein, respectively.
- integrase alone should set a DNA register sequence flanked by oppositional attB and attP sites thereby producing an inverted sequence flanked by attL and attR sites (State "1").
- a second independent transcriptional input drives the simultaneous production of integrase and excisionase and should reset the register sequence to its original orientation and flanking sequences (State "0").
- Fig. IB shows the elementary chemical reactions, molecular species, and kinetic parameters used to model the RAD module.
- Molecular concentrations are normalized to the integrase dimer dissociation constant (K;).
- Kinetic rates are normalized to the integrase-mediated recombination rate (k c _1 ).
- the model reflects available PCT Application
- Fig. 1C is a simulated phase diagram detailing pseudo equilibrium operating regimes for a RAD module experiencing sustained integrase and excisionase expression levels for 200/k c .
- the red (left corner curves), green (right corner curves), and gray (bottom curves) lines represent, with decreasing intensity, 95, 75, and 55% switching (or hold) efficiencies.
- Three distinct latch operating regions were found as a function of integrase and excisionase expression levels, corresponding to expected "set,” “reset,” or “hold” operations (Fig. 1C).
- One complete latch cycle requires the dynamic adjustment of integrase and excisionase expression through a "set, hold, reset, hold" pattern. These operations are realized in practice by cycling the transcription signals that define latch set and reset inputs and by tuning the specific genetic elements that provide fine control over integrase and excisionase synthesis and degradation.
- a data storage register was first implemented via a DNA fragment encoding fluorescent reporter proteins and Bxbl recombinase recognition sites flanking a constitutive promoter on the chromosome of E. coli DH5aZl (Lutz R, Bujard H (1997) Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/Il-I2 regulatory elements. Nucleic Acids Research 25: 1203-1210) (Fig. 1A). The state of the register could be assayed reliably was confirmed via microscopy and cytometry (Fig. 2A). In Fig. 2A, microscopy and flow cytometry data show two distinguishable states for an invertible data register integrated in the E. coli chromosome driving red (RFP) or green (GFP) fluorescent proteins, and also a control sample in which cells express neither reporter. It was next established that the register PCT Application
- the set-encoding vectors were transformed into cells containing the chromosomal BP register and isolated cells that only switched when induced; many variants switch spontaneously in the absence of an input signal or do not switch when induced (Fig. 8, Table 2).
- Set functions were isolated that switch with greater than 95% efficiency at the single cell level and that hold state following inducer removal (Fig. 2B).
- Fig. 2B data register inverses via expression of integrase. Growing cells (doubling time -90 min) start in state "0" expressing GFP and, following a 16 hour set input pulse, switch to and hold state "1" expressing RFP.
- Fig. 2C shows bidirectionality of the integrase-excisionase reaction.
- Cells were transformed with plasmids containing the LR DNA data register and a bi-directional reset element on a plasmid and pulsed with arabinose. During a pulse, cells entered an intermediate state where both GFP and RFP are expressed. After inducer removal, cells split into two major populations corresponding to BP and LR states. Split BP and LR populations were sorted by FACS and pulsed these sorted cells with arabinose again. The same behavior was observed regardless of the initial register state.
- 2D shows the stochastic simulation of bidirectional DNA inversion for a single copy DNA register (top row) before, during (blue shaded area) and after a reset pulse.
- BP to LR and LR to BP recombination propensities are assumed to be equal.
- Two independent time-course stochastic simulations (middle and bottom rows) of expected GFP and RFP expression levels given the depicted (top row) BP and LR states.
- Fluorescent reporter degradation propensities modeled as ten-fold slower than recombination propensities. From this framing, engineer reset controllers were then engineered that produce a range of weighted outcomes in the final register state by tuning the reset-specific integrase degradation rate (25, 50, 75% BP:LR distributions; Fig. 15).
- the fraction of integrase available for excisionase binding may be decreased and therefore the effective excisionase-to-integrase ratio may increase.
- the fraction of integrase available for excisionase binding may be decreased and therefore the effective excisionase-to-integrase ratio may increase.
- the DNA register copy number from 5-10 per cell (Fig. 2C) to 1 per cell (Fig. 7A, bottom) an increase in excisionase-mediated recombination directionality from -40% to -65% was observed.
- Such observations are consistent with a model that accounts for relative DNA copy number and whether excisionase can interact with cytoplasmic integrase (Figs. 7B-7C).
- E. coli cells were repetitively grown and diluted every day for 10 days and monitored data storage at the single cell level by measuring the continuous expression of fluorescent reporters. It was established that starting from either state the register could switch and hold state for 100+ cell doublings (Fig. 2F) or could hold state and then switch reliably following 90+ cell doublings (Fig. 2G). Fig. 2F shows stable long term data storage. Cells were serially propagated without input signals for 100 generations following data register set (orange) or reset (blue).
- Fig. 2G shows long term functionality of data register. Cells were serially propagated without input signals for 90 generations and then exposed to set (orange) or reset (blue) input signals. The fraction of individual cells switching state was assayed by cytometry. Taken together these data demonstrate the practical stability and long-term PCT Application
- Fig. 3A shows simulated phase diagrams detailing the expected operation of RAD module functions in response to dynamic pulses of integrase and excisionase across a range of expression levels.
- Top row switching efficiencies (colormap) for set and reset functions when combined in a RAD module.
- Fig. 3B shows reset failure during a reset pulse due to stoichiometric mismatch-mediated bidirectional register switching.
- Fig. 3C shows reset failure immediately following a reset pulse due to stoichiometric mismatch-mediated setting to state "1".
- Fig. 4A shows details of an integrated DNA inversion RAD module optimized for reliable set, reset and storage functions. Specific genetic regulatory elements controlling protein synthesis and degradation were obtained from standard biological parts collections (filled shapes with black outline such as AAK and B0031) (e.g., those from the Registry of Standard Biological Parts at MIT, as well as the International Open Facility
- FIG. 4B shows experimental RAD module operation over multiple duty cycles. Growing cells (doubling time -90 minutes) starting in state "1" were cycled through a "reset (marked by "R”), hold (marked by "St”), set (marked by "S”), hold” input pattern with each step lasting -10
- Fig. 4C shows multi-cycle RAD module PCT Application
- Atty Docket No. 062602-021402/PCT operation driven by shorter SET and RESET input pulses. As in Fig. 4B but with set and reset pulses lasting for ⁇ 2 cell doublings ( ⁇ 3 hours).
- Three additional exemplary data storage registers were implemented via a DNA fragment encoding fluorescent reporter protein and TP901-1, Phirvl and PhiC31 recombinase recognition sites flanking a constitutive promoter on the chromosome of E. coli DH5aZl .
- these three data storage registers can be switched from a BP state to an LR state using their cognate integrase (Fig. 17) and from an LR state to a BP state using their cognate integrase-excisionase (Fig. 18) expressed from J64100 plasmids.
- the RAD device design can have the input module (the transcriptional sources, integrase and excisionase generator) on a medium copy plasmid in order to produce enough integrase and excisionase for efficiently switching the state of the output module (recombination site and output promoter) on a genomic DNA, especially during reset.
- functional composition of multiple RAD units may require an ability to connect the output module of one RAD unit to the input module of the other RAD unit.
- integrase and excisionase generators were integrated to HK022 site on E.coli chromosome. Integrase was expressed under Ptet while excisionase was expressed under PBAD.
- Table 2 summarizes certain failure modes and engineering solutions for set and reset operations. Putative failure causes as noted.
- Expression cassette schematics highlight (orange) regulatory regions targeted for reengineering with element-specific redesign goals.
- 6N is a full six nucleotide library within a Shine -Dalgarno ribosome binding site (RBS) core.
- RBS-1 and RBS-2 are collections of "standard” or computationally designed RBSs.
- AXX is a peptide library sampling 12 biochemically representative amino acids. The number of independent clonal constructs tested in each case plus corresponding figures are provided.
- conditional control over recombination directionality to implement a repeatedly rewritable DNA data storage element likely only partially aligns with the natural contexts in which integrase and excisionase performance have been selected.
- integrase alone naturally mediates integration of a phage genome into a host chromosome under circumstances in which the phage will not destructively lyse the host cell.
- Such integration reactions are likely under positive selection to be fast and efficient, given that failure to integrate prior to host chromosome replication and cell division could result in loss of the phage from a daughter lineage. Integration reactions are also likely under negative selection to be irreversible, since integration followed by immediate excision could result in an abortive infection.
- the DNA inversion RAD module developed here should be translatable to applications requiring stable long-term data storage (for example, replicative aging) or under challenging conditions (for example, clinical or environmental contexts requiring in situ diagnosis or ex post facto reporting via PCR or DNA sequencing). Given the natural phage recombination functions from which the latch is implemented (Ringrose L et al. (1998) Comparative kinetic analysis of FLP and ere recombinases: mathematical models for DNA binding and recombination.
- RDFs recombination directionality factors
- Plasmids were constructed using standard BioBrickTM (Knight T (2003) Idempotent Vector Design for Standard Assembly of Biobricks) or Gibson assembly (Gibson DG et al. (2009)
- RG (1981) Mechanism of araC autoregulation and the domains of two overlapping promoters, Pc and PBAD, in the L-arabinose regulatory region of Escherichia coli. Proc Natl Acad Sci USA 78:752-756) (BBa_I0500), Superfolder GFP (Pedelacq J-D, Cabantous S, Tran T, Terwilliger TC, Waldo GS (2005) Engineering and characterization of a superfolder green fluorescent protein.
- Bxbl integrase was cloned on pSB4A5 plasmid (pSC 101 origin, 5-10 copies (Shetty RP, Endy D, Knight TF (2008) Engineering BioBrick vectors from BioBrick parts. J Biol Eng 2:5) while the excisionase was cloned on J64100 plasmid (regulated ColEl; 50-70 copies). Sequences are available via GENBANK accession numbers JQ929581 to JQ929585 and via the MIT Registry of Standard Biological Parts.
- the DNA data register in BP and LR states consist of a constitutive promoter
- BBa_J23119 flanked by BP or LR recombination sites positioned in opposite orientation, resulting in DNA inversion when recombined (Figs 1 & 2 and Figs 5E and 5F).
- a Rrnp Tl terminator (BBa_J61048) was added in reverse orientation upstream of the promoter to prevent transcriptional read-through in the opposite orientation, so that in each state, only one fluorescent protein is visibly expressed.
- superfolder GFP and mKate2 were cloned (Shcherbo, D. et al. Far-red fluorescent tags for protein imaging in living tissues. Biochem. J.
- Bxbl integrase was cloned downstream of the PBAD/AraC promoter (BBa_I0500) on pSB3Kl plasmid bearing a pl5A origin of replication (15-20 copies).
- This version of the integrase has a 6-His-tag which was found to stabilize the protein. Therefore, a weak RBS (BBa_B0031) and a LAA ssrA tag was added to reduce the basal expression of the enzyme.
- PBAD controls expression of a polycistron encoding excisionase with a strong RBS designed using the RBS Calculator (Salis HM, Mirsky EA, Voigt CA (2009) Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotechnol 27:946-950) to have a target Translation Initiation Rate (TIR) of 50000, followed by Bxbl integrase with a GTG start codon to decrease its TIR (Barrick, D., Villanueba, K., Childs, J. & Kalil, R. Quantitative analysis of ribosome binding sites in E. coli.
- RBS Calculator Salis HM, Mirsky EA, Voigt CA (2009) Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotechnol 27:946-950
- TIR Target Translation Initiation Rate
- PBAD-Bxbl integrase was cloned (with no 6-His tag and no Ssra tag) on a pSClOl plasmid pSB4A5 (5-10 copies), and transformed it in cells containing the DNA data register in LR state along with PCT Application
- Plasmids were transformed in chemically competent E. Coli DH5alphaZl and plated on LB agar plates containing the appropriate antibiotics.
- the set circuit presented in Fig. 2B was transformed in cells containing HK022 integrated BP DNA register bearing a chloramphenicol resistance cassette.
- the polycistronic reset circuit, the decoupled reset circuit, and all S/R RAD modules were transformed in cells containing Phi80-integrated BP or LR DNA register bearing a kanamycin resistance cassette.
- all DNA register bears a kanamycis resistance cassette.
- Bxbl, TP901-1 and PhiC31 data register were integrated at Phi80 site; Phirvl data register was integrated at p21 site.
- Cells containing the chromosomal DNA data register were grown with an additional 5 ⁇ g/ml of kanamycin or chloramphenicol depending on the integrated cassette.
- a screening vector (Fig. 5C) was built containing the excisionase fused to an AAK Ssra tag under the control of the arabinose promoter.
- a PLtetO-1 promoter followed by Hindlll and Nsil sites allows for cloning and screening of Ate controlled set circuits.
- the excisionase gene is followed by Ascl and the BioBrickTM suffix Spel and Pstl sites, allowing for cloning of a reset integrase.
- the library was built by amplifying the Bxbl integrase gene with forward primers containing B0031 RBS and a GTG start codon and reverse primers containing the randomized ssrA tag, and cloning the PCR library into the screening vector in between the Ascl and Spel sites. Ligations were transformed into DH5alphaZl cells containing the chromosomal LR DNA register. 384 clones, corresponding to almost 95% coverage of the library (Reetz, M.T., Kahakeaw, D. & Lohmer, R. Addressing the numbers problem in directed evolution.
- Gemini a bifunctional enzymatic and fluorescent reporter of gene expression.
- PLoS ONE 4, e7569 (2009) a bifunctional reporter containing the alpha-fragment and GFP, downstream of the invertible promoter. Therefore, recombinase activity can be monitored by beta-galactosidase activity.
- X-Gal was used at a final concentration of 70 ⁇ g/mL concentration and IPTG at a 80uM final concentration.
- the Bxbl integrase gene was PCR amplified using primers containing a RBS with a randomized Shine-Delgarno sequence and a GTG start codon (SEQ ID NO.:23:
- ODE Ordinary differential equation
- the model consists of three components: integrase (I), excisionase (X) and DNA register (D).
- BP to LR recombination is catalyzed by an integrase tetramer (a complex of integrase dimer binding to attB and attP sites).
- LR to BP recombination is catalyzed by an integrase-excisionase complex, integrase-excisionase stoichiometry in the complex for Bxbl is unknown but is assumed to be 1 : 1.
- Fig. 1 C the dynamics of total register in an LR state, D L Rtot, can be written as: PCT Application
- concentration of each complex can be written as:
- Ki , Kj; and K d i x are dissociation equilibrium constants of integrase-integrase dimer, of integrase dimer-recombination site complex and of integrase-excision complex on a
- the dual recombinase RAD module model (Fig. 14A) consists of three components: recombinase-1 (Rl), recombinase-2 (R2) and DNA register (D).
- Rl recombinase-1
- R2 recombinase-2
- D DNA register
- K c is an inversion rate constant and K
- K d i are dissociation equilibrium constants of recombinase dimer and recombinase dimer-recombination site, respectively.
- Binary state could be stored epigenetically using a bistable gene regulator network, for example, a system of two mutually repressing genes (Gardner TS, Cantor CR, Collins J J, (2000)
- the network can be set and reset simply by adding external inducers (IPTG, aTc, heat shock, etc.) that can inactivate one of the two repressors, allowing the other repressor to express.
- IPTG external inducers
- aTc heat shock, etc.
- the mutual inhibition S/R latch model presented here (Fig. 145) consists of repressor Rl and R2 mutually repressing each other expression and an extra copy of Rl and R2 driven by set input and reset input, respectively. Assuming that repressors bind to their cognate operator sites as tetramer (as it was assumed for recombinases), the dynamics of Rl and R2 concentrations can be written as: j3 ⁇ 4 + [R2 i 4 PCT Application
- Total DNA register concentration is 1 (in Kiunit).
- the state storage element allows the latch to maintain the state; this includes a DNA register of the DNA inversion RAD module or a bistable mutual inhibition circuit of a mutual inhibition S/R latch.
- a DNA register encodes a state in a DNA sequence which is naturally maintained and replicated inside living cells; a mutual inhibition circuit encodes a state as repressor concentration which can be maintained through a feedback loop.
- the input interface element allows external inputs (transcriptional signals for examples presented here) to perturb and change the state of the state storage element. This includes the integrase-excisionase genes (or dual recombinases) for the DNA inversion RAD module or extra copies of repressor genes driven by inputs, for a mutual inhibition S/R latch.
- a challenge in implementing a RAD module is to properly "map" the dynamic ranges of the external input of interest, via the input interface element, to the state phase of the state storage element.
- mapping is represented as the expression scaling parameter ⁇ .
- ⁇ is proportional to translation rate and inverse proportional to protein degradation rate.
- Another challenge for implementing a RAD module is to optimize two antagonizing mechanisms, the set and the reset mechanisms, within the same chassis.
- Optimal conditions for resetting of the integrase-excisionase based S/R latch, for example, having a stable and efficiently translating excisionase would be likely to have so high excisionase basal expression that can interfere with the setting mechanism.
- the size of the operable range with respect to the scaling parameter will be proportional to the fold changes between the basal input level and the input pulse. If the fold change is small, one needs to precisely match the basal input level to state storage regime and the PCT Application
- Both set and reset operable ranges have a lower bound of integrase expression level corresponding to the induced integrase level that is "enough" for efficient recombination.
- the rate of BP to LR recombination is governed by the level of [DI 4 ], which can be switched to LR state, relative to [D] and [DI 2 ] which cannot:
- Excisionase determines directionality of state switching. Too high basal excisionase expression will break a set. On the contrary, too low induce excisionase expression will break a reset. Consider how much excisionase expression scaling will allow both efficient set and reset.
- the net BP to LR and LR to BP recombination rates depend on the relative amount of the active recombination complex, [DI 4 ] and [DI 4 X 4 ], which, in turn, depends on excisionase level: ⁇ 5 : 1 ⁇ 2] M*
- basal excisionase level is 0.1 ⁇ ⁇ so the upper bound of PCT Application
- Reset operable range has an additional bound: the upper bound of integrase production which is not enough for causing spontaneous BP to LR recombination at the end of the reset pulse (Fig. 10).
- Reset becomes inefficient if [I tot ] > [Xtot] during the reset pulse because, at the end of the pulse, excisionase will disappear first and thus left over integrase will drive BP state register back to LR state again.
- operable ranges of dual recombinase RAD module based S/R latch or mutual inhibition S/R latch with respect to input element expression scaling are constrained by (Fig. 14): 1) the expression scaling lower bound that is large enough to allow state change during a pulse, and 2) the expression scaling upper bound that small enough to not allow spontaneous state switching in the absence of an input pulse.
- Operable range size i.e., the distance between the lower and the upper expression scaling bounds is approximately the fold changes between the induced and the basal input levels.
- Rectangular operable range shape results from the fact that the set and the reset mechanism for the dual recombinase RAD module (or for the mutual inhibition S/R latch) do not directly interacting with each other and that there is no loss of operable regime due to stoichiometry mismatch. Note that for mutual inhibition S/R latch, there is a sharp transition between efficient and inefficient set or reset operable range due to bistability of the system.
- Simulated RAD operable range can recapitulate experimentally observed dependence between the copy number of DNA register, the copy number of integrase-excisionase genes and S/R latch efficiency (Fig. ⁇ and IB). Specifically, increasing the copy number of DNA register relative to the copy number of integrase excisionase genes decreases resetting efficiency.
- Parameter setting used in the simulation shown here is the same as that of the default parameter setting except for that excisionase production rate during reset is reduced to only half of integrase production rate.
- the amount of integrase-DNA register complexes approaches the total amount of integrase. If total integrase outnumbers total excisionase, there are too many integrase-DNA register complexes for excisionase to bind to and thus BP to LR recombination cannot be suppressed completely.
- the amount of integrase-DNA register is limited by the PCT Application
- Fig. 3B the failure mode of a RAD module is presented in which each cell in the population expressing both GFP and RFP during an input pulse and then splitting into two populations of cells expressing either GFP or RFP after the pulse.
- the model consists of a single copy DNA register which can be in either state 0, expressing GFP, or state 1, expressing RFP.
- the scenario was simulated in which the net propensity for inverting from state 0 to state 1 and from state 1 to state 0 are equal. This scenario corresponds to the region between the set and the reset regime in Fig. ID. It was also assumed that the degradation propensity of the both reporters is ten times slower than the inversion propensity (it was expected that in the
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Nanotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Chemical & Material Sciences (AREA)
- Evolutionary Biology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
A rewriteable recombinase addressable data (RAD) module that reliably stores digital information within a chromosome is provided. The disclosed RAD modules use serine integrase and excisionase functions to invert and restore specific DNA sequences. The RAD memory element is capable of passive information storage in the absence of heterologous gene expression over multiple generations, and can be switched repeatedly without performance degradation to support combinatorial data storage.
Description
PCT Application
Arty Docket No. 062602-021402/PCT
Methods and Compositions for
Rewritable Digital Data Storage in Live Cells CROSS REFERENCE TO RELATED APPLICATIONS
This application claims priority to and the benefit of U.S. Provisional Application No. 61/688, 092 filed May 8, 2012 and U.S. Provisional Application No. 61/852, 002 filed March 14, 2013, both entitled "Methods and Compositions for Rewritable Digital Data Storage in Live Cells," the entire disclosures of both of which applications are incorporated herein by reference.
STATEMENT OF GOVERNMENT LICENSE RIGHTS
This invention was made with government support under Grant No. 0540879 awarded by Synthetic Biology Engineering Research Center through U.S. National Science Foundation. The government has certain rights in the invention.
SEQUENCE LISTING
The material in the text file named "062602_021402_ST25.txt" created May 3, 2013, being 121,280 bytes in size and submitted herewith, is hereby incorporated by reference in its entirety.
BACKGROUND
Most engineered genetic data storage systems use auto- or cross-regulating bistable systems of transcription repressors or activators to define and hold state via continuous gene expression (Toman Z, Dambly-Chaudiere C, Tenenbaum L, Radman M (1985) A system for detection of genetic and epigenetic alterations in Escherichia coli induced by DNA-damaging agents, Journal of Molecular Biology 186:97-105; Gardner TS, Cantor CR, Collins JJ, (2000) Construction of a genetic toggle switch in Escherichia coli. Nature 403:339-342; Ajo-Franklin CM et al. (2007) Rational design of memory in eukaryotic cells. Genes & Development 21 :2271-2276; Burrill DR, Silver PA (2010) Making cellular memories. Cell 140: 13-18)). Such epigenetic storage systems can be subject to evolutionary counter selection due to resource burdens placed on the host cell or spontaneous switching due to putatively stochastic fluctuations in cellular processes including gene expression. Moreover, heterologous expression-based
PCT Application
Arty Docket No. 062602-021402/PCT systems are difficult to redeploy given differences in gene regulatory mechanisms across organisms.
Another approach for storing data inside organisms is to code extrinsic information within genetic material (Bancroft C (2001) Long-term storage of information in DNA. Science 293: 1763c-1765). Nucleic acids have undergone natural selection to serve as heritable data storage material in organismal lineages. Moreover, DNA provides attractive features in terms of data storage robustness, scalability, and stability (Ham TS, Lee SK, Keasling JD, Arkin AP (2008) Design and construction of a double inversion recombination switch for heritable sequential genetic memory. PLoS ONE 3:e2815). In addition, engineered transmission of DNA molecules could support data exchange between organisms as needed to implement higher-order multicellular behaviors within programmed consortia (Ham TS, Lee SK, Keasling JD, Arkin AP (2008) Design and construction of a double inversion recombination switch for heritable sequential genetic memory. PLoS ONE 3:e2815; Abelson H et al. (2000) Amorphous computing. Communications of the ACM 43:74-82). Practically, researchers have begun to use enzymes that modify DNA, typically site- specific recombinases, to study and control engineered genetic systems. For example, recombinases can catalyze strand exchange between specific DNA sequences and enable precise manipulation of DNA in vitro and in vivo (Grindley NDF, Whiteson KL, Rice PA (2006) Mechanisms of site-specific recombination. Annu Rev Biochem 75:567-605 ). Depending on the relative location or orientation of recombination sites three distinct recombination outcomes, integration, excision or inversion, can be realized.
From such knowledge several natural recombination systems have been reapplied to support research in cell and developmental biology (Sauer B (1994) Site-specific recombination: developments and applications. Current Opinion in Biotechnology 5(5):521-527; Branda CS, Dymecki SM (2004) Talking about a revolution: the impact of site-specific recombinases on genetic analyses in mice. Developmental Cell 6:7-28). However, all in vivo DNA-based control or data storage systems implemented to date are "single write" systems (Podhajska AJ, Hasan N, Szybalski W (1985) Control of cloned gene expression by promoter inversion in vivo:
construction of the heat-pulse-activated att-nutL-p-att-N module. Gene 40: 163-168; Ham TS, Lee SK, Keasling JD, Arkin AP (2006), A tightly regulated inducible expression system utilizing
PCT Application
Arty Docket No. 062602-021402/PCT the fim inversion recombination switch. Biotechnol Bioeng 94: 1-4; Friedland AE et al. (2009)
Synthetic gene networks that count. Science 324:1199-1202). Consequently, the amount of information such systems are able to store is linearly proportional to the number of implemented elements (for example, a "thermometer-code" counter capable of recording N events given N data storage elements) (Friedland AE et al. (2009) Synthetic gene networks that count. Science
324: 1199-1202).
Single-write architectures are limiting if many of the uses for genetic data storage are considered in detail. For example, studies of replicative aging in yeast or human fibroblasts typically track at least 25 or 45 cell division events prior to the onset of senescence, respectively (Steinkraus KA, Kaeberlein M, Kennedy BK (2008) Replicative aging in yeast: the means to the end. Annu Rev Cell Dev Biol 24:29-54). Lineage mapping during worm development frequently tracks at least 10 differentiation events (Sulston JE, Schierenberg E, White JG, Thomson JN (1983), The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol 100:64-119), while research with mouse and human systems considers up to several hundred cell divisions (Frumkin D, Wasserstrom A, Kaplan S, Feige U, Shapiro E (2005)
Genomic variability within an organism exposes its cell lineage tree. PLoS Comput Biol 1 :e50). In situations where the same signal is being recorded over multiple occurrences (for example, a series of cell division events) reliably rewritable elements are needed to realize geometric increases in data storage capacity (for example, combinatorial counters capable of recording 2^ events given N storage elements).
Thus, the use of synthetic biological systems in research, healthcare, and manufacturing often requires autonomous history-dependent behavior and therefore some form of engineered biological memory. For example, the study or reprogramming of aging, cancer, or development would benefit from genetically encoded counters capable of recording up to several hundred cell division or differentiation events. While genetic material itself provides a natural data storage medium tools that allow researchers to reliably and reversibly write information to DNA in vivo are lacking.
PCT Application
Any Docket No. 062602-021402/PCT
SUMMARY OF THE INVENTION
Aspects of the present invention relate to methods and systems for rewritable digital data storage in live cells. Specifically, via engineered control of recombination directionality, binary digits can be stored in chromosomes, which enables combinatorial data storage. In some embodiments, a DNA element can be flipped within the chromosome of a live cell, and then be flipped back to its original state. These two steps can be repeated an infinite number of times, thereby creating a binary digit (bit) data register. This way, the present invention enables rewritable and passive binary data storage within the chromosome of live cells, as needed for general purpose biological bits and in a manner compatible with combinatorial data
architectures. Such rewritability of passive (requiring no gene expression or active cellular process) data storage system is for the first time achieved by the present invention via a DNA encoded state register. A combinatorial data storage having up to 2AN bits (if N registers) can be built, which is a drastic increase from previous storage size of N bits. Furthermore, using methods and systems of the present invention, it is also possible to support multiplexing of data storage (e.g., re -use of the same recombinase enzymes to store >1 bit, such as 1 byte (8 bits)). This is useful for storing information about events inside cells (e.g., cell division or
differentiation), storing information about molecular levels or activities inside cells (e.g., protein or nucleic acid levels above or below thresholds), and/or storing information about
environmental information that impacts cells (e.g., cytokine levels, environmental pollutants, etc.).
In some embodiments, methods of the present invention enable the development a recombinase addressable data (RAD) modules that can be used to write and store binary digits within the chromosome of live cells. So produced RAD systems are capable of passive and stable information storage over at least 100 cell divisions and can be switched repeatedly without performance degradation, as is required to support combinatorial data structures. Additionally, by varying the synthesis and degradation rates of recombinase functions programmed
stochasticity in RAD system performance across a range of weighted outcomes can be realized. In some examples, serine recombinase can be used. The serine recombinase functions used here do not require cell-specific co-factors and can be used to extend computing and control methods to the study and engineering of many biological systems.
PCT Application
Arty Docket No. 062602-021402/PCT
In one embodiment, the invention provides an in vivo data storage system and methods for storing data using such a system. Such a system includes a recombinase addressable data module comprising an invertible DNA data register. The DNA data register comprises a DNA register sequence flanked by oppositional attachment sites. According to one embodiment, the directionality of the DNA register sequence is invertible to a set state 1 , and optionally reversibly invertible from said set state 1 to a reset state 0. In some embodiments the system further comprises a set generator comprising a first gene encoding an integrase; a reset generator comprising a second gene encoding an excisionase. In various embodiment, one pair of set generator and reset generator can represent one binary digit (1 bit); multiple unique pairs (N) of set generators and reset generators can be present in one system which is capable of storing up to 2AN bits of data. In one embodiment, the reset generator further comprises a third gene encoding an integrase. In one aspect, the attachment sites of the invertible DNA data register are recognized and recombined by the integrase of the set generator when the DNA register sequence is in reset state 0. In another aspect, the attachment sites of the invertible DNA data register are recognized and recombined by the integrase-excisionase complex of the reset generator when the DNA register sequence is in set state 1. The integrase and excisionase can be derived from bacteriophage Bxbl, TP901-1, Phirvl and/or PhiC31, or other sources.
Aspects of the invention may include a reset generator in which the second and third genes together encode an excisionase-integrase complex or a fusion protein comprising the excisionase and the integrase. In another embodiment, at least one of the first, second, or third gene is inducible. The inducible gene may be directly or indirectly induced by an inducer. In some embodiments, an inducer may directly or inactivate transcription, or indirectly or directly inactivate a repressor of transcription. In other embodiments, an inducer may require one or more co-factors. Aspects include an inducer that is a small molecule, a chemical, a protein, an enzyme, a nucleic acid, or a metal ion. In some embodiments, the inducer is an endogenous inducer; and in other embodiments, the inducer may be exogenous. In yet another embodiment, at least one of the first, second or third genes is inducible by a transcription factor; or by two or more inducers functioning together or in the alternate. Aspects also include at least one of the first, second or third genes being autoinducible. Another embodiment provides a vector comprising an in vivo data storage system
PCT Application
Arty Docket No. 062602-021402/PCT according to the invention. Aspects also include a recombinant cell comprising an in vivo data storage system. The in vivo data storage system may be present, in whole or in part, in a chromosome of the recombinant cell. In one embodiment, the DNA data register is present in a chromosome of the recombinant cell. In another embodiment, at least one of the first, second or third genes is present in a chromosome of the recombinant cell.
Aspects of the invention also include a method for storing data in a cell. In one embodiment, the method comprises providing a cell comprising an in vivo data storage system having a recombinase addressable data module including (i) a set generator comprising a first gene encoding an integrase; and (ii) an invertible DNA data register comprising a DNA register sequence flanked by oppositional attachment sites, wherein the directionality of the DNA register sequence is invertible to a set state 1 , and optionally reversibly invertible from said set state 1 to a reset state 0. In some embodiments, methods according to the invention comprise inducing the first gene or allowing the induction of the first gene to express the integrase so as to allow the DNA register sequence to invert, thereby generating a set state of directionality for the DNA register sequence, the set state represented by binary digit 1, thereby storing data represented by the binary digit 1 in the cell. In one embodiment, the recombinase addressable data module further comprises a reset generator comprising a second gene encoding an excisionase and, optionally, a third gene encoding an integrase, and the method further comprises: inducing the second gene or allowing the induction of the second gene to express the excisionase so as to allow the DNA register sequence to invert back, thereby generating a reset state of directionality for the DNA register sequence, the reset state represented by binary digit 0, thereby storing data represented by the binary digit 0 in the cell. In yet another embodiment, methods of the invention comprise optionally repeating step the inversion step, thereby storing data represented by the binary digit 1 or 0 in the cell.
Aspects of the invention may relate to controlling expression of the excisionase to provide a stoichiometric amount of the excisionase in relation to one or both of an amount of an integrase and a copy number of the DNA register sequence, so as to favor generation of the reset state 0 of the DNA register sequence. Methods according to the invention may further comprise tunably controlling the reversible inversion of the DNA register sequence between a state 0 and a set state 1. Some embodiments comprise the method comprising one or more of: minimizing spontaneous inversion to set state 1 ; minimizing during the inversion step interference by
PCT Application
Arty Docket No. 062602-021402/PCT excisionase to favor generation of the set state 1 ; or minimizing during the reverse inversion step stoichiometry mismatch to favor generation of state 0.
In some embodiments, methods of the invention comprise minimizing spontaneous inversion to set state 1 by controlling basal expression of the integrase below a threshold level for spontaneous inversion. In other embodiments, methods of the invention include minimizing during the inversion step interference by excisionase to favor generation of the set state 1 by comprises increasing degradation of the excisionase. In yet another embodiment, methods of the invention comprising minimizing during the reverse inversion step stoichiometry mismatch to favor generation of state 0 by one or more of: increasing expression of the excisionase, decreasing expression of the integrase, increasing degradation of the integrase, and reducing a copy number of the DNA register sequence.
Aspects of the invention also relate to an in vivo data storage system which is a nonvolatile data storage system, and methods of storing data using a nonvolatile in vivo data storage system. In one embodiment, the DNA data register stores data as a nonvolatile memory. According to some embodiments, once data is encoded in the in vivo data storage system, the stored data is retained until the DNA data register rewritten or recoded with different data, erased (returned to original state), or otherwise rendered unreadable. In some embodiments, the stored data is retained in the absence of expression of the first gene. In other embodiments, the stored data is retained in the absence of expression of the first gene, the second gene, or both the first and second genes. Other embodiments may provide in vivo data storage system which is a volatile data storage system and methods of recording data using volatile in vivo data storage systems.
BRIEF DESCRIPTION OF THE FIGURES
The provisional application files to which this application claim priority contain at least one drawing executed in color. Copies of this application publication with color drawing(s) will be provided by the U.S. Patent Office upon request and payment of the necessary fee.
Figures 1A-1D: Architecture, mechanisms, and operation of a recombinase addressable data (RAD) module.
Figures 2A-2G: Independent set and reset operations plus long-term data storage and switching in vivo.
PCT Application
Arty Docket No. 062602-021402/PCT
Figures 3A-3C: Functional composition, expected operable ranges, and RESET failure modes for a RAD module.
Figures 4A-4C: Optimized genetic elements and reliable multi-cycle operation of a DNA- inversion RAD module.
Figures 5A-5I: Maps of certain constructs used. Fig. 5 A, The PBAD-Int set flipper where Bxbl integrase was cloned downstream of the PBAD/AraC promoter (BBa_I0500) on pSB3Kl plasmid bearing a pl5A origin of replication (15-20 copies). Fig. 5B, The PBAD- Xis/Int reset flipper circuit. Fig. 5C, The screening vector and Fig. 5D, The RAD module depicted in Fig. 4A. Figs. 5E (BxblregisterLRstate) and 5F (BxblregisterBPstate): Detailed architecture of the DNA data register in state 0 (BP) and state 1 (LR). Maps were generated using the JBEI (Joint Bio-Energy Institute, Emeryville, CA) online vector editor
(http://i5.ibei.orgA^ectorEditorA^ectorEditor.htmn. Fig. 5G. The PBAD-Xis / Ptet-Int flipper circuit on J64100 plasmid bearing ColElorigina of replication (-50 copies) used in Figs. 18 and 19 experiment. H. Constitutively expressed Int circuit on J64100 plasmid used in Fig. 20 experiment. I. The PBAD-Xis / Ptet-Int flipper circuit on CRIM plasmid integration vector used in Fig. 20 experiment.
Figures 6A-6B: Alternate architecture for a reset circuit. Fig. 6A, Schematic diagram of the decoupled reset circuit where integrase is expressed from a low-copy plasmid while excisionase is expressed from a medium-copy plasmid. Fig. 6B, Cells bearing the chromosomal LR DNA register were transformed with both plasmids encoding integrase and excisionase, pulsed with arabinose and analyzed by flow cytometry. Cells relaxed to the BP state after induction with approximately 85% efficiency.
Figures 7A-7C: Influence of register copy number on recombination efficiency and consequences for integrase-excisionase mechanism. Fig. 7A, Influence of copy number of the DNA register on the efficiency of integrase-excisionase mediated recombination. The bidirectional reset generator form Fig. 2C was transformed in cells containing the DNA data register in the LR state on a pSClOl plasmid (upper panel) or integrated in the chromosome (lower panel), and cells were pulsed with arabinose. When the register was on the chromosome, cells were driven toward the BP state more efficiently during induction, and the recombination efficiency of the Int/Xis reaction for LR to BP reaction was higher after inducer removal. Part of
PCT Application
Arty Docket No. 062602-021402/PCT the cells with a DNA register on a plasmid stayed in the intermediate state after inducer removal, due to the presence of various copies of the DNA register in different states within the same cell. These data support the biochemical studies suggesting that excisionase interacts preferentially with DNA-bound integrase (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86) as was assumed in the kinetic model. Fig. 7B, 7C, Simulated operable range of set, reset, and set/reset function, during and after a pulse, with respect to variations in register copy number and flipper copy number, using 2 different mechanistic assumptions for integrase-excisonase recombination. In Fig. 7B, the assumption is that excisionase interacts with DNA bound integrase only. In Fig. 7C, the assumption is that integrase and excisionase can interact in the cytosol.
Figure 8: Screening of different RBS designed using the RBS calculator. Constructs were tested using a DNA register that expresses gemini when flipped. Cells were co-transformed with the target and the different RBS variants constructs. Cells were grown with or without
Anhydrotetracycline (Ate) to test for leakiness and capacity to be induced, respectively. Cultures were spotted after induction on plates supplemented with X-Gal plus IPTG and beta- galactosidase activity was observed. Target TIR (Translation Initiation Rate) values entered in the forward engineering mode of the RBS calculator are indicated on top rows. Constructs displaying the required characteristics (non-leaky and inducible) are underlined. Many designs expected to exhibit the same quantitative performance produced qualitatively different results. Such differences are likely due to three coupled factors: i) because the recombinase enzyme is very active, a small level of spontaneous basal expression can switch the target DNA, ii) the RBS calculator was used to target extremely low TIRs, likely operating near a limit of the algorithm in terms of its design range, iii) regardless, as published, the calculator is expected to only have a -47% chance of producing a specific designed RBS that produces a TIR to within a factor-of- two target range.
Figures 9A-9D: Effect of the down-regulation of excisionase on set and reset functions. Fig. 9A, schematic representation of the RAD module used for this particular experiment. Here the DNA data register has only one output, GFP, in the LR state. The PLtet-O-1 promoter controls the set integrase while the PBAD promoter controls a polycistron expressing integrase and excisionase. Both set and reset circuits are cloned on pSB3Kl plasmid (pl5A origin, 15-20
PCT Application
Any Docket No. 062602-021402/PCT copies). Fig. 9B, control experiments. Left panel: BP and LR Target constructs on pSB4A5 low- copy plasmid (5-10 copies), showing the 2 states of the system (low or high GFP for BP and LR, respectively). Right panel: a RAD module with no copy of the excisionase was transformed in cells containing the DNA data register in the BP state on pSB4A5 and the set generator was induced with Ate. Cells flipped to the LR state as monitored by GFP expression. Fig. 9C, reduction of interference by down regulation of excisionase basal levels. While inducing the set led to entry into an intermediate state in presence of the wild type excisionase (wt-Xis, left panel), addition of a AAK ssrA tag to the excisionase restored the set function (Xis-AAK, right panel). Fig. 9D, down-regulation of excisionase breaks the reset circuit. Arabinose induction of a reset generator containing wt-excisionase (wt-Xis, left panel) shows little flipping from BP to LR, due to excisionase mediated inhibition of integrase activity toward the BP DNA register. However, a reset generator in which the excisionase is down-regulated with an AAK ssrA tag (Xis-AAK, right panel) acts as a set generator and can flip from BP to LR due to stoichiometry mismatch and higher integrase levels. Figures 1 OA- IOC: Example of an efficient reset generator setting-back after the end of a pulse. Fig. 10A, Kinetic model based simulation of the reset efficiencies during and after the pulse for different expression scaling. A gray dot (in white circle) marks an integrase- excisionase expression scales that keeps the latch in an intermediate state both during and after a RESET pulse; a black dot (in white circle) marks an expression scale that causes the latch to revert back to LR state after a reset pulse. Fig. 10B, Time-course simulation showing resetting failures during and after a RESET pulse. Gray and black lines are simulated using integrase and excisionase expression scaling parameters as marked with gray and black dots in Fig. 10A. Fig. IOC, Schematic representation of the particular construct used in Fig. 3C, in which excisionase has a strong RBS and AAK degradation tag while integrase has a very weak RBS (BBa_B0033) followed by a GTG start codon. Therefore, even if the stoichiometry between the two proteins is correct during the pulse, the degradation rate of excisionase is higher than for integrase, resulting in entry into the set regime after the pulse.
Figures 1 lA-1 IB: Example of a set circuit resetting back after the end of a pulse. Fig. 11 A, Kinetic model based simulation of the reset efficiencies during and after the pulse for different expression scaling. A gray dot (in white circle) marks an integrase-excisionase
PCT Application
Arty Docket No. 062602-021402/PCT expression scales that keep the latch in an intermediate state both during and after a set pulse; a black dot (in white circle) marks an expression scales that cause the latch to revert back to BP state after a SET pulse. Fig. 1 IB, Time-course simulation showing setting failures during and after a RESET pulse. Gray and black lines are simulated using integrase and excisionase expression scaling parameters as marked with gray and black dots in Fig. 11 A. Note that for the black line simulated condition, the basal expression levels of integrase and excisionase are both high enough so that the DNA register quickly reaches an intermediate BP-LR state even before an input pulse arrives.
Figure 12: Detailed parameter sensitivity analysis of DNA inversion RAD module operable range, (i), Kinetic model based simulation of the set, reset and SR-latch efficiencies after the pulse for different expression scaling and kinetic parameters. The integrase lower bound of the operable range scales with integrase-flipper dissociation constant (ii), the excisionase upper bound of the set operable range and the excisionase lower bound of the reset operable range scale with integrase-excisionase dissociation constant (iii), the excisionase lower bound of the reset operable range and the integrase lower bound of the set and the reset operable range scale with the fold change between induced and basal expression (iv) increasing the amount of DNA register changes the excisionase upper bound of set operable range and the excisionase lower bound of reset operable range. With higher DNA register concentration, these bounds become more sensitive to integrase level even at high excisionase level (v), allowing incomplete integrase-excisionase complex to catalyze bidirectional recombination broadens the parameter ranges for partially switching latch (vi), allowing integrase-excisionase to form complexes in the cytosol changes the set and the reset circuits operable range in a similar way to increasing DNA register concentration (vii), under this assumption, the set operable range can be extended to arbitrarily high excisionase level by increasing integrase level, (viii), Changing the scaling factor β by altering proteins translation rates instead of proteins degradation rates does not give a qualitatively different operable range in the default scenario.
Figures 13A-13F: Detailed data for the RAD module operation cycles. Figs. 13A and 13B: Typical RFP/GFP plots for switching and storage cycle for long and short pulses, respectively (In Fig. 13B, R=reset; S=set). Fig. 13C, Quantification of switching plus storage efficiency of the RAD module for long input cycles. Fig. 13D, Quantification of switching plus storage efficiency of the RAD module for short input cycles. Figs. 13C and 13D: error bars are
PCT Application
Arty Docket No. 062602-021402/PCT the SD of 2 experiments performed in triplicates. Fig. 13E, long term storage of the BP state in the context of the RAD module. Cells in the LR state were RESET with arabinose, switched and stored the BP state. After 40 generation, cells were SET back with Ate (blue/open circle line) or grown without inducer up to 100 generations of storage (red/filled square line). Fig. 13F, long term storage of the LR state in the context of the RAD module. After one cycle of RESET with arabinose and SET with Ate, cells were grown for 40 generations, RESET with arabinose or grown without inducer up to 100 generations. In both Figs. 13E and 13F, switching plus storage efficiency was comparable with the initial efficiencies.
Figures 14A-14C: Comparison of operable ranges of S/R latches using alternative mechanisms. Fig. 14A, Schematic diagram of a hypothetic S/R latch based on a DNA inversion RAD module whose DNA register can be inverted and reverted by two different recombinase, Reel and Rec2 respectively. Fig. 14B, Schematic diagram of a mutual inhibition S/R- latch. A pair of mutually repressed genes functions as a bistable switch; an additional copy of each gene, driven by SET or RESET inputs, is necessary to couple arbitrary transcriptional signals to the state of the bistable switch. Fig. 14C, Kinetic model based simulation of the set, reset and SR- latch. Note that the operable range for set and reset circuits are symmetric and there is no efficiency loss due to stoichiometry mis-match as in the case of integrase-excisionase based DNA inversion latch.
Figures 15A-15B: Tuning the reset-specific integrase degradation rate allows for different bias in the outcome of integrase-excisionase mediated recombination. Fig. 15 A, Schematic representation of the system used in this experiment. Only the integrase ssrA degradation tag changes. These constructs are alternate resets elements obtained during the screen for a functional reset using a destabilized excisionase. Fig. 15B, Three resets elements using different ssrA sequences for the integrase display different proportions of cells in the BP state after pulse (pink: gate used to measure the percentage of cells in BP along with the actual value). Note also that: i) the intermediate state is observable during induction even at single copy number of the DNA register, as simulated by the stochastic model in Fig. 2D. ii) differences in bias toward BP during induction are also observed. In front of each row is depicted a schematic representation of the actual reset element with relevant sequence information.
PCT Application
Arty Docket No. 062602-021402/PCT
Figures 16A-16Q: Maps of various genetic elements used herein. Fig. 16A,
Bxblconstlntj'64100. Fig. 16B, BxblPbadXisPtetlnt integrationVector. Fig. 16C,
BxblPbadXisPtetIntj'64100. Fig. 16D, PhiC31constIntj64100. Fig. 16E,
PhiC31PbadXisPtetInt_integrationVector. Fig. 16F, PhiC31PbadXisPtetIntj64100. Fig. 16G, Phirvl constIntj64100. Fig. 16H, Phirvl PbadXisPtetlnt integrationVector. Fig. 161,
PhirvlPbadXisPtetIntj'64100. Fig. 16J, TP901constIntj64100. Fig. 16K,
TP901PbadXisPtetIntj'64100. Fig. 16L, PhiC31registerBPstate. Fig. 16M,
PhiC31registerLRstate. Fig. 16N, PhirvlregisterBPstate. Fig. 160, Phirvl registerLRstate. Fig.
16P, TP901registerBPstate. Fig. 16Q, TP901registerLRstate. Figure 17: Independent set operations using serine integrases from bacteriophage Bxbl,
TP901-1, Phirvl and PhiC31. Cells bearing chromosomal BP data register built from attB/attP recombination sites of bacteriophage Bxbl, TP901-1, Phirvl or PhiC31were transformed with a medium copy plasmid expressing their cognate integrases. Expression of the integrase drove state switching of the data register from BP state to LR state (arrows indicate switching from gray/before switching to black/after switching).
Figure 18: Independent reset operations using serine integrases and excisionases from bacteriophage Bxbl, TP901-1, Phirvl and PhiC31. Cells bearing chromosomal LR data register built from attL/attR recombination sites of bacteriophage Bxbl, TP901-1, Phirvl or PhiC31were transformed with a medium copy plasmid expressing their cognate integrases and excisionases. Coexpressing integrase and excisionase drove state switching of the data register from LR state toward BP state (arrows indicate switching from gray/before switching to black/after switching).
Figure 19: Specificity of integrase- DNA data register from bacteriophage Bxbl, TP901- 1, Phirvl and PhiC31. Cells bearing chromosomal BP data register from bacteriophage Bxbl, TP901-1, Phirvl or PhiC31 were transformed with a medium copy plasmid expressing an integrase from bacteriophage Bxbl, TP901-1, Phirvl or PhiC31 (represented by black dots; negative control is represented by gray dots where there is no integrase expression). Only when the integrase and the data register are from the same bacteriophage can the integrase efficiently switch the data register from BP state to LR state (i.e., black dots that do not overlap with gray dots).
PCT Application
Arty Docket No. 062602-021402/PCT
Figure 20: Resetting by RAD modules with both integrase and excisionase expressed from chromosome. The PBAD-Xis / Ptet-Int flipper circuit was integrated to Escherichia coli chromosome at phage HK022 integration site; its cognate LR data register was integrated at phage phi80 site (Bxbl and PhiC31) or at phage p21 site (Phirvl). Dual induction with 0.1% arabinose and 200 ng/ml drove state switching from LR to BP state.
DETAILED DESCRIPTION
Demonstrated herein is a rewriteable recombinase addressable data (RAD) module that reliably stores digital information within a chromosome. RAD modules use serine integrase and excisionase functions adapted from bacteriophage to invert and restore specific DNA sequences. The core RAD memory element is capable of passive information storage in the absence of heterologous gene expression for over 100 cell divisions and can be switched repeatedly without performance degradation, as is required to support combinatorial data storage. Also
demonstrated herein is how programmed stochasticity in RAD system performance arising from bidirectional recombination can be achieved and tuned by varying the synthesis and degradation rates of recombinase proteins. The serine recombinase functions used here do not require cell- specific co-factors and should be useful in extending computing and control methods to the study and engineering of many biological systems.
Among the recombinase family of DNA modifying enzymes phage integrases are unique in that the directionality of the recombination reaction can be influenced by an excisionase co- factor (Groth AC, Calos MP (2004) Phage integrases: biology and applications. Journal of Molecular Biology 335:667-678). In natural systems a phage integrase alone typically catalyzes site-specific recombination between an attP site on the infecting phage chromosome and an attB site encoded within the host chromosome. The resulting integration reaction inserts the phage genome within the host chromosome bracketed by newly formed attL and attR (LR) sites. Upon induction leading to lytic growth the prophage co-expresses integrase and excisionase that together restore an independent phage genome and the original attB and attP (BP) sites (Ptashne M (2004) A genetic switch (Cold Spring Harbor Laboratory Pr)).
PCT Application
Arty Docket No. 062602-021402/PCT
Early work with the r32 polar mutations of bacteriophage lambda revealed that integrase mediated recombination of antiparallel BP sites could also lead to the inversion of the
intervening DNA (Fiandt M, Szybalski W, Malamy MH (1972) Polar mutations in lac, gal and phage lambda consist of a few IS-DNA sequences inserted with either orientation. Mol Gen Genet 119:223-231; Reyes O, Gottesman M, Adhya S (1979) Formation of lambda lysogens by
IS2 recombination: gal operon-lambda pR promoter fusions. Virology 94:400-408). Subsequent studies on DNA supercoiling used phage integrases to invert recombinant DNA sequences flanked by opposing BP sites (Podhajska AJ, Hasan N, Szybalski W (1985) Control of cloned gene expression by promoter inversion in vivo: construction of the heat-pulse-activated att-nutL- p-att-N module. Gene 40: 163-168; Mizuuchi K, Fisher LM, O'Dea MH, Gellert M (1980) DNA gyrase action involves the introduction of transient double-strand breaks into DNA. Proc Natl Acad Sci USA 77: 1847-1851). Further in vitro work has since demonstrated that an integrase- excisionase complex can revert a DNA sequence flanked by opposing LR sites (Pollock TJ, Nash HA (1983) Knotting of DNA caused by a genetic rearrangement. Evidence for a nucleosome-like structure in site-specific recombination of bacteriophage lambda. Journal of Molecular Biology 170: 1-18).
Phages integrases are thought to represent two evolutionary and mechanistically distinct recombinase families (Groth AC, Calos MP (2004) Phage integrases: biology and applications. Journal of Molecular Biology 335:667-678). Tyrosine integrases, such as the bacteriophage lambda integrase, often have relatively long attachment sites (-200 bp), use a Holliday junction mechanism during strand exchange, and require host specific co-factors. By contrast, serine integrases use a double-strand break mechanism during recombination and can have shorter attachment sites (-50 bp). In addition, some serine integrases do not require host cofactors, a feature that has led to their successful reuse across a range of organisms (Keravala A et al.
(2006) A diversity of serine phage integrases mediate site-specific recombination in mammalian cells. Mol Genet Genomics 276: 135-146). For exemplary purposes only and without being limited thereto, a bacteriophage serine integrase was used in the examples provided herein. By way of example, suitable bacteriophage serine integrase include but not limited to, Hin, Gin, Cin, cpC31, cpRvl, R4, TP901, A118, U153, Bxbl and cpFCl .
PCT Application
Arty Docket No. 062602-021402/PCT
Bacteriophage Bxbl now provides the best characterized serine integrase excisionase system (Kim AI et al. (2003) Mycobacteriophage Bxbl integrates into the Mycobacterium smegmatis groELl gene. Molecular Microbiology 50:463-473; Ghosh P, Kim AI, Hatfull GF (2003) The orientation of mycobacteriophage Bxbl integration is solely dependent on the central dinucleotide of attP and attB. Molecular Cell 12: 1101-1111; Ghosh P, Wasil LR, Hatfull GF
(2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86; Mediavilla J et al. (2000) Genome organization and characterization of
mycobacteriophage Bxbl . Molecular Microbiology 38:955-970). Bxbl gp35 is a serine integrase that catalyzes integration of the Bxbl genome into the GroELl gene of M. smegmatis (Kim AI et al. (2003) Mycobacteriophage Bxbl integrates into the Mycobacterium smegmatis groELl gene. Molecular Microbiology 50:463-473). Bxbl gp47 is an excisionase that mediates excision in vivo and has been shown to control recombination directionality in vitro with high efficiency (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86). Minimal attB, attP, attL, and attR sites have been defined for the Bxbl system (Kim AI et al. (2003) Mycobacteriophage Bxbl integrates into the Mycobacterium smegmatis groELl gene. Molecular Microbiology 50:463- 473; Ghosh P, Kim AI, Hatfull GF (2003) The orientation of mycobacteriophage Bxbl integration is solely dependent on the central dinucleotide of attP and attB. Molecular Cell 12: 1101-1111; Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86). The Bxbl excisionase does not bind DNA independently and, from in vitro studies, is thought to control integrase directionality in a stoichiometric manner (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86). From these and other studies several models have been proposed for how Bxbl excisionase controls integrase directionality but it is not yet clear how excisionase-mediated recombination proceeds or is regulated in vivo (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86; Savinov A, Pan J, Ghosh P, Hatfull GF (2011) The Bxbl gp47 recombination directionality factor is required not only for prophage excision, but also for phage DNA replication. Gene). Besides Bxbl, at least three other serine integrase-excisionase pairs have been
discovered. On TP901-1 bacteriophage from Lactococcus lactis subsp. cremoris 901-1, a serine
PCT Application
Arty Docket No. 062602-021402/PCT integrase gene, att sites (Christiansen B, Johnsen MG, Stenby E, Vogensen FK, Hammer K
(1994) Characterization of the lactococcal temperate phage TP901-1 and its site-specific integration. J Bacteriol 176(4): 1069- 1076) and its cognate excisionase (Breuner A, Brondsted L,
Hammer K (1999) Novel organization of genes involved in prophage excision identified in the temperate lactococcal bateriophage TP901-1. J Bacteriol 181(23):7291-7) were discovered.
Minimal att sites and in vitro kinetic assays for TP901-1 integrase-excisionase system were later established (Breuner A, Brondsted L, Hammer K (2001) Resolvase-like recombination performed by the TP901-1 integrase. Microbiology 147(Pt8):2051-63). On Phirvl prophage of
Mycobacterium tuberculosis H37Rv, another serine integrase, excisionase and att sites were identified (Bibb LA, Hatfull GF (2002) Integration and excision of the Mycobacterium tuberculosis prophage-like element, phiRvl . Mol Microbiol 45(6): 1515-26). In vitro
recombination assays for phiRvl integrase-excisionase system were established. Excisionase not only enables excisive recombination but also inhibit integrative recombination. Neither integration nor excision requires DNA supercoiling, host factors or high energy cofactors (Bibb LA, Hancox MI, Hatfull GF (2005) Integration and excision by the large serine recombinase phiRvl integrase. Mol Microbiol 55(6): 1896-910). The other serine integrase gene was discovered on phiC31 bacteriophage of Streptomycete (Kuhstoss A, Rao RN (1991) Analysis of the integration function of the streptomycete bacteriophage phiC31. J Mol Biol 222(4):897-908). Integrative recombination assays were established in Esherichia coli, in vitro (Thorpe HM, Smith MC (1998) In vitro site-specific integration of bateriophage DNA catalyzed by
recombinase of the resolvase/invertase family. Proc Natl Acad Sci USA 12:95(10):5505-10) and higher Eukaryotes (Chavez CL, Calos MP (2011) Therapeutic applications of the phiC31 integrase system. Curr Gene Ther 11(5):375-81). PhiC31 excisionase was discovered and characterized in vitro (Khaleel T, Younger E, McEwan AR, Varghese AS, Smith MC (2011) A phage protein that binds to phiC31 integrase to switch its directionality. Mol Microbiol
80(6): 1450-63). Unlike Bxbl excisionase, PhiC31 excisionase can bind to its cognate integrase in the absence of recombination sites.
The invention may be further appreciated with reference to the following examples without the invention being limited thereto.
PCT Application
Arty Docket No. 062602-021402/PCT
EXAMPLES
A. Plasmids
The set element for Bxbl (PBAD-driven-integrase generator, Figs. 2B, 2D, 2E) was cloned in pSB3Kl plasmid (pl5A origin; 15-20 copies). The reset element for Bxbl (PBAD- driven-excisionase+integrase generator, Fig. 2C, 2D, 2E) was cloned on J64100 plasmid. The full RAD module (PLtet-Ol driven integrase generator and PBAD-driven-excisionase+integrase generator, Fig. 4, Fig. 14) was cloned in J64100. PBAD-driven excisionase / Ptet-driven integrase generator (Fig. 17 Bxbl, TP901-1 and PhiC31, Fig. 18) was cloned in J64100 plasmid. Constitutive promoter-driven integrase generator (Fig. 17 Phirvl, Fig. 19) was cloned in J64100 plasmid.
Set experiment (Fig. 17) for Bxbl, TP901-1, Phirvl and PhiC31 was done using plasmid BxblPbadXisPtetIntj'64100, TP901PbadXisPtetIntj64100, Phirvl constlntj'64100 and
PhiC31PbadXisPtetIntj'64100 in cells with BxblregisterBPstate, TP901registerBPstate,
Phirvl registerBPstate and PhiC31registerBPstate, respectively, on chromosome.
Reset experiment (Fig. 18) for Bxbl, TP901-1, Phirvl and PhiC31 was done using plasmid BxblPbadXisPtetIntj'64100, TP901PbadXisPtetIntj64100, Phirvl PbadXisPtetIntj'64100 and PhiC31PbadXisPtetIntj'64100 in cells with BxblregisterLRstate, TP901registerLRstate,
Phirvl registerLRstate and PhiC31registerLRstate, respectively, on chromosome.
Integrase-data register specificity test (Fig. 19) was done using plasmid
Bxb 1 constlntj'64100, TP901 constlntj'64100, Phirvl constlntj 64100 and PhiC 1 constlntj'64100 in cells with BxblregisterBPstate, TP901registerBPstate, Phirvl registerBPstate and
PhiC3 IregisterBPstate.
Chromosomal reset experiment (Fig. 20) was done using
Bxb 1 PbadXisPtetlnt integrationVector, Phrv 1 PbadXisPtetlnt integrationVector and
PhrvlPbadXisPtetlnt integrationVector integrated on chromosome with BxblregisterLRstate, Phirvl registerLRstate and PhiC31registerLRstate, respectively, also on chromosome.
Table 1 below summarizes various genetic elements and plasmids and their
corresponding SEQ ID NOS.
PCT Application
Arty Docket No. 062602-021402/PCT
B. Cell culture and experimental conditions
All experiments were performed in E. coli DH5alphaZl . For each experiment, three colonies were inoculated in supplemented M9 medium (M9 salts (Sigma), 1 mM thiamine hydrochloride (Sigma), 0.2% casamino acids (Across Organics), 0.1 M MgS04 (EMD reagents), 0.5 M CaC12 (Sigma)) with glycerol (0.4%, from Fisher Scientific) added as a carbon source and appropriate antibiotics, and grown for approximatively 18 hours at 37°C. Antibiotics used were carbenicillin (25μg/ml), kanamycin (30μg/ml) and chloramphenicol (25μg/ml) (Sigma). L- arabinose (Calbiochem) was used at a final 0.5% w/v concentration; anhydrotetracycline (Sigma) was used at a final concentration of 20 ng/ml. For Figs. 17-20, cells were grown in Hi-Def Azure medium (Teknova) supplemented with 0.66 % v/v glycerol. L-arabinose was used at a final 0.2%> (Fig. 17 and 18) or 0.1% (Fig. 20) w/v concentration; and hydrotetracyclin was at a final concentration 200 ng/ml.
For long inputs, a saturated culture was diluted 1 :2000 in media with inducer. For evolutionary stability experiments, cultures were diluted 1 :2000 every day in media without inducer to achieve -10 generations per day (log2 2000=10.96, (43 Canton B, Labno A, Endy D (2008) Refinement and standardization of synthetic biological parts and devices. Nat Biotechnol 26:787-793)). Cells were centrifuged and washed before each dilution step. For short inputs, overnight grown cultures were diluted 1 : 100 in media with inducer, grown for 4 hours, at which point cells were washed, diluted 1 :2000, and grown for an additional 16H. For Figs. 17-20 induction was done overnight.
C. Measurement and data analysis
Flow cytometry was performed on a LSRII cytometer (BD-Bioscience). For each data point, GFP and RFP fluorescence was measured for 30,000 cells. For Figs. 17-20, fluorescence was measured for 10,000 cells. Data were analyzed using the Flow Jo software (Treestar, Inc.).
PCT Application
Arty Docket No. 062602-021402/PCT
State distributions were quantified by plotting GFP and RFP intensities against each other and gating according to control cells containing only the chromosomal LR or BP DNA registers. No gate was applied to the cell population before quantification.
D. Architecture and model for a RAD module A Recombinase Addressable Data (RAD) module was developed based on a two-state latch architecture that switches between states in response to distinct inputs and stores the last state recorded in the absence of either input signal. Here, the RAD module consists of an inducible "set" generator producing integrase, an inducible "reset" generator producing integrase and excisionase, and a DNA data register (Fig. 1 A). An alternative RAD module architecture used in Fig. 17 (Bxbl, TP901-1 and PhiC31) and Fig. 18 has integrase and excisionase under two different inducible promoters (Ptet and PBAD, respectively) so that integrase and
excisionase expression can be controlled independently. In Fig. 1 A, the DNA inversion RAD module is driven by two generic transcription input signals, set and reset. A set signal drives expression of integrase that inverts a DNA element serving as a genetic data register. Flipping the register converts flanking attB and attP sites to attL and attR sites, respectively. A reset signal drives expression of integrase and excisionase and restores both register orientation and the original flanking attB and attP sites. The register itself encodes a constitutive promoter which initiates strand-specific transcription. Following successful set or reset operations, mutually exclusive transcription outputs "1" or "0" are activated, respectively. For the RAD module developed here a "1" or "0" register state produces red or green fluorescent protein, respectively.
Briefly, production of integrase alone should set a DNA register sequence flanked by oppositional attB and attP sites thereby producing an inverted sequence flanked by attL and attR sites (State "1"). A second independent transcriptional input drives the simultaneous production of integrase and excisionase and should reset the register sequence to its original orientation and flanking sequences (State "0").
A chemical kinetic model was built to better understand the potential behavior and failure modes of a DNA inversion RAD module (Fig. IB). Fig. IB shows the elementary chemical reactions, molecular species, and kinetic parameters used to model the RAD module. Molecular concentrations are normalized to the integrase dimer dissociation constant (K;). Kinetic rates are normalized to the integrase-mediated recombination rate (kc _1). The model reflects available
PCT Application
Arty Docket No. 062602-021402/PCT knowledge of the mechanics and kinetics of the Bxbl recombinase system, specifically (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86; Ghosh P, Pannunzio NR, Hatfull GF (2005) Synapsis in phage Bxbl integration: selection mechanism for the correct pair of recombination sites. Journal of Molecular Biology 349:331-348; Ghosh P, Bibb LA, Hatfull GF (2008) Two-step site selection for serine -integrase-mediated excision: DNA-directed integrase conformation and central dinucleotide proofreading. Proceedings of the National Academy of Sciences 105:3238- 3243).
The operational phase diagram of the latch at pseudo equilibrium was estimated using the model (Supplementary Methods). Fig. 1C is a simulated phase diagram detailing pseudo equilibrium operating regimes for a RAD module experiencing sustained integrase and excisionase expression levels for 200/kc. The red (left corner curves), green (right corner curves), and gray (bottom curves) lines represent, with decreasing intensity, 95, 75, and 55% switching (or hold) efficiencies. Three distinct latch operating regions were found as a function of integrase and excisionase expression levels, corresponding to expected "set," "reset," or "hold" operations (Fig. 1C). One complete latch cycle requires the dynamic adjustment of integrase and excisionase expression through a "set, hold, reset, hold" pattern. These operations are realized in practice by cycling the transcription signals that define latch set and reset inputs and by tuning the specific genetic elements that provide fine control over integrase and excisionase synthesis and degradation.
E. Unidirectional DNA inversion and data storage
A data storage register was first implemented via a DNA fragment encoding fluorescent reporter proteins and Bxbl recombinase recognition sites flanking a constitutive promoter on the chromosome of E. coli DH5aZl (Lutz R, Bujard H (1997) Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/Il-I2 regulatory elements. Nucleic Acids Research 25: 1203-1210) (Fig. 1A). The state of the register could be assayed reliably was confirmed via microscopy and cytometry (Fig. 2A). In Fig. 2A, microscopy and flow cytometry data show two distinguishable states for an invertible data register integrated in the E. coli chromosome driving red (RFP) or green (GFP) fluorescent proteins, and also a control sample in which cells express neither reporter. It was next established that the register
PCT Application
Arty Docket No. 062602-021402/PCT could set and hold state via a pulse of integrase expression within cells containing a single coding sequence for integrase. To do this an integrase driven "set" switches was built by cloning Bxbl integrase under the control of an inducible PBAD promoter (Lee NL, Gielow WO, Wallace RG (1981) Mechanism of araC autoregulation and the domains of two overlapping promoters, Pc and PBAD, in the L-arabinose regulatory region of Escherichia coli. Proc Natl Acad Sci USA 78:752-756) and a ribosome binding site library within a pi 5a vector (described above). The set-encoding vectors were transformed into cells containing the chromosomal BP register and isolated cells that only switched when induced; many variants switch spontaneously in the absence of an input signal or do not switch when induced (Fig. 8, Table 2). Set functions were isolated that switch with greater than 95% efficiency at the single cell level and that hold state following inducer removal (Fig. 2B). As shown in Fig. 2B, data register inverses via expression of integrase. Growing cells (doubling time -90 min) start in state "0" expressing GFP and, following a 16 hour set input pulse, switch to and hold state "1" expressing RFP.
F. Bidirectionality of excisionase-mediated DNA inversion in vivo It was next determined if Bxbl integrase and excisionase could mediate DNA inversion from an LR to BP state efficiently and unidirectionally in vivo. Previous in vitro experiments show that Bxbl integrase and excisionase can catalyze LR to BP recombination to near completion (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86). It was found that a reset function mediated by integrase plus excisionase is reversible in vivo. For example, using constructs (15- 20 copies per cell) expressing both integrase and excisionase, it was observed that upon induction using a reset signal (arabinose) both DNA register states are sampled across a mixed population and then a split population arises following reset signaling (Fig. 2C).
Specifically, Fig. 2C shows bidirectionality of the integrase-excisionase reaction. Cells were transformed with plasmids containing the LR DNA data register and a bi-directional reset element on a plasmid and pulsed with arabinose. During a pulse, cells entered an intermediate state where both GFP and RFP are expressed. After inducer removal, cells split into two major populations corresponding to BP and LR states. Split BP and LR populations were sorted by FACS and pulsed these sorted cells with arabinose again. The same behavior was observed regardless of the initial register state.
PCT Application
Arty Docket No. 062602-021402/PCT
Bidirectional behavior starting from either initial register state was observed, suggesting that, in the context of the system, expression of integrase plus excisionase results in repeated cycles of register inversion between BP and LR states (Fig. 2C, right). Without being limited to theory, it was postulated that the system enters a bidirectional regime if the concentration of excisionase is too low relative to integrase as might be needed to completely reverse
recombination directionality (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86). This type of behavior was therefore termed as "stoichiometry mismatch" failure. Since register flipping should occur faster than fluorescent reporter degradation cells display an intermediate state in which both reporter proteins are expressed. Following the reset pulse cells resolve to one of two register states randomly and express a single reporter protein. A stochastic simulation of register "coin flipping" recapitulated the observed behavior (Fig. 2D, Supplementary Methods). Fig. 2D shows the stochastic simulation of bidirectional DNA inversion for a single copy DNA register (top row) before, during (blue shaded area) and after a reset pulse. BP to LR and LR to BP recombination propensities are assumed to be equal. Two independent time-course stochastic simulations (middle and bottom rows) of expected GFP and RFP expression levels given the depicted (top row) BP and LR states. Fluorescent reporter degradation propensities modeled as ten-fold slower than recombination propensities. From this framing, engineer reset controllers were then engineered that produce a range of weighted outcomes in the final register state by tuning the reset-specific integrase degradation rate (25, 50, 75% BP:LR distributions; Fig. 15).
G. Engineering excisionase-mediated inversion towards unidirectionality
Stoichiometry mismatch failures were overcome by engineering reset generators that should increase the expressed ratio of excisionase to integrase (Table 2). First, integrase from a lower copy plasmid (5-10 per cell) was expressed while excisionase was expressed from a higher copy plasmid (50-70 per cell). Such changes were sufficient to drive excisionase-directed recombination to a BP state at -85% efficiency on a single cell basis (Fig. 6). Next, excisionase and integrase were expressed together from a bicistronic operon on the same higher copy plasmid (50-70 per cell) but with a weaker GTG start codon for initiating translation of the integrase. These changes drove excisionase-directed recombination to a BP state at greater than 90% efficiency on a single cell basis (Fig. 2E). Fig. 2E shows data register restoration via
PCT Application
Arty Docket No. 062602-021402/PCT expression of integrase and excisionase. Growing cells (doubling time -90 min) start in state "1" expressing RFP and, following a 16 hour reset input pulse, return to and hold state "0" expressing GFP.
The DNA copy number of the data register was tuned in order to optimize the effective stoichiometry of excisionase associated with DNA-bound integrase (Table 2). Hatfull and coworkers proposed that Bxbl excisionase only interacts with integrase that is complexed with DNA in either the LR or, more strongly, BP state (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86). Thus, by reducing the number of DNA binding sites for integrase, the fraction of integrase available for excisionase binding may be decreased and therefore the effective excisionase-to-integrase ratio may increase. For example, by only reducing the DNA register copy number from 5-10 per cell (Fig. 2C) to 1 per cell (Fig. 7A, bottom) an increase in excisionase-mediated recombination directionality from -40% to -65% was observed. Such observations are consistent with a model that accounts for relative DNA copy number and whether excisionase can interact with cytoplasmic integrase (Figs. 7B-7C).
H. Stable in vivo data storage over many generations
For in vivo data storage systems to be most useful they must be able to store state over an extended period of time or many cell doublings. Thus, to study the temporal and evolutionary stability of the RAD module E. coli cells were repetitively grown and diluted every day for 10 days and monitored data storage at the single cell level by measuring the continuous expression of fluorescent reporters. It was established that starting from either state the register could switch and hold state for 100+ cell doublings (Fig. 2F) or could hold state and then switch reliably following 90+ cell doublings (Fig. 2G). Fig. 2F shows stable long term data storage. Cells were serially propagated without input signals for 100 generations following data register set (orange) or reset (blue). The fraction of individual cells maintaining register state was assayed by cytometry (see Supplementary Methods). Fig. 2G shows long term functionality of data register. Cells were serially propagated without input signals for 90 generations and then exposed to set (orange) or reset (blue) input signals. The fraction of individual cells switching state was assayed by cytometry. Taken together these data demonstrate the practical stability and long-term
PCT Application
Arty Docket No. 062602-021402/PCT operational reliability of a RAD register in the absence of feedback-mediated latch bistability.
/. Composing opposing recombinase functions
A full set reset cycle must operate within a single cell. However, it was found that most combinations of set and reset functions that work independently fail when used in combination within the same cell (Fig. 9). For example, the stand-alone reset functions, which encode for high levels of excisionase synthesis, result in spontaneous accumulation of excisionase sufficient to corrupt set functions (Fig. 9). Such behavior set/reset is referred to as "interference" failure.
Without being limited to theory, it appeared that the ranges of integrase and excisionase synthesis and degradation rates within which set and reset functions operate reliably together and for which a RAD module holds state are increasingly distinct with decreasing input pulse lengths, leading to increasingly stringent expression requirements (Fig. 3A, top to bottom). Fig. 3 A shows simulated phase diagrams detailing the expected operation of RAD module functions in response to dynamic pulses of integrase and excisionase across a range of expression levels. Top row: switching efficiencies (colormap) for set and reset functions when combined in a RAD module. Second, third, and fourth rows: combined switching and storage efficiencies (colormap) for set and reset alone and for the integrated RAD module across decreasing input pulse widths (200/kc, 20/kc, and 2/kc, respectively).
For example, when the reset excisionase peptide was destabilized in attempting to overcome interference failures (Table 2) it was found that reset generators could be corrupted in two ways. First, stoichiometry mismatch failures were returned to during reset induction (Fig. 3B). Second, it was found that if insufficient excisionase is maintained during the relaxation period following an apparently reliable reset then split state populations can still emerge (Fig. 3C and Fig. 10). More specifically, Fig. 3B shows reset failure during a reset pulse due to stoichiometric mismatch-mediated bidirectional register switching. Cells containing a chromosomal DNA data register starting in either state "1" (top row) or "0" (bottom row) were exposed to a 16 hour reset pulse, producing a mixed state population during signal input and a split population thereafter (see also Fig. 10). Fig. 3C shows reset failure immediately following a reset pulse due to stoichiometric mismatch-mediated setting to state "1". Cells containing a
PCT Application
Arty Docket No. 062602-021402/PCT chromosomal DNA data register starting in either state "1" (top row) or "0" (bottom row) were exposed to a 16 hour reset pulse, producing a single state "0" population during signal input but a split population thereafter.
Approximately 400 clones encoding a library of destabilized reset-encoded integrase peptides were screened in order to obtain a reset cassette that functions reliably without also corrupting set functionality (Table 2). The working RAD module that resulted from this overall process uses weaker GTG start codons for both set and reset integrase coding sequences and encodes distinct non-consensus ssrA proteolysis tags (Andersen JB et al. (1998) New unstable variants of green fluorescent protein for studies of transient gene expression in bacteria. Applied and Environmental Microbiology 64:2240-2246) on the reset integrase and excisionase peptides (Fig. 4A, Fig. 8, and Supplementary Methods). Once assembled, the RAD module can be cycled repeatedly and reliably in response to transition and hold inputs lasting -10 cell doublings (-900 minutes) (Fig. 4B); and the module will function in response to shorter transition inputs (Fig. 4C). Specifically, Fig. 4A shows details of an integrated DNA inversion RAD module optimized for reliable set, reset and storage functions. Specific genetic regulatory elements controlling protein synthesis and degradation were obtained from standard biological parts collections (filled shapes with black outline such as AAK and B0031) (e.g., those from the Registry of Standard Biological Parts at MIT, as well as the International Open Facility
Advancing Biotechnology (BIOFAB) at UC Berkeley and Stanford University), via
computational design (filled shapes without outline such as RBS-1 (SEQ ID NO.:24)) (e.g., using a ribosomal binding site (RBS) calculator), or via random mutagenesis and screening (filled shapes with gray outline such as RBS-2 (SEQ ID NO.:25) and ARH), as noted. Fig. 4B shows experimental RAD module operation over multiple duty cycles. Growing cells (doubling time -90 minutes) starting in state "1" were cycled through a "reset (marked by "R"), hold (marked by "St"), set (marked by "S"), hold" input pattern with each step lasting -10
generations. Cell state was assayed via multicolor cytometry. Population distribution GFP (y axis) and forward scatter (x axis) levels are shown for samples taken from the generation number following the input step given directly above each scatter plot (for example, the first scatter plot shows the population distribution at generation 10). Fig. 4C shows multi-cycle RAD module
PCT Application
Atty Docket No. 062602-021402/PCT operation driven by shorter SET and RESET input pulses. As in Fig. 4B but with set and reset pulses lasting for ~2 cell doublings (~3 hours).
J. Expanding rewritable data storage capacity
Three additional exemplary data storage registers were implemented via a DNA fragment encoding fluorescent reporter protein and TP901-1, Phirvl and PhiC31 recombinase recognition sites flanking a constitutive promoter on the chromosome of E. coli DH5aZl . Like Bxbl based data storage registers, these three data storage registers can be switched from a BP state to an LR state using their cognate integrase (Fig. 17) and from an LR state to a BP state using their cognate integrase-excisionase (Fig. 18) expressed from J64100 plasmids. Integrases from Bxbl, TP901-1, Phirvl and PhiC31 were shown to efficiently invert only their corresponding data storage register (Fig. 19). Together, these results suggest the possibilities of engineering multiple RAD modules, each module independently operated under different recombinases, in the same cell.
K. Rewritable data storage devices with integrase and excionase generators on chromosome
In some embodiments, the RAD device design can have the input module (the transcriptional sources, integrase and excisionase generator) on a medium copy plasmid in order to produce enough integrase and excisionase for efficiently switching the state of the output module (recombination site and output promoter) on a genomic DNA, especially during reset. In some instances, functional composition of multiple RAD units may require an ability to connect the output module of one RAD unit to the input module of the other RAD unit. Thus, it is important to be able to have both the input and the output modules operate on the same piece of DNA. Therefore, functional chromosomal reset for Bxbl, phirvl and phiC31 were demonstrated (Fig. 20). For all these three cases, integrase and excisionase generators were integrated to HK022 site on E.coli chromosome. Integrase was expressed under Ptet while excisionase was expressed under PBAD.
L. Discussion
PCT Application
Arty Docket No. 062602-021402/PCT
Table 2 summarizes certain failure modes and engineering solutions for set and reset operations. Putative failure causes as noted. Expression cassette schematics highlight (orange) regulatory regions targeted for reengineering with element-specific redesign goals. 6N is a full six nucleotide library within a Shine -Dalgarno ribosome binding site (RBS) core. RBS-1 and RBS-2 are collections of "standard" or computationally designed RBSs. AXX is a peptide library sampling 12 biochemically representative amino acids. The number of independent clonal constructs tested in each case plus corresponding figures are provided.
The experimental results are not inconsistent with a model in which inversion of DNA by Bxbl integrase alone is unidirectional but inversion of DNA by integrase plus excisionase is bidirectional. However, biased directionality of Bxbl excisionase-mediated recombination can be realized at a system-level by controlling the excisionase-to-integrase ratio and dynamics, and also the integrase-to-DNA target site ratio. By carefully tuning and integrating the expression of competing recombinase functions, a first reliable and rewritable DNA inversion-based data storage system that works in vivo was developed. The process by which a working RAD module was developed warrants consideration.
Practically, several challenges associated with enacting control over the relative levels and timing of expression for three proteins within E. coli had to be surmounted. Initially, it was unknown what specific quantitative levels or expression timing would drive DNA inversion in vivo. However, a need remained for sufficient tools, whether working standard biological parts or computational design methods, sufficient to rationally engineer gene expression cassettes to provide qualitative control over switchable gene expression. For example, a series of computationally designed translation control elements produced qualitative distinct phenotypes during set operations across a range of expected protein synthesis rates, and also for repeated design attempts at the same target translation rate (Fig. 8). The lack of genetic control elements that can be reliably composed so as to enable precise expression control with novel heterologous coding sequences, and the precision limits of computational tools that optimize control elements for specific genetic contexts, forced us to empirically validate gene expression on a gene-by-gene and gene combination-by-combination basis.
PCT Application
Arty Docket No. 062602-021402/PCT
The use of conditional control over recombination directionality to implement a repeatedly rewritable DNA data storage element likely only partially aligns with the natural contexts in which integrase and excisionase performance have been selected. For example, integrase alone naturally mediates integration of a phage genome into a host chromosome under circumstances in which the phage will not destructively lyse the host cell. Such integration reactions are likely under positive selection to be fast and efficient, given that failure to integrate prior to host chromosome replication and cell division could result in loss of the phage from a daughter lineage. Integration reactions are also likely under negative selection to be irreversible, since integration followed by immediate excision could result in an abortive infection. Both selective pressures would align well with the performance requirements for integrase during a set operation. However, in nature, when integrase plus excisionase excise a prophage it may be that there are not similarly strong selective pressures against recombination bidirectionality. For example, prophage induction via integrase plus excisionase is typically associated with the expression of phage factors leading to irreversible lytic growth. Such an evolutionary context is different from the performance requirements of a reset operation for a genetic data storage system in which both the recombination products and the host cell must continue to exist.
Searching for phage-host systems in which prophage induction is followed by a period of delayed lysis or even host cell reproduction may help to identify natural excisionase-mediated recombination systems that are fully unidirectional. The DNA inversion RAD module developed here should be translatable to applications requiring stable long-term data storage (for example, replicative aging) or under challenging conditions (for example, clinical or environmental contexts requiring in situ diagnosis or ex post facto reporting via PCR or DNA sequencing). Given the natural phage recombination functions from which the latch is implemented (Ringrose L et al. (1998) Comparative kinetic analysis of FLP and ere recombinases: mathematical models for DNA binding and recombination. Journal of Molecular Biology 284:363-384) a reliable operation with less than 30 minute switching times may be obtainable via continued optimization of integrase and excisionase synthesis and degradation rates. Further improvements to latch speed or reliability might also be realized by thresholding- or closed-loop control architectures that produce system-level bistability.
PCT Application
Arty Docket No. 062602-021402/PCT
A typical architecture for an 8-bit synchronous counter capable of recording a series of 256 input pulses (Horowitz P, Hill W (1989) The art of electronics. (Cambridge University Pr) would require 16 recombinases recognizing distinct DNA sequences or the multiplexing of recombinase activity across repeating DNA recognition sites. Additional biochemically independent RAD modules could likely be identified from the increasing set of known natural recombinases (Hendrix RW (1999) Evolutionary relationships among diverse bacteriophages and prophages: All the world's a phage. Proceedings of the National Academy of Sciences 96:2192- 2197; Lewis JA, Hatfull GF (2001) Control of directionality in integrase-mediated
recombination: examination of recombination directionality factors (RDFs) including Xis and Cox proteins. Nucleic Acids Research 29:2205-2216; Smith MCM, Thorpe HM (2002)
Diversity in the serine recombinases. Molecular Microbiology 44:299-307) or perhaps by engineering synthetic integrase excisionase pairs with altered DNA recognition specificity (Buchholz F, Stewart AF (2001) Alteration of Cre recombinase site specificity by substrate- linked protein evolution. Nat Biotechnol 19: 1047-1052Ashworth J et al. (2006) Computational redesign of endonuclease DNA binding and cleavage specificity. Nature 441 :656-659; Gaj T, Mercer AC, Gersbach CA, Gordley RM, Barbas CF (2011) Structure-guided reprogramming of serine recombinase DNA sequence specificity. Proc Natl Acad Sci USA 108:498-503). Such work along with puzzles of integrating dozens of competing biochemical functions suggest that engineering increased capacity in vivo data storage systems will help define and challenge the limits of synthetic biology.
K. Supplementary Methods Molecular Biology
Plasmids were constructed using standard BioBrick™ (Knight T (2003) Idempotent Vector Design for Standard Assembly of Biobricks) or Gibson assembly (Gibson DG et al. (2009)
Enzymatic assembly of DNA molecules up to several hundred kilobases. Nature Methods 6:343- 345). Coding sequences for BioBrick™ versions of Bxbl, TP901-1 and Phirvl integrase and excisionase and for LR and BP DNA register were synthesized by DNA 2.0 (Menlo Park, CA, USA). Coding sequence for PhiC31 integrase and excisionase was a gift from Calos lab
(Stanford, CA, USA). Plasmids and parts encoding PBAD/AraC (Lee NL, Gielow WO, Wallace
PCT Application
Arty Docket No. 062602-021402/PCT
RG (1981) Mechanism of araC autoregulation and the domains of two overlapping promoters, Pc and PBAD, in the L-arabinose regulatory region of Escherichia coli. Proc Natl Acad Sci USA 78:752-756) (BBa_I0500), Superfolder GFP (Pedelacq J-D, Cabantous S, Tran T, Terwilliger TC, Waldo GS (2005) Engineering and characterization of a superfolder green fluorescent protein. Nat Biotechnol 24:79-88) (BBa_I746916), and the PLtet-Ol (Lutz R, Bujard H (1997) Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/Il-I2 regulatory elements. Nucleic Acids Research 25: 1203-1210)
(BBa_R0040), were obtained from the MIT Registry of Standard Biological Parts
(http://partsregistry.org). For the decoupled reset circuit, Bxbl integrase was cloned on pSB4A5 plasmid (pSC 101 origin, 5-10 copies (Shetty RP, Endy D, Knight TF (2008) Engineering BioBrick vectors from BioBrick parts. J Biol Eng 2:5) while the excisionase was cloned on J64100 plasmid (regulated ColEl; 50-70 copies). Sequences are available via GENBANK accession numbers JQ929581 to JQ929585 and via the MIT Registry of Standard Biological Parts. Strong translational elements on both integrase and excisionase in chromosomal Bxbl, Phirvl and PhiC31 reset experiment were obtained from BIOFAB bicistron design collection (Mutalik V. et al, Precise and reliable gene expression via standard transcription and translation initiation elements. Nat Methods 2013 Apr.;10(4):354-60).
Site specific chromosomal integration
The DNA registers in BP and LR states were integrated into E. coli DH5alphaZl chromosome (Lutz R, Bujard H (1997) Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/Il-I2 regulatory elements. Nucleic Acids Research 25: 1203-1210) using a modified version of the CRIM system (St-Pierre F., Cui L., Endy D., and Shearwin K., manuscript in preparation), using phages HK022 (Fig. 2B, Fig. 7), Phi80 (Figs. 2A, 2E, 2F, 2G, 3, 4, 6, 13, 15, and 17-20 Bxbl and PhiC31) or p21 (Figs. 17-20 Phirvl) integration sites. Both integration sites are at around the same chromosomal location (25 minutes (Haldimann A (2001) Conditional-replication, integration, excision, and retrieval plasmid-host systems for gene structure-function studies of bacteria. Journal of Bacteriology 183 (21): 6384-6393)).
Architecture of the DNA data register
PCT Application
Arty Docket No. 062602-021402/PCT
The DNA data register in BP and LR states consist of a constitutive promoter
(BBa_J23119) flanked by BP or LR recombination sites positioned in opposite orientation, resulting in DNA inversion when recombined (Figs 1 & 2 and Figs 5E and 5F). A Rrnp Tl terminator (BBa_J61048) was added in reverse orientation upstream of the promoter to prevent transcriptional read-through in the opposite orientation, so that in each state, only one fluorescent protein is visibly expressed. On each side of the recombination target, superfolder GFP and mKate2 were cloned (Shcherbo, D. et al. Far-red fluorescent tags for protein imaging in living tissues. Biochem. J. 418, 567 (2009)) under translational control of measured strong RBSs (BIOFAB pilot C-dog project http://biofab.org/data). DNA data registers were first cloned on pSB4A5 low copy plasmid then integrated on the chromosome as described in the material and methods section.
Details on the set and reset circuits and alternate reset architectures
To build the set circuit presented in Fig. 2A, Bxbl integrase was cloned downstream of the PBAD/AraC promoter (BBa_I0500) on pSB3Kl plasmid bearing a pl5A origin of replication (15-20 copies). This version of the integrase has a 6-His-tag which was found to stabilize the protein. Therefore, a weak RBS (BBa_B0031) and a LAA ssrA tag was added to reduce the basal expression of the enzyme. For the reset circuit, PBAD controls expression of a polycistron encoding excisionase with a strong RBS designed using the RBS Calculator (Salis HM, Mirsky EA, Voigt CA (2009) Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotechnol 27:946-950) to have a target Translation Initiation Rate (TIR) of 50000, followed by Bxbl integrase with a GTG start codon to decrease its TIR (Barrick, D., Villanueba, K., Childs, J. & Kalil, R. Quantitative analysis of ribosome binding sites in E. coli. Nucleic acids Research 22, 1287-1295 (1994)) and increase the excisionase to integrase ratio. This construct was cloned in a plasmid with a regulated ColEl origin of replication (J64100, 50-70 copies). It was found that another way to match the correct excisionase to integrase ratio was to have a different copy number of each gene, by cloning them on different plasmids with different compatible origins of replication (Fig. 6A). PBAD-Bxbl integrase was cloned (with no 6-His tag and no Ssra tag) on a pSClOl plasmid pSB4A5 (5-10 copies), and transformed it in cells containing the DNA data register in LR state along with
PCT Application
Arty Docket No. 062602-021402/PCT pBad-RBS50000-Xis construct cloned in J64100 (excisionase only). It was found that cells were able to flip from LR to BP with approximately 85% efficiency (Fig. 6B).
Cell culture and circuits operating conditions
Plasmids were transformed in chemically competent E. Coli DH5alphaZl and plated on LB agar plates containing the appropriate antibiotics. The set circuit presented in Fig. 2B was transformed in cells containing HK022 integrated BP DNA register bearing a chloramphenicol resistance cassette. The polycistronic reset circuit, the decoupled reset circuit, and all S/R RAD modules were transformed in cells containing Phi80-integrated BP or LR DNA register bearing a kanamycin resistance cassette. For Fig. 17-20, all DNA register bears a kanamycis resistance cassette. Bxbl, TP901-1 and PhiC31 data register were integrated at Phi80 site; Phirvl data register was integrated at p21 site. Cells containing the chromosomal DNA data register were grown with an additional 5μg/ml of kanamycin or chloramphenicol depending on the integrated cassette.
Data collection and analysis
Flow cytometry analysis was performed at the Stanford Shared Facs Facility (SSFF). The use of 2 channels increased the resolution of the measurements and was important to analyze the intermediate state resulting from bidirectional flipping. For Figures 4B and 4C, data were gated by forward and side scatter, and forward scatter vs GFP was plotted.
Generation of a degenerated library of ssrA tagged reset integrases and library screening
A screening vector (Fig. 5C) was built containing the excisionase fused to an AAK Ssra tag under the control of the arabinose promoter. A PLtetO-1 promoter followed by Hindlll and Nsil sites allows for cloning and screening of Ate controlled set circuits. The excisionase gene is followed by Ascl and the BioBrick™ suffix Spel and Pstl sites, allowing for cloning of a reset integrase.
The last residues of the Ssra tag of the reset integrase were randomized using a reduced 12 amino-acids alphabet (Reetz, M.T. & Wu, S. Greatly reduced amino acid alphabets in directed evolution: making the right choice for saturation mutagenesis at homologous enzyme positions. Chem. Commun. 5499 (2008)) (ndt codons: Phe, Leu, He, Val, Tyr, His, Asn, Asp, Cys, Arg,
PCT Application
Arty Docket No. 062602-021402/PCT
Ser, Gly), therefore reducing the library size to 144 variants with no stop codons while conserving an equal representation of each type of amino-acids. The library was built by amplifying the Bxbl integrase gene with forward primers containing B0031 RBS and a GTG start codon and reverse primers containing the randomized ssrA tag, and cloning the PCR library into the screening vector in between the Ascl and Spel sites. Ligations were transformed into DH5alphaZl cells containing the chromosomal LR DNA register. 384 clones, corresponding to almost 95% coverage of the library (Reetz, M.T., Kahakeaw, D. & Lohmer, R. Addressing the numbers problem in directed evolution. ChemBioChem 9, 1797-1804 (2008)) were inoculated in supplemented M9 in 96 deep-well plates and grown for 18H at 37C. Cells were then diluted 1 :200 in M9 + 0.5% arabinose and grown for another 18H, then analyzed by flow cytometry on a LSRII cytometer equipped with a plate reader. Clones that were showing higher flipping toward the BP state were diluted 1 :2000 in M9 without inducer and grown again for 18H, after what they were analyzed by flow cytometry. One of the working reset circuits isolated, termed "CI", was chosen and combined with "G8" set generator to obtain the G8-C1 RAD module.
Computational design of RBSs using the RBS calculator and screening
Fourteen different RBSs were designed, using the RBS calculator (Salis HM, Mirsky EA, Voigt CA (2009) Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotechnol 27:946-950) (https://salis.psu.edu/software ), with Translation Initiation Rate (TIR) values ranging from 1, 5, 10, 20, 50, 100 and 200. Two different RBSs were designed for each TIR values. Cells were grown without or with Anhydro tetracycline to test for leakiness and capacity to be induced, respectively. The results are presented in Figure 8. DNA inversion was monitored by cloning Gemini (Martin, L., Che, A. & Endy, D. Gemini, a bifunctional enzymatic and fluorescent reporter of gene expression. PLoS ONE 4, e7569 (2009)) a bifunctional reporter containing the alpha-fragment and GFP, downstream of the invertible promoter. Therefore, recombinase activity can be monitored by beta-galactosidase activity. For beta-gal. plate assay, X-Gal was used at a final concentration of 70 μg/mL concentration and IPTG at a 80uM final concentration. RBS randomization and set circuits screening
PCT Application
Arty Docket No. 062602-021402/PCT
The Bxbl integrase gene was PCR amplified using primers containing a RBS with a randomized Shine-Delgarno sequence and a GTG start codon (SEQ ID NO.:23:
cggctttcacacNNNNNNgctagcGTG) (Martin, L., Che, A. & Endy, D. Gemini, a bifunctional enzymatic and fluorescent reporter of gene expression. PLoS ONE 4, e7569 (2009)), cloned into the screening vector downstream of the PLtetO-1 promoter (Hindlll/Nsil). Clones were inoculated in supplemented M9 in 96 deep well plates and grown for 18H at 37°C, diluted 1 :200 in SM9 containing 20ng/ml Ate, and analyzed by flow cytometry. Screening 288 clones was sufficient to obtain 5 different working set circuits. The circuit termed "G8" was picked, which was a good compromise between low basal expression (and thus low spontaneous flipping) and efficient switching.
Mathematical modeling
Ordinary differential equation (ODE) was used to describe the dynamical behaviors of RAD modules (Figs. ID, 3A, 10, 11, 12) and S/R latches (Fig. 14). ODEs keep track of the levels of recombinases and total DNA register in each state. ODEs were numerically solved using MATLAB© ODE23 solver. It was assumed that all binding-unbinding reactions are at quasi-steady-states. The quasi-steady-states are updated every time step of ODEs by using MATLAB© fsolve function. For the stochastic simulation (Fig. 2D), only the number of DNA registers in each state and fluorescent reporters was tracked. The stochastic simulation is implemented in MATLAB© using Gillespie algorithm.
Kinetic model of RAD modules
Integrase-excisionase based RAD module model
The model consists of three components: integrase (I), excisionase (X) and DNA register (D). BP to LR recombination is catalyzed by an integrase tetramer (a complex of integrase dimer binding to attB and attP sites). LR to BP recombination is catalyzed by an integrase-excisionase complex, integrase-excisionase stoichiometry in the complex for Bxbl is unknown but is assumed to be 1 : 1. As shown schematically in Fig. 1 C, the dynamics of total register in an LR state, DLRtot, can be written as:
PCT Application
Arty Docket No. 062602-021402/PCT when kc is an inversion rate constant.
Previous biochemical studies also indicated that: 1) Bxbl integrase can form a dimer in solution, 2) integrase-excisionase cannot form complex in solution, 3) integrase dimer and integrase excisionase complex can bind to all four possible recombination sites (attB, attP, attL and attPv). It was assumed that all complexes are at quasi-steady- state relative to integrase- excisionase production, degradation and DNA recombination. The quasi-steady-state
concentration of each complex can be written as:
f rtf y 1 _ {D] }2lx]m
when Ki , Kj; and Kdix are dissociation equilibrium constants of integrase-integrase dimer, of integrase dimer-recombination site complex and of integrase-excision complex on a
recombination site, respectively.
Note that in the default model it was assumed that the [DI4X], [DI4X2] and [DI4X3] are inactive and that integrase-excisionase cannot form a complex without an att site. For the model used for generating Figure 12-vi, it was assumed that [DI4X], [DI4X2] and [DI4X3] can undergo
For the model used for generating Fig. 12-vii, it was assumed that there also exist the following integrase-excisionase complexes in cytosol,
, ffl2lxir
Dual recombinase RAD module model
The dual recombinase RAD module model (Fig. 14A) consists of three components: recombinase-1 (Rl), recombinase-2 (R2) and DNA register (D). BP to LR and LR to BP recombination are catalyzed by Rl tetramer and R2 tetramer, respectively. It was assumed that Rl and R2 bind to an att site as dimer; only one dimer can bind to each att site at a time; Rl2 and R22 can bind to both states of the register. Putting together, the system is governed by the following system of equations.
PCT Application
Arty Docket No. 062602-021402/PCT
where kc is an inversion rate constant and K; , and Kdi are dissociation equilibrium constants of recombinase dimer and recombinase dimer-recombination site, respectively.
Mutual-inhibition S/R latch model
Binary state could be stored epigenetically using a bistable gene regulator network, for example, a system of two mutually repressing genes (Gardner TS, Cantor CR, Collins J J, (2000)
Construction of a genetic toggle switch in Escherichia coli. Nature 403 :339-342). The network can be set and reset simply by adding external inducers (IPTG, aTc, heat shock, etc.) that can inactivate one of the two repressors, allowing the other repressor to express.
In order to make this system modular with respect to inputs, i.e., capable of storing arbitrary transcriptional input signal, one could have an extra copy of each repressor gene driven by external transcriptional input signal (Fig. 14B). This way, a pulse of generic transcriptional signal can switch the system state by temporary increasing the production of one repressor over the other. Such input modularization has been demonstrated, albeit for only a set circuit (Tashiro, Y., Fukutomi, H., Terakubo, K., Saito, K. & Umeno, D. A nucleoside kinase as a dual selector for genetic switches and circuits. Nucleic Acids Research 39, el2 (201 1)).
The mutual inhibition S/R latch model presented here ((Fig. 145) consists of repressor Rl and R2 mutually repressing each other expression and an extra copy of Rl and R2 driven by set input and reset input, respectively. Assuming that repressors bind to their cognate operator sites as tetramer (as it was assumed for recombinases), the dynamics of Rl and R2 concentrations can be written as: j¾ + [R2 i 4
PCT Application
Arty Docket No. 062602-021402/PCT m = Preset + where aset and areset are production rate of Rl and R2 from input set and input reset, respectively; a is the maximal production rate of Rl and R2 from their repressible promoters. Ka is a dissociation equilibrium constant between each repressor and its cognate repressible promoter, γ is a degradation rate of each repressor.
Parameter non-dimensionalization and efficiency measurements
While exact kinetic parameters are unknown, the general features of the latch behaviors could be understood by nondimensionalizing all concentration and time units, here, in term of Ki and kc _1 respectively, the default parameter set has Kdi = Kd;x = 1 K; ; degradation rate constants of integrase and excisionase, γ; and γχ , equal to 1 kc; the default expression scaling for integrase and excisionase, β; and βε, are equal to 1 IQkc. Basal integrase or excisionase production rates are 0.1 βί and 0.1 βε, respectively ; induced excisionase production rate during reset and induced integrase production rate during set or reset are 10 βε and 10 βί, respectively (i.e., fold change, Fc, from basal production = 100). Total DNA register concentration is 1 (in Kiunit).
For a mutual inhibition S/R latch, repressor binding dissociation equilibrium constant , Ka = 1 ; basal and induced production of each repressor from set and reset promoter = 0.1 Ki2kc and 10 Ki2kc, respectively were used. The maximal production of each repressor from the repressible promoter is 10 IQ2kc. "Set (reset) efficiency" is defined as the fraction of the register that can be switched from BP to LR (LR to BP) state after inducing with a set (reset) pulse for 200 kc _1 and also can hold state for at least 200 kc _1 after the pulse ends, "set-reset efficiency" is defined as the product of set and reset efficiency. For a mutual inhibition SR-latch, the ratio between [Rl] and [R1]+[R2] instead of [BP] fraction was used.
The efficiency of the latch changes were studied as basal integrase and excisionase (or Rl and R2) concentrations were tuned while keeping the fold change constant. In Fig. 3, steady-state concentrations of Int and Xis were scaled by tuning γ; and γχ. For the default modeling parameters and conditions, scaling proportionally scaling basal and induced production also gives similar results. General features of RAD module operable range and modes of failure
PCT Application
Arty Docket No. 062602-021402/PCT
One could think of a RAD module as a device with two elements: state storage and input interface elements. The state storage element allows the latch to maintain the state; this includes a DNA register of the DNA inversion RAD module or a bistable mutual inhibition circuit of a mutual inhibition S/R latch. A DNA register encodes a state in a DNA sequence which is naturally maintained and replicated inside living cells; a mutual inhibition circuit encodes a state as repressor concentration which can be maintained through a feedback loop. The input interface element allows external inputs (transcriptional signals for examples presented here) to perturb and change the state of the state storage element. This includes the integrase-excisionase genes (or dual recombinases) for the DNA inversion RAD module or extra copies of repressor genes driven by inputs, for a mutual inhibition S/R latch.
A challenge in implementing a RAD module is to properly "map" the dynamic ranges of the external input of interest, via the input interface element, to the state phase of the state storage element. In this study, such mapping is represented as the expression scaling parameter β. Physically, β is proportional to translation rate and inverse proportional to protein degradation rate. When the scaling parameter for integrase, βί, is large, even low transcriptional set or reset signal is mapped to high integrase concentration (the upper part of Fig. ID). If β; is too large, even basal transcriptional signal set or reset signal results in too much integrase for the RAD module to hold the state. On the contrary, if β; is too small, transcriptional signal during a set or reset will not give enough integrase for the RAD module to set or reset. The same idea holds true for a mutual inhibition S/R latch: scaling parameter for the repressor from input interface element must be low enough to allow the state storage element to remain bistable in the absence of an input pulse and high enough to cause the loss of bistability in the presence of an input pulse.
Another challenge for implementing a RAD module is to optimize two antagonizing mechanisms, the set and the reset mechanisms, within the same chassis. Optimal conditions for resetting of the integrase-excisionase based S/R latch, for example, having a stable and efficiently translating excisionase would be likely to have so high excisionase basal expression that can interfere with the setting mechanism.
In general, the size of the operable range with respect to the scaling parameter will be proportional to the fold changes between the basal input level and the input pulse. If the fold change is small, one needs to precisely match the basal input level to state storage regime and the
PCT Application
Arty Docket No. 062602-021402/PCT induced input level to the state switching regime. Thus, the operable range will be narrow
12- iv).
Features of integrase-excisionase based RAD modules operable range
First consider switching efficiencies from BP to LR state during a long set pulse and from
LR to BP state during a long reset pulse (Fig. 3A, top row). Both set and reset operable ranges have a lower bound of integrase expression level corresponding to the induced integrase level that is "enough" for efficient recombination. For a set, at the limit of low excisionase expression, the rate of BP to LR recombination is governed by the level of [DI4], which can be switched to LR state, relative to [D] and [DI2] which cannot:
i ^ - ^ iDtot} < m < iitsei
*all concentrations are normalized with respect to IQ.
If [Itot] during set is small compared to the square root of IQi then the ratio between [DI4] and [D] + [DI2] become small and thus setting is inefficient. Therefore, the integrase expression lower bound for set to scale with a square root of IQi could be expected. For a reset,
interestingly, the lower bound of integrase expression is somewhat lower than that for set. This is because excisionase binding could help pushing the complex equilibrium toward [DI4X4] complex.
Excisionase determines directionality of state switching. Too high basal excisionase expression will break a set. On the contrary, too low induce excisionase expression will break a reset. Consider how much excisionase expression scaling will allow both efficient set and reset. The net BP to LR and LR to BP recombination rates depend on the relative amount of the active recombination complex, [DI4] and [DI4X4], which, in turn, depends on excisionase level: ί5:½] M*
If basal [X] during set is significantly larger than Κ<ϋχ, [DI4]/[DI4X4] will become very small and set will not work. In Fig. 3A, basal excisionase level is 0.1 βε so the upper bound of
PCT Application
Arty Docket No. 062602-021402/PCT excisionase level for set operable range at 0.1 βε ~ Kdix would be expected. On the contrary, if induced [X] during reset is significantly smaller than, [DI4]/[DI4X4] will become very large and reset will not work. In Fig. 3 A, induced excisionase level is 10pethe lower bound of excisionase level for reset operable range at 10βε ~ IQix would be expected. Note that since excisionase can only bind to integrase on the recombination site but not free integrase, no matter how much integrase was expressed, the lower limit of [X] will not go below [Xtot] - 4*[Dtot]. Therefore, the upper bound of excisionase expression for a set operable range and the lower bound of excisionase expression for reset operable range will be almost independent of integrase level. However if there is a large amount of [Dtot] or allow integrase-excisionase binding in cytosol, the [X] will also depend on how much integrase there is and this lower bound will increase as integrase level increases (Fig. 12, v and vii).
Now, consider switching efficiency after the input pulse is gone (Fig. 3A, 2nd to 4th rows). Reset operable range has an additional bound: the upper bound of integrase production which is not enough for causing spontaneous BP to LR recombination at the end of the reset pulse (Fig. 10). Reset becomes inefficient if [Itot] > [Xtot] during the reset pulse because, at the end of the pulse, excisionase will disappear first and thus left over integrase will drive BP state register back to LR state again. Note that this upper bound is not simply a straight line [Itot] = [Xtot]; once there is enough basal excisionase to bind to all integrases on att sites, BP to LR switching become very inefficient, regardless of how much integrase there is in the system. Therefore, this upper bound sharply rises asymptotically toward the excisionase upper bound of the set operable range described above. When [Dtot] is large or Int-Xis can form complex in the cytosol, this the upper bound becomes a diagonal line [Itot]=[Xtot] -
The set operable range after an input pulse remains almost the same as during an input pulse. The only difference is a small notch locating approximately where [Itot]=[Xtot] line cross the excisionase upper bound of the set operable range (Fig. 12). At this point, there is enough basal integrase expression and basal excisionase expression to cause LR to BP switching in the absence of a pulse.
The sharp boundary between operable and non-operable regions of the set or the reset arises from the fact that four integrase monomers act cooperatively to catalyze BP to LR recombination and four excisionase monomers act cooperatively to enable LR to BP
recombination. Cooperative effect is particularly strong because it was assumed inactive
PCT Application
Arty Docket No. 062602-021402/PCT intermediate complex (a DNA register with integrase tetramer and 1-3 excisionase monomers can neither undergo BP to LR nor LR to BP recombination). If it was instead assumed that these intermediate complexes can undergo both BP to LR and LR to BP recombination, then the boundary of the operable range become smoother (Fig. 12- vi).
Operable range features of S/R latches with alternative mechanisms
Similar to integrase-excisionase RAD module based S/R latch, operable ranges of dual recombinase RAD module based S/R latch or mutual inhibition S/R latch with respect to input element expression scaling are constrained by (Fig. 14): 1) the expression scaling lower bound that is large enough to allow state change during a pulse, and 2) the expression scaling upper bound that small enough to not allow spontaneous state switching in the absence of an input pulse. Operable range size, i.e., the distance between the lower and the upper expression scaling bounds is approximately the fold changes between the induced and the basal input levels.
Rectangular operable range shape results from the fact that the set and the reset mechanism for the dual recombinase RAD module (or for the mutual inhibition S/R latch) do not directly interacting with each other and that there is no loss of operable regime due to stoichiometry mismatch. Note that for mutual inhibition S/R latch, there is a sharp transition between efficient and inefficient set or reset operable range due to bistability of the system.
Simulated operable range with respect to DNA register and integrase-excisionase gene copy numbers
Simulated RAD operable range can recapitulate experimentally observed dependence between the copy number of DNA register, the copy number of integrase-excisionase genes and S/R latch efficiency (Fig. ΊΑ and IB). Specifically, increasing the copy number of DNA register relative to the copy number of integrase excisionase genes decreases resetting efficiency.
Parameter setting used in the simulation shown here is the same as that of the default parameter setting except for that excisionase production rate during reset is reduced to only half of integrase production rate. At the limit of high DNA register copy number, the amount of integrase-DNA register complexes approaches the total amount of integrase. If total integrase outnumbers total excisionase, there are too many integrase-DNA register complexes for excisionase to bind to and thus BP to LR recombination cannot be suppressed completely. At the limit of low DNA register copy number, the amount of integrase-DNA register is limited by the
PCT Application
Arty Docket No. 062602-021402/PCT number of total DNA register. Thus, although total excisionase is less than total integrase, there could still be enough excisionase to bind to all integrase-DNA register complexes, allowing for complete suppression of BP to LR recombination and thus efficient resetting. If the model assumption is changed to allow for integrase-excisionase complex formation in the absence of DNA register, reset efficiency is no longer sensitive to the copy number of DNA register (Fig. 7Q. Therefore, the fact that reset efficiency is sensitive to DNA register copy number supports a prior finding that excisionase only binds to integrase-recombination site complex but not to free integrase (Ghosh P, Wasil LR, Hatfull GF (2006) Control of phage Bxbl excision by a novel recombination directionality factor. PLoS Biol 4:el86).
Note that varying DNA register and integrase-excisionase gene copy number also has other side effects. Lowering integrase-excisionase copy number relative to DNA register copy number means that the amount of enzyme (integrase or integrase-excisionase complex) is reduced relative to the amount substrate (DNA register). Thus, state switching, both set and reset, will need more time to complete which could explain why both set and reset become inefficient (Fig. IB and 7C, lower right corner). On the contrary, increasing integrase-excisionase gene copy number results in higher basal expression levels of integrase and excisionase, leading to spontaneous state switching. Thus, state storage becomes inefficient (Fig. IB and 7C, bottom row, upper left corner). Stochastic simulation of bidirectional DNA inversion
In Fig. 3B, the failure mode of a RAD module is presented in which each cell in the population expressing both GFP and RFP during an input pulse and then splitting into two populations of cells expressing either GFP or RFP after the pulse. This observation may be explained using a stochastic simulation of bidirectional DNA inversion. The model consists of a single copy DNA register which can be in either state 0, expressing GFP, or state 1, expressing RFP. The scenario was simulated in which the net propensity for inverting from state 0 to state 1 and from state 1 to state 0 are equal. This scenario corresponds to the region between the set and the reset regime in Fig. ID. It was also assumed that the degradation propensity of the both reporters is ten times slower than the inversion propensity (it was expected that in the
experimental system reporter kinetics is slow because GFP and RFP are stable). During an input pulse, bidirectional inversion occurs so fast relative to reporter kinetics that cells appear to have
PCT Application
Arty Docket No. 062602-021402/PCT both reporters expressing although its DNA register can be in either state 0 or state 1 at any given moment. After an input pulse, DNA inversion stops and half of the population will stochastically end up in state 0 and the other half in state 1.
All references, patents, patent applications, and other documents cited herein are incorporated by reference in their entirety herein.
Claims
1. An in vivo data storage system comprising a recombinase addressable data module, the module comprising: an invertible DNA data register comprising a DNA register sequence flanked by oppositional attachment sites, wherein the directionality of the DNA register sequence is invertible to a set state 1 , and optionally reversibly invertible from said set state 1 to a reset state 0.
2. The in vivo data storage system of claim 1, further comprising:
(a) a set generator comprising a first gene encoding an integrase; and
(b) a reset generator comprising a second gene encoding an excisionase.
3. The in vivo data storage system of claim 1 or 2 wherein the attachment sites of the invertible DNA data register are recognized and recombined by the integrase of the set generator when the DNA register sequence is in reset state 0.
4. The in vivo data storage system of claim 1 or 2 wherein the attachment sites of the invertible DNA data register are recognized and recombined by an integrase-excisionase complex of the reset generator when the DNA register sequence is in set state 1.
5. The in vivo data storage system of claim 2, wherein the reset generator further comprises a third gene encoding an integrase.
6. The in vivo data storage system of claim 5 wherein the attachment sites of the invertible DNA data register are recognized and recombined by an integrase-excisionase complex of the reset generator when the DNA register sequence is in set state 1.
7. The in vivo data storage system of claim 5, wherein the second and third genes together encode proteins that form an excisionase-integrase complex.
8. The in vivo data storage system of claim 5, wherein the second and third genes together encode a fusion protein comprising the excisionase and the integrase.
PCT Application
Arty Docket No. 062602-021402/PCT
9. The in vivo data storage system of claim 2, wherein at least one of the first and second genes are inducible.
10. The in vivo data storage system of any one of claims 5-8, wherein at least one of the first, second and third genes are inducible.
11. The in vivo data storage system of claim 9 or 10, wherein at least one inducible gene is inducible by an inducer.
12. The in vivo data storage system of claim 11, wherein the inducer is a small molecule, a chemical, a protein, an enzyme, a nucleic acid, and a metal ion.
13. The in vivo data storage system of claim 11 or 12, wherein the inducer is an endogenous inducer.
14. The in vivo data storage system of claim 11 or 12, wherein the inducer is an exogenous inducer.
15. The in vivo data storage system of claim 2, wherein the at least one of the first and second genes is autoinducible.
16. The in vivo data storage system of any one of claims 11-14, wherein the inducer induces gene expression indirectly.
17. The in vivo data storage system of any one of claims 2-16, wherein the stored data is retained in the absence of expression of the first gene, the second gene, or both the first and second genes.
18. The in vivo data storage system of any one of claims 2-17, wherein each pair of set generator and reset generator represents one binary digit (bit).
19. The in vivo data storage system of claim 18, comprising N unique pairs of set generators and reset generators, such that the system is capable of storing up to 2AN bits of data.
20. The in vivo data storage system of any one of claims 2-19, wherein the integrase and excisionase are derived from bacteriophage Bxb 1 , TP901 - 1 , Phirvl and/or PhiC31.
PCT Application
Arty Docket No. 062602-021402/PCT
21. The in vivo data storage system of any one of claims 1-20, wherein the in vivo data storage system is a nonvolatile storage system.
22. A vector comprising the in vivo data storage system of any one of claims 1-21.
23. A recombinant cell comprising the in vivo data storage system of any one of claims 1-21.
24. The recombinant cell of claim 23, wherein the in vivo data storage system is present in a chromosome of the recombinant cell.
25. The recombinant cell of claim 23, wherein the DNA data register is present in a chromosome of the recombinant cell.
26. A recombinant cell comprising the in vivo storage system of any one of claims 2-21 wherein at least one of the first and second genes is present in a chromosome of the recombinant cell.
27. A recombinant cell comprising the in vivo storage system of any one of claims 5-8 and 10, wherein at least one of the first, second and third genes is present in a chromosome of the recombinant cell.
28. A method for storing data in a cell, the method comprising:
(a) providing a cell comprising an in vivo data storage system having a recombinase addressable data module including
(i) a set generator comprising a first gene encoding an integrase; and
(ii) an invertible DNA data register comprising a DNA register sequence flanked by oppositional attachment sites, wherein the directionality of the DNA register sequence is invertible to a set state 1 , and optionally reversibly invertible from said set state 1 to a reset state 0, and
(b) inducing the first gene or allowing the induction of the first gene to express the integrase so as to allow the DNA register sequence to invert, thereby generating a set state of directionality for the DNA register sequence, the set state represented by binary digit 1, thereby storing data represented by the binary digit 1 in the cell.
PCT Application
Arty Docket No. 062602-021402/PCT
29. The method of claim 28, wherein the recombinase addressable data module further comprises a reset generator comprising a second gene encoding an excisionase and, optionally, a third gene encoding an integrase, and wherein the method further comprises:
(c) inducing the second gene or allowing the induction of the second gene to express the excisionase so as to allow the DNA register sequence to invert back, thereby generating a reset state of directionality for the DNA register seuqence, the reset state represented by binary digit 0, thereby storing data represented by the binary digit 0 in the cell.
30. The method of claim 29 further comprising:
(d) optionally repeating step (b), thereby storing data represented by the binary digit 1 or 0 in the cell.
31. The method of claim 29 or 30, further comprising controlling expression of the excisionase to provide a stoichiometric amount of the excisionase in relation to one or both of an amount of an integrase and a copy number of the DNA register sequence, so as to favor generation of the reset state 0 of the DNA register sequence.
32. The method of claim any one of claims 29-31 , further comprising tunably controlling the reversible inversion of the DNA register sequence between a state 0 and a set state 1 , the controlling comprising one or more of:
(i) minimizing spontaneous inversion to set state 1 ;
(ii) minimizing during step (b) interference by excisionase to favor generation of the set state 1 ; and
(iii) minimizing during step (c) stoichiometry mismatch to favor generation of state 0.
33. The method of claim 32, wherein (i) comprises controlling basal expression of the integrase below a threshold level for spontaneous inversion.
34. The method of claim 32, wherein (ii) comprises increasing degradation of the
excisionase.
35. The method of claim 32, wherein said step (iii) includes one or more of: increasing expression of the excisionase, decreasing expression of the integrase, increasing degradation of the integrase, and reducing a copy number of the DNA register sequence.
PCT Application
Arty Docket No. 062602-021402/PCT
36. The method of any one of claims 28-35, wherein the wherein the in vivo data storage system is a nonvolatile storage system.
37. The method of any one of claims 28-35, wherein the in vivo data storage system is a nonvolatile storage system, wherein the stored data is retained in the absence of expression of the first gene.
38. The method of any one of claims 29-35, wherein the in vivo data storage system is a nonvolatile storage system, wherein the stored data is retained in the absence of expression of the first gene, the second gene, or both the first and second genes.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201261688092P | 2012-05-08 | 2012-05-08 | |
US61/688,092 | 2012-05-08 | ||
US201361852002P | 2013-03-14 | 2013-03-14 | |
US61/852,002 | 2013-03-14 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013169867A1 true WO2013169867A1 (en) | 2013-11-14 |
Family
ID=49551234
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2013/040089 WO2013169867A1 (en) | 2012-05-08 | 2013-05-08 | Methods and compositions for rewritable digital data storage in live cells |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2013169867A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014093852A1 (en) * | 2012-12-13 | 2014-06-19 | Massachusetts Institute Of Technology | Recombinase-based logic and memory systems |
US10669558B2 (en) | 2016-07-01 | 2020-06-02 | Microsoft Technology Licensing, Llc | Storage through iterative DNA editing |
US10892034B2 (en) | 2016-07-01 | 2021-01-12 | Microsoft Technology Licensing, Llc | Use of homology direct repair to record timing of a molecular event |
CN113462710A (en) * | 2021-06-30 | 2021-10-01 | 清华大学 | Random rewriting DNA information storage method |
US11359234B2 (en) | 2016-07-01 | 2022-06-14 | Microsoft Technology Licensing, Llc | Barcoding sequences for identification of gene expression |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1205490A1 (en) * | 2000-11-10 | 2002-05-15 | ARTEMIS Pharmaceuticals GmbH | Fusion protein comprising integrase (phiC31) and a signal peptide (NLS) |
WO2010075441A1 (en) * | 2008-12-22 | 2010-07-01 | Trustees Of Boston University | Modular nucleic acid-based circuits for counters, binary operations, memory, and logic |
-
2013
- 2013-05-08 WO PCT/US2013/040089 patent/WO2013169867A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1205490A1 (en) * | 2000-11-10 | 2002-05-15 | ARTEMIS Pharmaceuticals GmbH | Fusion protein comprising integrase (phiC31) and a signal peptide (NLS) |
WO2010075441A1 (en) * | 2008-12-22 | 2010-07-01 | Trustees Of Boston University | Modular nucleic acid-based circuits for counters, binary operations, memory, and logic |
Non-Patent Citations (3)
Title |
---|
CHO ET AL.: "Interactions between Integrase and Excisionase in the Phage Lambda Excisive Nucleoprotein Complex", JOURNAL OF BACTERIOLOGY, vol. 184, no. 16, September 2002 (2002-09-01), pages 5200 - 5203 * |
GHOSH ET AL.: "Control of phage Bxbl excision by a novel recombination irectionality factor", PLOS BIOL, vol. 4, no. 6, 30 May 2006 (2006-05-30), pages 0964 - 0974 * |
STUDIER: "Protein production by auto-induction in high-density shaking cultures", PROTEIN EXPRESSION AND PURIFICATION, vol. 41, 12 March 2005 (2005-03-12), pages 207 - 234 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014093852A1 (en) * | 2012-12-13 | 2014-06-19 | Massachusetts Institute Of Technology | Recombinase-based logic and memory systems |
US9691017B2 (en) | 2012-12-13 | 2017-06-27 | Massachusetts Institute Of Technology | Recombinase-based logic and memory systems |
US10669558B2 (en) | 2016-07-01 | 2020-06-02 | Microsoft Technology Licensing, Llc | Storage through iterative DNA editing |
US10892034B2 (en) | 2016-07-01 | 2021-01-12 | Microsoft Technology Licensing, Llc | Use of homology direct repair to record timing of a molecular event |
US11359234B2 (en) | 2016-07-01 | 2022-06-14 | Microsoft Technology Licensing, Llc | Barcoding sequences for identification of gene expression |
CN113462710A (en) * | 2021-06-30 | 2021-10-01 | 清华大学 | Random rewriting DNA information storage method |
CN113462710B (en) * | 2021-06-30 | 2023-07-11 | 清华大学 | A Randomly Rewritable DNA Information Storage Method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2013358998B2 (en) | Recombinase-based logic and memory systems | |
Bonnet et al. | Rewritable digital data storage in live cells via engineered control of recombination directionality | |
US8645115B2 (en) | Modular nucleic acid-based circuits for counters, binary operations, memory and logic | |
Bradley et al. | Tools and principles for microbial gene circuit engineering | |
US10607716B2 (en) | Synthetic biology tools | |
Stark | Making serine integrases work for us | |
US9718858B2 (en) | Tunable control of protein degradation in synthetic and endogenous bacterial systems | |
Fernandez-Rodriguez et al. | Memory and combinatorial logic based on DNA inversions: dynamics and evolutionary stability | |
US9697460B2 (en) | Biological analog-to-digital and digital-to-analog converters | |
WO2013169867A1 (en) | Methods and compositions for rewritable digital data storage in live cells | |
US10480009B2 (en) | Biological state machines | |
Jayaram et al. | An overview of tyrosine site‐specific recombination: From an FLP perspective | |
An et al. | Synthetic ratio computation for programming population composition and multicellular morphology | |
Chen et al. | Rational construction of a cellular memory inverter | |
Zhao | Engineering serine integrase-based synthetic gene circuits for cellular memory and counting | |
Guharajan | The Role of Binding Sequence, Position, and Promoter Strength on the Regulatory Modes of E. coli Transcription | |
Bowyer et al. | Development and experimental validation of a mechanistic model of a recombinase-based temporal logic gate | |
WO2024163909A2 (en) | Performance prediction of fundamental transcriptional programs | |
Jayaram et al. | An Site-specific From Overview an Flp Recombination: Perspective | |
Roquet | Synthetic Recombinase-Based State Machines | |
Gander | Rational design and implementation of synthetic genetic digital logic circuits in Saccharomyces cerevisiae | |
Sun | Biological Programmable Logic Device in Escherichia coli | |
Ham | Artificial genetic inversion recombination switch | |
Lange et al. | Tunable synthetic transcription factors assembled from TALE-like repeat elements | |
Jakimo | Genomic nucleic acid memory storage with directed endonucleases |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13788470 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13788470 Country of ref document: EP Kind code of ref document: A1 |