METHODS AND NUCLEIC ACID VECTORS FOR RAPID AND PARALLEL
ASSAY DEVELOPMENT, FOR CHARACTERIZATION OF THE ACTIVITIES
OF BIOLOGICAL RESPONSE MODIFIERS
Field of the invention
The present invention relates to the fields of molecular biology and drug discovery. More particularly, the invention relates to methods and constructs useful for identifying genes that are modulated by the activity of other gene products provided heterol- ogously to the engineered cell or by other response modifiers provided to the engineered cells. Background of the Invention
The promoter trap or gene trap is a useful tool for studying the distribution and activation of promoters in a cellular system. In its simplest form, a promoter trap con- sists of a reporter gene, lacking a promoter of its own, randomly integrated downstream of an endogenous promoter. The endogenous promoter drives transcription of the reporter gene, allowing one to study factors that influence the promoter, and to identify tissues in which the promoter is active, by detecting the transcription and/or translation of the reporter gene. Previous workers have used promoter trap reporter vectors to study gene regulations in several settings. H. von Melchner et al., Proc Natl Acad Sci USA (1990) 87:3733-37; and H. von Melchner and H.E. Ruley, Environ. Health Persp. (1990) 88:141- 48 disclosed the construction of retroviral-based promoter traps employing neoR (under control of a tk promoter) and either histidinol dehydrogenase (hisR) or β-gal, all located within the 3' LTR IB region, to indicate promoters in 3T3 cells. Only 1 out of 2500 integrations resulted in hisR expression.
In other experiments G. Friedrich and P. Soriano, Genes & Development (1991) 5:1513-23 inserted promoter trap constructs into murine embryonic stem cells to study gene function. The constructs contained a splice acceptor site, β-galactosidase (the
reporter gene), and neoR under the control of a phosphoglycerate-kinase-1 (PGK) promoter, or a fusion protein of β-galactosidase to neoR ("βgeo"), positioned in the retroviral vector between the 3' and 5' LTR in the 3' to 5' orientation. The constructs were inserted by electroporation (as linear DNA), or by transfection using vectors having retroviral LTRs. P. Natarajan and CA. Boulter. Nuc Acids Res (1995) 23:4003-04. prepared a similar construct using a hygromycin resistance-TK fusion protein instead of neomycin resistance.
Another variation on the use of promoter traps was revealed by L.M. Forrester et al., Proc Natl Acad Sci USA (1996) 93:1677-82, who disclosed a promoter trap construct comprising a splice acceptor, start codon, lacZ reporter, and neoR under control of a PGK1 promoter. The construct was used to study genes induced or repressed by addition of retinoic acid.
An interesting use of FACS allowed promoter trapping in studies by J.A. Gogos et al., J Virol (1997) 71: 1644-50. These studies used promoter trap vectors arranged according to Soriano et al., containing βgal and neoR, and isolated β-gal-expressing cells by FACS.
Via use of panning and complement mediated negative selection, T. Andreύ et al., J Biol Chem (1998) 273:13848-54, disclosed a retroviral promoter trap construct arranged within the U3 region of the 3"LTR. This construct used human cell surface receptor CD2 and human placental alkaline phosphatase (SEAP), where SEAP was separated from the CD2 gene by an internal ribosomal entry site (IRES) to permit translation of both genes from a bicistronic transcript. The vector further included neoR under control of a TK promoter, which was used to select for high-titer virus producer cells. HER1KA1 cells were infected and panned for CD2 expression using a IgGrCD2 ligand fusion protein. Recovered cells were then treated with a transforming concentration of EGF, followed by lysis with anti-CD2 and complement, to eliminate cells having the promoter trap integrated behind a constitutively expressed promoter and select for promoters repressed
by EGF. The surviving cells were analyzed for EGF regulation by detecting CD2 presence or SEAP activity.
Genes regulated by cell stimulation by extracellular effectors were trapped in the studies of M. Whitney et al., Nature Biotech (1998) 16:1329-33. This paper disclosed an experiment in which cells electroporated with a vector comprising β-lactamase flanked by a splice acceptor and a polyadenylation sequence were sorted by FACS based on their response to two stimuli, alone or in combination.
Summary of the Invention One aspect of the invention is a reporter construct, comprising polynucleotides encoding a reporter, a positive selectable marker, and a negative selectable marker.
Another aspect of the invention is a host cell comprising a reporter construct of the invention integrated within the host cell genome.
Another aspect of the invention is a panel of host cells, each comprising a reporter construct of the invention integrated within the host cell genome, where each cell contains approximately one reporter construct, and a plurality of different host promoters are trapped.
Another aspect of the invention is a method for elucidating intracellular signal pathways, by providing a plurality of host cells, each host cell comprising a reporter con- struct of the invention and in addition a regulated heterologous gene, or cDNA, or cellular response modifier of interest; alteration of the expression of the heterologous gene in response to an activity of the response modifier, will allow determination and isolation of those host cells which express the reporter gene. Determination the gene flanking the reporter construct integration site will provide identification of genes regulated by the heterologous gene or cDNA.
Another aspect of the invention is a method for preparing a panel of reporter cells for use in assaying compounds for activity against a selected gene or cDNA or cellular response modifier; in this method a plurality of cells comprising an integrated reporter
construct of the invention and, in addition, said selected gene or cellular response modifier, inducing said selected gene or cellular response modifier, and selecting those cells which alter reporter activity in response to chemical manipulation of the cellular response modifier. Another aspect of the invention is a method for identifying a promoter modulated by a particular selected response modifier by integrating a construct of the invention into a plurality of cells, selecting for only those cells that integrated the vector, selecting against cells wherein the construct has integrated next to a constitutive promoter, and selecting for cells that express the construct encoded protein when the level of the selected response modifier is altered. Alternatively, one may select against cells that express the construct encoded protein when the level of the selected response modifier is altered.
Another aspect of the invention is a method for identifying a promoter modulated by particular selected gene, cDNA or cellular response modifier by integrating a reporter construct of the invention into a plurality of cells, selecting reporter nonexpressing cells in the absence of the response modifier, and selecting for reporter expressing cells in the presence of the response modifier. Alternatively, in selection step 1 , one could select against cells that express the reporter and a linked selectable marker in the absence of the gene, cDNA or response modifier.
Brief Description of the Figures
FIG. 1 is a diagram of a retroviral embodiment of the invention having a 3' LTR (lacking an effective promoter), a splice acceptor site (SA), a GUS-TK fusion protein, an internal ribosomal entry site (IRES), a zeomycin resistance (Zeor) marker, an extended packaging signal (Ψ ), and a 5' LTR.
FIG. 2 is a diagram of another retroviral embodiment of the invention, having a 3' LTR (lacking an effective promoter), a splice acceptor site (SA), a TK-BSD fusion protein for selection against constitutively active promoters and for selection of successful
promoter trap events, respectively, an IRES, a GUS gene for reporter assay and/or FACS sorting, an extended packaging signal (Ψ^), and a 5' LTR.
FIG. 3 is a diagram of another embodiment of the invention, having a 3' LTR (lacking an effective promoter), a splice acceptor site, a hygromycin resistance/TK fusion protein coding sequence, an IRES sequence, a GUS coding region, an extended packaging signal, and a 5' LTR.
FIG. 4 is a diagram of another embodiment, having a 3' LTR (lacking an effective promoter), a splice acceptor site, a BSD coding sequence, an IRES sequence, a GUS coding region, an extended packaging signal, and a 5' LTR. FIG. 5 is a diagram of another embodiment, having a 3' LTR (lacking an effective promoter), a splice acceptor site, a GUS coding sequence, an IRES sequence, a CD2 coding region, an extended packaging signal, and a 5' LTR.
FIG. 6 is a diagram of another embodiment, a version designed for application via transfection. This vector, pTTrap-Puro, contains a splice acceptor site, a Kozak con- sensus sequence, a GUS coding sequence, an IRES sequence, and a puromycin resistance coding sequence.
FIG. 7 is a diagram of another embodiment, a version designed for application via transfection. This vector, pTTrap-TK/BSD, contains a splice acceptor site, a Kozak consensus sequence, a GUS coding sequence, an IRES sequence, and a fusion protein encoding Herpes-virus thymidine kinase (TK) and blasticidin S resistance (BSD) sequences.
FIG. 8 is a diagram of another embodiment, a version designed for application via retroviral infection. This vector, pRTrapSin-TK/BSD 3'5', contains a 3' LTR (lacking an effective promoter, known as SIN for self-inactivating), a splice acceptor site, a GUS coding sequence, an IRES sequence, a fusion protein encoding Herpes-virus thymidine kinase (TK) and blasticidin S resistance (BSD) sequences, an extended packaging signal, and a 5' LTR.
FIG 9 is a diagram of another embodiment, a version designed for application via retroviral infection. This vector, pRTrap-TK/BSD 3'5', contains a 3'LTR, a splice acceptor site, a GUS coding sequence, an IRES sequence, a fusion protein encoding Herpes-virus thymidine kinase (TK) and blasticidin S resistance (BSD) sequences, an extended packaging signal, and a 5' LTR.
FIG. 10 is a diagram of another embodiment, a version designed for application via retroviral infection. This vector, pRTrapSin-Puro 3 '5', contains a 3' LTR (lacking an effective promoter, known as SIN for self-inactivating), a splice acceptor site, a GUS coding sequence, an IRES sequence, a protein encoding Puromycin resistance sequences, an extended packaging signal, and a 5' LTR.
FIG 11 is a diagram of another embodiment, a version designed for application via retroviral infection. This vector, pRTrap-Puro 3 '5', contains a 3'LTR, a splice acceptor site, a GUS coding sequence, an IRES sequence, a Puromycin resistance sequences, an extended packaging signal, and a 5' LTR.
Detailed Description Definitions:
The term "reporter construct" as used herein refers to a polynucleotide that comprises a sequence encoding three elements: (a) a reporter, or/and (b) a positive selectable marker, or/and (c) a negative selectable marker. Two or more of the elements can be fused together in a fusion protein.
The term "reporter" refers to a polynucleotide sequence that is capable of providing a detectable signal upon transcription and/or translation. Translational reporters typically encode an enzyme or other protein that is easily detected, for example by its colora- tion or by its catalysis of a reaction that produces chromogenic products. Suitable translational reporters include, without limitation, green fluorescent proteins, enhanced yellow fluorescent protein (EYFP), β-galactosidase, β-D-glucuronidase (GUS), SEAP, proteases, nucleases, esterase, Upases, transferases (e.g. acetyl, phosphoryl or glycosyl transfer-
ases), oxidases, peroxidases, hydroxylases, cell surface antigens, and the like. Transcrip- tional reporters can also provide a polynucleotide that can be easily detected and/or quantified, and include without limitation polynucleotides of specific sequence (tags) and ribozymes. The term "positive selectable marker" refers to a polynucleotide sequence which confers on the host cell a characteristic that permits one to easily identify and separate cells containing the marker (a "sorting marker"), or provides for survival of the host cell under selection conditions that kill essentially all cells not possessing the marker (a "metabolic marker"). Exemplary metabolic markers include, without limitation, neoR and resistance genes for other antibiotics, such as for example zeocin, blasticidin S (e.g., BSD), kanamycin, puromycin, hygromycin, and the like. Alternatively, the positive selectable marker can overcome a metabolic deficiency in an auxotrophic host strain. Sorting markers typically provide an easily detectable signal, such as a fluorophore or chromo- phore, and enable FACS separation of cells. Alternatively, a sorting marker can encode a cell-surface protein that can be labeled with an appropriately labeled antibody (or other binding partner) for use in FACS, or permit one to pan for host cells using an immobilized binding partner. A variety of different markers can be employed , to provide a panel of transformed host cells that can be analyzed simultaneously.
The term "negative selectable marker" refers to a polynucleotide sequence which confers on the host cell a competitive disadvantage, or permits one to selectively kill host cells that possess and/or express the marker. Exemplary negative selectable markers include, without limitation, thymidine kinase (TK), and antigens that can be recognized with antibodies and complement. "Negative metabolic markers" are those negative selectable markers that can selectively kill or inhibit a host cell through use or interaction with a small molecule or antibiotic, such as, for example, TK and BSD.
The term "host cell" as used herein refers to a eukaryotic cell that is capable of integrating the constructs of the invention, and expressing the proteins encoded therein. Host cells may be altered or mutated to render them more amenable to transformation,
transfection, or infection, and/or to make them more sensitive to a particular signal of interest. For example, if a particular metabolic pathway is under study, one may choose or mutate a host cell that lacks alternate pathways, or in which alternate pathways are disabled. Suitable host cells include, without limitation, fungal cells such as yeasts, plant cells, mammalian cells such as human cells, murine cells, and the like, insect cells such as Drosophila cells and the like, etc.
The term "cellular response modifier" or "response modifier" refers to any manipulation of the host cell that results in a change in activity of any of the plurality of trapped promoters. Response modifiers include but are not limited to temperature, pres- sure, radiation environment, pH, osmolarity, ionic strength, hormones, antigens, antibodies, chemicals, helper cells, modifier cells, cell density, virus infection, prion infection, bacterial infection, and parasite infection.
The terms "heterologous DNA" or "heterologous gene" refers to all genes or cDNA or functional subfragments that are either foreign to the host cell, or are expressed, transcribed or modulated in a manner other than that found in the wild type host cell.
The term "target gene" refers to any heterologous DNA or gene that confers upon the host cell a change in expression of any of the plurality of trapped promoters. The presence of the cDNA or gene causes these changes in expression either by being translated into proteins which then drives the change in gene expression or by acting directly as the RNA species, for example as an antisense RNA or as a biologically active intracellular signaling RNA.
The term "regulatable promoter" refers to a promoter element and co-expressed promoter-binding factor contained within a cell that responds to changes in the cells environment by up or down regulating the expression of a gene operatively attached to the promoter element. The changes in the cell environment that might be used to regulate the regulatable promoter could be the addition or removal of chemical agents, changes in pH, temperature, osmolarity, radiation or other cell perturbagens. Example regulatable promoter/promoter-binding elements include, without limitation, the tetracycline regu-
lated system, the ecdysone regulated system, the heavy metal regulated metallothionein system, the heat shock regulated system, systems regulated by FK506 derivatives, and various reconstituted Lac regulated systems.
The term "IRES" refers to an internal ribosomal entry sequence, a polynucleotide sequence capable of recognition by a host cell ribosome. Use of an IRES permits one to obtain expression of two or more independent polypeptides from a single mRNA.
The term "retroviral LTR" refers to a polynucleotide sequence comprising an integrase recognition sequence sufficient for a retroviral integrase to integrate a polynucleotide attached to the LTR into a host cell genome. A retroviral LTR facilitates integra- tion if it results in integration more often than a similar vector lacking LTRs.
General Method:
One aspect of the invention is a reporter construct, comprising a polynucleotide encoding a reporter, a positive selectable marker, and a negative selectable marker. At least one of the sequences lacks a functional promoter, and must rely on integration next to a host cell promoter for expression. Constructs preferably comprise a splice acceptor or artificial intron site, a reporter gene, an IRES, and a negative selectable promoter, a promoter, and a positive selectable marker under control of the promoter: any order of positive, negative selectable marker and the reporter is acceptable. Further, the reporters and marker can be situated to be transcribed in either direction. The construct can be used as a linear polynucleotide, in which case it can be provided with restriction sites at each end so that it can be cleaved from a plasmid easily. Alternatively, the construct can be embedded in a retroviral vector, and further comprise those viral components necessary for packaging into the viral capsid (for example, with the help of a packaging cell line) and integration into the host genome. The construct can also comprise additional markers and reporter genes, and/or a cDNA under control of a different promoter.
Another aspect of the invention is a host cell comprising a reporter construct of the invention randomly integrated within the host cell genome. The host cell can be fur-
ther engineered to provide a heterologous cDNA or gene, provided either on a plasmid or other extra-genomic vehicle, or can be integrated into the host genome. The heterologous cDNA or gene is preferably provided with a regulatable promoter; however, use of a collection of variously efficient promoters, or the use of infection by a virus at various multi- plicities of infection or the use of transient transfection are also considered satisfactory.
Another aspect of the invention is a panel of host cells, each comprising a reporter construct of the invention integrated within the host cell genome, where each cell contains approximately one active reporter construct, and a plurality of different host promoters are trapped. The panel of host cells can be further engineered to provide a heterologous cDNA or gene, under the control of a promoter, to enable one to study the effect of the cDNA or gene and its product on a plurality of promoters simultaneously. The panel of host cells can be provided with a plurality of different reporter genes, such that one can distinguish host cells having constructs integrated at different positions on the basis of their reporter genes, thus enabling one to assay a plurality of different host cells (and thus, a plurality of different promoters) simultaneously, in the same vessel.
Another aspect of the invention is a method for identifying genes which are regulated by cDNAs provided to the host cells, each host cell comprising a reporter construct of the invention and a regulated heterologous gene of interest, by inducing expression of the heterologous gene, and determining which host cells express the reporter gene. This method further allows one to determine the genes flanking the reporter construct integration sites; and thus suggests genes that are regulated by the cDNA or gene or cellular response modifier. The information provided by identifying genes regulated by the cDNA or gene or cellular response modifier is useful since it can provide information about cell pathways and cellular targets of the cDNA or gene or response modifier; this latter information is particularly useful in the context of cDNAs or genes uncovered during genomics projects. The subset of host cells that respond to induction of the heterologous gene or cDNA can further be used to study compounds and genes that modulate the effect of the heterologous gene or cDNA.
Another aspect of the invention is a method for preparing a panel of reporter cells for use in assaying compounds for activity against a selected gene, by providing a plurality of cells comprising an integrated reporter construct of the invention and said selected gene, inducing said selected gene, and selecting those cells which display reporter activity in response to induction of the selected gene.
Another aspect of the invention is a method for identifying a promoter modulated by a particular selected signal, by integrating a reporter construct of the invention into a plurality of cells, selecting for only those cells that integrated the vector, selecting against cells wherein the construct has integrated next to a constitutive promoter, and selecting for cells that express the reporter in the presence of the selected signal. Alternatively, one may select against cells that express the reporter in the presence of the selected signal.
General Method
The promoter trap-reporter vectors of the invention differ from others previously described by von Melchner et al., in that they provide the following advantageous features: a) The placement of all of the reporter and selectable markers inside the body of a retro- virus vector should provide for the production of higher titer virus than other arrangements, such as the placement within the U3 region of the 3'LTR arrangement for example. b) This arrangement of the reporter and selectable markers also allows for much larger pieces of DNA to be placed in the retroviral vector than other arrangements, such as within the U3 region of the 3'LTR arrangement. c) The arrangement of the reporter within the body of the retrovirus prevents the crea- tion of a direct duplication of the reporter-selectable marker piece of the vector in the final integrated viral configuration. Such direct duplication sequences in the human genome are known to create unstable genetic arrangements; unstable arrangements are
disadvantageous for chemical screening and envisioned uses of the promoter- trapped engineered cells. d) In addition the direct duplication of the reporter in U3-3' LTR based retroviral vectors creates a situation wherein the reporter can be transcribed and translated using the positive selectable marker promoter present in those vectors; this reporter activity would create excessive undesirable background over which a trapped-regulated promoter would have to elevate activity. This would create an undesirably low signal to noise situation. e) The selection of herpes TK as the negative selectable markers and other potentially positive selectable markers (neo, hygro, blastocidin, and the like) allows parallel selection of many promoter-trapped engineered cells simultaneously due to the ease of these selections. These easy selections are much more robust and less labor demanding than antibody and complement mediated negative selections, and could avoid the costly equipment and highly trained personnel needed for flow-cytometer mediated negative selection.
Constructs of the invention are prepared using standard laboratory techniques. In general, a construct of the invention comprises a reporter gene, a first positive selectable marker, and a second negative selectable marker. The construct may be provided as a retroviral vector, or as a linear polynucleotide. The positive selectable marker is provided with its own promoter, so that one can eliminate all cells which were not transformed or failed to integrate the construct, or otherwise fail to express proteins from the construct. The positive selectable marker is preferably one that permits survival of the transformed host cells under selection conditions, or that provides a surface antigen suitable for panning or affinity selection, preferably one that confers survival. Typical positive selectable markers include genes for antibiotic resistance or metabolism, and enzymes that permit the host cell to live on media lacking otherwise essential nutrients. The positive selectable marker is provided with a promoter that may be either constitutive or regulated, and may
be heterologous to the host. Alternatively the promoter for the positive selectable marker maybe the "trapped" host promoter.
The reporter gene and negative selectable marker can be arranged to form a bicis- tronic mRN A upon transcription; another possible arrangement of the reporter and select- able marker is as a fusion protein. The reporter gene is preferably (but not necessarily) positioned near the 5' end of the construct, preceded by a splice acceptor site, so that it will be as close as possible to a host cell promoter following integration into the host cell genome. The negative selectable marker can be positioned downstream from the reporter gene, and may have an intervening IRES sequence to permit efficient expression of both genes.
The design of the vectors allows many variations upon the selection markers, reporter, and their relative configuration. For example, the positive selection markers can include, without limitation, blastR, BleoR, Neo, hygromycinR, gpt, TK, and the like. The negative selection markers suitable in the invention include, without limitation, cell sur- face markers , gpt, TK, and the like. The GUS reporter can be substituted by lacZ, SEAP, β-lactamase, luciferase, fluorescent proteins, and the like. It is also possible to design a cell in which a single cell surface protein can serve all three functions: positive selection can be achieved by panning or cell sorting, negative selection can be accomplished by antibody-mediated complement lysis, and reporter assay can be performed via ELISA. The markers and reporters can be in a wide variety of configurations using strategies similar to those of vector I and II construction (FIG. 1 and FIG. 2, respectively). For example, the three elements can be provided as a single fusion protein, a bifunctional fusion protein plus a separate protein directed by an IRES sequence, three single open reading frames separated by IRES elements, and the like. The fusion proteins can be gen- erated with or without various linker sequences, such as Ile-Ser-Gly-Ala-Asn-Gly-Ala, Gly-Gly-Ile-Pro-Arg-Gly-Ser-Ala-Thr-Met-Ala, Glyu, and the like.
Constructs of the invention can be prepared in the form of circular plasmids or linear polynucleotides, or can be embedded in retroviral transfection vectors or other viral
vectors. If prepared as a plasmid, the plasmid is preferably linearized prior to transfection. Retroviral vectors can be prepared using standard techniques, for example as described by A.D. Miller and G.J. Rosman, BioTechniques (1989) 7:980-90, incorporated herein by reference. Retroviral packaging, selection, cell sorting, panning, transfection or infection, and other techniques are conducted using standard laboratory procedures.
The host cells employed in the methods of the invention are preferably eukary- otic, and can be cells derived from multicellular organisms, such as mammals, insects, birds, or other vertebrates, or can be a unicellular organism such as a yeast; finally single and multicellular plants are also acceptable hosts. Cells derived from multicellular organ- isms can be permanent cell cultures or primary cultures, preferably immortalized permanent cell cultures. The host cells are transformed, transfected, or infected by standard procedures, depending on the cell type and the form of the vector used. The host cell can additionally comprise a heterologous gene or cDNA, preferably under the control of a regulatable promoter. Alternatively, the host cell can have a native gene that has been placed under the control of a regulatable promoter, permitting manipulation of the expression level. A vector of the invention can be introduced into the host cell either before or after introduction of other heterologous sequences.
Once the vector is constructed, it can be tested as follows: the purified plasmid DNA can be transiently transfected into a packaging cell line, such as for example PT67 (Clontech, Palo Alto, CA), Retropack-293 (Clontech), BOSC23 (W.S. Pear et al.. Proc Natl Acad Sci USA (1993) 90(181:8392-6. or Bing (ATCC CRL11270). The culture supernatant is harvested as the retroviral stock. To determine the titer of the virus, a serial dilution of the supernatant is used to infect a target cell line, such as 3T3, and the percentage of reporter (e.g., GUS+) cells is determined by FACS analysis. A successful retroviral vector will have an unconcentrated titer of 10° pfu/ml or higher (G. Friedrich and P. Soriano. Genes and Dev. (1991) 5:1513-23), preferably 103 pfu/ml or higher. To obtain larger quantities of stock, the construct can be stably introduced into the packaging cells, and clones producing high-titer virus are identified in the same manner. These
clones can then be expanded to produced the desired amount of supernatant. The resulting viral stock can be concentrated and stored at -80°C for later use.
In one method of the invention, host cells are infected with a retroviral vector as depicted in FIG. 1, comprising a promoter-disabled 3' LTR, a splice acceptor site, a Gus sequence (providing a detectable label) fused to a TK sequence (providing a negative selectable marker), an IRES followed by a Zeor (providing a positive selectable marker), a packaging signal, and a 5' LTR. After introduction into the host cell, the vector integrates at a random location. Since no promoter is provided in the vector, only vectors that integrate downstream from an active promoter will express the sequences encoded in the vec- tor. Transformed host cells are selected using the positive selectable marker, which can be provided with its own promoter if desired. For example, in cases in which the positive selectable marker is Zeor, transformed cells are grown in the presence of sufficient concentrations of the antibiotic to distinguish between transformed and non-transformed cells. Alternatively, the host cells can first be selected for integration behind a non-con- stitutive promoter, using the negative selectable marker. For example, where the negative selectable marker is TK, the host cells are cultured in media containing a sufficient amount of TK-activated microbicide (for example ganciclovir) to distinguish between cells expressing TK and those that are not expressing TK. The amount of compound used can be titrated to select those promoters having the degree of control desired. For example, if a relatively low concentration of compound is employed, one would eliminate only those cells that express TK to a large degree. Conversely, if a high concentration of compound is used, only those promoters that result in little or no expression of TK will survive selection. At concentrations intermediate between those extremes, one can find promoters that are partially activated, that provide a constitutive but low level of expression. After positive and negative selection, the remaining host cells will all have a vector construct of the invention operatively associated with regulated promoters. It is presently preferred to transform a sufficient number of host cells to provide a panel of test cells that has essentially every regulatable promoters represented by at least one host cell hav-
ing a construct of the invention in operative association therewith (a "complete" panel of host cells). The panel can be arranged in an array, if desired, such that cells having known characteristics are identified by position within the array. Cells can be identified by inverse PCR to identify the promoter associated with each reporter construct. Alterna- tively, one can identify and arrange cells simply based on their response to various response modifiers. For example, cells that exhibit a response to reduced levels of growth factors can be grouped in one section on the basis of which factor induces a response, while cells that respond to changes in temperature are grouped in a second section. The array can take any convenient form; for example, the array can consist of a series of 96- well microtiter plates.
The panel is then exposed to a variety of response modifiers, for example by cul- turing under varying conditions of temperature, media composition (including the presence or absence of growth factors, concentration of nutrients and gasses, and the like), osmotic pressure, test compounds, heterologous gene products, and the like, and are then examined for the presence of a detectable signal from the reporter. For example, a panel of test cells can be transfected with one or more heterologous genes or cDNAs (under control of a promoter that allows controlled overexpression) to form a panel of surrogate host cells. Cells that exhibit a signal are then identified, and the promoter responsible for the reporter expression can be identified (e.g., by inverse PCR, if not already identified, or by position in the array if already identified). It is expected that each cDNA or gene will affect a number of different promoters, and that cDNAs or genes that affect the same signal pathways within the cell will affect many of the same promoters. The particular promoters affected thus provide a "profile" of activity, consisting of a list of the affected promoters, and optionally the degree of activation or inhibition. Thus, one can identify the function and activity of new cDNAs and genes by comparing their profiles.
A panel of surrogate host cells, having one or more heterologous genes or cDNAs of interest ("target genes") can then be used to screen compounds for the ability to modulate the function of the target genes. In cells that change a detectable signal when the tar-
get gene is induced, an active compound (an inhibitor of the target gene or its product) is indicated by change of the signal. It is possible that a compound will inhibit less than all of a target gene's activities: this situation is indicated by reversal of some, but not all, of the signals in cells having reporters under the control of different promoters. Compounds can also be tested against a panel comprising a plurality of different target genes, and thus identify relationships between target genes in terms of the affected pathways. The panel of cells can be used as sensors for the stimulation of different pathways. To identify additional genes within the pathway, one can introduce a cDNA library into the reporter cells, and identify those cDN A clones that affect the expression pattern of the reporter gene similarly to the index gene of the pathway.
One of the utilities of the invention is to establish a cell-based reporter system in response to the expression of a particular gene or heterologous polynucleotide. For example, one can first establish a cell line that expresses a target gene such as PARP, phosphatase X, or NF-κB subunit in a regulated fashion. The cells are then infected with a retrovirus of the invention, followed by selecting for the successful gene trapping events in the presence of the inducer for the target gene (e.g., blastR or ZeoR). The cells in which a constitutively expressed gene is trapped by the construct are selected against using the negative selectable marker (e.g., ganciclovir for TK expression) after turning off the target gene. Additionally, the reporter gene expression can be used in combination with the above selection scheme to isolate and characterize those cells that express the reporter gene only in response to the expression of the target gene. Such artificial reporter cells can then serve as a cell-based compound screening reagent against the target gene. Alternatively, the cells can be used to study the unknown biological function of the target gene if one recovers and characterizes the trapped host genes or promoters. One can also use the gene trap system of the invention in the analysis of biological pathways in various cell types (see, e.g., M. Whitney et al., Nature Biotechnology (1998) J_6: 1329-33). For example, one can perform the gene trap experiments using the retroviral vectors of the invention in the B cell line CL-01, which undergoes isotype switching in
response to IL-4 and CD40L stimulation. Through both positive and negative selections, one can quickly identify cell clones and their trapped genes that are responsive to IL-4, an important mediator of IgE production. Such material and information will be highly valuable not only in understanding the mechanism of isotype switching, but also in discovery of potential allergy drugs that may interfere with IgE production.
Examples The following examples are provided as a guide for the practitioner of ordinary skill in the art. Nothing in the examples is intended to limit the claimed invention. Unless otherwise specified, all reagents are used in accordance with the manufacturer's recommendations, and all reactions are performed at standard temperature and pressure.
Example 1 (A) Gl Vector: A plasmid of the invention was constructed as follows: the backbone plasmid used was pMClNeoPolyA, which contains a neo selection marker driven by the HSV thymidine kinase promoter (Ptk). The transcription blocker (TB) sequence was obtained from pSEAPbasic2 plasmid by PCR using a 5' primer containing a Xho I site and a 3' primer containing a Sal I site. The TB PCR product was digested with Xho I and Sal I and ligated to pMClNeoPolyA that had been previously digested with Xho I and calf intestinal alkaline phosphatase (CIP). The TB-containing clones were isolated and characterized for the correct orientation with the 3' end next to Ptk, to provide pMClNeoPolyA-TB.
The synthetic intron/internal ribosomal entry site/enhanced yellow fluorescent protein (IVS-IRES-EYFP) sequence was obtained from pIRES-EYFP (Clontech)) by PCR using a 5' primer containing a Xho I site and a 3' primer containing a Sal I site. The PCR product was digested with Xho I and Sal I and ligated to pMClNeoPolyA-TB that was previously treated with Xho I and CIP. The IVS-IRES-EYFP containing clones were iso-
lated and checked for the correct orientation with the 3 ' end next to the TB to provide the Gl vector.
(B) G2 vector:
Plasmid pIRES-EYFP was digested with Age I and treated with CIP. HSV thym- idine kinase (TK) was obtained from HSV DNA y PCR using primers both containing the Age I site. TK PCR product was digested with Age I and ligated to the pIRES-EYFP treated with Age I and CIP. TK containing clones were isolated and characterized for the correct orientation with the 3' end fused to the 5' end of EYFP.
The IVS-IRES-TK-EYFP sequence was amplified using a 5' primer containing a Sal I site and a 3' primer containing a Xho I site. The PCR product was digested with Xho I and Sal 1 and ligated to pMClneoPolyA that was previously digested with Xho I and CIP. Insert-containing clones were isolated and characterized for the correct orientation with the 3 ' end next to the TB to provide the G2 vector.
(C) G3 vector: G3 vector has an additional element added to the G2 vector between the EYFP and the TB sequences. This element contains an IRES sequence followed by a reporter gene EGFP fused to a positive selection marker zeocin. The Zeocin sequence is obtained from pTracer-CMV2 vector (Invitrogen) by PCR using in-frame linker encoding primers. The 3' primer contain the Xho I site. The IRES-EGFP-Zeocin fusion cassette is obtained by PCR using overlapping templates of IRES-EGFP and Zeocin sequences. The 5' primer of IRES-EGFP is designed to contain Xho I site. The IRES-EGFP-Zeocin sequence is digested with Xho I and cloned into Xho I digested G2 vector. G3 vector is obtained by isolating clones containing the right orientation of IRES-EGFP-Zeocin sequence.
(D) G4 vector: The G4 vector contains a TK/BSD fusion (see description in Vector II) for negative and positive selection, and a cell surface marker as reporter. An example of the cell surface marker can be obtained from pDisplay (Invitrogen) in which the reporter protein consists of an unique Ig kappa-chain secretion signal at the N-terminal and a transmem-
brane anchoring domain of the platelet-derived growth factor receptor (PDGFR) at the C- terminal. The mouse IgG sequence can be cloned into the pDisplay. The cell surface anchored IgG sequence from pDisplaylgG is amplified and cloned right behind an IRES sequence. The G4 cassette sequence is obtained by PCR using two overlapping tem- plates, a: the TK/BSD fusion sequence preceded by a splice acceptor sequence; b: the IRES-IgG sequence. The cassette is then cloned into pMClNeoPolyA-TB to give rise to the G4 vector.
(E) G5 vector: pTTrap-Puro The G5 vector uses β-glucuronidase (eGUS) as reporter protein. The eGUS gene with Kozak sequence was amplified from E. coli and cloned into pcDNA3.1 vector
(Invitrogen), creating pcDNA3.1eGUS. pIRESPuro (Clontech) was digested with BamHI and Pstl. The ends of the digested vector were subsequently blunted using Klenow and then ligated to create pIRESPuro* which lacks the IVS. A splice acceptor site was created using the annealed oligos SA- Bgl 2.1 and SA- Bgl 2.2.
SA-Bgl 2.1 = gatctgtttaaacgaattcATCTCAGTTCGGTGTAGGTCGTTCGCTCCAA-
GCTGGGCTGTGTGCaagcttagatctatgcatg
SA-Bgl 2.2 = gatccATGCATAGATCTAAGCTTGCACACAGCCCAGCTTGGAG-
CGAACGACCTACACCGAACTGAGATgaattcGTTTAAACa
The annealed oligos were digested with Hindlll and ligated with the 6.46Kb Bglll and Hindlll fragment of pCDNA3.1eGUS, creating pCDNA3.1-CMV+SA+eGUS. This process removes the CMV promoter upstream of the eGUS gene. The 1.99Kb Bglll- Notl fragment of pCDNA3.1-CMV+SA+eGUS containing the splice acceptor and eGUS gene was ligated into pIRESPuro* digested with Bglll and Notl to create pTTrap-Puro. Thus G5 vector contains SA-eGUS-IRES-Puro sequence in pIRESPuro backbone. SA oligonucleotide sequences: 1) 5 'GAT CTG TTT AAA CGA ATT CAT CTC AGT TCG GTG TAG GTC
GTT CGC TCC AAG CTG GGC TGT GTG CAA GCT TAG ATC TAT GCA TG3' 2) 5'-GAT CCA TGC ATA GAT CTA AGC TTG CAC ACA GCC CAG CTT GGA GCG AAC GAC CTA CAC CGA ACT GAG ATG AAT TCG TTT AAA CA3'
(F) G6 vector: pTTrap-TK/BSD
The G6 vector contains the eGUS reporter protein, an IRES and a TK BSD fusion selection marker (see Vector II in Example 2 for details). Briefly, the IRES sequence with partial TK sequence at the 3' was PCR amplified from pLXIN (Clonetech) and cloned into pcRII vector. The TK/BSD sequence was PCR amplified and cloned into pcRII vector. The TK/BSD sequence was obtained by Mlul and Notl digestion and cloned into pcRII-IRES digested with Mlul and Notl, generating pcRII-IRES-TK/BSD. The IRES- TK/BSD sequence was obtained by Xhol digestion and cloned into the Xhol site of the pcDNA3.1-eGus vector, resulting in G6 vector with SA-eGUS-IRES-TK BSD in the pcDNA3.1 backbone.
Example 2 (A) Vector I:
The backbone retroviral vector (pSIRΔ) was derived from the pSIR plasmid (Clontech) by deleting the PH4/Neo fragment and substituting the EcoRI site in the poly- linker region with a Notl site. The GUS/TK fusion gene with splice acceptor site and Kozak sequence was created by PCR using the following primers: 1) GUS5: 5'-agc ttg egg ccg ctg act etc tct gtc gac gga tec cct ttt ttc tag gcc gcc ace atg gtc cgt cct gta gaa ace cc-3' 2) GUS3: 5'-cga age ace ggc gcc att age tec gga gcc ttg ttt gcc tec ctg ctg cg-3'
3) TK5: 5 '-ggc tec gga get aat ggc gcc ggt get teg tac ccc tgc cat caa-3'
4) TK3: 5'-gcg gga tec tea gtt age etc ccc cat etc cc-3'
The first round of PCR used GUS5/GUS3 and TK5 TK3 primer pairs in combination with plasmid template pBACgus-1 (Novagen) and pTK173, respectively. The second round of PCR used the GUS5/TK3 primer pair in combination with the products from the previous round of PCR. The final fusion PCR product was cloned into pIRESBelo vector (Clontech) to generate the entire marker/reporter cassette. This cassette is in turn cloned into the retroviral vector pSIRΔ to provide the final construct. (B) Vector II:
The backbone retroviral vector (pSIRΔ) was derived from the pSIR vector as des- cribed in part (A) above. The TK/BSD fusion gene (C. Karreman, Nuc Acids Res (1998) 26:2508-10) with an upstream splice acceptor site and Kozak sequence was generated by PCR using the following primers:
1) TK(BSD)5: 5'-c ggg ate ctg act etc tct gtc gac gga tec cct ttt ttc tag gcc gcc ace atg get teg tac ccc tgc cat caa-3' 2) TK(BSD)3: 5 '-gga ttc ttc ttg aga caa agg act agt gtt age etc ccc cat etc ccg-3'
3) BSD5: 5'-atg ggg gag gct aac act agt cct ttg tct caa gaa gaa tcc acc-3'
4) BSD3: 5 '-egg aag ctt tta gcc etc cca cac ata ace aga g-3'
The first round of PCR used TK(BSD)5/TK(BSD)3 and BSD5/BSD3 primer pairs in combination with plasmid template pcDNA6/V5-His (Invitrogen) and pTK173, respect- ively. The second round of PCR used the TK(BSD)5/BSD3 primer pair in combination with the products from the first round of PCR. The final fusion PCR product was first cloned into pcDNA3/V5-His TOPO vector (Invitrogen) for functional testing. The insert was then cloned into the retroviral vector pSIRΔ.
The IRES/GUS fragment was created by PCR using the following primers: 1.) IRES5: 5 '-ggg aag ctt gcc cct etc cct ccc cc-3'
2.) IRES3: 5'-ggt ttc tac agg acg gac cat ggt tgt ggc cat att ate ate g-3'
3.) GUS(IRES)5: 5'-g gcc aca ace atg gtc cgt cct gta gaa ace cc-3'
4.) GUS(IRES)3: 5'-gc gga etc gag tea ttg ttt gcc tec ctg ctg-3'
The first round of PCR used primer pairs IRES5/IRES3 and GUS(IRES)5/GUS(IRES)3 in combination with plasmid templates pIRES2-EGFP (Clontech) and pBACgus-1 (Novagen), respectively. The fusion PCR was carried out using the primer pair IRES5/GUS(IRES)3 and the PCR products from the previous PCR as templates. The final IRES/GIUS fusion was first cloned into the pcDNA3/V5-His TOPO vector
(Invitrogen) for testing GUS activity. It is then cloned into pSIRΔ-TK/BSD to obtain the final retroviral construct.
The configuration of Vector II permits the facile substitution of reporter, markers, or both, with other genes. For example, TK/BSD can be easily replaced with HyTK (S.D. Lupton et al., Mol Cell Biol (1991) U :3374-78), and GUS can be replaced by other reporter genes such as lacZ, SEAP, or cell surface proteins. (C) Vector HI: The backbone retroviral vector (pSIRΔ) was constructed as described in part (A) above. The GUS gene with the splice acceptor site and start codon is generated using the pBAC-gusl vector (Novagen) as the template and the following oligos as the primers: GUS (SA)5: 5'-gct ctg act etc tct gtg gac gga tec cct ttt ttc tag gca taa art aca cct ata aat atg gtc cgt cct gta gaa 3' GUS3: 5'-gc gga etc gag tea ttg ttt gcc tec ctg ctg 3' The PCR product is first cloned into pcDNA3/V5-His TOPO vector and later transferred to the retroviral vector pSIRΔ.
The IRES/CD2 fusion is generated using polymerase chain reactions with the following primers:
IRES5: 5 '-ggg aag ctt gcc cct etc cct ccc cc 3' IRES (CD2)3: 5'-gct ace cag gaa ttt aca ttt cat ggt tgt ggc cat att ate ate g 3' CD2 (IRES) 5: 5'-gat gat aat atg gcc aca ace atg aaa tgt aaa ttc ctg ggt age ttc ttt ctg 3' CD2 (IRES)3: 5'-ctt aat tag ggg gcg gca g 3'
The first round of PCR used IRES5 IRES (CD2)3 and CD2 (IRES)5/CD2 (IRES)3 primer pairs in combination with plasmids containing the IRES element and mouse CD2, respectively. The second round of PCR used IRES5/CD2 (IRES)3 primer pair in combination with the products from the previous round of PCR.
The final fusion PCR product was first cloned into pcDNA3 V5-His TOPO vector (Invitrogen) for functional testing. This insert is in turn cloned into the vector
pSIRΔ-SA/GUS to obtain the intermediate plasmid pSIRΔ-TK/BSD.
Construction of pRTrap-Puro-3 '5 '- The base vector for pRTrapPuro is pLNCX (Genbank M28247, published in Miller & Rosman, Biotechniques 7, 980, 1990) commercially available from Clontech). The 4.62Kb Bcll-Hpal fragment of pLNCX (containing both LTR's and the Psi sequence) was ligated to the 1.96Kb Pmel - Pstl fragment of pTTrapPuro (containing the splice acceptor and eGUS gene) as well as the 1.4Kb Pstl - Bell fragment of pTTrapPuro (containing the IRES and Puromycin gene). This three-piece ligation produced the vector pRTrap-Puro 3'5' which has the eGUS- IRES-Puro sequences in the opposite transcriptional orientation as that of the 5'LTR. The eGUS -IRES -Puro sequences may also be placed between the two LTR's in the same transcriptional orientation as the 5'LTR. pLNCX was digested with Bell and Hpal to remove the sequences between the LTRs. The eGUS-IRES-Puro fragment was isolated from pTTrap-Puro-3'5' by digesting with Xbal, then blunting these ends with Klenow and finally digesting the vector with Bglll. The 4.62Kb Bcll-Hpal pLNCX fragment is then ligated with the 3.36Kb Bglll-Xbal/blunted pTTrapPuro fragment to create pRTrap-Puro- 5'3'.
Construction ofpRTrapPuroSIN-3'5' - The Nhel-Xbal fragment contains the core enhancer of the Moloney Murine Leukemia virus LTR in pLNCX and has been shown to eliminate enhancer activity (Suhr et al (1998), PNAS 95, 7999). We removed the NHEi- Xbal fragment from the 3'LTR of pRTrapPuro to create pRTrapPuroSIN in the following manner. PRTrapPuro was digested with Nhel and both the 4.9Kb and 3.1Kb fragments were isolated. The 3.1Kb fragment containing the 3'LTR was subsequently digested with Xbal to remove the enhancer sequences. The 4.9Kb Nhel-Nhel fragment was then ligated to the 2.83Kb Xbai-Nhel fragment to create pRTrapSin-Puro-3'5'. The same set of manipulations can be done to make pRTrapSin-Puro5'-3' from pRTrapPuro5'-3'. Construction ofpRTrap-TK/BSD-3'5' - The base vector of pRTraptkBsd is pLNCX (Clontech). The 4.04Kb Pmel-Pmel fragment from pTTrap-TK/BSD containing the splice acceptor, eGUS and tkBsd sequences was ligated to the 4.64Kb BsaBI-Hpal fragment of pLNCX containing the vector backbone and the two LTRs. This ligation produces the vector pRTrap-TK/BSD-3'5' and in the opposite orientation with the eGUS in the same transcriptional orientation as the 5'LTR the vector pRTrap-TK BSD-5'3' is also produced.
Construction ofpRTrapSin-TK/BSD-3'5' - The 3'SIN LTR was isolated from pRTrapSin-Puro and ligated to the vector pRTrap-TK BSD in the following manner. A4.02Kb Clal-BstEII fragment from pRTrapSin-Puro containing the 3'LTR and vector
backbone was ligated to a 4.39Kb Clal-BstEII fragment from pRTrap-TK BSD containing the eGUS, IRES and TK BSD sequences to produce pRTrapSin-TK/BSD
. Andreu, T., Beckers, T., Thoenes, E., Hilgard, P., von Melchner, H. Gene trapping identifies inhibitors ofoncogenic transformation. The tissue inhibitor of metalloproteinases-3 (TIMP3) and collagen type 1 alpha! (COL1A2) are epidermal growth factor-regulated growth repressers., 1998. J Biol Chem. 273 (22): 13848-54
2. Aran, J.M., Gottesman, M.M., Pastan, I. 1998. Cancer Gene Ther. 5: 195-206
3. Bhat, K., McBurney, M.W., Hamada, H. Functional cloning of mouse chromosomal loci specifically active in embryonal carcinoma stem cells., 1988. Mol Cell Biol. 8 (8):3251-9 4. Chen, B.F., Hwang, L.H., Chen, D.S. Characterization of a bicistronic retroviral vector composed of the swine vesicular disease virus internal ribosome entry site., 1993. J Virol. 67 (4):2142-8
5. Di lanni, M., Casciari, C, Ciumelli, R., Fulvi, A., Bagnis, C, Lucheroni, F., Falzetti,
F., Mannoni, P., Martelli, M.F., Tabilio, A. beta-galactosidase transduced T lymphocytes: a comparison between stimulation by either PHA and IL-2 or a mixed lymphocyte reaction., 1996. Haematologica. 81 (5):410-7
6. Di lanni, M., Casciari, C, Ciumelli, R., Fulvi, A., Bagnis, C, Sadelain, M.,
Lucheroni, F., Mannoni, P., Stella, C.C., Martelli, M.F., Tabilio, A. 1997. LeukRes. 21 :951-959 7. Forrester, L.M., Nagy, A., Sam, M., Watt, A., Stevenson, L., Bernstein, A., Joyner, A.L., Wurst, W. An induction gene trap screen in embryonic stem cells: Identification of genes that respond to retinoic acid in vitro., 1996. Proc Natl AcadSci USA. 93 (4):1677-82
8. Friedrich, G., Soriano, P. Promoter traps in embryonic stem cells: a genetic screen to identify and mutate developmental genes in mice., 1991. Genes Dev. 5
(9): 1513-23
9. Gallardo, H.F., Tan, C, Sadelain, M. The internal ribosomal entry site of the encephalomyocarditis virus enables reliable coexpression of two transgenes in human primary T lymphocytes., 1997. Gene Ther. 4 (10): 1 1 15-9 10. Ghattas, I.R., Sanes, J.R., Majors, J.E. The encephalomyocarditis virus internal ribosome entry site allows efficient coexpression of two genes from a recombinant provirus in cultured cells and in embryos. , 1991. Mol Cell Biol. l l (12):5848-59
11. Gogos, J.A., Lowry, W., Karayiorgou, M. Selection for retroviral insertions into regulated genes., 1997. J Virol. 71 (2): 1644-50
12. Gossler, A., Joyner, A.L., Rossant, J., Skames, W.C. Mouse embryonic stem cells and reporter constructs to detect developmentally regulated genes., 1989. Science . 244 (4903):463-5
13. Hicks, G.G., Shi, E.G., Chen, J., Roshon, M., Williamson, D., Scherer, C, Ruley, H.E. Retrovirus gene traps. , 1995. Methods Enzymol. 254 263-75
14. Hicks, G.G., Shi, E.G., Li, X.M., Li, C.H., Pawlak, M., Ruley, H.E. Functional genomics in mice by tagged sequence mutagenesis., 1997. Nat Genet. 16 (4):338-44
15. Hill, D.P., Wurst, W. Screening for novel pattern formation genes using gene trap approaches., 1993. Methods Enzymol. 225 664-81
16. Karreman, C. New positive/negative selectable markers for mammalian cells on the basis of Blasticidin deaminase-thymidine kinase fusions., 1998. Nucleic Acids Res . 26 (10):2508-10
17. Karreman, C. A new set of positive/negative selectable markers for mammalian cells., 1998. Gene. 218. 218 (l-2):57-61
18. Lupton, S.D., Brunton, L.L., Kalberg, V.A., Overell, R.W. Dominant positive and negative selection using a hygromycin phosphotransferase-thymidine kinase fusion gene., 1991. Mol Cell Biol. 11 (6):3374-8
19. Miller, A.D., Rosman, G.J. Improved retroviral vectors for gene transfer and expression., 1989. Biotechniques . 7 (9):980-90
20. Muth, K., Bruyns, R., Thorey, I.S., von Melchner, H. Disruption of genes regulated during hematopoietic differentiation of mouse embryonic stem cells., 1998. Dev Dyn. 212 (2):277-83
21. Nakajima, K., Ikenaka, K., Nakahira, K., Morita, N., Mikoshiba, K. 1993. EEES Lett. 315:129-133
22. Natarajan, D., Boulter, CA. A lacZ-hygromycin fusion gene and its use in a gene trap vector for marking embryonic stem cells., 1995. Nucleic Acids Res. 23 (19):4003-4
23. Niwa, H., Araki, K., Kimura, S., Taniguchi, S., Wakasugi, S., Yamamura, K. An efficient gene-trap method using poly A trap vectors and characterization of gene-trap events., 1993. J Biochem. 113 (3):343-9
24. Overell, R.W., Weisser, K.Ε., Cosman, D. Stably transmitted triple-promoter retroviral vectors and their use in transformation of primary mammalian
cells., 1988. Mol Cell Biol. 8 (4): 1803-8
25. Pear, W.S., Nolan, G.P., Scott, M.L., Baltimore, D. Production ofhigh-titer helper- free retroviruses by transient transfection., 1993. Proc Natl Acad Sci USA. 90 (18):8392-6 26. Reddy, S., DeGregori, J.V., von Melchner, H., Ruley, H.E. Retrovirus promoter- trap vector to induce lacZ gene fusions in mammalian cells., 1991. J Virol. 65 (3): 1507-15
27. Reddy, S., Raybum, H., von Melchner, H., Ruley, H.E. Fluorescence-activated sorting oftotipotent embryonic stem cells expressing developmentally regulated lacZ fusion genes., 1992. Proc Natl Acad Sci USA. 89 (15):6721-
5
28. Skames, W.C, Auerbach, B.A., Joyner, A.L. A gene trap approach in mouse embryonic stem cells: the lacZ reported is activated by splicing, reflects endogenous gene expression, and is mutagenic in mice., 1992. Genes Dev. 6 (6):903-18
29. Stockschlaeder, M.A., Storb, R., Osbome, W.R., Miller, A.D. 1991. Hum.Gene
Ther. 2:33-39
30. Stone, J.C.D.N.A.S.L. 1986. Somatic Cell and Molecular Genetics. 6:575-
583(Abstract) 31. Thorey, I.S., Muth, K., Russ, A.P., Otte, J., Reffelmann, A., von Melchner, H. Selective disruption of genes transiently induced in differentiating mouse embryonic stem cells by using gene trap mutagenesis and site- specific recombination [published erratum appears in Mol Cell Biol 1998 Oct;18(10):6164], 1998. Mol Cell Biol. 18 (5):3081-8 32. Toyoda, A., Kusuda, J., Maeda, H., Hashimoto, K. 1995. Mamm.Genome. 6:426- 428
33. von Melchner, H., DeGregori, J.V., Raybum, H., Reddy, S., Friedel, C, Ruley, H.E. Selective disruption of genes expressed in totipotent embryonal stem cells., 1992. Genes Dev. 6 (6):919-27 34. von Melchner, H., Reddy, S., Ruley, H.E. Isolation of cellular promoters by using a retrovirus promoter trap. , 1990. Proc Natl Acad Sci USA. 87 (10):3733-7
35. von Melchner, H., Ruley, H.E. Retroviruses as genetic tools to isolate transcriptionally active chromosomal regions., 1990. Environ Health
Perspect. 88 141-8
36. Whitney, M., Rockenstein, E., Cantin, G., Knapp, T., Zlokamik, G., Sanders, P.,
Durick, K., Craig, F.F., Negulescu, P.A. A genome-wide functional assay of signal transduction in living mammalian cells [see comments] , 1998. Nat Biotechnol. 16 (13):1329-33
37. Xiong, J.W., Battaglino, R., Leahy, A., Stuhlmann, H. Large-scale screening for developmental genes in embryonic stem cells and embryoid bodies using retroviral entrapment vectors., 1998. Dev Dyn. 212 (2): 181-97
38. Zambrowicz, B.P., Imamoto, A., Fiering, S., Herzenberg, L.A., Kerr, W.G., Soriano, P. Disruption of overlapping transcripts in the ROSA beta geo 26 gene trap strain leads to widespread expression of beta-galactosidase in mouse embryos and hematopoietic cells., 1997. Proc Natl Acad Sci U S A. 94 (8):3789-94
39. Zwiebel, J.A., Freeman, S.M., Kantoff, P.W., Cometta, K., Ryan, U.S., Anderson, W.F. 1989. Science. 243:220-222
Figure XX: Nuclotide Sequence of vector pTTrap-Puro shown in Figure 6
GACGGATCGGGAGATCTCCCGATCCCCTATGGTCGACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAA GCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAA GGCAAGGCTTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTTGCGCTGCTTCGCGATGTAC GGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTC ATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACC CCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAAT GGGTGGACTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTA TTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTT GGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTG GATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACC AAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTAC GGTGGGAGGTCTATATAAGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTA ATACGACTCACTATAGGGAGACCCAAGCTTGGTACCGAGCTCGGATCGATAAACgaattcATCTCAGTTCGG TGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCAAGCTTGGTACCGAGCTCGGATCCACTAGTCCAGTGTGG TGGAATTGCCCTTgccgccacgATGGTCCGTCCTGTAGAAACCCCAACCCGTGAAATCAAAAAACTCGACGG CCTGTGGGCATTCAGTCTGGATCGCGAAAACTGTGGAATTGATCAGCGTTGGTGGGAAAGCGCGTTACAAGA AAGCCGGGCAATTGCTGTGCCAGGCAGTTTTAACGATCAGTTCGCCGATGCAGATATTCGTAATTATGCGGG CAACGTCTGGTATCAGCGCGAAGTCTTTATACCGAAAGGTTGGGCAGGCCAGCGTATCGTGCTGCGTTTCGA TGCGGTCACTCATTACGGCAAAGTGTGGGTCAATAATCAGGAAGTGATGGAGCATCAGGGCGGCTATACGCC ATTTGAAGCCGATGTCACGCCGTATGTTATTGCCGGGAAAAGTGTACGTATCACCGTTTGTGTGAACAACGA ACTGAACTGGCAGACTATCCCGCCGGGAATGGTGATTACCGACGAAAACGGCAAGAAAAAGCAGTCTTACTT CCATGATTTCTTTAACTATGCCGGAATCCATCGCAGCGTAATGCTCTACACCACGCCGAACACCTGGGTGGA CGATATCACCGTGGTGACGCATGTCGCGCAAGACTGTAACCACGCGTCTGTTGACTGGCAGGTGGTGGCCAA TGGTGATGTCAGCGTTGAACTGCGTGATGCGGATCAACAGGTGGTTGCAACTGGACAAGGCACTAGCGGGAC TTTGCAAGTGGTGAATCCGCACCTCTGGCAACCGGGTGAAGGTTATCTCTATGAACTGTGCGTCACAGCCAA AAGCCAGACAGAGTGTGATATCTACCCGCTTCGCGTCGGCATCCGGTCAGTGGCAGTGAAGGGCCAACAGTT CCTGATTAACCACAAACCGTTCTACTTTACTGGCTTTGGTCGTCATGAAGATGCGGACTTACGTGGCAAAGG ATTCGATAACGTGCTGATGGTGCACGACCACGCATTAATGGACTGGATTGGGGCCAACTCCTACCGTACCTC GCATTACCCTTACGCTGAAGAGATGCTCGACTGGGCAGATGAACATGGCATCGTGGTGATTGATGAAACTGC TGCTGTCGGCTTTTCGCTCTCTTTAGGCATTGGTTTCGAAGCGGGCAACAAGCCGAAAGAACTGTACAGCGA AGAGGCAGTCAACGGGGAAACTCAGCAAGCGCACTTACAGGCGATTAAAGAGCTGATAGCGCGTGACAAAAA CCACCCAAGCGTGGTGATGTGGAGTATTGCCAACGAACCGGATACCCGTCCGCAAGGTGCACGGGAATATTT CGCGCCACTGGCGGAAGCAACGCGTAAACTCGACCCGACGCGTCCGATCACCTGCGTCAATGTAATGTTCTG CGACGCTCACACCGATACCATCAGCGATCTCTTTGATGTGCTGTGCCTGAACCGTTATTACGGATGGTATGT CCAAAGCGGCGATTTGGAAACGGCAGAGAAGGTACTGGAAAAAGAACTTCTGGCCTGGCAGGAGAAACTGCA TCAGCCGATTATCATCACCGAATACGGCGTGGATACGTTAGCCGGGCTGCACTCAATGTACACCGACATGTG GAGTGAAGAGTATCAGTGTGCATGGCTGGATATGTATCACCGCGTCTTTGATCGCGTCAGCGCCGTCGTCGG TGAACAGGTATGGAATTTCGCCGATTTTGCGACCTCGCAAGGCATATTGCGCGTTGGCGGTAACAAGAAAGG GATCTTCACTCGCGACCGCAAACCGAAGTCGGCGGCTTTTCTGCTGCAAAAACGCTGGACTGGCATGAACTT CGGTGAAAAACCGCAGCAGGGAGGCAAACAATGActcgaGAAGGGCAATTCTGCAGATATCCAGCACAGTGG CGGCCGCGTCGACGGAATTCAGTGGATCGGTCGAGCATGCATCTAGGGCGGCCAATTCCGCCCCTCTCCCTC CCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTGATTTTC CACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAG GGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGC TTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCT CTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTG GATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGT ACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAA ACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAAGCTTGCCACAACCC ACAAGGAGACGACCTTCCATGACCGAGTACAAGCCCACGGTGCGCCTCGCCACCCGCGACGACGTCCCCCGG GCCGTACGCACCCTCGCCGCCGCGTTCGCCGACTACCCCGCCACGCGCCACACCGTCGACCCGGACCGCCAC ATCGAGCGGGTCACCGAGCTGCAAGAACTCTTCCTCACGCGCGTCGGGCTCGACATCGGCAAGGTGTGGGTC GCGGACGACGGCGCCGCGGTGGCGGTCTGGACCACGCCGGAGAGCGTCGAAGCGGGGGCGGTGTTCGCCGAG ATCGGCCCGCGCATGGCCGAGTTGAGCGGTTCCCGGCTGGCCGCGCAGCAACAGATGGAAGGCCTCCTGGCG CCGCACCGGCCCAAGGAGCCCGCGTGGTTCCTGGCCACCGTCGGCGTCTCGCCCGACCACCAGGGCAAGGGT CTGGGCAGCGCCGTCGTGCTCCCCGGAGTGGAGGCGGCCGAGCGCGCCGGGGTGCCCGCCTTCCTGGAGACC
TCCGCGCCCCGCAACCTCCCCTTCTACGAGCGGCTCGGCTTCACCGTCACCGCCGACGTCGAGTGCCCGAAG GACCGCGCGACCTGGTGCATGACCCGCAAGCCCGGTGCCTGACGCCCGCCCCACGACCCGCAGCGCCCGACC GAAAGGAGCGCACGACCCCATGGCTCCGACCGAAGCCGACCCGGGCGGCCCCGCCGACCCCGCACCCGCCCC CGAGGCCCACCGACTCTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTT GCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAA TTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGG ATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCT GGGGCTCGAGTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTATACCGTCG ACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATT CCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTA ATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAA CGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTC GTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAAC GCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTT TTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACA GGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTT ACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTC AGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCC TTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGT AACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTAC ACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCT TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAA AAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAA GGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAA TCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCA GCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGC TTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATA AACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAAT TGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGC ATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACA TGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCC GCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTT TCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCG GCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCG GGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGA TCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAG GGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAG GGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACA TTTCCCCGAAAAGTGCCACCTGACGTC
Figure XX: Nucleotide Sequence of pTTrap-TK/BSD as shown in Figure 7
AGCTTGGTACCGAGCTCGGATCCACTAGTCCAGTGTGGTGGAATTGCCCTTgccgccacgATGGTCCGTCCT GTAGAAACCCCAACCCGTGAAATCAAAAAACTCGACGGCCTGTGGGCATTCAGTCTGGATCGCGAAAACTGT GGAATTGATCAGCGTTGGTGGGAAAGCGCGTTACAAGAAAGCCGGGCAATTGCTGTGCCAGGCAGTTTTAAC GATCAGTTCGCCGATGCAGATATTCGTAATTATGCGGGCAACGTCTGGTATCAGCGCGAAGTCTTTATACCG AAAGGTTGGGCAGGCCAGCGTATCGTGCTGCGTTTCGATGCGGTCACTCATTACGGCAAAGTGTGGGTCAAT AATCAGGAAGTGATGGAGCATCAGGGCGGCTATACGCCATTTGAAGCCGATGTCACGCCGTATGTTATTGCC GGGAAAAGTGTACGTATCACCGTTTGTGTGAACAACGAACTGAACTGGCAGACTATCCCGCCGGGAATGGTG ATTACCGACGAAAACGGCAAGAAAAAGCAGTCTTACTTCCATGATTTCTTTAACTATGCCGGAATCCATCGC AGCGTAATGCTCTACACCACGCCGAACACCTGGGTGGACGATATCACCGTGGTGACGCATGTCGCGCAAGAC TGTAACCACGCGTCTGTTGACTGGCAGGTGGTGGCCAATGGTGATGTCAGCGTTGAACTGCGTGATGCGGAT CAACAGGTGGTTGCAACTGGACAAGGCACTAGCGGGACTTTGCAAGTGGTGAATCCGCACCTCTGGCAACCG GGTGAAGGTTATCTCTATGAACTGTGCGTCACAGCCAAAAGCCAGACAGAGTGTGATATCTACCCGCTTCGC GTCGGCATCCGGTCAGTGGCAGTGAAGGGCCAACAGTTCCTGATTAACCACAAACCGTTCTACTTTACTGGC TTTGGTCGTCATGAAGATGCGGACTTACGTGGCAAAGGATTCGATAACGTGCTGATGGTGCACGACCACGCA TTAATGGACTGGATTGGGGCCAACTCCTACCGTACCTCGCATTACCCTTACGCTGAAGAGATGCTCGACTGG GCAGATGAACATGGCATCGTGGTGATTGATGAAACTGCTGCTGTCGGCTTTTCGCTCTCTTTAGGCATTGGT TTCGAAGCGGGCAACAAGCCGAAAGAACTGTACAGCGAAGAGGCAGTCAACGGGGAAACTCAGCAAGCGCAC TTACAGGCGATTAAAGAGCTGATAGCGCGTGACAAAAACCACCCAAGCGTGGTGATGTGGAGTATTGCCAAC GAACCGGATACCCGTCCGCAAGGTGCACGGGAATATTTCGCGCCACTGGCGGAAGCAACGCGTAAACTCGAC CCGACGCGTCCGATCACCTGCGTCAATGTAATGTTCTGCGACGCTCACACCGATACCATCAGCGATCTCTTT GATGTGCTGTGCCTGAACCGTTATTACGGATGGTATGTCCAAAGCGGCGATTTGGAAACGGCAGAGAAGGTA CTGGAAAAAGAACTTCTGGCCTGGCAGGAGAAACTGCATCAGCCGATTATCATCACCGAATACGGCGTGGAT ACGTTAGCCGGGCTGCACTCAATGTACACCGACATGTGGAGTGAAGAGTATCAGTGTGCATGGCTGGATATG TATCACCGCGTCTTTGATCGCGTCAGCGCCGTCGTCGGTGAACAGGTATGGAATTTCGCCGATTTTGCGACC TCGCAAGGCATATTGCGCGTTGGCGGTAACAAGAAAGGGATCTTCACTCGCGACCGCAAACCGAAGTCGGCG GCTTTTCTGCTGCAAAAACGCTGGACTGGCATGAACTTCGGTGAAAAACCGCAGCAGGGAGGCAAACAATGA ctcgagCGCCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCG TTTGTCTATATGTGATTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTC TTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAG GAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCC CCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCC CAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGG CTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGT GTTTAGTCGAGGTTAAAAAAGCTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGAT GATAATATGGCTTCGTACCCCTGCCATCAACACGCGTCTGCGTTCGACCAGGCTGCGCGTTCTCGCGGCCAT AACAACCGACGTACGGCGTTGCGCCCTCGCCGGCAACAAAAAGCCACGGAAGTCCGCCTGGAGCAGAAAATG CCCACGCTACTGCGGGTTTATATAGACGGTCCCCACGGGATGGGGAAAACCACCACCACGCAACTGCTGGTG GCCCTGGGTTCGCGCGACGATATCGTCTACGTACCCGAGCCGATGACTTACTGGCGGGTGTTGGGGGCTTCC GAGACAATCGCGAACATCTACACCACACAACACCGCCTCGACCAGGGTGAGATATCGGCCGGGGACGCGGCG GTGGTAATGACAAGCGCCCAGATAACAATGGGCATGCCTTATGCCGTGACCGACGCCGTTCTGGCTCCTCAT ATCGGGGGGGAGGCTGGGAGCTCACATGCCCCGCCCCCGGCCCTCACCCTCATCTTCGACCGCCATCCCATC GCCGCCCTCCTGTGCTACCCGGCCGCGCGATACCTTATGGGCAGCATGACCCCCCAGGCCGTGCTGGCGTTC GTGGCCCTCATCCCGCCGACCTTGCCCGGCACAAACATCGTGTTGGGGGCCCTTCCGGAGGACAGACACATC GACCGCCTGGCCAAACGCCAGCGCCCCGGCGAGCGGCTTGACCTGGCTATGCTGGCCGCGATTCGCCGCGTT TATGGGCTGCTTGCCAATACGGTGCGGTATCTGCAGGGCGGCGGGTCGTGGCGGGAGGATTGGGGACAGCTT TCGGGGGCGGCCGTGCCGCCCCAGGGTGCCGAGCCCCAGAGCAACGCGGGCCCACGACCCCATATCGGGGAC ACGTTATTTACCCTGTTTCGGGCCCCCGAGTTGCTGGCCCCCAACGGCGACCTGTATAACGTGTTTGCCTGG GCTTTGGACGTCTTGGCCAAACGCCTCCGTCCCATGCATGTCTTTATCCTGGATTACGACCAATCGCCCGCC GGCTGCCGGGACGCCCTGCTGCAACTTACCTCCGGGATGGTCCAGACCCACGTCACCACCCCAGGCTCCATA CCGACGATCTGCGACCTGGCGCGCACGTTTGCCCGGGAGATGGGGGAGGCTAACactAgtCCTTTGTCTCAA GAAGAATCCACCCTCATTGAAAGAGCAACGGCTACAATCAACAGCATCCCCATCTCTGAAGACTACAGCGTC GCCAGCGCAGCTCTCTCTAGCGACGGCCGCATCTTCACTGGTGTCAATGTATATCATTTTACTGGGGGACCT TGTGCAGAACTCGTGGTGCTGGGCACTGCTGCTGCTGCGGCAGCTGGCAACCTGACTTGTATCGTCGCGATC GGAAATGAGAACAGGGGCATCTTGAGCCCCTGCGGACGGTGCCGACAGGTGCTTCTCGATCTGCATCCTGGG ATCAAAGCCATAGTGAAGGACAGTGATGGACAGCCGACGGCAGTTGGGATTCGTGAATTGCTGCCCTCTGGT TATGTGTGGGAGGGCTAAgtttaaacAAGGGCAATTCTGCAGATATCCAGCACAGTGGCGGCCGCTCGAGTC
TAGAGGGCCCGCGGTTCGAAGGTAAGCCTATCCCTAACCCTCTCCTCGGTCTCGATTCTACGCGTACCGGTC ATCATCACCATCACCATTGAGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTG TTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATG AGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGG GGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAA CCAGCTGGGGCTCTAGGGGGTATCCCCACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTA CGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCG CCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGCATCCCTTTAGGGTTCCGATTTAGTGCTTTAC GGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTT TTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACC CTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGGGGATTTCGGCCTATTGGTTAAAAAATGAGCTGA TTTAACAAAAATTTAACGCGAATTAATTCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTC CCCAGGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCCCAGGCT CCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGC CCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATG CAGAGGCCGAGGCCGCCTCTGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCT TTTGCAAAAAGCTCCCGGGAGCTTGTATATCCATTTTCGGATCTGATCAAGAGACAGGATGAGGATCGTTTC GCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACT GGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTT TTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCA CGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCG AAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAA TGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAGCGAG CACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAG CCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCT GCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGG ACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCT TCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCT TCTGAGCGGGACTCTGGGGTTCGCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGAGATTTCGAT TCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCAG CGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCCAACTTGTTTATTGCAGCTTATAATGGTTACAAATAA AGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTC ATCAATGTATCTTATCATGTCTGTATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTG TTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCC TGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAAC CTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCC GCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCG GTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCC AGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAAT CGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCC CTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTG GCGCTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTG CACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGA CACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACA GAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAG CCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTT TTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGG TCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACC TAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGT TACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTC CCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGAC CCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCT GCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAAT AGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTC AGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTC GGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAAT TCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAA TAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACT
TTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCC AGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGA GCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTC TTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATT TAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGa gatctGTTTAAACgaattcATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCAAGCT
Figure XX: Nucleotide Sequence ofpRTrapSin-TK/BSD 3'5' as shown in Figure 8 gaattcataccagatcaccgaaaactgtcctccaaatgtgtccccctcacactcccaaattcgcgggcttct gcctcttagaccactctaccctattccccacactcaccggagccaaagccgcggcccttccgtttctttgct tttgaaagaccccacccgtaggtggcaagctagcttaagtaacgccactttgcaaggcatggaaaaatacat aactgagaatagaaaagttcagatcaaggtcaggaacaaagaaacagctgaataccaaacaggatatctgtg gtaagcggttcctgccccggctcagggccaagaacagatgagacagctgagtgatgggccaaacaggatatc tgtggtaagcagttcctgccccggctcggggccaagaacagatggtccccagatgcggtccagccctcagca gtttctagtgaatcatcagatgtttccagggtgccccaaggacctgaaaatgaccctgtaccttatttgaac taaccaatcagttcgcttctcgcttctgttcgcgcgcttccgctctccgagctcaataaaagagcccacaac ccctcactcggcgcgccagtcttccgatagactgcgtcgcccgggtacccgtattcccaataaagcctcttg ctgtttgcatccgaatcgtggtctcgctgttccttgggagggtctcctctgagtgattgactacccacgacg ggggtctttcatttgggggctcgtccgggatttggagacccctgcccagggaccaccgacccaccaccggga ggtaagctggccagcaacttatctgtgtctgtccgattgtctagtgtctatgtttgatgttatgcgcctgcg tctgtactagttagctaactagctctgtatctggcggacccgtggtggaactgacgagttctgaacacccgg ccgcaaccctgggagacgtcccagggactttgggggccgtttttgtggcccgacctgaggaagggagtcgat gtggaatccgaccccgtcaggatatgtggttctggtaggagacgagaacctaaaacagttcccgcctccgtc tgaatttttgctttcggtttggaaccgaagccgcgcgtcttgtctgctgcagcgctgcagcatcgttctgtg ttgtctctgtctgactgtgtttctgtatttgtctgaaaattagggccagactgttaccactcccttaagttt gaccttaggtcactggaaagatgtcgagcggatcgctcacaaccagtcggtagatgtcaagaagagacgttg ggttaccttctgctctgcagaatggccaacctttaacgtcggatggccgcgagacggcacctttaaccgaga cctcatcacccaggttaagatcaaggtcttttcacctggcccgcatggacacccagaccaggtcccctacat cgtgacctgggaagccttggcttttgacccccctccctgggtcaagccctttgtacaccctaagcctccgcc tcctcttcctccatccgccccgtctctcccccttgaacctcctcgttcgaccccgcctcgatcctcccttta tccagccctcactccttctctaggcgccggaattccgatctgatcaagagacaggatgaaaacTTAGCCCTC CCACACATAACCAGAGGGCAGCAATTCACGAATCCCAACTGCCGTCGGCTGTCCATCACTGTCCTTCACTAT GGCTTTGATCCCAGGATGCAGATCGAGAAGCACCTGTCGGCACCGTCCGCAGGGGCTCAAGATGCCCCTGTT CTCATTTCCGATCGCGACGATACAAGTCAGGTTGCCAGCTGCCGCAGCAGCAGCAGTGCCCAGCACCACGAG TTCTGCACAAGGTCCCCCAGTAAAATGATATACATTGACACCAGTGAAGATGCGGCCGTCGCTAGAGAGAGC TGCGCTGGCGACGCTGTAGTCTTCAGAGATGGGGATGCTGTTGATTGTAGCCGTTGCTCTTTCAATGAGGGT GGATTCTTCTTGAGACAAAGGacTagtGTTAGCCTCCCCCATCTCCCGGGCAAACGTGCGCGCCAGGTCGCA GATCGTCGGTATGGAGCCTGGGGTGGTGACGTGGGTCTGGACCATCCCGGAGGTAAGTTGCAGCAGGGCGTC CCGGCAGCCGGCGGGCGATTGGTCGTAATCCAGGATAAAGACATGCATGGGACGGAGGCGTTTGGCCAAGAC GTCCAAAGCCCAGGCAAACACGTTATACAGGTCGCCGTTGGGGGCCAGCAACTCGGGGGCCCGAAACAGGGT AAATAACGTGTCCCCGATATGGGGTCGTGGGCCCGCGTTGCTCTGGGGCTCGGCACCCTGGGGCGGCACGGC CGCCCCCGAAAGCTGTCCCCAATCCTCCCGCCACGACCCGCCGCCCTGCAGATACCGCACCGTATTGGCAAG CAGCCCATAAACGCGGCGAATCGCGGCCAGCATAGCCAGGTCAAGCCGCTCGCCGGGGCGCTGGCGTTTGGC CAGGCGGTCGATGTGTCTGTCCTCCGGAAGGGCCCCCAACACGATGTTTGTGCCGGGCAAGGTCGGCGGGAT GAGGGCCACGAACGCCAGCACGGCCTGGGGGGTCATGCTGCCCATAAGGTATCGCGCGGCCGGGTAGCACAG GAGGGCGGCGATGGGATGGCGGTCGAAGATGAGGGTGAGGGCCGGGGGCGGGGCATGTGAGCTCCCAGCCTC CCCCCCGATATGAGGAGCCAGAACGGCGTCGGTCACGGCATAAGGCATGCCCATTGTTATCTGGGCGCTTGT CATTACCACCGCCGCGTCCCCGGCCGATATCTCACCCTGGTCGAGGCGGTGTTGTGTGGTGTAGATGTTCGC GATTGTCTCGGAAGCCCCCAACACCCGCCAGTAAGTCATCGGCTCGGGTACGTAGACGATATCGTCGCGCGA ACCCAGGGCCACCAGCAGTTGCGTGGTGGTGGTTTTCCCCATCCCGTGGGGACCGTCTATATAAACCCGCAG TAGCGTGGGCATTTTCTGCTCCAGGCGGACTTCCGTGGCTTTTTGTTGCCGGCGAGGGCGCAACGCCGTACG TCGGTTGTTATGGCCGCGAGAACGCGCAGCCTGGTCGAACGCAGACGCGTGTTGATGGCAGGGGTACGAAGC CATATTATCATCGTGTTTTTCAAAGGAAAACCACGTCCCCGTGGTTCGGGGGGCCTAGAGCTTTTTTAACCT CGACTAAACACATGTAAAGCATGTGCACCGAGGCCCCAGATCAGATCCCATACAATGGGGTACCTTCTGGGC ATCCTTCAGCCCCTTGTTGAATACGCTTGAGGAGAGCCATTTGACTCTTTCCACAACTATCCAACTCACAAC GTGGCACTGGGGTTGTGCCGCCTTTGCAGGTGTATCTTATACACGTGGCTTTTGGCCGCAGAGGCACCTGTC GCCAGGTGGGGGGTTCCGCTGCCTGCAAAGGGTCGCTACAGACGTTGTTTGTCTTCAAGAAGCTTCCAGAGG AACTGCTTCCTTCACGACATTCAACAGACCTTGCATTCCTTTGGCGAGAGGGGAAAGACCCCTAGGAATGCT CGTCAAGAAGACAGGGCCAGGTTTCCGGGCCCTCACATTGCCAAAAGACGGCAATATGGTGGAAAATCACAT ATAGACAAACGCACACCGGCCTTATTCCAAGCGGCTTCGGCCAGTAACGTTAGGGGGGGGGGAGGGAGAGGG GCGctcgagTCATTGTTTGCCTCCCTGCTGCGGTTTTTCACCGAAGTTCATGCCAGTCCAGCGTTTTTGCAG CAGAAAAGCCGCCGACTTCGGTTTGCGGTCGCGAGTGAAGATCCCTTTCTTGTTACCGCCAACGCGCAATAT GCCTTGCGAGGTCGCAAAATCGGCGAAATTCCATACCTGTTCACCGACGACGGCGCTGACGCGATCAAAGAC GCGGTGATACATATCCAGCCATGCACACTGATACTCTTCACTCCACATGTCGGTGTACATTGAGTGCAGCCC
GGCTAACGTATCCACGCCGTATTCGGTGATGATAATCGGCTGATGCAGTTTCTCCTGCCAGGCCAGAAGTTC TTTTTCCAGTACCTTCTCTGCCGTTTCCAAATCGCCGCTTTGGACATACCATCCGTAATAACGGTTCAGGCA CAGCACATCAAAGAGATCGCTGATGGTATCGGTGTGAGCGTCGCAGAACATTACATTGACGCAGGTGATCGG ACGCGTCGGGTCGAGTTTACGCGTTGCTTCCGCCAGTGGCGCGAAATATTCCCGTGCACCTTGCGGACGGGT ATCCGGTTCGTTGGCAATACTCCACATCACCACGCTTGGGTGGTTTTTGTCACGCGCTATCAGCTCTTTAAT CGCCTGTAAGTGCGCTTGCTGAGTTTCCCCGTTGACTGCCTCTTCGCTGTACAGTTCTTTCGGCTTGTTGCC CGCTTCGAAACCAATGCCTAAAGAGAGCGAAAAGCCGACAGCAGCAGTTTCATCAATCACCACGATGCCATG TTCATCTGCCCAGTCGAGCATCTCTTCAGCGTAAGGGTAATGCGAGGTACGGTAGGAGTTGGCCCCAATCCA GTCCATTAATGCGTGGTCGTGCACCATCAGCACGTTATCGAATCCTTTGCCACGTAAGTCCGCATCTTCATG ACGACCAAAGCCAGTAAAGTAGAACGGTTTGTGGTTAATCAGGAACTGTTGGCCCTTCACTGCCACTGACCG GATGCCGACGCGAAGCGGGTAGATATCACACTCTGTCTGGCTTTTGGCTGTGACGCACAGTTCATAGAGATA ACCTTCACCCGGTTGCCAGAGGTGCGGATTCACCACTTGCAAAGTCCCGCTAGTGCCTTGTCCAGTTGCAAC CACCTGTTGATCCGCATCACGCAGTTCAACGCTGACATCACCATTGGCCACCACCTGCCAGTCAACAGACGC GTGGTTACAGTCTTGCGCGACATGCGTCACCACGGTGATATCGTCCACCCAGGTGTTCGGCGTGGTGTAGAG CATTACGCTGCGATGGATTCCGGCATAGTTAAAGAAATCATGGAAGTAAGACTGCTTTTTCTTGCCGTTTTC GTCGGTAATCACCATTCCCGGCGGGATAGTCTGCCAGTTCAGTTCGTTGTTCACACAAACGGTGATACGTAC ACTTTTCCCGGCAATAACATACGGCGTGACATCGGCTTCAAATGGCGTATAGCCGCCCTGATGCTCCATCAC TTCCTGATTATTGACCCACACTTTGCCGTAATGAGTGACCGCATCGAAACGCAGCACGATACGCTGGCCTGC CCAACCTTTCGGTATAAAGACTTCGCGCTGATACCAGACGTTGCCCGCATAATTACGAATATCTGCATCGGC GAACTGATCGTTAAAACTGCCTGGCACAGCAATTGCCCGGCTTTCTTGTAACGCGCTTTCCCACCAACGCTG ATCAATTCCACAGTTTTCGCGATCCAGACTGAATGCCCACAGGCCGTCGAGTTTTTTGATTTCACGGGTTGG GGTTTCTACAGGACGGACCATcgtggcggcAAGGGCAATTCCACCACACTGGACTAGTGGATCCGAGCTCGG TACCAAGCTTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATgaattcGTTTaacatcg ataaaataaaagattttatttagtctccagaaaaaggggggaatgaaagaccccacctgtaggtttggcaag ctagagaaccatcagatgtttccagggtgccccaaggacctgaaatgaccctgtgccttatttgaactaacc aatcagttcgcttctcgcttctgttcgcgcgcttctgctccccgagctcaataaaagagcccacaacccctc actcggggcgccagtcctccgattgactgagtcgcccgggtacccgtgtatccaataaaccctcttgcagtt gcatccgacttgtggtctcgctgttccttgggagggtctcctctgagtgattgactacccgtcagcgggggt ctttcatttgggggctcgtccgggatcgggagacccctgcccagggaccaccgacccaccaccgggaggtaa gctggctgcctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcaca gcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcgg ggcgcagccatgacccagtcacgtagcgatagcggagtgtatactggcttaactatgcggcatcagagcaga ttgtactgagagtgcaccatatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggc gctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcact caaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagc aaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatc acaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctg gaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgg gaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgg gctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacc cggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcg gtgctacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatctgcgctc tgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcg gtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatctttt ctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaagga tcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggt ctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttg cctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgatac cgcgagacccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaa gtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttcgc cagttaatagtttgcgcaacgttgttgccattgctgcaggcatcgtggtgtcacgctcgtcgtttggtatgg cttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggtta gctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcac tgcataattctcttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcat tctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaacacgggataataccgcgccacata gcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgt tgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgttt ctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatac
tcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttg aatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaag aaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtcttcaa
Figure xx: Nucleotide Sequence ofpRTrap-TK/BSD 3'5' as Shwon in Figure 9 gaattcataccagatcaccgaaaactgtcctccaaatgtgtccccctcacactcccaaattcgcgggcttct gcctcttagaccactctaccctattccccacactcaccggagccaaagccgcggcccttccgtttctttgct tttgaaagaccccacccgtaggtggcaagctagcttaagtaacgccactttgcaaggcatggaaaaatacat aactgagaatagaaaagttcagatcaaggtcaggaacaaagaaacagctgaataccaaacaggatatctgtg gtaagcggttcctgccccggctcagggccaagaacagatgagacagctgagtgatgggccaaacaggatatc tgtggtaagcagttcctgccccggctcggggccaagaacagatggtccccagatgcggtccagccctcagca gtttctagtgaatcatcagatgtttccagggtgccccaaggacctgaaaatgaccctgtaccttatttgaac taaccaatcagttcgcttctcgcttctgttcgcgcgcttccgctctccgagctcaataaaagagcccacaac ccctcactcggcgcgccagtcttccgatagactgcgtcgcccgggtacccgtattcccaataaagcctcttg ctgtttgcatccgaatcgtggtctcgctgttccttgggagggtctcctctgagtgattgactacccacgacg ggggtctttcatttgggggctcgtccgggatttggagacccctgcccagggaccaccgacccaccaccggga ggtaagctggccagcaacttatctgtgtctgtccgattgtctagtgtctatgtttgatgttatgcgcctgcg tctgtactagttagctaactagctctgtatctggcggacccgtggtggaactgacgagttctgaacacccgg ccgcaaccctgggagacgtcccagggactttgggggccgtttttgtggcccgacctgaggaagggagtcgat gtggaatccgaccccgtcaggatatgtggttctggtaggagacgagaacctaaaacagttcccgcctccgtc tgaatttttgctttcggtttggaaccgaagccgcgcgtcttgtctgctgcagcgctgcagcatcgttctgtg ttgtctctgtctgactgtgtttctgtatttgtctgaaaattagggccagactgttaccactcccttaagttt gaccttaggtcactggaaagatgtcgagcggatcgctcacaaccagtcggtagatgtcaagaagagacgttg ggttaccttctgctctgcagaatggccaacctttaacgtcggatggccgcgagacggcacctttaaccgaga cctcatcacccaggttaagatcaaggtcttttcacctggcccgcatggacacccagaccaggtcccctacat cgtgacctgggaagccttggcttttgacccccctccctgggtcaagccctttgtacaccctaagcctccgcc tcctcttcctccatccgccccgtctctcccccttgaacctcctcgttcgaccccgcctcgatcctcccttta tccagccctcactccttctctaggcgccggaattccgatctgatcaagagacaggatgaaaacTTAGCCCTC CCACACATAACCAGAGGGCAGCAATTCACGAATCCCAACTGCCGTCGGCTGTCCATCACTGTCCTTCACTAT GGCTTTGATCCCAGGATGCAGATCGAGAAGCACCTGTCGGCACCGTCCGCAGGGGCTCAAGATGCCCCTGTT CTCATTTCCGATCGCGACGATACAAGTCAGGTTGCCAGCTGCCGCAGCAGCAGCAGTGCCCAGCACCACGAG TTCTGCACAAGGTCCCCCAGTAAAATGATATACATTGACACCAGTGAAGATGCGGCCGTCGCTAGAGAGAGC TGCGCTGGCGACGCTGTAGTCTTCAGAGATGGGGATGCTGTTGATTGTAGCCGTTGCTCTTTCAATGAGGGT GGATTCTTCTTGAGACAAAGGacTagtGTTAGCCTCCCCCATCTCCCGGGCAAACGTGCGCGCCAGGTCGCA GATCGTCGGTATGGAGCCTGGGGTGGTGACGTGGGTCTGGACCATCCCGGAGGTAAGTTGCAGCAGGGCGTC CCGGCAGCCGGCGGGCGATTGGTCGTAATCCAGGATAAAGACATGCATGGGACGGAGGCGTTTGGCCAAGAC GTCCAAAGCCCAGGCAAACACGTTATACAGGTCGCCGTTGGGGGCCAGCAACTCGGGGGCCCGAAACAGGGT AAATAACGTGTCCCCGATATGGGGTCGTGGGCCCGCGTTGCTCTGGGGCTCGGCACCCTGGGGCGGCACGGC CGCCCCCGAAAGCTGTCCCCAATCCTCCCGCCACGACCCGCCGCCCTGCAGATACCGCACCGTATTGGCAAG CAGCCCATAAACGCGGCGAATCGCGGCCAGCATAGCCAGGTCAAGCCGCTCGCCGGGGCGCTGGCGTTTGGC CAGGCGGTCGATGTGTCTGTCCTCCGGAAGGGCCCCCAACACGATGTTTGTGCCGGGCAAGGTCGGCGGGAT GAGGGCCACGAACGCCAGCACGGCCTGGGGGGTCATGCTGCCCATAAGGTATCGCGCGGCCGGGTAGCACAG GAGGGCGGCGATGGGATGGCGGTCGAAGATGAGGGTGAGGGCCGGGGGCGGGGCATGTGAGCTCCCAGCCTC CCCCCCGATATGAGGAGCCAGAACGGCGTCGGTCACGGCATAAGGCATGCCCATTGTTATCTGGGCGCTTGT CATTACCACCGCCGCGTCCCCGGCCGATATCTCACCCTGGTCGAGGCGGTGTTGTGTGGTGTAGATGTTCGC GATTGTCTCGGAAGCCCCCAACACCCGCCAGTAAGTCATCGGCTCGGGTACGTAGACGATATCGTCGCGCGA ACCCAGGGCCACCAGCAGTTGCGTGGTGGTGGTTTTCCCCATCCCGTGGGGACCGTCTATATAAACCCGCAG TAGCGTGGGCATTTTCTGCTCCAGGCGGACTTCCGTGGCTTTTTGTTGCCGGCGAGGGCGCAACGCCGTACG TCGGTTGTTATGGCCGCGAGAACGCGCAGCCTGGTCGAACGCAGACGCGTGTTGATGGCAGGGGTACGAAGC CATATTATCATCGTGTTTTTCAAAGGAAAACCACGTCCCCGTGGTTCGGGGGGCCTAGAGCTTTTTTAACCT CGACTAAACACATGTAAAGCATGTGCACCGAGGCCCCAGATCAGATCCCATACAATGGGGTACCTTCTGGGC ATCCTTCAGCCCCTTGTTGAATACGCTTGAGGAGAGCCATTTGACTCTTTCCACAACTATCCAACTCACAAC GTGGCACTGGGGTTGTGCCGCCTTTGCAGGTGTATCTTATACACGTGGCTTTTGGCCGCAGAGGCACCTGTC GCCAGGTGGGGGGTTCCGCTGCCTGCAAAGGGTCGCTACAGACGTTGTTTGTCTTCAAGAAGCTTCCAGAGG AACTGCTTCCTTCACGACATTCAACAGACCTTGCATTCCTTTGGCGAGAGGGGAAAGACCCCTAGGAATGCT CGTCAAGAAGACAGGGCCAGGTTTCCGGGCCCTCACATTGCCAAAAGACGGCAATATGGTGGAAAATCACAT ATAGACAAACGCACACCGGCCTTATTCCAAGCGGCTTCGGCCAGTAACGTTAGGGGGGGGGGAGGGAGAGGG GCGctcgagTCATTGTTTGCCTCCCTGCTGCGGTTTTTCACCGAAGTTCATGCCAGTCCAGCGTTTTTGCAG CAGAAAAGCCGCCGACTTCGGTTTGCGGTCGCGAGTGAAGATCCCTTTCTTGTTACCGCCAACGCGCAATAT GCCTTGCGAGGTCGCAAAATCGGCGAAATTCCATACCTGTTCACCGACGACGGCGCTGACGCGATCAAAGAC GCGGTGATACATATCCAGCCATGCACACTGATACTCTTCACTCCACATGTCGGTGTACATTGAGTGCAGCCC
GGCTAACGTATCCACGCCGTATTCGGTGATGATAATCGGCTGATGCAGTTTCTCCTGCCAGGCCAGAAGTTC
TTTTTCCAGTACCTTCTCTGCCGTTTCCAAATCGCCGCTTTGGACATACCATCCGTAATAACGGTTCAGGCA
CAGCACATCAAAGAGATCGCTGATGGTATCGGTGTGAGCGTCGCAGAACATTACATTGACGCAGGTGATCGG
ACGCGTCGGGTCGAGTTTACGCGTTGCTTCCGCCAGTGGCGCGAAATATTCCCGTGCACCTTGCGGACGGGT ATCCGGTTCGTTGGCAATACTCCACATCACCACGCTTGGGTGGTTTTTGTCACGCGCTATCAGCTCTTTAAT
CGCCTGTAAGTGCGCTTGCTGAGTTTCCCCGTTGACTGCCTCTTCGCTGTACAGTTCTTTCGGCTTGTTGCC
CGCTTCGAAACCAATGCCTAAAGAGAGCGAAAAGCCGACAGCAGCAGTTTCATCAATCACCACGATGCCATG
TTCATCTGCCCAGTCGAGCATCTCTTCAGCGTAAGGGTAATGCGAGGTACGGTAGGAGTTGGCCCCAATCCA
GTCCATTAATGCGTGGTCGTGCACCATCAGCACGTTATCGAATCCTTTGCCACGTAAGTCCGCATCTTCATG ACGACCAAAGCCAGTAAAGTAGAACGGTTTGTGGTTAATCAGGAACTGTTGGCCCTTCACTGCCACTGACCG
GATGCCGACGCGAAGCGGGTAGATATCACACTCTGTCTGGCTTTTGGCTGTGACGCACAGTTCATAGAGATA
ACCTTCACCCGGTTGCCAGAGGTGCGGATTCACCACTTGCAAAGTCCCGCTAGTGCCTTGTCCAGTTGCAAC
CACCTGTTGATCCGCATCACGCAGTTCAACGCTGACATCACCATTGGCCACCACCTGCCAGTCAACAGACGC
GTGGTTACAGTCTTGCGCGACATGCGTCACCACGGTGATATCGTCCACCCAGGTGTTCGGCGTGGTGTAGAG CATTACGCTGCGATGGATTCCGGCATAGTTAAAGAAATCATGGAAGTAAGACTGCTTTTTCTTGCCGTTTTC
GTCGGTAATCACCATTCCCGGCGGGATAGTCTGCCAGTTCAGTTCGTTGTTCACACAAACGGTGATACGTAC
ACTTTTCCCGGCAATAACATACGGCGTGACATCGGCTTCAAATGGCGTATAGCCGCCCTGATGCTCCATCAC
TTCCTGATTATTGACCCACACTTTGCCGTAATGAGTGACCGCATCGAAACGCAGCACGATACGCTGGCCTGC
CCAACCTTTCGGTATAAAGACTTCGCGCTGATACCAGACGTTGCCCGCATAATTACGAATATCTGCATCGGC GAACTGATCGTTAAAACTGCCTGGCACAGCAATTGCCCGGCTTTCTTGTAACGCGCTTTCCCACCAACGCTG
ATCAATTCCACAGTTTTCGCGATCCAGACTGAATGCCCACAGGCCGTCGAGTTTTTTGATTTCACGGGTTGG
GGTTTCTACAGGACGGACCATcgtggcggcAAGGGCAATTCCACCACACTGGACTAGTGGATCCGAGCTCGG
TACCAAGCTTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATgaattcGTTTaacatcg ataaaataaaagattttatttagtctccagaaaaaggggggaatgaaagaccccacctgtaggtttggcaag ctagcttaagtaacgccattttgcaaggcatggaaaaatacataactgagaatagagaagttcagatcaagg tcaggaacagatggaacagctgaatatgggccaaacaggatatctgtggtaagcagttcctgccccggctca gggccaagaacagatggaacagctgaatatgggccaaacaggatatctgtggtaagcagttcctgccccggc tcagggccaagaacagatggtccccagatgcggtccagccctcagcagtttctagagaaccatcagatgttt ccagggtgccccaaggacctgaaatgaccctgtgccttatttgaactaaccaatcagttcgcttctcgcttc tgttcgcgcgcttctgctccccgagctcaataaaagagcccacaacccctcactcggggcgccagtcctccg attgactgagtcgcccgggtacccgtgtatccaataaaccctcttgcagttgcatccgacttgtggtctcgc tgttccttgggagggtctcctctgagtgattgactacccgtcagcgggggtctttcatttgggggctcgtcc gggatcgggagacccctgcccagggaccaccgacccaccaccgggaggtaagctggctgcctcgcgcgtttc ggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcggatgcc gggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggcgcagccatgacccagtca cgtagcgatagcggagtgtatactggcttaactatgcggcatcagagcagattgtactgagagtgcaccata tgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgctcttccgcttcctcgctca ctgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttat ccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaa aggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtc agaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctc ctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcata gctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccg ttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgc cactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagt ggtggcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcg gaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagc agcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagt ggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaa attaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttaccaatgcttaa tcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtaga taactacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccgg ctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccg cctccatccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacg ttgttgccattgctgcaggcatcgtggtgtcacgctcgtcgtttggtatggcttcattcagetccggttccc aacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcg ttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtca tgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggc
gaccgagttgctcttgcccggcgtcaacacgggataataccgcgccacatagcagaactttaaaagtgctca tcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaac ccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaa ggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaat attattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaac aaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatgacat taacctataaaaataggcgtatcacgaggccctttcgtcttcaa
Figure XX: Nucleotide Sequence of pRTrapSin-Puro 3 '5' as shown in Figure 10
GACGGATCGGGAGATCTCCCGATCCCCTATGGTCGACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAA GCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAA GGCAAGGCTTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTTGCGCTGCTTCGCGATGTAC GGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTC ATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACC CCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAAT GGGTGGACTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTA TTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTT GGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTG GATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACC AAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTAC GGTGGGAGGTCTATATAAGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTA ATACGACTCACTATAGGGAGACCCAAGCTTGGTACCGAGCTCGGATCGATAAACgaattcATCTCAGTTCGG TGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCAAGCTTGGTACCGAGCTCGGATCCACTAGTCCAGTGTGG TGGAATTGCCCTTgccgccacgATGGTCCGTCCTGTAGAAACCCCAACCCGTGAAATCAAAAAACTCGACGG CCTGTGGGCATTCAGTCTGGATCGCGAAAACTGTGGAATTGATCAGCGTTGGTGGGAAAGCGCGTTACAAGA AAGCCGGGCAATTGCTGTGCCAGGCAGTTTTAACGATCAGTTCGCCGATGCAGATATTCGTAATTATGCGGG CAACGTCTGGTATCAGCGCGAAGTCTTTATACCGAAAGGTTGGGCAGGCCAGCGTATCGTGCTGCGTTTCGA TGCGGTCACTCATTACGGCAAAGTGTGGGTCAATAATCAGGAAGTGATGGAGCATCAGGGCGGCTATACGCC ATTTGAAGCCGATGTCACGCCGTATGTTATTGCCGGGAAAAGTGTACGTATCACCGTTTGTGTGAACAACGA ACTGAACTGGCAGACTATCCCGCCGGGAATGGTGATTACCGACGAAAACGGCAAGAAAAAGCAGTCTTACTT CCATGATTTCTTTAACTATGCCGGAATCCATCGCAGCGTAATGCTCTACACCACGCCGAACACCTGGGTGGA CGATATCACCGTGGTGACGCATGTCGCGCAAGACTGTAACCACGCGTCTGTTGACTGGCAGGTGGTGGCCAA TGGTGATGTCAGCGTTGAACTGCGTGATGCGGATCAACAGGTGGTTGCAACTGGACAAGGCACTAGCGGGAC TTTGCAAGTGGTGAATCCGCACCTCTGGCAACCGGGTGAAGGTTATCTCTATGAACTGTGCGTCACAGCCAA AAGCCAGACAGAGTGTGATATCTACCCGCTTCGCGTCGGCATCCGGTCAGTGGCAGTGAAGGGCCAACAGTT CCTGATTAACCACAAACCGTTCTACTTTACTGGCTTTGGTCGTCATGAAGATGCGGACTTACGTGGCAAAGG ATTCGATAACGTGCTGATGGTGCACGACCACGCATTAATGGACTGGATTGGGGCCAACTCCTACCGTACCTC GCATTACCCTTACGCTGAAGAGATGCTCGACTGGGCAGATGAACATGGCATCGTGGTGATTGATGAAACTGC TGCTGTCGGCTTTTCGCTCTCTTTAGGCATTGGTTTCGAAGCGGGCAACAAGCCGAAAGAACTGTACAGCGA AGAGGCAGTCAACGGGGAAACTCAGCAAGCGCACTTACAGGCGATTAAAGAGCTGATAGCGCGTGACAAAAA CCACCCAAGCGTGGTGATGTGGAGTATTGCCAACGAACCGGATACCCGTCCGCAAGGTGCACGGGAATATTT CGCGCCACTGGCGGAAGCAACGCGTAAACTCGACCCGACGCGTCCGATCACCTGCGTCAATGTAATGTTCTG CGACGCTCACACCGATACCATCAGCGATCTCTTTGATGTGCTGTGCCTGAACCGTTATTACGGATGGTATGT CCAAAGCGGCGATTTGGAAACGGCAGAGAAGGTACTGGAAAAAGAACTTCTGGCCTGGCAGGAGAAACTGCA TCAGCCGATTATCATCACCGAATACGGCGTGGATACGTTAGCCGGGCTGCACTCAATGTACACCGACATGTG GAGTGAAGAGTATCAGTGTGCATGGCTGGATATGTATCACCGCGTCTTTGATCGCGTCAGCGCCGTCGTCGG TGAACAGGTATGGAATTTCGCCGATTTTGCGACCTCGCAAGGCATATTGCGCGTTGGCGGTAACAAGAAAGG GATCTTCACTCGCGACCGCAAACCGAAGTCGGCGGCTTTTCTGCTGCAAAAACGCTGGACTGGCATGAACTT CGGTGAAAAACCGCAGCAGGGAGGCAAACAATGActcgaGAAGGGCAATTCTGCAGATATCCAGCACAGTGG CGGCCGCGTCGACGGAATTCAGTGGATCGGTCGAGCATGCATCTAGGGCGGCCAATTCCGCCCCTCTCCCTC CCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTGATTTTC CACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAG GGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGC TTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCT CTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTG GATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGT ACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAA ACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAAGCTTGCCACAACCC ACAAGGAGACGACCTTCCATGACCGAGTACAAGCCCACGGTGCGCCTCGCCACCCGCGACGACGTCCCCCGG GCCGTACGCACCCTCGCCGCCGCGTTCGCCGACTACCCCGCCACGCGCCACACCGTCGACCCGGACCGCCAC ATCGAGCGGGTCACCGAGCTGCAAGAACTCTTCCTCACGCGCGTCGGGCTCGACATCGGCAAGGTGTGGGTC GCGGACGACGGCGCCGCGGTGGCGGTCTGGACCACGCCGGAGAGCGTCGAAGCGGGGGCGGTGTTCGCCGAG ATCGGCCCGCGCATGGCCGAGTTGAGCGGTTCCCGGCTGGCCGCGCAGCAACAGATGGAAGGCCTCCTGGCG CCGCACCGGCCCAAGGAGCCCGCGTGGTTCCTGGCCACCGTCGGCGTCTCGCCCGACCACCAGGGCAAGGGT CTGGGCAGCGCCGTCGTGCTCCCCGGAGTGGAGGCGGCCGAGCGCGCCGGGGTGCCCGCCTTCCTGGAGACC
TCCGCGCCCCGCAACCTCCCCTTCTACGAGCGGCTCGGCTTCACCGTCACCGCCGACGTCGAGTGCCCGAAG GACCGCGCGACCTGGTGCATGACCCGCAAGCCCGGTGCCTGACGCCCGCCCCACGACCCGCAGCGCCCGACC GAAAGGAGCGCACGACCCCATGGCTCCGACCGAAGCCGACCCGGGCGGCCCCGCCGACCCCGCACCCGCCCC CGAGGCCCACCGACTCTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTT GCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAA TTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGG ATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCT GGGGCTCGAGTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTATACCGTCG ACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATT CCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTA ATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAA CGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTC GTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAAC GCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTT TTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACA GGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTT ACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTC AGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCC TTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGT AACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTAC ACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCT TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAA AAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAA GGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAA TCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCA GCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGC TTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATA AACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAAT TGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGC ATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACA TGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCC GCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTT TCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCG GCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCG GGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGA TCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAG GGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAG GGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACA TTTCCCCGAAAAGTGCCACCTGACGTC
Figure XX: Nucleotide Sequence of pRTrap-Puro 3 '5' as Shown in Figure 11
GACGGATCGGGAGATCTCCCGATCCCCTATGGTCGACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAA GCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAA GGCAAGGCTTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTTGCGCTGCTTCGCGATGTAC GGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTC ATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACC CCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAAT GGGTGGACTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTA TTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTT GGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTG GATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACC AAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTAC GGTGGGAGGTCTATATAAGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTA ATACGACTCACTATAGGGAGACCCAAGCTTGGTACCGAGCTCGGATCGATAAACgaattcATCTCAGTTCGG TGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCAAGCTTGGTACCGAGCTCGGATCCACTAGTCCAGTGTGG TGGAATTGCCCTTgccgccacgATGGTCCGTCCTGTAGAAACCCCAACCCGTGAAATCAAAAAACTCGACGG CCTGTGGGCATTCAGTCTGGATCGCGAAAACTGTGGAATTGATCAGCGTTGGTGGGAAAGCGCGTTACAAGA AAGCCGGGCAATTGCTGTGCCAGGCAGTTTTAACGATCAGTTCGCCGATGCAGATATTCGTAATTATGCGGG CAACGTCTGGTATCAGCGCGAAGTCTTTATACCGAAAGGTTGGGCAGGCCAGCGTATCGTGCTGCGTTTCGA TGCGGTCACTCATTACGGCAAAGTGTGGGTCAATAATCAGGAAGTGATGGAGCATCAGGGCGGCTATACGCC ATTTGAAGCCGATGTCACGCCGTATGTTATTGCCGGGAAAAGTGTACGTATCACCGTTTGTGTGAACAACGA ACTGAACTGGCAGACTATCCCGCCGGGAATGGTGATTACCGACGAAAACGGCAAGAAAAAGCAGTCTTACTT CCATGATTTCTTTAACTATGCCGGAATCCATCGCAGCGTAATGCTCTACACCACGCCGAACACCTGGGTGGA CGATATCACCGTGGTGACGCATGTCGCGCAAGACTGTAACCACGCGTCTGTTGACTGGCAGGTGGTGGCCAA TGGTGATGTCAGCGTTGAACTGCGTGATGCGGATCAACAGGTGGTTGCAACTGGACAAGGCACTAGCGGGAC TTTGCAAGTGGTGAATCCGCACCTCTGGCAACCGGGTGAAGGTTATCTCTATGAACTGTGCGTCACAGCCAA AAGCCAGACAGAGTGTGATATCTACCCGCTTCGCGTCGGCATCCGGTCAGTGGCAGTGAAGGGCCAACAGTT CCTGATTAACCACAAACCGTTCTACTTTACTGGCTTTGGTCGTCATGAAGATGCGGACTTACGTGGCAAAGG ATTCGATAACGTGCTGATGGTGCACGACCACGCATTAATGGACTGGATTGGGGCCAACTCCTACCGTACCTC GCATTACCCTTACGCTGAAGAGATGCTCGACTGGGCAGATGAACATGGCATCGTGGTGATTGATGAAACTGC TGCTGTCGGCTTTTCGCTCTCTTTAGGCATTGGTTTCGAAGCGGGCAACAAGCCGAAAGAACTGTACAGCGA AGAGGCAGTCAACGGGGAAACTCAGCAAGCGCACTTACAGGCGATTAAAGAGCTGATAGCGCGTGACAAAAA CCACCCAAGCGTGGTGATGTGGAGTATTGCCAACGAACCGGATACCCGTCCGCAAGGTGCACGGGAATATTT CGCGCCACTGGCGGAAGCAACGCGTAAACTCGACCCGACGCGTCCGATCACCTGCGTCAATGTAATGTTCTG CGACGCTCACACCGATACCATCAGCGATCTCTTTGATGTGCTGTGCCTGAACCGTTATTACGGATGGTATGT CCAAAGCGGCGATTTGGAAACGGCAGAGAAGGTACTGGAAAAAGAACTTCTGGCCTGGCAGGAGAAACTGCA TCAGCCGATTATCATCACCGAATACGGCGTGGATACGTTAGCCGGGCTGCACTCAATGTACACCGACATGTG GAGTGAAGAGTATCAGTGTGCATGGCTGGATATGTATCACCGCGTCTTTGATCGCGTCAGCGCCGTCGTCGG TGAACAGGTATGGAATTTCGCCGATTTTGCGACCTCGCAAGGCATATTGCGCGTTGGCGGTAACAAGAAAGG GATCTTCACTCGCGACCGCAAACCGAAGTCGGCGGCTTTTCTGCTGCAAAAACGCTGGACTGGCATGAACTT CGGTGAAAAACCGCAGCAGGGAGGCAAACAATGActcgaGAAGGGCAATTCTGCAGATATCCAGCACAGTGG CGGCCGCGTCGACGGAATTCAGTGGATCGGTCGAGCATGCATCTAGGGCGGCCAATTCCGCCCCTCTCCCTC CCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTGATTTTC CACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAG GGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGC TTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCT CTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTG GATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGT ACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAA ACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAAGCTTGCCACAACCC ACAAGGAGACGACCTTCCATGACCGAGTACAAGCCCACGGTGCGCCTCGCCACCCGCGACGACGTCCCCCGG GCCGTACGCACCCTCGCCGCCGCGTTCGCCGACTACCCCGCCACGCGCCACACCGTCGACCCGGACCGCCAC ATCGAGCGGGTCACCGAGCTGCAAGAACTCTTCCTCACGCGCGTCGGGCTCGACATCGGCAAGGTGTGGGTC GCGGACGACGGCGCCGCGGTGGCGGTCTGGACCACGCCGGAGAGCGTCGAAGCGGGGGCGGTGTTCGCCGAG ATCGGCCCGCGCATGGCCGAGTTGAGCGGTTCCCGGCTGGCCGCGCAGCAACAGATGGAAGGCCTCCTGGCG CCGCACCGGCCCAAGGAGCCCGCGTGGTTCCTGGCCACCGTCGGCGTCTCGCCCGACCACCAGGGCAAGGGT CTGGGCAGCGCCGTCGTGCTCCCCGGAGTGGAGGCGGCCGAGCGCGCCGGGGTGCCCGCCTTCCTGGAGACC
TCCGCGCCCCGCAACCTCCCCTTCTACGAGCGGCTCGGCTTCACCGTCACCGCCGACGTCGAGTGCCCGAAG GACCGCGCGACCTGGTGCATGACCCGCAAGCCCGGTGCCTGACGCCCGCCCCACGACCCGCAGCGCCCGACC GAAAGGAGCGCACGACCCCATGGCTCCGACCGAAGCCGACCCGGGCGGCCCCGCCGACCCCGCACCCGCCCC CGAGGCCCACCGACTCTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTT GCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAA TTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGG ATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCT GGGGCTCGAGTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTATACCGTCG ACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATT CCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTA ATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAA CGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTC GTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAAC GCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTT TTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACA GGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTT ACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTC AGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCC TTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGT AACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTAC ACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCT TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAA AAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAA GGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAA TCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCA GCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGC TTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATA AACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAAT TGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGC ATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACA TGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCC GCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTT TCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCG GCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCG GGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGA TCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAG GGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAG GGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACA TTTCCCCGAAAAGTGCCACCTGACGTC