WO1993017123A1 - Mutagenicity testing using reporter genes with modified methylation frequencies - Google Patents
Mutagenicity testing using reporter genes with modified methylation frequencies Download PDFInfo
- Publication number
- WO1993017123A1 WO1993017123A1 PCT/US1993/001676 US9301676W WO9317123A1 WO 1993017123 A1 WO1993017123 A1 WO 1993017123A1 US 9301676 W US9301676 W US 9301676W WO 9317123 A1 WO9317123 A1 WO 9317123A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- gene
- cpg
- marker
- ala
- animal
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/8509—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K67/00—Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
- A01K67/027—New or modified breeds of vertebrates
- A01K67/0275—Genetically modified vertebrates, e.g. transgenic
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/24—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
- C07K14/245—Escherichia (G)
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2217/00—Genetically modified animals
- A01K2217/05—Animals comprising random inserted nucleic acids (transgenic)
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2227/00—Animals characterised by species
- A01K2227/10—Mammal
- A01K2227/105—Murine
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2227/00—Animals characterised by species
- A01K2227/10—Mammal
- A01K2227/108—Swine
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2227/00—Animals characterised by species
- A01K2227/40—Fish
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2267/00—Animals characterised by purpose
- A01K2267/03—Animal model, e.g. for test or diseases
- A01K2267/0393—Animal model comprising a reporter system for screening tests
Definitions
- This invention relates to testing for mutagens and carcinogens.
- Carcinogens are chemical (or physical) agents which are capable of causing cancer in a susceptible subject.
- a chemical is considered a carcinogen if, in a well designed and conducted bioassay, it produces a statistically significant increase in the incidence of neoplasms in one or more target organs.
- Some carcinogens are themselves mutagens, i.e., agents which directly cause the mutation of DNA. Others are metabolized by cells to form powerful mutagens, which in turn act on the host's DNA.
- aflatoxin Bl is converted by a hepatic aryl hydroxylase into 2,3-epoxide derivative, a mutagen, and nitrates are a starting material in the formation of the highly carcinogenic (mutagenic) nitrosamines.
- Still others act as tumor initiators or promoters, i.e., they induce or accelerate the transformation of normal cells into maliganant cells without necessarily directly modifying the genetic material.
- carcinogens and mutagens were identified by epidemiological means. After several decades, segments of the population having elevated exposures to the agent would exhibit a higher frequency of incidence of a particular cancer or other disorder. Case studies would reveal the common factor. Of course, the problem with epidemiological detection is that it is retrospective; society has already been damaged. This prompted the development of a variety of screening tests.
- the Ames test is based on the assumption that carcinogens (or their metabolites) will cause the genetic reversion of certain mutant strains of bacteria. These strains lack the ability to prc-duce histidine, an essential amino acid, and therefore are unable to multiply unless this nutrient is in their growth medium. In the presence of a mutagen, these mutants are more likely to revert to their "wild" phenotype, i.e., they regain the ability to manufacture histidine from other materials and therefore can grow in a histidine-free medium.
- a chemical may be converted into a carcinogen or mutagen through a metabolic activity specific to a particular type of cell, e.g., periportal liver cells. Unless, fortuitously, cells of this type are exposed to the chemical, this mode of carcinogenesis or mutagenesis will not be discovered.
- the production of the mutagen requires processing by several different types of cells, or that the necessary metabolic activity in one type of cell must be activated by a product of a different type of cell.
- a single cell type bioassay is incompetent for risk assessment of chemicals which are metabolized in this manner.
- transgenic animals were prepared whose cells carried a foreign (typically bacterial) "marker" gene. These animals were exposed to normal doses.of the suspect chemical. DNA was then extracted from the various tissues and organs of the animal, and the foreign "marker” gene was "rescued” and transferred to the genome of a host by means of a bacteriophage. The lytic plaques were then screened for the phenotype characteristically imparted by the original "marker” gene. The absence of this phenotype was indicative of mutation.
- a phage lambda vector capable of lysing E. coli, was engineered to carry the E. coli beta galactosidase (lacZ) gene.
- Lambda DNA was microinjected into mouse embryos, and transgenic mice were produced by standard techniques. Genomic DNA was purified from a tissue of the transgenic mouse, and the test DNA was excised by means of a lambda phage packaging extract. The packaged phage were incubated with beta-galactosidase deficient E. coli. Bacteria infected by the phage particles were lysed, resulting in the formation of lytic plaques on a lawn of beta- galactosidase deficient E. coli.
- the transgenic animal-based test unlike an in vitro mammalian cell bioassay, can detect mutagenesis by metabolites of the chemical of interest, even though the metabolites are produced at the appropriate concentrations only by differentiated cells or the tissue of live animals.
- Cells of any tissue or organ of interest may be screened for mutagenic damage, merely by extracting their DNA and recovering and characterizing the transgene.
- a single animal may yield a multitude of cells for testing and analysis. .It should be noted, to avoid confusion, that this cellular level analysis cannot be performed with geries endogenous to the test animal. They cannot be isolated from the DNA of an organ or tissue with sufficient efficiency. That is why this more sophisticated analytical approach was not possible until it became feasible to make transgenic animals.
- the present invention overcomes the deficiencies of the test methods described above.
- the present Applicant recognized that bacterial genes exhibit a much higher frequency of occurrence of the "CpG" doublet than do vertebrate genes. As a result, a bacterial gene incorporated into a mammalian genome will exhibit a much higher degree of methylation than is typical for a mammalian gene.
- the primary nucleic acid methylation substrate in mammals For example, of the 1081 dinucleotides in the lad gene, 95 (about 9%) are CpG, the primary nucleic acid methylation substrate in mammals. (55 of these CpG lie within a single one of the 360 codons of the gene, and 40 are at intercodon boundaries.)
- the mammalian gene is underrepx * esented in CpG dinucleotides. In random DNA having a 50% GC content, about 6% of the dinucleotides would be expected to be CpG. In mammalian DNA, it occurs with a frequency of about 2%. .
- This problem may be overcome by the use of a wholly or partially synthetic bacterial marker gene having a reduced (vertebrate cell like) number of CpGs, and hence, presumably, a lower overall level of methylation.
- Vijg had been troubled by the very low number of plaques obtained in practice. He postulated that the methylation pattern of the lambda vectors was rendering them susceptible to restriction by the E. coli host, i.e., the lambda vectors were being cut to pieces by the bacterium's defensive enzymes before they could integrate into the host genome (Vijg, page 3) . His solution, however, was not to modify the vector, but rather to employ a "host restriction"- negative strain for plating.
- Figure 1 sets forth the sequence of the coding strand of the wild-type lad gene, 5" to 3' (SEQ ID N0:1) . CpG dinucleotides are marked. Above the nucleotide sequence are alternative nucleotides for eliminating most of the CpG dinucleotides and for eliminating splicing donor-acceptor sites. Below the nucleotide sequence is the sequence of the corresponding wild- type Lad repressor protein (SEQ ID N0:2) , given according to the single letter amino acid code.
- Figure 2 sets forth the sequence (LACIMIRNL) (SEQ ID NO:3) of a Kozak consensus RBS (TCACC.,..), a CpG-depleted lad gene, a three codon linker (encoding AAL) , and a seven codon sequence encoding the SV40 large antigen nuclear localization site. These features are marked, as are all Mspl/Hpall (CCGG) sites.
- the transgene also includes a beta actin promoter, but, since this sequence is lengthy and has been published, it was not reprinted. The corresponding amino acid sequence is presented as SEQ ID N0:4.
- Figure 3 shows the wild-type E. coli gpt gene (SEQ ID NO:5) and the suggested base substitutions for reducing its CpG content.
- the corresponding amino acid sequence is presented as SEQ ID NO:6.
- Figure 4 shows the modified gpt gene (SEQ ID NO:7) and a synthesis strategy therefor.
- the complementary DNA sequence is provided as SEQ IS NO:8.
- Figure 5 is a schematic depiction of (A) a Hindlll/BamHI fragment comprising the lacZ gene and SV40 processing signals, and (B) plasmid pL26.6.
- Figure 6 shows the construction of plasmid PAP lacOZneo.
- Figure 7 depicts the organization of the mouse aprt promoter region with the site of one lacO insert marked.
- the aprt promoter and upstream sequences that will be used in the proposed experiments are schematically displayed. In the genome, this fragment is bounded by Fnudll sites. The numbering in base pairs begins at the translation start codon.
- the 4 boxes represent sites of Spl binding.
- the horizontal arrows indicate major sites of transcription initiation, and the vertical arrow indicates extend of deletion that permits full aprt expression.
- the position of the E. coli lac operator (lacO) is indicated, as are potential Taql and Xmal sites for alternative operator insertion.
- the present invention is directed to a test for mutagenicity and carcinogenicity, in which mutations in a marker gene are used to predict the effect of a chemical agent on genes of a target species of interest.
- the target species will be human beings, however, it must be noted that both wild and domesticated animals are also chronically exposed to potential mutagens and carcinogens and that environmental policy may call for minimizing such exposure.
- the present invention may readily be adapted to screening for the mutagenic potential of a chemical vis-a-vis the genes of a nonhuman animal species, including other mammals, birds, fish, amphibia, reptiles, and even lower life forms (such as bees, silkworms, earthworms, and so forth) .
- Nucleic acids are the basic genetic material in cells. They are formed by a chemically linked sequence of nucleotides. Each nucleotide contains a heterocyclic ring of carbon and nitrogen atoms (the nitrogenous base) , a five carbon sugar in ring form (a pentose) , and a phosphate group. Two types of pentoses are found in nucleic acids, the 2-deoxyribose in DNA (deoxyribonucleic * acid) and the ribose in RNA (ribonucleic acid) .
- RNA In DNA, there are four normal nitrogenous bases: two pyrimidines, cytosine (C) and thymine (T) , and two purines, adenine (A) and guanine (G) .
- C cytosine
- T thymine
- U uracil
- a base-sugar moiety is called a nucleoside
- a base-sugar-phosphate moiety is a nucleotide. Genetic information is conveyed by the sequence of bases in a polynucleotide chain, not by the phosphodiester-sugar backbone.
- the 5' position of one pentose ring is connected to the 3' position of the next pentose ring via a phosphate group, thus forming a series of 5' to 3' linkages.
- the terminal nucleotide at one end of the chain has a free 5' group, and the other terminal nucleotide has free 3' group.
- IT is conventional to write nucleic acid sequences by setting forth the sequence of bases in the 5' to 3' direction.
- the chromosome is a double helix formed by two very long, interweaving polynucleotide chains, or strands. These strands are held together in the double helical structure as a result of hydrogen bonds between so-called complementary bases on the two strands.
- One such "base pair”, adenine-thymine (A:T) (or, in RNA, adenine-utacil, A:U) provides two hydrogen bonds; the other, guanine-cytosine (G:C) . provides three.
- RNA transcript When a gene in the chromosome is expressed, a single strand of messenger RNA is transcribed from a template strand of the chromosome by complementary base pairing.
- the template strand is called the coding or anti-sense strand; the other DNA strand, the anti-coding or sense strand, is identical in sequence (except for the T/U distinction) to the messenger RNA transcript.
- the messenger RNA acts as a template for the assembly of amino acids into the protein encoded by the gene; the assembly process is known as translation.
- the messenger RNA transcript is read in nonoverlapping units of three nucleotides, known as codons, from a fixed starting point; there are three possible ways of translating any messenger RNA, depending on the starting point. These are known as reading frames.
- the bases of DNA may be modified by enzymes endogenous to the cell, especially methylases. In vertebrates, the only methylated base is 5-methylcytosine. Between 2% and 7% of the C residues of animal cell DNA are methylated. Most of the methyl groups are found in CpG "doublets" (dinucleotides) ; in birds and mammals, 50- 70% of all such dinucleotides are modified by methylation.
- the C and G are adjacent bases on the same strand, joined by a covalent 5 1 to 3' phosphodiester(p) -sugar linkage; the CpG dinucleotide should not be confused with the C:G base pair formed by hydrogen bonding between a C on one strand and a G on another.
- a doublet that is instead methylated on only one of the two strands is said to be hemimethylated.
- the distribution of methyl groups may be examined by taking advantage of pairs of restriction enzymes, known as isoschizomers, that cleave the same target sequence in DNA, but have a different sensitivity to its methylation pattern.
- the enzymes Hpall and Mspl both recognize the sequence CCGG, which includes a CG dinucleotide.
- CCGG which includes a CG dinucleotide.
- Mspl is indifferent to the presence of methylation at this C.
- Mspl can be used to identify all the CCGG sites, and Hpall, to determine whether or not they are methylated.
- some methylation sites are methylated when the gene is inactive but unmethylated (at least in some cells) when the gene is expressed.
- Palmiter and Brinster Ann. Rev. Genet., 20:465- 99 (1986)
- the methylation of DNA microinjected into mouse embryos, in the course of transgenic mouse production may in some cases be responsible for problems observed in expressing transgenes.
- a mutation is a change in the nucleotide sequence. An alteration that alters only a single base pair is termed a point mutation.
- a point mutation may take the form of a substitution, an insertion, or a deletion.
- a point mutation in a gene will not necessarily have an effect on the sequence of the encoded polypeptide. This is because the genetic code is redundant--61 different codons encode only twenty different amino acids. -A mutation which does not affect which amino acid is encoded is termed a "silent" mutation. Even if the amino acid is changed, it is possible that the mutant polypeptide will retain activity. In this case the mutation is said to be "neutral".
- a mutation which inactivates a gene is termed a forward mutation.
- the effects of forward mutations may be reversed by back mutations in the same genes, or through suppression of the mutated gene through mutation of a different gene.
- Back mutations fall into two categories, true reversions, which restore the wild- type sequence, and second- site reversions, which simply compensate for the forward mutation by restoring activity.
- a nucleotide in a gene will cause a shift of the reading frame. In general, this will result in the expression of a radically different and probably nonfunctional polypeptide. However, a second frameshift mutation, close enough to the first one, may restore activity.
- Mutations may also involve the insertion, deletion, or inversion of larger chunks of DNA.
- Nitrous acid deaminates adenine to hypoxanthine, whichbonds to cytosine instead of to thymine.
- the new strand features a cytosine, rather than a thymine, at the position complementary to the hypoxanthine.
- the polymerase inserts a guanine, which is complementary to the aforementioned cytosine.
- nitrous acid causes an A->G transition. Cytosine is deaminated to uracil, which hydrogen bonds to adenine instead of to guanine.
- nitrous acid also causes a C->T transition.
- Guanine is deaminated to xanthine, which continues to hydrogen bond to cystosine, though with only two hydrogen bonds.
- Thymine and uracil are not altered by nitrous acid.
- Hydroxylamine reacts with cytosine to form N- hydrocytosine, which preferentially pairs with adenine.
- hydroxylamine produces a C->T transition.
- the alkylating agents are the largest group of mutagens, which introduce alkyl groups into nucleotides at various positions.
- the alkylating agents include mustard gas, epoxides, dimethyl- and diethylsulfonate, methyl- and ethylmethane- sulfonate, and N-methyl-N" -nitroso-N-nitroguanidine.
- G->A (as a result of formation of 0 6 -alkylguanine)
- T->C attributable to O 6 -alkylthymine
- the present invention is not limited to the evaluation of any particular chemical class of potential mutagen, or to the detection of any particular kind of mutation.
- methylated cytosines also acts as a hotspot for mutation in the genes of vertebrate cells, it follows that the placement of a heavily methylated marker gene in a vertebrate cell for mutagenesis assay purposes will result in overestimation of the mutagenic potential of the assayed chemical.
- transgenic animal is defined, for the purpose of the appended claims, as an animal at least some of whose germ cells contain genetic material, originally derived from another animal, other than an ancestor of said animal, as a result of human intervention. So defined, it includes progeny of a transgenic animal which retain the transgenic genotype. It is not necessary that all cells of the animal contain the transgene.
- the reference to human intervention is intended to exclude genetic modification as a result of unintentional infection with a virus.
- chimeric animal is defined, for the purpose of the appended claims, as an animal which is not necessarily a transgenic animal, but at least some of whose somatic cells contain genetic information, originally derived from another animal other than an ancestor of said animal, as a result of human intervention.
- genetic engineered animal refers to an animal which is either a transgenic animal or a chimeric animal.
- animals produced by conventional artificial insemination techniques are not considered to be genetically engineered, the donors of sperm and egg being considered parents of the animal, unless one or more ancestors of the animal was genetically engineered and the descendant animal retains the engineered genotype.
- the transplantation of cells from one animal to another is not considered genetic engineering.
- transgenic animals may utilize both transgenic animals and chimeric animals, though transgenic animals are preferred. References to transgenic animals in this specification should be deemed to include, mutatis mutandis. chimeric animals as well.
- the "marker gene” may be any gene which confers a selectable or screenable phenotype on cells of the transgenic animal, or, if the assay is not applied directly to those cells, on the assay cells subsequently transformed by the rescued marker gene.
- the marker gene is . one which is not substantially homologous with any gene endogenous to the host animal from which the transgenic animal is produced. This facilitates the identification of the marker gene.
- the marker gene will be a bacterial gene in order to maximize the taxonomic distance between the marker gene and the genes of the host animal. However, it may also be a nonbacterial, nonmammalian gene, uch as a viral, fungal, plant, invertebrate or lower vertebrate gene.
- the marker gene may be a wild type nonvertebrate gene chosen because it has a CpG level in the vertebrate range; more often, it is a nonvertebrate gene mutated to reduce the CpG level.
- the detected mutation in the marker gene may be any of the numerous types of mutation known to occur. It may be in the coding sequence of the DNA, or in an associated regulatory sequence such as a promoter or a stop codon. It may be a point mutation or a frameshift mutation. It may involve a base substitution, insertion or deletion.
- the marker gene is a functional gene, and the assay is for forward mutations which inactivate the gene.
- the marker gene is one mutated to render it nonfunctional, and the assay is for back mutations which restore activity.
- the second of these embodiments has the advantage .of lower background, since back mutation is a rare event.
- the marker gene may be a lacZ gene with the codon for Glu- 461 (GAA) mutated to HBA; it has been shown in IjL. coli that only same site reversion will restore activity.
- the phenotypic change (marker) associated with mutation of the marker gene may be one detectable in a mammalian cell, or it may be one detectable only after rescue of the mutated DNA and expression of that DNA in a non- mammalian, e.g., bacterial system.
- the mutagenesis assay may detect a direct or an indirect product of the marker gene.
- the detection may occur in vivo or in vitro, through any means known in the diagnostic art.
- the phenotypic change may be the death of the animal or the affected cells, or a change in cell morphology or metabolism. It may be the presence or absence of a characteristic luminescence or radioactivity. All that matters is that there be a detectable change if mutation occurs. It is not necessary that all mutations cause a detectable change, as long as some mutations in the marker gene will do so.
- the detection could be through in vitro examination of the blood, urine, milk, or other expendible product of the animal. This has the advantage that the animal is not harmed, so that the animal can continue to be monitored for further genotoxic damage from the agent. However, there is no way of knowing how the detected label enters the examined product, and hence it is not known whether the mutation occurs only in certain tissues or organs.
- the detection could be through histochemical examination of one or more tissues of the animal. See, e.g., Wei, EP Appl 370,813.
- the marker gene may be, but is not limited to, a tumorigenic, toxin, hormonal, enzymatic or antigenic marker gene. (It should be noted that these categories are not mutually exclusive.)
- Tumorigenic marker genes are those which, when expressed in a transgenic animal, result in production of a transforming gene product and therefore induce tumors.
- the marker gene may be a functional oncogene, in which case the assay is for mutations which render the oncogene nonfunctional and therefore protect the animal, or it may be an oncogene mutated to be nonfunctional, in which case the assay is for back mutations which render the oncogene functional once more and therefore result in tumor formation in the animal.
- the oncogene may be a viral or a cellular oncogene.
- the marker gene may be a naturally occurring proto-oncogene, or it may be an oncogene mutated in the laboratory to render it nonfunctional.
- the oncogene When the oncogene is a viral oncogene, it may be derived from a DNA tumor virus or from a retrovirus.
- Suitable retroviral oncogenes include v- abl, v-fes, v-fps, v-fgr, v-src, v-erbA, v-erbB, v-fms, v-ros, v-yes, v-mos, v-ras, v-fos, v-myb, v-myc, v-ski, v-sis, v-rel, v-kit, v-jun, andv-ets.
- Suitable DNA tumor virus genes include the T antigen genes from SV40 or polyoma viruses and the EIA and E1B genes from adenoviruses.
- Toxin marker genes encode a toxic protein or an enzyme which participates in the enzymatic production of a toxic metabolite.
- the toxin may be, but is not limited to, a bacterial toxin (e.g., diphtheria toxin, tetanus toxin, and botulin toxin) , a plant toxin (e.g., ricin or abrin) , an invertebrate toxin (e.g., a scorpion or sea anemone toxin), or a snake venom toxin (e.g., a cobra or rattlesnake toxin) .
- Toxins include cardiotoxins, neurotoxins, and protease inhibitors. Nonfunctional mutants of toxin genes may be used in back mutation assays.
- Hormonal marker genes encode protein or peptide hormones (or prohormones, or pre-prohormones) which are detectable either directly or through their biological effect. These hormones may be identical to natural counterparts secreted by, e.g., the endocrine glands (such as the pituitary, thyroid, or gonads) , or they may be muteins. Suitable hormones include growth hormone, prolactin, chorionic gonadotropin, luteinizing hormone, follicle stimulating hormone, insulin, parathyroid hormone, somatostatin, and gonadotropin releasing hormone, and homologues thereof. While most mammalian hormonal marker genes will exhibit CpG frequencies typical of mammalian DNA, exceptions may exist. Also, nonmammalian hormonal marker genes may be of interest as their proteins may be more readily differentiated from their mammalian cognates in a transgenic mammalian host.
- Enzymatic marker genes encode enzymes which, in the presence of a suitable substrate, convert the substrate into a directly or indirectly detectable product. Suitable enzymes include beta- galactosidase (lacZ) , alkaline phosphatase, luciferase and horseradish peroxidase.
- Enzymatic marker genes are particularly appropriate where the marker gene is engineered so that the enzyme is secreted into an assayable biological fluid, such as blood.
- the substrate can then be supplied when the blood is assayed in vitro. They may also be used when the marker is to be detected by histochemical analysis. In any event, the substrate may be provided in vivo or in vitro.
- Antigenic marker genes encode a detectable antigen. The antigen is then detected with a specific antibody. Antigenic marker genes are particularly suitable for detection of mutagenic activity by in vivo imaging, as the antibody may be labeled with an imageable label such as a radioactive label. Regulatory marker genes are genes which encode regulatory proteins. Such proteins control the expression of other genes.
- Examples include the lad repressor gene and the lambda repressor gene, and the lad activator protein LAP267, see Bairn, et al.. PNAS (USA), 88:5072-5076 (June 1991). With regard to lac repressor, see Wyborski and Short, Nucleic Acids Res., 19:17 (Sep. 1991) .
- Suppressor tRNA Genes encode tRNAs which suppress the effect 5 of a chain termination mutation.
- the supF gene suppresses the amber mutation and the supE gene the ochre mutation. If there is, for example, an amber mutation in a required or selectable function, the mutation can be suppressed by a functional supF gene. Thus, if there is an amber mutation in a
- Antibiotic resistance genes include the ampicillin, choramphenicol, neomycin, bleomycin, puromycin resistance genes. The rescue approach is preferred when the marker is an antibiotic
- the lad and lacZ genes are of particular interest, and it is therefore appropriate to discuss their function in nature.
- the polycistronic lac operon comprises the lac promoter, the lac operator (lacO) , the lacZ, lacY and lacA genes, and a terminator.
- lacZYA Immediately 5' of the lac operon is the monocistronic lad operon, which comprises the lad promoter, the lad gene, and a terminator.
- the lacZ gene encodes the enzyme,beta-galactosidase
- lacY and lacA encode the enzymes beta-galactoside permease and transacetylase, respectively.
- 25 gene cluster is normally repressed by the Lad repressor protein, which binds to the lacO operator site and thereby prevents the binding of DNA-directed RNA polymerase to the operator. Transcription is activated if an inducer, such as IPTG, is present; IPTG releases the Lad repressor from the lacO site.
- an inducer such as IPTG
- the vector used to introduce the marker gene may contain one copy of a particular marker gene, multiple copies of a single
- 35 marker gene or several different marker genes. Use of multiple marker genes, whether the same or different, alters the sensitivity of the assay.
- marker genes of nonvertebrate origin will exhibit a higher frequency of the CpG dinucleotide than do the genes of a vertebrate host animal.
- the expected CpG frequency in DNA of random sequence is (GC%) 2 /4.
- the expected CpG frequency is 4%
- the expected CpG frequency is 9%.
- Bacteria exhibit CpG frequencies in keeping with statistical predictions. However, for vertebrates, especially mammals, the CpG frequency is depressed overall, though so-called HTF regions are marked by higher-than-expected CpG frequencies. and are usually hypomethylated.
- the E. coli gpt gene for example, has a CpG frequency of about 8.5%, while for lacl, it is about 9%.
- this invention may be applied to any marker gene, it is especially suitable for marker genes where the wild-type gene has a CpG frequency substantially higher than is typical of genes of the target species, e.g., at least about twice the frequency (thus, >4% for mammalian target species) , and more preferably at least about four times the frequency (>8% for mammalian target species) .
- the marker gene in which the CpG dinucleotide frequency is reduced, preferably to the point that it is not substantially greater than the CpG dinucleotide frequency in genes of the target species (e.g., not greater than twice, better yet, VA times) .
- the marker gene preferably is engineered so that its CpG dinucleotide frequency does not substantially exceed the frequency in mammalian genes, which is 2%.
- the CpG dinucleotide frequency is 1-3%.
- Each amino acid of a polypeptide is encoded by a DNA triplet, or codon. Since there are four bases (A,T,C,G) in DNA, there are 4 3 possible triplets. Three -- the stop codons -- direct the termination of the polypeptide chain. The remaining 61 possible codons encode the twenty protogenic amino acids. Each amino acid is encoded by one (Met, Trp) to six (Arg, Ser, Leu) different codons.
- a CpG dinucleotide pair may be formed by the first and second bases of a codon, as in the Arg codon CGT. by the second and third bases of a codon, as in the Thr codon ACG. or by the last base of one codon and the first base of the next one, as in the Cys-Ala encoding sequence TGC.GCA. The last situation is called an "intercodon CpG".
- the Met (ATG) , Trp (TGG) , Lys (AAA, AAG) and Gin (CAA, CAG) codons are incapable of forming a CpG dinucleotide.
- Table A sets forth five amino acids for which there is at least one CpG-containing codon, and lists the alternative codons, with the percentage of usage of that codon of all codons encoding the same amino acid, in mammals, give in parenthesis. (Codon preferences for non-mammalian vertebrates are also available.)
- the next table (B) refers to other amino acids having codons ending with a "C”. These form a CpG dinucleotide .if. followed by an Ala, Val or Glu codon (all of which begin with "G”) .
- TTA for Leu is disfavored, but not prohibited.
- a gene can be altered, without affecting the sequence of the encoded polypeptide, to reduce the number of CpGs to zero.
- the gene is more preferably modified so that 1% to 3% of the dinucleotides are CpG.
- CCGG Mspl/Hpall sites
- a further consideration in designing the CpG-depleted marker gene is that one preferably should avoid creation of RNA splice sites. Consensus sequences for splicing donor and acceptor sites are given in Padgett, et al., Ann. Rev. Biochem. , 55:1119-50 (1986) . Otherwise, some mRNAs will be incorrectly spliced and may therefore be translated into a nonfunctional protein or a protein of different antigenic characteristics. ' The sequence AGGT is particularly undesirable as it is the predominant splice donor site; AGGC (a splice donor site) and AGG (a splice acceptor) should also be avoided if possible.
- the desired CpG-depleted marker gene may be prepared entirely synthetically, i.e., using DNA synthesizer apparatus. See Worall and Connolly, J. Biol. Chem., 265:21889-95 (1990). (Typically, the double stranded DNA will be subdivided into overlapping single stranded oligonucleotide segments. These will be synthesized separately, then ligated and annealed to form the desired DNA duplex.) However, if the marker gene is very large, it may be more desirable to eliminate unwanted CpGs through mutagenesis, e.g., cassette mutagenesis, of the wild- type gene. The individual cassettes may, of course, be prepared synthetically as described above.
- the invention is not limited to any particular method of preparing the CpG- depleted gene.
- mutant is not intended to indicate that the wild-type gene is obtained first, and then altered. It includes even a wholly synthetic gene, provided that gene differs by at least one base pair from the naturally occurring gene which is closest in sequence to the mutant marker gene.
- a CpG-depleted mutant gene 0 is one having at least one fewer CpG than the naturally occurring gene which has the greatest sequence similarity to the CpG- depleted gene.
- the engineered structural sequence of the marker gene will be operably linked to regulatory sequences which are functional 5 in the cells in which the selectable or screenable phenotype conferred by the marker gene is to be looked for.
- the most important of these regulatory sequences are the promoters.
- the transcription of the coding strand of the gene is accomplished by DNA-directed RNA polymerase, which binds to the promoter 0 region.
- Promoters may contain regulatory elements which render transcription tissue- or developmentally-specific, or which make transcription regulatable by inducer or repressor molecules.
- the promoter may be constitutive, inducible or repressible; the choice will depend 5 on the character of the polypeptide encoded by the marker gene. •
- mice A great variety of promoters have been used to drive expression of unrelated genes in transgenic animals.
- mouse metallothionein (MT) human MT
- mouse serum amyloid (SAA) mouse myc
- mouse alpha2 mouse alpha2
- mouse H-2K (class I MHC) , viral thymidine kinase, Rous sarcoma virus LTR, mouse iriammary tumor virus LTR, rat elastase, mouse albumin, mouse transferrin, human growth hormone releasing factor, mouse alphaA-crystallin, mouse beta- globin, mouse IgH and mouse amylase promoters. See Palmiter and
- the promoter used should be one which is not tissue- specific.
- a preferred promoter for driving expression of a marker gene is the beta-actin promoter, which, unlike the alpha-actin promoter, drives a gene whose expression is believed to be ubiquitous.
- beta-actin promoter For the sequence of the beta-actin promoter from -2011, see Miyamoto, Nucleic Acids Res. , 15: 9095 (1987).
- Other preferred ubiquitous promoters include the various tRNA promoters, the ribosomal RNA promoter, the ribosomal protein promoter, and the histone promoter.
- methionyl tRNA promoter see Nucleic Acids Res., 12:1101-15 (1984).
- the present invention extends, however, to the use of tissue-specific promoters as well.
- the terminator (polyA addition site) sequence may be the endogenous terminator sequence of the marker gene, or it may be a foreign terminator, such as the terminator of the SV40 early gene or of the bovine growth hormone gene.
- the ribosomal binding site may be the endogenous ribosomal binding site, or one which provides increased translational efficiency, such as the Kozak sequence.
- Enhancer sequences may be used to increase expression, or to limit it to particular tissues., developmental stages, etc.
- a regulatory element of interest appears in the first intron of the beta-actin gene. It is believed to act as an up-regulator of transcription in a non- tissue specific manner.
- the marker gene and its associated regulatory sequences hereinafter referred to as the transgene, must be introduced into the cells of a host animal.
- the target species is a vertebrate
- the host animal is also, preferably, a vertebrate.
- a suitable host animal is dependent on (a) its genetic and metabolic similarity to the target animal, and (b) the time and expense involved in producing and maintaining the transgenic animals.
- Preferred host animals include, among the mammals, mice, rats, rabbits, hamsters and pigs, and among other vertebrates, transgenic fish.
- Pigs are of interest since the anatomy of the pig (including the skin) is very similar to the human. Directing transgenic expression to the skin of pigs would create a useful model for the testing of cosmetics.
- Fish may have an advantage in that various species of fish exhibit desirable characteristics relating to their use as laboratory animals. " In fact, fish have a long history of performing in this capacity. They have played a critical role in the development of environmental biology, - embryology, endocrinology, neurobiology and other areas. Research in fish has established much of our basic knowledge of membrane transport systems at the molecular level. Transgenic examples of at least 10 different species of fish have been produced. Several mammalian promoters have been shown to function in fish (1) . See Chen, Thomas T. and Powers, Dennis A., Transgenic fish, Trends in Biotechnology, Vol. 8, No. 8, 1990, pp. 209-215.
- SV40 early promoter include the Rous sarcoma virus LTR promoter, the mouse metallothionein promoter, the flounder luciferase promoter and the flounder alpha fetoprotein promoter. It is further believed that the cytomegalovirus promoter and phosphoenolpyruvate carboxykinase (PEPCK) promoters would be functional in fish. In general, promoters of piscine genes, genes of viruses which infect fish, and genes which are strongly conserved among the vertebrates are likely to be functional in fish.
- PEPCK phosphoenolpyruvate carboxykinase
- Useful marker genes for fish models include the chloramphenicol acetyltransferase gene and the luciferase gene.
- Stuart, et al., Development, 109:577-584 (1990) describes an assay for expression of a CAT transgene.
- Assays for other genes transferred to fish, including various growth hormoned, the E. coli beta-galactosidase (lacZ) gene and the E. coli hygromycin resistance gene have been reported in the references cited in Table 1 of Chen, et al., and of course assays for expression of still more genes may be adapted from piscine systems.
- the zebrafish (Brachydanio rerio) has been used to produce stable lines that exhibit reproducible patterns of transgene expression. See Stuart, Gary W.. , et al., Stable lines of transgenic zebrafish exhibit reproducable patterns of transgene expression, Development 109, 577-584, 1990. They are much less expensive to buy and raise than any mammalian species. They are extremely fecund, oviparous and are externally fertilized. Because of these factors, it is much less expensive and technically less complicated to perform gene transfer procedures on them. Their eggs are transparent and embryonic development occurs at a much faster rate than in the mouse. Large scale production of homozygous diploid zebrafish can be obtained in a reproducible and relatively simple manner. See Streisinger, G. , Walker, C., Dower, N. , Knauber, D., Singer, F. , Nature 291, 293, 1981.
- Certain enzymes are known to play a role in the conversion of promutagens into mutagens.
- Host animals may be selected, on a species and/or individual level, to provide a level of activity of these enzymes which is comparable to (or if desired to increase the margin of safety, higher than) that of the target animal of interest. If a particular species of animal, such as a mouse, is deficient in a particular enzyme of this type, it may be modified, by crossbreeding or genetic engineering, to provide
- a transgenic animal may be produced that features a P450 enzyme missing in the mouse, or homologous recombination may be used to replace , it with a human counterpart' or to insert a stronger promoter upstream of a gene encoding such an enzyme.
- this background level of mutation is low, e.g., less than 10 *5 to 10" 6 .
- the natural environment of an animal may make it better suited for testing certain scenarios of chemical exposure.
- waterborne chemical are preferably tested using transgenic fish (or amphibia or aquatic mammals) .
- an animal is particularly sensitive to mutagens, it may be useful in detecting less potent mutagens.
- a final issue is the economic importance of the animal.
- a chemical which has a detrimental effect on an economically important animal may be rejected even if it does not have a serious adverse effect on humans. This could be the case with, for example, honey bees, or with fish.
- the laboratory mouse has been the most popular host animal for use in the development of transgenic animals, as there are numerous strains available. Mice are, of course, the most widely available laboratory animal, and many strains are available. See Genetic Variants and Strains of the Laboratory Mouse (Gustav Fischer Verlag, 1981) . However, there are no substantial restrictions on the use of other laboratory or livestock species in such work. Among the higher mammals, pigs are preferred, and fish offer an interesting alternative to mammalian subjects.
- DNA may be introduced into host cells by microi ⁇ jection, electroporation, infection, and other mechanisms such as lipofection and cell receptor-mediated transfer. While the DNA may plainly contain bacterial genes, procaryotic vector DNA (more particularly any prokaryotic replicon) should be removed before the transgene is introduced into the host cell(s) to be developed into a transgenic animal.
- transgenic animals The most common technique for the production of transgenic animals involves the microinjection of the transgene into the pronucleus of fertilized eggs. Because integration usually accompanies DNA replication, about 70% of the transgenic mice carry the transgenes in all of their cells, including the germ cells. In the remaining 30%, integration apparently occurs after one or more rounds of replication, hence, the transgene is found in only a fraction of the cells. These mice usually exhibit the same degree of mosaicism in somatic and germ cells, but in some mice the germ cells may totally lack the transgene. In the latter case, the mice will be unable to transmit the transgene to their progeny.
- One of the requirements for successful pronuclear microinjection is the ability to locate the pronucleus.
- Transgenes may also be incorporated into the host cell genome by microinjection of DNA into the cytoplasm of fertilized or unfertilized eggs, into the nuclei of two-cell embryos, or into the blastocoel cavity. Mosaicism is more prevalent with these approaches.
- microinjection alternatives include electroporation, liposome-mediated entry, and particle gun bombardment.
- Preimplantation embryos may also be infected with retroviruses engineered to carry the transgene. This method has found particular favor for the production of transgenic birds.
- Still another method for the production of transgenic animals is to introduce the transgene, on-a suitable vector, into totipotent teratocarcinoma or embryonic stem cells and then incorporate these cells into embryos.
- transgenic animals are produced, they (or their transgenic progeny) are exposed to the suspect chemical.
- the exposure may be by ingestion, inhalation, injection, or skin contact.
- the dosage employed may be one comparable to that experienced by the target species in the environment of interest, or it may be a higher dose, in order to provide a margin of safety.
- the animals are examined to determine whether the marker gene has been mutated.
- the marker gene confers a phenotype which can be detected without killing the animal, e.g., one which may be detected by in vivo imaging means.
- In vivo imaging means known in medicine include CAT, PET, NMR and MRI.
- the transgene or its expression product must have a characterizing feature which is recognizable by a detectably labeled homing agent.
- monoclonal antibodies may be prepared which bind a wild-type polypeptide, in preference to the mutant polypeptide encoded by the marker gene. These antibodies may be detectably labeled and injected into the animal. If the epitopes for these antibodies are reestablished by specific reverse mutation, by mutation, these antibodies may be localized by scintigraphic means known in the art. (A forward assay can also be envisioned, but is less desirable because of the increased background.)
- Certain cells may, of course, be removed without killing the transgenic animal. These include blood cells, skin cells, mucosal cells, etc. Such cells may be removed and examined as described below.
- this method while permitting the monitoring of the development of the mutagenic effect of the chemical in certain tissues over time, does not provide information as to mutagenesis of the marker gene in all tissues and organs. Therefore, in another and preferred embodiment, the animal is sacrificed so that all of its tissues and organs of interest may be examined for mutation of the transgene. In this embodiment, it is not strictly necessary that the animal have expressed the marker gene. However, it is preferable that the animal express the marker gene, since mutation rates may be different for expressed and unexpressed genes. Mellon, et al., PNAS, 83:8878-82 (1986).
- transgenic mice which are homozygous for lad are mated with transgenic mice which are homozygous for lacZ under lacO control.
- the progeny are hemizygous for a single copy of lad and for one or two copies of lacZ.
- the progeny animals are exposed to the potential mutagen. If the lad gene is mutated, the cells of the progeny animal will stain blue since lacZ gene is then derepressed. (Having more than one copy of the lad gene is undesirable, since then both copies must be mutated in order to derepress the lacZ gene.)
- a variety of other phenotypic characteristics could be used to identify cells containing a mutagenized form of the marker gene. These include antibody sensitivity or resistance, antigenicity, etc. (See discussion of marker genes above.)
- the marker gene may be recovered from the genomic DNA of the transgenic animal.
- a variety of techniques are known for rescue of a foreign gene from genomic DNA. These include rescue of lambda proviruses, plasmid rescue, and rescue of filamentous phage DNA.
- One method is the use, ' as previously discussed, of a lambda packaging extract.
- mutations may be detected in any of several ways. First, as the marker gene confers a selectable or screenable phenotype, the recovered marker gene may be cloned into a suitable "assay" cell, such as a bacterial cell, and the transformed cells may then be exposed to selection or screening conditions.
- the marker gene must be expressible in the assay cells, and therefore must be operably linked (either originally, or as a result of further manipulation) to a promoter functional in those cells. This procedure will detect forward and back mutations, but not silent or neutral mutations. Silent and neutral mutations may be screened for by extracting genomic DNA and hybridizing it to a panel of oligonucleotide probes, each directed against a different locus of the marker gene, under stringent conditions. The failure of one of- these probes to hybridize is then indicative of the presence of a mutation.
- transgenic animals in mutagenic assays also allows one to determine whether a promutagen or its metabolic products can cross the placenta or the blood-brain barriers.
- Plasmid pCMVlad (5.5Kb) (Brown, et al., Cell, 49:603-12, 1987; Figge, et al., Cell, 52:713-22, 1988), a source of the lad gene, was digested with EcoRI, and the resulting 1.1Kb fragment was cloned into the EcoRI site of plasmid pBSK+ (2.9Kb) (Stratagene) , creating the plasmid pBSK lad (4.0Kb) . No promoter is operably linked to the lad gene in pBSK lad.
- the 0-actin promoter was excised from the plasmid pHj8Apr-l (6.6Kb) (Gunning, et al., Proc. Nat. Acad. Sci. USA, 84:4831-35, 1987) by restriction with Hindlll and BamHI, and this 4.3Kb fragment was ligated with a 1.1 Kb fragment obtained by Ba ⁇ iHI/Hindlll digestion of pBSSKlad to obtain pH/Jlad (7.7Kb), in which the lad gene is under the transcriptional control of the j8-actin promoter.
- the hybrid gene was modified to further encode the heptapeptide (PKKKRKV) nuclear location signal from SV40 large T antigen.
- Plasmid pHjSlacI (7.7 Kb) was cut with Hindlll and BamHI, thereby excising the 3' untranslated flanking region of the lad gene.
- the remaining 6.6 Kb fragment was ligated with a l.l Kb fragment obtained by digestion of plasmid pSZN5 (a.k.a.
- pMTlacINLS a derivative of pMTlad (Brown, et al., Cell, 49:603-12, 1987) with Hindlll and Bglll, thereby producing the new plasmid pHjSlacINLS (7.7 Kb) .
- Plasmid tkneo was cut with BamHI and Hindlll, releasing a 2 Kb fragment. Both ends of this fragment were then blunt-ended. Plasmid HSL mutants pSAM was cut with EcoRI, yielding a 2.7 Kb fragment. This, too, was blunt-ended. The two blunt-ended fragments were then ligated to obtain plasmid HSLmutants-neo. This was cut with Seal to Obtain a 2.6 Kb fragment.
- Plasmid pHjSlacINLS was linearized with Sspl, and ligated to the neo-bearing Seal fragment to obtain pHSlacINLSneo (10.7 Kb) .
- Plasmid pHblNB was prepared by cutting pHblacI with HindHI/BamHI, and ligating it with a 1.1 Hindlll/BamHI fragment from obtained by cutting pSZN5 with Bglll, blunt ending the Bglll ends, cutting again with Hindlll, and attaching a BamHI linker.
- Transgenic mice were produced substantially according to the following standard protocol. (For further details, see Chandrashekar, et al., Neuroendocrine Research Methods and Functions in Transgenic Mice, in Greenstein, D.B., ed. , Vol. 1, Chap 15, Neuroendocrine Research Methods. 315-336 (Howard Academic Pub., London: 1991) .
- Embryos from B 6 SJL F t female mice bred to males of the same strain are used in our laboratory because they culture well in our hands and are favorable for microinjectionbecause they have little cytoplasmic pigmentation.
- the embryo donor B 6 SJL females are superovulated with 5 I.U.
- mice pregnant mares' serum gonadotropin (PMS) at 12:00 noon three days prior to embryo collection. Forty-eight hours later at 12:00 noon, the ovulation of these mice is synchronized by injection of 5 I.U. of human chorionic gonadotropin (HCG) and embryos are collected at 9:00 am the following morning from the ampula of the oviducts of the embryo donors following sacrifice by cervical fracture. The collected embryos are treated with bovine testis hyaluronidase to remove cumulus cells, washed five times, and incubated under 90% N 2 , 5%0 2 , 5% C0 2 at 37°C in Brinster's medium until further use.
- HCG human chorionic gonadotropin
- Plasmid pHbLacI was digested with Sspl and BamHI to remove all procaryotic vector sequences, and 25 ⁇ l of fragments (concentration 25 ng/ ⁇ l) were microinjected into the male pronucleus of the collected embryos. Microinjection is carried out using two Leitz micromanipulators controlling a suction holding pipette and an l ⁇ m injection pipette.
- the holding pipette is connected via tubing to a 500 ⁇ l threaded plunger Hamilton syringe; the injection pipette is connected via tubing to a microsyringe. In both cases the syringe and tubing are filled with light paraffin oil.
- the injected embryos are transferred into the oviducts of white CD-I female mice previously bred to vasectomized males. These recipient females are selected by the presence of vaginal plugs on the morning the embryo microsurgery is performed. Ten injected embryos are transferred to each oviduct of each recipient. Approximately 20 days later, pups are born. When uninjected embryos are transferred as controls, the average litter size is 14. In our experience, 90 to 95% of recipients will give birth with an average litter size of 7 to 8 pups. Mice produced from microinjected eggs are weaned a month after birth. Segments of tails are analyzed by DNA hybridization analysis for the presence of the injected gene construct. In the instant experiment, the microinjected embryos were transplanted into the preimplantation uteri of nine pseudopregnant females. These females produced 63 pups, in eight litters.
- Genomic DNA from the tails of several transgenic mice was digested with a restriction enzyme (BamHI, Bglll, EcoRI, Hindlll, Hpall, Mspl, NotI, PstI or Rsal) and characterized by Southern blotting.
- the probe was prepared by digesting plasmid pCMVlad with EcoRV, which linearizes the plasmid, and then labeling the linearized pCMVlad with [ 32 P] dATP. The probe was hybridized to the blotted fragments at 42 deg. C.
- lad cell line DNA (mouse NIH 3T3 or human fibrosarcoma HTD114 derivatives) was subjected to a similar analysis.
- the Hpall and Mspl patterns were the same, indicating that lad was not methylated in cell line DNA.
- mice 02.11.03 (female) (BCF1 background), 09.07 (female) (DBA/2J bkgd) and 09.03.03 (male) (129/SV bkgd) were sacrificed, and their liver, spleen, heart, testis/ovary (09.03.03 and 02.11.03 only), uterus, kidney andmuscle (02.11.03 and 09.03.03 only) tissues were removed and frozen in liquid nitrogen. Liver, testis/kidney and heart RNA was extracted by the acid phenol method.
- a sensitive lad probe was made by PCR amplification of a lad template This was hybridized to the aforementioned RNAs.
- Southern blots were prepared of lacI09 lineage DNAs to compare the degree of methylation of the lad gene in DBA/2J, 129/Sv and BALB/c X C3H backgrounds. There were no apparent differences in the first generation. In later generations, the lad gene was demethylated to some degree in the DBA/2J line but remained hypermethylated in the other two lines.
- Figure 2 depicts a lad gene modified to reduce the number of CpG dinucleotides from 95 ( ⁇ 18%) to four ( ⁇ 0.8%).
- the gene of Figure 2 is prepared by chemical synthesis of component oligonucleotides and their subsequent ligation and annealing to form the desired lad mutant.
- the Lad gene was synthesized in segments by annealing double stranded oligonucleotides with overhanging, complementary ends of 10 nucleotides.
- the following single stranded oligonucleotides were synthesized for use in assembling the lacImlRNL gene, including the Hindlll site, the Kozak RBS, the modified lad sequence, the three codon linker- and the seven codon NLS-encoding sequence. The numbering begins with the first base on the sense strand in Figure 2 (SEQ ID NO:3) .
- the oligonucleotides a through t and a' through t' were synthesized on an Applied Biosystems 391 DNA synthesizer (PCR-MATE) per the vendors instructions.
- the oligonucleotide a is complementary to a', b to b ! , c to c', etc.
- the Lad gene was modified to remove all but 4 CpGs, 3 of which (positions 48, 606 and 1027) are part of Mspl/Hpall methylation diagnostic sites.
- the modified Lad gene was synthesized in 2 halves.
- the 5' half is bounded by a Hindlll site and Apal site; the 3' half by an Apal site and a BamHI site.
- the double stranded oligos a/a, b/b" and c/c' were incubated together, allowed to anneal, and ligated with T4 DNA ligase.
- the trimeric product was separated by agarose gel electrophoresis and recovered from the gel. The same procedure was followed with double stranded oligonucleotides d/d 1 , e/e', f/f, and g/g', and with h/h' , j/j 1 and k/k' .
- the three trimeric products were incubated together, allowed to anneal via complementary overhanging ends, and ligated with T4 DNA ligase in the presence of Bluescript SK plasmid DNA (Stratagene) that had been digested with Hindlll and Apal.
- the DNAs were used to transform E. coli XL1 cells, and transformed cells with plasmids containing inserts were identified by the absence of blue color development after staining with the chromogenic agent X-gal (Sigma) and IPTG
- Plasmid DNAs from white colonies were isolated and tested for a 570bp insert representing the 5' half of the modified Lad gene by digestion with Hindlll and Apal and size fractionation. Inserts of the correct size were sequenced to ensure that no unwanted mutations were inadvertently introduced. The same procedure was used to synthesize, ligate and clone the oligonucleotides encoding the 3' end of the modified Lad gene.
- the double stranded oligonucleotides were annealed, ligated and recovered in the following groupings: 1/1', m/m * , and n/n' ; o/o', p/p 1 , q/q' and r/r" ; s/s', and t/t ! .
- the three oligonucleotide multimers were annealed together and ligated in the presence of Bluescript SK plasmid DNA (Stratagene) , and the DNA used to transform E. coli XL1 cells. Plasmids with inserts were identified as above, and correct insert size determined by Apal/BamHl digestion. Inserts were sequenced to ensure the absence of unwanted mutations.
- the complete gene was assembled by digesting the plasmid containing the 5 ⁇ half with Hindlll and Apal and the plasmid containing the 5' end with Apal and BamHI.
- the inserts were recovered and annealed and ligated in the presence of plasmid Bluescript SK digested with Hindlll and BamHI in a 3 way ligation.
- white colonies were picked after staining with X-gal and IPTG induction.
- Proper insert size (1.13kb) was established by digesting plasmids with Hindlll and BamHI, and the entire modified Lad gene was sequenced to ensure that it was correct.
- the gene had a 5' Hindlll site; the beta actin promoter has a 3' Hindlll site and can readily be linked at that site.
- the final construct is cloned in Bluescript SK+ (Stratagene) .
- the resulting expression vector is then digested with with Sspl and BamHI to linearize the vector and remove procaryotic sequences, and the lad-bearing fragment is then microinjected in the male pronuclei of fertilized mouse eggs as previously described. Production of transgenic animals is then by the method set forth above. It is expected that, as a result of the CpG depletion, the engineered lad gene will be only weakly methylated, and therefore will be better expressed.
- the E. coli gpt gene may be used.
- 40 are CpGs.
- the nucleotide sequence and position of CpGs are shown in Figure 3 (SEQ ID NO:5) .
- Oligonucleotides containing the modified DNA sequences with absent or reduced CpGs are synthesized using an Applied Biosystems* 391 DNA synthesizer.
- the modified sequences and oligonucleotides and endpoints are denoted in Figure 4 (SEQ ID NO:7) .
- the strategy is to hybridize complementary oligonucleotides (e.g., a and a'; b and b' etc.) , to form double stranded oligos with 10 nucleotide overhangs.
- the double stranded oligos a/a 1 , b/b' and c/c ? are incubated together allowing ends to anneal, ligated with T4 ligase, and the trimeric product is purified from an agarose gel.
- the same procedure is followed for oligonucleotides d/d' and e/e', and for oligonucleotides f/f* and g/g' and h/h' .
- the gel purified trimeric and dimeric products are incubated together to allow annealing in the presence of EcoRV-digested Bluescript SK
- E. coli XL1 cells (Stratagene) are transformed with the product and unstained colonies are picked for analysis after chromogenic staining with X-gal and IPTG. Plasmids from white colonies are checked for proper sized inserts by cleavage with EcoRI and Hindlll, and plasmids with inserts of about 460bp are subjected to DNA sequencing to ensure that the sequence is correct and no unwanted mutations are present.
- Pigs are one of the standard experimental models for humans in clinical studies.
- Transgenic pigs may be produced by the following procedure, which uses commercial cross-bred sows and boars.
- Parental stock may come from the Landrace, Dodge, Duroc and Hampshire breeds. About, 24 h after previous litters are weaned, sows used as zygote donors are induced to ovulate with 400 i.u. PMSG (i.ifi.)
- Donor sows are artificially inseminated with 120 ml of fresh, extended semen 24 and 30 h after the onset of oestrus. (Recipient sows are synchronized in oestrus with the donor sows, but are not inseminated.) A mid-ventral laparatomy is performed on the donor animals and zygotes are flush from the oviducts with warm (37 deg. C.) modified BMOC-3 medium containing HES.
- the zonae pellucidae of the recovered eggs are examined for the presence of spermatozoa.
- One and two- cellzygotes are centrifuged at 10,000 xG to faciliate visualization of pronuclear and nuclear structures.
- the zygotes are placed in cover-slip chambers in microdrops of modified BMOC-3 covered in silicone oil.
- Microinjection of about 10 pi of DNA-containing soluation follows the procedure previously described for the mouse experiments.
- A' midventral laparotomy is performed on the recipient sows and zygotes are inserted into the oviduct of animals identified by ovarian morphology as having ovulated.
- Example 9 Production of Transgenic Fish
- the zebrafish, Brachydanio rerio is a simple vertebrate with a number of desirable characteristics. Hundreds of eggs can be produced daily on a year round basis from a small number of 5 adult fish. Eggs can be fertilized in vitro; as in the frog, fertilization is external. Zebrafish embryos are optically transparent, so embryonic development can be monitored and cell types within the embryo identified. The fish develop rapidly, hatching from their chorions at 2 to 3 days post fertilization. Q The generation time is only 3 to 4 months.
- Zebrafish are maintained in aquaria under conditions conducive to rearing, mating and spawning, e.g., 12-16 fish per tank, 28.5 * 1., 14h light/lOh dark cycle.
- zebrafish care and maintenance see Streisinger, Nat. Cancer Inst. Monogr. 5 65:53-58 (1984) .
- the embryos in embryo medium are placed on a depression slide and injected with the aid of a dissecting microscope and
- the DNA solution is injected cytoplasmically through a continuously flowing micropipette, the flow rate may be controlled with pressurized air. Phenol red may be added to the solution to aid in estimating the volume injected.
- DNA may be extracted from whole fish at 1-3 weeks of age by
- the zebrafish is a small, laboratory- adapted vertebrate species which can be cared for more easily than most mammalian subjects.
- a variety of mutations may be detected by studying their effects on a CpG-depleted marker gene in a zebrafish model.
- the marker, or mutational target, gene express a detectable product. It may instead express a product that serves as a substrate for the product of a second (reporter) gene, or as a cofactor for the action of that product, or, as in this example, as a means of regulating the expression of the second gene.
- the Lac repressor expressed by a lad target gene, may be used to extinguish expression of beta- galactosidase by the lacZ indicator gene.
- One method of obtaining a lad/lacZ transgenic animal is by mating (a) a transgenic mouse homozygous for a single copy of the target transgene lad, and (b) a transgenic mouse homozygous for the indicator transgene, lacZ, which may be present in one or two copies.
- Production of mouse line (a) is described in Example 2.
- Production of mouseline (b) is set forth in Example 11.
- the modified lad gene with reduced CpG and (preferably) mammalian codon usage is directed by the normal bacterial lad promoter, or an alternative prokaryotic promoter such as trp, and introduced into a lambda phage shuttle vector that has been previously described with the bacterial lad by Kohler, et al., Proc. Natl. Acad. Sci. USA 88:7958-62. 1991, and that includes the c subunit of lacZ, a jS-lactamase gene, and a ColEl replication origin, all flanked by the initiator and terminator halves of the Fl filamentous phage origin.
- the vector is introduced into mice by pronuclear injection, and the transgenic mice are bred to homozygosity for the transgene.
- the mice are exposed to mutagen and genomic DNA is subsequently prepared from selected tissues as previously described: Kohler, S.W. et al.. Nucleic Acids Res. 18_:3007-13, 1990; Kohler, S.W. et al., PNAS _3_8:7958- 62, 1991.
- the shuttle vector is rescued from genomic DNA by packaging the shuttle vector DNA into infective ⁇ virions using an in vitro ⁇ packaging extract (Transpack from Stratagene Cloning Systems) , preadsorbing to E.
- coli SCS-8 (Stratagene Cloning Systems) , mixing with top agar containing 2 mg of X-gal per ml of top agar and pouring onto assay plates with a bottom agar layer.
- Rescued phage containing wild type lad will produce colorless plaques while rescued phage with mutant lad will produce blue plaques.
- the ratio of blue to colorless plaques is indicative of the mutagenicity of the compound.
- the DNA containing mutant lad can be excised from the lambda phage in vivo (Kohler, S.W. , Nucleic Acids Res. 16:7583- 7600, 1988) , and the mutant lad gene sequenced.
- the modified lad gene with prokaryotic promoter is linked to a DNA sequence that is recognized by and binds to a specific protein or other substance.
- a specific protein or other substance is the lac operator (lacO) which specifically binds the lac repressor with high affinity.
- lacO lac operator
- the lac operator is placed close to the lad gene, so as not to interfere with expression, and mice are rendered transgenic for this construct by pronuclear injection. After breeding mice to homozygosity for the transgene, the animals are exposed to he mutagenic environment.
- the DNAs are isolated from selected organs and tissues of the exposed animal, and the DNA is digested with an enzyme that cleaves outside of both the operator sequence and the lad gene leaving intact DNA fragments containing both sequences.
- lac repressor protein attached to magnetic beads.
- the repressor binds the operator sequence, and the complex is separated from the remainder of the DNA by use of a magnet.
- the separated fragments are cloned into a plasmid with an ampicillin-resistance marker and used to transform E. coli that constitutively express lacZ due to mutant or absent lad. Ampicillin resistant colonies containing mutant lad will stain blue with X-gal while colonies with wild-type lad will not.
- bipartite detection system which is not limited to lad/lacZ
- an "inhibitory" gene is the target gene
- a reporter gene as the target gene (i.e., lacZ alone)
- mutation in the target transgene is manifested as stained cells on an unstained background, whereas if lacZ were the target, mutation would appear as unstained cells on a stained background.
- Example 11 Production of Transgenic Mice Expressing a Bacterial lacZ Gene
- the constructs will be generated as described in Figure 6.
- the promoterless ⁇ - galactosidase gene with 3' SV40 processing signals is bounded by Hindlll (5') and Xhol (3') sites in the Bluescript-based plasmid pLZ6.6 ( Figure 5B and 6A) .
- the core of this sequence is an 18 bp palindrome that behaves as a mutant operator which binds lac repressor- about 8 times more tightly than the wild-type operator sequence Brown, et al., Cell, 4 :603-612, 1987.
- the oligonucleotides will be hybridized and directionally cloned into the unique BamHl/Hindlll sites upstream of the lacZ gene ( Figure 6B) .
- a 396 bp fragment extending from 6 nucleotides upstream of the APRT translation start codon (position -6) to position -402, encompassing the entire aprt promoter (Dush, et al., Nucleic Acids Res., 16:8509- 8524, 1988) and flanked by BamHI linkers, will be cloned immediately upstream of the operator sequence, and its orientation determined by the position of an asymmetrically located Smal site ( Figure 6C) .
- a neo marker driven by an HSV tk promoter and flanked by Xhol sites will be inserted at the unique Xhol site in either orientation ( Figure 6D) for selection purposes, and the resultant plasmid designated pAPlacOZneo.
- aprt promoter constructs complementary oligonucleotides encoding operator sequence with appropriate cohesive ends will be synthesized, and inserted at the sites indicated in Figure 7. Following conversion of the 3' end of the modified aprt promoter to a Hindlll site, the fragment will replace the aprt/lacO promoter construct in Figure 6D and will be inserted into the resulting BamHl/HindlH site immediately preceding the lacZ gene.
- the system may be validated and optimized by testing various known and suspected mutagens and/or carcinogens.
- N-nitrosoethyl urea a transplacental mutagen of the N-nitroso family of carcinogenic agents. It causes neurogenic tumors in a variety of species, including mice Rice, et al., Ann. N.Y. Acad. Sci., 381:274-289. 1982, and papillary lung tumors in the progeny of pregnant females exposed to the agent Rehm, et al., Cancer Res., 48.:148-160, 1988. Since there is no real precedent to follow, we will first expose fetal mice to NEU via i.p. injection of the mother with 0.1 mmol to 0.5 mmol NEU per Kg on gestational days 14, 16 and 18.
- mice from different litters at 1 wk, 6 wk, 20 wk and 1 yr of age for each time and dose of NEU administration, which is equivalent to the number .used to unequivocally detect an association with lung tumors.
- Analysis of stained sections should define which organ(s) and tissue(s) in progeny mice are most susceptible to mutation following i.p. ' administration of NEU to the mother. It will also define the gestational age at which the fetus is most susceptible to this agent, and will establish the presence or absence of a dose response relationship between the amount of NEU administered and the number of stained foci per organ than one detects.
- a second class of carcinogen which may be tested is the aromatic amines, whose carcinogenic characteristics have been recognized since the turn of the century. These compounds have been widely used in industry, see Haley, et al. , Handbook of Carcinogens and Hazardous Substances, eds. M.C. Bowman, Marcel Dekker, Inc., New York, Basel, 1982. For example, the DuPont Company has screened workers exposed to jS- naphthylamine for the occurrence of bladder cancer Mason, et al., J. Occup. Med. , 2J[:1011-1016, 1986. Other studies (e.g.
- target organs such as liver lack dividing cells so that a mutation in the lad gene will be manifested as only a single stained cell.
- target organs such as liver lack dividing cells so that a mutation in the lad gene will be manifested as only a single stained cell.
- immature animals at an age when mitotic division is still active in most organs like liver, larger foci will be evident due to a lad mutation in a progenitor cell and all of its progeny.
- nursing mothers will be injected i.p. or i.v. immediately after birth of a litter of tester mice and offspring will be analyzed at 6 to 10 weeks of age.
- mice will be exposed to increasing amounts of one of the above aromatic amines (each will be separately tested) .
- Administration will be a single or multiple doses dispensed i.p., i.v. or orally (gavage) to again determine the target organ(s) and to ask whether they differ according to route of administration.
- Mice will be sacrificed at times up to 8 weeks after the last administration and individual organs sectioned, stained and analyzed as above. We will begin with about 6 to 10 mice for each regimen of mutagen administration. This number may be increased if so required.
- PCBs congenital poisoning in Taiwan in 1978 by cooking oil contaminated with thermally degraded PCBs.
- Affected offspring of mothers who had ingested the contaminated rice-bran oil manifested a spectrum of congenital defects. It may be too early to tell whether or not these children will exhibit a higher than normal incidence of tumors.
- PCB thermal derivatives are mutagenic to fetuses of pregnant mice that ingest these agents, and if so, whether or not 1) the affected tissue(s) is of ectodermal origin, as appears to be the case in man, and 2) whether or not tissues that incur mutations in the lad transgene later selectively give rise to tumors.
- PCBs will be dissolved in cooking oil at about 100 pp . Chen, et al., Am. J. Int. Med., .5:133-145, 1984.
- PCDF polychlorinated dibenzofuran
- the offspring (after cooling) will be orally administered to pregnant mice carrying tester fetuses.
- the offspring will be analyzed for teratological abnormalities and for organs and tissues that manifest lad mutations, as described before.
- GTT TCT GCC AAA ACC AGG GAA AAA GTG GAA GCA GCC ATG GCA GAG CTG Val Ser Ala Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu 30 35 40 45
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Environmental Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Veterinary Medicine (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- Physics & Mathematics (AREA)
- Microbiology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Animal Behavior & Ethology (AREA)
- Animal Husbandry (AREA)
- Biodiversity & Conservation Biology (AREA)
- Plant Pathology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
A transgenic animal-based assay for mutagens or carcinogens is described in which the CpG dinucleotide frequency of the marker transgene is adjusted to resemble that of the genes native to the host animal.
Description
MUTAGENICITY TESTING USING REPORTER GENES WITH MODIFIED METHYLATION FREQUENCIES
Mention of Government Support
This invention was made with support from the National Institutes of Health, Grant No. NIH ES05204, and the Government may have certain rights in the invention.
BACKGROUND OF THE INVENTION
Field of the Invention
This invention relates to testing for mutagens and carcinogens.
Information Disclosure Statement
More than 50,000 new chemicals are introduced every year into our environment--our homes, our factories and offices, and our recreational areas. They may be in our food or our drink, our shampoos or our furniture waxes. Some of these chemicals-- like the nitrates found in pastrami, frankfurters, and bacon--can directly or indirectly cause mutations in DNA, the genetic material of all cells.
Carcinogens are chemical (or physical) agents which are capable of causing cancer in a susceptible subject. A chemical is considered a carcinogen if, in a well designed and conducted bioassay, it produces a statistically significant increase in the incidence of neoplasms in one or more target organs. Some carcinogens are themselves mutagens, i.e., agents which directly cause the mutation of DNA. Others are metabolized by cells to form powerful mutagens, which in turn act on the host's DNA. For example, aflatoxin Bl is converted by a hepatic aryl hydroxylase into 2,3-epoxide derivative, a mutagen, and nitrates are a starting material in the formation of the highly carcinogenic (mutagenic) nitrosamines. Still others act as tumor initiators or promoters, i.e., they induce or accelerate the transformation of normal cells into maliganant cells without necessarily directly modifying the genetic material. At first, carcinogens and mutagens were identified by epidemiological means. After several decades, segments of the population having elevated
exposures to the agent would exhibit a higher frequency of incidence of a particular cancer or other disorder. Case studies would reveal the common factor. Of course, the problem with epidemiological detection is that it is retrospective; society has already been damaged. This prompted the development of a variety of screening tests.
One was the use of animals. See, e.g., Lew, U.S. 4,345,026. Since it was impractical to observe the effects of a new product on animals for several decades in order to determine whether it was a carcinogen, the animals were given extremely high doses of the product, but for a shorter period of time (e.g., two years) . The animals were then examined for the development of tumors, or other disorders of genetic origin. Such testing had, however, a number of disadvantages. First, some products were undoubtedly stigmatized as carcinogenic which would not have had an adverse effect, even over the course of a lifetime, had a more normal dosage been employed. Second, the testing periodwas still significant, particularly if the product
. were a drug for which there .was a substantial societal need. ' Also, the data provided was very crude and limited.
The time and expense associated with animal testing led to the development of various in vitro tests, of which the Ames test is the most prominent. The Ames test is based on the assumption that carcinogens (or their metabolites) will cause the genetic reversion of certain mutant strains of bacteria. These strains lack the ability to prc-duce histidine, an essential amino acid, and therefore are unable to multiply unless this nutrient is in their growth medium. In the presence of a mutagen, these mutants are more likely to revert to their "wild" phenotype, i.e., they regain the ability to manufacture histidine from other materials and therefore can grow in a histidine-free medium. Note, however, that reversion can also occur spontaneously, i.e., even in the absence of a mutagen. This natural reversion acts as "noise" that limits the overall sensitivity of the test. A more fundamental problem with the Ames test, and other testing in bacteria, is that bacteria have a different metabolic apparatus than do vertebrate cells. As a result, a chemical that causes mutation in bacteria may not do so in vertebrate cells,
and vice versa.
This problem is only partially alleviated by the use of mammalian cells or cell lines, as in Calos, U.S. 4,753,874; Thilly, U.S. 4,066,510; Crespi, U.S. 4,532,204; Dolbeare, U.S. 4,345,027; Skopek, U.S. 4,302,535; Grosveld, EP Appl 258,899. A chemical may be converted into a carcinogen or mutagen through a metabolic activity specific to a particular type of cell, e.g., periportal liver cells. Unless, fortuitously, cells of this type are exposed to the chemical, this mode of carcinogenesis or mutagenesis will not be discovered. Indeed, it may be the case that the production of the mutagen requires processing by several different types of cells, or that the necessary metabolic activity in one type of cell must be activated by a product of a different type of cell. A single cell type bioassay is incompetent for risk assessment of chemicals which are metabolized in this manner.
Developments in genetics, molecular biology and embryology made it possible to make a more sophisticated use of test animals, in which they were examined for mutagenic damage on a cellular level, rather than for gross abnormalities such as tumors or organ failures. Specifically, transgenic animals were prepared whose cells carried a foreign (typically bacterial) "marker" gene. These animals were exposed to normal doses.of the suspect chemical. DNA was then extracted from the various tissues and organs of the animal, and the foreign "marker" gene was "rescued" and transferred to the genome of a host by means of a bacteriophage. The lytic plaques were then screened for the phenotype characteristically imparted by the original "marker" gene. The absence of this phenotype was indicative of mutation. See Gossen, et al., Proc. Nat. Acad. Sci. (USA), 86:7971-75 (Oct. 1989); Gossen and Vijg, in Mutation and the Environment. Part A, 347-354 (1990); Short, et al, in Idem.. 355-367; Kohler, et al., Nucleic Acids Res., 18:3007-3013 (1990); Hazletion, "MutaMouse" product brochure; Sorge, EP Appl 289,121; Vijg, EP Appl 353,812; Shenk, O89/05864; Wei, EP Appl 370,813.
Thus, in Sorge, EP Appl 289,121, a phage lambda vector, capable of lysing E. coli, was engineered to carry the E. coli beta galactosidase (lacZ) gene. Lambda DNA was microinjected
into mouse embryos, and transgenic mice were produced by standard techniques. Genomic DNA was purified from a tissue of the transgenic mouse, and the test DNA was excised by means of a lambda phage packaging extract. The packaged phage were incubated with beta-galactosidase deficient E. coli. Bacteria infected by the phage particles were lysed, resulting in the formation of lytic plaques on a lawn of beta- galactosidase deficient E. coli. In the presence of X-gal (5- bromo-4-chloro- 3-indoyl-beta-D-galactoside) and in the absence of IPTG (isopropyl-beta-D-galactopyranoside) , thephageplaques will turn blue if the beta-galactosidase sequence within the lambda genome had not mutated. A white plaque, on the other hand, is evidence of a mutation that rendered the beta- galactosidase nonfunctional. In the transgenic animal, the chemical is available for processing by a variety of tissues and organs. Thus, the transgenic animal-based test, unlike an in vitro mammalian cell bioassay, can detect mutagenesis by metabolites of the chemical of interest, even though the metabolites are produced at the appropriate concentrations only by differentiated cells or the tissue of live animals. Cells of any tissue or organ of interest may be screened for mutagenic damage, merely by extracting their DNA and recovering and characterizing the transgene. A single animal may yield a multitude of cells for testing and analysis. .It should be noted, to avoid confusion, that this cellular level analysis cannot be performed with geries endogenous to the test animal. They cannot be isolated from the DNA of an organ or tissue with sufficient efficiency. That is why this more sophisticated analytical approach was not possible until it became feasible to make transgenic animals.
Unfortunately, persons of ordinary skill in the art have not heretofore recognized a flaw in the aforementioned scheme. They assume that the rate of mutation in the bacterial gene, once incorporated into the mammalian cell genome, will be equivalent to that of an endogenous mammalian gene of similar length. For the reasons described hereafter, this is an unwarranted assumption.
SUMMARY OF THE INVENTION
The present invention overcomes the deficiencies of the test methods described above. The present Applicant recognized that bacterial genes exhibit a much higher frequency of occurrence of the "CpG" doublet than do vertebrate genes. As a result, a bacterial gene incorporated into a mammalian genome will exhibit a much higher degree of methylation than is typical for a mammalian gene.
For example, of the 1081 dinucleotides in the lad gene, 95 (about 9%) are CpG, the primary nucleic acid methylation substrate in mammals. (55 of these CpG lie within a single one of the 360 codons of the gene, and 40 are at intercodon boundaries.) The mammalian gene is underrepx*esented in CpG dinucleotides. In random DNA having a 50% GC content, about 6% of the dinucleotides would be expected to be CpG. In mammalian DNA, it occurs with a frequency of about 2%. .
It is submitted that the elevated methylation of the bacterially derived marker gene will affect the mutagenic susceptibility of the gene in an unpredictable manner, thus weakening any conclusions which may be drawn from a positive or negative finding in an assay employing transgenic animals featuring a "wild type" bacterial marker gene.
This problem may be overcome by the use of a wholly or partially synthetic bacterial marker gene having a reduced (vertebrate cell like) number of CpGs, and hence, presumably, a lower overall level of methylation.
It has already been recognized that the integrated vectors used in the previously reported work are heavily methylated. However, no concern has been expressed with regard to the methylation of the marker gene.
Thus, in Vijg, EP Appl 353,812, page 8, it is reported that the lambda vectors integrated into the mouse genome were highly, perhaps even completely, methylated. This was determined through comparison of restriction digests made using various methylation- sensitive and insensitive restriction enzymes.
Vijg had been troubled by the very low number of plaques obtained in practice. He postulated that the methylation pattern of the lambda vectors was rendering them susceptible to
restriction by the E. coli host, i.e., the lambda vectors were being cut to pieces by the bacterium's defensive enzymes before they could integrate into the host genome (Vijg, page 3) . His solution, however, was not to modify the vector, but rather to employ a "host restriction"- negative strain for plating.
Sorge, EP Appl 289,121 (page 8) was also concerned with methylation of the vector. However, his concern was with test DNA rescue efficiency. He suggested that the methylation of the vector inhibited cleavage at the cos sites, and thereby interfered with its excission from the genome. He advised either placing enhancers, promoters or other genetic elements which inhibit methylation near the cos site to reduce CpG methylation, or pretreating the host cells with the drug 5' - azacytidine to reduce their level of methylation. In other words, neither Sorge nor Vijg recognized that overmethylation of the marker gene could" result in misstatement of the mutagenic damage to the mammalian genome, and neither advised reducing methylation by alteration of the DNA sequence of their vectors. Methylation of marker genes is also mentioned by Shenk, WO89/05864. However, Shenk proposed that the marker gene be engineeered so that it is heavily methylated, so that expression would be inhibited unless the marker gene was activated by a carcinogen with demethylating activity. This, in effect, teaches retention of CpGs, which are methylation sites, and thus against the CpG-depleted genes of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 sets forth the sequence of the coding strand of the wild-type lad gene, 5" to 3' (SEQ ID N0:1) . CpG dinucleotides are marked. Above the nucleotide sequence are alternative nucleotides for eliminating most of the CpG dinucleotides and for eliminating splicing donor-acceptor sites. Below the nucleotide sequence is the sequence of the corresponding wild- type Lad repressor protein (SEQ ID N0:2) , given according to the single letter amino acid code.
Figure 2 sets forth the sequence (LACIMIRNL) (SEQ ID NO:3) of a Kozak consensus RBS (TCACC.,..), a CpG-depleted lad gene, a
three codon linker (encoding AAL) , and a seven codon sequence encoding the SV40 large antigen nuclear localization site. These features are marked, as are all Mspl/Hpall (CCGG) sites. The transgene also includes a beta actin promoter, but, since this sequence is lengthy and has been published, it was not reprinted. The corresponding amino acid sequence is presented as SEQ ID N0:4.
Figure 3 shows the wild-type E. coli gpt gene (SEQ ID NO:5) and the suggested base substitutions for reducing its CpG content. The corresponding amino acid sequence is presented as SEQ ID NO:6.
Figure 4 shows the modified gpt gene (SEQ ID NO:7) and a synthesis strategy therefor. The complementary DNA sequence is provided as SEQ IS NO:8. Figure 5 is a schematic depiction of (A) a Hindlll/BamHI fragment comprising the lacZ gene and SV40 processing signals, and (B) plasmid pL26.6.
Figure 6 shows the construction of plasmid PAP lacOZneo. Figure 7 depicts the organization of the mouse aprt promoter region with the site of one lacO insert marked. The aprt promoter and upstream sequences that will be used in the proposed experiments are schematically displayed. In the genome, this fragment is bounded by Fnudll sites. The numbering in base pairs begins at the translation start codon. The 4 boxes represent sites of Spl binding. The horizontal arrows indicate major sites of transcription initiation, and the vertical arrow indicates extend of deletion that permits full aprt expression. The position of the E. coli lac operator (lacO) is indicated, as are potential Taql and Xmal sites for alternative operator insertion.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention is directed to a test for mutagenicity and carcinogenicity, in which mutations in a marker gene are used to predict the effect of a chemical agent on genes of a target species of interest. Most often, the target species will be human beings, however, it must be noted that both wild and domesticated animals are also chronically exposed to potential mutagens and carcinogens and that environmental policy may call
for minimizing such exposure. The present invention may readily be adapted to screening for the mutagenic potential of a chemical vis-a-vis the genes of a nonhuman animal species, including other mammals, birds, fish, amphibia, reptiles, and even lower life forms (such as bees, silkworms, earthworms, and so forth) .
Nucleic acids are the basic genetic material in cells. They are formed by a chemically linked sequence of nucleotides. Each nucleotide contains a heterocyclic ring of carbon and nitrogen atoms (the nitrogenous base) , a five carbon sugar in ring form (a pentose) , and a phosphate group. Two types of pentoses are found in nucleic acids, the 2-deoxyribose in DNA (deoxyribonucleic* acid) and the ribose in RNA (ribonucleic acid) . In DNA, there are four normal nitrogenous bases: two pyrimidines, cytosine (C) and thymine (T) , and two purines, adenine (A) and guanine (G) . In RNA, uracil (U) is found in place of thymine. A base-sugar moiety is called a nucleoside, and a base-sugar-phosphate moiety is a nucleotide. Genetic information is conveyed by the sequence of bases in a polynucleotide chain, not by the phosphodiester-sugar backbone. The 5' position of one pentose ring is connected to the 3' position of the next pentose ring via a phosphate group, thus forming a series of 5' to 3' linkages. The terminal nucleotide at one end of the chain has a free 5' group, and the other terminal nucleotide has free 3' group. IT is conventional to write nucleic acid sequences by setting forth the sequence of bases in the 5' to 3' direction.
The chromosome is a double helix formed by two very long, interweaving polynucleotide chains, or strands. These strands are held together in the double helical structure as a result of hydrogen bonds between so-called complementary bases on the two strands. One such "base pair", adenine-thymine (A:T) (or, in RNA, adenine-utacil, A:U) , provides two hydrogen bonds; the other, guanine-cytosine (G:C) . provides three.
When a gene in the chromosome is expressed, a single strand of messenger RNA is transcribed from a template strand of the chromosome by complementary base pairing. The template strand is called the coding or anti-sense strand; the other DNA strand, the anti-coding or sense strand, is identical in sequence (except
for the T/U distinction) to the messenger RNA transcript. The messenger RNA, in turn, acts as a template for the assembly of amino acids into the protein encoded by the gene; the assembly process is known as translation. The messenger RNA transcript is read in nonoverlapping units of three nucleotides, known as codons, from a fixed starting point; there are three possible ways of translating any messenger RNA, depending on the starting point. These are known as reading frames.
When a chromosome replicates, the double helix is unzipped, and both original strands act as templates for the synthesis of two new DNA strands by complementary base pairing.
The bases of DNA may be modified by enzymes endogenous to the cell, especially methylases. In vertebrates, the only methylated base is 5-methylcytosine. Between 2% and 7% of the C residues of animal cell DNA are methylated. Most of the methyl groups are found in CpG "doublets" (dinucleotides) ; in birds and mammals, 50- 70% of all such dinucleotides are modified by methylation. Note that in these doublets, the C and G are adjacent bases on the same strand, joined by a covalent 51 to 3' phosphodiester(p) -sugar linkage; the CpG dinucleotide should not be confused with the C:G base pair formed by hydrogen bonding between a C on one strand and a G on another.
In double-stranded DNA, the following structure is common:
5' mCpG 3' 3' GpC"1 5'
A doublet that is instead methylated on only one of the two strands is said to be hemimethylated.
The distribution of methyl groups may be examined by taking advantage of pairs of restriction enzymes, known as isoschizomers, that cleave the same target sequence in DNA, but have a different sensitivity to its methylation pattern. For example, the enzymes Hpall and Mspl both recognize the sequence CCGG, which includes a CG dinucleotide. However, if the second C is methylated, this sequence cannot be cleaved by Hpall. Mspl, however, is indifferent to the presence of methylation at this C. Thus, Mspl can be used to identify all the CCGG sites, and
Hpall, to determine whether or not they are methylated.
In some genes, some methylation sites are methylated when the gene is inactive but unmethylated (at least in some cells) when the gene is expressed. Frank, et al., Nature, 351: 239-41 (1991); Paroush, et al.. Cell, 63:1229-39 (1990). According to Palmiter and Brinster, Ann. Rev. Genet., 20:465- 99 (1986), the methylation of DNA microinjected into mouse embryos, in the course of transgenic mouse production, may in some cases be responsible for problems observed in expressing transgenes. A mutation is a change in the nucleotide sequence. An alteration that alters only a single base pair is termed a point mutation. A point mutation may take the form of a substitution, an insertion, or a deletion. A point mutation in a gene will not necessarily have an effect on the sequence of the encoded polypeptide. This is because the genetic code is redundant--61 different codons encode only twenty different amino acids. -A mutation which does not affect which amino acid is encoded is termed a "silent" mutation. Even if the amino acid is changed, it is possible that the mutant polypeptide will retain activity. In this case the mutation is said to be "neutral".
A mutation which inactivates a gene is termed a forward mutation. The effects of forward mutations may be reversed by back mutations in the same genes, or through suppression of the mutated gene through mutation of a different gene. Back mutations fall into two categories, true reversions, which restore the wild- type sequence, and second- site reversions, which simply compensate for the forward mutation by restoring activity.
An insertion or deletion of a nucleotide in a gene will cause a shift of the reading frame. In general, this will result in the expression of a radically different and probably nonfunctional polypeptide. However, a second frameshift mutation, close enough to the first one, may restore activity.
Mutations may also involve the insertion, deletion, or inversion of larger chunks of DNA.
The mechanism by which some chemical mutagens produce mutations is known.
Nitrous acid deaminates adenine to hypoxanthine, whichbonds
to cytosine instead of to thymine. As a result, in the first replication cycle after deamination, the new strand features a cytosine, rather than a thymine, at the position complementary to the hypoxanthine. Then, in the second replication cycle after the deamination, instead of regenerating the original adenine, the polymerase inserts a guanine, which is complementary to the aforementioned cytosine. Thus, nitrous acid causes an A->G transition. Cytosine is deaminated to uracil, which hydrogen bonds to adenine instead of to guanine. In other words, the nitrous acid also causes a C->T transition. Guanine is deaminated to xanthine, which continues to hydrogen bond to cystosine, though with only two hydrogen bonds. Thymine and uracil are not altered by nitrous acid.
Hydroxylamine reacts with cytosine to form N- hydrocytosine, which preferentially pairs with adenine. Thus, hydroxylamine produces a C->T transition.
The alkylating agents are the largest group of mutagens, which introduce alkyl groups into nucleotides at various positions. The alkylating agents include mustard gas, epoxides, dimethyl- and diethylsulfonate, methyl- and ethylmethane- sulfonate, and N-methyl-N" -nitroso-N-nitroguanidine. G->A (as a result of formation of 06-alkylguanine) and T->C (attributable to O6-alkylthymine) transitions are typical. The present invention is not limited to the evaluation of any particular chemical class of potential mutagen, or to the detection of any particular kind of mutation.
It is known that some sites in genes are mutated far more often (10X-100X) than would be expected by random hit kinetics. Within the same gene, one site may be a "hotspot" for spontaneous mutation, a second for mutation by an alkylating agent, and a third for mutation by hydroxylamine. In E. coli. the development of hotspots for spontaneous point mutations has been correlated, to some degree, with the presence of 5- methylcytosine. In the lad gene of E. coli. the hotspots for spontaneous point mutations all occur at sites at which the wild- type sequence contains a 5-methylcytosine. In each case the mutation takes the form of a G:C to A:T transition. In strains of E. coli that are unable to methylate DNA, these hotspots do
not exist. Lewin, Genes 45 (1983) . See also Lebrowski, et al., PNAS, 82:8606-10 (1985) and Scheaper, et al., J. Mol. Biol. 189:273-284 (1986).
It has been postulated that these mutations occur through spontaneous deamination of 5-methylcytosine to form thymine. This creates a mismatched G:T pair, which separates, in the next replication cycle, to generate a wild-type G:C pair and a mutant A:T pair. While wild type cytosine is also deaminated, yielding uracil, E. coli contains an enzyme, uracil-DNA-glycosidase, that removes uracil residues from DNA. This leaves an unpaired G residue, and repair enzymes then insert a complementary C residue, thus repairing the damage caused by the deamination.
If the presence of methylated cytosines also acts as a hotspot for mutation in the genes of vertebrate cells, it follows that the placement of a heavily methylated marker gene in a vertebrate cell for mutagenesis assay purposes will result in overestimation of the mutagenic potential of the assayed chemical.
Cells repair DNA damage at a faster rate in genes which are transcribed than in those which are transcriptionally inactive. Heavy methylation interferes with transcription, and therefore mutations in heavily methylated transcriptionally active genes are less quickly repaired than they would be otherwise. This means that such heavily methylated genes will tend to accumulate mutations more rapidly since the repair mechanisms are less ■ efficient.
The term "transgenic animal" is defined, for the purpose of the appended claims, as an animal at least some of whose germ cells contain genetic material, originally derived from another animal, other than an ancestor of said animal, as a result of human intervention. So defined, it includes progeny of a transgenic animal which retain the transgenic genotype. It is not necessary that all cells of the animal contain the transgene. The reference to human intervention is intended to exclude genetic modification as a result of unintentional infection with a virus.
The term "chimeric animal" is defined, for the purpose of the appended claims, as an animal which is not necessarily a
transgenic animal, but at least some of whose somatic cells contain genetic information, originally derived from another animal other than an ancestor of said animal, as a result of human intervention. The term "genetically engineered animal" refers to an animal which is either a transgenic animal or a chimeric animal.
Note that animals produced by conventional artificial insemination techniques are not considered to be genetically engineered, the donors of sperm and egg being considered parents of the animal, unless one or more ancestors of the animal was genetically engineered and the descendant animal retains the engineered genotype. Moreover, the transplantation of cells from one animal to another is not considered genetic engineering.
The present invention may utilize both transgenic animals and chimeric animals, though transgenic animals are preferred. References to transgenic animals in this specification should be deemed to include, mutatis mutandis. chimeric animals as well.
The "marker gene" may be any gene which confers a selectable or screenable phenotype on cells of the transgenic animal, or, if the assay is not applied directly to those cells, on the assay cells subsequently transformed by the rescued marker gene. Preferably, the marker gene is. one which is not substantially homologous with any gene endogenous to the host animal from which the transgenic animal is produced. This facilitates the identification of the marker gene. Usually, the marker gene will be a bacterial gene in order to maximize the taxonomic distance between the marker gene and the genes of the host animal. However, it may also be a nonbacterial, nonmammalian gene, uch as a viral, fungal, plant, invertebrate or lower vertebrate gene. The marker gene may be a wild type nonvertebrate gene chosen because it has a CpG level in the vertebrate range; more often, it is a nonvertebrate gene mutated to reduce the CpG level.
The detected mutation in the marker gene may be any of the numerous types of mutation known to occur. It may be in the coding sequence of the DNA, or in an associated regulatory sequence such as a promoter or a stop codon. It may be a point mutation or a frameshift mutation. It may involve a base
substitution, insertion or deletion.
In one embodiment, the marker gene is a functional gene, and the assay is for forward mutations which inactivate the gene. In another embodiment, the marker gene is one mutated to render it nonfunctional, and the assay is for back mutations which restore activity. The second of these embodiments has the advantage .of lower background, since back mutation is a rare event. For example, the marker gene may be a lacZ gene with the codon for Glu- 461 (GAA) mutated to HBA; it has been shown in IjL. coli that only same site reversion will restore activity.
Cupples, et al., Proc. Nat. Acad. Sci. (USA), 86: 5345-49 (1989).
The phenotypic change (marker) associated with mutation of the marker gene may be one detectable in a mammalian cell, or it may be one detectable only after rescue of the mutated DNA and expression of that DNA in a non- mammalian, e.g., bacterial system.
The mutagenesis assay may detect a direct or an indirect product of the marker gene. The detection may occur in vivo or in vitro, through any means known in the diagnostic art. The phenotypic change may be the death of the animal or the affected cells, or a change in cell morphology or metabolism. It may be the presence or absence of a characteristic luminescence or radioactivity. All that matters is that there be a detectable change if mutation occurs. It is not necessary that all mutations cause a detectable change, as long as some mutations in the marker gene will do so.
There are several ways in which a mutation in a marker gene may be detected, and the choice of marker gene will be influenced by the method of detection which is contemplated. First, the detection could be through in vitro examination of the blood, urine, milk, or other expendible product of the animal. This has the advantage that the animal is not harmed, so that the animal can continue to be monitored for further genotoxic damage from the agent. However, there is no way of knowing how the detected label enters the examined product, and hence it is not known whether the mutation occurs only in certain tissues or organs. Second, the detection could be through histochemical examination of one or more tissues of the animal. See, e.g., Wei, EP Appl
370,813. This permits identification of the particular tissues or organs affected by a mutagen. However, unless the histochemical examination is limited to material removable by a biopsy, the animal must be sacrificed in order to assess mutagenicity. Third, the detection may be through in vivo imaging. This has the advantage that the animal is kept alive, while the affected tissues and organs may still be determined. However, the spatial data is perhaps somewhat less precise than that which can be obtained by histochemical analysis. The marker gene may be, but is not limited to, a tumorigenic, toxin, hormonal, enzymatic or antigenic marker gene. (It should be noted that these categories are not mutually exclusive.)
Tumorigenic marker genes are those which, when expressed in a transgenic animal, result in production of a transforming gene product and therefore induce tumors. The marker gene may be a functional oncogene, in which case the assay is for mutations which render the oncogene nonfunctional and therefore protect the animal, or it may be an oncogene mutated to be nonfunctional, in which case the assay is for back mutations which render the oncogene functional once more and therefore result in tumor formation in the animal. The oncogene may be a viral or a cellular oncogene. When the assay is a back mutation assay, the marker gene may be a naturally occurring proto-oncogene, or it may be an oncogene mutated in the laboratory to render it nonfunctional. When the oncogene is a viral oncogene, it may be derived from a DNA tumor virus or from a retrovirus. Suitable retroviral oncogenes include v- abl, v-fes, v-fps, v-fgr, v-src, v-erbA, v-erbB, v-fms, v-ros, v-yes, v-mos, v-ras, v-fos, v-myb, v-myc, v-ski, v-sis, v-rel, v-kit, v-jun, andv-ets. Suitable DNA tumor virus genes include the T antigen genes from SV40 or polyoma viruses and the EIA and E1B genes from adenoviruses.
Toxin marker genes encode a toxic protein or an enzyme which participates in the enzymatic production of a toxic metabolite. The toxin may be, but is not limited to, a bacterial toxin (e.g., diphtheria toxin, tetanus toxin, and botulin toxin) , a plant toxin (e.g., ricin or abrin) , an invertebrate toxin (e.g., a scorpion or sea anemone toxin), or a snake venom toxin (e.g., a
cobra or rattlesnake toxin) . Toxins include cardiotoxins, neurotoxins, and protease inhibitors. Nonfunctional mutants of toxin genes may be used in back mutation assays.
Hormonal marker genes encode protein or peptide hormones (or prohormones, or pre-prohormones) which are detectable either directly or through their biological effect. These hormones may be identical to natural counterparts secreted by, e.g., the endocrine glands (such as the pituitary, thyroid, or gonads) , or they may be muteins. Suitable hormones include growth hormone, prolactin, chorionic gonadotropin, luteinizing hormone, follicle stimulating hormone, insulin, parathyroid hormone, somatostatin, and gonadotropin releasing hormone, and homologues thereof. While most mammalian hormonal marker genes will exhibit CpG frequencies typical of mammalian DNA, exceptions may exist. Also, nonmammalian hormonal marker genes may be of interest as their proteins may be more readily differentiated from their mammalian cognates in a transgenic mammalian host.
Enzymatic marker genes encode enzymes which, in the presence of a suitable substrate, convert the substrate into a directly or indirectly detectable product. Suitable enzymes include beta- galactosidase (lacZ) , alkaline phosphatase, luciferase and horseradish peroxidase.
Enzymatic marker genes are particularly appropriate where the marker gene is engineered so that the enzyme is secreted into an assayable biological fluid, such as blood. The substrate can then be supplied when the blood is assayed in vitro. They may also be used when the marker is to be detected by histochemical analysis. In any event, the substrate may be provided in vivo or in vitro. Antigenic marker genes encode a detectable antigen. The antigen is then detected with a specific antibody. Antigenic marker genes are particularly suitable for detection of mutagenic activity by in vivo imaging, as the antibody may be labeled with an imageable label such as a radioactive label. Regulatory marker genes are genes which encode regulatory proteins. Such proteins control the expression of other genes. Examples include the lad repressor gene and the lambda repressor gene, and the lad activator protein LAP267, see Bairn, et al..
PNAS (USA), 88:5072-5076 (June 1991). With regard to lac repressor, see Wyborski and Short, Nucleic Acids Res., 19:17 (Sep. 1991) .
Suppressor tRNA Genes encode tRNAs which suppress the effect 5 of a chain termination mutation. Thus, the supF gene suppresses the amber mutation and the supE gene the ochre mutation. If there is, for example, an amber mutation in a required or selectable function, the mutation can be suppressed by a functional supF gene. Thus, if there is an amber mutation in a
1.0 lambda phage packaging protein, no phage will be packaged or plaques formed, unless functional supF is present.
Antibiotic resistance genes include the ampicillin, choramphenicol, neomycin, bleomycin, puromycin resistance genes. The rescue approach is preferred when the marker is an antibiotic
15 resistance gene.
The lad and lacZ genes are of particular interest, and it is therefore appropriate to discuss their function in nature. The polycistronic lac operon comprises the lac promoter, the lac operator (lacO) , the lacZ, lacY and lacA genes, and a terminator.
20 Immediately 5' of the lac operon is the monocistronic lad operon, which comprises the lad promoter, the lad gene, and a terminator. The lacZ gene encodes the enzyme,beta-galactosidase, and lacY and lacA encode the enzymes beta-galactoside permease and transacetylase, respectively. Transcription of the lacZYA
25 gene cluster is normally repressed by the Lad repressor protein, which binds to the lacO operator site and thereby prevents the binding of DNA-directed RNA polymerase to the operator. Transcription is activated if an inducer, such as IPTG, is present; IPTG releases the Lad repressor from the lacO site.
30 Mutations in either the lacZ or lacY genes create the lac" genotype, in which the cells cannot utilize lactose. Mutations in the lad gene derepress the lac operon.
The vector used to introduce the marker gene may contain one copy of a particular marker gene, multiple copies of a single
35 marker gene, or several different marker genes. Use of multiple marker genes, whether the same or different, alters the sensitivity of the assay.
It is expected that marker genes of nonvertebrate origin
will exhibit a higher frequency of the CpG dinucleotide than do the genes of a vertebrate host animal. The expected CpG frequency in DNA of random sequence is (GC%)2/4. Thus, when the GC% is 40%, the expected CpG frequency is 4%, while if the GC% is 60%, the expected CpG frequency is 9%. Bacteria exhibit CpG frequencies in keeping with statistical predictions. However, for vertebrates, especially mammals, the CpG frequency is depressed overall, though so-called HTF regions are marked by higher-than-expected CpG frequencies. and are usually hypomethylated. For bacteria, whose genomes and mRNA complement vary considerably in GC content, the usual CpG frequencies are believed to be about 5-15%. The E. coli gpt gene, for example, has a CpG frequency of about 8.5%, while for lacl, it is about 9%. In general, while this invention may be applied to any marker gene, it is especially suitable for marker genes where the wild-type gene has a CpG frequency substantially higher than is typical of genes of the target species, e.g., at least about twice the frequency (thus, >4% for mammalian target species) , and more preferably at least about four times the frequency (>8% for mammalian target species) . It is a basic teaching of the invention to provide a wholly or partially synthetic marker gene in which the CpG dinucleotide frequency is reduced, preferably to the point that it is not substantially greater than the CpG dinucleotide frequency in genes of the target species (e.g., not greater than twice, better yet, VA times) . Thus, when the target species is a mammal, the marker gene preferably is engineered so that its CpG dinucleotide frequency does not substantially exceed the frequency in mammalian genes, which is 2%. Preferably, the CpG dinucleotide frequency is 1-3%. While it is within the scope of this invention to reduce the CpG frequency below 1%, and indeed to eliminate CpGs altogether, such depression of the CpG frequency is not preferred. Since the CpG frequency in bacterial genes is 5-15%, if is evident that the degree of CpG depletion envisioned herein is preferably at least two-fold and is more preferably 5-20 fold. It may, as noted, be still higher or even total.
The key to the reduction of the prevalence of CpG in genes of non-mammalian (esp. bacterial) origin is the degeneracy of the
genetic code. Each amino acid of a polypeptide is encoded by a DNA triplet, or codon. Since there are four bases (A,T,C,G) in DNA, there are 43 possible triplets. Three -- the stop codons -- direct the termination of the polypeptide chain. The remaining 61 possible codons encode the twenty protogenic amino acids. Each amino acid is encoded by one (Met, Trp) to six (Arg, Ser, Leu) different codons.
A CpG dinucleotide pair may be formed by the first and second bases of a codon, as in the Arg codon CGT. by the second and third bases of a codon, as in the Thr codon ACG. or by the last base of one codon and the first base of the next one, as in the Cys-Ala encoding sequence TGC.GCA. The last situation is called an "intercodon CpG".
The Met (ATG) , Trp (TGG) , Lys (AAA, AAG) and Gin (CAA, CAG) codons are incapable of forming a CpG dinucleotide.
Table A sets forth five amino acids for which there is at least one CpG-containing codon, and lists the alternative codons, with the percentage of usage of that codon of all codons encoding the same amino acid, in mammals, give in parenthesis. (Codon preferences for non-mammalian vertebrates are also available.)
Table A: Amino Acids With CpG-Containing Codons
CpG Amino Acid Containing Codons Alternative
Thr ACG(11.8%) ACA(26.4%) , ACT(23.4%) , ACC* (38.5%)
Ala GCG(9.9%) GCA(21.0%) , GCT(28.8%) ,
GCC* (40.2%)
Pro CCG(11.2%) CCA(27.3%), CCT(28.8%) ,
CCC*(32.7%) Arg CGA, CGC, CGG, CGT AGA(21.0%) , AGG(21.5%)
(10.7%) (19.5%) (18.2%) (9.1%)
Ser TCG(4.9%) TCA(14.2%) , TCT(18.3%) ,
AGT(13.4%) ,TCC*(23.5%) , AGC* (24.8%)
*if not followed by an Ala (GCN) , Val (GTN) or Gly (GAR) codon (N=A,C,T,G; R=A,G) .
The next table (B) refers to other amino acids having codons
ending with a "C". These form a CpG dinucleotide .if. followed by an Ala, Val or Glu codon (all of which begin with "G") .
Table B: Amino Acids With Codons Having "C" in the Last Position
Potential CpG
Amino Acid Forming Codons Alternative
Cys TGC (57.3%) TCT (42.7%)
Val GTC (25.6%) GTA (9.9%) , GTG (48.1%) , GTT (16.4%)
He ATC (53.6%) ATA (13.1%) , ATT (33.3%)
Leu CTC (20.8%) CTA( 6.8%) , CTG (42.8%), CTT (12.1%), TTA (5.4%) , TTG (12.2%)
Phe TTC (59.3%) TTT (40.7%) His CAC (61.2%) CAT (38.8%) Tyr TAC (59.8%) TAT (40.2%) Asn AAC (58.6%) AAT (41.4%) Asp GAC (57.4%) GAT (42.6%) Gly GGC (34.1%) GGG (22.8%), GGA (25.5%) , GGT (17.6%)
Thr ACC (38.5%) ACA (26.4%) , ACT (23.4%)
Ala GCC (40.2%) GCA (21.0%) , GCT (28.8%) Pro CCC (32.7%) CCA (27.3%) , CCT (28.8%)
Ser TCC, (23.5%) AGC (24.8%) TCA (14.2%), TCT (18.3%), AGT (13.4%)
It is noted that the use of TTA for Leu is disfavored, but
not prohibited.
It is apparent from the foregoing tables that a gene can be altered, without affecting the sequence of the encoded polypeptide, to reduce the number of CpGs to zero. However, the gene is more preferably modified so that 1% to 3% of the dinucleotides are CpG.
With regard to the selection of which CpG dinucleotides in the wild-type marker gene to eliminate, there are no stringent requirements. It is believed to be preferable to reduce CpGs proportionately throughout the gene, however, this rule is not ironclad. It is expected, that in order to achieve the desired CpG frequency, in at least some genes of interest, it will be necessary to eliminate one or more intercodon CpGs. In at least some cases, this will mean replacing a higher preference CpG forming codon with a lower preference non-CpG codon (as in the case of He) . When there is a choice of substitute codons, the higher preference codon is normally preferred.
It may be useful, though it is not necessary, to include two or three CCGG (Mspl/Hpall sites) , prudently positioned, as a "diagnostic" for methylation.
A further consideration in designing the CpG-depleted marker gene is that one preferably should avoid creation of RNA splice sites. Consensus sequences for splicing donor and acceptor sites are given in Padgett, et al., Ann. Rev. Biochem. , 55:1119-50 (1986) . Otherwise, some mRNAs will be incorrectly spliced and may therefore be translated into a nonfunctional protein or a protein of different antigenic characteristics. ' The sequence AGGT is particularly undesirable as it is the predominant splice donor site; AGGC (a splice donor site) and AGG (a splice acceptor) should also be avoided if possible.
The desired CpG-depleted marker gene may be prepared entirely synthetically, i.e., using DNA synthesizer apparatus. See Worall and Connolly, J. Biol. Chem., 265:21889-95 (1990). (Typically, the double stranded DNA will be subdivided into overlapping single stranded oligonucleotide segments. These will be synthesized separately, then ligated and annealed to form the desired DNA duplex.) However, if the marker gene is very large, it may be more desirable to eliminate unwanted CpGs through
mutagenesis, e.g., cassette mutagenesis, of the wild- type gene. The individual cassettes may, of course, be prepared synthetically as described above. The invention is not limited to any particular method of preparing the CpG- depleted gene. 5 The term "mutant" is not intended to indicate that the wild-type gene is obtained first, and then altered. It includes even a wholly synthetic gene, provided that gene differs by at least one base pair from the naturally occurring gene which is closest in sequence to the mutant marker gene. A CpG-depleted mutant gene 0 is one having at least one fewer CpG than the naturally occurring gene which has the greatest sequence similarity to the CpG- depleted gene.
The engineered structural sequence of the marker gene will be operably linked to regulatory sequences which are functional 5 in the cells in which the selectable or screenable phenotype conferred by the marker gene is to be looked for. The most important of these regulatory sequences are the promoters. The transcription of the coding strand of the gene is accomplished by DNA-directed RNA polymerase, which binds to the promoter 0 region. Promoters may contain regulatory elements which render transcription tissue- or developmentally-specific, or which make transcription regulatable by inducer or repressor molecules. For the purpose of the present invention, the promoter may be constitutive, inducible or repressible; the choice will depend 5 on the character of the polypeptide encoded by the marker gene. •
A great variety of promoters have been used to drive expression of unrelated genes in transgenic animals. In transgenic mice, one may list the mouse metallothionein (MT) , human MT, mouse serum amyloid (SAA) , mouse myc, mouse alpha2,
30 chicken transferrin, mouse H-2K (class I MHC) , viral thymidine kinase, Rous sarcoma virus LTR, mouse iriammary tumor virus LTR, rat elastase, mouse albumin, mouse transferrin, human growth hormone releasing factor, mouse alphaA-crystallin, mouse beta- globin, mouse IgH and mouse amylase promoters. See Palmiter and
35. Brinster, Ann. Rev. Genet., 20:465-99 (1986), Table 2. Additionally, several other promoters have been used to direct expression of their natively associated gene in a transgenic animal. These include the alpha- ctin, alpha-fetoprotein, growth
hormone, Hepatitis B surface antigen, insulin, myosin light chain-2, alphal(l) collagen, and ovalbumin promoters. Id, Table 1.
If it is desirable to determine the mutagenic potential of the suspect chemical in all major tissues and organs, the promoter used should be one which is not tissue- specific. A preferred promoter for driving expression of a marker gene is the beta-actin promoter, which, unlike the alpha-actin promoter, drives a gene whose expression is believed to be ubiquitous. For the sequence of the beta-actin promoter from -2011, see Miyamoto, Nucleic Acids Res. , 15: 9095 (1987). Other preferred ubiquitous promoters include the various tRNA promoters, the ribosomal RNA promoter, the ribosomal protein promoter, and the histone promoter. For the methionyl tRNA promoter, see Nucleic Acids Res., 12:1101-15 (1984). The present invention extends, however, to the use of tissue-specific promoters as well.
The terminator (polyA addition site) sequence may be the endogenous terminator sequence of the marker gene, or it may be a foreign terminator, such as the terminator of the SV40 early gene or of the bovine growth hormone gene. The ribosomal binding site may be the endogenous ribosomal binding site, or one which provides increased translational efficiency, such as the Kozak sequence. Enhancer sequences may be used to increase expression, or to limit it to particular tissues., developmental stages, etc. A regulatory element of interest appears in the first intron of the beta-actin gene. It is believed to act as an up-regulator of transcription in a non- tissue specific manner.
The marker gene and its associated regulatory sequences, hereinafter referred to as the transgene, must be introduced into the cells of a host animal. When the target species is a vertebrate, the host animal is also, preferably, a vertebrate.
The following references discuss the preparation of various transgenic vertebrates:
Mammals: Hammer, et al., J. Anim. Sci., 63:269-78 (1986); Hammer, et al., Nature, 315:680-683 (1985); Simons, BIO/TECHNOLOGY, 6:179-183 (1988); Murray, et al., Reprod. Fertil. Dev. , 1:147-55
(1989); Rexroad, et al., Molec. Reprod. Dev. , 1:164-69 (1989);
Vize, et al., J. Cell Sci., 90:295-300 (1988); Wieghart, et al.,
J. Reprod. Fertil. (Suppl. 41) 89-96 (1990); Oren. et al., Proc.
Nat. Acad. Sci. USA 87:5061-65 (1990); Brinster, et al., Proc. Nat. Acad. Sci. USA 82:4438-42 (1985).
Birds: Salter, et al.. Virology, 157:236-40 (1987); Bosselman, et al.. Science, 243:533-35 (1989); Bosselman, et al., J. Virol., 63:2680-89 (1989); Crittenden, et al., Theor. Appl. Genet. 77:505-15 (1989). Amphibians: Rusconi and Schaffner, PNAS, 78:5051-55 (1981).
Fish: Zuoyan, et al., Kexye Tongbau, 31:988-90 (1986); Maclean, et al., BIO/TECHNOLOGY, 5:257-61 (1987).
The choice of a suitable host animal is dependent on (a) its genetic and metabolic similarity to the target animal, and (b) the time and expense involved in producing and maintaining the transgenic animals. Preferred host animals include, among the mammals, mice, rats, rabbits, hamsters and pigs, and among other vertebrates, transgenic fish.
Pigs are of interest since the anatomy of the pig (including the skin) is very similar to the human. Directing transgenic expression to the skin of pigs would create a useful model for the testing of cosmetics.
Fish may have an advantage in that various species of fish exhibit desirable characteristics relating to their use as laboratory animals. " In fact, fish have a long history of performing in this capacity. They have played a critical role in the development of environmental biology, - embryology, endocrinology, neurobiology and other areas. Research in fish has established much of our basic knowledge of membrane transport systems at the molecular level. Transgenic examples of at least 10 different species of fish have been produced. Several mammalian promoters have been shown to function in fish (1) . See Chen, Thomas T. and Powers, Dennis A., Transgenic fish, Trends in Biotechnology, Vol. 8, No. 8, 1990, pp. 209-215. These include the SV40 early promoter, the Rous sarcoma
virus LTR promoter, the mouse metallothionein promoter, the flounder luciferase promoter and the flounder alpha fetoprotein promoter. It is further believed that the cytomegalovirus promoter and phosphoenolpyruvate carboxykinase (PEPCK) promoters would be functional in fish. In general, promoters of piscine genes, genes of viruses which infect fish, and genes which are strongly conserved among the vertebrates are likely to be functional in fish.
Useful marker genes for fish models include the chloramphenicol acetyltransferase gene and the luciferase gene. Stuart, et al., Development, 109:577-584 (1990) describes an assay for expression of a CAT transgene. Assays for other genes transferred to fish, including various growth hormoned, the E. coli beta-galactosidase (lacZ) gene and the E. coli hygromycin resistance gene, have been reported in the references cited in Table 1 of Chen, et al., and of course assays for expression of still more genes may be adapted from piscine systems.
Since fish undergo external fertilization, the injected embryos do not require the complex manipulations (in vitro culturing followed by implantation into pseudopregnant foster mothers) required in mammalian systems. It has been found that when DNA is injected cytoplasmically into fish eggs (rather than into the pronucleus) , embryo survival rate is high (35-80%) as is the rate of DNA integration (10-70%) . While pronuclear injection is usually impractical, as fish nuclei are difficult to locate, there are exceptions (e.g., edaka) . Where the chorion poses a significant obstacle to microinjection, it may be removed mechanically or chemically, its hardening may be prevented, the DNA may be injected through the micropile, or a pilot hole may be made by microsurgery.
The zebrafish (Brachydanio rerio) has been used to produce stable lines that exhibit reproducible patterns of transgene expression. See Stuart, Gary W.. , et al., Stable lines of transgenic zebrafish exhibit reproducable patterns of transgene expression, Development 109, 577-584, 1990. They are much less expensive to buy and raise than any mammalian species. They are extremely fecund, oviparous and are externally fertilized. Because of these factors, it is much less expensive and
technically less complicated to perform gene transfer procedures on them. Their eggs are transparent and embryonic development occurs at a much faster rate than in the mouse. Large scale production of homozygous diploid zebrafish can be obtained in a reproducible and relatively simple manner. See Streisinger, G. , Walker, C., Dower, N. , Knauber, D., Singer, F. , Nature 291, 293, 1981.
The fact that fish are aquatic organisms cannot be overlooked. This has important implications with respect to simplifying both experimental design and implementation. Fish present clear advantages .for the evaluation of the impact of water-soluble chemicals.
Certain enzymes are known to play a role in the conversion of promutagens into mutagens. Host animals may be selected, on a species and/or individual level, to provide a level of activity of these enzymes which is comparable to (or if desired to increase the margin of safety, higher than) that of the target animal of interest. If a particular species of animal, such as a mouse, is deficient in a particular enzyme of this type, it may be modified, by crossbreeding or genetic engineering, to provide
(or enhance the activity of) the desired enzyme systems. For example, a transgenic animal may be produced that features a P450 enzyme missing in the mouse, or homologous recombination may be used to replace, it with a human counterpart' or to insert a stronger promoter upstream of a gene encoding such an enzyme.
A further consideration is the spontaneous mutation rate in the host animal. Preferably, this background level of mutation is low, e.g., less than 10*5 to 10"6.
The natural environment of an animal may make it better suited for testing certain scenarios of chemical exposure. For example, waterborne chemical are preferably tested using transgenic fish (or amphibia or aquatic mammals) .
If an animal is particularly sensitive to mutagens, it may be useful in detecting less potent mutagens. A final issue is the economic importance of the animal. A chemical which has a detrimental effect on an economically important animal may be rejected even if it does not have a serious adverse effect on humans. This could be the case with,
for example, honey bees, or with fish.
The laboratory mouse has been the most popular host animal for use in the development of transgenic animals, as there are numerous strains available. Mice are, of course, the most widely available laboratory animal, and many strains are available. See Genetic Variants and Strains of the Laboratory Mouse (Gustav Fischer Verlag, 1981) . However, there are no substantial restrictions on the use of other laboratory or livestock species in such work. Among the higher mammals, pigs are preferred, and fish offer an interesting alternative to mammalian subjects.
Techniques for the production of transgenic animals are described in Gordon and Ruddle, "Gene Transfer into Mouse Embryos: Production of Transgenic Mice by Pronuclear Injection, " Meth. Enzymol. 101:411 (1983); Brinster, et al., "Factors Affecting the Efficiency of Inroducing Foreign DNA into Mice by
Microinjecting Eggs," Proc. Nat. Acad. Sci. (USA), 82: 4438-42
(1985) ; Palmiter and Brinster, "Germ-Line Transformation of
Mice," Ann. Rev. Genet. 20:465 (1986): Brinster and Palmiter,
"Introduction of Genes into the Germ Line of Animals, " The Harvey Lectures, Series 80, 1-38 (1986) ; Scangos and Bieberich, "Gene Transfer into Mice," Adv. Genet., 24:285 (1987); Cuthbertson and Klintworth, "Transgenic Mice--A Gold Mine for Furthering Knowledge in Pathobiology, " Lab. Investig. 58:484 (1988); Camper, "Research Applications of Transgenic Mice, " BioTechniques, 5: 638 (1987); Hogan, et al., Manipulating the Mouse Embryo: A Laboratory Manual (Cold Spring Harbor Lab. 1986) ; Levine and Tilghman, "Gene Transfer into the Germline," in Kucherpati, ed. , Gene Transfer (Plenum Press 1986); Palmiter and Brinster, "Transgenic Mice," Cell, 41:343- 45 (1985). DNA may be introduced into host cells by microiήjection, electroporation, infection, and other mechanisms such as lipofection and cell receptor-mediated transfer. While the DNA may plainly contain bacterial genes, procaryotic vector DNA (more particularly any prokaryotic replicon) should be removed before the transgene is introduced into the host cell(s) to be developed into a transgenic animal. For example, the inclusion of large flanking sequences of lambda DNA in early beta-globin transgene constructs apparently inhibited expression of the transgene in
transgenic mice. See Wagner, et al., in Molecular and Cellular Aspects of Reproduction, 319-349 (1986) .
The most common technique for the production of transgenic animals involves the microinjection of the transgene into the pronucleus of fertilized eggs. Because integration usually accompanies DNA replication, about 70% of the transgenic mice carry the transgenes in all of their cells, including the germ cells. In the remaining 30%, integration apparently occurs after one or more rounds of replication, hence, the transgene is found in only a fraction of the cells. These mice usually exhibit the same degree of mosaicism in somatic and germ cells, but in some mice the germ cells may totally lack the transgene. In the latter case, the mice will be unable to transmit the transgene to their progeny. One of the requirements for successful pronuclear microinjection is the ability to locate the pronucleus. Eggs of some species are more opaque than others. However, the visibility of the pronucleus may be enhanced by centrifugation and/or differential interference contrast microscopy (Nomarski optics) . The employment of art-recognized techniques of facilitating microinjection is within the invention as contemplated.
Transgenes may also be incorporated into the host cell genome by microinjection of DNA into the cytoplasm of fertilized or unfertilized eggs, into the nuclei of two-cell embryos, or into the blastocoel cavity. Mosaicism is more prevalent with these approaches.
Alternatives to microinjection include electroporation, liposome-mediated entry, and particle gun bombardment. Preimplantation embryos may also be infected with retroviruses engineered to carry the transgene. This method has found particular favor for the production of transgenic birds.
Still another method for the production of transgenic animals is to introduce the transgene, on-a suitable vector, into totipotent teratocarcinoma or embryonic stem cells and then incorporate these cells into embryos.
Once transgenic animals are produced, they (or their transgenic progeny) are exposed to the suspect chemical. The
exposure may be by ingestion, inhalation, injection, or skin contact. The dosage employed may be one comparable to that experienced by the target species in the environment of interest, or it may be a higher dose, in order to provide a margin of safety. After an appropriate exposure period, the animals are examined to determine whether the marker gene has been mutated.
In one embodiment, the marker gene confers a phenotype which can be detected without killing the animal, e.g., one which may be detected by in vivo imaging means. In vivo imaging means known in medicine include CAT, PET, NMR and MRI. For this to work, the transgene or its expression product must have a characterizing feature which is recognizable by a detectably labeled homing agent. For example, monoclonal antibodies may be prepared which bind a wild-type polypeptide, in preference to the mutant polypeptide encoded by the marker gene. These antibodies may be detectably labeled and injected into the animal. If the epitopes for these antibodies are reestablished by specific reverse mutation, by mutation, these antibodies may be localized by scintigraphic means known in the art. (A forward assay can also be envisioned, but is less desirable because of the increased background.)
Certain cells may, of course, be removed without killing the transgenic animal. These include blood cells, skin cells, mucosal cells, etc. Such cells may be removed and examined as described below. However, this method, while permitting the monitoring of the development of the mutagenic effect of the chemical in certain tissues over time, does not provide information as to mutagenesis of the marker gene in all tissues and organs. Therefore, in another and preferred embodiment, the animal is sacrificed so that all of its tissues and organs of interest may be examined for mutation of the transgene. In this embodiment, it is not strictly necessary that the animal have expressed the marker gene. However, it is preferable that the animal express the marker gene, since mutation rates may be different for expressed and unexpressed genes. Mellon, et al., PNAS, 83:8878-82 (1986).
In an especially preferred embodiment, transgenic mice which
are homozygous for lad are mated with transgenic mice which are homozygous for lacZ under lacO control. The progeny are hemizygous for a single copy of lad and for one or two copies of lacZ. The progeny animals are exposed to the potential mutagen. If the lad gene is mutated, the cells of the progeny animal will stain blue since lacZ gene is then derepressed. (Having more than one copy of the lad gene is undesirable, since then both copies must be mutated in order to derepress the lacZ gene.) To avoid the problem of a mutation in lacZ that prevents its expression hiding the status of the lad marker gene, it is preferable to use two copies of the lacZ gene. However, more than two copies is undesirable as this would titrate out the repressor.
A variety of other phenotypic characteristics could be used to identify cells containing a mutagenized form of the marker gene. These include antibody sensitivity or resistance, antigenicity, etc. (See discussion of marker genes above.)
Alternatively, the marker gene may be recovered from the genomic DNA of the transgenic animal. A variety of techniques are known for rescue of a foreign gene from genomic DNA. These include rescue of lambda proviruses, plasmid rescue, and rescue of filamentous phage DNA. One method is the use,' as previously discussed, of a lambda packaging extract. Once the marker gene has been recovered, mutations may be detected in any of several ways. First, as the marker gene confers a selectable or screenable phenotype, the recovered marker gene may be cloned into a suitable "assay" cell, such as a bacterial cell, and the transformed cells may then be exposed to selection or screening conditions. If this technique is used, the marker gene must be expressible in the assay cells, and therefore must be operably linked (either originally, or as a result of further manipulation) to a promoter functional in those cells. This procedure will detect forward and back mutations, but not silent or neutral mutations. Silent and neutral mutations may be screened for by extracting genomic DNA and hybridizing it to a panel of oligonucleotide probes, each directed against a different locus of the marker gene, under stringent conditions. The failure of
one of- these probes to hybridize is then indicative of the presence of a mutation.
Use of transgenic animals in mutagenic assays also allows one to determine whether a promutagen or its metabolic products can cross the placenta or the blood-brain barriers.
Example 1
Construction of lad Expression Vectors
All plasmids were cloned in E. coli strains HB101 or DH5. Plasmid pCMVlad (5.5Kb) (Brown, et al., Cell, 49:603-12, 1987; Figge, et al., Cell, 52:713-22, 1988), a source of the lad gene, was digested with EcoRI, and the resulting 1.1Kb fragment was cloned into the EcoRI site of plasmid pBSK+ (2.9Kb) (Stratagene) , creating the plasmid pBSK lad (4.0Kb) . No promoter is operably linked to the lad gene in pBSK lad. The 0-actin promoter was excised from the plasmid pHj8Apr-l (6.6Kb) (Gunning, et al., Proc. Nat. Acad. Sci. USA, 84:4831-35, 1987) by restriction with Hindlll and BamHI, and this 4.3Kb fragment was ligated with a 1.1 Kb fragment obtained by BaπiHI/Hindlll digestion of pBSSKlad to obtain pH/Jlad (7.7Kb), in which the lad gene is under the transcriptional control of the j8-actin promoter.
In order to obtain targeting of the Lad repressor protein to the nucleus, the hybrid gene was modified to further encode the heptapeptide (PKKKRKV) nuclear location signal from SV40 large T antigen. Plasmid pHjSlacI (7.7 Kb) was cut with Hindlll and BamHI, thereby excising the 3' untranslated flanking region of the lad gene. The remaining 6.6 Kb fragment was ligated with a l.l Kb fragment obtained by digestion of plasmid pSZN5 (a.k.a. pMTlacINLS) , a derivative of pMTlad (Brown, et al., Cell, 49:603-12, 1987) with Hindlll and Bglll, thereby producing the new plasmid pHjSlacINLS (7.7 Kb) .
For selective purposes, a neomycin resistance gene was cloned into pHjSlacINLS. Plasmid tkneo was cut with BamHI and Hindlll, releasing a 2 Kb fragment. Both ends of this fragment were then blunt-ended. Plasmid HSL mutants pSAM was cut with EcoRI, yielding a 2.7 Kb fragment. This, too, was blunt-ended. The two blunt-ended fragments were then ligated to obtain plasmid
HSLmutants-neo. This was cut with Seal to Obtain a 2.6 Kb fragment. Plasmid pHjSlacINLS was linearized with Sspl, and ligated to the neo-bearing Seal fragment to obtain pHSlacINLSneo (10.7 Kb) . Plasmid pHblNB was prepared by cutting pHblacI with HindHI/BamHI, and ligating it with a 1.1 Hindlll/BamHI fragment from obtained by cutting pSZN5 with Bglll, blunt ending the Bglll ends, cutting again with Hindlll, and attaching a BamHI linker.
Example 2 Production of Transgenic Mice with CpG-depleted lacl gene
Transgenic mice were produced substantially according to the following standard protocol. (For further details, see Chandrashekar, et al., Neuroendocrine Research Methods and Functions in Transgenic Mice, in Greenstein, D.B., ed. , Vol. 1, Chap 15, Neuroendocrine Research Methods. 315-336 (Howard Academic Pub., London: 1991) . Embryos from B6SJL Ft female mice bred to males of the same strain are used in our laboratory because they culture well in our hands and are favorable for microinjectionbecause they have little cytoplasmic pigmentation. The embryo donor B6SJL females are superovulated with 5 I.U. of pregnant mares' serum gonadotropin (PMS) at 12:00 noon three days prior to embryo collection. Forty-eight hours later at 12:00 noon, the ovulation of these mice is synchronized by injection of 5 I.U. of human chorionic gonadotropin (HCG) and embryos are collected at 9:00 am the following morning from the ampula of the oviducts of the embryo donors following sacrifice by cervical fracture. The collected embryos are treated with bovine testis hyaluronidase to remove cumulus cells, washed five times, and incubated under 90% N2, 5%02, 5% C02 at 37°C in Brinster's medium until further use.
Plasmid pHbLacI was digested with Sspl and BamHI to remove all procaryotic vector sequences, and 25μl of fragments (concentration 25 ng/μl) were microinjected into the male pronucleus of the collected embryos. Microinjection is carried out using two Leitz micromanipulators controlling a suction holding pipette and an l μm injection pipette. The holding pipette is connected via tubing to a 500 μl threaded plunger
Hamilton syringe; the injection pipette is connected via tubing to a microsyringe. In both cases the syringe and tubing are filled with light paraffin oil. 30 to 40 eggs are placed in a depression slide within a microdrop of Brinter's microinjection medium (a phosphate buffered medium not requiring continuous exposure to the N2/02/C02 gas mixture) , adjacent to a microdrop of DNA solution, under silicon oil. Micromanipulation involves loading the injection pipette with DNA solution, returning to the microdrop, focusing on the male pronuclear membrane, introducing the injection pipette through the pronuclear membrane into the pronucleus, and expelling about 1 pi of DNA solution, swelling the pronucleus to approximately 200% of its normal volume. These manipulations are carried out under 160 to 32OX magnification using a Zeiss inverted microscope. After each group of embryos is injected, they are washed and then returned to Brinster's medium and maintained under the gas mixture until transfer to recipient females.
The injected embryos are transferred into the oviducts of white CD-I female mice previously bred to vasectomized males. These recipient females are selected by the presence of vaginal plugs on the morning the embryo microsurgery is performed. Ten injected embryos are transferred to each oviduct of each recipient. Approximately 20 days later, pups are born. When uninjected embryos are transferred as controls, the average litter size is 14. In our experience, 90 to 95% of recipients will give birth with an average litter size of 7 to 8 pups. Mice produced from microinjected eggs are weaned a month after birth. Segments of tails are analyzed by DNA hybridization analysis for the presence of the injected gene construct. In the instant experiment, the microinjected embryos were transplanted into the preimplantation uteri of nine pseudopregnant females. These females produced 63 pups, in eight litters.
Animals which were pHjSlacI positive at 20-30 days after birth were detected by means of a Southern blot using a jS- actin/lacl probe, an SspI/BamHI fragment of pHβlad. Positive mice were selected for further testing.
Example 3
Study of Methylation of Transgene in Transgenic Mice and in
Tranfected Cell Line
Genomic DNA from the tails of several transgenic mice (lacI02 and its daughter, lacI02.11) was digested with a restriction enzyme (BamHI, Bglll, EcoRI, Hindlll, Hpall, Mspl, NotI, PstI or Rsal) and characterized by Southern blotting. The probe was prepared by digesting plasmid pCMVlad with EcoRV, which linearizes the plasmid, and then labeling the linearized pCMVlad with [32P] dATP. The probe was hybridized to the blotted fragments at 42 deg. C.
A single large fragment was detected in the BamHI, Bglll and NotI digests. The fragment migrated to approximately the same position as the 23 Kb Hindlll fragment.of lambda. This indicated the absence of Bglll sites and the loss of the last BamHI. Complex patterns of 3-5 bands of over 5kB size were observed in the EcoRI, Hindlll and Hpall digests.
Mspl and Hpall generated very different patterns. This suggested that the lad transgene was heavily methylated in the tail DNA of the source animals. Similarly, heavy methylation patterns were observed in progeny.
For comparative purposes, lad cell line DNA (mouse NIH 3T3 or human fibrosarcoma HTD114 derivatives) was subjected to a similar analysis. The Hpall and Mspl patterns were the same, indicating that lad was not methylated in cell line DNA.
Example 4
Expression of Transgene in Transgenic Mice and in Transfected
Cell Line
Transgenic mice 02.11.03 (female) (BCF1 background), 09.07 (female) (DBA/2J bkgd) and 09.03.03 (male) (129/SV bkgd) were sacrificed, and their liver, spleen, heart, testis/ovary (09.03.03 and 02.11.03 only), uterus, kidney andmuscle (02.11.03 and 09.03.03 only) tissues were removed and frozen in liquid nitrogen. Liver, testis/kidney and heart RNA was extracted by the acid phenol method. A Northern blot was run on total RNA from (1) the lad cell line (NIH 3T3 or HTD114) , (2) rat embryo fibroblast (REF) cells, (3,4,5) liver, testis and heart RNA from
09.03.03, (6,7) liver and heart from 09.07, (8,9,10) liver, ovary and heart from 02.11.03, (11,12) the tails of 04.04.05 and 04.04.01), (13) control lines (NIH 3T3) , and on DNA from (14, 15) 04.04.05 and 04.04.01. A sensitive lad probe was made by PCR amplification of a lad template This was hybridized to the aforementioned RNAs. A band hybridized to lad in the transfected cell line RNA at the appropriate position for the expected 1.1 Kb message. No such bands were detected in any transgenic animal tissues. This suggests that the lad gene was expressed in the cell line, wherein it was unmethylated, but not in the transgenic mice, wherein it was heavily methylated.
Example 5
Comparative Study of Methylation of Transgene in Mice of Different Backgrounds
Southern blots were prepared of lacI09 lineage DNAs to compare the degree of methylation of the lad gene in DBA/2J, 129/Sv and BALB/c X C3H backgrounds. There were no apparent differences in the first generation. In later generations, the lad gene was demethylated to some degree in the DBA/2J line but remained hypermethylated in the other two lines.
Example 6
Preparation and Use of CpG-Depleted Lad Gene (LacIMlRNL)
Figure 2 depicts a lad gene modified to reduce the number of CpG dinucleotides from 95 (~18%) to four (~0.8%).
The sequence of Figure 2 unfortunately includes potential RNA splice sites. The following base substitutions would eliminate these sites: 87 A→G 180 A→T
402 G→A 594 G→A 882 G→A 891 G→A 936 G→A
942 G→A
1083 G→A 111613-»A
The gene of Figure 2 is prepared by chemical synthesis of component oligonucleotides and their subsequent ligation and annealing to form the desired lad mutant. The Lad gene was synthesized in segments by annealing double stranded oligonucleotides with overhanging, complementary ends of 10 nucleotides. The following single stranded oligonucleotides were synthesized for use in assembling the lacImlRNL gene, including the Hindlll site, the Kozak RBS, the modified lad sequence, the three codon linker- and the seven codon NLS-encoding sequence. The numbering begins with the first base on the sense strand in Figure 2 (SEQ ID NO:3) .
Sense Strand Anti-Sense Strand
Note that there is a four base protrusion at both 5' ends of the final double stranded gene. The oligonucleotides a
through t and a' through t' were synthesized on an Applied Biosystems 391 DNA synthesizer (PCR-MATE) per the vendors instructions. The oligonucleotide a is complementary to a', b to b!, c to c', etc. The Lad gene was modified to remove all but 4 CpGs, 3 of which (positions 48, 606 and 1027) are part of Mspl/Hpall methylation diagnostic sites. The modified Lad gene was synthesized in 2 halves. The 5' half is bounded by a Hindlll site and Apal site; the 3' half by an Apal site and a BamHI site. The double stranded oligos a/a, b/b" and c/c' were incubated together, allowed to anneal, and ligated with T4 DNA ligase. The trimeric product was separated by agarose gel electrophoresis and recovered from the gel. The same procedure was followed with double stranded oligonucleotides d/d1 , e/e', f/f, and g/g', and with h/h' , j/j1 and k/k' . The three trimeric products were incubated together, allowed to anneal via complementary overhanging ends, and ligated with T4 DNA ligase in the presence of Bluescript SK plasmid DNA (Stratagene) that had been digested with Hindlll and Apal. The DNAs were used to transform E. coli XL1 cells, and transformed cells with plasmids containing inserts were identified by the absence of blue color development after staining with the chromogenic agent X-gal (Sigma) and IPTG
(Sigma) . Plasmid DNAs from white colonies were isolated and tested for a 570bp insert representing the 5' half of the modified Lad gene by digestion with Hindlll and Apal and size fractionation. Inserts of the correct size were sequenced to ensure that no unwanted mutations were inadvertently introduced. The same procedure was used to synthesize, ligate and clone the oligonucleotides encoding the 3' end of the modified Lad gene. The double stranded oligonucleotides were annealed, ligated and recovered in the following groupings: 1/1', m/m* , and n/n' ; o/o', p/p1, q/q' and r/r" ; s/s', and t/t! . The three oligonucleotide multimers were annealed together and ligated in the presence of Bluescript SK plasmid DNA (Stratagene) , and the DNA used to transform E. coli XL1 cells. Plasmids with inserts were identified as above, and correct insert size determined by Apal/BamHl digestion. Inserts were sequenced to ensure the absence of unwanted mutations. The complete gene was assembled by digesting the plasmid containing the 5 ■ half with Hindlll and
Apal and the plasmid containing the 5' end with Apal and BamHI. The inserts were recovered and annealed and ligated in the presence of plasmid Bluescript SK digested with Hindlll and BamHI in a 3 way ligation. Following transformation of E. coli XL1 cells, white colonies were picked after staining with X-gal and IPTG induction. Proper insert size (1.13kb) was established by digesting plasmids with Hindlll and BamHI, and the entire modified Lad gene was sequenced to ensure that it was correct. As synthesized, the gene had a 5' Hindlll site; the beta actin promoter has a 3' Hindlll site and can readily be linked at that site. The final construct is cloned in Bluescript SK+ (Stratagene) .
The resulting expression vector is then digested with with Sspl and BamHI to linearize the vector and remove procaryotic sequences, and the lad-bearing fragment is then microinjected in the male pronuclei of fertilized mouse eggs as previously described. Production of transgenic animals is then by the method set forth above. It is expected that, as a result of the CpG depletion, the engineered lad gene will be only weakly methylated, and therefore will be better expressed.
Example 7
Preparation of CpG-Depleted gpt Gene
In another example, the E. coli gpt gene may be used. Of the 458 dinucleotide pairs, 40 are CpGs. The nucleotide sequence and position of CpGs are shown in Figure 3 (SEQ ID NO:5) . Oligonucleotides containing the modified DNA sequences with absent or reduced CpGs are synthesized using an Applied Biosystems* 391 DNA synthesizer. The modified sequences and oligonucleotides and endpoints are denoted in Figure 4 (SEQ ID NO:7) . The strategy is to hybridize complementary oligonucleotides (e.g., a and a'; b and b' etc.) , to form double stranded oligos with 10 nucleotide overhangs. The double stranded oligos a/a1, b/b' and c/c? are incubated together allowing ends to anneal, ligated with T4 ligase, and the trimeric product is purified from an agarose gel. The same procedure is followed for oligonucleotides d/d' and e/e', and for oligonucleotides f/f* and g/g' and h/h' . The gel purified
trimeric and dimeric products are incubated together to allow annealing in the presence of EcoRV-digested Bluescript SK
(Stratagene) . E. coli XL1 cells (Stratagene) are transformed with the product and unstained colonies are picked for analysis after chromogenic staining with X-gal and IPTG. Plasmids from white colonies are checked for proper sized inserts by cleavage with EcoRI and Hindlll, and plasmids with inserts of about 460bp are subjected to DNA sequencing to ensure that the sequence is correct and no unwanted mutations are present.
Example 8: Production of Transgenic Pigs
Pigs are one of the standard experimental models for humans in clinical studies.
Transgenic pigs may be produced by the following procedure, which uses commercial cross-bred sows and boars. Parental stock may come from the Landrace, Yorkshire, Duroc and Hampshire breeds. About, 24 h after previous litters are weaned, sows used as zygote donors are induced to ovulate with 400 i.u. PMSG (i.ifi.)
(Ayerst Labs, Montreal, Canada) and 200 i.u. hCG (i.m.) (Sigma) or allowed to ovulate naturally. Thereafter, boars are used to check for oestrus daily at 10:00 h and 16:00 h. Donor sows are artificially inseminated with 120 ml of fresh, extended semen 24 and 30 h after the onset of oestrus. (Recipient sows are synchronized in oestrus with the donor sows, but are not inseminated.) A mid-ventral laparatomy is performed on the donor animals and zygotes are flush from the oviducts with warm (37 deg. C.) modified BMOC-3 medium containing HES. The zonae pellucidae of the recovered eggs are examined for the presence of spermatozoa. One and two- cellzygotes are centrifuged at 10,000 xG to faciliate visualization of pronuclear and nuclear structures. The zygotes are placed in cover-slip chambers in microdrops of modified BMOC-3 covered in silicone oil. Microinjection of about 10 pi of DNA-containing soluation follows the procedure previously described for the mouse experiments. A' midventral laparotomy is performed on the recipient sows and zygotes are inserted into the oviduct of animals identified by ovarian morphology as having ovulated.
Example 9: Production of Transgenic Fish
The zebrafish, Brachydanio rerio, is a simple vertebrate with a number of desirable characteristics. Hundreds of eggs can be produced daily on a year round basis from a small number of 5 adult fish. Eggs can be fertilized in vitro; as in the frog, fertilization is external. Zebrafish embryos are optically transparent, so embryonic development can be monitored and cell types within the embryo identified. The fish develop rapidly, hatching from their chorions at 2 to 3 days post fertilization. Q The generation time is only 3 to 4 months.
Zebrafish are maintained in aquaria under conditions conducive to rearing, mating and spawning, e.g., 12-16 fish per tank, 28.5*1., 14h light/lOh dark cycle. For general zebrafish care and maintenance, see Streisinger, Nat. Cancer Inst. Monogr. 5 65:53-58 (1984) .
Males and females are kept separate until the evening prior to egg collection. At this time 1-3 females and 1-4 males are placed in a spawning tank. The spawned eggs should be protected from the adults, e.g., by covering the floor of the tank with 0 marbles. Recently fertilized zebrafish eggs are collected with a siphon at various intervals during the l-2h spontaneous spawning period initiated by the onset of light. Immediately prior to microinjection, eggs are dechorionated manually or by digestion with pronase and placed in embryo medium (a modified 5 10% Hanks containing 1.2mM CaCl2, ImM MgCl2 and 4mM NaHC03) . * Injected or control embryos are reared in embryo medium for 2-4 days and then transferred to normal tank water and maintained.
The embryos in embryo medium are placed on a depression slide and injected with the aid of a dissecting microscope and
30 a micrόmanipulator. The DNA solution is injected cytoplasmically through a continuously flowing micropipette, the flow rate may be controlled with pressurized air. Phenol red may be added to the solution to aid in estimating the volume injected.
DNA may be extracted from whole fish at 1-3 weeks of age by
35. dissolving the tissue in 0.2ml IxSET (lOmM Tns pH 2.75, 5mM EDTA, 1% SDS) and then digesting the samples with 0.2mg/ml proteinase K for Zh at 37°C and then overnight at room temperature. The DNA is then precipitated with several volumes of 95% ethanol in the
presence of 0.15M NaCl.
It may also be extracted from fins of adult fish by dissolving the tissue in IxSET, removing undissolved material by microcentrifugation, extracting with phenyl/chloroform/ isoamyl alcohol (50:50:1) and then precipitating with ethanol.
See Stuart, et al.. Development, 103:403-12 (1988) and Development, 109:577-584 (1990).
In conclusion, the zebrafish is a small, laboratory- adapted vertebrate species which can be cared for more easily than most mammalian subjects. A variety of mutations may be detected by studying their effects on a CpG-depleted marker gene in a zebrafish model.
Example 10; LacI/LacZ Reporter System
It is not necessary that the marker, or mutational target, gene, express a detectable product. It may instead express a product that serves as a substrate for the product of a second (reporter) gene, or as a cofactor for the action of that product, or, as in this example, as a means of regulating the expression of the second gene. For example, the Lac repressor, expressed by a lad target gene, may be used to extinguish expression of beta- galactosidase by the lacZ indicator gene. One method of obtaining a lad/lacZ transgenic animal is by mating (a) a transgenic mouse homozygous for a single copy of the target transgene lad, and (b) a transgenic mouse homozygous for the indicator transgene, lacZ, which may be present in one or two copies. Production of mouse line (a) is described in Example 2. Production of mouseline (b) is set forth in Example 11.
In a second method, the modified lad gene with reduced CpG and (preferably) mammalian codon usage is directed by the normal bacterial lad promoter, or an alternative prokaryotic promoter such as trp, and introduced into a lambda phage shuttle vector that has been previously described with the bacterial lad by Kohler, et al., Proc. Natl. Acad. Sci. USA 88:7958-62. 1991, and that includes the c subunit of lacZ, a jS-lactamase gene, and a ColEl replication origin, all flanked by the initiator and terminator halves of the Fl filamentous phage origin. The vector is introduced into mice by pronuclear injection, and the
transgenic mice are bred to homozygosity for the transgene. The mice are exposed to mutagen and genomic DNA is subsequently prepared from selected tissues as previously described: Kohler, S.W. et al.. Nucleic Acids Res. 18_:3007-13, 1990; Kohler, S.W. et al., PNAS _3_8:7958- 62, 1991. The shuttle vector is rescued from genomic DNA by packaging the shuttle vector DNA into infective ■ virions using an in vitro λ packaging extract (Transpack from Stratagene Cloning Systems) , preadsorbing to E. coli SCS-8 (Stratagene Cloning Systems) , mixing with top agar containing 2 mg of X-gal per ml of top agar and pouring onto assay plates with a bottom agar layer. Rescued phage containing wild type lad will produce colorless plaques while rescued phage with mutant lad will produce blue plaques. The ratio of blue to colorless plaques is indicative of the mutagenicity of the compound. The DNA containing mutant lad can be excised from the lambda phage in vivo (Kohler, S.W. , Nucleic Acids Res. 16:7583- 7600, 1988) , and the mutant lad gene sequenced.
In a third method, the modified lad gene with prokaryotic promoter is linked to a DNA sequence that is recognized by and binds to a specific protein or other substance. One example is the lac operator (lacO) which specifically binds the lac repressor with high affinity. The lac operator is placed close to the lad gene, so as not to interfere with expression, and mice are rendered transgenic for this construct by pronuclear injection. After breeding mice to homozygosity for the transgene, the animals are exposed to he mutagenic environment. At a subsequent time, the DNAs are isolated from selected organs and tissues of the exposed animal, and the DNA is digested with an enzyme that cleaves outside of both the operator sequence and the lad gene leaving intact DNA fragments containing both sequences. To the digested DNA is added purified lac repressor protein attached to magnetic beads. The repressor binds the operator sequence, and the complex is separated from the remainder of the DNA by use of a magnet. The separated fragments are cloned into a plasmid with an ampicillin-resistance marker and used to transform E. coli that constitutively express lacZ due to mutant or absent lad. Ampicillin resistant colonies containing mutant lad will stain blue with X-gal while colonies
with wild-type lad will not.
The advantage of a bipartite detection system (which is not limited to lad/lacZ) in which an "inhibitory" gene is the target gene, over use of a reporter gene as the target gene (i.e., lacZ alone) , is that mutation in the target transgene is manifested as stained cells on an unstained background, whereas if lacZ were the target, mutation would appear as unstained cells on a stained background.
Ideally we will generate one or more mouse lines that expresses a β-galactosidase transgene, with an associated lac operator sequence. The distribution of bacterial jS- galactosidase expression in the whole organism will be evaluated histochemically. Although many mammalian cell types produce an endogenous jS-galactosidase, it can be distinguished from the bacterial enzyme by its pH 4.2 optimum. In contrast, the bacterial enzyme has an optimum at pH 7.0, Goring, et al., Science 235:456-450. 1987. As previously described, the gross distribution of jS-galactosidase activity in whole embryos (e.g., days 11 to 13 of gestation) can. be rapidly assessed by fixing in, 1% formaldehyde, 0.2% glutaraldehyde and 0.02% NP40 in PBS at 4'C for 30 minutes, and staining overnight at 30°C in 5mM potassium ferricyanide, 5mM potassium ferrocyanide and 1 mg/ml X-gal (5- bromo-4-chloro-3- indoyl-jS-D-galactoside) . In this manner it is possible to scan an entire fetus for color by examination under a dissecting microscope. The specimen can be further analyzed by embedding in an acrylic resin and sectioning. Absence of stain in internal regions of the embryo or of individual organs should be viewed with caution, since this may reflect a problem with X-gal permeability rather than an absence of jS-galactosidase expression. This difficulty can be resolved as described below.
To assay for β-galactosidase at a finer level than as above, whole embryos or individual organs of adult, lacZ- containing mice will be surgically removed and quick-frozen by submersion in liquid nitrogen. Specimens can be stored at - 70°C until ready for further processing. Serial 50 um sections will be cut using a digital Leitz cryostat 1720. Alternatively, tissue will be fixed and cut into 150 um sections using a vibratome (Ted Pella, Inc.). When solely determining the distribution of β-
galactosidase expression or the number of β-galactosidase positive foci present in an organ or tissue the thicker sections will be used. About 100 to 150 such sections, for example, will encompass an entire adult liver, rendering analysis quick and easy. When requiring finer resolution or when determining the molecular nature of a mutation thinner sections will be used.
Example 11: Production of Transgenic Mice Expressing a Bacterial lacZ Gene
We will generate a construct that contains the bacterial lacZ gene directed by a murine aprt promoter associated with lac operator sequences. For in vitro* selection purposes, a neo gene
(G418 resistance) will also be incorporated into the construct.
The constructs will be generated as described in Figure 6. The promoterless β- galactosidase gene with 3' SV40 processing signals is bounded by Hindlll (5') and Xhol (3') sites in the Bluescript-based plasmid pLZ6.6 (Figure 5B and 6A) . We have synthesized two complementary 28-mer oligonucleotides with overhangs cohesive with BamHI on the 5' side and with Hindlll on the 3' side. The core of this sequence is an 18 bp palindrome that behaves as a mutant operator which binds lac repressor- about 8 times more tightly than the wild-type operator sequence Brown, et al., Cell, 4 :603-612, 1987. The oligonucleotides will be hybridized and directionally cloned into the unique BamHl/Hindlll sites upstream of the lacZ gene (Figure 6B) . A 396 bp fragment extending from 6 nucleotides upstream of the APRT translation start codon (position -6) to position -402, encompassing the entire aprt promoter (Dush, et al., Nucleic Acids Res., 16:8509- 8524, 1988) and flanked by BamHI linkers, will be cloned immediately upstream of the operator sequence, and its orientation determined by the position of an asymmetrically located Smal site (Figure 6C) . As a last step, a neo marker driven by an HSV tk promoter and flanked by Xhol sites will be inserted at the unique Xhol site in either orientation (Figure 6D) for selection purposes, and the resultant plasmid designated pAPlacOZneo.
Since the operator can be positioned at several sites within the SV40 promoter region and retain its regulatory capacity at
each position (Hu, et al., Cell, 48_:555-566, 1987, Brown, et al., Cell, 4 :603-612, 1987), we anticipate that the above construct will serve our purposes and that the operator will extinguish jS- galactosidase expression in the presence of repressor. The ability of the construct to express β- galactosidase and to suppress its expression in the presence of repressor will be tested in cultured mammalian cells, as described in a later section. However, alternative strategies can be exploited should, for example, repression due to the position of the operator be leaky. The construct described above places the operator adjacent to the translation start codon and bp and 122 bp downstream of the two major transcription initiation sites
(Figure 7) (Dush, et al., Nucleic Acids Res., 16:8509-8524,
1988) . Although the subcloned promoter fragment used in these experiments is 396 bp long, we have shown by deletion analysis that only the proximal most 160 bp is needed for full promoter activity. Within this sequence there are four sites that bind transcription factor sPl in vitro and two major transcription start sites (Dush, et al., Nucleic Acids Res., 16:8509-8524, 1988) . There are also unique restriction sites within the promoter fragment that should serve as useful insertion sites for the lac operator sequences (Figure 7) . These sites are within and immediately adjacent to the Spl binding domain and, in the presence of repressor, should interfere with transcription factor binding. To generate these "operator-modified" aprt promoter constructs, complementary oligonucleotides encoding operator sequence with appropriate cohesive ends will be synthesized, and inserted at the sites indicated in Figure 7. Following conversion of the 3' end of the modified aprt promoter to a Hindlll site, the fragment will replace the aprt/lacO promoter construct in Figure 6D and will be inserted into the resulting BamHl/HindlH site immediately preceding the lacZ gene.
Example 12: Proposed Assay Protocol
The system may be validated and optimized by testing various known and suspected mutagens and/or carcinogens.
One example is N-nitrosoethyl urea, a transplacental mutagen of the N-nitroso family of carcinogenic agents. It causes
neurogenic tumors in a variety of species, including mice Rice, et al., Ann. N.Y. Acad. Sci., 381:274-289. 1982, and papillary lung tumors in the progeny of pregnant females exposed to the agent Rehm, et al., Cancer Res., 48.:148-160, 1988. Since there is no real precedent to follow, we will first expose fetal mice to NEU via i.p. injection of the mother with 0.1 mmol to 0.5 mmol NEU per Kg on gestational days 14, 16 and 18. These parameters were selected since they were most effective in producing lung tumors Rehm, et al., Cancer Res., 48_:148-160, 1988. Although the half-life of NEU in vivo may be no more than 10 minutes, a single dose appeared sufficient. However, if necessary, multiple injections may be administered. Mock injected (i.e., trioctanoin, in which the NEU is dissolved) pregnant mice will serve as controls. Organs and tissues from mice, at ages ranging from neonates to one-year old mice will be removed, fixed, sectioned and stained form β-galactosidase as described earlier. Initially, we will use 6 mice from different litters at 1 wk, 6 wk, 20 wk and 1 yr of age for each time and dose of NEU administration, which is equivalent to the number .used to unequivocally detect an association with lung tumors. Analysis of stained sections should define which organ(s) and tissue(s) in progeny mice are most susceptible to mutation following i.p. ' administration of NEU to the mother. It will also define the gestational age at which the fetus is most susceptible to this agent, and will establish the presence or absence of a dose response relationship between the amount of NEU administered and the number of stained foci per organ than one detects.
A second class of carcinogen which may be tested is the aromatic amines, whose carcinogenic characteristics have been recognized since the turn of the century. These compounds have been widely used in industry, see Haley, et al. , Handbook of Carcinogens and Hazardous Substances, eds. M.C. Bowman, Marcel Dekker, Inc., New York, Basel, 1982. For example, the DuPont Company has screened workers exposed to jS- naphthylamine for the occurrence of bladder cancer Mason, et al., J. Occup. Med. , 2J[:1011-1016, 1986. Other studies (e.g. Schulte, et al., Cancer, 58_:2156-2162, 1986 have also indicated an association between exposure to b-naphthylamine and bladder cancer. In the
proposed experiments, tester mice will be individually exposed to varying amounts of aromatic amines, including #-naphthylamine, 2-acetylaminofluorene or benzidine, all of which, following metabolic activation, are carcinogenic in vivo. Haley, et al., Handbook of Carcinogens and Hazardous Substances, eds. M.C. Bowman, Marcel Dekker, Inc., New York, Basel, 1982. In initial studies, young mice will be used to facilitate identification of stained foci in target organs. In adults, target organs such as liver lack dividing cells so that a mutation in the lad gene will be manifested as only a single stained cell. By using immature animals at an age when mitotic division is still active in most organs like liver, larger foci will be evident due to a lad mutation in a progenitor cell and all of its progeny. To ascertain whether or not mutagenic activity may be transmitted via milk, and whether the route of carcinogen administration affects which organ(s) serve as a target, nursing mothers will be injected i.p. or i.v. immediately after birth of a litter of tester mice and offspring will be analyzed at 6 to 10 weeks of age. Likewise newly weaned mice will be exposed to increasing amounts of one of the above aromatic amines (each will be separately tested) . Administration will be a single or multiple doses dispensed i.p., i.v. or orally (gavage) to again determine the target organ(s) and to ask whether they differ according to route of administration. Mice will be sacrificed at times up to 8 weeks after the last administration and individual organs sectioned, stained and analyzed as above. We will begin with about 6 to 10 mice for each regimen of mutagen administration. This number may be increased if so required. Lastly we wish to ascertain the transplacental mutagenic potential of thermal derivatives of polychlorinated biphenyls
(PCBs) . This objective is prompted by a recent report of congenital poisoning in Taiwan in 1978 by cooking oil contaminated with thermally degraded PCBs. Rogan, et al., Science, 241:334-336, 1988. Affected offspring of mothers who had ingested the contaminated rice-bran oil manifested a spectrum of congenital defects. It may be too early to tell whether or not these children will exhibit a higher than normal incidence
of tumors. It will be instructive to establish whether or not PCB thermal derivatives are mutagenic to fetuses of pregnant mice that ingest these agents, and if so, whether or not 1) the affected tissue(s) is of ectodermal origin, as appears to be the case in man, and 2) whether or not tissues that incur mutations in the lad transgene later selectively give rise to tumors. To this end, PCBs will be dissolved in cooking oil at about 100 pp . Chen, et al., Am. J. Int. Med., .5:133-145, 1984. Also, polychlorinated dibenzofuran (PCDF) which may have been the most active ingredient in the tainted oil associated with the Taiwan outbreak (Rogan, et al.. Science, 24.1:334-336, 1988, Chen, et al., Am. J. Int. Med., 5.:133-145, 1984) will be dissolved in cooking oil at 0.1 ppm. The oils will be heated to 180°C and
(after cooling) will be orally administered to pregnant mice carrying tester fetuses. The offspring will be analyzed for teratological abnormalities and for organs and tissues that manifest lad mutations, as described before.
Miscellaneous
For molecular biology or immunology techniques not already described, see Sambrook, et al.. Molecular Cloning: A Laboratory Manual (2d ed. 1989), and Harlow. et al.. •Antibodies:' A Laboratory Manual (1988) .
All references cited in this specification are hereby incorporated by reference, to the extent pertinent.
e 1: odon U a e T ble f r Wil -T e Lad Gene E ID N0:1
(CpG codons are underlined.)
Table 2: Codon Usage Table for CpG-depleted LacIMlR Gene (SEQ ID NO:31
(CpG codons are underlined.)
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: Stambrook, Peter J.
(ii) TITLE OF INVENTION: MUTAGENICITY TESTING USING REPORTER GENES WITH MODIFIED METHYLATION FREQUENCIES
(iii) NUMBER OF SEQUENCES: 8
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Browdy and Neimark
(B) STREET: 419 Seventh Street, N„W. , Suite 300
(C) CITY: Washington
(D) STATE: D.C.
(E) COUNTRY: USA
(F) ZIP: 20004
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentin Release #1.0, Version #1.25
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 07/842,644
(B) FILING DATE: 02-FEB-1992
(C) CLASSIFICATION:
(viii) ATTORNEY/ GENT INFORMATION:
(A) NAME: Cooper, Iver P.
(B) REGISTRATION NUMBER: 28,005
(C) REFERENCE/DOCKET NUMBER: STAMBROOK 1
. (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 202-628-5197
(B) TELEFAX: 202-737-3528
(C) TELEX: 248633
(2) INFORMATION FOR SEQ ID NO:l:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1082 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE :
(A) NAME/KEY: CDS
(B) LOCATION: 1..1082
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l:
GTG AAA CCA GTA ACG TTA TAC GAT GTC GCA GAG TAT GCC GGT GTC TCT 4 Val Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala .Gly Val Ser 1 5 10 15
TAT CAG ACC GTT TCC CGC GTG GTG AAC CAG GCC AGC CAC GTT TCT GCG 9 Tyr Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala 20 25 30
AAA ACG CGG GAA AAA GTG GAA GCG GCG ATG GCG GAG CTG AAT TAC ATT 14 Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr He 35 40 45
CCC AAC CGC GTG GCA CAA CAA CTG GCG GGC AAA CAG TCG TTG CTG ATT 19 Pro Asn Arg Val Ala Gin Gin Leu Ala Gly Lys Gin Ser Leu Leu He 50 55 60
GGC GTT GCC ACC TCC AGT CTG GCC CTG CAC GCG CCG TCG CAA ATT GTC 24 Gly Val Ala Thr Ser Ser Leu Ala Leu His Ala Pro Ser Gin He Val 65 70 75 80
GCG GCG ATT AAA TCT CGC GCC GAT CAA CTG GGT GCC AGC GTG GTG GTG 28 Ala Ala He Lys Ser Arg Ala Asp Gin Leu Gly Ala Ser Val Val Val
85 90 95 .
TCG ATG GTA GAA CGA AGC GGC GTC GAA GCC TGT AAA GCG GCG GTG CAC 33 Ser Met Val Glu Arg Ser Gly Val Glu Ala Cys Lys Ala Ala Val His 100 105 110
AAT CTT CTC GCG CAA CGC GTC AGT GGG CTG ATC ATT AAC TAT CCG CTG 38 Asn Leu Leu Ala Gin Arg Val Ser Gly Leu He He Asn Tyr Pro Leu 115 120 125
GAT GAC CAG GAT GCC ATT GCT GTG GAA GCT GCC TGC ACT AAT GTT CCG 43 Asp Asp Gin Asp Ala He Ala Val Glu Ala Ala Cys Thr Asn Val Pro 130 135 140
GCG TTA TTT CTT GAT GTC TCT GAC CAG ACA CCC ATC AAC AGT ATT ATT 48 Ala Leu Phe Leu Asp Val Ser Asp Gin Thr Pro He Asn Ser He He 145 150 155 160
TTC TCC CAT GAA GAC GGT ACG CGA CTG GGC GTG GAG CAT CTG GTC GCA 52 Phe Ser His Glu Asp Gly Thr Arg Leu Gly Val Glu His Leu Val Ala
165 170 175 '
TTG GGT CAC CAG CAA ATC GCG CTG TTA GCG GGC CCA TTA AGT TCT GTC 57 Leu Gly His Gin Gin He Ala Leu Leu Ala Gly Pro Leu Ser Ser Val 180 185 190
TCG GCG CGT CTG CGT CTG GCT GGC TGG CAT AAA TAT CTC ACT CGC AAT 62 Ser Ala Arg Leu Arg Leu Ala Gly Trp His Lys Tyr Leu Thr Arg Asn 195 200 205
CAA ATT CAG CCG ATA GCG GAA CGG GAA GGC GAC TGG AGT GCC ATG TCC 6 Gin He Gin Pro He Ala Glu Arg Glu Gly Asp Trp Ser Ala Met Ser 210 215 220
GGT TTT CAA CAA ACC ATG CAA ATG CTG AAT GAG GGC ATC GTT CCC ACT 7 Gly Phe Gin Gin Thr Met Gin Met Leu Asn Glu Gly He Val Pro Thr 225 230 235 240
GCG ATG CTG GTT GCC AAC GAT CAG ATG GCG CTG GGC GCA ATG CGC GCC 7 Ala Met Leu Val Ala Asn Asp Gin Met Ala Leu Gly Ala Met Arg Ala
245 250 255
ATT ACC GAG TCC GGG CTG CGC GTT GGT GCG GAT ATC TCG GTA GTG GGA 8 He Thr Glu Ser Gly Leu Arg Val Gly Ala Asp He Ser Val Val Gly 260 265 270
TAC GAC GAT ACC GAA GAC AGC TCA TGT TAT ATC CCG CCG TCA ACC ACC 8 Tyr Asp Asp Thr Glu Asp Ser Ser Cys Tyr He Pro Pro Ser Thr Thr 275 280 285
ATC AAA CAG GAT TTT CGC CTG CTG GGG CAA ACC AGC GTG GAC CGC TTG 9 He Lys Gin Asp Phe Arg Leu Leu Gly Gin Thr Ser Val Asp Arg Leu 290 295 300
CTG CAA CTC TCT CAG GGC CAG GCG GTG AAG GGC AAT CAG CTG TTG CCC 9 Leu Gin Leu Ser Gin Gly Gin Ala Val Lys Gly Asn Gin Leu Leu Pro 305 ■ 310 315 320
GTC TCA CTG GTG AAA AGA AAA ACC ACC CTG GCG CCC AAT ACG CAA ACC 10 Val Ser Leu Val Lys Arg Lys Thr Thr Leu Ala Pro Asn Thr Gin Thr
325 330 335
GCC TCT CCC CGC GCG TTG GCC GAT TCA TTA ATG CAG CTG GCA CGA CAG 10 Ala Ser Pro.Arg Ala Leu Ala Asp Ser Leu Met Gin Leu Ala Arg Gin 340 345 350
GTT TCC CGA CTG GAA AGC GGG CAG TG 10
Val Ser Arg Leu Glu Ser Gly Gin 355 360
(2) INFORMATION FOR SEQ ID NO:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 360 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:
Val Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser 1 5 10 15
Tyr Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala 20 25 30
Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr He 35 40 45
Pro Asn Arg Val Ala Gin Gin Leu Ala Gly Lys Gin Ser Leu Leu He 50 55 60
Gly Val Ala Thr Ser Ser Leu Ala Leu His Ala Pro Ser Gin He Val 65 70 75 80
Ala Ala He Lys Ser Arg Ala Asp Gin Leu Gly Ala Ser Val Val Val
85 90 95
Ser Met Val Glu Arg Ser Gly Val Glu Ala Cys Lys Ala Ala Val His 100 105 110
Asn Leu Leu Ala Gin Arg Val Ser Gly.Leu He He Asn Tyr Pro Leu 115 120 125
Asp Asp Gin Asp Ala He Ala Val Glu Ala Ala Cys Thr Asn Val Pro 130 135 140
Ala Leu Phe Leu Asp Val Ser Asp Gin Thr Pro He Asn Ser He He 145* 150 155 160
Phe Ser His Glu Asp Gly Thr Arg Leu Gly Val Glu His Leu Val Ala-
165 170 175
Leu Gly His Gin Gin He Ala Leu Leu Ala Gly Pro Leu Ser Ser Val 180 185 190
Ser Ala Arg Leu Arg Leu Ala Gly Trp His Lys Tyr Leu Thr Arg Asn 195 200 205
Gin He Gin Pro He Ala Glu Arg Glu Gly Asp Trp Ser Ala Met Ser 210 215 220
Gly Phe Gin Gin Thr Met Gin Met Leu Asn Glu Gly He Val Pro Thr 225 230 235 240
Ala Met Leu Val Ala Asn Asp Gin Met Ala Leu Gly Ala Met Arg Ala
245 250 255
He Thr Glu Ser Gly Leu Arg Val Gly Ala Asp He Ser Val Val Gly 260 265 270
Tyr Asp Asp Thr Glu Asp Ser Ser Cys Tyr He Pro Pro Ser Thr Thr 275 280 285
He Lys Gin Asp Phe Arg Leu Leu Gly Gin Thr Ser Val Asp Arg Leu 290 295 300
Leu Gin Leu Ser Gin Gly Gin Ala Val Lys Gly Asn Gin Leu Leu Pro 305 310 315 320
Val Ser Leu Val Lys Arg Lys Thr Thr Leu Ala Pro Asn Thr Gin Thr
325 330 335
Ala Ser Pro Arg Ala Leu Ala Asp Ser Leu Met Gin Leu Ala Arg Gin 340 345 350
Val Ser Arg Leu Glu Ser Gly Gin 355 360
(2) INFORMATION FOR SEQ ID NO:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1129 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
. (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 10..1122
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:
AGCTTCACC ATG AAA CCA GTA ACA TTG TAT GAT GTT GCA GAG TAT GCC
Met Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala 1 5 10
GGT GTC TCT TAT CAG ACT GTT TCC AGA GTG GTG AAC CAG GCC AGC CAT Gly Val Ser Tyr Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His 15 20 25
GTT TCT GCC AAA ACC AGG GAA AAA GTG GAA GCA GCC ATG GCA GAG CTG Val Ser Ala Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu 30 35 40 45
AAT.TAC ATT CCC AAC AGA GTG GCA CAA CAA CTG GCA GGC AAA CAG AGC Asn Tyr He Pro Asn Arg Val Ala Gin Gin Leu Ala Gly Lys Gin Ser
50 55 60
TTG CTG ATT GGA GTT GCC ACC TCC AGT CTG GCC CTG CAT GCA CCA TCT Leu Leu He Gly Val Ala Thr Ser Ser Leu Ala Leu His Ala Pro Ser 65 70 75
CAA ATT GTG GCA GCC ATT AAA TCT AGA GCT GAT CAA CTG GGA GCC TCT Gin He Val Ala Ala He Lys Ser Arg Ala Asp Gin Leu Gly Ala Ser 80 85 90
GTG GTG GTG TCA ATG GTA GAA AGA AGT GGA GTT GAA GCC TGT AAA GCT Val Val Val Ser Met Val Glu Arg Ser Gly Val Glu Ala Cys Lys Ala 95 100 105
GCA GTG CAC AAT CTT CTG GCA CAA AGA GTC AGT GGG CTG ATC ATT AAC Ala Val His Asn Leu Leu Ala Gin Arg Val Ser Gly Leu He He Asn 110 115 120 125
TAT CCA CTG GAT GAC CAG GAT GCC ATT GCT GTG GAA GCT GCC TGC ACT 432 Tyr Pro Leu Asp Asp Gin Asp Ala He Ala Val Glu Ala Ala Cys Thr
130 135 140
AAT GTT CCA GCA CTC TTT CTT GAT GTC TCT GAC CAG ACA CCC ATC AAC 480 Asn Val Pro Ala Leu Phe Leu Asp Val Ser Asp Gin Thr Pro He Asn 145 150 155
AGT ATT ATT TTC TCC CAT GAA GAT GGT ACA AGA CTG GGT GTG GAG CAT 528 Ser He He Phe Ser His Glu Asp Gly Thr Arg Leu Gly Val Glu His 160 165 170
CTG GTT GCA TTG GGA CAC CAG CAA ATT GCA CTG CTT GCG GGC CCA CTC 576 Leu Val Ala Leu Gly His Gin Gin He Ala Leu Leu Ala Gly Pro Leu 175 180 185
AGT TCT GTC TCA GCA AGG CTG AGA CTG GCC GGC TGG CAT AAA TAT CTC 62 Ser Ser Val Ser Ala Arg Leu Arg Leu Ala Gly Trp His Lys Tyr Leu 190 195 200 205
ACT AGG AAT CAA ATT CAG CCA ATA GCT GAA AGA GAA GGG GAC TGG AGT 672 Thr Arg Asn Gin He Gin Pro He Ala Glu Arg Glu Gly Asp Trp Ser
210 215 220
GCC ATG TCT GGG TTT CAA CAA ACC ATG CAA ATG CTG AAT GAG GGC ATT 72 Ala Met Ser Gly Phe Gin Gin Thr Met Gin Met Leu Asn Glu Gly He 225 230 235
GTT CCC ACT GCA ATG CTG GTT GCC AAT GAT CAG ATG GCA CTG GGT GCA 768 Val Pro Thr Ala Met Leu Val Ala Asn Asp Gin Met Ala Leu Gly Ala 240 245 250
ATG AGA GCC ATT ACT GAG TCT GGG CTG AGA GTT GGT GCA GAT ATC TCA 81 Met Arg Ala He Thr Glu Ser Gly Leu Arg Val Gly Ala Asp He Ser 255 260 265
GTA GTG GGA TAT GAT GAT ACT GAA GAC AGC TCA TGT TAT ATC CCA CCC 86 Val Val Gly Tyr Asp Asp Thr Glu Asp Ser Ser Cys Tyr He Pro Pro 270 275 280 285
TCA ACC ACC ATC AAA CAA GAT TTT AGA CTG CTG GGG CAA ACC AGT GTG 91 Ser Thr Thr He Lys Gin Asp Phe Arg Leu Leu Gly Gin Thr Ser Val • 290 295 300
GAC AGA TTG CTG CAA CTC TCT CAA GGC CAA GCA GTG AAG GGC AAT CAG 96 Asp Arg Leu Leu Gin Leu Ser Gin Gly Gin Ala Val Lys Gly Asn Gin 305 310 315
CTG TTG CCA GTC TCA CTG GTG AAG AGA AAA ACC ACC CTG GCA CCC AAT 100 Leu Leu Pro Val Ser Leu Val Lys Arg Lys Thr Thr Leu Ala Pro Asn 320 325 330 ACA CAA ACT GCC TCT CCC CGG GCA TTG GCT GAT TCA CTC ATG.CAG CTA 105 Thr Gin Thr Ala Ser Pro Arg Ala Leu Ala Asp Ser Leu Met Gin Leu 335 340 345
GCA AGA CAG GTT TCC AGA CTG GAA AGT GGG CAG GCA GCT CTG CCC AAG 1 Ala Arg Gin Val Ser Arg Leu Glu Ser Gly Gin Ala Ala Leu Pro Lys 350 355 360 365
AAG AAG CGA AAG GTG TGATAGGATC 1
Lys Lys Arg Lys Val
370
(2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 370 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:
Met Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser 1 5 10 15
Tyr Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala . 20 25 30
Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr He 35 40 - 45
Pro Asn Arg Val Ala Gin Gin Leu Ala Gly Lys Gin Ser Leu Leu He 50 55 60
Gly Val Ala Thr Ser Ser Leu Ala Leu His Ala Pro Ser Gin He Val 65 70 75 . 80
Ala Ala He Lys Ser Arg Ala Asp Gin Leu Gly Ala Ser Val Val Val
85 90 95
Ser Met Val Glu Arg Ser Gly Val Glu Ala Cys Lys Ala Ala Val His 100 . 105 110
Asn Leu Leu Ala Gin Arg Val Ser Gly Leu He He Asn Tyr Pro Leu 115 120 125
Asp Asp Gin Asp Ala He Ala Val Glu Ala Ala Cys Thr Asn Val Pro 130 135 140
Ala Leu Phe Leu Asp Val Ser Asp Gin Thr Pro He Asn Ser He He 145 150 155 160
Phe Ser His Glu Asp Gly Thr Arg Leu Gly Val Glu His Leu Val Ala
165 170 175
Leu Gly His Gin Gin He Ala Leu Leu Ala Gly Pro Leu Ser Ser Val 180 185 190
Ser Ala Arg Leu Arg Leu Ala Gly Trp His Lys Tyr Leu Thr Arg Asn 195 200 205
Gin He Gin Pro He Ala Glu Arg Glu Gly Asp Trp Ser Ala Met Ser 210 215 220
Gly Phe Gin Gin Thr Met Gin Met Leu Asn Glu Gly He Val Pro Thr 225 230 235 240
Ala Met Leu Val Ala Asn Asp Gin Met Ala Leu Gly Ala Met Arg Ala
245 250 255
He Thr Glu Ser Gly Leu Arg Val Gly Ala Asp He Ser Val Val Gly 260 265 270
Tyr Asp Asp Thr Glu Asp Ser Ser Cys Tyr He Pro Pro Ser Thr Thr 275 280 285
He Lys Gin Asp Phe Arg Leu Leu Gly Gin Thr Ser Val Asp Arg Leu 290 295 300
Leu Gin Leu Ser Gin Gly Gin Ala Val Lys Gly Asn Gin Leu Leu Pro 305 310 315 320
Val Ser Leu Val Lys Arg Lys Thr Thr Leu Ala Pro Asn Thr Gin Thr
325 330 335
Ala Ser Pro Arg Ala Leu Ala Asp Ser Leu Met Gin Leu Ala Arg Gin 340 345 350
Val Ser Arg Leu Glu Ser Gly Gin Ala Ala Leu Pro Lys Lys Lys Arg 355 . 360 365
Lys Val 370
(2) INFORMATION FOR SEQ ID NO:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 459 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: CDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..456
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:
ATG AGC GAA AAA TAC ATC GTC ACC TGG GAC ATG TTG CAG ATC CAT GCA Met Ser Glu Lys Tyr He Val Thr Trp Asp Met Leu Gin He His Ala 1 5 10 15
CGT AAA CTC GCA AGC CGA CTG ATG CCT TCT GAA CAA TGG AAA GGC ATT Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly He 20 25 30
ATT GCC GTA AGT CGT GGC GGT CTG GTA CCG GGT GCG TTA CTG GCG CGT He Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 35 40 45
GAA CTG GGT ATT CGT CAT GTC GAT ACC GTT TGT ATT TCC AGC TAC GAT Glu Leu Gly He Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 50 55 60
CAC GAC AAC CAG CGC GAG CTT AAA GTG CTG AAA CGC GCA GAA GGC GAT His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 65 70 75 80
GGC GAA GGC TTC ATC GTT ATT GAT GAC CTG GTG GAT ACC GGT GGT ACT Gly Glu Gly Phe He Val He Asp Asp Leu Val Asp Thr Gly Gly Thr
85 90 . 95
GCG GTT GCG ATT CGT GAA ATG TAT CCA AAA GCG CAC TTT GTC ACC ATC Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 100 105 110
TTC GCA AAA CCG GCT GGT CGT CCG CTG GTT GAT GAC TAT GTT GTT GAT Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 115 120 125
ATC CCG CAA GAT ACC TGG ATT GAA CAG CCG TGG GAT ATG GGC GTC GTA He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val .130 135 140
TTC GTC CCG CCA ATC TCC GGT CGC TAA
Phe Val Pro Pro He Ser Gly Arg 145 150
(2) INFORMATION FOR SEQ ID NO:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 152 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:
Met Ser Glu Lys Tyr He Val Thr Trp Asp Met Leu Gin He His Ala
1 5 10 15
Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly He 20 25 30
He Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 35 40 45
Glu Leu Gly He Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 50 55 60
His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 65 70 75 80
Gly Glu Gly Phe He Val He Asp Asp Leu Val Asp Thr Gly Gly Thr
85 90 95
Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 100 105 110
Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 115 120 125
He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val .130 135 140
Phe Val Pro Pro He Ser Gly Arg 145 150
(2) INFORMATION FOR SEQ ID NO:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 459 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:
ATGAGTGAAA AATACATAGT CACCTGGGAC ATGTTGCAGA TCCATGCAAG GAAACTGGCA
AGCAGACTGA TGCCTTCTGA ACAATGGAAA GGCATTATTG CAGTAAGCCG TGGAGGTCTG
GTACCGGGTG CATTACTGGC AAGAGAACTG GGTATTAGGC ATGTAGATAC TGTTTGTATT
TCCAGCTATG ATCATGACAA CCAGAGGGAG CTTAAAGTGC TGAAAAGAGC AGAAGGTGAT
GGTGAAGGCT TCATTGTTAT TGATGACCTG GTGGATACAG GTGGTACTGC AGTTGCAATT
AGGGAAATGT ATCCAAAAGC ACACTTTGTC ACCATCTTTG CAAAACCGGC TGGTAGACCC
CTGGTTGATG ACTATGTTGT TGATATCCCA CAAGATACCT GGATTGAACA GCCATGGGAT
ATGGGAGTGG TATTTGTCCC TCCAATCTCA GGTAGGTAA (2) INFORMATION FOR SEQ ID NO:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 459 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:
TTACCTACCT GAGATTGGAG GGACAAATAC CACTCCCATA TCCCATGGCT GTTCAATCCA
GGTATCTTGT GGGATATCAA CAACATAGTC ATCAACCAGG GGTCTACCAG CCGGTTTTGC
AAAGATGGTG ACAAAGTGTG CTTTTGGATA CATTTCCCTA ATTGCAACTG CAGTACCACC
TGTATCCACC AGGTCATCAA TAACAATGAA GCCTTCACCA TCACCTTCTG CTCTTTTCAG
CACTTTAAGC TCCCTCTGGT TGTCATGATC ATAGCTGGAA ATACAAACAG TATCTACATG
CCTAATACCC AGTTCTCTTG CCAGTAATGC ACCCGGTACC AGACCTCCAC GGCTTACTGC
AATAACGCCT TTCCATTGTT CAGAAGGCAT CAGTCTGCTT CGCAGTTTCC TTGCATGGAT
CTGCAACATG TCCCAGGTGA CTATGTATTT TTCACTCAT
Claims
1. In a method of assaying for the mutagenic or carcinogenic potential of a chemical, in which a heterologous marker gene is introduced into a nonhuman vertebrate animal, the animal is exposed to the chemical, and the presence or absence of a mutation in the marker gene caused by the chemical is assessed, the improvement comprising providing as the marker gene a wholly or partially synthetic gene engineered to have a frequency of occurrence of CpG dinucleotides which does not substantially exceed the frequency of occurrence of CpG dinucleotides in genes native to said vertebrate animal.
2. The method of claim 1 in which the marker gene is a wild-type gene of a heterologous organism, and the assay is for a forward mutation in said gene.
3. The method of any of claims 1 or 2 in which the marker is a mutated gene of a heterologous organism, and the assay is for a reversion mutation in said gene.
4. The method of any of claims 1-3 in which the presence or absence of the mutation is assessed by in vitro examination of a biological fluid or tissue of said animal for the presence or absence of a biochemical whose level therein is affected' by the expression of said marker gene.
5. The method of claim 4 wherein the animal is sacrificed to permit such examination.
6. The method of any of claims 1-5 in which the presence or absence of the mutation is assessed by in vivo imaging of the animal to localize a biochemical whose level in a particular tissue or organ of the' animal is affected by the expression of said marker gene.
7. The method of any of claims 1-6 in which* the marker gene is a CpG depleted wild-type or mutated bacterial gene.
8. The method of any of claims 1-6 in which the marker gene is a CpG-depleted wild-type or mutated plant gene.
9. The method of any of claims 1-6 in which the marker gene is a CpG-depleted wild-type or mutated invertebrate gene.
10. The method of any of claims 1-6 in which the marker gene is a CpG-depleted wild-type or mutated gene of a mammal of a species other than the animal intowhich themarkergene is introduced.
11. The method of any of claims 1-10 in which the animal is a mammal.
12. The method of any of claims 1-10 in which the animal is a fish.
5 13. The method of any of claims 1-12 in which the marker gene is a tumorigenic marker gene.
14. .The method of any of claims 1-12 in which the marker gene is a toxin marker gene.
15. The method of any of claims 1-12 in which the marker 10.gene is a hormonal marker gene.
16. The method of any of claims 1-12 in which the marker gene is an enzymatic marker gene.
17. The method of any of claims 1-12 in which the marker gene is an antigenic marker gene.
15 18. The method of any of claims 1-12 in which the marker gene is a regulatory marker gene.
19. The method of any of claims 1-12 in which the marker gene is an antibiotic resistance gene.
20. The method of any of claims 1-12 in which the marker 20 gene is a CpG-depleted lad or lacZ gene.
21. The method of any of claims 1-20 in which the CpG dinucleotide frequency in the marker gene does not substantially exceed 3%.
22. The method of any of claims 1-21 wherein the marker 25 gene is controlled by a promoter whose expression in said animal is not tissue-specific.
23. The method of any of claims 1-21 wherein the promoter is selected from the group consisting of the beta actin gene, tRNA gene, ribosomal RNA gene and historie gene promoters.
30 24. The method of any of claims 1-21 wherein marker gene is controlled by a tissue-specific promoter.
25. An expression vector comprising an CpG-depleted gene encoding a non-vertebrate peptide or protein, said CpG- depleted gene having a CpG dinucleotide frequency which is substantially
35 less than that of a wild type gene of the same non-vertebrate animal species that encodes said peptide or protein, said CpG- depleted gene being operably linked to a promoter functional in a vertebrate host.
26.. The vector of claim 25 wherein the promoter is functional in a mammalian host.
27. The vector of any of claims 25-26 wherein the CpG frequency if the CpG depleted gene is less than about 3%.
28. The vector of any of claims 25-27 wherein the CpG frequency of the CpG-depleted gene is less than half that of the wild- type gene.
29. The vector of any of claims 25-28 wherein said peptide or protein is a bacterial peptide or protein.
30. The vector of claim 29 wherein the gene is a lad gene.
31. A nonhuman genetically engineered animal at least some of whose cells are transformed by an expression vector according to any of claim 25-30.
32. The method of any of claims 1-24 wherein the marker gene is a mutant of a naturally occurring gene whose frequency of occurrence of CpG dinucleotides does substantially exceed the frequency of occurrence of CpG dinucleotides in genes native to said vertebrate animal.
33. The method of claim 32 wherein at least one of the CpG dinucleotides of the naturally occurring gene which is mutated in the marker gene is an intercodon CpG dinucleotide.
34. The method of any of claims 1-12, 21-24 or 32-33 in which the marker gene is a gpt gene.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US84266492A | 1992-02-27 | 1992-02-27 | |
US842,664 | 1992-02-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1993017123A1 true WO1993017123A1 (en) | 1993-09-02 |
Family
ID=25287933
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1993/001676 WO1993017123A1 (en) | 1992-02-27 | 1993-02-26 | Mutagenicity testing using reporter genes with modified methylation frequencies |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU3777693A (en) |
WO (1) | WO1993017123A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996012008A1 (en) * | 1994-10-13 | 1996-04-25 | Merck & Co., Inc. | Synthesis of methylase-resistant genes |
WO1999062333A1 (en) * | 1998-05-31 | 1999-12-09 | The University Of Georgia Research Foundation, Inc. | Bacteriophage-based transgenic fish for mutation detection |
US6472583B1 (en) | 1998-10-26 | 2002-10-29 | The University Of Georgia Research Foundation, Inc. | Plasmid-based mutation detection system in transgenic fish |
EP1373297A2 (en) * | 2001-03-05 | 2004-01-02 | University Of Virginia Patent Foundation | A lac operator-repressor system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0289121A2 (en) * | 1987-05-01 | 1988-11-02 | Stratagene | Mutagenesis testing using transgenic non-human animals carrying test DNA sequences |
WO1989005864A1 (en) * | 1987-12-15 | 1989-06-29 | The Trustees Of Princeton University | Transgenic testing systems for mutagens and carcinogens |
EP0370813A2 (en) * | 1988-11-25 | 1990-05-30 | Exemplar Corporation | Rapid screening mutagenesis and teratogenesis assay |
-
1993
- 1993-02-26 AU AU37776/93A patent/AU3777693A/en not_active Abandoned
- 1993-02-26 WO PCT/US1993/001676 patent/WO1993017123A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0289121A2 (en) * | 1987-05-01 | 1988-11-02 | Stratagene | Mutagenesis testing using transgenic non-human animals carrying test DNA sequences |
WO1989005864A1 (en) * | 1987-12-15 | 1989-06-29 | The Trustees Of Princeton University | Transgenic testing systems for mutagens and carcinogens |
EP0370813A2 (en) * | 1988-11-25 | 1990-05-30 | Exemplar Corporation | Rapid screening mutagenesis and teratogenesis assay |
Non-Patent Citations (2)
Title |
---|
NUC. ACIDS RES., Vol. 14, Suppl., issued 1986, MARUYAMA et al., "Codon Usage Tabulated from the GenBank Genetics Sequence Data", pages r151-r197. * |
TIBTECH, Vol. 6, issued August 1988, ERNST, J.B., "Codon Usage and Gene Expression", pages 196-199. * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996012008A1 (en) * | 1994-10-13 | 1996-04-25 | Merck & Co., Inc. | Synthesis of methylase-resistant genes |
WO1999062333A1 (en) * | 1998-05-31 | 1999-12-09 | The University Of Georgia Research Foundation, Inc. | Bacteriophage-based transgenic fish for mutation detection |
US6307121B1 (en) | 1998-05-31 | 2001-10-23 | The University Of Georgia Research Foundation, Inc. | Bacteriophage-based transgenic fish for mutation detection |
US6472583B1 (en) | 1998-10-26 | 2002-10-29 | The University Of Georgia Research Foundation, Inc. | Plasmid-based mutation detection system in transgenic fish |
EP1373297A2 (en) * | 2001-03-05 | 2004-01-02 | University Of Virginia Patent Foundation | A lac operator-repressor system |
EP1373297A4 (en) * | 2001-03-05 | 2005-09-21 | Univ Virginia | OPERATOR-REPRESSOR SYSTEM LAKE |
Also Published As
Publication number | Publication date |
---|---|
AU3777693A (en) | 1993-09-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wagner et al. | The human beta-globin gene and a functional viral thymidine kinase gene in developing mice. | |
US6025155A (en) | Artificial chromosomes, uses thereof and methods for preparing artificial chromosomes | |
US8288610B2 (en) | Artificial chromosomes, uses thereof and methods for preparing artificial chromosomes | |
US5510099A (en) | Mutagenesis testing using transgenic non-human animals carrying test DNA sequences | |
EP2314708A1 (en) | Artificial chromosomes, uses thereof and methods for preparing artificial chromosomes | |
US6100089A (en) | Rapid screening mutagenesis and teratogenesis assay | |
US20080289059A1 (en) | Methods for developing animal models | |
US8431768B2 (en) | Targeted and regional cellular ablation in zebrafish | |
Boyd et al. | Molecular biology of transgenic animals | |
JP2008523796A (en) | Method for in vitro production of oocytes or egg cells having a target genome modification | |
WO1993017123A1 (en) | Mutagenicity testing using reporter genes with modified methylation frequencies | |
US6307121B1 (en) | Bacteriophage-based transgenic fish for mutation detection | |
US6472583B1 (en) | Plasmid-based mutation detection system in transgenic fish | |
CA2130081A1 (en) | Mutagenesis testing using transgenic non-human animals carrying test dna sequences | |
EP1859677A1 (en) | Diabetes model animal | |
WO1991015579A1 (en) | Mutagenesis testing using transgenic non-human animals carrying test dna sequences | |
JP2010075065A (en) | Gene-modified animal for evaluating harmfulness of test substance | |
Pinkert | Genetic engineering of farm mammals | |
Ebert | Prospective Developments in Laboratory Animals | |
Neilan | Insertional mutagenesis in mice using gene trap vectors and embryonic stem cells | |
Wagner | Accessing novel developmental mechanisms in the mouse by gene trapping | |
JPH02145200A (en) | Inspection for rapidly screening mutagenicity and teratogenicity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CA JP US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: CA |