+

WO1997001634A2 - Polypeptide pour la reparation d'informations genetiques, sequence nucleotidique codant pour ce polypeptide, et son procede de preparation (proteine de liaison de guanine-thymine - gtbp) - Google Patents

Polypeptide pour la reparation d'informations genetiques, sequence nucleotidique codant pour ce polypeptide, et son procede de preparation (proteine de liaison de guanine-thymine - gtbp) Download PDF

Info

Publication number
WO1997001634A2
WO1997001634A2 PCT/IT1996/000131 IT9600131W WO9701634A2 WO 1997001634 A2 WO1997001634 A2 WO 1997001634A2 IT 9600131 W IT9600131 W IT 9600131W WO 9701634 A2 WO9701634 A2 WO 9701634A2
Authority
WO
WIPO (PCT)
Prior art keywords
gtbp
sequence
gene
seq
protein
Prior art date
Application number
PCT/IT1996/000131
Other languages
English (en)
Other versions
WO1997001634A3 (fr
Inventor
Josef Jiricny
Fabio Palombo
Paola Gallinari
Original Assignee
Istituto Di Ricerche Di Biologia Molecolare P. Angeletti S.P.A.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Istituto Di Ricerche Di Biologia Molecolare P. Angeletti S.P.A. filed Critical Istituto Di Ricerche Di Biologia Molecolare P. Angeletti S.P.A.
Priority to AU62412/96A priority Critical patent/AU6241296A/en
Publication of WO1997001634A2 publication Critical patent/WO1997001634A2/fr
Publication of WO1997001634A3 publication Critical patent/WO1997001634A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/82Translation products from oncogenes
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • C07K16/18Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/05Animals comprising random inserted nucleic acids (transgenic)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds

Definitions

  • This invention relates to the area of cancer prevention, diagnosis and therapeutics.
  • the invention is concerned with methods for detection of a novel mismatch binding protein, termed GTBP (Guanine Timine Binding Protein) , which mediates the repair of genetic information, with the nucleic acid sequence encoding the protein and with processes for obtaining the protein and producing it by recombinant genetic engineering techniques.
  • GTBP Guide Timine Binding Protein
  • the present invention also relates to detection of mutated GTBP gene in tumour tissues and to prevention and early diagnosis of human colorectal cancers.
  • the link between the biological function of hMSH2 and the phenotype of the CRC tumors was forged when (i) the hMSH2 gene was shown to segregate with a known CRC locus on chromosome 2p (10,11), (ii) the hMSH2-deficient cell line LoVo was shown to be deficient in mismatch repair (12) as well as in mismatch-binding activity (12) and (iii) the genome of this cell line exhibited a marked instability of microsatollite sequences (14) .
  • GTBP for G/T binding protein
  • G/T mispair binding protein A mismatch-binding factor, GTBP (for G/T binding protein) , originally identified in HeLa cells by the present inventors (15) , was shown to bind preferentially to heteroduplexes containing G/T mispairs. Purification of this DNA binding activity by G/T mismatch affinity chromatography yielded a mixture of two polypeptides of apparent molecular weights of 100 and 160 kDa (16) , indicating that the mismatch-specific complex was composed of two proteins. The 100 kDa constituent of the complex was demonstrated to be hMSH2 (17) . The present discovery implies that hMSH2 acts as a complex with GTBP in the correction of base/base mispairs and one- or two-nucleotide loops.
  • GTBP is necessary but not indispensible in the correction of larger insertion/deletion loops.
  • a number of tumors have been shown to display mutator phenotypes which are consistent with the functional role of the hMSH2-GTBP complex (20-24) .
  • Prior to the current discovery and characterization of GTBP no specific role in the repair of genetic information and no hereditary defect had been associated with this protein or with the gene encoding it.
  • GTBP 1360-amino acid sequence corresponding to the polypeptide referred to as GTBP.
  • GTBP is used to indicate a compound polypeptide combining in order the amino acid sequences indicated in SEQ ID NO:15 (from amino acid 1 to 68) and SEQ ID NO:l (from amino acid 1 to 1292) .
  • SEQ ID NO:15 from amino acid 1 to 68
  • SEQ ID NO:l from amino acid 1 to 1292
  • GTBP 1360-amino acid sequence corresponding to as GTBP.
  • the whole coding gene GTBP indicates a compound DNA sequence combining in order the nucleotide sequences indicated in SEQ ID NO:16 (from nucleotide 1 to 204) and SEQ ID NO:12 (from nucleotide 1 to 3980) .
  • a further object of the present invention is to provide a genetic construct capable of expressing a 1360- a ino acid peptide of molecular mass 153 kDa referred to as GTBP.
  • CRC colorectal cancers
  • CRC human colorectal cancers
  • sequence of a 1360-amino acid polypeptide is provided corresponding to the protein referred to as GTBP.
  • a cDNA molecule which comprises the coding sequence of the GTBP gene.
  • the sequence of said primers is internal to chromosome 2pl6, said pairs of primers allowing the syntesis of GTBP gene or of parts of it.
  • a nucleic acid probe is provided which is complementary to human wild-type GTBP gene coding sequence and which can form mismatches when annealed with mutant GTBP alleles, thereby making possible the detection of heteroduplex DNA as revealed by shifts in electrophoretic mobility either with or without prior enzymatic or chemical cleavage.
  • a procedure for the detection of wild-type or mutated GTBP protein in humans, comprising: isolating a human sample selected from the tissue or body fluid and detecting the wild-type or the altered GTBP protein itself or in any complex formed by the association of GTBP with other polypeptides.
  • a method for the assessment of the activitiy of (i) the wild-type GTBP protein or (ii) of derived peptides obtained by deletion or insertion of known amino acid sequences in GTBP protein or (iii) of the altered GTBP protein as the result of in vivo mutational events or (iv) of any complex formed by the association of peptides just mentioned in (i) , (ii) , (iii) , and (iv) of the present embodiment with other polypeptides.
  • a method for the detection of cancer in humans comprising: isolating a human sample selected from the tissue or body fluid; detecting the alteration in the GTBP gene or in the expressed polypeptide (GTBP protein) itself or in any complex formed by the association of GTBP with other polypeptides, said alteration indicating the predisposition to neoplastic transformation or the presence of cancer.
  • GTBP protein expressed polypeptide
  • a method of diagnosing or prognosing neoplastic tissue of a human comprising: detecting somatic alterations in wild-type GTBP alleles or their expression products in human colorectal cancers (CRC) , said alteration indicating neoplasia of the tissue.
  • a method for the detection of genetic predisposition to CRC comprising: isolating a human sample selected from the group consisting of blood, bioptic samples of tissues, esfoliative cells and any other generic human sample; detecting the alteration in the GTBP gene or in the expressed polypeptide (GTBP protein) itself or in any complex formed by the association of GTBP with other polypeptides, said alteration indicating genetic predisposition to cancer.
  • a method for supplying wild-type GTBP gene function to a cell which has lost said gene function by virtue of any mutation in the GTBP gene comprising: introducing wild type GTBP gene into a cell which has lost said gene function such that GTBP gene is then expressed at wild-type level in the cell.
  • GTBP protein can also be applied to cells or administered to animals to remediate defects in GTBP gene function.
  • a method is provided to supply a portion of wild-type GTBP gene to a cell which has lost the said gene such that the said portion is expressed in the cells and encodes part of the GTBP protein which is required for non-neoplastic growth of the said cell. It is another embodiment of the present invention the generation of transgenic animals carrying a mutated GTBP gene derived from a second species or a mutated GTBP gene generated in vi tro by genetic engineering techniques. In another embodiment of the present invention a method of testing therapeutic agents for the ability to suppress a neoplastically trasformed phenotype is provided.
  • the method comprises: applying a test substance to a cultured epithelial cell which carries a mutation of the GTBP gene and determining whether the substance suppresses the neoplastic phenotype of the cell or suppresses the growth of already developed tumors.
  • a method of testing therapeutic agents for the ability to suppress a neoplastically trasformed phenotype comprises: applying a test substance to an animal which carries a mutation of the GTBP gene and determining whether the substance prevents neoplastic transformation of defined tissues or suppresses the growth of already developed tumors.
  • the present information provides the art with the information that the GTBP gene, a heretofore unknown gene, encodes the GTBP protein which acts as specific mismatch-binding factor.
  • GTBP binds preferentially to heteroduplexes containing G/T mispairs and one- or two- nucleotide loops. Purification of this DNA binding activity made it possible to establish that the mismatch- specific factor is in fact a complex composed of two distinct proteins.
  • the smaller constituent of the complex (about 100 kDa) is the hMSH2 protein (17) whereas the larger component (about 160 kDa) is GTBP.
  • the present invention provides the technical tools for the detection and for the activity assessment of GTBP alone or as a complex with hMSH2.
  • the GTBP gene is a target of mutational events, these alterations being associated with tumorigenesis.
  • This discovery allows highly specific assays to be performed to determine the neoplastic status of a particular tissue or the predisposition to cancer of individuals.
  • a number of tumors have been shown to display mutator phenotypes with a similarly low degree of microsatellite instability (20-24) consistent with the functional role of the hMSH2-GTBP complex.
  • Prior to the current discovery and characterization of GTBP no specific role in the repair of genetic information and no hereditary defect had been associated with this protein.
  • Figure 1 a shows the commercial phagemid vector pBluescript SK" (Stratagene) used for cloning and sequencing the GTBP cDNA.
  • the DNA fragment shown in SEQ ID NO: 12 was cloned between the EcoRI and Xhol sites of the vector, b shows the commercial pCITE 2b vector.
  • the insert described in SEQ ID NO: 12 was inserted between the EcoRI and Xhol sites of the vector.
  • Ampicillin beta-lactamase gene for ampicillin resistance
  • ColEl ori origin of replication derived from plasmid
  • ColEl fl origin of replication of phage
  • IacZ alpha peptide of beta-galactosidase used for genetic complementation
  • MCS multiple cloning site containing the recognition sequences of the listed restriction enzymes
  • T3 and T7 promoter sequences from phages T3 and T7.
  • Figure 2 shows the commercial plasmid vector pGEX-3x (Pharmacia Biotech) that was used for cloning of the PCR fragments corresponding to amino acid residues 27 to 158 of hMSH2 and 750 to 928 of GTBP (SEQ ID NO:l) .
  • Primers used for amplification were:
  • Figure 3 shows an alignment of the amino acid sequences of the conserved C-terminal regions of the four mismatch binding proteins, i.e. GTBP (ff. sapiens) , hMSH2
  • Figure 4 shows the sequence homology, at the protein level, between pairs of MSH family members.
  • Section a shows the matrix obtained from the alignment of GTBP (on the abscissa) with the yeast GTBP homolog (GenBank accession number Z47746, on the ordinate); the two proteins show comparable length and a significant homology is evident throughout their whole sequence.
  • Section b shows the matrix obtained from the alignment of yeast MSH2 (on the ordinate) with GTBP (on the abscissa) ; the proteins show different lengths and most of the homology is confined to the C-rerminal regions of the two sequences.
  • Section c shows the matrix obtained from the alignment of human MSH2 protein (on the ordinate) with GTBP (on the abscissa) ; the proteins show different lengths and, also in this case, most of the homology is confined to the C-rerminal regions of the two sequences.
  • Section d shows the matrix obtained from the alignment of human hMSH2 protein (on the ordinate) with the yeast MSH2 (on the abscissa) ; the two proteins show comparable length and the homology is evident throughout the entire sequence.
  • Figure 5 shows the effect of selective anti-hMSH2 and anti-GTBP antisera on the formation of the specific mismatch-binding complex.
  • Pre-incubation of HeLa nuclear extracts with either antiserum prior to addition of the G/T heteroduplex DNA probe results in a diminuition of the specific band in the gel-shift assay, an effect not observed when the respective pre-immune sera were used.
  • This figure proves that both hMSH2 and GTBP are present in the mismatch-binding factor.
  • This gel-shift analysis was carried out as described in ref.15, except that nuclear extracts were used (25) .
  • the antisera were added to the reaction mixtures 20 min prior to the addition of the radioactively-labelled probe.
  • the figure is an autoradiogram of a native 6% polyacrylamide gel run in Tris-acetate/EDTA (TAE) buffer prepared according to Maniatis et al . , Molecular cloning: a laboratory manual , Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1982.
  • TAE Tris-acetate/EDTA
  • Figure 6 shows that the mismatch-binding activity can be reconstituted using GTBP and hMSH2 obtained using an in vi tro translation system.
  • the procedure followed to generate in vi tro transcripts of the hMSH2, Cl and FLY5 coding sequences was as follows : The DNA region encoding hMSH2 was inserted into pCite-1; Cl and FLY5 ORFs were introduced into pCite-2b (Novagen) .
  • vi tro transcription and translation reactions were carried out as described in ref. 26, including a mock translation reaction in the absence of added DNA. S-labeled translation products were analysed on a SDS- polyacrylamide gel treated with Amplify (Amersham) , dried and autoradiographed.
  • Section a is an autoradiogram of a denaturing 7.5% SDS-polyacrylamide gel showing that translation of hMSH2, GTBP (Cl) and FLY5 mRNAs in a reticulocyte lysate system (Promega) gave rise to expected polypeptides of 113, 142 and 122 kDa, respectively.
  • Section b shows the gel-shift analysis which demonstrates the binding of the in vitro-translated proteins to the G/T heteroduplex. The figure is an autoradiogram of a native 6% polyacrylamide gel run in TAE buffer.
  • Figure 7 shows that mismatch binding activity is absent from cell extracts lacking GTBP or hMSH2.
  • the experiment is based on the analysis of two cell lines derived from CRC: LoVo cells contain a homozygous deletion of hMSH2 alleles and do not exhibit G/T binding activity (13) , while neither hMSH2 allele is mutated in DLDl cells, in spite of the fact that also this cell line lacks G/T binding activity.
  • Section a shows a gel-shift assay showing that extracts of LoVo and DLDl fail to make mismatch-specific complexes.
  • the G/C and G/T probes were obtained as described previously (15) . Experimental conditions were as in Figure 6.
  • the figure is an autoradiogram of a native 6% polyacrylamide gel run in TAE buffer.
  • Section b shows the Western blot analysis of extracts from Hela, LoVo and DLDl cells.
  • the protein bands were visualized using an alkaline phosphatase- conjugated anti-rabbit IgG system (Promega) as directed by the manufacturer.
  • the anti-GTBP and anti-hMSH2 antisera were used alone with the HeLa extract to demonstrate their selectivity for the 160 and 100 kDa proteins, respectively.
  • both antisera were used together. Control HeLa cells revealed the presence of both hMSH2 and GTBP.
  • the two CRC-derived tumor cell lines LoVo and DLDl were completely devoid of full-length hMSH2 and GTBP, respectively.
  • the amounts of hMSH2 in DLDl cells and GTBP in LoVo cells were considerably lower than in HeLa cells. Since hMSH2 and GTBP bind heteroduplex DNA as a complex, the lack of one of the two proteins may cause instability of the second component of the complex.
  • Figure 8 part a, shows the experimental approach followed to discover the amino-terminal region of GTBP (from amino acid 1 to 68 of SEQ ID NO:15) .
  • 5' RACE method Radar Amplification cDNA Ends, given in detail in the publication Nicolaides, N.C. et al. Geno ics, 29: 229-234, 1995 and Nicolaides N.C. et al. Genomics, 30: 195-206, 1995
  • oligonucleotides were used that pairs with the sequence given in SEQ ID NO:12 from nucleotide 114 to 133 (primary oligonucleotide A) and from nucleotide 56 to 74 (secondary oligonucleotide B) .
  • the PCR reaction products were sequenced and it was possible to determine that the amplification product was capable of encoding the polypeptide DAAWSEAGPGPR, corresponding to amino acids 46-58 of the amino-terminal domain of GTBP as indicated in SEQ ID NO:15.
  • oligonucleotides whose sequence was deduced from the initial RACE, complementary to the sequence given in SEQ ID NO:16 from nucleotide 188 to 204 (primary oligonucleotide C) and from oligonucleotide 169 to 185 (secondary oligonucleotide D) it was possible to amplify the GTBP- coding region 5' by-passing the methionine in position 1 of the amino acid sequence given in SEQ ID NO:15.
  • the amplified clone termed KMN, contained the entire nucleotidic sequence given in SEQ ID NO:16.
  • RACE analysis of leucocyte cDNA is shown in lanes 2 and 5, that of placenta cDNA in lanes 3 and 6.
  • the products of lanes 1 to 3 derive from sequenced amplifications with oligonucleotides A and B, those in lanes 4 to 6 derive from sequenced amplifications with oligonucleotides C and D.
  • Lanes 1 and 4 are the negative controls (absence of template) .
  • the molecular weight markers are indicated at the side.
  • Part b of figure 8 shows expression of the transcript encoding the protein GTBP using RT-PCR (PCR preceded by inverse transcription on RNA templates) .
  • the RT-PCR was carried out using a synthetic oligonucleotide which paired with the sequence given in SEQ ID NO:12 from nucleotide 114 to 133 in the inverse transcription reaction followed by amplification with an oligonucleotide with a sequence equal to the end 5' of the GTBP transcript, that is 5'GGTGCTTTTAGGAGCCCCG3'.
  • RNA used as a mold template taken from HeLa cells (lane 2) placenta (lane 3) leucocytes (lane 4) and cells from the colon (lane 5) ; these were incubated with
  • Lane 1 is the negative control without RNA.
  • the hMSH2/GTBP heterodimer is necessary for the correction of base/base mispairs and one or two- nucleotide loops.
  • Genomic instability in tumor-derived cell-lines lacking GTBP demonstrates itself mainly in the form of small differences (e.g. in runs of A) rather than large changes in CA repeats, characteristic of phenotypes associated with the four known CRC loci hMSH2, hMLHl, hPMSl and hPMS2. Cancers displaying mutator phenotypes with a low degree of microsatellite instability (20-24) may be associated with a malfunction of GTBP. It is a discovery of the present invention that mutational events associated with tumorigenesis in CRC are due to defects in the GTBP gene.
  • Novel compositions comprising generic sequences encoding the GTBP protein, as well as fragments derived therefrom are provided, together with recombinant proteins produced using the genomic sequences and methods of using these compositions.
  • Exemplary amino acid and DNA sequences of the invention are set forth in SEQ ID NO: 1 - SEQ ID NO:15 and in SEQ ' ID NO: 12 - SEQ ID NO: 16. Standard abbreviations for nucleotides and amino acids are used in the Figures and elsewhere in this specification.
  • GTBP- derived polypeptides are particularly preferred embodiments of the invention, although variations based on the specific sequences of these polypeptides are also part of the present invention.
  • the invention (as it pertains to polypeptides per se) includes any polypeptide selected from the group consisting of:
  • the genetic engineering aspects of the present invention include any recombinant DNA or RNA molecule comprising a DNA sequence encoding GTBP itself or GTBP-derived protein according to SEQ ID NO: 1 or a corresponding DNA or RNA sequence, or a subsequence thereof comprising at least 10 nucleotides.
  • the present invention also focuses on diagnostic methodologies aimed to detect loss of GTBP function in humans and consequent predisposition to neoplasia. Defintion of terms
  • Two nucleic acid fragments are "homologous" if they are capable of hybridizing to one another under hybridization conditions described in Maniatis et al . , (1982) , Molecular cloning: a laboratory manual . Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., pp. 320-325.
  • wash conditions --2 x SSC, 0.1% SDS, room temperature twice, 30 minutes each; then 2 x SSC, 0.1% SDS, 50° C once, 30 minutes; then 2 x SSC, room temperature twice, 10 minutes each-- homologous sequences can be identified that contain at most about 25-30% base pair mismatches.
  • homologous nucleic acid strand contains 15-25% base pair mismatches, even more preferably 5-15% base pair mismatches. These degrees of homology can be selected by using more stringent wash conditions for identification of clones from gene libraries (or other sources of genetic material), as is well known in the art.
  • Two amino acid sequences are homologous if there is a partial or complete identity between their sequences. For example, 85% homology means that 85% of the amino acids are identical when the two sequences are aligned for maximum matching. Gaps (in either of the two sequences being matched) are allowed in maximizing matching gap lengths of 5 or less are preferred with 2 or less being more preferred.
  • two protein sequences are homologous, as this term is used herein, if they have an alignment score of more than 5 (in standard deviation units) using the program ALIGN with the mutation data matrix and a gap penalty of 6 or greater (Dayhoff, M.O., in Atlas of Protein Sequence and Structure, 1972, volume 5, National Biomedical Research Foundation, pp. 101-110, and Supplement 2 to this volume, pp. 1-10) .
  • the two sequences or parts thereof are more preferably homologous if their amino acids are greater than or equal to 50% identical when optimally aligned using the ALIGN program.
  • a DNA fragment is "derived from" a GTBP-encoding DNA sequence if it has the same or substantially the same base pair sequence as a region of the coding sequence for GTBP protein molecule.
  • substantially the same means, when referring to biological activities, that the activities are of the same type although they may differ in degree.
  • amino acid sequences “substantially the same” means that the molecules in question have similar biological properties and preferably have at least 85 % homology in amino acid sequences. More preferably, the amino acid sequences are at least 90% identical. In other uses, "substantially the same” has its ordinary English language meaning.
  • a protein is "derived from" GTBP if it has the same or substantially the same amino acid sequence as a region of the GTBP protein molecule.
  • polypeptide derivatives of GTBP protein is meant polypeptides differing in length from the natural protein and containing five or more amino acids in the same primary order as found in the protein as obtained from a natural source.
  • Polypetide molecules having substantially the same amino acid sequence as the natural protein but possessing minor amino acid substitutions which do not significantly affect the ability of the protein or polypeptide to interact with protein-specific molecules, such as antibodies and nucleic acids are within the definition as derived from GTBP.
  • Derivatives include glycosylated forms, aggregative conjugates with other protein molecules and covalent conjugates with unrelated chemical moieties. Covalent derivatives are prepared by linkage of functionalities to groups which are found in the amino acid chain or at the N-or C-terminal residue by means known in the art.
  • GTBP-specific molecules include polypeptides such as antibodies that are specific for the protein or polypeptide containing the naturally occurring GTBP amino acid sequence.
  • specific binding polypetide are intended polypeptides that bind with GTBP protein and its derivatives and which have a measurably higher binding affinity for the target polypeptide than for other polypetides tested for binding. Higher affinity by a factor 10 is preferred, more preferably by a factor of 100. Binding affinity for antibodies refers to a single binding event (i.e., monovalent binding of an antibody molecule) . Specific binding by antibodies also means that binding takes place at the normal binding site of the molecule's antibody (at the end of the arms in the variable region) .
  • Phenylanine, tryptophan, and tyrosine are sometimes classified jointly as aromatic amino acids.
  • an isolated replacement of a leucine with an isoleucine or valine, an aspartate with a glutamate, a theonine with a serine, or a similar replacement of an amino acid with a structurally related amino acid will not have a major effect on the binding properties of the resulting molecule, especially if the replacement does not involve an amino acid at a binding site involved in the interaction of GTBP or its derivatives with an antibody or with a specific DNA recognition sequence.
  • Whether an amino acid change results in a functional peptide can readily be determined by assaying the specific binding properties of the polypeptide derivative. Isolation of cDNA encoding GTBP protein
  • Isolation of nucleotide sequences encoding GTBP protein involves creation of a cDNA library prepared from full-length mature messenger RNA extracted from cultured cells or tissues.
  • Evidence is provided that GTBP is conserved over a broad evolutionary range, thus allowing the isolation of GTBP homologs from the genomes of phylogenetically distant species, i.e. from mammals to yeasts to bacteria.
  • Genetic libraries can be made in either eukaryotic or prokaryotic host cells. Widely available cloning vectors such as plasmids, cosmids, phage, YACs and the like can be used to generate genomic libraries suitable for the isolation of nucleotide sequences encoding GTBP protein or portions thereof.
  • Useful methods for screening genetic libraries for the presence of GTBP protein nucleotide sequences include the preparation of oligonucleotide probes based on the sequence information provided in SEQ ID NO: 1 and SEQ ID NO: 15 (after decoding of the amino acid sequence) as well as in SEQ ID NO:12 and SEQ ID NO: 16 (directly derived from the encoding DNA) of this patent.
  • oligonucleotide sequences of about 17 base pairs or longer can be prepared by conventional in vi tro synthesis techniques.
  • the resultant nucleic acid sequences can be subsequently labeled with radionuclides, enzymes, biotin, fluorescers or the like, and used as probes for screening the libraries.
  • Additional methods of interest for isolating GTBP protein-encoding nucleic acid sequences include screening of genetic libraries for the expression of GTBP protein or fragments thereof by means of GTBP protein-specific antibodies, either polyclonal or monoclonal. Moreover, a selection method advisable for the screening of GTBP libraries cloned in conventional expression vectors is based on the specific binding of the protein (or of polypeptides contained therein) to heteroduplex DNA molecules containing G/T mimatches.
  • a particularly preferred technique for isolating homolog proteins from related species or strains involves the use of degenerate primers based on partial amino acid sequences of GTBP protein and the polymerase chain reaction (PCR) to amplify gene segments between the primers.
  • a similar approach can also be applied to generate double stranded cDNA molecules after amplification of mRNA with appropriate primers and polymerases.
  • the gene can than be isolated using a specific hybridization probe based on the amplified gene segment, which is then analyzed for appropriate expression of the protein.
  • the nucleotide sequence of the isolated genetic material which encodes GTBP protein can be obtained by sequencing the non-vector nucleotide sequences of these recombinant molecules. Nucleotide sequence information can be obtained by employing widely used DNA sequencing protocols, such as Maxam and Gilbert sequencing, dideoxy nucleotide sequencing according to Sanger, and the like. Examples of suitable nucleotide sequencing protocols can be found in Berger and Kimmel, Methods in Enzymology Vol 52 Guide to Molecular Cloning Techniques, (1987) Academic Press.
  • Nucleotide sequence information from several recombinant DNA isolates may be combined so as to provide the entire amino acid coding sequence of GTBP, as well as the nucleotide sequences of upstream and downstream nucleotide sequences.
  • Nucleotide sequences obtained from sequencing GTBP protein-specific genomic library isolates can be subjected to further analysis in order to identify regions of interest in the GTBP gene. These regions of interest include additional open reading frames, promoter sequences, termination sequences, and the like. Analysis of nucleotide sequence information is preferably performed by computer. Software suitable for analyzing nucleotide sequences for regions of interest is commercially available and includes, for example, DNASIS
  • Isolated nucleotide sequences encoding GTBP protein can be used to produce purified GTBP protein or fragments thereof by either recombinant DNA methodology or by in vitro polypeptide synthesis techniques.
  • purified and isolated is meant, when referring to a polypeptide or nucleotide sequence, that the indicated molecule is present in the substantial absence of other biological macromolecules of the same type.
  • the term “purified” as used herein preferably means at least 95% by weight, more preferably at least 99% by weight, and most preferably at least 99.8% by weight, of biological macromolecules of the same type present (but water, buffers, and other small molecules, especially molecules having a molecular weight of less than 1000, can be present) .
  • a significant advantage of producing GBTB protein by recombinant DNA techniques rather than by isolating from natural sources of GTBP protein is that equivalent quantities of GTBP protein can be produced by using less starting material than would be required for isolating GTBP protein from a natural source.
  • Producing GTBP protein by recombinant techniques also permits GTBP protein to be isolated in the absence of some molecules normally present in cells that naturally produce GTBP protein. It is also apparent that recombinant DNA techniques can be used to produce GTBP protein polypeptide derivatives that are not found in nature, such as the variations described above.
  • GTBP protein and polypeptide derivatives of GTBP protein can be expressed by recombinant techniques when a DNA sequence encoding the relevant molecule is functionally inserted into a vector.
  • functionally inserted is meant in proper reading frame and orientation, as is well understood by those skilled in the art.
  • the GTBP protein gene will be inserted downstream from a promoter and will be followed by a stop codon, although production as a hybrid protein followed by cleavage may be used, if desired.
  • host-cell-specific sequences improving the production yield of GTBP protein and GTBP polypeptide derivatives will be used, and appropriate control sequences will be added to the expression vector, such as enhancer sequences, polyadenylation sequences, and ribosome binding sites.
  • Two basic types of expression are contemplated: (i) expression in mammalian cells so as to overcome a deficiency in an individual having insufficient GTBP, and
  • BK-SV40 hybrid vectors have been constructed . These vectors can be maintained in cultured human cells as multicopy double-stranded DNA extrachromosomal replicons.
  • One exemplary vector consists of the SV40 promoter controlling the expression of neomycin resistance gene (the selectable marker) and the MMTV promoter regulated by the DRE enhancer sequence which controls the expression of the cloned gene.
  • the foreign construct will usually include transcriptional and translational initiation and termination signals, with the initiation signals 5' to the gene and termination signals 3' to the gene of interest, altough linear DNA can be delivered to a host where recombination occurs for insertion into the host genome.
  • Expression under the control of the native promoter can thus be achieved by replacing the defective gene with the linear DNA encoding GTBP by making use of cellular processes, e.g. homologous recombination.
  • the transcriptional initiation region which includes the RNA polymerase binding site (promoter) may be native to the host or may be derived from an alternative source, where the region is functional in the host.
  • the transcriptional initiation regions may not only include the RNA polymerase binding site, but also regions providing for the regulation of the transcription.
  • the 3' termination region may be derived from the same gene as the trancriptional initiation region or a different gene. For example, where the gene of interest has a trascriptional termination region functional in the host species, that region may be retained within the gene.
  • An expression cassette can be constructed which will include transcriptional initiation region, the GTBP protein gene under the transcriptional control of the trascription initiation region, the initiation codon, the coding sequence of the gene, with or without introns, and the translational stop codons, followed by the transcriptional termination region, which will include the terminator, and may include a polyadenylation signal sequence, and other sequences associated with transcriptional termination.
  • the direction is 5' to 3' same as the direction of transcripition.
  • the cassette will usually be less than about 10 kb, frequently less than about 6 kb, usually being at least about 5 kb.
  • the gene When the expression product of the gene is to be located other than in the cytoplasm, the gene will usually be constructed to include particular amino acid sequences which result in translocation of the product to a particular site, which may be an organelle, such as the nucleus, or may be secreted into the external envirnoment of the cell.
  • a particular site which may be an organelle, such as the nucleus, or may be secreted into the external envirnoment of the cell.
  • Various secretory leaders, membrane integrator sequences, and translocation sequences for directing the peptide expression product to a particular site are described in the literature.
  • cassettes may be involved, where the cassettes may be employed in tandem for the expression of independent genes which may express products independently of each other or may be regulated concurrently, where the products may act independently or in conjunction, e.g. GTBP and hMSH2.
  • the expression cassette will normally be carried on a vector having at least one replication system.
  • a replication system functional in E. coli such as ColEl, pSClOl, pACYC184, or the like. In this manner, at each stage after each manipulation, the resulting construct may be cloned, sequenced, and the correctness of the manipulation determined. In addition, or in place of the E.
  • a broad host range replication system may be employed, such as the replication systems of the Pl incompatibility plasmids, e.g. RK2, RP1, RP4 and R68.
  • the replication system there will frequently be at least one marker present, which may be uselful in one or more hosts, or different markers for individual hosts. That is, one marker may be employed for selection in a prokaryotic host, while another marker may be employed for selection in a eukaryotic host.
  • neo neomycin- kanamycin resistance
  • choramphenicol acetyltransferase cat
  • b lactamase Jbla
  • b galactosidase etc.
  • the various fragments comprising the various constructs, expression cassettes, markers, and the like may be introduced consecutively by restriction enzyme cleavage of an appropriate replication system, and insertion of the particular construct or fragment into the available size. After ligation and cloning the vector may be isolated for further manipulation. All of these techniques are amply exemplified in the literature and find particular exemplification in Maniatis et al . , Molecular cloning: a laboratory manual , Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1982. Transformation of mammalian cells and ⁇ ene therapy Once the vector is completed, the vector may be introduced into mammalian cells. Techniques for transforming mammalian cells include transfection, microinjection, liposome-based delivery etc..
  • Transfection of cultured human cells is the most commonly used method and can be achieved by standard protocols which involve either incubation of cells with DNA that has been co-precipitated with calcium phosphate or DEAE- dextran or electroporation with purified transfecting DNA.
  • a genetically modified virus, a liposome or a microinjection can also be used to deliver foreign DNA to human recipient cells.
  • Vectors derived from retroviruses are often used to stably maintain and persintently express the remedial gene in the corrected cell.
  • In vivo gene therapy entails the direct delivery of remedial gene into the cell of a particular tissue of a prospective patient.
  • the wild-type protein can be cloned into various benign viruses and delivered to target defective cells in an in vivo infection.
  • Vectors derived from adenovirus, herpes simplex virus and certain retroviruses are excellent candidates for in vivo gene therapy. Methods and prospectives of gene therapy have been reviewed by Mulligan (1993), Science 260:926-932. Diagnostic methods usin ⁇ anti ⁇ ens
  • methods for detecting analytes such as binding proteins of the invention are based on immunoassays.
  • Immunoassays can be conducted to determine the presence or absence of GTBP in host cells. Such techniques are well known and need not be described here in detail. Examples include both heterogeneous and homogeneous immunoassay techniques. Both techniques are based on the formation of an immunological complex between the binding protein and a corresponding specific antibody.
  • Heterogeneous assays for GTBP typically use a specific monoclonal or polyclonal antibody bound to solid surface, e.g. in sandwich assays.
  • Homogeneous assays which are carried out in solution without the presence of a solid phase, can be used, for example, by determining the difference in enzyme activity brought on by binding of free antibody to an enzyme-antigen conjugate.
  • a number of suitable assays are disclosed in U.S. Patent Nos. 3,817,837, 4,006,360, and 3, 996.34545.
  • the solid surface reagent in the above assay prepared by known techniques for attaching protein material to solid support material, such as polymeric beads, dip sticks, or filter material. These attachment methods generally include non-specific adsorption of the protein to the support or covalent attachment of the protein, typically through a free amine group, to a chemically reactive group on the solid support, such as an activated carboxyl, hydroxyl, or aldehyde group.
  • homogeneous assay In a second diagnostic configuration, known as a homogeneous assay, antibody binding to an analyte produces some change in the reaction medium which can be directly detected in the medium.
  • Known general types of homogeneous assays proposed heretofore include (a) spin- labeled reporters, where antibody binding to the antigen is detected by a change in reported mobility (broadening of the spin splitting peaks) , (b) fluorescent reporters, where binding is detected by a change in fluorescence emission, (c) enzyme reporters, where antibody binding effects enzyme/substrate interactions, and (d) liposome- bound reporters, where binding leads to liposome lysis and release of encapsulated reporter.
  • spin- labeled reporters where antibody binding to the antigen is detected by a change in reported mobility (broadening of the spin splitting peaks)
  • fluorescent reporters where binding is detected by a change in fluorescence emission
  • enzyme reporters where antibody binding effects enzyme/substrate interactions
  • liposome- bound reporters where binding
  • the assay method involves reacting the tissue extract from a test individual with an antibody and examining the sample for the presence of bound antigen.
  • the examination may involve attaching a labelled anti-GTBP antibody to the primary complex formed between GTBP and the immobilized antibody and measuring the amount of reporter bound to the solid support, as in the first method, or may involve observing the effect of antibody binding on a homogeneous assay reagent, as in the second method.
  • GTBP in its native or chemically modified form, or polypeptide derivatives thereof, or specific complexes with other polypeptides may be used for producing antibodies, either monoclonal or polyclonal, specific to GTBP or polypeptide derivatives thereof, or to GTBP complexes with other polypeptides.
  • Antibodies specific for GTBP protein are produced by immunizing an appropriate vertebrate host, e.g., rabbit or mouse, with purified GTBP protein or polypeptide derivatives of GTBP protein, by themselves or in conjunction with a conventional adjuvant. Usually, two or more immunizations will be involved, and blood or spleen will be harvested a few days after the last injection.
  • the immunoglobulins can be precipitated, isolated and purified by a variety of standard techniques, including affinity purification using GTBP protein attached to a solid surface, such as a gel or beads in an affinity column.
  • the splenocytes will normally be fused with an immortalized lymphocyte, e.g., a myeloid cell line, under selective conditions for hybridoma formation.
  • the hybridomas can then be cloned under limiting dilution conditions and their supernatants screened for antibodies having the desired specificity.
  • Techniques for producing antibodies are well known in the literature and are exemplified by the publication Antibodies : A Laboratory Manual (1988) eds.
  • the genetic material of the invention can itself be used in numerous assays as probes for genetic material present in an individual.
  • the analyte can be a nucleotide sequence which hybridizes with a probe comprising a sequence of at least about 16 consecutive nucleotides, usually 30 to 200 nucleotides, up to substantially the full sequence of the gene as shown in SEQ ID NO: 12.
  • the analyte can be RNA or DNA.
  • the sample is typically a DNA or an RNA molecule extracted by the patient's tissue.
  • the probe may contain a detectable label.
  • Particularly preferred for use as a probe are sequences up to about 3200 consecutive nucleotides (for example from nucleotide 1 to nucleotide 3000 of SEQ ID NO: 12 and from nucleotide 1 to nucleotide 204 of SEQ ID NO:16) since these sequences appear to be particularly specific for GTBP.
  • PCR technique One method for amplification of target nucleic acids, for later analysis by hybridization assays, is known as the polymerase chain reaction or PCR technique.
  • the PCR technique can be applied to detect sequences of the invention in suspected samples using oligonucleotide primers spaced apart from each other and based on the genetic sequence set forth in SEQ ID NO: 12 and SEQ ID NO:16.
  • the primers are complementary to opposite strands of a double-stranded DNA molecule and are typically separated by from about 50 to 450 nt or more (usually not more than 2000 nt) .
  • This method entails preparing the specific oligonucleotide primers and then repeated cycles of target DNA denaturation, primer binding, and extension with a DNA polymerase to obtain DNA fragments of the expected length based on the primer spacing. Extension products generated from one primer serve as additional target sequences for the other primer.
  • the degree of amplification of a target sequence is controlled by the number of cycles that are performed and is theoretically calculated by the simple formula 2 where n is the number of cycles. Given that the average efficiency per cycle ranges from about 65% to 85%, 25 cycles produce from 0.3 to 4.8 million copies of the target sequence.
  • the PCR method is described in a number of publications, including Saiki et al. , Science (1985) 230:1350-1354; Saiki et al. , Nature (1986) 324:163-166; and Scharf et al., Science (1986)233:1076-1078. Also see U.S. Patent Nos. 4,683,194; 4,683,195; and 4,68
  • the invention includes a specific diagnostic method for determination of GTBP, based on selective amplification of GTBP-protein-encoding DNA fragments.
  • This method employs a pair of single-stranded primers derived from non-homologous regions of opposite strands of GTBP DNA duplex fragment having a sequence as described by combining the sequences SEQ ID NO: 16 and SEQ ID NO:12. These "primer fragments" represent one aspect of the invention.
  • the method follows the process for amplifying selected nucleic acid sequences as disclosed in U.S. Patent No. 4,683,202, as discussed above.
  • Mutations in the GTBP gene can be detected by restriction enzyme analysis of the amplification product or by direct sequencing. Also, alterations in GTBP sequence can be revealed by Southern hybridization with probes encompassing part or the entire sequences of SEQ ID NO: 12 and SEQ ID NO:16.
  • Single-stranded DNA probes complementary to the wild-type GTBP-coding sequence can also be hybridized to RNA extracted from tissues or cells of human patients and used to detect mutations in the mature GTBP gene transcript by enzymatic digestion of heteroduplexes at the level of mismatches. These and other techniques aimed to identify variations in gene sequences from wild-type GTBP are extensively reported in the literature and well established in the scientific community. Binding assays involving GTBP The presence of an altered GTBP protein can be detected by the use of binding assays based on the specific recognition of G/T mismatches by GTBP. A synthetic double-stranded 34-mer oligonucleotide containing G/T mispair is prepared and labelled substantially as reported (15) .
  • Cell extracts can be prepared as reported in current literature (e.g. ref 25 and refs. therein) .
  • the cell extract (1-10 micrograms of nuclear proteins) can be incubated with the heteroduplex oligonucleotide at room temperature for 30 minutes to allow GTBP binding to the G/T mismatch.
  • the mixture can then be loaded on a gel prepared as reported in Figure 6. Alterations in GTBP mass or affinity for the substrate can be evidenced by an altered electrophoretic mobility.
  • NCIMB Newcastle disease virus
  • NCIMB 40742 accession numbers NCIMB 40742
  • NCIMB 40471 accession numbers NCIMB 40740 respectively.
  • a strain of E.coli TOP10 - transformed using the plasmid pBluescript SK ' /GTBP coding for the whole amino acid sequence of GTBP from the amino acid 1 to the amino acid 1360 (SEQ ID NO: 15 and SEQ ID NO:l)- has been deposited on 28/5/96 with the above depositary institution with accession number NCIMB 40805.
  • the present example shows that the GTBP protein sequence, as reported by combining the sequences SEQ ID NO:15 and SEQ ID NO: 1, contains seven subsequences which correspond to polypeptides obtained after proteolytic cleavage of the 160 kDa DNA-binding protein termed GTBP. These subsequences are indicated as SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7 and SEQ ID NO: 8.
  • the 160 kDa protein was purified as reported in ref. 16.
  • the fractions containing the G/T- specific mismatch binding activity were loaded onto a preparative SDS-PAGE gel and the 100 and 160 kDa bands were excised following staining with Coomassie Blue.
  • the proteins were digested in the gel matrix either with trypsin (100 kDa protein, Promega Corporation, UK) , or with Achromombacter lyticus endopeptidase lys-C (160 kDa protein, Wako Chemicals GmbH, Germany) .
  • the proteolytic peptides were recovered by sequential extractions and separated by tandem hplc on a Hewlett-Packard 1090M with diode array detection.
  • Example IB The present example shows that the protein GTBP contains an amino-terminal domain corresponding to SEQ ID NO:15. This region can be determined by analysis of the coding nucleotide sequence. The amino-terminal domain is an integral part of the peptide GTBP itself, and therefore the GTBP sequence must be understood to be the sequenced combination of SEQ ID NO:15 and SEQ ID: NO:l with a total extension of 1360 amino acids. Part a of figure 8 shows the experimental approach followed to discover the amino-terminal region of GTBP (from amino acid 1 to 68 of SEQ ID NO:15) . Using the 5' RACE method(Rapid Amplification cDNA Ends, given in detail in the publication Nicolaides, N.C. et al .
  • Genomics 29: 229-234, 1995 and Nicolaides N.C. et al. Genomics, 30: 195-206, 1995
  • a pair of oligonucleotides was used that pairs with the sequence given in SEQ ID NO:12 from nucleotide 114 to 133 (primary oligonucleotide A) and from nucleotide 56 to 74 (secondary oligonucleotide B) .
  • the PCR reaction products were sequenced and it was possible to determine that the amplification product was capable of encodmg the polypeptide DAAWSEAGPGPR, corresponding to amino acids 46-58 of the amino-terminal domain of GTBP as indicated in SEQ ID NO:15.
  • oligonucleotides whose sequence was deduced from the initial RACE, complementary to the sequence given in SEQ ID NO:16 from nucleotide 188 to 204 (primary oligonucleotide C) and from oligonucleotide 169 to 185 (secondary oligonucleotide D) it was possible to amplify the GTBP-coding region 5' by-passing the methionine in position 1 of the amino acid sequence given in SEQ ID NO:15.
  • the amplified clone termed KMN, contained the entire nucleotidic sequence given in SEQ ID NO:16.
  • RACE analysis of leucocyte cDNA is shown in lanes 2 and 5, that of placenta cDNA in lanes 3 and 6.
  • the products of lanes 1 to 3 derive from sequenced amplifications with oligonucleotides A and B, those in lanes 4 to 6 derive from sequenced amplifications with oligonucleotides C and D.
  • Lanes 1 and 4 are the negative controls (absence of template) .
  • the molecular weight markers are indicated at the side.
  • Part b of figure 8 shows expression of the transcript encoding the protein GTBP using RT-PCR (PCR preceded by inverse transcription on RNA templates) .
  • the RT-PCR was carried out using a synthetic oligonucleotide which paired with the sequence given in SEQ ID NO:12 from nucleotide 114 to 133 in the inverse transcription reaction followed by amplification with an oligonucleotide with a sequence equal to the end 5' of the GTBP transcript, that is 5'GGTGCTTTTAGGAGCCCCG3' .
  • RNA used as a mold was taken from HeLa cells (lane 2) placenta (lane 3) leucocytes (lane 4) and cells from the colon (lane 5) ; these were incubated with (+ symbol on the lane) or without (- symbol on the lane) inverse transcriptase and then made to underto PCR. Wnere no cDNA was produced, as the reverse transcription reaction did not occur, it was not seen to be amplified. Lane 1 is the negative control without RNA. Example ?.
  • the present example shows that DNA regions internal to GTBP gene can be obtained by amplification with primers designed on the basis of the sequence of peptides deriving from proteolytic cleavage of the 160 kDa G/T- binding factor (SEQ ID NO: 2 to 8) .
  • the inventors identified a unique DNA sequence encoding the central 8 amino acids of the peptide of SEQ ID NO: 6.
  • Example 3 shows that DNA regions internal to GTBP gene can be identified by hybridization with a DNA probe designed on the basis of the sequence of peptides obtained upon proteolytic cleavage of the 160 kDa G/T-binding factor.
  • the DNA sequence reported as SEQ ID NO: 11 was was labeled with 32P by a standard kinase
  • the labelled probe of SEQ ID NO: 11 was then used in the screening of a commercial oligo dT-primed cDNA library in phage lambda (HeLa S3 Uni-ZAP XR, Stratagene) . Two positive clones were selected for further analysis.
  • Clone Cl contained an insert of 3980 bp corresponding to SEQ ID NO: 12, with a continuous open reading frame from amino acid residue 1 to 1292 encoding a polypeptide of 1292 amino acids (SEQ ID NO: 1) and a calculated molecular mass of 142 kDa; clone FLY 5 contained sequences coding from aa residue 116 to 1292 (see comments to SEQ ID NO: 1 and 12) .
  • GTBP protein can be used as an antigen to produce highly specific antibodies which recognize GTBP but not hMSH2.
  • PCR fragments corresponding to amino acid residues 27 to 158 of hMSH2 (SEQ ID NO: 13) and 750 to 928 of GTBP (SEQ ID NO: 14) were subcloned into the E. coli expression vector pGEX-3X (Pharmacia/LKB) and the recombinant proteins, in the form of fusion polypeptides with glutathione S-transferase, were induced and isolated as recommended by the manufacturer, except that the final concentration of IPTG was 0.25 mM and induced cultures were harvested after 6 hours at 20°C.
  • the fusion proteins were used for immunization of New Zealand White S.P.F. female rabbits (Charles River Co.) using standard protocols. Two polyclonal antisera specifically immunoreactive to GTBP and hMSH2, respectively, were obtained and assayed as reported in Antibodies : A Laboratory Manual (1988) eds. Harlow and Lane, Cold Spring Harbor Laboratories Press (see Figures 2 and 5 for more details) . ExampIf? 5
  • GTBP belongs to a class of DNA-repair proteins conserved over a wide evolutionary range.
  • Figure 3 shows the alignment of the amino acid sequences of the conserved C-terminal regions of the mismatch binding proteins GTBP (ff. sapiens) , hMSH2
  • GenBank entry HSU04045 (hMSH2) .
  • the alignment was carried out using the GCG Pileup option.
  • the figure was generated using Prettyplot.
  • the alignment reveals a high degree of conservation at the C-terminal domain among all the proteins. GTBP can thus be considered a new member of the
  • Figure 5 shows the effect of anti-hMSH2 and anti-GTBP antisera on the formation of the specific mismatch-binding complex.
  • This gel-shift analysis was carried out as described (15) , except that nuclear extracts were used (25) .
  • the antisera were added to the reaction mixtures 20 min prior to the radioactively- labelled probe.
  • the figure is an autoradiogram of a native 6% polyacrylamide gel run in TAE buffer.
  • the following example shows that GTBP and hMSH2 can be expressed separately in a cell-free translation system.
  • the inventors employed a hMSH2 cDNA clone (17) and the GTBP clones Cl and FLY5 as set forth in SEQ ID NO: 12.
  • the Cl and FLY5 ORFs were introduced into pCite- 2b.
  • the hMSH2 ORF was inserted into pCite-1 (Novagen) .
  • In vitro transcription and translation reactions were carried out as described previously (26) including a mock translation reaction in the absence of added DNA. 3 5s- labeled translation products were analyzed on a SDS- polyacrylamide gel treated with Amplify (Amersham) , dried and autoradiographed.
  • the experiment was carried out using conditions recommended by the manufacturer.
  • the figure is an autoradiogram of a denaturing 7.5% SDS- polyacrylamide gel.
  • Fig. 6 section a translation of hMSH2, GTBP (Cl) and FLY5 mRNAs in a reticulocyte lysate system (Promega) gave rise to polypeptides of 113, 142 and 122 kDa respectively.
  • reticulocyte lysate system Promega
  • mismatch repair genes such as hMSH2, hMLHl , hPMSl and hPMS2 (1) are known to cause the hypermutability found in many forms of hereditary colorectal cancers (CRC) .
  • CRC hereditary colorectal cancers
  • the CRC-derived cell line HCT15 contains a full length hMSH2 protein but shows hypermutable phenotype (19) .
  • the RNA of this cell line was reverse transcribed with random hexamers and reverse transcriptase according to standard protocols (e.g., see Powell et al., New Engl . J. Med. 329, 1982, 1993).
  • the cDNA was then amplified with PCR using primers specific for the GTBP-coding sequence.
  • the oligonucleotides used were: primer 5' -PGAGGGTTACCCCTGG-3' and 5'- ACACTGTAAGTCTGTGTACC-3' for codons 32 to 458, primers 5'- PAGTGAAGGCCTGAACAGCC-3' and 5' -AAGTCCAGTCTTTCGAGCC-3' for codons 219 to 858, and primers 5' -PGAGAGGGTTGATACTTGCC-3' and 5' -AGAAGTCAACTCAAAGCTTCC-3' for codons 692 to 1292 (where P denotes a T7 promoter sequence and a ribosome- binding site for translation initiation (26) and codon numbers are those reported in SEQ ID NO: 1 and SEQ ID NO: 12) .
  • the amplification products were first transcribed and translated in vi tro using a commercial kit (Promega) .
  • Analysis of translation products in a PAGE-SDS gel revealed truncated GTBP polypeptides from two PCR products, corresponding to regions located at codons 32- 458 (5' -end of the gene) and 692-1292 (3' -end of the gene) .
  • Sequencing of these PCR products using a commercial system (SequiTherm Polymerase, Epicentre Technologies) revealed that truncations were due to frameshift mutations.
  • nucleotide 664 (a C) at codon 222 changed a leucine to a termination codon and a substitution of nucleotides 3307-3312 (GATAGA) with T (see SEQ ID NO: 12) created a new termination codon several bp downstream.
  • MTl is an alkylation-resistant lymphoblastoid cell line with a biochemical deficiency . similar to that of HCT15 (see Goldmacher et al . , J. Biol . Chem. , 261 , 12462, 1986; Kat et al . Proc. Natl . Acad Sci USA, 90, 6424, 1993) .
  • the RNA of this cell line was reverse transcribed with random hexamers and reverse transcriptase and the cDNA was then amplified with PCR using primers specific for the GTBP- coding sequence as reported above.
  • GTBP ff. sapiens
  • hMSH2 ff. sapiens
  • the amplification products were cloned in the vector
  • BLUESCRIPT SK ⁇ and individual clones were sequenced using conventional protocols (Sequenase, USB) .
  • the two mutations were not found to be associated in a single clone, deriving thus from separate alleles.
  • a tumor cell line termed 543X (from the patient's designation) was derived from CRC and displays hypermutable phenotype and microsatellite instability but no mutation in mismatch repair genes so far described, including hMSH2 , hMLHl , hPMSl and hPMS2 (Liu et al . , Nature Genetics 9, 48, 1995) .
  • RNA of this cell line was reverse transcribed with random hexamers and reverse transcriptase and the cDNA was then amplified with PCR using primers specific for the GTBP-coding sequence as reported above.
  • MOLECULE TYPE protein
  • HYPOTHETICAL No
  • ANTISENSE No
  • ORGANISM Homo sapiens
  • IMMEDIATE SOURCE cDNA clone pCITE2b- Cl
  • FEATURE SEQ ID NO : 1 shows the 1292 amino acid sequence ( in three letter code) of GTBP encoded by clone Cl ( see SEQ ID NO : 12 ) .
  • the seven oligopeptides which were identif ied upon proteolytic cleavage of GTBP see SEQ ID NO : 2 to 8 ) are underlined .
  • the first amino acid residue of the peptide encoded by the FLY5 cDNA is Asn at position 116 .
  • NAME Cl
  • Lys lie Leu Lys Gin Val Ile Ser Leu Gin Thr Lys Asn Pro Glu Gly
  • Phe Leu Tyr Lys lie Lys Gly Ala Cys Pro Lys Ser Tyr Gly Phe 1220 1225 1230 Asn Ala Ma Ar ⁇ Leu Ala Asn Leu Pro Glu Glu Val Ile Gin Lys Gly
  • ORGANISM Homo sapiens
  • FEATURE SEQ ID NO: 2 to 8 show seven oligopeptides derived from proteolytic cleavage of GTBP extracted from HeLa cells and purified as described in ref. 16 .
  • the peptide corresponding to SEQ ID NO: 6 (18 amino acids) was selected to design two degenerate primers corresponding to the N- and C-terminal sequences of the peptide, as given in detail in SEQ ID NO: 9 and 10.
  • NAME FR44
  • C IDENTIFICATION METHOD: Experimentally
  • MOLECULE TYPE protein
  • HYPOTHETICAL No
  • ANTISENSE No
  • ORIGINAL SOURCE (A) ORGANISM: Homo sapiens
  • MOLECULE TYPE protein
  • HYPOTHETICAL No
  • ANTISENSE No
  • ORIGINAL SOURCE
  • ORGANISM Homo sapiens
  • ix FEATURE: see SEQ ID NO: 2
  • A NAME: FR69
  • SEQ ID NO: 9 shows the sequence of the degenerate single-stranded DNA primer deduced from the N-terminal of oligopeptide shown in SEQ ID NO: 6. Together with SEQ ID NO: 10, the two primers were used to amplify poly-A RNA extracted from HeLa cells .
  • the expected 67 base pairs (bp) fragment was cloned in pBluescript SK ⁇ (Stratagene) and sequenced with a commercial T7-polymerase based kit (Pharmacia) .
  • the 54 bp sequence of the resulting fragment, obtained after subtraction of the engineered cloning sites, is shown as SEQ ID NO: 11.
  • A)NAME oligo 5' sense
  • C IDENTIFICATION METHOD: Polyacrylamide gel
  • IMMEDIATE SOURCE oligonucleotide synthesizer
  • FEATURE SEQ ID NO:10 shows the sequence of the degenerate single-stranded DNA primer deduced from the C-terminal of oligopeptide shown in SEQ
  • SEQ ID NO: 11 shows the double-stranded DNA sequence encoding the oligopeptide reported in SEQ ID NO: 6, as deduced by sequencing of cloned amplification product . This fragment was derived from PCR amplification of HeLa cDNA, using the degenerate primers described in SEQ ID NO:
  • MOLECULE TYPE synthetic DNA
  • HYPOTHETICAL No
  • ANTISENSE No
  • IMMEDIATE SOURCE cDNA clone Cl
  • FEATURE SEQ ID NO: 12 shows the 3980 bp cDNA sequence of clone Cl .
  • the cDNA insert of clone FLY5 spanned from nucleotide 346 to 3980 of the Cl sequence as reported in SEQ ID NO: 12.
  • TATCCCCCAG TACAAGTTTT ATTTGAAAAA GGAAATCTCT CAAAGGAAAC TAAAACAATT 1620
  • CAGGTCATCT CTCTGCAGAC AAAAAATCCT GAAGGTCGTT TTCCTGATTT GACTGTAGAA 2520
  • AAAACTATTG AAAAGAAGTT GGCTAATCTC ATAAATGCTG AAGAACGGAG GGATGTATCA 2880
  • SEQ ID NO: 13 shows the double-stranded DNA sequence used to express an internal domain of hMSH2 (corresponding to amino acid residues 27 to 158) in the expression vector pGEX-3x (see also legend to Figure 2) .
  • FEATURE SEQ ID NO: 14 shows the double -stranded DNA sequence used to express an internal domain of GTBP (corresponding to amino acid residues 750 to 928) in the expression vector pGEX-3x (see also legend to Figure 2) .
  • NAME GST/GTBP
  • C IDENTIFICATION METHOD: Polyacrylamide gel
  • SEQ ID NO: 15 shows the amino-terminal sequence of 68 amino acids of GTBP encoded by the clone TASNR2A1 (see SEQ ID NO:16 for the corresponding nucleotide encoding sequence) .
  • the amino acid sequence SEQ ID NO:15 (corresponding to residues 1-68) must be placed in front of the amino acid in position 1 of the sequence given in SEQ ID NO:l (corresponding to 1292 residues) to obtain the complete GTBP sequence of 1360 amino acids.
  • nucleotidic sequence SEQ ID NO : 15 (corresponding to 204 residues) must be positioned in front of the nucleotide in position 1 of the sequence given in SEQ ID NO : 12 (corresponding to 3980 residues ) in order to obtain the complete GTBP- encoding sequence of 4080 nucleotides .
  • NAME KMN

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Medicinal Chemistry (AREA)
  • Immunology (AREA)
  • Oncology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Hospice & Palliative Care (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Toxicology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Nouvelle protéine dite GTBP (protéine de liaison de guanine-thymine) pouvant se lier aux mésappariements de l'ADN de G/T afin de réparer les informations génétiques; procédés de détection de cette protéine; séquence nucléotidique codant pour cette protéine; et procédés de préparation de ladite protéine à l'aide de techniques de manipulation génétique. On décrit également la détection dans les tissus tumoraux du gène de GTBP mutant dans le but de prévenir les formes tumorales colorectales chez l'homme, et d'assurer un diagnostic rapide de celles-ci. La figure met en évidence l'absence d'activité spécifique de la GTBP dans les cellules obtenues à partir des tumeurs colorectales chez l'homme.
PCT/IT1996/000131 1995-06-27 1996-06-27 Polypeptide pour la reparation d'informations genetiques, sequence nucleotidique codant pour ce polypeptide, et son procede de preparation (proteine de liaison de guanine-thymine - gtbp) WO1997001634A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU62412/96A AU6241296A (en) 1995-06-27 1996-06-27 Polypeptide for repairing genetic information, nucleotidic sequence which codes for it and process for the preparation thereof (guanine thymine binding protein - gtbp)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IT95RM000434A IT1278118B1 (it) 1995-06-27 1995-06-27 Polipeptide per la riparazione di informazione genetica, sequenza nucleotidica che codifica per esso e procedimento per la sua
ITRM95A000434 1995-06-27

Publications (2)

Publication Number Publication Date
WO1997001634A2 true WO1997001634A2 (fr) 1997-01-16
WO1997001634A3 WO1997001634A3 (fr) 1997-04-03

Family

ID=11403448

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IT1996/000131 WO1997001634A2 (fr) 1995-06-27 1996-06-27 Polypeptide pour la reparation d'informations genetiques, sequence nucleotidique codant pour ce polypeptide, et son procede de preparation (proteine de liaison de guanine-thymine - gtbp)

Country Status (3)

Country Link
AU (1) AU6241296A (fr)
IT (1) IT1278118B1 (fr)
WO (1) WO1997001634A2 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999019492A3 (fr) * 1997-10-10 1999-06-24 Rhone Poulenc Agrochimie Procedes pour obtenir des varietes de plantes
WO2002020783A1 (fr) * 2000-06-07 2002-03-14 Shanghai Biowindow Gene Development Inc. Nouveau polypeptide, proteine humaine 13.2 de reparation du mesappariement de bases, et polynucleotide codant ce polypeptide
CN119285699A (zh) * 2024-11-13 2025-01-10 盐城工学院 一种核桃粕源的抗菌多肽及其应用

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
GENOMICS, vol. 31, no. 3, 1 February 1996, pages 395-397, XP000613760 N.C. NICOLAIDES ET AL.: "Molecular cloning of the N-terminus of GTBP" *
JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 267, no. 33, 25 November 1992, pages 23876-23882, XP000615520 M.J. HUGHES ET AL.: "The purification of a human mismatch-binding protein and identification of its associated ATPase and helicase activities." cited in the application *
NATURE MEDICINE., vol. 2, no. 2, February 1996, pages 169-174, XP000615507 B. LIU ET AL.: "Analysis of mismatch repair genes in hereditary non-polposis colorectal cancer patients." *
NATURE, vol. 367, 3 February 1994, page 417 XP000615506 F. PALOMBO ET AL.: "Mismatch repair and cancer." cited in the application *
PROC. NATL. ACAD. SCI. USA, vol. 85, no. 23, December 1988, pages 8860-8864, XP000615500 J. JIRICNY: "A human 200-kDa protein binds selectively to DNA fragments containing G.T mismatches." cited in the application *
PROC. NATL. ACAD. SCI. USA., vol. 91, September 1994, pages 8905-8909, XP000615501 G. AQUILINA ET AL.: "A mismatch recognition defect in colon carcinoma confers DNA microsatellite instability and a mutator phenotype." cited in the application *
SCIENCE, vol. 268, 30 June 1995, pages 1912-1914, XP000615504 F. PALOMBO ET AL.: "GTBP, a 160-Kilodalton protein essential for mismatich-binding activity in human cells." *
SCIENCE, vol. 268, 30 June 1995, pages 1915-1917, XP000615505 N. PAPADOPUOLOS ET AL.: "Mutations of GTBP in genetically unstable cells." *
THE JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 264, no. 35, 15 December 1989, pages 21177-21182, XP000371702 C. STEPHENSON ET AL.: "Selective binding to DNA base pair mismatches by proteins from human cells." *
THE JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 269, no. 20, 20 May 1994, pages 14367-14370, XP000615519 A. UMAR ET AL.: "Defective mismatch repair in extracts of colorectal and endometrial cancer cell lines exhibiting microsatellite instability." cited in the application *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999019492A3 (fr) * 1997-10-10 1999-06-24 Rhone Poulenc Agrochimie Procedes pour obtenir des varietes de plantes
US6734019B1 (en) 1997-10-10 2004-05-11 Aventis Cropscience S.A Isolated DNA that encodes an Arabidopsis thaliana MSH3 protein involved in DNA mismatch repair and a method of modifying the mismatch repair system in a plant transformed with the isolated DNA
WO2002020783A1 (fr) * 2000-06-07 2002-03-14 Shanghai Biowindow Gene Development Inc. Nouveau polypeptide, proteine humaine 13.2 de reparation du mesappariement de bases, et polynucleotide codant ce polypeptide
CN119285699A (zh) * 2024-11-13 2025-01-10 盐城工学院 一种核桃粕源的抗菌多肽及其应用

Also Published As

Publication number Publication date
ITRM950434A1 (it) 1996-12-27
WO1997001634A3 (fr) 1997-04-03
IT1278118B1 (it) 1997-11-17
AU6241296A (en) 1997-01-30
ITRM950434A0 (it) 1995-06-27

Similar Documents

Publication Publication Date Title
AU664834B2 (en) Gene mutated in colorectal cancer of humans
US5352775A (en) APC gene and nucleic acid probes derived therefrom
US6191268B1 (en) Compositions and methods relating to DNA mismatch repair genes
JP3944654B2 (ja) 神経細胞アポトーシスの抑制タンパクとその遺伝子配列、並びに脊髄性筋萎縮症の原因となる当該遺伝子の突然変異
JPH11502404A (ja) 細胞表面発現分子および形質転換関連遺伝子に特異的なdnaプローブおよび免疫学的試薬の開発
WO1993025713A9 (fr) Compositions et procedes de detection de rearrangements et de translocations de genes
WO1993025713A1 (fr) Compositions et procedes de detection de rearrangements et de translocations de genes
EP0756488B1 (fr) Detection d'alterations de genes antitumoraux pour le diagnostic du cancer
US20030190639A1 (en) Genes involved in intestinal inflamatory diseases and use thereof
KR100828506B1 (ko) 마우스 정자 형성 유전자와 인간 남성 불임 관련 유전자,및 이들을 사용한 진단 시스템
CA2176819A1 (fr) Procede de detection des alterations du processus de reparation des mauvais appariements de l'adn
WO1995021943A1 (fr) Adn codant pour les proteines du canal k+ dependant de l'atp, et applications
EP0633268B1 (fr) Protéines MDC et ADNs les encodant
CA2554380C (fr) Gene mecp2e1
JP2003274978A (ja) 腫瘍治療のための組成物及び方法
US8003764B2 (en) Folliculin-specific antibodies and methods of detection
WO1999001550A1 (fr) Procede de detection de modifications de msh5
US20030027300A1 (en) Human hairless gene and protein
AU710551B2 (en) Nucleic acid encoding a nervous tissue sodium channel
WO1997001634A2 (fr) Polypeptide pour la reparation d'informations genetiques, sequence nucleotidique codant pour ce polypeptide, et son procede de preparation (proteine de liaison de guanine-thymine - gtbp)
US6127128A (en) Diagnosis of primary congenital glaucoma
WO1997012973A1 (fr) Cycline i humaine et gene la codant
US20030138928A1 (en) Tumor suppressor gene and methods for detection of cancer, monitoring of tumor progression and cancer treatment
KR101093508B1 (ko) 대장암 진단용 조성물 및 그 용도
WO1998006871A1 (fr) Substances et procedes concernant le diagnostic et le traitement prophylactique et therapeutique du carcinome papillaire renal

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AU CA CN JP MX US

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载