WO1993008303A1

WO1993008303A1 - Gene identification

Info

Publication number: WO1993008303A1
Application number: PCT/GB1992/001952
Authority: WO
Inventors: Walter Fred Bodmer; William Ernest Pullman; Helga Durbin
Original assignee: Imperial Cancer Research Technology Limited
Priority date: 1991-10-23
Filing date: 1992-10-23
Publication date: 1993-04-29
Also published as: GB9122501D0

Abstract

A method of identifying genes conferring adhesive properties on cells. A particular gene (CAR) found thereby. The gene function is lost in at least some tumour cells, especially metastatic cells. The protein can be used to make antibodies useful in detecting the presence or absence of the protein; genetic assays can test for loss of the gene function. In both cases, clinically important information about the tumour is gained. Cell lines deficient in the protein function are useful to identify compounds capable of restoring the function. The function may alternatively be restored by administering the protein or by transfecting the cells with the gene.

Description

GENE IDENTIFICATION

Background

The present invention relates to the identification of useful genes, proteins produced thereby and uses of such proteins.

Some mammalian cells adhere to other cells and to tissue such as extracellular matrix (ECM) and stroma. When cancer cells metastasize, they become detached from the original tumour and spread to other parts of the body, where they can lodge and develop into secondary tumours. The adhesive properties of the cells are thought to be important in this process but quite how the cells lose or acquire adhesive properties or alter the specificity of their adhesive properties is poorly understood.

Summary of the Invention

We have devised a method by which genes involved in modulating such adhesion may be identified. Using this method, we have identified a gene, the loss of the function of which causes cells to lose their ability to adhere to substrates. We have called this gene Cell Adhesion Regulator (CAR). A very high proportion of breast tumours which have a low survival rate have lost this function and such loss appears to be important independently of the stage or size of the tumour. Thus, by determining whether the gene function has been lost, one can identify one factor in deciding the manner of treating the tumour. The present invention encompasses method for finding the gene, the gene itself, the protein encoded thereby, methods of producing that protein, the preparation of antibodies specific for the protein and the use of those antibodies and genetic assays in identifying the characteristics of a tumour cell. The invention also provides, in another major aspect, an assay for identifying compounds which will restore the adhesive properties of cells which have lost the function of this gene. Such compounds are potentially useful in the amelioration of metastasis, as is the restoration to the cell of the gene function by gene therapy or administration of the polypeptide.

Description of the Invention and Preferred Embodiments thereof

A first aspect of the invention provides a method of identifying a gene useful in modulating the adhesion of cells to substrates, the method comprising: (i) selecting a cell population with a given level of adhesion to the substrate (termed, hereinafter the "original cells"), (ii) transfecting at least one original cell with a gene to provide expression of the gene in the cell or with antisense DNA corresponding to a gene, (iii) cloning thus-transfected cells, and (iv) determining whether the transfected cells adhere to the substrate more or less than the original cells.

The substrate may be cells, such as epithelial or endothelial cells, or it may be an extracellular matrix component such as a collagen (for example Type I, III or IV collagen), laminin, fibronectin or a proteoglycan, or it may be a synthetic basement membrane such as "Matrigel" .

Suitable cells in which to conduct the assay are those which enable high copy number plasmid replication whilst allowing selection in an adhesion or panning assay by virtue of acceptably low levels of constitutive binding. In other words, the cells should exhibit low enough binding to enable discrimination of specific effect in cells transfected with the desired cDNA from the background, without at best requiring a large number of selection and amplification cycles:

A second aspect of the invention provides an isolated gene identified by the method above. Preferably, the gene is the isolated Cell Adhesion Regulator (CAR) gene shown below and variations thereof:

atggggaaggttatcagtgcttcccgagtgagcatggaacacttcgagttccagggt tatagacagtcgttcccagtgtggctgaggccacccagaggcagcagagcattcaga ctccaaacagacccctgttcatgccgacgcttgcacgaccgccccagttcctgtggc tccctcggaatgctaaggggatcggacatgaaaggaccctgtgagccgattgtccta tctccagcggccctgtcatccagctcactcatcaatggggccagtcaggcccaggca ctgggctccggaggactcaccactgccccctgctgccatgtggactggtgcaagttg aggacttcttgctggtctagtcacgcatgcagtgttggggatgccttggtttttact gctctgagaattgttgagatactttactaataaactgtgtagttgg

(In the application (GB 9122501.1) from which this present application claims priority, there are three errors in the nucleotide sequence. These have been corrected in the present application. In the priority application, position 142 in the nucleotide sequence is a T but is a C in the present application; in the present application an additional G is inserted between positions 267 and 268 of the nucleotide sequence in the priority application; and the C at position 330 in the nucleotide sequence in the priority application has been deleted in the present application.)

As used herein, the term "isolated" means that the gene is in isolation from at least most of the human chromosome on which it is found (16q24 in the case of the CAR gene), in other words the gene is not claimed in the form in which it has previously existed. Thus, the gene of the invention includes the gene when that gene has been cloned into a bacterial vector, such as a plasmid, or into a viral vector that may be harboured by a bacteriophage provided that such clones are in isolation from clones constituting a DNA library of the relevant chromosome.

The "gene" may comprise the promoter and/or other expression-regulating sequences which normally govern its expression and it may comprise introns, or it may consist of the coding sequence only, for example a cDNA sequence.

A "variation" of the gene is one which is (i) usable to produce a protein or a fragment thereof which is in turn usable to prepare antibodies which specifically bind to the protein encoded by the said gene or (ii) an antisense sequence corresponding to the gene or to a variation of type (i) as just defined. For example, different codons can be substituted which code for the same amino acid(s) as the original codons. Alternatively, the substitute codons may code for a different amino acid that will not affect the activity or immunogenicity of the protein or which may improve its activity or immunogenicity. For example, site-directed mutagenesis or other techniques can be employed to create single or multiple mutations, such as replacements, insertions, deletions, and transpositions, as described in Botstein and Shortle, "Strategies and Applications of In Vitro Mutagenesis," Science, 229: 193-1210 (1985), which is incorporated herein by reference. Since such modified genes can be obtained by the application of known techniques to the teachings contained herein, such modified genes are within the scope of the claimed invention. If a mutant form of the adhesion gene is positively responsible for the loss of adhesion (ie rather than just loss of the gene), then the antisense DNA may be administered to remove the mutant gene.

Moreover, it will be recognised by those skilled in the art that the gene sequence (or fragments thereof) of the invention can be used to obtain other DNA sequences that hybridise with it under conditions of high stringency. Such DNA includes any genomic DNA. Accordingly, the gene of the invention includes DNA that shows at least 55 per cent, preferably 60 per cent, and most preferably 70 per cent homology with the gene identified in the method of the invention, provided that such homologous DNA encodes a protein which is usable in the methods described below. For example, the 3' 19 nucleotides of the CAR gene may be omitted as they begin with a stop codon.

"Variations" of the gene include genes in which relatively short stretches (for example 20 to 50 nucleotides) have a high degree of homology (at least 50% and preferably at least 90 or 95%) with equivalent stretches of the gene of the invention even though the overall homology between the two genes may be much less. This is because important active or binding sites may be shared even when the general architecture of the protein is different. An insertion polymorphism has been found in the CAR gene. The 4-bp insertion, CACA, is between the A at position 271 and the G at position 272 and leads to the formation of a polypeptide, shorter than the wild-type CAR protein, which may or may not be functional. Of course, if the mutant gene product has the wild- type function, antisense DNA would not be administered to remove this mutant gene.

Hereinafter, the term "gene" will be used to embrace all such variations and fragments.

The gene may be used, when included in a suitable expression sequence, to prepare the protein or fragment thereof.

Alternatively, the gene and the knowledge of its sequence may be used in the preparation of nucleotide probes and primers, useful in the analysis of human chromosomes and parts thereof, as is discussed further below.

A third aspect of the invention provides a protein encoded by the gene of the invention and variants and fragments of such proteins.

Preferably, the protein is the Cell Adhesion Regulator (CAR) protein having the following sequence and variants and fragments of that protein: MGKVISASRVSMEHFEFQGYRQSFPWLRPPRGSRAFRLQTDPCSCRRLHD RPSSCGSLGMLRGSDMKGPCEPIVLSPAALSSSSLINGASQAQALGSGGLT TAPCCHVD CKLRTSCWSSHACSVGDALVFTALRIVEIL5T

(As described above, the nucleotide sequence of the CAR gene in this present application has been corrected. The amino acid sequence for the CAR protein deduced from the corrected nucleotide sequence, shown above, differs from the amino acid sequence given in the priority application. In the CAR amino acid sequence shown in the priority application, the residues from positions 90 to 110 are PVRPRHWAPEDSPLPPAAMWT.)

"Fragments" and "variants" are those which are useful to prepare antibodies which will specifically bind the said protein or mutant forms thereof lacking the function of the native protein. Such variants and fragments will usually include at least one region of at least five consecutive amino acids which has at least 90% homology with the most homologous five or more consecutive amino acids region of the said protein. A fragment is less than 100% of the whole protein. For example, as a fragment of the CAR protein, amino acids 61-72 may be prepared as a synthetic peptide having the sequence H₂N-Met-Leu-Arg- Gly-Ser-Asp-Met-Lys-Gly-Pro-Cys-Giu-COOH.

It will be recognised by those skilled in the art that the polypeptide of the invention may be modified by known protein modification techniques. These include the techniques disclosed in US Patent No 4,302,386 issued 24 November 1981 to Stevens, incorporated herein by reference. Such modifications may enhance the immunogenicity of the antigen, or they may have no effect on such immunogenicity. For example, a few amino acid residues may be changed. Alternatively, the antigen of the invention may contain one or more amino acid sequences that are not necessary to its immunogenicity. Unwanted sequences can be removed by techniques well known in the art. For example, the sequences can be removed via limited proteolytic digestion using enzymes such as trypsin or papain or related proteolytic enzymes.

Alternatively, polypeptides corresponding to antigenic parts of the protein may be chemically synthesised by methods well known in the art. These include the methods disclosed in US Patent No 4,290,944 issued 22 September 1981 to Goldberg, incorporated herein by reference.

Thus, the protein of the invention includes a class of modified polypeptides, including synthetically derived polypeptides or fragments of the original protein, having common elements of origin, structure, and immunogenicity that are within the scope of the present invention.

A fourth aspect of the invention provides a method of producing the said protein by expressing a corresponding nucleotide sequence in a suitable host cell or by amino acid synthesis.

Thus, the gene of the invention may be used in accordance with known techniques, appropriately modified in view of the teachings contained herein, to construct an expression vector, which is then used to transform an appropriate host cell for the expression and production of the polypeptide of the invention. Such techniques include those disclosed in US Patent Nos.

4,440,859 issued 3 April 1984 to Rutter et al, 4,530,901 issued 23 July 1985 to Weissman, 4,582,800 issued 15 April 1986 to Crowl, 4,677,063 issued 30

June 1987 to Mark et al, 4,678,751 issued 7 July 1987 to Goeddel, 4,704,362 issued 3 November 1987 to Itakura et al, 4,710,463 issued 1 December 1987 to Murray, 4,757,006 issued 12 July 1988 to Toole, Jr. et al, 4,766,075 issued

23 August 1988 to Goeddel et al and 4,810,648 issued 7 March 1989 to Stalker, all of which are incorporated herein by reference. The gene of the invention may be joined to a wide variety of other DNA sequences for introduction into an appropriate host. The companion DNA will depend upon the nature of the host, the manner of the introduction of the DNA into the host, and whether episomal maintenance or integration is desired.

Generally, the gene, preferably as cDNA, is inserted into an expression vector, such as a plasmid, in proper orientation and correct reading frame for expression. If necessary, the DNA may be linked to the appropriate transcriptional and translational regulatory control nucleotide sequences recognised by the desired host, although such controls are generally available in the expression vector. The vector is then introduced into the host through standard techniques. Generally, not all of the hosts will be transformed by the vector. Therefore, it will be necessary to select for transformed host cells. One selection technique involves incorporating into the expression vector a DNA sequence, with any necessary control elements, that codes for a selectable trait in the transformed cell, such as antibiotic resistance. Alternatively, the gene for such selectable trait can be on another vector, which is used to co- transform the desired host cell.

Host cells that have been transformed by the recombinant DNA of the invention are then cultured for a sufficient time and under appropriate conditions known to those skilled in the art in view of the teachings disclosed herein to permit the expression of the polypeptide, which can then be recovered.

Many expression systems are known, including bacteria (for example E. coli and Bacillus subtilis), yeasts (for example Saccharomyces cerevisiae), filamentous fungi (for example Aspergillus), plant cells, animal cells and insect cells.

Those vectors that include a replicon such as a procaryotic replicon can also include an appropriate promoter such as procaryotic promoter capable of directing the expression (transcription and translation) of the genes in a bacterial host cell, such as E. coli, transformed therewith.

A promoter is an expression control element formed by a DNA sequence that permits binding of RNA polymerase and transcription to occur. Promoter sequences compatible with exemplary bacterial hosts are typically provided in plasmid vectors containing convenient restriction sites for insertion of a DNA segment of the present invention.

Typical procaryotic vector plasmids are pUC8, pUC9, pBR322 and pBR329 available from Biorad Laboratories, (Richmond, CA, USA) and pPL and pKK223 available from Pharmacia, Piscataway, NJ, USA.

A variety of methods have been developed to operatively link DNA to vectors via complementary cohesive termini. For instance, complementary homopolymer tracts can be added to the DNA segment to be inserted to the vector DNA. The vector and DNA segment are then joined by hydrogen bonding between the complementary homopolymeric tails to form recombinant DNA molecules.

Synthetic linkers containing one or more restriction sites provide an alternative method of joining the DNA segment to vectors. The DNA segment, generated by endonuclease restriction digestion as described earlier, is treated with bacteriophage T4 DNA polymerase or E. coli DNA polymerase I, enzymes that remove protruding, 3 '-single-stranded termini with their 3'-5'-exonucleolytic activities, and fill in recessed 3 '-ends with their polymerizing activities.

The combination of these activities therefore generates blunt-ended DNA segments. The blunt-ended segments are then incubated with a large molar excess of linker molecules in the presence of an enzyme that is able to catalyze the ligation of blunt-ended DNA molecules, such as bacteriophage T4 DNA ligase. Thus, the products of the reaction are DNA segments carrying polymeric linker sequences at their ends. These DNA segments are then cleaved with the appropriate restriction enzyme and ligated to an expression vector that has been cleaved with an enzyme that produces termini compatible with those of the DNA segment.

Synthetic linkers containing a variety of restriction endonuclease sites are commercially available from a number of sources including International Biotechnologies Inc, New Haven, CN, USA.

Exemplary genera of yeast contemplated to be useful in the practice of the present invention are Pichia, Saccharomyces, Kluyveromyces, Candida, Torulopsis, Hansenula, Schizosaccharomyces, Citeromyces, Pachysolen, Debaromyces, Metschunikøwia, Rhodosporidium, Leucosporidium, Botryoascus, Sporidiobolus, Endomycopsis, and the like. Preferred genera are those selected from the group consisting of Pichia, Saccharomyces, Kluyveromyces, Yarrowia and Hansenula, because the ability to manipulate the DNA of these yeasts has, at present, been more highly developed than for the other genera mentioned above. Examples of Saccharomyces are Saccharomyces cerevisiae, Saccharomyces italicus and Saccharomyces rouxii. Examples of Kluyveromyces are Kluyveromyces fragilis and Kluyveromyces lactis. Examples of Hansenula are Hansenula polymorpha, Hansenula anomala and Hansenula capsulata. Yarrowia lipolytica is an example of a suitable Yarrowia species. Yeast cells can be transformed by: (a) digestion of the cell walls to produce spheroplasts; (b) mixing the spheroplasts with transforming DNA (derived from a variety of sources and containing both native and non-native DNA sequences); and (c) regenerating the transformed cells. The regenerated cells are then screened for the incorporation of the transforming DNA. It has been demonstrated that yeast cells of the genera Pichia, Saccharomyces, Kluyveromyces, Yarrowia and Hansenula can be transformed by enzymatic digestion of the cell walls to give spheroplasts; the spheroplasts are then mixed with the transforming DNA and incubated in the presence of calcium ions and polyethylene glycol, then transformed spheroplasts are regenerated in regeneration medium.

Methods for the transformation of S. cerevisiae are taught generally in EP 251 744, EP 258 067 and WO 90/01063, all of which are incorporated herein by reference.

Suitable promoters for S. cerevisiae include those associated with the PGK1 gene, GAL1 or GAL10 genes, CYC1, PH05, TRP1, ADH1, ADH2, the genes for glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, triose phosphate isomerase, phosphoglucose isomerase, glucokinase, or-mating factor pheromone, a-mating factor pheromone, the PRB1 promoter, the GUT2 promoter, and hybrid promoters involving hybrids of parts of 5' regulatory regions with parts of 5' regulatory regions of other promoters or with upstream activation sites (eg the promoter of EP-A-258 067.

The transcription termination signal is preferably the 3' flanking sequence of a eukaryotic gene which contains proper signals for transcription termination and polyadenylation. Suitable 3' flanking sequences may, for example, be those of the gene naturally linked to the expression control sequence used, ie may correspond to the promoter. Alternatively, they may be different in which case the termination signal of the S. cerevisiae AHD1 gene is preferred.

The CAR protein found in nature is probably not glycosylated and so it is preferred for the protein to be produced non-glycosylated in any of these expression systems. Since the protein does not appear to contain any glycosylation sites, it will normally be possible to express the protein in a host cell and for the protein to be secreted, without it becoming glycosylated in an undesirable manner. Suitable secretion leader sequences, if the molecule is to be secreted from yeast, include mammalian leader sequences, such as the human serum albumin (HSA) and pro-uPA (pro-urokinase-type plasminogen activator) leader sequences, S. cerevisiae leader sequences such as the αt-mating factor pheromone pre- and prepro- sequence, the invertase (SUC2) leader sequence, the PH05 leader sequence, or hybrid leader sequences such as the leader sequence of WO 90/01063.

The present invention also relates to a host cell transformed with a poly nucleotide vector construct of the present invention. The host cell can be either procaryotic or eucaryotic. Bacterial cells are preferred procaryotic host cells and typically are a strain of E. coli such as, for example, the E. coli strains DH5 available from Bethesda Research Laboratories Inc., Bethesda, MD, USA, and RR1 available from the American Type Culture Collection (ATCC) of Rockville, MD, USA (No ATCC 31343) . Preferred eucaryotic host cells include yeast and mammalian cells, preferably vertebrate cells such as those from a mouse, rat, monkey or human fibroblastic cell line. Preferred eucaryotic host cells include Chinese hamster ovary (CHO) cells available from the ATCC as CCL61 and NIH Swiss mouse embryo cells NIH/3T3 available from the ATCC as CRL 1658.

Transformation of appropriate cell hosts with a DNA construct of the present invention is accomplished by well known methods that typically depend on the type of vector used. With regard to transformation of procaryotic host ceils, see, for example, Cohen et al, Proc. Natl. Acad. Sci. USA, 69: 2110 (1972); and Sambrook et al, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989). Transformation of yeast cells is described in Sherman et al, Methods In Yeast Genetics, A Laboratory Manual, Cold Spring Harbor, NY (1986). The method of Beggs, Nature, 275: 104-109 (1978) is also useful. With regard to transformation of vertebrate cells with retroviral vectors containing rDNAs, see, for example, Sorge et al, Mol. Cell. Biol , 4: 1730-37 (1984); Graham et al, Virol , 52: 456 (1973); and Wigler et al, Proc. Natl. Acad. Sci. USA, 76: 1373-76 (1979).

Successfully transformed cells, ie cells that contain a DNA construct of the present invention, can be identified by well known techniques. For example, cells resulting from the introduction of an expression construct of the present invention can be grown to produce the protein of the invention. Cells can be harvested and lysed and their DNA content examined for the presence of the DNA using a method such as that described by Southern, J. Mol. Biol. , 98: 503 (1975) or Berent et al, Biotech. , 3: 208 (1985). Alternatively, the presence of the protein in the supernatant can be detected using antibodies as described below.

In addition to directly assaying for the presence of recombinant DNA, successful transformation can be confirmed by well known immunological methods when the recombinant DNA is capable of directing the expression of the protein. For example, cells successfully transformed with an expression vector produce proteins displaying appropriate antigenicity. Samples of cells suspected of being transformed are harvested and assayed for the protein using suitable antibodies.

Thus, in addition to the transformed host cells themselves, the present invention also contemplates a culture of those cells, preferably a monoclonal (clonally homogeneous) culture, or a culture derived from a monoclonal culture, in a nutrient medium. Preferably, the culture also contains the protein. Nutrient media useful for culturing transformed host cells are well known in the art and can be obtained from several commercial sources.

A fifth aspect of the invention provides an antibody specific for the said protein, preferably the CAR protein.

By "antibody", we include monoclonal and polyclonal antibodies and we include all molecules which bind specifically but reversibly to the protein, including antibody fragments. Antibodies to the protein may be prepared in any one of the many known ways.

Monoclonal antibodies may be prepared generally by the techniques of Zola, H. (1988) ("Monoclonal Antibodies - A manual of techniques" CRC Press) which is incorporated herein by reference. Antibody fragments such as F_ab, (F_ab)₂, F_v, scFv or dAb fragments may be prepared therefrom in known ways. The antibodies may be humanized in known ways. Antibody-like molecules may be prepared using the recombinant DNA techniques of WO 84/03712. The region specific for the protein may be expressed as part of a bacteriophage, using the technique of McCafferty et al (1990) Nature, 348: 552-554.

The art of "antibody engineering" is advancing rapidly, as is described in Tan, L.K. and Morrison, S.L. (1988) Adv. Drug Deliv. Rev. 2: 129-142, Williams, G. (1988) Tibtech 6: 36-42 and Neuberger, M.S. et al (1988) 8th International Biotechnology Symposium Part 2, 792-799 (all of which are incorporated herein by reference), and is well suited to preparing antibody-like molecules specific for the protein of the invention.

The antibody may be specific for an epitope of the wild-type protein which is associated with the cell adhesion phenotype, or it may be specific for a mutated epitope associated with loss of that phenotype. The antibodies may be raised against native protein or fusion proteins. The antibodies should be immunoreactive with epitopes on the protein of the invention, preferably epitopes not present on other human proteins. In a preferred embodiment of the invention the antibodies will immunoprecipitate the protein from solution as well as react with protein on Western blots of polyacrylamide gels.

A sixth aspect of the invention provides a process of identifying whether the substrate adhesion function is present in a sample. In one embodiment, the method comprises exposing the sample to an antibody prepared as above and determining whether the antibody binds the protein. If the antibody binds specifically to a wild-type epitope and is bound by protein in the sample, then it is presumed that the function is present. If the antibody binds to a mutant epitope, then the function may have been lost.

Loss of wild-type genes can also be detected by screening for loss of wild-type adhesion protein. For example, monoclonal antibodies immunoreactive with the protein can be used to screen a tissue. Lack of antigen would indicate a mutation. Antibodies specific for proteins expressed by mutant alleles could also be used to detect mutant adhesion gene product. Such immunological assays could be done in any convenient format known in the art. These include Western blots, immunohistochemical assays and ELISA assays. Any means for detecting an altered adhesion protein can be used to detect loss of wild-type genes. Finding a mutant gene product indicates loss of a wild-type gene.

The wild-type CAR protein appears to be largely intracellular, although with a portion thereof embedded in the lipid membrane of the cell, and therefore it will normally not be possible to use the antibody, at least not large immunoglobulins, for in vivo determinations. Instead, in the case of CAR and similar adhesion genes, a sample of the tissue, normally the tumour tissue, will be extracted and fixed in such a way that intracellular proteins are exposed and so that the antibodies can bind thereto. A further advantage of such a fixing and staining process is that one can then be sure that it is the tumour cells which one is assaying, rather than neighbouring normal cells which will normally express the protein in any case. Alternatively, normal and tumour cells may be separated from one another by other methods, for example flow cytometry. Polyclonal antibodies are sometimes preferred in such methods since the particular epitope to which a monoclonal antibody binds can sometimes be lost in the fixing and staining process. Such fixing and staining methods are standard in the art, as are methods of labelling the antibodies so that their location can be determined. Although we have referred above to a sample of the tissue, usually the tumour, being assayed, in fact the normal way in which the method of the invention will be used will be for the surgeon to excise as much of the tumour as possible and for representative portions of the tumour then to be assayed in accordance with the invention. Thus, one would not normally simply remove a small sample of the tumour and leave the remainder of the tumour in place.

The methods of the invention may be particularly useful in the context of tumours of the breast, prostate and colon.

Mutant adhesion genes or gene products can also be detected in other human body samples, such as serum, stool, urine and sputum. The same techniques discussed above for detection of mutant adhesion genes or gene products in tissues can be applied to other body samples. Cancer cells are sloughed off from tumours and appear in such body samples. In addition, the (mutant) adhesion gene product itself may be secreted into the extracellular space and found in these body samples even in the absence of cancer cells. By screening such body samples, a simple early diagnosis can be achieved for many types of cancers. In addition, the progress of chemotherapy or radiotherapy can be monitored more easily by testing such body samples for mutant adhesion genes or gene products.

Functional loss of the protein can also be detected by identifying loss of the gene encoding the protein or mutations in the protein coding sequence, or in associated DNA regions which govern the expression or processing of the coding region, leading to reduced or nil expression of the protein or expression of functionally inactive versions of the protein. Such genetic assay methods include the standard techniques of restriction fragment length polymorphism assays and PCR-based assays.

Certain types of mutations, such as frameshift mutations and nonsense mutations, may be recognised by someone skilled in the art as usually leading to a loss of function. Other types of mutations, for example missense mutations, may or may not lead to a loss of function. In this case the loss or retention of function may be assessed by transfecting the mutant CAR gene, carrying the said mutations, into a cell and identifying whether the substrate adhesion function is present in the transfected cell.

According to the diagnostic and prognostic method of the present invention, loss of the wild-type gene is detected. The loss may be due to either insertional, deletional or point mutational events. If only a single allele is mutated, an early neoplastic state may be indicated. However, if both alleles are mutated then a metastatic state is indicated. The finding of such mutations thus provides both diagnostic and prognostic information. An adhesion gene allele which is not deleted (eg that on the sister chromosome to a chromosome carrying a gene deletion) can be screened for other mutations, such as insertions, small deletions, and point mutations. It is believed that most mutations found in tumour tissues will be those leading to greatly decreased expression of the adhesion gene product. However, mutations leading to non- functional gene products would also lead to a metastatic state. Point mutational events may occur in regulatory regions, such as in the promoter of the gene, leading to loss or diminution of expression of the mRNA. Point mutations may also abolish proper RNA processing, leading to loss of expression of the adhesion gene product.

It may also be desirable to screen normal tissue for mutations in the CAR gene. The presence of a mutant allele of the CAR gene may serve as a useful indicator of pre-disposition to certain types of cancer.

An insertion polymorphism has been found in the CAR gene. The mutant allele is defined by an insertion of the 4-bp sequence CACA between the A at position 271, and the G at position 272 in the aforementioned wild-type CAR gene. Of 48 individuals analysed, 17 were found to be heterozygous for the 4-bp insertion. As a result of this 4-bp insertion, a frame shift mutation, the polypeptide encoded by this mutant allele is as follows:

Me Gly Lys Val lie Ser Ala Ser Arg Val Ser Met Glu His Phe Glu Phe Gin Gly Tyr Arg Gin Ser Phe Pro Val Trp Leu Arg Pro Pro Arg Gly Ser Arg Ala Phe Arg Leu Gin Thr Asp Pro Cys Ser Cys Arg Arg Leu His Asp Arg Pro Ser Ser Cys Gly Ser Leu Gly Met Leu Arg Gly Ser Asp Met Lys Gly Pro Cys Glu Pro lie Val Leu Ser Pro Ala Ala Leu Ser Ser Ser Ser Leu lie Asn Gly Ala Thr Gin Ser Gly Pro Gly Thr Gly Leu Arg Arg Thr His His Cys Pro Leu Leu Pro Cys Gly Leu Val Gin Val Glu Asp Phe Leu Leu Val

This mutant CAR protein is shorter than the wild-type protein. Also, the C- terminal portion has a different amino acid sequence because of the reading frame shift and therefore the mutant CAR protein is unlikely to function in the same way as the wild-type CAR protein. The assay for a mutant CAR gene may involve any suitable method for identifying such polymorphisms, such as: sequencing of the DNA at one or more of the relevant positions; differential hybridisation of an oligonucleotide probe designed to hybridise at the relevant positions of either the wild-type or mutant sequence; denaturing gel electrophoresis following digestion with an appropriate restriction enzyme, preferably following amplification of the relevant DNA regions; SI nuclease sequence analysis; non-denaturing gel electrophoresis, preferably following amplification of the relevant DNA regions; conventional RFLP (restriction fragment length polymorphism) assays; selective DNA amplification using oligonucleotides which are matched for the wild-type sequence and unmatched for the mutant sequence or vice versa; or the selective introduction of a restriction site using a PCR (or similar) primer matched for the wild-type or mutant genotype, followed by a restriction digest. The assay may be indirect, ie capable of detecting a mutation at another position or gene which is known to be linked to one or more of the mutant positions. The probes and primers may be fragments of DNA isolated from nature or may be synthetic.

A non-denaturing gel may be used to detect differing lengths of fragments resulting from digestion with an appropriate restriction enzyme. The DNA is usually amplified before digestion, for example using the polymerase chain reaction (PCR) method and modifications thereof. Otherwise 10-100 times more DNA would need to be obtained in the first place, and even then the assay would work only if the restriction enzyme cuts DNA infrequently.

Amplification of DNA may be achieved by the established PCR method or by developments thereof or alternatives such as the Iigase chain reaction, QB replicase and nucleic acid sequence-based amplification. An "appropriate restriction enzyme" is one which will recognise and cut the wild-type sequence and not the mutated sequence or vice versa. The sequence which is recognised and cut by the restriction enzyme (or not, as the case may be) can be present as a consequence of the mutation or it can be introduced into the normal or mutant allele using mismatched oligonucleotides in the PCR reaction. It is convenient if the enzyme cuts DNA only infrequently, in other words if it recognises a sequence which occurs only rarely.

In another method, a pair of PCR primers are used which match (ie hybridise to) either the wild-type genotype or the mutant genotype but not both . Whether amplified DNA is produced will then indicate the wild-type or mutant genotype (and hence phenotype). However, this method relies partly on a negative result (ie the absence of amplified DNA) which could be due to a technical failure. It is therefore less reliable and/or requires additional control experiments.

A preferable method employs similar PCR primers but, as well as hybridising to only one of the wild-type or mutant sequences, they introduce a restriction site which is not otherwise there in either the wild-type or mutant sequences.

Kits and assay components comprising PCR primers and oligonucleotides for hybridisation as described above form further aspects of the invention.

The primer kit of the present invention is useful for determination of the nucleotide sequence of the adhesion gene using the polymerase chain reaction. The kit comprises a set of pairs of single stranded DNA primers which can be annealed to sequences within or surrounding the adhesion gene on the relevant chromosome in order to prime amplifying DNA synthesis of the gene itself. The complete set allows synthesis of all of the nucleotides of the adhesion gene coding sequences, ie the exons. The set of primers preferably allows synthesis of both intron and exon sequences, as adhesion gene mutations may be found in an adhesion gene intron. The kit can also contain DNA polymerase, preferably Taq polymerase, and suitable reaction buffers. Such components are known in the art.

In order to facilitate subsequent cloning of amplified sequences, primers may have restriction enzyme sites appended to their 5' ends. Thus, all nucleotides of the primers are derived from adhesion gene sequences or sequences adjacent to that gene except the few nucleotides necessary to form a restriction enzyme site. Such enzymes and sites are well known in the art. The primers themselves can be synthesized using techniques which are well known in the art. Generally, the primers can be made using synthesizing machines which are commercially available. Given the sequence of the CAR open reading frame shown above, design of particular primers is well within the skill of the art.

The nucleic acid probes provided by the present invention are usefiil for a number of purposes. They can be used in Southern hybridization to genomic DNA and in the RNase protection method for detecting point mutations already discussed above. The probes can be used to detect PCR amplification products. They may also be used to detect mismatches with the adhesion gene or mRNA using other techniques. Mismatches can be detected using either enzymes (eg SI nuclease), chemicals (eg hydroxylamine or osmium tetroxide and piperidine), or changes in electrophoretic mobility of mismatched hybrids as compared to totally matched hybrids. These techniques are known in the art. See Cotton, supra. Shenk, supra. Myers, supra. Winter, supra, and Novack et al, Proc. Natl. Acad. Sci. USA, vol. 83, p. 586, 1986. Generally, the probes are complementary to adhesion gene coding sequences, although probes to certain introns are also contemplated. An entire battery of nucleic acid probes may be used to compose a kit for detecting loss of wild-type adhesion genes. The kit allows for hybridization to the entire adhesion gene. The probes may overlap with each other or be contiguous. If a riboprobe is used to detect mismatches with mRNA, it is complementary to the mRNA of the human wild-type adhesion gene. The riboprobe thus is an anti-sense probe in that it does not code for the adhesion protein because it is of the opposite polarity to the sense strand. The riboprobe generally will be radioactively labelled which can be accomplished by any means known in the art. If the riboprobe is used to detect mismatches with DNA it can be of either polarity, sense or anti-sense. Similarly, DNA probes also may be used to detect mismatches.

Nucleic acid probes may also be complementary to mutant alleles of the adhesion gene. These are useful to detect similar mutations in other patients on the basis of hybridization rather than mismatches. These are discussed above and referred to as allele-specific probes. As mentioned above, the adhesion gene probes can also be used in Southern hybridizations to genomic DNA to detect gross chromosomal changes such as deletions and insertions. The probes can also be used to select cDNA clones of adhesion genes from tumour and normal tissues. In addition, the probes can be used to detect adhesion gene mRNA in tissue to determine if expression is diminished as a result of loss of wild-type adhesion genes. Provided with the adhesion gene coding sequence, design of particular probes is well within the skill of the ordinary artisan.

A seventh aspect provides a method of supplying the protein function to a cell which lacks that function. This may be achieved by transfecting the cell with a suitable expression sequence for the protein or an active part thereof. In this way, cells with metastatic potential may be made less metastatic.

The wild-type adhesion gene or a part of the gene may be introduced into the cell in a vector (or similar construct, such as a yeast artificial chromosome) such that the gene remains extrachromoso al. In such a situation the gene will be expressed by the cell from the extrachromosomal location. If a gene portion is introduced and expressed in a cell carrying a mutant adhesion gene allele, the gene portion should encode a part of the adhesion protein which is required for cell adhesion. More preferred is the situation where the wild-type adhesion gene or a part of it is introduced into the mutant cell in such a way that it recombines with the endogenous mutant adhesion gene present in the cell. Such recombination requires a double recombination event which results in the correction of the adhesion gene mutation. Vectors for introduction of genes both for recombination and for extrachromosomal maintenance are known in the art and any suitable vector may be used. Methods for introducing DNA into cells such as electroporation, calcium phosphate co-precipitation and viral transduction are known in the art and the choice of method is within the competence of those in the art. Methods of gene therapy using recombinant retroviruses are taught in, for example, Lemoine et al (1990) Oncogene, 5, 237-239. By "gene therapy" we include the provision of genes which are integrated into the host cell's chromosome and the provision of vectors which reproduce and express the protein in the host cells without such integration taking place. Such vectors include recombinant pox (eg vaccinia) virus (see Mackett et al (1982) Proc. Nat. Acad. Sci. USA 79, 7415-7419; Cremer et al (1985) Science 228, 1985), recombinant retroviral vectors (eg those of Apperley et al (1991) Blood 78(2), 310-317; Markowitz et al (1990) Ann. N. Y. Acad. Sci. 612, 407-414; Lothrop et al (1991) Blood 78(1), 237-245; Fauser (1991) J. Cell. Biochem. 45(4), 353-358; or WO 89/12109) and recombinant adenovirus, as is known in the art.

Polypeptides which have adhesion activity can be supplied to ceils which carry mutant or missing adhesion gene alleles. The sequence of the adhesion protein is disclosed above. Protein can be produced by expression of the cDNA sequence in bacteria, etc as described above. Alternatively, adhesion protein can be extracted from adhesion-protein-producing mammalian cells. In addition, the techniques of synthetic chemistry can be employed to synthesize adhesion protein. Any of such techniques can provide the preparation of the present invention which comprises the adhesion gene product. The preparation is substantially free of other human proteins. This is most readily accomplished by synthesis in a microorganism or in vitro. Active adhesion protein molecules can be introduced into cells by microinjection or by use of liposomes, for example. Alternatively, some such active molecules may be taken up by cells, actively or by diffusion. Extracellular application of adhesion gene product may be sufficient to cause cell adhesion. Other molecules with adhesion activity may also be used to cause cell adhesion.

It may or may not be desirable to target the transfecting gene (or compound (including polypeptides) which restores adhesion function) to the defective cells, using substances specific for target-cell-specific cell markers.

A eighth aspect of the invention provides an assay method for determining whether a compound is potentially useful in the treatment of cancers, the test comprising exposing to the compound a cell line defective in the function of the adhesion protein and determining whether the compound causes the cell line to adhere more readily to a substrate.

In principle, any cell line defective in the function of this protein can be used as the basis of the assay, although normally a mammalian cell line and, more particularly, a mammalian tumour cell line, will be used. Such a cell line may have the said protein function removed by removal of the entire gene. Alternatively, and more readily, the gene may be altered, for example by means of site-directed mutagenesis, so that a non-functional protein is produced. One specific example which we discuss below in the context of the CAR gene is to mutate the C-terminal tyrosine residue, for example to form a stop codon. This appears to remove a tyrosine kinase phosphorylation site. A compound identified in the assay of the invention (or other adhesion- function-restoring molecule) may be dissolved in a suitable delivery vehicle and administered to patients to inhibit or prevent the growth or spread, particularly metastasis, of an actual or suspected tumour. The patient is preferably a human but may be another animal, preferably a mammal, such as a pet (dogs, cats, etc) or an economically important animal (sheep, cattle, pigs, fowl, horses, etc), including transgenic animals expressing valuable polypeptides. The molecules are preferably administered parenterally, for example intravenously, intramuscularly or subcutaneously, by injection or infusion. Clinically qualified people will be able to determined suitable dosages, delivery vehicles and administrative routes.

Further aspects and preferred features of the invention will now be described by way of example and in the figures in which Figure 1 shows the pSVL expression vector containing the SV40 late promoter. Restriction sites suitable for cloning the CAR cDNA are Xhol, Xhal, Smal, Sad and BamRl. Figure 2 shows the pNeo vector encoding a -lactamase for ampicillin resistance (Amp^R) and an aminoglycoside 3'-phosphotransferase for neomycin resistance (Neo^R).

EXAMPLE 1: FUNCTIONAL EXPRESSION CLONING AND CHARACTERISATION OF CLONE 27

Summary

From a pCDM8 library derived from the SW 1222 colon cancer cell line, we isolated a cDNA by functional screening for adhesive phenotype in the transfected WOP cells (fibroblast origin, polyoma transformed NIH 3T3). SW1222 and other suitable colon cancer cell lines are generally available, for example from ATCC. WOP cells were selected, not only for their ability to support high copy replication of pCDM8 but because they have an acceptably low binding to collagen type 1 (% cell binding = 10±3 %). After 4 rounds of selection a number of clones were identified as inducing collagen binding activity in WOP cells and clone 27 in particular was confirmed to induce WOP cell-collagen 1 matrix adhesion by 2-3 fold.

Methods. A cDNA library was made using standard methods (Seed, B. (1987) Nature 329, 840-842) from mRNA obtained from 5xI0⁷ SWI222 cells. Following ligation into the pCDM8 vector and expression in E. coli, an aliquot of the resulting CsCl-purified plasmid library was electroporated into WOP cells (25 μl into 10⁷ cells at 300v). The approximate efficiency of transfection was 30-40%, as judged by control CD2 in pCDM8 WOP transfection detected by surface fluorescence with anti-CD2 and FITC-conjugated F(ab')₂ antibodies. After 48 h of growth, WOP cells were then selected for adhesion to collagen type 1 coated Petri dishes which had been prepared overnight with collagen type 1 solution (20 g/ml in PBS) under sterile laminar flow conditions at room temperature. Dishes were then washed 3 times with PBS and casein blocking solution applied for a further 4 h before use. lxlO⁶ freshly trypsinised WOP cells washed twice in DMEM with BSA (2 mg/ml) were applied to each 10 cm dish and incubated for 2 h at room temperature. Non-adherent cells were then washed off by streaming PBS over the plate 3 times and the remaining adherent cells lysed in situ with SDS/EDTA to form a Hirt supernatant. Following recovery of plasmids and amplification in E. coli the selection process was repeated for a total of 4 rounds before individual clones were selected and transfected into WOP cells for assay of collagen binding activity as previously described (Pignatelli, M.P. & Bodmer, W.F. (1988) Proc. Natl. Acad. Sci. USA 85, 5561-5565). Briefly, 96 well microtitre plates were freshly coated overnight with collagen type 1 (100 μl of 20 μg/ml in PBS allowed to evaporate at room temperature). Control plates to assess non-specific binding were coated with BSA (20 μg/ml) in PBS. Plates were washed 3 times with PBS prior to use and 2xl0⁴ of trypsinised cells which had been washed 3 times in DMEM with BSA (2 mg/ml) and 2.5 mM Hepes pH7.4, were applied to each well in a volume of 100 μl and allowed to adhere for 2 h at room temperature. Plates were then washed gendy 3 times with PBS and the residual cell numbers determined by a fluorescent assay using 4-methyl umbelliferyl phosphate to detect cellular phosphatase activity (Huschtscha, L.I. et al (1989) In Vitro Cellular & Developmental Biol. 25: 105-108). After subtraction of non¬ specific background adhesion to BSA control wells, results were initially expressed as arbitrary fluorescence units but subsequentiy, to enable direct comparisons between experiments, as a % of the input number of cells (maximum fluorescence) and called % specific binding.

EXAMPLE 2: SEQUENCING OF CLONE 27

Clone 27, selected as the most potent inducer of collagen binding activity, was sequenced using the chain-termination method of Sanger (Sanger, F. et al (1977) Proc. Natl. Acad. Sci. USA 74: 5463-5467).

The open reading frame encodes a protein of 142 amino acids:

MGKVISASRVSMEHFEFQGYRQSFPV LRPPRGSRAFRLQTDPCSRRCLHD RPSSCGSLGMLRGSDMKGPCEPIVLSPAALSSSSLINGASQAQALGSGGLT TAPCCHVD CKLRTSCWSSHACSVGDALVFTALRIVEILY An N terminus myristoylation consensus sequence (Towler, D. A. et al (1988) Ann. Rev. Biochem. 57: 69-99) is underlined and at the COOH terminus the terminal tyrosine residue forming part of a putative tyrosine phosphorylation signature is shown in italics. The poly A tail occurs immediately following the two stop codons and consequently the polyadenylation signal AATAAA is found within the expressed sequence.

EXAMPLE 3: MODIFICATION OF PROTEIN

Methods. Site-directed mutagenesis by PCR with the following primers

forward 5 ' -AATAGTACATGGGGAAGGTTATCA- 3 ^r reverse 5' ---^TAGTACTTATTAAAGTATCTCAACAA-3 ' (stop mutation)

was used to convert TYR 142 to an additional stop codon. After ligation into pCDM8 and sequencing, the mutant and control clone 27 plasmids were independently electroporated into WOP cells which were assayed for collagen binding ability after 48 h culture.

The terminal tyrosine residue (TYR 142) of clone 27 was found to be essential for function in the protein studied. Loss of the tyrosine residue by site-directed mutagenesis inhibits the collagen adhesion inducing effect of wild-type clone 27 with WOP cells displaying only background levels of adhesion compared to wild-type transfected cells.

EXAMPLE 4: MAPPING OF CLONE 27 TO HUMAN CHROMOSOME 16q24

Methods. DNA was obtained from human x rodent DNAs, EcoRl-digested overnight and applied to nitrocellulose filters which were hybridised with ³²P- labelled clone 27 following standard Southern blotting protocols and the autoradiographs exposed for 24 h. A cosmid containing clone 27, displaying the 4kb EcoRl fragment, was obtained by colony hybridisation of a cosmid library and was used to map by in situ hybridisation to human chromosome following established protocols (Zabel, B.U. et al (1983) Proc. Nat. Acad. Sci. USA 80, 6932-6936; Harper & Saunders (1981) Chromosoma 83, 431-439).

Southern blotting on a panel of human x rodent hybrid DNAs localised clone 27 to 16q24 with a 4kb ZscσRl band and showed homology to mouse and hamster.

EXAMPLE 5: SUPPRESSION OF CLONE 27 EXPRESSION TO ALTER CELL MORPHOLOGY AND GLANDULAR FORMATION IN SW1222

The parent cell line SW1222 was cotransfected with an antisense construct of clone 27 in pCDM8 together with pSV2neo by electroporation (400v) and stable clones were obtained which displayed a high proportion of fusiform fibroblast-like cells and absence of tight glandular structures compared to the control clone 27/neo transfected SW1222. Collagen binding activity was also markedly reduced from 88% to 22% specific binding, confirming that reduced expression of clone 27 results in down-regulation of ECM binding.

EXAMPLE 6: USE OF CLONE 27 TO ALTER CELL MORPHOLOGY

It was of interest to determine how clone 27 upregulates cellular adhesion to collagen type 1 , and whether this was due to upregulated integrin expression or by alteration of binding avidity of existing surface integrins. Consequently stable transfectants of clone 27 were made but this time in an epithelial cell line. The colon cancer line SW 620 was selected for these experiments for two reasons. Firstly it displays only lowgrade binding to collagen 1 and secondly it was derived from a metastatic lesion from the same patient as the cell line SW 480 which was derived from the primary tumour and retains ability to bind to collagen.

Transfected SW 620 showed enhanced ability, by 3-4 fold, to bind to collagen compared to the wild-type line and % binding levels approached those of SW 1222, the library source, and those of SW 480. The surface expression of known alpha and beta integrin chains was checked by FACS analysis and by Elisa. No significant alteration in levels of expression were found for alpha 1, 2, 3, 4, 5 or 6 or for betas 1, 2, 3 or 4. Functional inhibition studies using anti-integrin antibodies and peptides containing the RGD sequence were also carried out. As expected, anti bl and anti alpha 2 antibodies significantly inhibited binding to collagen type 1 for both the wild type and transfected cells to the equivalent basal level.

EXAMPLE 7: CELLULAR DISTRIBUTION OF CLONE 27

In order to study the cellular distribution of clone 27 and investigate alterations in levels of tumours of different degrees of invasiveness and correlation with collagen binding, rabbit polyclonal antibodies were raised to synthetic peptides predicted from the primary sequence.

Methods. Immunocytochemistry of acetone-fixed cytospins was performed using as first layer rabbit anti-human polyclonal antibody (60 min at room temperature) raised to a synthetic peptide predicted from clone 27 sequence (amino acids 61-72, MetLeuArgGlySerAspMetLysGlyProCysGIu). The second layer antibody was horseradish-peroxidase-conjugated swine anti-rabbit (60 min at room temperature) followed by peroxidase reaction with DAB (diamϊnobenzidine) and counterstaining with haematoxylin. SW1222 cells showed wild-type expression and wild-type SW 620 showed essentially no expression. Immunocytochemistry demonstrated staining for the clone 27 product in the cell cytoplasm of SW 1222, transfected SW 620 and SW 480 but not in wild-type SW 620. Repeated Northern analysis of cell line RNA has failed to detect a specific signal for clone 27 message. Thus it would appear that mRNA for clone 27 is in low abundance and this may reflect a fairly stable low turnover protein product.

EXAMPLE 8: SYNTHETIC ANTIGENIC PEPTIDES

Polyclonal rabbit anti-human antibodies which are able to stain cells and tissue sections have been raised to two synthetic peptides:

PEPTIDE 1: 15 AMINO ACIDS (89-103 OF PREDICTED SEQUENCE) H₂N-Gly-Ala-Ser-Gln-Ala-Gln-Ala-Leu-Gly-Ser-Gly-Gly-Leu-Thr-Thr-COOH

PEPTIDE 2: 12 AMINO ACIDS (61-72 OF PREDICTED SEQUENCE) H₂N-Met-Leu-Arg-Gly-Ser-Asp-Met-Lys-Gly-Pro-Cys-Glu-COOH

Two further suitable peptides to be used for raising functional antibodies are:

PEPTIDE 3: 11 AMINO ACIDS (46-56 OF PREDICTED SEQUENCE) H₂N-Cys-Arg-Cys-Leu-His-Asp-Arg-Pro-Ser-Ser-Cys-COOH

(PEPTIDE 3 in the priority application had a Val at position 3 instead of Cys, but was otherwise the same as the present PEPTIDE 3.)

PEPTIDE 4: 10 AMINO ACIDS (26-35 OF PREDICTED SEQUENCE) H₂N-Val-Trp-Leu-Arg-Pro-Pro-Arg-Gly-Ser-Arg-COOH (PEPTIDE 4 in the priority application had Asp- Arg- Ala at positions 8 to 10 instead of Gly-Ser-Arg, but was otherwise the same as the present PEPTIDE 4.)

One further peptide to be used for raising functional antibodies which will identify the mutant CAR protein (derived from the mutant CAR gene containing the 4-bp insertion) is:

PEPTIDE 5: 10 AMINO ACIDS (92-101 OF PREDICTED SEQUENCE OF MUTANT CAR PROTEIN)

H₂N-GIn-Ser-Gly-Pro-Gly-Thr-Gly-Leu-Arg-Arg-COOH

EXAMPLE 9: IMMUNOCYTOCHEMISTRY METHOD FOR FORMALIN OR METHACARN FIXED PARAFFIN EMBEDDED SECTIONS

1. De-paraffin the section as follows. a. Xylene for 2-3 min, repeat in fresh xylene b. 100% EtOH for 2-3 min c. Block endogenous peroxidase activity with 0.03% H₂0₂ in

Methanol for 10 min at room temperature (1:1000 of stock) d. 100% EtOH for 2-3 min e. 90% EtOH for 2-3 min f. 70% EtOH for 2-3 min g. H₂0 for 10-15 min in several changes h. 15-20 min in PBS 4-5 washes

If necessary a Pronase (0.4% in 50 mM Tris/HCl pH8) or Trypsin incubation step (37°C for 15 min) can be included after step c. 33 2. Stain sections with first coat antibody for 60 min at room temperature and follow subsequent steps as outlined for cultured cells.

Recipe for wash solution between antibody and peroxidase substrate treatments:

4 g Casein in 2 litres PBSA (lx) with 2 ml Tween.

EXAMPLE 10: IMMUNOCYTOCHEMISTRY METHOD APPLICABLE FOR CELLS IN CULTURE DISHES. COVER SLIPS OR CYTOSPINS

1. Fix cells 5-10 mins in acetone at room temperature or in cold acetone (4°C).

2. Rehydrate in PBS 4 washes x 5 min each. 3. Add first layer antibody at appropriate concentration in PBS with 5 %

FCS^for 30-60 min at room temperature. Cover cells (for cytospin, this needs about 100 μl).

4. Wash 3 times in PBS/casein/Tween solution (recipe below).

5. Add second layer HRP-conjugated antibody. If first coat Ab is rabbit polyclonal then use HRP swine anti-rabbit at

1/100 in PBS + 5% FCS.

If first coat is mouse Ab then use HRP rabbit anti-mouse 1/100. Incubate for 60 min at room temperature.

6a. Wash 3x. 6b. If increased sensitivity required, ie for nuclear staining, then peroxidase antiperoxidase step required.

7. Add peroxidase substrate (10 μl of H₂0₂ in XXX). Incubate for up to

10 mins at room temperature whilst assessing for brown colouration in test v control. 8. Wash in flowing water for 15-20 mins. 7. Stain cells with haematoxylin as follows: a. 7 min in haematoxylin b. wash for 15 min in flowing water c. 7 sec in destain d. 20 min in flowing water e. dehydrate in each of 70% EtOH, 90% EtOH, 100% EtOH, and Xylene for 2-3 min each before drying carefully and applying coverslip if required.

EXAMPLE 11: USE OF CELLS WITHOUT CAR ACTIVITY TO IDEISΓΠFY USEFUL COMPOUNDS

Cell lines which lack CAR activity (collagen-binding) are made by transfecting cell lines, preferably colon cancer cell lines, with the antisense construct as described in Example 5. Suitable colon cancer cell lines are SW122, SW620 (ATCC CCL 227), COLO 320 DM (ATCC CCL 220) and COLO 205 (ATCC CCL 222), and are available from the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD, USA. The CAR cDNA is cloned into the plas id pSVL as shown in Figure 1. (The plasmid pSVL is available from Pharmacia Biotech, Knowlhill, Milton Keynes, UK.) The cell lines lacking CAR activity are transfected with the CAR-containing construct, and pNEO, shown in Figure 2 (modified so as to contain a promoter, such as the SV40 promoter, to drive Neo^R expression) using neomycin selection to form stable transfectants. A natural cell line is used for comparison, and both cell lines are treated in culture with potential therapeutic agents added to the culture medium. A variety of doses and culture periods are used before assaying these cells for the ability to bind the substrate, as in Example 1 above. Controls comprise the natural cells and the abnormal cells treated with carrier only (ie no drug). EXAMPLE 12: DETECTION OF 4-bp INSERTION MUTANT

Two oligonucleotide probes are made which will distinguish the wild-type CAR sequence and the mutant CAR sequence containing the 4-bp insertion.

OLIGOl is 5'-AATGGGGCCAGTCAGGCCCA-3' and preferentially hybridises to the wild-type CAR sequence.

OLIG02 is 5'-TGGGGCCACACAGTCAGGCC-3' and preferentially hybridises to the mutant CAR sequence containing the 4-bp insertion.

OLIGOl and OLIG02 are radio-labelled using ³²P[γ-ATP] and T4 poly nucleotide kinase to a high specific activity. Genomic DNA is extracted from suitable tissue, such as normal colon or colon tumour cells using standard methods. The oligonucleotide probes are each separately hybridised to the test nucleic acid sample under stringent hybridisation conditions, such as in 6X SSC (SSC is saline sodium citrate) at a temperature of 70°C. OLIGOl will only hybridise to the test nucleic acid if it contains the wild-type CAR sequence and not the mutant CAR sequence. OLIG02 will only hybridise to the test nucleic acid if it contains the mutant CAR sequence and not the wild-type CAR sequence. Both OLIGOl and OLIG02 will hybridise if the test sample contains both wild-type and mutant sequences.

SEQUENCE LISTING

(1) GENERAL INFORMATION:

(i) APPLICANT: Pullman, William E Bodmer, Walter F Durbin, Helga

(ii) TITLE OF INVENTION: Gene

(iii) NUMBER OF SEQUENCES: 10

(iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: Eric Potter Clarkson

(B) STREET: St Mary's Court, St Mary's Gate

(C) CITY: Nottingham

(D) STATE: Nottinghamshire

(E) COUNTRY: United Kingdom

(F) ZIP: NG1 1LE

(v) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: Floppy disk

(B) COMPUTER: IBM PC compatible

(C) OPERATING SYSTEM: PC-DOS/MS-DOS

(D) SOFTWARE: Patentln Release #1.0, Version #1.25

(vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER:

(B) FILING DATE:

(C) CLASSIFICATION:

(vii) PRIOR APPLICATION DATA:

(A) APPLICATION NUMBER: GB 9122501.1

(B) FILING DATE: 22-OCT-1991

(viii) ATTORNEY/AGENT INFORMATION:

(A) NAME: Bassett, Richard S

(C) REFERENCE/DOCKET NUMBER: IMPF/P11106PC

(ix) TELECOMMUNICATION INFORMATION:

(A) TELEPHONE: (0602) 585800

(B) TELEFAX:^" (0602) 588122

(C) TELEX: 37540 Potter G

(2) INFORMATION FOR SEQ ID NO:l:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 445 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 1..426

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..445

(D) OTHER INFORMATION: /function= "Wild type CAR gene"

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l:

ATG GGG AAG GTT ATC AGT GCT TCC CGA GTG AGC ATG GAA CAC TTC GAG Met Gly Lys Val lie Ser Ala Ser Arg Val Ser Met Glu His Phe Glu 1 5 10 15

TTC CAG GGT TAT AGA CAG TCG TTC CCA GTG TGG CTG AGG CCA CCC AGA Phe Gin Gly Tyr Arg Gin Ser Phe Pro Val Trp Leu Arg Pro Pro Arg 20 25 30

GGC AGC AGA GCA TTC AGA CTC CAA ACA GAC CCC TGT TCA TGC CGA CGC 1 Gly Ser Arg Ala Phe Arg Leu Gin Thr Asp Pro Cys Ser Cys Arg Arg 35 40 45

TTG CAC GAC CGC CCC AGT TCC TGT GGC TCC CTC GGA ATG CTA AGG GGA 1 Leu His Asp Arg Pro Ser Ser Cys Gly Ser Leu Gly Met Leu Arg Gly 50 55 60

TCG GAC ATG AAA GGA CCC TGT GAG CCG ATT GTC CTA TCT CCA GCG GCC 2 Ser Asp Met Lys Gly Pro Cys Glu Pro lie Val Leu Ser Pro Ala Ala 65 70 75 80

CTG TCA TCC AGC TCA CTC ATC AAT GGG GCC AGT CAG GCC CAG GCA CTG 2 Leu Ser Ser Ser Ser Leu lie Asn Gly Ala Ser Gin Ala Gin Ala Leu

85 90 95

GGC TCC GGA GGA CTC ACC ACT GCC CCC TGC TGC CAT GTG GAC TGG TGC 3 Gly Ser Gly Gly Leu Thr Thr Ala Pro Cys Cys His Val Asp Trp Cys 100 105 110

AAG TTG AGG ACT TCT TGC TGG TCT AGT CAC GCA TGC AGT GTT GGG GAT 3 Lys Leu Arg Thr Ser Cys Trp Ser Ser His Ala Cys Ser Val Gly Asp 115 120 125

GCC TTG GTT TTT ACT GCT CTG AGA ATT GTT GAG ATA CTT TAC 4

Ala Leu Val Phe Thr Ala Leu Arg lie Val Glu lie Leu Tyr 130 135 140

TAATAAACTG TGTAGTTGG 4 (2) INFORMATION FOR SEQ ID NO:2:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 142 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..142

(D) OTHER INFORMATION: /function= "Wild type CAR protein"

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:

Met Gly Lys Val lie Ser Ala Ser Arg Val Ser Met Glu His Phe Glu 1 5 10 15

Phe Gin Gly Tyr Arg Gin Ser Phe Pro Val Trp Leu Arg Pro Pro Arg 20 25 30

Gly Ser Arg Ala Phe Arg Leu Gin Thr Asp Pro Cys Ser Cys Arg Arg 35 40 45

Leu His Asp Arg Pro Ser Ser Cys Gly Ser Leu Gly Met Leu Arg Gly 50 55 60

Ser Asp Met Lys Gly Pro Cys Glu Pro lie Val Leu Ser Pro Ala Ala 65 70 75 80

Leu Ser Ser Ser Ser Leu lie Asn Gly Ala Ser Gin Ala Gin Ala Leu

85 90 95

Gly Ser Gly Gly Leu Thr Thr Ala Pro Cys Cys His Val Asp Trp Cys 100 105 110

Lys Leu Arg Thr Ser Cys Trp Ser Ser His Ala Cys Ser Val Gly Asp 115 120 125

Ala Leu Val Phe Thr Ala Leu Arg lie Val Glu lie Leu Tyr 130 135 140

(2) INFORMATION FOR SEQ ID NO:3:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 121 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..121 (D) OTHER INFORMATION: /function= "mutant CAR protein"

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:

Met Gly Lys Val lie Ser Ala Ser Arg Val Ser Met Glu His Phe Glu 1 5 10 15

Phe Gin Gly Tyr Arg Gin Ser Phe Pro Val Trp Leu Arg Pro Pro Arg 20 25 30

Gly Ser Arg Ala Phe Arg Leu Gin Thr Asp Pro Cys Ser Cys Arg Arg 35 40 45

Leu His Asp Arg Pro Ser Ser Cys Gly Ser Leu Gly Met Leu Arg Gly 50 55 60

Ser Asp Met Lys Gly Pro Cys Glu Pro lie Val Leu Ser Pro Ala Ala 65 70 75 80

Leu Ser Ser Ser Ser Leu lie Asn Gly Ala Thr Gin Ser Gly Pro Gly

85 90 95

Thr Gly Leu Arg Arg Thr His His Cys Pro Leu Leu Pro Cys Gly Leu 100 105 110

Val Gin Val Glu Asp Phe Leu Leu Val 115 120

(2) INFORMATION FOR SEQ ID NO:4:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 15 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO

(ix) FEATURE:

(A) NAME/KEY: Peptide

(B) LOCATION: 1..15

(D) OTHER INFORMATION: /note= "peptide 1"

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:

Gly Ala Ser Gin Ala Gin Ala Leu Gly Ser Gly Gly Leu Thr Thr 1 5 10 15

(2) INFORMATION FOR SEQ ID NO:5: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 12 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO

(ix) FEATURE:

(A) NAME/KEY: Peptide

(B) LOCATION: 1..12

(D) OTHER INFORMATION: /note= "peptide 2"

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:

Met Leu Arg Gly Ser Asp Met Lys Gly Pro Cys Glu 1 5 10

(2) INFORMATION FOR SEQ ID NO:6:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 11 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO

(ix) FEATURE:

(A) NAME/KEY: Peptide

(B) LOCATION: 1..11

(D) OTHER INFORMATION: /note= "peptide 3"

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:

Cys Arg Cys Leu His Asp Arg Pro Ser Ser Cys 1 5 10

(2) INFORMATION FOR SEQ ID NO:7:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 10 amino acids

(B) TYPE: amino acid (C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO

(ix) FEATURE:

(A) NAME/KEY: Peptide

(B) LOCATION: 1..10

(D) OTHER INFORMATION: /note= "peptide 4"

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:

Val Trp Leu Arg Pro Pro Arg Gly Ser Arg 1 5 10

(2) INFORMATION FOR SEQ ID NO:8:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 10 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO

(ix) FEATURE:

(A) NAME/KEY: Peptide

(B) LOCATION: 1..10

(D) OTHER INFORMATION: /note= "peptide 5"

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:

Gin Ser Gly Pro Gly Thr Gly Leu Arg Arg 1 5 10

(2) INFORMATION FOR SEQ ID NO:9:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 20 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO

(ix) FEATURE:

(A) NAME/KEY: misc__feature

(B) LOCATION: 1..20

(D) OTHER INFORMATION: /function= "oligol"

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: AATGGGGCCA GTCAGGCCCA

(2) INFORMATION FOR SEQ ID NO:10:

^"". -^~(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 20 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..20

(D) OTHER INFORMATION: /function= "oligo2"

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: TGGGGCCACA CAGTCAGGCC

Claims

1. A method of identifying a gene useful in modulating the adhesion of cells to substrates, the method comprising: (i) selecting a cell population with a given level of adhesion to the substrate (termed hereinafter the "original cells"),

(ii) transfecting at least one original cell with a gene to provide expression of the gene in the cell or with antisense DNA corresponding to a gene, (iii) cloning thus-transfected cells, and

(iv) determining whether the transfected cells adhere to the substrate more or less than the original cells.

2. A method according to Claim 1 wherein the substrate contains collagen.

3. A method according to Claim 1 or 2 wherein the original cells do not adhere the substrate; in step (ii), a gene is transfected into the cell; and step (iv) comprises determining whether the transfected cells adhere to the substrate more than the original cells.

4. An isolated gene identified as useful in the method of any one of Claims 1 to 3.

5. The isolated Cell Adhesion Regulator (CAR) gene shown below and variations thereof:

atggggaaggttatcagtgcttcccgagtgagcatggaacacttcgagttcca gggttatagacagtcgttcccagtgtggctgaggccacccagaggcagcagag cattcagactccaaacagacccctgttcatgccgacgcttgcacgaccgcccc agttcctgtggctccctcggaatgctaaggggatcggacatgaaaggaccctg tgagccgattgtcctatctccagcggccctgtcatccagctcactcatcaatg gggccagtcaggcccaggcactgggctccggaggactcaccactgccccctgc tgccatgtggactggtgcaagttgaggacttcttgctggtctagtcacgcatg cagtgttggggatgccttggtttttactgctctgagaattgttgagatacttt actaataaactgtgtagttgg

6. A polypeptide encoded by a gene according to Claim 4.

7. A polypeptide comprising the Cell Adhesion Regulator (CAR) protein having the following sequence or a variant or fragment of the said protein:

MGKVISASRVSMEHFEFQGYRQSFPVWLRPPRGSRAFRLQTDPCSCRRLHD RPSSCGSLGMLRGSDMKGPCEPIVLSPAALSSSSLINGASQAQALGSGGLT TAPCCHVDWCKLRTSCWSSHACSVGDALVFTALRIVEILF

8. A polypeptide according to Claim 7 and having the sequence:

H₂N-Gly-Ala-Ser-Glu-AIa-GIu-Ala-Leu-Gly-Ser-Gly-Gly-Leu-Thr-Thr- COOH, or

H₂N-Met-Leu-Arg-Gly-Ser-Asp-Met-Lys-Gly-Pro-Cys-Glu-COOH, or H₂N-Cys-Arg-Cys-Leu-His-Asp-Arg-Pro-Ser-Ser-Cys-COOH, or

H₂N-Val-Trp-Leu-Arg-Pro-Pro-Arg-Gly-Ser-Arg-COOH.

9. A method of producing a polypeptide according to Claim 7 or 8 by expressing a corresponding nucleotide sequence in a suitable host cell or by amino acid synthesis or by proteolytic degradation of a larger polypeptide.

10. An antibody specific for the polypeptide of Claim 7 or 8.

11. A method of identifying whether a cell has adhesion properties, the method comprising determining whether the function of the protein encoded by a gene according to Claim 4 or 5 is present in a sample.

12. A method according to Claim 11 , the method comprising exposing the sample to an antibody according to Claim 10 and determining whether the antibody binds the protein.

13. A method according to Claim 12 wherein the sample is fixed tumour tissue.

14. A method according to Claim 1 1 comprising identifying at least partial functional loss of the gene encoding the protein or identifying a mutation leading to the expression of a mutant version of the said protein lacking in at least some extent of normal function.

15. A method according to any one of Claims 11 to 14 wherein the sample is tissue from a tumour of the breast, prostate or colon.

16. A method of increasing the ability of a cell to adhere to a substrate, the method comprising supplying to the cell the function of the polypeptide of any one of Claims 6 to 8.

17. A method according to Claim 16 comprising administering the said polypeptide to the cell or transfecting the cell with a nucleotide sequence encoding the said polypeptide.

18. An assay method for determining whether a compound is potentially useful in the treatment of cancers, the assay method comprising exposing to the compound a cell line defective in the function of the polypeptide of any one of Claims 6 to 8 and determining whether the compound causes the cell line to adhere more readily to a substrate.

19. An assay method according to Claim 18 wherein the cell line comprises a coding sequence for the polypeptide having a mutation such that the polypeptide lacks an effective phosphorylation site.

20. A compound identifiable as usefiil in the assay method of Claim 18 or 19.

21. A method of treating cancer in a patient, the method comprising administering to the patient a compound according to Claim 20.