HK1129842A - Diagnosis and treatment of multiple sulfatase deficiency and other using a formylglycine generating enzyme(fge) - Google Patents
Diagnosis and treatment of multiple sulfatase deficiency and other using a formylglycine generating enzyme(fge) Download PDFInfo
- Publication number
- HK1129842A HK1129842A HK09107709.8A HK09107709A HK1129842A HK 1129842 A HK1129842 A HK 1129842A HK 09107709 A HK09107709 A HK 09107709A HK 1129842 A HK1129842 A HK 1129842A
- Authority
- HK
- Hong Kong
- Prior art keywords
- seq
- fge
- sulfatase
- arylsulfatase
- nucleic acid
- Prior art date
Links
Description
This application is a divisional application of a patent of the same name as the patent having application No. 200480006490.0, 2004, 2/10.
Technical Field
The present invention relates to methods and compositions for diagnosing and treating Multiple Sulfatase Deficiency (MSD) as well as other sulfatase deficiencies. More particularly, the invention relates to isolated molecules that modulate post-translational modification of sulfatases. Such modifications are necessary for the proper function of the sulfatase.
Background
Sulfatase is a member of a highly conserved gene family, sharing extensive sequence homology (Franco, B. et al, Cell, 1995, 81: 15-25; Parenti, G. et al, curr. Opin. Gen. Dev., 1997, 7: 386-391), high structural similarity (Bond, C.S. et al, Structure, 1997, 5: 277-289; Lukatela, G. et al, Biochemistry, 1998, 37: 3654-64), and unique post-translational modifications necessary for sulfate cleavage (Schmidt, B. et al, Cell, 1995, 82: 271-278; Selmer, T. et al, Eur. J.biom., 1996, 238: 341-345). Post-translational modifications include the addition of conserved cysteines(in eukaryotic cells) or serine (in some prokaryotic cells) residues at CβOxidation of (b) to give L-C αFormylglycine (also called FGly; 2-amino-3-oxopropanoic acid) in which the aldehyde group replaces the thiomethyl group of the side chain. The aldehyde group is an essential part of the catalytic site of sulfatase, and is likely to function as an aldehyde hydrate. One of the paired hydroxyl groups accepts a sulfate group upon sulfate cleavage, resulting in the formation of a covalently sulfated enzyme intermediate. Other hydroxyl groups are required for subsequent sulfate group elimination and aldehyde group regeneration. This modification occurs in the endoplasmic reticulum during or shortly after the primary sulfatase polypeptide import and is regulated by a short linear sequence around the cysteine (or serine) residue to be modified. This highly conserved sequence is the hexapeptide L/V-C (S) -X-P-S-R (SEQ ID NO: 32), which occurs in the N-terminal region of all eukaryotic sulfatases and most frequently carries a hydroxyl or thiol group at residue X (Dierks, T. et al, Proc. Natl. Acad. Sci. U.S.A., 1997, 94: 11963-11968).
To date, 13 sulfatase genes have been identified in humans. They encode enzymes with different substrate specificities and subcellular localization such as lysosomal, golgi and endoplasmic reticulum localization. Four of these genes, ARSC, ARSD, ARSE, and ARSF (encoding arylsulfatase C, D, E, and F, respectively), are located in the same chromosomal region (Xp22.3). They share significant sequence similarity and nearly identical genomic organization, suggesting that they originate from replication events that have only recently occurred during evolution (Franco B et al, Cell, 1995, 81: 15-25; Meroni G et al, Hum Mol Genet, 1996, 5: 423-31).
The identification of at least eight human monogenic diseases caused by the lack of sulfatase activity alone underscores the importance of sulfatase in human metabolism. Most of these disorders are lysosomal storage diseases in which the phenotypic outcome is derived from the type and tissue distribution of the stored substance. Among them are five different types of mucopolysaccharidosis (MPS types II, IIIA, IIID, IVA and VI) caused by The lack of sulfatase catabolism of mucopolysaccharides (Neufeld and Muenzer, 2001, The mucopolysaccharoses, In The Metabolic Molecular Bases of Inherited diseases, C.R.Scriver, A.L.Beaudet, W.S.Sly, D.Valle, B.Childs, K.W.Kinzler and B.Vogelstein eds, New York: Mc Graw-Hill, pp.3421-3452), and The abnormally-contagious leukodystrophy (MLD) characterized by The storage of brain sulfatase In The central and peripheral nervous systems and causing severe and progressive neurodegeneration. Two additional human diseases are caused by non-lysosomal sulfatase enzyme deficiencies. These include X-linked ichthyosis, skin disease caused by steroid sulphatase (STS/ARSC) deficiency; and ichthyosiform insufficiency, diseases affecting bone and cartilage caused by deficiency of arylsulfatase E (ARSE). Sulfatase is also implicated in drug-induced malformation syndrome in humans, such as Warfarin embryopathy, caused by inhibition of ARSE activity during pregnancy by intrauterine exposure to Warfarin.
In the human monogenic diseases of interest, Multiple Sulfatase Deficiency (MSD) is a simultaneous deficiency of all sulfatase activities. Thus, the phenotype of this severe multisystemic disease combines the features observed in the sulfatase deficiency alone. Cells from patients with multiple sulfatase deficiency lack sulfatase activity even after transfection with cDNAs encoding human sulfatase, suggesting the existence of a common mechanism required for all sulfatase activities (Rommerkirch and von Figura, Proc. Natl. Acad. Sci., USA, 1992, 89: 2561-. Post-translational modification of sulfatase was found to be deficient in patients with multiple sulfatase deficiency, suggesting that this disease is caused by mutation of a gene involved in the conversion mechanism of cysteine to formylglycine (Schmidt, B. et al, Cell, 1995, 82: 271-278). Despite the strong biological and medical interest, efforts to identify this gene have been hampered by the rarity of multiple sulfatase deficient patients and the consequent inability to complete genetic mapping due to the lack of appropriate familial cases.
Summary of The Invention
Hair brushMethods and compositions are provided for the diagnosis and treatment of multiple sulfatase deficiency (MIM 272200) and for the treatment of other sulfatase deficiencies. More specifically, genes encoding the Formylglycine Generating Enzyme (FGE) responsible for the unique post-translational modifications (formation of L-C) that occur in sulfatases have been identified α-formylglycine; also called FGly and/or 2-amino-3-oxopropanoic acid), which is necessary for sulfatase function. It has been found that unexpectedly, mutations in the FGE gene result in the development of Multiple Sulfatase Deficiency (MSD) in a subject. It was also found that, unexpectedly, FGE enhances the activity of sulfatases, including, but not limited to, iduronate 2-sulfatase, sulfamidase, N-acetylgalactosamine 6-sulfatase, N-acetylglucosamine 6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6. In view of these findings, the molecules of the present invention are useful in the diagnosis and treatment of multiple sulfatase deficiency as well as other sulfatase deficiencies.
Methods of using the molecules of the invention in the diagnosis of multiple sulfatase deficiency are provided.
In addition, methods of using these molecules in vivo or in vitro for the purpose of modulating FGly formation on sulfatases, methods of treating conditions associated with such modifications, and compositions useful for preparing pharmaceutical formulations for the treatment of multiple and other sulfatase deficiencies are also provided.
The invention thus includes, in several aspects, polypeptides that modulate FGly formation on sulfatases, isolated nucleic acids encoding those polypeptides, functional modifications and variants thereof, useful fragments thereof, and therapeutic, diagnostic and research methods, compositions and tools related thereto.
According to one aspect of the invention, an isolated nucleic acid molecule selected from the group consisting of: (a) and SEQ ID NO: 1 by the nucleotide sequence shown inNucleic acid molecules which hybridize under stringent conditions to the constituent molecules and which encode a polypeptide having Cα-a Formylglycine Generating Enzyme (FGE) active for formylglycine generation; (b) a nucleic acid molecule which differs from the nucleic acid molecule of (a) in codon sequence by the degeneracy of the genetic code; and (c) the complementary strand of (a) or (b). In certain embodiments, the isolated nucleic acid molecule comprises SEQ ID NO: 1. In certain embodiments, an isolated nucleic acid molecule consists of SEQ ID NO: 3 or a fragment thereof.
Another aspect of the invention provides an isolated nucleic acid molecule selected from the group consisting of: (a) SEQ ID NO: 1, and (b) the complementary strand of (a), with the proviso that the unique fragment of (a) comprises a contiguous nucleotide sequence that is not identical to a sequence selected from the group consisting of (1) a sequence identical to nucleotides 20-1141 of SEQ ID No.4 and/or SEQ ID No.4, and (2) the complementary strand of the nucleic acid molecule of (1). In any of the preceding embodiments, the complementary strand refers to the full-length complementary strand.
In one embodiment, the contiguous nucleotide sequence is selected from the group consisting of (1) at least two contiguous nucleotides that are not identical to the sequence set, (2) at least three contiguous nucleotides that are not identical to the sequence set, (3) at least four contiguous nucleotides that are not identical to the sequence set, (4) at least five contiguous nucleotides that are not identical to the sequence set, (5) at least six contiguous nucleotides that are not identical to the sequence set, and (6) at least seven contiguous nucleotides that are not identical to the sequence set.
In another embodiment, the fragment size is selected from at least: 8 nucleotides, 10 nucleotides, 12 nucleotides, 14 nucleotides, 16 nucleotides, 18 nucleotides, 20 nucleotides, 22 nucleotides, 24 nucleotides, 26 nucleotides, 28 nucleotides, 30 nucleotides, 40 nucleotides, 50 nucleotides, 75 nucleotides, 100 nucleotides, 200 nucleotides, 1000 nucleotides and each integer length therebetween.
According to a further aspect, the present invention provides an expression vector comprising the nucleic acid molecule described above and a host cell transformed or transfected with the expression vector.
According to yet another aspect, the invention provides a cell expressing an activated form of an endogenous FGE gene. In one embodiment, activation of the endogenous FGE gene occurs by homologous recombination.
According to yet another aspect of the invention, isolated polypeptides are provided. The isolated polypeptide is encoded by a nucleic acid molecule of the invention as described above. In certain embodiments, the isolated polypeptide is encoded by SEQ id no: 1, producing a polypeptide having the sequence of SEQ ID NO: 2 sequence and Cα-a formylglycine generating active polypeptide. In other embodiments, the isolated polypeptide may be a fragment or variant of the foregoing, long enough to represent a unique sequence in the human genome and having a Cα-a formylglycine generating active polypeptide, with the proviso that the fragment comprises a sequence of consecutive amino acids which is not identical to any of the sequences encoded by the nucleic acid sequence having SEQ ID No. 4. In another embodiment, immunogenic fragments of the above polypeptide molecules are provided. The immunogenic fragment may or may not have CαFormylglycine generating activity.
According to a further aspect of the invention, an isolated binding polypeptide is provided which selectively binds to a polypeptide encoded by a nucleic acid molecule of the invention as described above. Preferably the isolated binding polypeptide selectively binds to a polypeptide comprising SEQ ID NO: 2, fragments thereof or belonging to the isolated C-bearing polypeptideα-a family of polypeptides having formylglycine generating activity and polypeptides as described elsewhere herein. In preferred embodiments, isolated binding polypeptides include antibodies and antibody fragments (e.g., Fab, F (ab)) 2Fd and antibody fragments comprising the CDR3 region that selectively binds to the FGE polypeptide). In certain embodiments, the antibody is a human antibody. In certain embodiments, the antibody is a monoclonal antibody. In one embodiment, the antibody is a polyclonal antiserum. In a further embodiment, the antibody is humanized. In still further embodiments, the antibody is chimeric.
According to a further aspect of the invention, having Cα-a family of isolated polypeptides active in formylglycine production is provided. Each of said polypeptides comprises from amino-terminus to carboxy-terminus: (a) amino-terminal subdomain 1; subdomain 2; carboxy-terminal subdomain 3 comprising from 35 to 45 amino acids, and wherein subdomain 3 has at least about 75% homology and substantially the same length as subdomain 3 of a polypeptide selected from the group consisting of SEQ ID No.2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78. In important embodiments, subdomain 2 contains from 120 to 140 amino acids. In a further important embodiment, at least 5% of the amino acids in subdomain 2 are tryptophan. In certain embodiments, subdomain 2 has at least 50% homology to domain 2 of a polypeptide selected from SEQ ID No.2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78. In certain embodiments, subdomain 3 of each polypeptide has between about 80% and about 100% homology to subdomain 3 of a polypeptide selected from the group consisting of SEQ ID No.2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78.
According to a further aspect of the invention, a method for determining the level of FGE expression in a subject is provided. The method includes measuring FGE expression in a test sample from a subject to determine a level of FGE expression in the subject. In certain embodiments, the measured FGE expression in the test sample is compared to FGE expression in a control sample (containing a known level of FGE expression). Expression is defined as FGE mRNA expression, FGE polypeptide expression, or FGE C as defined elsewhere hereinαFormylglycine generating activity. A variety of methods can be used to measure expression. Preferred embodiments of the invention include PCR and northern blotting for measuring mRNA expression, FGE monoclonal antibody or FGE polyclonal antiserum as an agent for measuring FGE polypeptide expression, and measurement of FGE Cα-formylglycine generating activity.
In certain embodiments, test samples, such as biopsy samples and biological fluids such as blood, are used as test samples. FGE expression in a test sample of a subject is compared to FGE expression in a control sample.
According to another aspect of the invention, C is used to identify a regulatory moleculeαA method for producing a substance useful in formylglycine generating activity is provided. The method comprises (a) mixing the compound with C αContacting a formylglycine generating active molecule with a candidate substance, (b) measuring C of the moleculeαFormylglycine generating activity, and (C) measuring the molecular weight Cα-formylglycine generating activity in comparison with a control to determine whether a candidate substance modulates C of a moleculeα-formylglycine generating activity, wherein the molecule is a peptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87, or an expression product thereof (e.g., a peptide having a sequence selected from SEQ ID No.2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78). In certain embodiments, the control is the C of the molecule measured in the absence of the candidate substanceαFormylglycine generating activity.
According to yet another aspect of the invention, a method of diagnosing multiple sulfatase deficiency in a subject is provided. The method comprises contacting a biological sample from a subject suspected of having multiple sulfatase deficiency with a reagent that specifically binds to a molecule selected from the group consisting of: (i) has the sequence shown in SEQ ID NO: an FGE nucleic acid molecule of 1, 3, or 4 nucleotide sequence, (ii) an expression product of nucleic acid molecule (i), or (iii) a fragment of the expression product of (ii); and measuring the amount of bound agent and thereby determining whether expression of the nucleic acid molecule or expression product thereof is abnormal, abnormal expression indicating that the subject has multiple sulfatase deficiency.
According to yet another aspect of the invention, methods are provided for diagnosing a disorder characterized by aberrant expression of a nucleic acid molecule or an expression product thereof. The method comprises contacting a biological sample from a subject with an agent, wherein the agent specifically binds to the nucleic acid molecule, expression product thereof, or fragment of the expression product thereof; and measuring the amount of bound agent and thereby determining whether expression of said nucleic acid molecule or expression product thereof is abnormal, abnormal expression being indicative of having said condition, wherein said nucleic acid molecule has the nucleotide sequence of SEQ ID NO: 1 and the condition is multiple sulfatase deficiency.
According to yet another aspect of the invention, methods are provided for measuring multiple sulfatase deficiency in a subject characterized by aberrant expression of a nucleic acid molecule or an expression product thereof. The method comprises monitoring a sample from the patient for a parameter selected from the group consisting of: (i) has the sequence shown in SEQ ID NO: 1, 3, 4 or a nucleic acid molecule having a sequence derived from a FEG genomic site, (ii) a polypeptide encoded by said nucleic acid molecule, (iii) a peptide derived from said polypeptide, and (iv) an antibody that selectively binds to said polypeptide or peptide, as an assay for multiple sulfatase deficiency in a subject. In certain embodiments, the sample is a biological fluid or tissue as described in any of the preceding embodiments. In certain embodiments, the monitoring step comprises contacting the sample with a detectable agent selected from the group consisting of: (a) an isolated nucleic acid molecule that selectively hybridizes to the nucleic acid molecule of (i) under stringent conditions, (b) an antibody that selectively binds to the polypeptide of (ii) or the peptide of (iii), and (c) a polypeptide or peptide that binds to the antibody of (iv). The antibody, polypeptide, peptide or nucleic acid can be labeled with a radioactive label or an enzyme. In further embodiments, the method further comprises testing the sample for peptides.
According to yet another aspect of the invention, a kit is provided. The kit comprises a package comprising an agent that selectively binds to any one of the isolated nucleic acids or expression products thereof of the aforementioned FGE, and a control for comparison to a measurement of binding of the agent to any one of the isolated nucleic acids or expression products thereof of the aforementioned FGE. In certain embodiments, the control is a predetermined value for comparison to the measured value. In certain embodiments, the control comprises an epitope of an expression product of any of the aforementioned isolated nucleic acids of FGE. In one embodiment, the kit further comprises a second agent that selectively binds to a polypeptide selected from the group consisting of: iduronate 2-sulfatase, sulfamidase (sulfamidase), N-acetylgalactosamine 6-sulfatase, N-acetylglucosamine 6-sulfatase, arylsulfatase a, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6, or a peptide fragment thereof, and a control for comparing the measured value of binding of the second agent to the polypeptide or peptide fragment thereof.
According to a further aspect of the invention, a method of treating multiple sulfatase deficiency is provided. The method comprises administering to a subject in need of such treatment modulation Cα-a formylglycine generating active agent in an amount effective to treat multiple sulfatase deficiency in the subject. In certain embodiments, the method further comprises co-administering an agent selected from a nucleic acid molecule encoding iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase a, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, or HSulf-6, an expression product of the nucleic acid molecule, and a fragment of the expression product of the nucleic acid molecule. In certain embodiments, C is adjustedα-the agent for formylglycine generating activity is an isolated nucleic acid molecule of the invention (e.g. a nucleic acid molecule as claimed in claims 1 to 8 or a nucleic acid having a sequence selected from the group consisting of SEQ ID NO: 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80 to 87). In important embodiments, C is modulated α-the agent with formylglycine generating activity is a peptide of the invention (e.g. as claimed in claims 11-15, 19, 20, or a peptide having an amino acid sequence selected from the group consisting of SEQ ID No.2, 5, 46, 48, 50, 52, 54,56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78). Regulating CαThe agent of formylglycine generating activity may be produced by a cell expressing an endogenous and/or exogenous FGE nucleic acid molecule. In important embodiments, the endogenous FGE nucleic acid molecule may be activated.
According to one aspect of the invention, for increasing C in a subjectα-formylglycine generating activity is provided. The method comprises administering an isolated FGE nucleic acid molecule of the invention (e.g., the nucleic acid molecule of claims 1-8, or a nucleic acid having a sequence selected from SEQ ID NOs 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87), and/or an expression product thereof, to the subject in an amount that increases C in the subjectα-an effective amount of formylglycine generating activity.
According to one aspect of the invention, methods are provided for treating a subject suffering from multiple sulfatase deficiency. The method comprises administering to a subject in need of such treatment modulation C α-an agent with formylglycine generating activity in an amount to increase C in a subjectα-an effective amount of formylglycine generating activity. In certain embodiments, C is modulatedα-the agent for formylglycine generating activity is a sense nucleic acid of the invention (e.g., a nucleic acid molecule of claims 1-8, or a nucleic acid having a sequence selected from the group consisting of SEQ ID NOs: 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87). In certain embodiments, C is adjustedα-the agent for formylglycine generating activity is an isolated polypeptide of the invention (e.g. a polypeptide of claims 11-15, 19, 20 or a peptide having a sequence selected from the group consisting of SEQ ID No.2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78).
According to yet another aspect of the invention, for increasing C in a cellα-formylglycine generating activity is provided. The methods comprise contacting the cell with an isolated nucleic acid molecule of the invention (e.g.The nucleic acid molecule of claims 1-8, or a nucleic acid molecule having an amino acid sequence selected from SEQ ID NOs: 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87), or an expression product thereof, in an amount to increase C in a cell α-an effective amount of formylglycine generating activity. In important embodiments, the method comprises activating an endogenous FGE gene to increase C in the cellαFormylglycine generating activity.
According to a further aspect of the invention, a pharmaceutical composition is provided. Compositions comprise a pharmaceutically effective amount of an agent comprising an isolated nucleic acid molecule of the invention (e.g., the isolated nucleic acid molecule of any of claims 1-8, a FGE nucleic acid molecule having a sequence selected from SEQ ID NOs 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87), or an expression product thereof, for treating multiple sulfatase deficiency, and a pharmaceutically acceptable carrier.
According to one aspect of the invention, methods are provided for identifying candidate substances useful in the treatment of multiple sulfatase deficiency. The method comprises determining the expression of a set of nucleic acid molecules in a cell or tissue, provided that in the absence of the candidate substance, an initial amount of expression of the set of nucleic acid molecules is allowed, wherein the set of nucleic acid molecules comprises at least one nucleic acid molecule selected from the group consisting of: (a) under stringent conditions with a peptide consisting of SEQ ID NO: 1 and encoding a polypeptide having a nucleotide sequence as shown in α-a nucleic acid molecule of a polypeptide having formylglycine generating activity (FGE), (b) a nucleic acid molecule differing in codon sequence from the nucleic acid molecule of (a) or (b) due to the degeneracy of the genetic code, (c) a nucleic acid molecule having a sequence selected from the group consisting of SEQ ID NO: 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87, and (d) the complementary strand of (a) or (b) or (c), contacting the cell or tissue with the candidate substance and detecting a test amount of expression of the set of nucleic acid molecules, wherein an increase in the test amount expressed in the presence of the candidate substance relative to the initial amount expressed indicates that the candidate substance is treating a plurality of sulfur speciesUseful in the treatment of esterase deficiency.
According to a further aspect of the invention, methods are provided for preparing pharmaceutical agents useful in the treatment of multiple sulfatase deficiency and/or other sulfatase deficiencies.
According to yet another aspect of the invention, an array of solid phase nucleic acid molecules is provided. The array essentially consists of a set of nucleic acid molecules, expression products thereof or fragments thereof (either nucleic acid or polypeptide molecules) immobilized to a solid matrix, each nucleic acid molecule encoding a polypeptide selected from the group consisting of: SEQ ID NO.2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78, iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6. In certain embodiments, the solid phase array further comprises at least one control nucleic acid molecule. In certain embodiments, the set of nucleic acid molecules comprises at least one, at least two, at least three, at least four, or even at least five nucleic acid molecules, each selected from the group consisting of SEQ ID NO.2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78, iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6.
According to a further aspect of the invention, a method for treating sulfatase deficiency in a subject is provided. The method comprises administering to a subject in need of such treatment a sulfatase enzyme produced according to the invention in an amount effective to treat sulfatase deficiency in the subject, wherein the sulfatase deficiency is notMultiple sulfatase deficiency. In an important embodiment, C is adjusted byα-the cell contacted with the agent for formylglycine generating activity produces a sulfatase. In certain embodiments, sulfatase deficiency includes, but is not limited to, mucopolysaccharidosis II (MPS II; Hunter syndrome), mucopolysaccharidosis IIIA (MPS IIIA; Sanfilippo syndrome A), mucopolysaccharidosis VIII (MPS VIII), mucopolysaccharidosis IVA (MPS IVA; Morquio syndrome A), mucopolysaccharidosis VI (MPS VI; Maroteeaux-Lamy syndrome), dyschromophoric leukodystrophy (MLD), X-linked recessive punctate achondroplasia 1, or X-linked ichthyosis (steroid sulfatase deficiency). In certain embodiments, C is adjustedαThe agent for formylglycine generating activity may be a nucleic acid molecule or peptide of the present invention. In one embodiment, sulfatase and regulatory C α-the agent with formylglycine generating activity is co-expressed in the same cell. Sulfatase and/or regulation of CαThe agent of formylglycine generating activity may be of endogenous or exogenous origin. If endogenous, it may be activated (e.g., by inserting a strong promoter and/or other elements at appropriate locations known in the art). If exogenous, its expression may be driven by elements on the expression vector, or it may be targeted to an appropriate location in the cell's genome to allow for its enhanced expression (e.g., downstream of a strong promoter).
According to yet another aspect of the invention, a pharmaceutical composition is provided. Compositions comprise a pharmaceutically effective amount of an agent comprising an isolated nucleic acid molecule of the invention, or an expression product thereof, and a pharmaceutically acceptable carrier, for treating sulfatase deficiency.
According to a further aspect of the invention, a method for increasing sulfatase activity in a cell is provided. The method comprises contacting a cell expressing a sulfatase enzyme with an isolated nucleic acid molecule of the invention (e.g., an isolated nucleic acid molecule of any one of claims 1-8, a FGE nucleic acid molecule having a sequence selected from SEQ ID NOs 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87) or an expression product thereof (e.g., a polypeptide of claims 11-15, 19, 20, or a peptide having a sulfate sequence selected from SEQ ID nos. 2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78) in an amount effective to increase enzyme activity in the cell. The cells may express endogenous and/or exogenous sulfatase. In important embodiments, endogenous sulfatase is activated. In certain embodiments, the sulfatase is iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and/or HSulf-6. In certain embodiments, the cell is a mammalian cell.
According to yet another aspect of the invention, a pharmaceutical composition is provided. The composition comprises a pharmaceutically effective amount of a sulfatase enzyme produced by a cell in the treatment of sulfatase deficiency, and a pharmaceutically acceptable carrier, wherein the cell has been contacted with an agent comprising an isolated nucleic acid molecule of the invention (e.g., a nucleic acid molecule of claims 1-8, or having a sequence selected from SEQ ID NOs 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87), or an expression product thereof (e.g., a peptide selected from SEQ ID NOs 2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78).
According to yet another aspect of the invention, an isolated variant allele of a human FGE gene (which encodes a variant FGE polypeptide) is provided. An isolated variant allele comprises an amino acid sequence set forth in SEQ ID NO: 2, wherein the at least one variation comprises: MetlArg; MetlVal; leu20 Phe; ser155 Pro; ala177 Pro; cys218 Tyr; arg224 Trp; asn259 Ile; pro266 Leu; ala279 Val; arg327 Stop; cys336 Arg; arg345 Cys; ala348 Pro; arg349 Gln; arg349 Trp; arg349 Trp; ser359 Stop; or a combination thereof.
According to yet another aspect of the invention, an isolated human variant FGE polypeptide is provided. An isolated human variant FGE polypeptide comprises an amino acid sequence set forth in SEQ ID NO: 2, wherein the at least one variation comprises: MetlArg; MetlVal; leu20 Phe; ser155 Pro; ala177 Pro; cys218 Tyr; arg224 Trp; asn259 Ile; pro266 Leu; ala279 Val; arg327 Stop; cys336 Arg; arg345 Cys; ala348 Pro; arg349 Gln; arg349 Trp; arg349 Trp; ser359 Stop; or a combination thereof.
Antibodies to any of the aforementioned human variant FGE polypeptides as immunogens are also provided. Such antibodies include polyclonal, monoclonal, chimeric, and may also be detectably labeled. The detectable label may comprise a radioactive element, a fluorescent chemical or an enzyme.
According to yet another aspect of the invention, a sulfatase producing cell is provided, wherein the ratio of active sulfatase to total sulfatase produced by the cell is increased. The cell comprises: (i) expressing an enhanced sulfatase, and (ii) expressing an enhanced formylglycine generating enzyme, wherein the ratio of active sulfatase to total sulfatase produced by the cell (i.e., the specific activity of the sulfatase) is increased by at least 5% relative to the ratio of active sulfatase to total sulfatase produced by a cell lacking the formylglycine generating enzyme. In certain embodiments, the ratio of active sulfatase to total sulfatase produced by the cell is increased by at least 10%, 15%, 20%, 50%, 100%, 200%, 500%, 1000% relative to the ratio of active sulfatase to total sulfatase produced by a cell lacking the formylglycine generating enzyme.
According to a further aspect of the invention, improved methods of treating sulfatase deficiency in a subject are provided. The method comprises administering to a subject in need of such treatment a sulfatase in an amount effective to treat sulfatase deficiency in the subject, wherein the sulfatase is contacted with a formylglycine generating enzyme in an amount effective to increase the specific activity of the sulfatase. In an important embodiment, the sulfatase is selected from iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6. In certain embodiments, the formylglycine generating enzyme is encoded by the nucleic acid molecule of claims 1-8 or a nucleic acid molecule having an amino acid sequence selected from the group consisting of SEQ ID NOs: 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87. In certain embodiments, the formylglycine generating enzyme is a peptide of claims 11-15, 19, 20, or is a peptide having a sequence selected from the group consisting of SEQ ID No.2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78.
These and other objects of the invention will be further elaborated in conjunction with the detailed description of the invention.
Brief description of the sequences
SEQ ID NO: 1 is the nucleotide sequence of human FGE cDNA.
SEQ ID NO: 2 is the expected amino acid sequence of the translation product of the human FGE cDNA (SEQ ID NO: 1).
SEQ ID NO: 3 is a polypeptide encoding SEQ ID NO: 2 (i.e., nucleotides 20-1141 of SEQ ID NO: 1).
SEQ ID NO: 4 is the nucleotide sequence of GenBank Acc.No. AK075459.
SEQ ID NO: 5 is SEQ ID NO: 4, an unnamed protein product having GenBank acc No. bac 11634.
SEQ ID NO: 6 is the nucleotide sequence of human iduronate-2-sulfatase cDNA (GenBank Acc. No. M58342).
SEQ ID NO: 7 is the expected amino acid sequence of the translation product of the human iduronate-2-sulfatase cDNA (SEQ ID NO: 6).
SEQ ID NO: 8 is the nucleotide sequence of human sulfamidase cDNA (GenBank Acc.No. U30894).
SEQ ID NO: 9 is the expected amino acid sequence of the translation product of human sulfamidase cDNA (SEQ ID NO: 8).
SEQ ID NO: 10 is the nucleotide sequence of human N-acetylgalactosamine-6-sulfatase cDNA (GenBank Acc. No. U06088).
SEQ ID NO: 11 is the expected amino acid sequence of the translation product of human N-acetylgalactosamine-6-sulfatase cDNA (SEQ ID NO: 10).
SEQ ID NO: 12 is the nucleotide sequence of human N-acetylglucosamine-6-sulfatase cDNA (GenBank Acc. No. Z12173).
SEQ ID NO: 13 is the expected amino acid sequence of the translation product of human N-acetylglucosamine-6-sulfatase cDNA (SEQ ID NO: 12).
SEQ ID NO: 14 is the nucleotide sequence of human arylsulfatase A cDNA (GenBank Acc. No. X52151).
SEQ ID NO: 15 is the expected amino acid sequence of the translation product of the human arylsulfatase A cDNA (SEQ ID NO: 14).
SEQ ID NO: 16 is the nucleotide sequence of human arylsulfatase B cDNA (GenBank Acc.No. J05225).
SEQ ID NO: 17 is the expected amino acid sequence of the translation product of the human arylsulfatase B cDNA (SEQ ID NO: 16).
SEQ ID NO: 18 is the nucleotide sequence of human arylsulfatase C cDNA (GenBank Acc. No. J04964).
SEQ ID NO: 19 is the expected amino acid sequence of the translation product of the human arylsulfatase C cDNA (SEQ ID NO: 18).
SEQ ID NO: 20 is the nucleotide sequence of human arylsulfatase D cDNA (GenBank acc.no. x 83572).
SEQ ID NO: 21 is the expected amino acid sequence of the translation product of the human arylsulfatase D cDNA (SEQ ID NO: 20).
SEQ ID NO: 22 is the nucleotide sequence of human arylsulfatase E cDNA (GenBank Acc. No. X83573).
SEQ ID NO: 23 is the expected amino acid sequence of the translation product of the human arylsulfatase E cDNA (SEQ ID NO: 22).
SEQ ID NO: 24 is the nucleotide sequence of human arylsulfatase F cDNA (GenBank Acc. No. X97868).
SEQ ID NO: 25 is the expected amino acid sequence of the translation product of the human arylsulfatase F cDNA (SEQ ID NO: 24).
SEQ ID NO: 26 is the nucleotide sequence of human arylsulfatase G cDNA (GenBank Acc. No. BC012375).
SEQ ID NO: 27 is the expected amino acid sequence of the translation product of human arylsulfatase G (SEQ ID NO: 26).
SEQ ID NO: 28 is the nucleotide sequence of HSulf-1 cDNA (GenBank Acc. No. AY 101175).
SEQ ID NO: 29 is the expected amino acid sequence of the translation product of the HSulf-1 cDNA (SEQ ID NO: 28).
SEQ ID NO: 30 is the nucleotide sequence of HSulf-2 cDNA (GenBank Acc. No. AY 101176).
SEQ ID NO: 31 is the expected amino acid sequence of the translation product of the HSulf-2 cDNA (SEQ ID NO: 30).
SEQ ID NO: 32 is the highly conserved hexapeptide L/V-FGly-X-P-S-R occurring in sulfatase.
SEQ ID NO: 33 is a synthetic FGly-forming substrate; the primary sequence is derived from human arylsulfatase A.
SEQ ID NO: 34 is a hybrid oligopeptide PVSLPTRSCAALLTGR.
SEQ ID NO: 35 is Ser69 oligopeptide PVSLSTPSRAALLTGR.
SEQ ID NO: 36 is a human FGE-specific primer 1199 nc.
SEQ ID NO: 37 is the human FGE-specific forward primer 1 c.
SEQ ID NO: 38 is the human FGE-specific reverse primer 1182 c.
SEQ ID NO: 39 is a human 5' -FGE-specific primer containing an EcoRI site.
SEQ ID NO: 40 is an HA-specific primer.
SEQ ID NO: 41 is a c-myc-specific primer.
SEQ ID NO: 42 is RGS-His6-a specific primer.
SEQ ID NO: 43 is a tryptic oligopeptide SQNTPDSSASNLGFR from a human FGE preparation.
SEQ ID NO: 44 is a tryptic oligopeptide MVPIPAGVFTMGTDDPQIK from a human FGE preparation.
SEQ ID NO: 45 is the nucleotide sequence of human FGE2 paralog (paralog) (GenBank GI: 24308053).
SEQ ID NO: 46 is the expected amino acid sequence of the translation product of the human FGE2 paralog (SEQ ID NO: 45).
SEQ ID NO: 47 is the nucleotide sequence of mouse FGE paralog (GenBank GI: 26344956).
SEQ ID NO: 48 is the expected amino acid sequence of the translation product of the mouse FGE paralog (SEQ ID NO: 47).
SEQ ID NO: 49 is the nucleotide sequence of mouse FGE ortholog (ortholog) (GenBank GI: 22122361).
SEQ ID NO: 50 is the expected amino acid sequence of the translation product of the mouse FGE ortholog (SEQ ID NO: 49).
SEQ ID NO: 51 is the nucleotide sequence of Drosophila FGE ortholog (GenBank GI: 20130397).
SEQ ID NO: 52 is the expected amino acid sequence of the translation product of the Drosophila FGE ortholog (SEQ ID NO: 51).
SEQ ID NO: 53 is the nucleotide sequence of mosquito FGE ortholog (GenBank GI: 21289310).
SEQ ID NO: 54 is the expected amino acid sequence of the translation product of a mosquito FGE ortholog (SEQ ID NO: 53).
SEQ ID NO: 55 is the nucleotide sequence of the closely related S.coelicolor FGE ortholog (GenBank GI: 21225812).
SEQ ID NO: 56 is the expected amino acid sequence of the translation product of S.coelicolor FGE ortholog (SEQ ID NO: 55).
SEQ ID NO: 57 is the nucleotide sequence of the closely related C.efficiens FGE ortholog (GenBank GI: 25028125).
SEQ ID NO: 58 is the expected amino acid sequence of the translation product of the C.efficiens FGE ortholog (SEQ ID NO: 57).
SEQ ID NO: 59 is the nucleotide sequence of N.aromatic civorans FGE ortholog (GenBank GI: 23108562).
SEQ ID NO: 60 is the expected amino acid sequence of the translation product of N.aromatic civorans FGE ortholog (SEQ ID NO: 59).
SEQ ID NO: and 61 is the nucleotide sequence of M.loti FGE ortholog (GenBank GI: 13474559).
SEQ ID NO: loti FGE ortholog (SEQ ID NO: 61) is the expected amino acid sequence of the translation product.
SEQ ID NO: fungorum FGE ortholog (GenBank GI: 22988809) nucleotide sequence.
SEQ ID NO: fungorum FGE ortholog (SEQ ID NO: 63) is the expected amino acid sequence of the translation product.
SEQ ID NO: and 65 is the nucleotide sequence of meliloti FGE ortholog (GenBank GI: 16264068).
SEQ ID NO: meliloti FGE ortholog (SEQ ID NO: 65) is the expected amino acid sequence of the translation product.
SEQ ID NO: 67 is the nucleotide sequence of the FGE ortholog of the species Microtreoschus (GenBank GI: 14518334).
SEQ ID NO: 68 is the expected amino acid sequence of the translation product of the FGE ortholog of the species Microtremollis (SEQ ID NO: 67).
SEQ ID NO: 69 is the nucleotide sequence of P.putida KT2440FGE ortholog (GenBank GI: 26990068).
SEQ ID NO: 70 is the expected amino acid sequence of the translation product of P.putida KT2440FGE ortholog (SEQ ID NO: 69).
SEQ ID NO: 71 is the nucleotide sequence of R.metallidurans FGE ortholog (GenBank GI: 22975289).
SEQ ID NO: 72 is the expected amino acid sequence of the translation product of R.metalllidans FGE ortholog (SEQ ID NO: 71).
SEQ ID NO: marinus FGE ortholog (GenBank GI: 23132010) nucleotide sequence 73.
SEQ ID NO: marinus FGE ortholog (SEQ ID NO: 73).
SEQ ID NO: c. creescens CB15 FGE ortholog (GenBank GI: 16125425) nucleotide sequence.
SEQ ID NO: 76 is the expected amino acid sequence of the translation product of the C.creescens CB15 FGE ortholog (SEQ ID NO: 75).
SEQ ID NO: 77 is the nucleotide sequence of M.tuboculosis Ht37Rv FGE ortholog (GenBank GI: 15607852).
SEQ ID NO: 78 is the expected amino acid sequence of the translation product of M.tuboculosis Ht37Rv FGE ortholog (SEQ ID NO: 77).
SEQ ID NO: 79 is a highly conserved hexapeptide present in subdomain 3 of the FGE ortholog and paralogs.
SEQ ID NO: 80 is a polypeptide having GenBank Acc.No.: nucleotide sequence of FGE ortholog EST fragment of CA 379852.
SEQ ID NO: 81 is a peptide having GenBank acc.no.: the nucleotide sequence of the FGE ortholog EST fragment of AI 721440.
SEQ ID NO: 82 is a polypeptide having GenBank Acc.No.: nucleotide sequence of FGE ortholog EST fragment of BJ 505402.
SEQ ID NO: 83 is a polypeptide having GenBank acc.no.: nucleotide sequence of FGE ortholog EST fragment of BJ 054666.
SEQ ID NO: 84 is a polypeptide having GenBank acc.no.: the nucleotide sequence of FGE ortholog EST fragment of AL 892419.
SEQ ID NO: 85 is a peptide having GenBank Acc.No.: nucleotide sequence of FGE ortholog EST fragment of CA 064079.
SEQ ID NO: 86 is a polypeptide having GenBank acc.no.: nucleotide sequence of FGE ortholog EST fragment of BF 189614.
SEQ ID NO: 87 is a peptide having GenBank acc.no.: nucleotide sequence of FGE ortholog EST fragment of AV 609121.
SEQ ID NO: 88 is the HSulf-3cDNA nucleotide sequence.
SEQ ID NO: 89 is the expected amino acid sequence of the translation product of the HSulf-3cDNA (SEQ ID NO: 88).
SEQ ID NO: 90 is the HSulf-4 cDNA nucleotide sequence.
SEQ ID NO: 91 is the expected amino acid sequence of the translation product of the HSulf-4 cDNA (SEQ ID NO: 90).
SEQ ID NO: 92 is the HSulf-5 cDNA nucleotide sequence.
SEQ ID NO: 93 is the expected amino acid sequence of the translation product of the HSulf-5 cDNA (SEQ ID NO: 92).
SEQ ID NO: 94 is the HSulf-6 cDNA nucleotide sequence.
SEQ ID NO: 95 is the expected amino acid sequence of the translation product of the HSulf-6 cDNA (SEQ ID NO: 94).
Brief Description of Drawings
FIG. 1: MALDI-TOF mass spectrum representation of P23 after incubation in the absence (A) or presence (B) of a soluble extract from bovine testis microsomes.
FIG. 2: an ordered phylogenetic tree of 21 proteins derived from human FGE and PFAM-DUF323 seeds.
FIG. 3: human and murine FGE gene loci. Exons are shown as boxes and light boxes (murine loci). The numbers on the intron lines indicate the intron size in kb.
FIG. 4: a diagram showing the structure of the FGE expression plasmid pXMG.1.3.
FIG. 5: histogram depicts N-acetylgalactosamine-6-sulfatase activity in 36F cells transiently transfected with FGE expression plasmid.
FIG. 6: histogram depicts the specific activity of N-acetylgalactosamine-6-sulfatase in 36F cells transiently transfected with FGE expression plasmid.
FIG. 7: histogram depicts N-acetylgalactosamine-6-sulfatase production in 36F cells transiently transfected with FGE expression plasmid.
FIG. 8: iduronate-2-sulfatase activity in 30C6 cells transiently transfected with FGE expression plasmid is described.
FIG. 9: kits embodying features of the invention are described.
Detailed Description
The present invention comprises the discovery of a gene encoding a Formylglycine Generating Enzyme (FGE) which is responsible for unique post-translational modifications occurring on sulfatases which are essential for sulfatase function: form L-CαFormylglycine (a.k.a.fgly and/or 2-amino-3-oxopropanoic acid). It has been found that, surprisingly, mutations in the FGE gene cause the development of Multiple Sulfatase Deficiency (MSD) in a subject. It has also been found that, surprisingly, FGE enhances the activity of sulfatases, including, but not limited to, iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase a, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6, and sulfatases described in U.S. provisional applications with application numbers 20030073118, 20030147875, 20030148920, 20030162279, and 20030166283 (the contents of which are expressly incorporated herein). In view of these findings, the molecules of the invention may be used for diagnosis and/or Treatment of multiple sulfatase deficiency, as well as other sulfatase deficiencies.
Methods of using the molecules of the invention in diagnosing multiple sulfatase deficiency are provided.
In addition, methods of using these molecules in vivo or in vitro for the purpose of modulating FGly formation on sulfatases, methods of treating conditions associated with such modifications, and compositions useful in preparing therapeutic formulations for the treatment of multiple and other sulfatase deficiencies are also provided.
The invention thus includes, in several aspects, polypeptides that modulate FGly formation on sulfatases, isolated nucleic acids encoding these polypeptides, functional modifications and variants of the foregoing, useful fragments of the foregoing, as well as methods, compositions and tools related thereto for therapy, diagnosis and research.
“CαFormylglycine generating activity "refers to the ability of a molecule to form FGly on a substrate or to enhance FGly formation. The substrate may be a sulfatase as described elsewhere herein, or a synthetic oligopeptide (see, e.g., SEQ ID NO: 33, and examples). The substrate preferably comprises SEQ id no: 32 [ L/V-C (S) -X-P-S-R ] which is a conserved hexapeptide ]. Methods for analyzing FGly formation are as described in the art (see, e.g., Dierks, t. et al, proc.natl.acad.sci.u.s.a., 1997, 94: 11963-. As used herein, "molecule" includes "nucleic acids" and "polypeptides". FGE molecules are capable of forming FGly, or enhancing/increasing FGly formation, both in vivo and in vitro.
As used herein, "enhance (or" increase ")" CαFormylglycine generating activity, typically referring to increasing the expression of FGE and/or the polypeptide it encodes. Increased expression refers to an increase (i.e., to a detectable degree) in replication, transcription and/or translation of any of the nucleic acids of the invention (FGE nucleic acids as described elsewhere herein), since upregulation of any of these processes will result in an increase in the concentration/amount of the polypeptide encoded by the gene (nucleic acid)And (4) adding. Enhancement (or increase) of CαFormylglycine generating activity also refers to preventing or inhibiting degradation (e.g. by increased ubiquitination), down-regulation, etc. of FGE resulting in e.g. an increased or stabilized FGE molecule t relative to a control1/2(half-life). Downregulated or reduced expression refers to reduced expression of a gene and/or polypeptide encoded thereby. Upregulation or downregulation of gene expression can be directly measured by detecting an increase or decrease in the level of mRNA of a gene (e.g., FGE) or the level of protein expression of a polypeptide encoded by the gene, respectively, relative to a control, using any suitable method known in the art, such as nucleic acid hybridization or antibody detection methods. Up-or down-regulation of FGE gene expression can also be detected by probing C α-a change in formylglycine generating activity.
As used herein, "expression" refers to nucleic acid and/or polypeptide expression, as well as activity of the polypeptide molecule (e.g., C of the molecule)αFormylglycine generating activity).
One aspect of the invention includes cloning of a cDNA encoding FGE. FGE according to the invention is a FGE comprising SEQ ID NO: 1 and encoding a polypeptide having Cα-a formylglycine generating active polypeptide. The sequence of human FGE cDNA is set forth in SEQ ID NO: 1, the expected amino acid sequence of the protein product encoded by this cDNA is represented by SEQ ID NO: 2 occur.
A subject as used herein is a mammal or a non-human mammal. In all embodiments, human FGE and human subjects are preferred.
The invention thus includes, in one aspect, isolated FGE polypeptides, cdnas encoding such polypeptides, functional modifications and variants of the foregoing, useful fragments of the foregoing, and diagnostics and therapeutics related thereto.
The term "isolated" as used herein with respect to nucleic acids means: (i) amplified in vitro by, for example, Polymerase Chain Reaction (PCR); (ii) recombinantly produced by cloning; (iii) as purified by cleavage and gel separation; or (iv) synthesized by, for example, chemical synthesis. Isolated nucleic acids are of a type that are readily manipulated by recombinant DNA techniques well known in the art. Thus, a nucleotide sequence contained in a vector whose 5 'and 3' restriction sites are known or whose Polymerase Chain Reaction (PCR) primer sequences have been disclosed is considered isolated, but a nucleic acid that is present in its native host in its native state is not. Isolated nucleic acids may be substantially purified, but need not be. For example, an isolated nucleic acid within a cloning or expression vector is not pure in that it may contain only a small percentage of material in the cells in which it resides. However, such nucleic acid is isolated, as that term is used herein, because it is readily manipulated by standard techniques known to those of ordinary skill in the art.
The term "isolated" as used herein with respect to a polypeptide means separated from its natural environment in a sufficiently pure form that it can be manipulated or used for any purpose of the invention. Thus, "isolated" means sufficiently pure to be used (i) for the production and/or isolation of antibodies, (ii) as a reagent in an assay, (iii) for sequencing, (iv) for therapy, and the like.
According to the invention, the code has Cα-an isolated nucleic acid molecule of a formylglycine generating active FGE polypeptide comprising: (a) a nucleic acid molecule that hybridizes under stringent conditions to SEQ ID NO: 1 and encodes a polypeptide having Cα-a formylglycine generating active FGE polypeptide, (b) deletions, additions and substitutions of (a) encoding each having CαA formylglycine-generating active FGE polypeptide, (c) a nucleic acid molecule which differs from the nucleic acid molecule of (a) or (b) in codon sequence due to the degeneracy of the genetic code, and (d) the complementary strand of (a), (b) or (c). "complementary strand" as used herein includes the full-length complementary strand of "(a), (b) or (c) or 100% complementary strand".
Also has CαHomologues and alleles of FGE nucleic acids of the invention with formylglycine generating activity are also encompassed by the invention. Homology as described herein Molecules identified elsewhere herein (see, e.g., SEQ ID NOs: 4, 5, 45-78, and 80-87) are orthologs and paralogs. Further, homologues can be identified according to the teachings of the present invention as well as conventional techniques. Since the FGE homologs described herein all have CαFormylglycine generating activity, which can be used interchangeably with the human FGE molecule in all aspects of the invention.
Accordingly, one aspect of the invention is those encoding FGE polypeptides and which hybridize under stringent conditions to the sequence of SEQ ID NO: 1, and a nucleic acid sequence which hybridizes to a nucleic acid molecule consisting of the coding region of 1. In important embodiments, the term "stringent conditions" as used herein refers to parameters familiar to the art. For nucleic acids, hybridization conditions called stringency are typically at low ionic strength and well below the melting point (T) of the DNA hybridization complexm) (typically, lower than hybrid TmAbout 3 deg.C). Higher stringency defines a more specific correlation between probe sequence and target. The stringency conditions used in nucleic acid hybridization are well known in the art and can be found in references compiling such methods, for example, Molecular Cloning: a Laboratory Manual, eds.J.Sambrook et al, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1989, or Current protocols Molecular Biology, F.M.Ausubel et al, John Wiley &An example of "stringent conditions" is hybridization in 6 x SSC at 65 ℃. Another example of stringent conditions is hybridization at 65 ℃ in a hybridization buffer consisting of 3.5 XSSC, 0.02% Ficoll, 0.02% polyvinylpyrrolidone, 0.02% bovine serum albumin, 2.5mM NaH2PO4[pH7]0.5% SDS, 2mM EDTA (SSC 0.15M sodium chloride/0.15M sodium citrate, pH 7; SDS sodium dodecyl sulfate; EDTA ethylenediaminetetraacetic acid). After hybridization, the DNA-transferred membrane was washed in 2 XSSC at room temperature and then in 0.1 XSSC/0.1 XSDS at a temperature of up to 68 ℃. In a further embodiment, an alternative to the use of aqueous hybridization solutions is the use of solutions of hybrid formamide. Applications such asStringent hybridization conditions can thus be achieved with a 50% formamide solution and 42 ℃. There are other conditions, reagents, etc. that can be used and will result in similar stringency. The skilled person will be familiar with such conditions and they are therefore not given here. However, it is to be understood that the skilled person will be able to manipulate the conditions in a way that allows for clear identification of homologues and alleles of FGE nucleic acids of the invention. The skilled artisan will also be familiar with methods for screening expression libraries for cells and such molecules that will be routinely isolated thereafter, and then isolating the relevant nucleic acid molecules and sequences.
In general, homologues and alleles will typically be compared to SEQ ID NO: 1 and SEQ ID NO: 2, in some cases will have at least 50% nucleotide identity and/or at least 65% amino acid identity, and in other cases will have at least 60% nucleotide identity and/or at least 75% amino acid identity. In a further scenario, homologues and alleles will typically be compared to SEQ id nos: 1 and SEQ ID NO: 2 have at least 90%, 95% or even 99% nucleotide identity and/or at least 95%, 98% or even 99% amino acid identity. Identity can be calculated using a variety of publicly available software tools developed by NCBI (Bethesda, Maryland). Exemplary tools include Altschul SF et al heuristic (J Mol Biol, 1990, 215: 403-. Pairwise and ClustalW alignments (BLOSUM30 matrix set up) and Kyte-Doolittle hydrotherapy analyses are available using published (EMBL, Heidelberg, Germany) and commercial (e.g., MacVector sequence analysis software from Oxford molecular Group/Genetics Computer Group, Madison, Wis.) types. Watson-Crick complementary strands of the aforementioned nucleic acids are also encompassed by the present invention.
In screening for FGE-related genes, such as homologs and alleles of FGE, Southern blots can be performed with radioactive probes using the conditions described above. After washing the membrane on which the DNA is finally transferred, the membrane can be placed on an X-ray film or a phosphoimager plate (phosphoimager plate) to detect the radioactive signal.
Given the guidance given herein regarding full-length human FGE cDNA clones, other mammalian sequences corresponding to the human FGE gene, such as mouse cDNA clones, can be isolated from cDNA libraries using standard colony hybridization techniques.
The invention also includes degenerate nucleic acids containing substitutions for those codons that occur in nature. For example, the serine residue is encoded by the codons TCA, AGT, TCC, TCG, TCT and AGC. Thus, it will be apparent to one of ordinary skill in the art that any serine-encoding nucleotide triplet may be used to direct the protein synthesis apparatus in vivo or in vitro to incorporate a serine residue into an elongating FGE polypeptide. Similarly, nucleotide sequence triplets encoding other amino acid residues include, but are not limited to: CCA, CCC, CCG and CCT (proline codon); CGA, CGC, CGG, CGT, AGA and AGG (arginine codon); ACA, ACC, ACG and ACT (threonine codon); AAC and AAT (asparagine codons); and ATA, ATC and ATT (isoleucine codons). Other amino acid residues may be similarly encoded by several nucleotide sequences. Thus, the present invention includes degenerate nucleic acids that differ from biologically isolated nucleic acids in codon sequence by the degeneracy of the genetic code.
The invention also provides an isolated SEQ ID NO: 1 or SEQ ID NO: 3 or the complementary strand thereof. Unique fragments are of the type which are "tags" for larger nucleic acids. For example, the unique fragment is long enough to ensure that its exact sequence cannot be found in molecules in the human genome that are outside of the FGE nucleic acids (and human alleles) described above. Those of ordinary skill in the art may not have to apply procedures beyond routine to determine whether a fragment is unique in the human genome. However, the unique fragment excludes fragments consisting entirely of a sequence selected from SEQ ID NO: 4 and/or other sequences as published before the filing date of this application.
Fragments consisting entirely of the sequences described in the aforementioned GenBank accession Bank do not include any nucleotides that are unique to the sequences of the present invention. Thus, a unique fragment according to the present invention must comprise a nucleotide sequence other than the exact sequence in those GenBank deposits or fragments thereof. The difference may be an addition, deletion, or substitution with respect to the GenBank sequence, or may be a sequence completely different from the GenBank sequence.
The unique fragments can be used as probes in Southern and Northern blot analyses to identify such nucleic acids, or can be used in amplification assays such as those employing PCR. As known to those skilled in the art, large probes such as 200, 250, 300 or more nucleotides are preferred for certain applications such as Southern and Northern blots, while smaller fragments will be preferred for use in, for example, PCR. The unique fragments can also be used to generate fusion proteins to generate antibodies or to determine binding of polypeptide fragments as demonstrated in the examples or to generate immunoassay components. Likewise, the unique fragments can be used to generate non-fusion fragments of FGE polypeptides useful, for example, in antibody preparation, immunoassay, or therapeutic applications. The unique fragments further can be used as antisense molecules to inhibit the expression of each of FGE nucleic acids and polypeptides.
As will be appreciated by those skilled in the art, the size of a unique fragment will depend on conservation in its genetic code. Thus, SEQ ID NO: 1 or SEQ ID NO: 3 will require longer segments to be unique, while others will only require short segments, typically between 12 and 32 nucleotides (e.g., 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, and 32 bases) or longer, to the full length of the disclosed sequence. As mentioned above, this disclosure is intended to encompass each segment of each sequence starting at the first nucleotide, the second nucleotide, and so on, until 8 nucleotides remain from the end, and ending anywhere from the 8 th, 9 th, 10 th, and so on, up to the very last nucleotide (provided that the sequence is a unique segment as described above). Indeed SEQ ID NO: 1 region beginning with nucleotide 1 and ending at nucleotide 1180 or SEQ ID NO: any segment of the 3 region beginning at nucleotide 1 and ending at nucleotide 1122, or the complement thereof, that is 20 or more nucleotides in length will be unique. The methods for selecting such sequences (generally the ability to selectively distinguish the sequence of interest from other sequences in the human genome based on unique fragments) are well within the skill of the art, although confirmatory hybridization and sequence analysis in vitro can be performed.
As mentioned above, the invention comprises antisense oligonucleotides that selectively bind to nucleic acid molecules encoding FGE polypeptides to reduce FGE activity.
The term "antisense oligonucleotide" or "antisense" as used herein describes an oligonucleotide, oligodeoxyribonucleotide, modified oligoribonucleotide, or modified oligodeoxyribonucleotide that hybridizes under physiological conditions to the DNA comprising a particular gene or to the mRNA transcript of that gene, thereby inhibiting the transcription of that gene and/or the translation of that mRNA. Antisense molecules are designed to hybridize to a target gene or transcript to interfere with transcription or translation of the target gene. One skilled in the art will recognize that the exact length of an antisense oligonucleotide and the degree to which it is complementary to its target will depend on the specific target chosen, including the sequence of the target and the particular bases comprising that sequence. Preferably, the antisense oligonucleotide is constructed and arranged to selectively bind to the target under physiological conditions, i.e., to hybridize more fully to the target sequence under physiological conditions relative to any other sequence in the target cell. Based on SEQ ID NO: 1 or based on allelic or homologous genomic and/or cDNA sequences, one of skill in the art can readily select and synthesize any of a number of suitable antisense molecules for use in accordance with the present invention. In order to have sufficient selectivity and sufficient ability to inhibit, such antisense oligonucleotides should comprise at least 10 and more consecutive bases, preferably at least 15, complementary to the target, although in some cases modified oligonucleotides as short as 7 bases in length have been successfully used as antisense oligonucleotides (Wagner et al, nat. Med, 1995, 1 (11): 1116-1118; nat. Biotech., 1996, 14: 840-844). Most preferably, the antisense oligonucleotide comprises a complementary sequence of 20 to 30 bases. While oligonucleotides that are antisense to any segment of a gene or mRNA transcript may be selected, in preferred embodiments, the antisense oligonucleotide corresponds to an N-terminal or 5' upstream site such as the translation start, transcription start, or promoter site. In addition, the 3' -untranslated region can be targeted by antisense oligonucleotides. Targeting of mRNA splice sites is also applied in the art, but may not be preferred if alternative mRNA splicing occurs. Furthermore, antisense is preferably targeted to sites where mRNA secondary structure is not desired (see, e.g., Sainio et al, CellMol. Neurobiol.14 (5): 439-. Finally, although SEQ ID No: 1 discloses a cDNA sequence, and genomic DNA corresponding to this sequence is readily available to those of ordinary skill in the art. Thus, the invention also provides a polypeptide corresponding to the amino acid sequence of SEQ ID NO: 1, or a pharmaceutically acceptable salt thereof. Similarly, antisense to allelic or homologous FGE cDNAs and genomic DNAs can also be used without undue experimentation.
In one set of embodiments, the antisense oligonucleotides of the invention may be composed of "natural" deoxyribonucleotides, ribonucleotides, or any combination thereof. That is, the 5 'end of one natural nucleotide and the 3' end of another natural nucleotide may be covalently linked by an internucleoside phosphodiester bond, as in natural systems. These oligonucleotides can be prepared by art-recognized methods, which can be performed manually or by automated synthesizers. They may also be produced recombinantly by means of vectors.
However, in preferred embodiments, antisense oligonucleotides of the invention may also include "modified" oligonucleotides. That is, oligonucleotides can be modified in a number of ways that do not prevent their hybridization to their target, but enhance their stability or targeting or enhance their therapeutic efficacy.
The term "modified oligonucleotide" as used herein describes an oligonucleotide in which (1) at least two of its nucleotides are covalently linked by a synthetic internucleoside linkage (i.e., a linkage other than a phosphodiester linkage between the 5 'end of one nucleotide and the 3' end of another nucleotide) and/or (2) a chemical group not normally linked to a nucleic acid has been covalently attached to the oligonucleotide. Preferred synthetic internucleoside linkages are phosphorothioates, alkylphosphonates, phosphorodithioates, phosphates, alkylsulfonates (alkylphosphorothioates), phosphoramidates, carbamates, carbonates, phosphotriesters, acetamidates, carboxymethyl esters and peptides.
The term "modified oligonucleotide" also includes oligonucleotides having covalently modified bases and/or sugars. For example, modified oligonucleotides include oligonucleotides having a sugar backbone that has been covalently attached to a low molecular weight organic group at a position other than the 3 'position hydroxyl and the 5' position phosphate group. The modified oligonucleotide may thus comprise a 2' -O-alkylated ribose group. In addition, the modified oligonucleotide may include a sugar, such as arabinose instead of ribose. The invention thus relates to pharmaceutical formulations comprising a modified antisense molecule complementary to a nucleic acid encoding a FGE polypeptide and hybridizing under physiological conditions, and a pharmaceutically acceptable carrier. The antisense oligonucleotide may be administered as part of a pharmaceutical composition. Such pharmaceutical compositions may include antisense oligonucleotides in combination with any physiologically and/or pharmaceutically acceptable standard carrier known in the art. The composition should be sterile and contain a pharmaceutically effective amount of the antisense oligonucleotide in a unit weight or volume suitable for administration to a patient. The term "pharmaceutically acceptable" denotes a non-toxic substance that does not interfere with the efficacy of the biological activity of the active ingredient. The term "physiologically acceptable" refers to a non-toxic substance that is compatible with a biological system, such as a cell, cell culture, tissue, or organ. The characteristics of the vector will depend on the route of application. Physiologically and pharmaceutically acceptable carriers include diluents, fillers, salts, buffers, stabilizers, solubilizers and other substances well known in the art.
The invention also includes increasing C in a cellα-formylglycine generating activity. In important embodiments, this is accomplished by the use of vectors ("expression vectors" and/or "targeting vectors").
As used herein, a "vector" may be any of a variety of nucleic acids into which a desired sequence may be inserted by restriction and ligation for transport between different genetic environments or for expression in a host cell. The vector typically consists of DNA, although RNA vectors may also be used. Vectors include, but are not limited to, plasmids, phagemids, and viral genomes. A cloning vector is one that is capable of replication in a host cell and is further characterized by one or more endonuclease restriction sites at which the vector can be excised in a detectable manner, into which a desired DNA sequence may be ligated such that the new recombinant vector retains its ability to replicate in the host cell. In the case of plasmids, replication of the desired sequence may occur many times as a result of the increased copy number of the plasmid in the host bacterium, or just once in each host before the host replicates through mitosis. In the case of phage, replication can occur actively during the lytic phase or passively during the lysogenic phase. An "expression vector" is of the type into which a desired DNA sequence (e.g., FGE cDNA of SEQ ID NO: 3) is inserted by restriction and ligation such that the desired DNA sequence is operably linked to regulatory sequences and can be expressed as an RNA transcript. The vector may further comprise one or more marker sequences suitable for identifying cells that have or have not been transformed or transfected with the vector. Markers include, for example, genes encoding proteins that increase or decrease resistance or sensitivity to antibiotics or other compounds, genes encoding enzymes whose activity is detectable by standard assays known in the art (e.g., beta-galactosidase or alkaline phosphatase), and genes that visibly affect the phenotype of transformed or transfected cells, hosts, colonies or plaques (e.g., green fluorescent protein).
A "targeting vector" is a type that typically contains certain targeting structures/sequences for inserting regulatory sequences, e.g., in an endogenous gene (e.g., within an exon and/or intron sequence), in the promoter sequence of an endogenous gene, or upstream of the promoter sequence of an endogenous gene. In another embodiment, the targeting vector can comprise the gene of interest (e.g., encoded by the cDNA of SEQ ID NO: 1) and other sequences necessary to target the gene to a preferred location in the genome (e.g., a transcriptionally active site such as downstream of an endogenous promoter for an unrelated gene). The construction of targeting constructs and vectors is described in detail in U.S. patents 5,641,670 and 6,270,989, which are expressly incorporated herein by reference.
Virtually any cell (prokaryotic or eukaryotic) that can be transformed with heterologous DNA or RNA and can be grown or maintained in culture can be used in the practice of the present invention. Examples include bacterial cells such as E.coli, insect cells, and cells of mammals such as humans, mice, hamsters, pigs, goats, primates, and the like. They can be primary or secondary cell lines (which show a limited number of average population doublings in culture rather than permanent) and permanent cell lines (which show a clearly unlimited life span in culture). Primary and secondary cells include, for example, fibroblasts, keratinocytes, epithelial cells (e.g., mammary epithelial cells, intestinal epithelial cells), endothelial cells, glial cells, neural cells, constituents of the blood (e.g., lymphocytes, bone marrow cells), muscle cells and precursors of these somatic cell types including embryonic stem cells. Where the cells are to be used in gene therapy, the primary cells are preferably obtained from the individual to whom the manipulated cells are administered. However, primary cells can be obtained from a donor (rather than a recipient) of the same species. Examples of permanent human cell lines that can be used with the DNA constructs and methods of the invention include, but are not limited to, HT-1080 cells (ATCC CCL 121), HeLa cells and HeLa cell derivatives (ATCC CCL 2, 2.1 and 2.2), MCF-7 breast Cancer cells (ATCC BTH 22), K-562 leukemia cells (ATCC CCL 243), KB carcinoma cells (ATCC CCL 17), 2780AD ovarian Cancer cells (Van der Blick, A.M. et al, Cancer Res, 48: 5927-, bowes melanoma cells (ATCCRL 9607), WI-38VA13 subline 2R4 cells (ATCC CLL 75.1), and MOLT-4 cells (ATCC CRL 1582), CHO cells, and COS cells, as well as heterohybridoma cells produced by fusion of human cells and cells of another species. Secondary human fibroblast cell lines, such as WI-38(ATCC CCL 75) and MRC-5(ATCC CCL 171) may also be used. Further discussion of cell types that can be employed in practicing the methods of the present invention is described in U.S. patents 5,641,670 and 6,270,989. Cell-free transcription systems may also be substituted for cells.
The cells of the invention are maintained under conditions known in the art which will result in the expression of FGE protein or functional fragments thereof. Using the method, the expressed protein can be purified from cell lysates or cell supernatants. The proteins produced according to this method can be formulated into pharmaceutically useful formulations and delivered to human or non-human animals by conventional pharmaceutical routes known in the art (e.g., orally, intravenously, intramuscularly, intranasally, intratracheally or subcutaneously). As described elsewhere herein, the recombinant cell may be an immortalized, primary or secondary cell, preferably a human cell. The use of cells from other species may be desirable in situations where non-human cells are beneficial for the purpose of protein production where the non-human FGE produced is pharmaceutically useful.
Coding sequences and regulatory sequences used herein are said to be "operably" linked when they are covalently linked in a manner that places expression or transcription of the coding sequence under the influence or control of the regulatory sequence. Two DNA sequences are said to be "operably" linked if the coding sequence is desired to be translated into a functional protein, provided that induction of the promoter in the 5' regulatory sequence results in transcription of the coding sequence, and that the linkage between the two DNA sequences does not (1) result in the introduction of a frameshift mutation, (2) interfere with the ability of the promoter region to direct transcription of the coding sequence, or (3) interfere with the ability of the corresponding RNA transcript to be translated into a protein. Thus, a promoter region will be operably linked to a coding sequence if it is capable of effecting transcription of that DNA sequence such that the resulting transcript can be translated into the desired protein or polypeptide.
The precise nature of the regulatory sequences required for gene expression may vary between species or cell types, but will generally include, as necessary, 5 'nontranscribed and 5' nontranslated sequences involved in the initiation of transcription and translation, respectively, such as a TATA box, capping sequence, CAAT sequence, and the like. In particular, such 5' non-transcribed regulatory sequences will include a promoter region comprising a promoter sequence for transcriptional control of the operably linked gene. The control sequences may also include desired enhancer sequences or upstream agonist sequences. The vectors of the invention may optionally include a 5' leader or signal sequence. The selection and design of an appropriate vector is within the ability and guidance of one of ordinary skill in the art.
Expression vectors containing all the elements necessary for expression are commercially available and known to those skilled in the art. See, e.g., Sambrook et al, Molecular Cloning: the laboratory Manual, Second Edition, Cold Spring harbor laboratory Press, 1989. The cells are genetically engineered by the introduction of heterologous dna (rna) encoding FGE polypeptides or fragments or variants thereof. That DNA (RNA) is placed under the operational control of the transcription element to allow expression of the heterologous DNA in the host cell.
Preferred systems for expressing mRNA in mammalian cells are, for example, prb/CMV (available from Invitrogen, Carlsbad, CA) comprising selection markers such as genes conferring G418 resistance, which facilitates selection of stably transfected cell lines, and human Cytomegalovirus (CMV) enhancer-promoter sequences. Furthermore, a suitable type of expression in primate and canine cell lines is the pCEP4 vector (Invitrogen, Carlsbad, CA) which contains an EB virus (EBV) origin of replication that facilitates plasmid preservation as a multicopy extrachromosomal element. Another expression vector is the pEF-BOS plasmid containing the promoter for polypeptide extension factor 1 α, which effectively stimulates transcription in vitro. Plasmids are described by Mishizuma and Nagata (Nuc. acids Res.18: 5322, 1990), and their use in transfection experiments has been disclosed, for example, in Demoulin (mol. cell. biol.16: 4710-. A further preferred expression vector is the adenovirus described by Stratford-Perricaudet, which is deficient in the E1 and E3 proteins (J.Clin.invest.90: 626-630, 1992). The use of adenovirus as an Adeno. P1A recombinant was disclosed by Warner et al for intradermal injection in mice to achieve immunization against P1A (int. J. cancer, 67: 303-310, 1996).
The invention also encompasses so-called expression kits, which allow the skilled worker to prepare the desired expression vector or vectors. Such expression kits include at least separate portions of each of the previously discussed coding sequences. Other components may be added as needed so long as the desired sequence mentioned above is included.
It will also be appreciated that the invention encompasses the use of the vectors described above comprising the expression FGE cDNA sequences to transfect host cells and cell lines, either prokaryotic (e.g., e.coli) or eukaryotic (e.g., CHO cells, COS cells, yeast expression systems and baculovirus expression in insect cells). Particularly useful are mammalian cells such as those of humans, mice, hamsters, pigs, goats, primates, and the like. They may be of a variety of tissue types, including primary cells and immortalized cell lines as described elsewhere herein. Specific examples include HT-1080 cells, CHO cells, dendritic cells, U293 cells, peripheral blood leukocytes, bone marrow stem cells, embryonic stem cells, and insect cells. The invention also allows the construction of FGE gene "knockouts" in cells and animals, providing material for studying certain aspects of FGE activity.
The invention also provides isolated polypeptides (including whole and partial proteins) encoded by the aforementioned FGE nucleic acids, also including SEQ ID NO: 2 and unique fragments thereof. Such polypeptides can be used, for example, alone or as part of a fusion protein to generate antibodies as components of an immunoassay. The polypeptides may be isolated from biological samples including tissue or cell homogenates, or may be recombinantly expressed in a variety of prokaryotic or eukaryotic expression systems by constructing an expression vector suitable for the expression system, introducing the expression vector into the expression system, and isolating the recombinantly expressed protein. Short polypeptides, including antigenic peptides (e.g., presented on the cell surface by MHC molecules for immunological recognition) can also be chemically synthesized using well-established peptide synthesis methods.
In general, the distinct fragments of FGE polypeptides have the characteristics and properties of the distinct fragments associated with nucleic acids as discussed above. As will be appreciated by those skilled in the art, the size of a unique fragment will depend on factors such as whether the fragment forms part of a conserved protein domain. Thus, SEQ ID NO: 2 will require longer segments to be unique, while others will require only short segments, typically between 5 and 12 amino acids (e.g., 5, 6, 7, 8, 9, 10, 11 and 12 amino acids long or more, including from each integer up to the full length, 287 amino acids long).
Unique fragments of a polypeptide are preferably those that retain a significant functional capability of the polypeptide. Functional capabilities that can be retained in a unique fragment of a polypeptide include interaction with an antibody, interaction with other polypeptides or fragments thereof, interaction with other molecules, and the like. One important activity is the ability to function as a tag to identify polypeptides. Those skilled in the art are well versed in methods of selecting unique amino acid sequences, typically based on the ability of the unique fragments to selectively recognize a sequence of interest from a non-family member. It is desirable to compare the sequence of the fragment to those known in the database.
The present invention includes variants of the aforementioned FGE polypeptides. As used herein, a "variant" of an FGE polypeptide is a polypeptide comprising one or more modifications to the primary amino acid sequence of the FGE polypeptide. Modifications to create FGE polypeptide variants, typically applied to nucleic acids encoding FGE polypeptides, may include deletions, point mutations, truncations, amino acid substitutions, and additions of amino acid or non-amino acid moieties to: 1) reducing or eliminating the activity of FGE polypeptides; 2) enhancing a property of the FGE polypeptide, such as protein stability or stability of protein-ligand binding in the expression system; 3) providing a new activity or property to the FGE polypeptide, such as the addition of an antigenic epitope or the addition of a detectable moiety; or 4) provide equal or better binding to FGE polypeptide receptors or other molecules. Alternatively, the modification may be applied directly to the polypeptide, e.g., by cleavage, addition of a linker molecule, addition of a detectable moiety such as biotin, addition of a fatty acid, and the like. Modifications also include fusion proteins comprising all or part of the FGE amino acid sequence. Those skilled in the art will be familiar with methods for predicting the effect of protein sequence changes on protein conformation, and can therefore "design" variants of FGE polypeptides according to known methods. An example of this method is described by Dahiyat and Mayo in Science 278: 82-87, 1997, in which proteins can be redesigned. This method can be applied to known proteins to alter only a portion of the polypeptide sequence. By applying the calculations of Dahiyat and Mayo, variants of a particular FGE polypeptide can be proposed and tested to determine whether the variants retain the desired conformation.
Variants may include FGE polypeptides that are specifically modified in order to alter a characteristic of the polypeptide that is not associated with its physiological activity. For example, cysteine residues can be substituted or deleted to prevent unnecessary disulfide linkages. Similarly, certain amino acids may be altered to enhance expression of FGE polypeptides by eliminating proteolysis by proteases in the expression system (e.g., dibasic amino acid residues in yeast expression systems where KEX2 protease activity is present).
Mutations in the nucleic acid encoding the FGE polypeptide preferably preserve the amino acid reading frame of the coding sequence and preferably do not create regions in the nucleic acid that are likely to hybridize to form secondary structures, such as hairpins or loop structures, that can be detrimental to expression of the variant polypeptide.
Mutations can occur by selection of amino acid substitutions or random mutations at selected positions in the nucleic acid encoding the polypeptide. The variant polypeptide is then expressed and tested for one or more activities to determine which mutation provides the variant polypeptide with the desired property. Further mutations that are silent with respect to the amino acid sequence of the polypeptide may be provided to the variant (or non-variant FGE polypeptide), but which provide translational codons preferred in a particular host, or alter the structure of the mRNA, for example, to enhance stability and/or expression. Preferred codons for translation of nucleic acids, e.g., in E.coli, mammalian cells, etc., are well known to those of ordinary skill in the art. Other mutations may also be provided to the non-coding sequence of the FGE gene or cDNA clone to enhance expression of the polypeptide.
The skilled person will appreciate that conservative amino acid substitutions may be made in FGE polypeptides to provide functionally equivalent variants of the aforementioned polypeptides, i.e. variants that retain the functional capability of the FGE polypeptide. As used herein, "conservative amino acid substitutions" refer to amino acid substitutions that do not significantly alter tertiary structure and/or polypeptide activity. Variants can be prepared according to methods known to those of ordinary skill in the art for altering polypeptide sequences and include those found in references compiling such methods, such as Molecular Cloning: a Laboratory Manual, eds.Sambrook et al, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1989, or Current Protocols in molecular biology, F.M. Ausubel et al, John Wiley & Sons, Inc., New York. Exemplary functionally equivalent variants of FGE polypeptides include the amino acid sequence set forth for SEQ ID NO: 2, conservative amino acid substitution. Conservative amino acid substitutions include those that occur among amino acids within the following groups: (a) m, I, L, V; (b) f, Y, W; (c) k, R, H; (d) a, G; (e) s, T; (f) q, N; and (g) E, D.
Functionally equivalent variants of FGE polypeptides, i.e. variants of FGE polypeptides that retain the function of the native FGE polypeptide, are therefore contemplated by the present invention. Conservative amino acid substitutions in the amino acid sequence of FGE polypeptides that result in functionally equivalent variants are typically provided by alteration of the nucleic acid encoding the FGE polypeptide (SEQ ID NOs: 1, 3). Such substitutions may be obtained by a variety of methods known to those of ordinary skill in the art. For example, amino acid substitutions may be made by PCR-directed mutagenesis, site-directed mutagenesis according to the method of Kunkel (Kunkel, Proc. Nat. Acad. Sci. U.S.A.82: 488-substituted 492, 1985), or chemical synthesis of a gene encoding an FGE polypeptide. The activity of functionally equivalent fragments of FGE polypeptides can be determined by: the gene encoding the altered FGE polypeptide is cloned into a bacterial or mammalian expression vector, the vector is introduced into a suitable host cell, the altered FGE polypeptide is expressed, and the FGE polypeptides disclosed herein are tested for functional capability (e.g., C α -formylglycine generating activity, etc.).
The invention described herein has many applications, some of which are described elsewhere herein. First, the present invention allows for the isolation of FGE polypeptides. A variety of methods well known to the skilled practitioner can be used to obtain the isolated FGE molecules. The polypeptide may be purified from cells in which it is naturally produced by chromatographic means or immunological recognition. Alternatively, the expression vector may be introduced into a cell to cause production of the polypeptide. In another approach, the mRNA transcript can be microinjected or otherwise introduced into the cell to cause production of the encoded polypeptide. Translation of FGE mRNA in cell-free extracts such as reticulocyte degradation systems can also be used to produce FGE polypeptides. The person skilled in the art can also easily follow known methods for isolating FGE polypeptides. These include, but are not limited to, immunochromatography, HPLC, size exclusion chromatography, ion exchange chromatography, and immunoaffinity chromatography.
In certain embodiments, the invention also provides "dominant negative" polypeptides derived from FGE polypeptides. Dominant negative polypeptides are negative variants of a protein that reduce the effect of an active protein by interacting with the cellular machinery to displace the active protein from its interaction with the cellular machinery, or compete with the active protein. For example, a dominant negative receptor that binds to a ligand but does not transduce a signal in response to ligand binding can reduce the biological effect of ligand expression. Similarly, dominant negative catalytically inactive kinases that normally interact with a target protein but do not phosphorylate the target protein can reduce phosphorylation of the target protein in response to cellular signals. Similarly, a dominant negative transcription factor that binds to a promoter position in the regulatory region of a gene without increasing transcription of the gene can reduce the effect of a normal transcription factor by occupying the promoter binding position without increasing transcription.
The net result of expression of a dominant negative polypeptide in a cell is a reduction in the function of the active protein. One of ordinary skill in the art can assess the potential of a dominant negative protein variant and create one or more dominant negative variant proteins using standard mutagenesis techniques. See, e.g., U.S. patent No.5, 580,723 and Sambrook et al, Molecular Cloning: a Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 1989. The skilled artisan can then test the mutagenized population of proteins for a reduction in a selected activity and/or for retention of such activity. Other similar methods for creating and testing dominant negative variants of a protein will be apparent to those of ordinary skill in the art.
The isolation of FGE cDNA also makes it possible for the skilled person to diagnose diseases characterized by aberrant expression of FGE. These methods include determining the expression of the FGE gene and/or FGE polypeptides derived therefrom. In the former case, such assays can be performed by any standard nucleic acid assay, including polymerase chain reaction, or as exemplified below, assays employing labeled hybridization probes. In the latter case, such assays can be performed by any standard immunoassay (using, for example, antibodies that bind to secreted FGE proteins). The preferred disease which can be diagnosed according to the invention is multiple sulfatase deficiency.
The invention also includes isolated peptide binding agents, which, for example, can be an antibody or antibody fragment ("binding polypeptide") having the ability to selectively bind to an FGE polypeptide. Antibodies include polyclonal and monoclonal antibodies prepared according to conventional methods. In certain embodiments, the invention excludes polypeptides that are identical to SEQ ID NO: 4 (e.g., an antibody).
Notably, as is well known in the art, only a small portion of an antibody molecule, the paratope, is involved in binding an antibody to its epitope (see, generally, Clark, W.R (1986)The Experimental Foundations of Modern ImmunologyWiley & Sons,Inc.,New York;Roitt,I.(1991)Essential Immunology7th Ed., Blackwell Scientific Publications, Oxford). For example, the pf' and Fc regions are effectors of the complement cascade but are not involved in antigen binding. Antibodies whose pFC ' region has been cleaved off enzymatically or which are produced without pFC ' region are referred to as F (ab ')2Fragments, which retain both antigen binding sites of the intact antibody. Similarly, antibodies in which the Fc region has been cleaved off enzymatically or produced without the Fc region, referred to as Fab fragments, retain one antigen binding site of the intact antibody molecule. Further, the Fab fragment consists of a covalently bound antibody light chain and a portion of the antibody heavy chain denoted Fd. The Fd fragment is the main determinant of antibody specificity (a single Fd fragment can be associated with up to 10 different light chains without altering antibody specificity), and the Fd fragment retains epitope-binding ability after isolation.
As is well known in the art, within the antigen-binding portion of an antibody there are Complementarity Determining Regions (CDRs) that interact directly with an epitope of an antigen, as well as Framework Regions (FRs) that maintain the tertiary structure of the antibody-binding site (see, generally, Clark, 1986; Roitt, 1991). In the heavy chain Fd fragment and light chain of IgG immunoglobulins, there are four framework regions (FR1 to FR4) separated by three complementarity determining regions (CDR1 to CDR3), respectively. The CDRs, particularly the CDR3 region, and more particularly the heavy chain CDR3, are largely responsible for antibody specificity.
It is now well recognized in the art that non-CDR regions of mammalian antibodies can be replaced by analogous regions of antibodies of the same or different specificity, while retaining the epitope specificity of the original antibody. This is most clearly demonstrated in the preparation and use of "humanized" antibodies, in which non-human CDRs are covalently linked to human FR and/or Fc/pFc' regions to produce functional antibodies. See, e.g., U.S. patents 4,816,567, 5,225,539, 5,585,089, 5,693,762, and 5,859,205. Thus, for example, PCT International publication No. WO 92/04381 teaches the production and use of humanized murine RSV antibodies in which at least a portion of the murine FR region is replaced by a human-derived FR region. Such antibodies, including fragments of intact antibodies having antigen binding ability, are often referred to as "chimeric" antibodies.
Thus, as will be apparent to one of ordinary skill in the art, the present invention also provides F (ab')2Fab, Fv and Fd fragments; chimeric antibodies in which the Fc and/or FR and/or CDR1 and/or CDR2 and/or light chain CDR3 regions have been replaced by homologous human or non-human sequences; chimeric F (ab') in which FR and/or CDR1 and/or CDR2 and/or light chain CDR3 regions have been replaced by homologous human or non-sequence2A fragment antibody; a chimeric Fab fragment antibody in which the FR and/or CDR1 and/or CDR2 and/or light chain CDR3 regions have been replaced by homologous human or non-human sequences; and chimeric Fd fragment antibodies in which the FR and/or CDR1 and/or CDR2 regions have been replaced by homologous human or non-human sequences. The invention also encompasses so-called single chain antibodies.
Thus, the present invention encompasses polypeptides of various sizes and types that specifically bind to FGE polypeptides, as well as complexes of FGE polypeptides with their binding partners. These polypeptides may also be derived from sources other than antibody technology. For example, such polypeptide binding reagents can be provided from a degenerate peptide library that can be readily prepared in solution, in immobilized form, as a bacterial flagellin peptide display library or a phage display library. Combinatorial libraries of peptides containing one or more amino acids can also be prepared. The library can be further synthesized from peptide and non-peptide synthetic moieties.
Phage display can be particularly effective in identifying binding peptides useful according to the invention. Briefly, phage libraries exhibiting insertions of 4 to about 80 amino acid residues can be prepared using conventional procedures (using, for example, m13, fd, or lambda phage). The insert may represent, for example, a complete degradation or an offset array. The phage-bearing insert can then be selected for binding to the FGE polypeptide or to a complex of FGE and a binding partner. This process can be repeated by checking for several cycles of phage binding to FGE polypeptides or complexes. Repeated cycles lead to enrichment of phage bearing specific sequences. DNA sequence analysis may be performed to identify the sequence of the expressed polypeptide. The least linear portion of the sequence that binds to the FGE polypeptide or complex can be determined. These procedures can be repeated using libraries containing the bias of inserts comprising a portion or all of the minimal linear portion plus one or more additional degenerate residues upstream or downstream thereof. The yeast two-hybrid screening method can also be used to identify polypeptides that bind to FGE polypeptides. Thus, FGE polypeptides or fragments thereof of the invention or FGE complexes with binding partners can be used to screen peptide libraries, including phage display libraries, to identify and select peptide binding partners for FGE polypeptides of the invention. Such molecules can be used, as described, in screening assays, purification schemes, direct interference with FGE function, and other purposes that will be apparent to those of ordinary skill in the art.
FGE polypeptides or fragments thereof can also be used to isolate their natural binding partners. Isolation of the binding partner may be accomplished according to well known methods. For example, an isolated FGE polypeptide can be attached to a substrate and a solution suspected of containing FGE binding partners can then be applied to the substrate. If the binding partner of the FGE polypeptide is present in solution, it will bind to the matrix-immobilized FGE polypeptide. The binding partner may then be isolated. Other proteins that act as FGE binding partners can be isolated by similar methods without undue experimentation. A preferred binding partner is sulfatase.
The invention also provides methods for measuring the level of expression of FGE in a subject. This can be done by first obtaining a test sample from the subject. The test sample may be a tissue or a biological fluid. Tissues include brain, heart, serum, breast, colon, bladder, uterus, prostate, stomach, testis, ovary, pancreas, pituitary gland, adrenal gland, thyroid, salivary gland, breast, kidney, liver, intestine, spleen, thymus, blood vessels, bone marrow, trachea, and lung. In certain embodiments, the test sample is derived from cardiac and vascular tissue, and the biological fluid includes blood, saliva, and urine. Both invasive and non-invasive techniques can be used to obtain such samples and are well documented in the art. At the molecular level, both PCR and Northern blot can be used to determine the level of FGE mRNA using the products of the invention described herein and protocols well known in the art that can be found in the references that compile such methods. At the protein level, FGE expression can be determined using polyclonal or monoclonal anti-FGE sera in combination with standard immunoassays. A preferred method would be to compare the measured FGE expression levels of the test sample to a control. The control may comprise a known amount of nucleic acid probe, an FGE epitope (such as an FGE expression product), or a similar test sample from a subject having a control or "normal" level of FGE expression.
FGE polypeptides are preferably produced recombinantly, although such proteins may be isolated from biological extracts. Recombinantly produced FGE polypeptides include chimeric proteins comprising a fusion of the FGE protein with another polypeptide, e.g., capable of providing or enhancing protein-protein binding, sequence-specific nucleic acid binding (e.g., GAL4), enhancing the stability of the FGE polypeptide under assay conditions, or providing a detectable moiety, e.g., green fluorescent protein. Polypeptides fused to FGE polypeptides or fragments may also provide a means for easy detection of the fusion protein, for example by immunological recognition or by fluorescent labeling.
The invention is also useful in the production of non-human transgenic animals. As used herein, "non-human transgenic animal" includes non-human animals having one or more exogenous nucleic acid molecules integrated into germline and/or somatic cells. Thus, transgenic animals include "knockout" animals having homozygous or heterozygous gene disruption by homologous recombination, animals having episomal or chromosomally integrated expression vectors, and the like. Knockout animals can be made by homologous recombination using embryonic stem cells, which are well known in the art. Recombination can be facilitated using, for example, cre/lox systems or other recombinase systems known to those of ordinary skill in the art. In certain embodiments, the recombinase system itself is conditionally expressed, for example, in a tissue or cell type, induced by the addition of compounds or the like that increase or decrease expression at certain embryonic or post-embryonic developmental stages. Typically, the conditional expression vectors employed in such systems employ a variety of promoters which impart the desired pattern of gene expression (e.g., temporal or spatial). The conditional promoter is also operably linked to the FGE nucleic acid molecule to increase expression of FGE in a regulated or conditional manner. The trans-acting negative regulator of FGE activity or expression can also be operably linked to the conditional promoter described above. Such trans-acting modulators include antisense FGE nucleic acid molecules, nucleic acid molecules encoding dominant negative FGE molecules, ribozyme molecules specific for FGE nucleic acids, and the like. Non-human transgenic animals are useful in assays for testing for biochemical or physiological effects of diagnostic or therapeutic methods directed to conditions characterized by increased or decreased FGE expression. Other applications will be apparent to those of ordinary skill in the art.
The invention also relates to gene therapy. Procedures for accomplishing ex vivo gene therapy are outlined in U.S. patent5,399,346, also shown in the documents submitted in this patent review, all of which are publicly available documents. Generally, it involves introducing functional copies of a gene into cells of a subject containing defective copies of the gene in vitro, and replacing the genetically engineered cells into the subject. The functional copy of the gene is under the operable control of regulatory elements that allow the gene to be expressed in the genetically engineered cell. A number of transfection and transduction techniques, as well as suitable expression vectors, are well known to those of ordinary skill in the art, some of which are described in PCT application WO 95/00654. In vivo gene therapy using vectors such as adenovirus, retrovirus, herpes virus, and targeted liposomes is also contemplated according to the present invention.
The invention further provides efficient methods of identifying agents or lead compounds to obtain agents that are active at the level of FGE or FGE fragment-dependent cellular function. Such functions include inter alia interaction with other polypeptides or fragments. Generally, the screening method involves interference with FGE activity (e.g., C) αFormylglycine generating activity), although enhancing FGE CαCompounds with formylglycine generating activity can also be assayed using this screening method. Such methods can be adapted to automationScreening for high-yielding compounds. Target indicators include cellular processes such as C regulated by FGEαFormylglycine generating activity.
A variety of assays for alternative (pharmacological) agents are provided, including labeled in vitro protein-ligand binding assays, electrophoretic migration shift assays, immunoassays, cell-based assays such as two-hybrid or three-hybrid screens, expression assays, and the like. The transfected nucleic acid can encode, for example, a combinatorial peptide library or a cDNA library. Reagents to facilitate such assays, such as GAL4 fusion proteins, are well known in the art. Exemplary cell-based assays include transfecting a cell with a nucleic acid encoding an FGE polypeptide fused to a GAL4 DNA-binding domain and a nucleic acid encoding a reporter gene operably linked to a regulatory region of gene expression (e.g., one or more GAL4 binding sites). Activation of reporter gene transcription occurs when FGE and the reporter fusion polypeptide bind, e.g., to activate transcription of the reporter gene. Thereafter, agents that modulate FGE polypeptide-mediated cellular function are detected by changes in reporter gene expression. Methods for determining changes in reporter gene expression are well known in the art.
The FGE fragments used in the method, when not produced from the transfected nucleic acid, are added to the assay mixture as isolated polypeptides. The FGE polypeptide is preferably produced recombinantly, although the polypeptide may be isolated from a biological extract. Recombinantly produced FGE polypeptides include chimeric proteins comprising a fusion of the FGE protein with another polypeptide, e.g., capable of providing or enhancing protein-protein binding, sequence-specific nucleic acid binding (e.g., GAL4), enhancing the stability of the FGE polypeptide under assay conditions, or providing a detectable moiety, e.g., a green fluorescent protein or Flag epitope.
The assay mixture comprises a native intracellular FGE binding target that is capable of interacting with FGE. Where a natural FGE-binding target is available, it is often preferred to employ a local (e.g., a peptide-see, e.g., SEQ ID NO: 33 peptide-or nucleic acid fragment) or analog (i.e., an agent that mimics the FGE-binding properties of the natural binding target for analytical purposes) of the FGE-binding target, so long as the local or analog provides a binding affinity and strong binding propensity to the FGE fragment that is measurable in the assay.
The assay mixture also contains a candidate substance. Typically, multiple assay mixtures are run in parallel at different reagent concentrations to obtain different responses to the various concentrations. Typically, one of these concentrations is used as a negative control, i.e., zero concentration of reagent or a concentration of reagent below the detection limit of the assay. Candidate substances include many chemical species, although they are typically organic compounds. Preferably, the candidate substances are small molecule organic compounds, i.e., those of the type having a molecular weight greater than 50 but less than about 2500, preferably less than about 1000, and more preferably less than about 500. Candidate substances contain functional chemical groups necessary for structural interaction with the polypeptide and/or nucleic acid, and typically include at least one amino, carbonyl, hydroxyl, or carboxyl group, preferably at least two functional chemical groups, and more preferably at least three functional chemical groups. Candidate substances may comprise cyclic or heterocyclic ring structures and/or aromatic or polyaromatic structures substituted with one or more of the above-identified functional groups. Candidate substances may also be biomolecules such as peptides, sugars, fatty acids, sterols, isoprenoids, purines, pyrimidines, derivatives or structural analogs of the above, or combinations thereof, and the like. Where the agent is a nucleic acid, the agent is typically a DNA or RNA molecule, although modified nucleic acids as defined herein are also contemplated.
Candidate substances are available from a variety of sources including libraries of synthetic or natural compounds. For example, for the random and directed synthesis of a variety of organic compounds and biomolecules, a number of means are available, including expression of random oligonucleotides, synthetic organic combinatorial libraries, phage display libraries of random peptide fragments, and the like. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. In addition, naturally and synthetically produced libraries and compounds can be modified by conventional chemical, physical and biochemical means. Further, known (pharmacologically) agents may be subjected to directed or random chemical modifications such as acylation, alkylation, esterification, amidification, etc., to produce structural analogs of the agent.
Various other agents can also be included in the mixture. These include reagents such as salts, buffers, neutral proteins (e.g., albumin), detergents, etc., which can be used to promote optimal protein-protein and/or protein-nucleic acid binding. Such reagents may also reduce non-specific or background interactions of the reaction components. Other reagents that improve the efficiency of the assay, such as proteases, inhibitors, nuclease inhibitors, antimicrobial agents and the like, may also be used.
The mixture of the aforementioned assay materials is incubated under conditions thus determined, but for the presence of the candidate substance, the FGE polypeptide specifically binds to the cell-binding target, a portion thereof, or an analog thereof. The order of addition of the components, incubation temperature, incubation time and other parameters of the assay can be readily determined. Such experiments involve only optimization of the analysis parameters and do not involve the basic composition of the analysis. The incubation temperature is typically between 4 ℃ and 40 ℃. Incubation times are preferably minimized to facilitate rapid, high-throughput screening, and are typically between 0.1 and 10 hours.
After incubation, the presence or absence of specific binding of the FGE polypeptide to one or more binding targets is detected by any convenient means available to the user. For non-cell binding type assays, a separation step is often used to separate bound components from unbound components. The separation step can be accomplished in a variety of ways. Conveniently, at least one of the components is immobilised on a solid phase matrix from which unbound components can be readily separated. The solid phase matrix can be made from a variety of materials in a variety of forms, such as microtiter plates, microbeads, metering rods (dipsticks), resin particles, and the like. Preferably the matrix is selected to maximize signal to noise ratio, primarily to minimize background binding, and to simplify separation procedures and costs.
Separation can be affected by a variety of factors, such as removing beads or metering rods from the library, emptying or diluting the library such as a microtiter plate well, washing beads, particles, chromatography columns or filters with washing solutions or solvents. The separation step preferably comprises multiple rinses or washes. For example, when the solid phase matrix is a microtiter plate, the wells may typically be washed several times with a wash solution that includes those components of the incubation mixture that do not participate in specific binding, such as salts, buffers, detergents, non-specific proteins, and the like. In the case where the solid substrate is a magnetic bead, the bead may be washed one or more times with a wash solution and separated by a magnet.
The detection may be effected in any convenient manner for cell-based assays such as two-hybrid or three-hybrid screens. The transcript resulting from the reporter gene transcriptional analysis of the FGE polypeptide interacting with the target molecule typically encodes a directly or indirectly detectable product, such as β -galactosidase activity, luciferase activity, and the like. For non-cell binding assays, one of the components typically comprises or is coupled to a detectable label. A variety of labels can be used, such as those that provide direct detection (e.g., radioactive, luminescent, visual or electron density, etc.), or indirect detection (e.g., epitope tags such as FLAG epitopes, enzyme tags such as horseradish peroxidase, etc.). The label may be immobilized to the FGE binding partner, or incorporated into the structure of the binding partner.
A variety of methods can be used to detect the label, depending on the nature of the label and other analytical components. For example, the label may be detected while immobilized to or after separation from the solid phase matrix. The label may be detected directly by visual or electron density, radioactive radiation, nonradioactive energy transfer, or indirectly with an antibody conjugate, streptavidin-biotin conjugate, or the like. Methods of detecting markers are well known in the art.
The present invention provides FGE-specific binding reagents, methods of identifying and producing such reagents, and their use in diagnostics, therapeutics and drug development. For example, FGE-specific pharmacological agents are useful in a variety of diagnostic and therapeutic applications, particularly in conditions where disease or disease prognosis is associated with altered FGE binding characteristics, such as multiple sulfatase deficiency. Novel FGE-specific binding reagents include FGE-specific antibodies, cell surface receptors, other native intracellular and extracellular binding reagents identified by assays such as two-hybrid screens, and non-native intracellular and extracellular binding reagents identified by screens of chemical libraries, and the like.
In general, the specificity of FGE binding to a specific molecule is determined by the binding equilibrium constant. Targets capable of selectively binding FGE polypeptides preferably have at least about 10 7M-1More preferably at least about 108M-1And most preferably at least about 109M-1. A variety of cell-based and non-cell assays can be used to demonstrate specific binding of FGE. Cell-based assays include single, double and triple hybrid screens, in which FGE-mediated transcription is inhibited or increased, among others. Non-cellular assays include FGE protein binding assays, immunoassays, and the like. Other assays useful for screening agents that bind FGE polypeptides include Fluorescence Resonance Energy Transfer (FRET) and Electrophoretic Migration Shift Assays (EMSA).
According to another aspect of this aspect, C is identified which modulates a molecule of the inventionαA method for producing a useful agent in formylglycine generating activity is provided. The method comprises (a) mixing the compound with CαContacting a formylglycine generating active molecule with a candidate substance, (b) measuring C of the moleculeαFormylglycine generating activity, and (C) C of the molecule to be determinedα-formylglycine generating activity in comparison to a control to determine whether a candidate substance modulates C of a moleculeα-formylglycine generating activity, wherein the molecule is an FGE nucleic acid molecule of the invention or an expression product thereof. "contact" means having Cα-direct and indirect contact of the formylglycine generating active molecule with the candidate substance. By "indirect" contact is meant that the candidate substance is directed to the C of the molecule by a third agent (e.g., a messenger molecule, receptor, etc.) αThe formylglycine generating activity exerts its influence. In certain embodiments, the control is molecule C measured in the absence of the candidate substanceαFormylglycine generating activity. The assay and candidate substances are as described in the previous embodiments with respect to FGE.
According to another aspect of the present invention, methods are provided for diagnosing a disease characterized by aberrant expression of a nucleic acid molecule, expression product thereof, or expression product fragment thereof. The method comprises contacting a biological sample isolated from a subject with an agent that specifically binds to a nucleic acid molecule, an expression product thereof, or a fragment of an expression product thereof, and determining the interaction between the agent and the nucleic acid molecule or expression product as a basis for disease determination, wherein the nucleic acid molecule is an FGE molecule according to the present invention. The disease is a deficiency of multiple sulfatases. Mutations in the FGE gene that result in aberrant expression of the FGE molecule result in the following SEQ ID NOs: amino acid change at 2: MetlArg; MetlVal; leu20 Phe; ser155 Pro; ala177 Pro; cys218 Tyr; arg224 Trp; asn259 Ile; pro266 Leu; ala279 Val; arg327 Stop; cys336 Arg; arg345 Cys; ala348 Pro; arg349 Gln; arg349 Trp; arg349 Trp; ser359 Stop; or a combination thereof.
In the case where the molecule is a nucleic acid molecule, such assays can be carried out by any standard nucleic acid assay analysis, including polymerase chain reaction or the assays exemplified herein with labeled hybridization probes. In the case where the molecule is a nucleic acid molecule expression product or a fragment of a nucleic acid molecule expression product, such an assay can be performed by any standard immunoassay employing, for example, an antibody that binds to any polypeptide expression product.
"aberrant expression" refers to reduced expression (under-expression) or increased expression (over-expression) of an FGE molecule (nucleic acid and/or polypeptide) relative to a control (i.e., expression of the same molecule in a healthy or "normal" subject). As used herein, "healthy subject" refers to a subject who does not develop multiple sulfatase deficiency or is at risk of developing multiple sulfatase deficiency according to standard medical standards. Healthy subjects also do not otherwise develop the disorder. In other words, such subjects, if examined by a medical professional, will be characterized as healthy and without symptoms of multiple sulfatase deficiency. These include the characteristics of abnormally-infectious leukodystrophy and mucopolysaccharidosis, such as increased amounts of acidic mucopolysaccharides in several tissues, slight 'lipochondral dystrophy', rapid neurological decline, excessive presence of brain sulfatides and mucopolysaccharides in urine, increased metachromatic degeneration of cerebrospinal fluid proteins and myelin in peripheral nerves.
The invention also provides novel kits that will be used to measure the levels of the nucleic acids of the invention and the expression products of the invention.
In one embodiment, the kit comprises a package comprising the following: an agent that selectively binds to any of the foregoing isolated nucleic acids of FGE or an expression product thereof, and a control for comparing the measured value of binding of the agent and any of the foregoing isolated nucleic acids of FGE or an expression product thereof. In certain embodiments, the control is a predetermined value for comparison to a measured value. In certain embodiments, the control comprises an epitope of an expression product of any of the foregoing isolated nucleic acids of FGE. In one embodiment, the kit further comprises a second agent that selectively binds to a polypeptide selected from the group consisting of: iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase a, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6, or a peptide fragment thereof, and a control for comparing measured values for binding of the second agent to the polypeptide or peptide fragment thereof.
In the case of nucleic acid detection, primer pairs for amplifying the nucleic acid molecules of the invention may be included. Preferred kits will include controls, such as a known amount of a nucleic acid probe, an epitope (e.g., iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6, expression products) or an anti-epitope antibody, and a guide or other copy material. In certain embodiments, the reprographic material can be characterized for risk of developing a sulfatase deficient state based on the results of the analysis. The reagents may be packaged in containers and/or coated in wells in predetermined amounts, and the kit may include standard substances, such as labeled immunological reagents (e.g., labeled anti-IgG antibodies) and the like. One kit is a coated polystyrene microtiter plate coated with FGE protein and a container containing labeled anti-human IgG antibody. The wells of the plate are contacted with, for example, a biological fluid, washed, and then contacted with an anti-IgG antibody. The mark is then detected. A kit embodying features of the invention, generally designated by the numeral 11, is shown in figure 25. The kit 11 comprises the following main elements: package 15, inventive reagent 17, control reagent 19, and guide 21. The package 15 is a box-like structure for containing a vial (or vials) containing the reagent 17 of the present invention, a vial (or vials) containing the control reagent 19, and the guide 21. One skilled in the art can readily modify the package 15 to suit the needs of the individual.
The invention also encompasses methods of treating multiple sulfatase deficiency in a subject. The method comprises administering to a subject in need of such treatment modulation Cα-a formylglycine generating agent in an amount effective to increase C in a subjectα-amount of formylglycine generating activity. In certain embodiments, the methods further comprise co-administration of an agent selected from the group consisting of nucleic acid molecules encoding: iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase a, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6; an expression product of a nucleic acid molecule, and/or a fragment of an expression product of a nucleic acid molecule.
As used herein, "an agent that modulates the expression of a nucleic acid or polypeptide" is known in the art and refers toSense and antisense nucleic acids, dominant negative nucleic acids, antibodies and analogs of the polypeptides. Any agent that modulates the expression of a molecule (and modulates its activity, as described herein) is useful according to the present invention. In certain embodiments, C is adjusted αThe agent for formylglycine generating activity is an isolated nucleic acid molecule of the invention (e.g. a nucleic acid of SEQ ID NO. 3). In important embodiments, C is modulatedαThe agent having formylglycine generating activity is a peptide of the present invention (e.g., the peptide of SEQ ID NO. 2). In certain embodiments, C is modulatedαThe agent for formylglycine generating activity is a sense nucleic acid of the invention.
According to one aspect of the invention, C is increased in a subjectα-formylglycine generating activity is provided. The method comprises administering to the subject an isolated nucleic acid molecule of the invention and/or expression products thereof in an amount that increases C in the subjectα-an effective amount of formylglycine generating activity.
According to another aspect of the invention, C is increased in a cellα-formylglycine generating activity is provided. The method comprises contacting a cell with an isolated nucleic acid molecule of the invention (e.g., the nucleic acid of SEQ ID No. 1), or an expression product thereof (e.g., the peptide of SEQ ID No. 2), in an amount effective to increase C in the cellα-amount of formylglycine generating activity. In important embodiments, the method comprises activating an endogenous FGE gene to increase C in the cellαFormylglycine generating activity.
In any of the preceding embodiments, the nucleic acid may be operatively coupled to a gene expression sequence that directs the expression of a nucleic acid molecule within a eukaryotic cell, such as an HT-1080 cell. A "gene expression sequence" is any regulatory nucleotide sequence, such as a promoter sequence or promoter-enhancer combination, which facilitates efficient transcription and translation of the nucleic acid to which it is operably linked. The gene expression sequence may be, for example, a mammalian or viral promoter such as a constitutive or inducible promoter. Constitutive mammalian promoters include, but are not limited to, the promoters of the following genes: hypoxanthine Phosphoribosyltransferase (HPTR), adenosine deaminase, pyruvate kinase, alpha-actin promoter and other constitutive promoters. Exemplary viral promoters that function constitutively in eukaryotic cells include, for example, promoters from simian virus, papilloma virus, adenovirus, Human Immunodeficiency Virus (HIV), rous sarcoma virus, cytomegalovirus, the Long Terminal Repeat (LTR) of Moloney leukemia virus and other retroviruses, and the thymidine kinase promoter of herpes simplex virus. Other constitutive promoters are known to those of ordinary skill in the art. Promoters useful as gene expression sequences of the invention also include inducible promoters. Inducible promoters are activated in the presence of an inducing agent. For example, the metallothionein promoter is activated in the presence of certain metal ions to increase transcription and translation. Other inducible promoters are known to those of ordinary skill in the art.
In general, gene expression sequences will include, if necessary, 5 'non-transcribed and 5' non-translated sequences involved in initiation of transcription and translation, respectively, such as a TATA box, capping sequence, CAAT sequence, and the like. In particular, such 5' non-transcribed sequences will comprise a promoter region comprising a promoter sequence for transcriptional control of an operably linked nucleic acid. The gene expression sequence optionally includes a desired enhancer sequence or upstream activator sequence.
Preferably, any of the FGE nucleic acid molecules of the invention is linked to a gene expression sequence that allows expression of the nucleic acid molecule in cells of a particular cell line, such as neural cells. Sequences that allow expression of a nucleic acid molecule in cells, such as neural cells, are of a type that are selectively active in such cell types to cause expression of the nucleic acid molecule in those cells. For example, the synapsin-1 promoter can be used to express any of the foregoing nucleic acid molecules of the invention in a neural cell; for example, the von Willebrand factor gene promoter can be used to express nucleic acid molecules in vascular endothelial cells. One of ordinary skill in the art will be readily able to identify alternative promoters capable of expressing a nucleic acid molecule in any of the preferred cells of the present invention.
Nucleic acid sequences and gene expression sequences are said to be "operably linked" when they are covalently linked in such a way that transcription and/or translation of the nucleic acid coding sequence (e.g., SEQ ID No.3, in the case of FGE) is placed under the influence or control of the gene expression sequence. Two DNA sequences are said to be operably linked if the desired nucleic acid sequence is translated into a functional protein, and if induction of the promoter in the 5' gene expression sequence results in transcription of the nucleic acid sequence, and if the nature of the linkage between the two DNA sequences does not (1) result in the introduction of a frameshift mutation, (2) interfere with the ability of the promoter region to direct transcription of the nucleic acid sequence, and/or (3) interfere with the ability of the corresponding RNA transcript to be translated into a protein. Thus, a gene expression sequence will be considered to be operably linked to a nucleic acid sequence if it affects the transcription of the nucleic acid sequence such that the resulting transcript can be translated into the desired protein or polypeptide.
The molecules of the invention can be delivered to preferred cell types of the invention either alone or in combination with a vector (see also earlier discussion of vectors). In its broadest sense (and consistent with the description of expression and targeting vectors elsewhere herein), a "vector" is a vector that facilitates: (1) delivery of the molecule to the target cell, and/or (2) aspiration of the molecule by the target cell. Preferably, the carrier is delivered to transport the molecule to the target cell with reduced degradation relative to the degree of degradation that would occur in the absence of the carrier. Alternatively, a "targeting ligand" can be attached to a vector to selectively transport the vector to cells whose surface expresses the relevant receptor for the targeting ligand. In this manner, the vector (containing the nucleic acid or protein) can be selectively delivered to the nerve cell. Methods of targeting include conjugation, such as those described in Priest, U.S. patent 5,391,723. Another well-known example of a targeting vector is a liposome. Liposomes are commercially available from Gibco BRL. Various methods of producing targeted liposomes are disclosed.
In general, vectors useful in the present invention include, but are not limited to, plasmids, phagemids, viruses, other vectors derived from viral or bacterial sources which have been manipulated by insertion or integration of the nucleic acid sequences of the invention and additional nucleic acid fragments (e.g., enhancers, promoters) which can be attached to the nucleic acid sequences of the invention. Viral vectors are a preferred type of vector, including but not limited to nucleic acid sequences from the following viruses: an adenovirus; adeno-associated virus; retroviruses, such as murine moloney leukemia virus; murine sarcoma virus; murine mammary tumor virus; rous sarcoma virus; SV 40-type virus; a polyoma virus; EB virus; papilloma virus; herpes virus; vaccinia virus; poliovirus; and RNA viruses such as retroviruses. Other vectors not indicated but known in the art can also be readily employed.
A particularly preferred virus for certain applications is adeno-associated virus, a double-stranded DNA virus. Adeno-associated viruses can infect a wide range of cell types and species, and can be engineered to be replication-defective. It further has advantages such as heat and lipid solvent stability, high transduction frequency in various strain cells including hematopoietic cells, and lack of superinfection inhibition thus allowing multiple series of transduction. Adeno-associated viruses are reported to integrate into human cellular DNA in a site-specific manner, thereby minimizing the possibility of insertional gene mutation and variability in the expression of the inserted gene. Furthermore, wild-type adeno-associated virus infection has been present in tissue culture for more than 100 generations in the absence of selective pressure, meaning that adeno-associated virus genomic integration is a relatively stable event. Adeno-associated viruses can also function extrachromosomally.
In general, other preferred viral vectors are based on non-cytopathic eukaryotic viruses in which non-essential genes have been replaced by genes of interest. Non-cytopathic viruses include retroviruses, the life cycle of which involves reverse transcription of genomic viral RNA into DNA followed by proviral integration into host cell DNA. Adenovirus and retrovirus have been licensed for use in human gene therapy trials. In general, retroviruses are replication-defective (i.e., capable of directing the synthesis of a desired protein, but incapable of producing infectious particles). Such genetically altered retroviral expression vectors have general utility for the efficient transduction of genes in vivo. Standard procedures for the generation of replication-defective retroviruses (including the steps: integration of exogenous genetic material into plasmids, transfection of plasmids into packaging cell lines, production of recombinant retroviruses by packaging cell lines, harvesting of viral particles from tissue culture media and infection of target cells by viral particles) are provided in Kriegler, M., "Gene Transfer and Expression, A laboratory Manual," W.H.Freeman C.O., New York (1990) and Murry, E.J.Ed. "Methods in Molecular Biology," volume 7, Humana Press Inc., Clifton, New Jersey (1991).
Another preferred retroviral vector is one derived from murine moloney leukemia virus, as described in Nabel, e.g., Science, 1990, 249: 1285 and 1288. These vectors are reported to be effective for gene delivery to all three layers of the arterial wall, including the middle layer. Other preferred vectors are described in Flugelman et al, Circulation, 1992, 85: 1110-. Additional vectors useful for delivery of the molecules of the invention are described by Mulligan et al in U.S. patent No.5,674,722.
In addition to the aforementioned vectors, other delivery methods can be used to deliver the molecules of the present invention to cells such as nerve cells, liver, fibroblasts, and/or vascular endothelial cells and facilitate uptake therein.
The preferred method of delivery of this type of invention is a colloidal dispersion system. Colloidal dispersion systems include lipid-based systems, including oil-in-water emulsions, microcapsules, mixed microcapsules, and liposomes. The preferred colloidal system of the present invention is a liposome. Liposomes are artificial membrane containers that are useful as delivery vehicles in vivo or in vitro. Monolayer containers (LUV) in the size range of 0.2-4.0 μm have been shown to encapsulate large macromolecules. RNA, DNA and whole virus particles can be encapsulated in an aqueous interior and transported into cells in a biologically active form (Fraley et al, Trends biochem. Sci., 1981, 6: 77). In order for liposomes to be effective gene delivery vehicles, one or more of the following features should be present: (1) the target gene is efficiently encapsulated, and the living activity is kept; (2) preferential and robust binding to target cells relative to non-target cells; (3) efficient transport of the water content of vesicles to the target cell cytoplasm; and (4) accurate and efficient expression of genetic information.
Liposomes can be targeted to specific tissues, such as the cardiac muscle or vascular cell wall, by conjugation of the liposome to specific ligands, such as monoclonal antibodies, sugars, glycolipids, or proteins. Ligands that may be useful for targeting liposomes to the vessel wall include, but are not limited to, the viral capsid proteins of sendai virus. In addition, a carrier may be coupled to a nuclear targeting peptide that will target the nucleic acid to the host cell nucleus.
Liposomes are commercially available from Gibco BRL, for example, from cationic lipids such as N- [1- (2, 3-dioleyloxy) -propyl]LIPOFECTIN from N, N, N-trimethyl ammonium chloride (DOTMA) and Dimethyl Dioctadecyl Ammonium Bromide (DDAB)TMAnd LIPOFECTAACETM. Methods for producing liposomes are well known in the art and are also described in numerous publications. Liposomes have also been reviewed by Gregoriadis, G. in Trends in Biotechnology, Vol.3, 235-241 (1985). Novel liposomes for Intracellular Delivery of Macromolecules, including nucleic acids, are also described in PCT International application No. PCT/US96/07572 (publication No. WO 96/40060, entitled "Intracellular Delivery of Macromolecules").
In a particular embodiment, the preferred carrier is a biocompatible microparticle or implant suitable for implantation into a mammalian subject. Exemplary bioerodible implants useful according to this method are described in PCT International application PCT/US/03307 (publication No. WO 95/24929, entitled "Polymeric Gene Delivery System," claiming priority from U.S. patent application Ser. No. 213,668 filed 3, 15, 1994). PCT/US/0307 describes biocompatible, preferably biodegradable, polymeric matrices for containing exogenous genes (regulated by appropriate promoters). The polymeric matrix is used to achieve sustained release of the exogenous gene in the patient. According to the present invention, the nucleic acids described herein are encapsulated or dispersed in a biocompatible, preferably biodegradable, polymeric matrix as disclosed in PCT/US/03307. The polymeric matrix is preferably in the form of microparticles such as microspheres (in which the nucleic acid is dispersed throughout the solid polymeric matrix) or microcapsules (in which the nucleic acid is stored in the core of the polymeric shell). Other forms of polymeric matrix comprising the nucleic acids of the invention include membranes, coatings, gels, implants and scaffolds. The size and composition of the polymeric matrix device is selected to result in good release kinetics in the tissue in which the matrix device is implanted. The size of the polymeric matrix is further designed to be selected according to the delivery method to be used, typically tissue injection or suspension administration to the nasal and/or pulmonary region by nebulization. The polymeric matrix composition can be selected to have both a good degradation rate and to be formed of bioadhesive material to further increase the effectiveness of the delivery when the design is applied to the surface of the blood vessel. The matrix composition can also be selected not to degrade, but more suitably to be released by diffusion for a considerable period of time.
Both non-biodegradable and biodegradable polymeric matrices can be used to deliver the nucleic acids of the invention to a subject. A biodegradable matrix is preferred. Such polymers may be natural or synthetic polymers. Synthetic polymers are preferred. The polymer is selected according to the length of time required for release, typically on the order of a few hours to a year or more. Typically, release over a period ranging from a few hours to three to twelve months is most desirable. The polymer is optionally in the form of a hydrogel capable of absorbing up to about 90% of its weight in water, and further optionally crosslinked with a multivalent ion or other polymer.
In general, the nucleic acids of the invention are delivered diffusively using a bioerodible implant, or more preferably, by degradation of a polymeric matrix. Exemplary synthetic polymers that can be used to form biodegradable delivery systems include: polyamides, polycarbonates, polyalkylenes, polyalkylene glycols, polyalkylene oxides, polyalkylene terephthalates, polyvinyl alcohols, polyvinyl ethers, polyvinyl esters, poly-vinyl halides, polyvinylpyrrolidone, polyglycolide, polysiloxanes, polyurethanes and copolymers thereof, alkylcelluloses, hydroxyalkylcelluloses, cellulose ethers, cellulose esters, nitrocellulose, polymers of acrylic acid esters and methacrylic acid esters, methylcellulose, ethylcellulose, hydroxypropylcellulose, hydroxypropylmethylcellulose, hydroxybutylmethylcellulose, cellulose acetate, cellulose propionate, cellulose acetate butyrate, cellulose acetate phthalate, carboxyethylcellulose, cellulose triacetate, cellulose sulfate sodium salt, poly (methyl methacrylate), poly (ethyl methacrylate), poly (butyl methacrylate), poly (isobutyl methacrylate), poly (hexyl methacrylate), poly (isodecyl methacrylate), poly (dodecyl methacrylate), poly (phenyl methacrylate), poly (isopropyl acrylate), poly (isobutyl acrylate), poly (octadecyl acrylate), polyethylene, polypropylene, poly (ethylene glycol), poly (ethylene oxide), poly (ethylene terephthalate), poly (vinyl alcohol), polyvinyl acetate, polyvinyl chloride, polystyrene and polyvinyl pyrrolidone.
Examples of non-biodegradable polymers include ethylene vinyl acetate, poly (meth) acrylic acid, polyamides, copolymers and mixtures thereof.
Examples of biodegradable polymers include synthetic polymers such as polymers of lactic acid and oxalic acid, polyanhydrides, poly (ortho) acid esters, polyurethanes, poly (butyric acid), poly (valeric acid), poly (lactide-co-caprolactone), natural polymers such as alginic acid and other polysaccharides including dextran and cellulose, collagen, chemical derivatives thereof (chemical groups such as alkyl, alkylene, hydroxylation, oxidative substitution and addition, and other modifications routinely made by those skilled in the art), albumin and other hydrophilic proteins, zein and other prolamines and hydrophobic proteins, copolymers and mixtures thereof. Generally, these substances degrade either by enzymatic hydrolysis or by surface or bulk erosion upon exposure to water in vivo.
Bioadhesive polymers of particular interest include the bioerodible hydrogels described by h.s.sawhney, c.p.pathak and j.a.hubell in Macromolecules, 1993, 26, 581-. Thus, the present invention provides compositions of the above-described molecules of the invention for use as medicaments, methods of preparing such medicaments and methods for sustained release of medicaments in vivo.
Compressed agents can also be used in conjunction with the carriers of the present invention. As used herein, "compaction reagent" refers to an agent, such as a histone, that neutralizes the negative charge on a nucleic acid thereby allowing the nucleic acid to be compacted into a fine particle. The compression of the nucleic acid facilitates the uptake of the nucleic acid by the target cell. The compaction reagent can be used alone, i.e., to transport the isolated nucleic acid of the invention in a form more efficiently taken up by the cells or, more preferably, in combination with one or more of the above-described vectors.
Other exemplary compositions that can be used to facilitate uptake of a nucleic acid of the invention by a target cell include calcium phosphate and other chemical modifiers of intracellular trafficking, microinjection compositions, and electroporation.
The invention encompasses methods of increasing sulfatase activity in a cell. Such methods comprise contacting an isolated nucleic acid molecule of the invention (e.g., an isolated nucleic acid molecule as claimed in any one of claims 1-8, a FGE nucleic acid molecule having a sequence selected from SEQ ID NOs 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87) or an expression product thereof (e.g., a polypeptide as claimed in claims 11-15, 19, 20 or a peptide having a sequence selected from SEQ ID NOs 2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78) with a cell expressing a sulfatase in an amount effective to increase sulfatase activity in the cell. As used herein, "increase" sulfatase activity refers to increased affinity for and/or conversion of a specific substrate of sulfatase, typically resulting in increased formation of FGly on the sulfatase molecule. In a certain embodiment, the cell expresses sulfatase at a higher level than a wild-type cell. "increasing sulfatase activity in a cell" also refers to increasing the activity of sulfatase secreted by the cell. The cells may express endogenous and/or exogenous sulfatase. Said contacting with an FGE molecule also refers to activating the endogenous FGE gene of the cell. In important embodiments, the endogenous sulfatase is activated. In certain embodiments, the sulfatase is iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and/or HSulf-6. In certain embodiments, the cell is a mammalian cell.
According to another aspect of the present invention, a pharmaceutical composition is provided. The composition comprises a sulfatase enzyme produced by a cell in a pharmaceutically effective amount for treating sulfatase deficiency, and also comprises a pharmaceutically acceptable carrier, wherein the cell has been contacted with an agent comprising an isolated nucleic acid molecule of the invention (e.g., an isolated nucleic acid molecule as claimed in claims 1-8 or a nucleic acid molecule having a sequence selected from SEQ ID NOs: 1, 3, 4, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, and 80-87) or an expression product thereof (e.g., a peptide selected from SEQ ID nos. 2, 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78). In important embodiments, the sulfatase is expressed at a level higher than that of normal/control cells.
The invention also encompasses sulfatase producing cells wherein the ratio of active sulfatase to total sulfatase produced by the cells is increased. The cell comprises: (i) a sulfatase enzyme having increased activity relative to a control, and (ii) a formylglycine generating enzyme having increased activity relative to a control, wherein the ratio of active sulfatase enzyme to total sulfatase enzyme produced by the cell is increased by at least 5% relative to the ratio of active sulfatase enzyme to total sulfatase enzyme produced by a cell lacking the formylglycine generating enzyme. It is known in the art that overexpression of sulfatase reduces the activity of endogenous sulfatase (Anson et al, biochem. J., 1993, 294: 657-662). Furthermore, only a portion of the recombinant sulfatase enzyme is active. Unexpectedly, we found that increased expression/activity of FGE results in the production of more active sulfatase in cells with increased expression/activity of sulfatase. Since the presence of FGly on sulfatase molecules correlates with sulfatase activity, the "active sulfatase" can be quantified by measuring the presence of FGly on sulfatase cell products using MALDI-TOF mass spectrometry (as described elsewhere herein). The ratio to total sulfatase can then be easily determined.
The invention also provides methods for the diagnosis and treatment of sulfatase deficiency. Such diseases include, but are not limited to, multiple sulfatase deficiency, mucopolysaccharidosis II (MPS II; Hunter syndrome), mucopolysaccharidosis IIIA (MPS IIIA; Sanfilippo syndrome A), mucopolysaccharidosis VIII (MPS VIII), mucopolysaccharidosis IVA (MPS IVA; Morquio syndrome A), mucopolysaccharidosis VI (MPS VI; Maroteca-Lamy syndrome), Metachromatic Leukodystrophy (MLD), X-linked recessive subchondral hypoplasia 1, and X-linked ichthyosis (steroid sulfatase deficiency).
The methods of the invention are useful in the acute or prophylactic treatment of any of the foregoing conditions. Acute treatment as used herein refers to treatment of a subject with a particular disorder. Prophylactic treatment refers to treatment of a subject who may have the disorder but does not currently have or does not experience symptoms of the disorder.
In its broadest sense, the term "treatment" means both acute and prophylactic treatment. If a subject in need of treatment is experiencing a disorder (or has or is having a particular disorder), treating the disorder refers to ameliorating, reducing, or eliminating the disorder or one or more symptoms from the disorder. In certain preferred embodiments, treating the disorder refers to ameliorating, reducing or eliminating specific symptoms or specific subtypes of symptoms associated with the disorder. If the subject in need of treatment is a subject who is likely to have the disorder, then treating the subject means reducing the risk of the subject developing the disorder.
The mode of administration and dosage of the therapeutic agents of the invention will vary with the particular stage of the condition being treated, the age and physiological state of the subject being treated, the nature of concurrent therapy (if any), the particular route of administration, and like factors within the knowledge and expertise of the medical practitioner.
As described herein, the agents of the invention are administered in an amount effective to treat any of the foregoing sulfatase deficiency. In general, an effective amount is any amount that can cause a beneficial change in a desired tissue of a subject. An effective amount is preferably an amount sufficient to cause a good phenotypic change in a particular disorder, such as a reduction, alleviation or elimination of symptoms or the entirety of the disorder.
In general, an effective amount is that amount of the pharmaceutical agent alone or in combination with a further agent that produces the desired response. This may include slowing the progression of the condition only briefly, although more preferably it includes preventing the progression of the condition chronically, or delaying the onset of the condition, or preventing the condition from occurring. This can be monitored by conventional means. Generally, the dosage of active compound will be from about 0.01mg/kg per day to 1000mg/kg per day. A dosage in the desired range of 50. mu.g-500 mg/kg will be suitable, preferably orally and once or several times daily.
Such amounts will, of course, depend on the particular condition being treated, the severity of the condition, the parameters of the individual patient including age, physical condition, size and weight, the duration of treatment, the nature of concurrent therapy (if any), the specific route of administration and similar factors within the knowledge and expertise of the medical practitioner. The low dose will result from a certain form of administration, e.g., intravenous administration. In cases where the response in the subject is inadequate when the initial dose is applied, a higher dose (or an effective higher dose through a different, more localized delivery pathway) may be applied to the extent permitted by the patient's tolerance. Multiple doses per day are expected to achieve appropriate systemic levels of the compound. Generally, it is preferred to apply the maximum dose, i.e., the highest safe dose according to sound medical judgment. However, it will be understood by those of ordinary skill in the art that a patient may insist on a lower dose or a tolerable dose for medical reasons, psychological reasons or virtually any other reason.
The agents of the present invention are optionally combined with a pharmaceutically acceptable carrier to form a pharmaceutical formulation. The term "pharmaceutically acceptable carrier" as used herein means one or more compatible solid or liquid fillers, diluents or encapsulating substances suitable for administration to a human. The term "carrier" denotes a natural or synthetic organic or inorganic ingredient with which the active ingredient is combined to facilitate application. The components of the pharmaceutical composition can also be co-mixed with the molecules of the present invention, and with each other, in a manner that ensures that no interaction exists that substantially destroys the desired pharmaceutical efficacy. In certain aspects, the pharmaceutical formulation comprises an agent of the invention in an amount effective to treat the disease.
The pharmaceutical formulation may contain suitable buffering agents including: acetic acid in salt form, citric acid in salt form, boric acid in salt form, or phosphoric acid in salt form. The pharmaceutical compositions may optionally also contain suitable preservatives such as benzalkonium chloride; chlorobutanol; parabens or thimerosal.
A variety of routes of administration are available. The particular mode selected will, of course, depend on the particular drug selected, the severity of the condition being treated and the dosage required for therapeutic efficacy. The methods of the invention, in general, may be practiced using any mode of administration that is medically acceptable, i.e., any mode of interest that produces effective levels of the active compound without causing clinically unacceptable adverse effects. Such modes of administration include oral, rectal, topical, nasal, intradermal, transdermal, or parenteral routes. The term "parenteral" includes subcutaneous, intravenous, intramental, intramuscular, or infusion. Intravenous or intramuscular routes are not particularly suitable for long-term treatment and prevention. As an example, pharmaceutical compositions for acute treatment of subjects with migraine headache can be formulated in a variety of different ways and in a variety of modes of administration, including tablets, capsules, powders, suppositories, injections and nasal sprays.
Pharmaceutical preparations may conveniently be presented in unit dosage form or may be prepared by any of the methods well known in the art of pharmacy. All methods include the step of bringing into association the active agent with a carrier which constitutes one or more accessory ingredients. In general, the compositions are prepared by uniformly and intimately bringing into association the active compound with liquid carriers, well-separated solid carriers or both, and then, if necessary, shaping the product.
Compositions suitable for oral administration may be presented as discrete units, such as capsules, tablets, lozenges, each containing a predetermined amount of the active compound. Other compositions include aqueous or non-aqueous suspensions such as syrups, elixirs or emulsions.
Compositions suitable for parenteral administration conveniently comprise an aqueous preparation of a sterilised agent of the invention, which is preferably isotonic with the blood of the recipient. The aqueous preparations can be formulated according to known methods using suitable dispersing or wetting agents and suspending agents. The sterile injectable preparation may also be a sterile injectable solution or suspension in a non-toxic parenterally acceptable diluent or solvent, for exampleSolution in 1, 3-butanediol. Acceptable carriers and solvents are water, Ringer's solution and isotonic sodium chloride solution. In addition, sterile, fixed oils are conventionally employed as a solvent or suspending medium. For this purpose, any bland fixed oil may be employed including synthetic mono-or diglycerides. In addition, fatty acids such as oleic acid find use in the preparation of injectables. Formulations suitable for oral, subcutaneous, intravenous, intramuscular, and the like can be used Remington’s Pharmaceutical SciencesMack Publishing co., Easton, PA.
According to one aspect of the invention, for increasing C in a cellα-formylglycine generating activity is provided. The method comprises isolating a nucleic acid molecule of the invention (e.g., the nucleic acid of SEQ ID NO. 1) or an expression product thereof (e.g., the peptide of SEQ ID NO. 2) to increase C in a cellα-contacting the cell with an effective amount of formylglycine generating activity. In important embodiments, the method comprises activating an endogenous FGE gene to increase C in the cellαFormylglycine generating activity. In certain embodiments, the contacting is performed under conditions that allow the molecule of the invention to enter the cell.
The term "allowing molecules into" a cell according to the invention has the following meaning based on the nature of the molecule. For isolated nucleic acids, which are used to describe entry of the nucleic acid into the nucleus through the cell membrane, a "nucleic acid transgene" is based on the ability of the cell to produce a functional polypeptide encoded by the nucleic acid using cellular machinery. "nucleic acid transgene" is used to describe all nucleic acids of the invention with or without associated vectors. For polypeptides, it is used to describe the entry of the polypeptide into the cytoplasm through the cell membrane and, if desired, the use of the cytoplasmic machinery of the cell to functionally modify the polypeptide (e.g., into an active form).
A variety of techniques can be used to introduce the nucleic acids of the invention into a cell, depending on whether the nucleic acid is introduced into the host in vitro or in vivo. Such techniques include nucleic acid-CaPO4Transfection of the precipitate, transfection of the DEAE-linked nucleic acid, retroviruses comprising the nucleic acid of interestTransfection, liposome-mediated transfection and the like. For certain applications, it is preferred to target nucleic acids to specific cells. In such instances, the vector (e.g., retrovirus or other virus; liposome) used to transport the nucleic acid of the invention into the cell may have a targeting molecule attached thereto. For example, molecules such as antibodies specific for surface membrane proteins on target cells or ligands for receptors on target cells can be immobilized to or incorporated into nucleic acid delivery vectors. For example, where liposomes are used to deliver the nucleic acids of the invention, proteins that bind to surface membrane proteins associated with endocytosis may be incorporated into formulations of liposomes for targeting and/or facilitating absorption. Such proteins include capsid proteins or fragments thereof having affinity for a particular cell type, antibodies to proteins that undergo internalization in the circulation, proteins that target intracellular localization and enhance intracellular half-life, and the like. Polymeric delivery systems have also been successfully used to deliver nucleic acids into cells, as is known to those skilled in the art. Such systems even allow for oral delivery of nucleic acids.
Other delivery systems can include timed release, delayed release, or sustained release delivery systems. Such systems avoid repeated administration of the agents of the invention, increasing convenience to the subject and the physician. Various types of delivery systems are available and known to those of ordinary skill in the art. They include polymer-based systems such as poly (lactide-co-glycolide), copolyoxalic acid, polycaprolactone, polyesteramide, polyorthoester, polyhydroxybutyric acid and polyanhydride. Microcapsules containing the aforementioned polymers of drugs are described, for example, in U.S. patent 5,075,109. The delivery system also includes a non-polymeric system: lipids including sterols such as cholesterol, cholesterol esters and fatty acids or neutral fats such as mono-di-and triglycerides; a hydrogel release system; the sylastic system; peptide-based systems; a wax coating; compressed tablets using conventional binders and excipients; a partially fused implant; and the like. Specific examples include, but are not limited to: (a) aggressive systems in which the agents of the invention are contained in a matrix in some form, such as those of the type described in U.S. patent nos.4,452,775, 4,675,189 and 5,736,152, and (b) diffusion systems in which the active ingredient is exuded from the polymer at a controlled rate, such as those described in U.S. patent nos.3,854,480, 5,133,974 and 5,407,686. Furthermore, a pump-based hardware delivery system can be employed, some of which are suitable for implantation.
The use of long-term sustained release implants may be desirable. Long-term release, as used herein, means that the implant is constructed and arranged to deliver therapeutic levels of the active ingredient for at least 30 days, preferably 60 days. Long-term sustained release implants are well known to those of ordinary skill in the art and include portions of the delivery systems described above. Specific examples include, but are not limited to, the long-term sustained release implants described in U.S. patent No.4,748,024 and canadian No. 1330939.
The present invention also includes administration, and in certain embodiments co-administration, of agents other than FGE molecules of the present invention that, when administered in an effective dose, can act synergistically, additively, or synergistically with the molecules of the present invention to: (i) regulating CαFormylglycine generating activity, and (ii) treatment of C involving the molecules of the inventionαAny disorder of formylglycine generating activity (e.g. sulfatase deficiency including multiple sulfatase deficiency). Agents other than molecules of the present invention include iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, or HSulf-6, (nucleic acids and polypeptides, and/or fragments thereof), and/or combinations thereof.
As used herein, "co-administration" refers to the simultaneous administration of two or more compounds of the invention (e.g., FGE nucleic acids and/or polypeptides and agents known to be beneficial in the treatment of, for example, sulfatase deficiency, e.g., iduronate-2-sulfatase in the treatment of MPSII), as a mixture in a single composition, or sequentially at sufficient time intervals that the compounds may exert additive or even synergistic efficacy.
The invention also encompasses an array of solid phase nucleic acid molecules. The array consists essentially of a set of nucleic acid molecules, expression products thereof, or fragments thereof (or of the nucleic acid or polypeptide molecules), each nucleic acid molecule being selected from the group consisting of FGE, iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6, immobilized on a solid substrate. In certain embodiments, the solid phase array further comprises at least one control nucleic acid molecule. In certain embodiments, the set of nucleic acid molecules comprises at least one, at least two, at least three, at least four, or even at least five nucleic acid molecules, each selected from the group consisting of FGE, iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6. In a preferred embodiment, the set of nucleic acid molecules comprises up to 100 different nucleic acid molecules. In important embodiments, the set of nucleic acid molecules comprises up to 10 different nucleic acid molecules.
According to the invention, standard microarray hybridization techniques are used to assess patterns of nucleic acid expression and to identify nucleic acid expression. Microarray technology, also known by other names, includes: DNA chip technology, gene chip technology and solid phase nucleic acid array technology are known to those of ordinary skill in the art and are based on, but not limited to, obtaining an array of identified nucleic acid probes (e.g., molecules such as FGE, iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine-6-sulfatase, N-acetylglucosamine-6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and/or HSulf-6, as described elsewhere herein) on an immobilized substrate) to report the molecules (e.g., radioactive, chemiluminescent, or fluorescent labels such as fluorescein, Cye3-dUTP or Cye5-dUTP) labels the target molecule, hybridizes the target nucleic acid to the probe, and assesses target-probe hybridization. In general, a probe having a nucleic acid sequence that completely matches the target sequence will detect a stronger reporter signal than an incompletely matched probe. Various components and techniques employed in nucleic acid microarray technology are presented in the Chipping Forecast, Nature Genetics, Vol.21, Jan 1999, the entire contents of which are incorporated herein by reference.
According to the present invention, the microarray substrate may include, but is not limited to, glass, silica, aluminosilicates, borosilicates, metal oxides such as alumina and nickel oxide, various clays, nitrocellulose or nylon. In all embodiments, a glass matrix is preferred. According to the invention, the probe is selected from nucleic acids, including but not limited to: DNA, genomic DNA, cDNA and oligonucleotides; and may be natural or synthetic. The oligonucleotide probes are preferably 20 to 25 mer oligonucleotides, while the DNA/cDNA probes are preferably 500 to 5000 bases in length, although other lengths may be used. Suitable probe lengths can be determined by one of ordinary skill in the art by procedures known in the art. In one embodiment, preferred probes are those set forth as SEQ ID NOs: 1, 3, 4, 6, 8, 10, and/or 12. The probe may be purified using standard methods known to those of ordinary skill in the art, such as gel filtration or precipitation, to remove contaminants.
In one embodiment, the microarray substrate may be coated with a compound to enhance the synthesis of probes on the substrate. Such compounds include, but are not limited to, oligoethylene glycols. In another embodiment, a coupling reagent or group on the substrate can be used to covalently attach the first nucleotide or oligonucleotide to the substrate. These agents or groups may include, but are not limited to: amino, hydroxyl, bromo and carboxyl. These reactive groups are preferably attached to the substrate through a hydrocarbyl group such as an alkylene or phenylene divalent group, one valence position being occupied by a chain link and the remaining one being attached to the reactive group. These hydrocarbyl groups may contain up to about 10 carbon atoms, preferably up to about 6 carbon atoms. The alkylene group generally preferably contains 2 to 4 carbon atoms in the main chain. These and further details of the present process are disclosed, for example, in U.S. patent 4,458,066, the entire contents of which are incorporated by reference.
In one embodiment, the probes are synthesized directly on the substrate in a predetermined grid pattern using methods such as light-directed chemical synthesis, photochemical deprotection, or transport of nucleotide precursors to the substrate and subsequent probe generation.
In another embodiment, the substrate may be coated with a compound to enhance binding of the probe to the substrate. Such compounds include, but are not limited to: polylysine, aminosilane, amino-reactive silane (wrapping Forecast, 1999) or chromium (Gwynne and Page, 2000). In this embodiment, the presynthesized probes are applied to the substrate in a precise, predetermined volumetric and grid pattern using a computer-controlled robot to apply the probes to the substrate in a contact printing mode or a non-contact mode, such as ink-jet or piezoelectric conveyance. The probes may be covalently attached to the substrate by methods including, but not limited to, UV irradiation. In another embodiment, the probe is thermally attached to the substrate.
The target is a nucleic acid selected from the group including, but not limited to: DNA, genomic DNA, cDNA, RNA, mRNA, and may be natural or synthetic. In all embodiments, nucleic acid molecules from subjects suspected of developing or having sulfatase deficiency are preferred. In certain embodiments of the invention, one or more control nucleic acid molecules are attached to a substrate. Preferably, the control nucleic acid molecule allows for the determination of factors including, but not limited to: nucleic acid mass and binding characteristics; reagent quality and effectiveness; success of hybridization; and analyzing thresholds and success. The control nucleic acid can include, but is not limited to, an expression product of a gene, such as a housekeeping gene, or a fragment thereof.
To select a set of sulfatase deficient disease markers, the expression data generated by, for example, microarray analysis of gene expression is preferably analyzed to determine which genes in different patient categories (each patient category being a different sulfatase deficient disease) are significantly differentially expressed. The significance of gene expression can be determined using Permax computer software, although any standard statistical package that distinguishes significant differences in expression can be used. Permax enabled a two-sample t-test of the alignment of large-scale arrays of data. For the high dimensional vectors under observation, the Permax software calculated t-statistics for each attribute and evaluated the significance using the distribution of the maximum and minimum values of all attributes. The main application is to determine the attributes (genes) that differ most between the two groups (e.g. healthy subjects of control and subjects with specific sulfatase deficiency), using the values of t-statistics to measure the "maximum difference" and their level of significance.
Expression of a sulfatase deficiency disease-associated nucleic acid molecule can also be determined using protein measurement methods to determine the expression of the nucleic acid molecule of SEQ ID NOs: 2, for example by assaying SEQ ID NOs: 1 and/or 3. Preferred methods for specific and quantitative measurement of proteins include, but are not limited to: mass spectrometry based methods such as surface enhanced laser desorption ionization (SELDI; e.g. the cipergen protein chip system), non-mass spectrometry based methods, and immunohistochemistry based methods such as two-dimensional gel electrophoresis.
By procedures known to those of ordinary skill in the art, the SELDI method can be used to evaporate a significant amount of protein and create a "fingerprint" of individual proteins, allowing simultaneous measurement of the content of multiple proteins in a single sample. Preferably, SELDI-based assays can be used to identify the characteristics of multiple sulfatase deficiencies, as well as the stage of such conditions. Such assays preferably include, but are not limited to, the following examples. Gene products found by RNA microarrays can be selectively measured by specific (antibody-mediated) capture to a SELDI protein disk (e.g., selective SELDI). Gene products found by protein screening (e.g., in a 2-dimensional gel) can be resolved by optimized "total protein SELDI" to visualize those from SEQ ID NOs: 1, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, and/or 28. From SEQ ID NOs: predictive models of SELDI-measured specific sulfatase deficiency of various markers in 1, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, and/or 28 can be used for SELDI strategies.
The use of any of the foregoing microarray methods for determining the protein of a sulfatase-deficient disease-associated nucleic acid can be accomplished in conventional ways known to those of ordinary skill in the art, and the expression determined by protein measurement methods can be correlated with a pre-determined level of a marker for use as a predictive method of selecting a sulfatase-deficient disease patient treatment strategy.
The invention also encompasses a sulfatase-producing cell wherein the ratio of active sulfatase to total sulfatase produced by the cell (i.e., the specific activity) is increased. The cell comprises: (i) sulfatase having enhanced expression, and (ii) formylglycine generating enzyme having enhanced expression, wherein the ratio of active sulfatase to total sulfatase produced by the cell is increased by at least 5% relative to the ratio of active sulfatase to total sulfatase produced by the cell in the absence of the formylglycine generating enzyme.
As used herein, "sulfatase with enhanced expression" typically refers to increased expression of sulfatase and/or its encoded polypeptide relative to a control. Increased expression refers to an increase (i.e., to a detectable degree) in the replication, transcription and/or translation of any sulfatase nucleic acid (e.g., a sulfatase nucleic acid of the invention as described elsewhere herein), as upregulation of any of these processes results in an increase in the concentration/amount of the polypeptide encoded by the gene (nucleic acid). This can be accomplished using a number of methods known in the art and also described elsewhere hereinE.g., transfection of cells with sulfatase cDNA and/or genomic DNA containing a sulfatase site, activation of an endogenous sulfatase gene by placing, e.g., a strong promoter element upstream of the genomic site of the endogenous sulfatase gene using homologous recombination (see, e.g., the gene activation techniques described in detail in U.S. patent nos.5,733,761, 6,270,989, and 6,565,844, all of which are incorporated herein by reference), and the like. A typical control will be the same cell transfected with the vector plasmid. Enhancing (or increasing) sulfatase activity also refers to preventing or inhibiting degradation (e.g., by enhanced ubiquitination), down-regulation, etc., of sulfatase, which results in, for example, an increased or stabilized sulfatase molecule t relative to a control 1/2(half-life). Downregulated or reduced expression refers to reduced expression of a gene and/or polypeptide encoded thereby. The up-or down-regulation of gene expression can be determined directly by detecting an increase or decrease in the mRNA level of a gene (e.g., sulfatase) or the protein expression level of a polypeptide encoded by the gene, respectively, using any suitable means known in the art, such as nucleic acid hybridization or antibody detection methods, and comparing to a control. Up-or down-regulation of the expression of a sulfatase gene can also be determined indirectly by detecting a change in sulfatase activity.
Similarly, as used herein, "formylglycine generating enzyme with enhanced expression" typically refers to increased expression of an FGE nucleic acid of the invention and/or a polypeptide encoded thereby relative to a control. Increased expression refers to an increase (i.e., to a detectable degree) in the replication, transcription and/or translation of any FGE nucleic acid of the invention (as described elsewhere herein), as upregulation of any of these processes results in an increase in the concentration/amount of the polypeptide encoded by the gene (nucleic acid). This can be done using the methods described above (for sulfatase) and elsewhere herein,
In certain embodiments, the ratio of active sulfatase to total sulfatase produced by the cell is increased by at least 10%, 15%, 20%, 50%, 100%, 200%, 500%, 1000% relative to the ratio of active sulfatase to total sulfatase produced by a cell lacking the formylglycine generating enzyme.
The present invention further comprises an improved method of treating sulfatase deficiency in a subject. The method comprises administering to a subject in need of such treatment a sulfatase in an amount effective to treat sulfatase deficiency in the subject, wherein the sulfatase is contacted with a formylglycine generating enzyme in an amount effective to increase the specific activity of the sulfatase. As described elsewhere herein, "specific activity" refers to the ratio of active sulfatase produced to total sulfatase. As used herein, "contacting" refers to FGE post-translationally modified sulfatase as described elsewhere herein. It will be apparent to the skilled person that FGE can be contacted with sulfatase and modified if FGE and sulfatase encoding nucleic acids are co-expressed in a cell, or even if an isolated FGE polypeptide is contacted with an isolated sulfatase polypeptide in vivo or in vitro. Even if the isolated FGE polypeptide can be co-administered to the subject with the isolated sulfatase polypeptide to treat sulfatase deficiency in the subject, the contact between the FGE and the sulfatase preferably occurs in vitro prior to administration of the sulfatase to the subject. Since lower amounts of sulfatase need to be administered, and/or administered less frequently (because sulfatase has a higher specific activity), this improved treatment method is beneficial to the subject.
The present invention will be more fully understood with reference to the following examples. However, these examples are intended only to illustrate embodiments of the present invention and should not be construed as limiting the scope of the present invention.
Examples
Example 1:
Multiple sulfatase deficiency is coded byα-a mutation in the gene for the Formylglycine Generating Enzyme (FGE).
Experimental procedures
Materials and methods
In vitro analysis of FGE
To monitor the activity of FGE, N-acetylated and C-amidated 23-mer peptide P23(MTDFYVPVSLCTPSRAALLTGRS) (SEQ ID NO: 33) was used as the substrate. The conversion of the cysteine residue at position 11 to FGly was monitored by MALDI-TOF mass spectrometry. A6. mu.M stock of P23 in 30% acetonitrile and 0.1% trichloroacetic acid (TFA) was prepared. Under standard conditions, 6pmol of P23 was combined with up to 10. mu.l of enzyme in a final volume of 30. mu.l of 50mM Tris/HCl (pH 9.0, containing 67mM NaCl, 15. mu.M CaCl22mM DTT, and 0.33mg/ml bovine serum albumin) at 37 ℃. To stop the enzyme reaction, 1.5. mu.l of 10% TFA was added. P23 was then immobilized on ZipTip C18(Millipore), washed with 0.1% TFA and eluted in 3. mu.l of 50% acetonitrile and 0.1% TFA. Mu.l of the eluate were mixed with 0.5. mu.l of a matrix solution (5 mg/ml a-cyano-4-hydroxy-cinnamic acid (Bruker Daltonics, Billerica, MA) in 50% acetonitrile and 0.1% TFA) on a stainless steel target. MALDI-TOF mass spectrometry was performed in Reflex III (Bruker Daltonics) using a reflection mode and laser energy just above the desorption/ionization threshold. All lines are the average of 200- & 300 shots (shots) from several points on the target. The mass axis is calibrated using peptides with molecular masses in the range of 1000 to 3000Da as external standards. Monoisotopic MH of P23 +Is 2526.28 and contains the FGly product is 2508.29. The activity (pmol product/hour) was calculated as the peak height of the product divided by the sum of the peak heights of P23 and product.
Purification of FGE from bovine testis
Bovine testes were obtained from a local slaughterhouse and stored on ice for up to 20 hours. Parenchyma (Parenchyme) is released from connective tissue and homogenized in a waring blender and by three rounds of motor pottering. Preparation of crude microsomes (RM) by cell fractionation of the resulting homogenate was accomplished as described (Meyer et al, J.biol.chem., 2000, 275: 14550-. Three differential centrifugation steps at 500g (JA 1) at 4 ℃ for 20 minutes each0 rotor), 3000g (JA10) and 10000g (JA 20). RM membrane was precipitated from the last supernatant (125000g, Ti45 rotor, 45 min, 4 ℃), homogenized by motor and layered on a sucrose cushion (50mM hepes, pH 7.6, 50mM KAc, 6mM MgAc)21mM EDTA, 1.3M sucrose, 5mM β -mercaptoethanol). RM was recovered from the pellet after centrifugation in a Ti45 rotor at 45000rpm4 ℃ for 210 minutes. Typically 100000-150000 equivalent RMs, as defined by Walter and Blobel (Methods enzymol., 1983, 96: 84-93), were obtained from 1kg of testicular tissue. The luminal contents of the reticuloplasm, RM, were obtained by fractional extraction under low-concentration deoxygenated Big Chap, as Fey et al, j.biol.chem., 2001, 276: 47021 and 47028. Purification of FGE, 95ml of reprotopasm 4 ℃ was dialyzed against 20mM Tris/HCl, pH 8.0, 2.5mM DTT for 20 hours and clarified by centrifugation at 125000g for 1 hour. 32ml clear lyticuloplast aliquots were loaded onto MonoQ HR10/10 columns (Amersham Biosciences, Piscataway, N.J.) at room temperature, washed and eluted with a linear gradient of 0 to 0.75M NaCl in 80ml Tris buffer at 2 ml/min. The fractions containing FGE activity (50-165mM NaCl) eluted in three cycles were collected (42ml) and combined with 2ml concanavalin A-Sepharose (Amersham Biosciences, which had been eluted with 1mM MgCl containing 0.5M KCl 2,1mM MnCl2,1mM CaCl2And 2.5mM DTT in50 mM Hepes buffer, pH 7.4 washed). After incubation at 4 ℃ for 16 hours, concanavalin A-Sepharose was collected on the column and washed with 6ml of the same Hepes buffer. The immobilized material was eluted by incubation of the column with 6ml of 0.5 Ma-methyl mannoside at 50mM Hepes, pH 7.4, 2.5mM for 1 hour at room temperature. Elution was repeated with 4ml of the same eluent. The combined eluate (10ml) from concanavalin A-Sepharose was adjusted to pH 8.0 with 0.5M Tris/HCl (pH 9.0) and mixed with 2ml Affigel 10(Bio-Rad Laboratories, Hercules, Calif.) derivatized with 10mg of promiscuous peptide (PVSLPTRSCAALLTGR) (SEQ ID NO: 34) and buffered with buffer A (50mM Hepes, pH 8.0, containing 0.15M potassium acetate, 0.125M sucrose, 1mM MgCl2And 2.5mM DTT). After 3 hours incubation at 4 ℃, the affinity matrix was collected in the column. The liquid that passed through was collected (flow through) and usedThe 4ml fractions washed with buffer A were combined and mixed with 2ml of Affigel 10 which had been replaced with 10mg of Ser69 peptide (PVSLSTPSRAALLTGR) (SEQ ID NO: 35) and washed with buffer A. After an overnight incubation at 4 ℃ the affinity matrix was collected on the column and washed 3 times with 6ml of buffer B (buffer A containing a mixture of 2M NaCl and 20 constituent protein amino acids (each at a concentration of 50 mg/ml)). The immobilized material was eluted from the affinity matrix by incubating Affigel twice with 6ml buffer B containing 25mM Ser69 peptide, 90min each. Aliquots of the eluate were replaced with 1mg/ml bovine serum albumin, dialyzed against buffer A, and analyzed for activity. The active retentate (11.8ml) was concentrated in a Vivaspin500 concentrator (Vivascience AG, Hannover, Germany) and dissolved in Laemmli SDS sample buffer at 95 ℃. The polypeptide composition of the starting material and of the preparation obtained after the chromatography step was monitored by SDS-PAGE (15% acrylamide, 0.16% bisacrylamide) and stained by SYPRO Ruby (Bio-Rad Laboratories).
Identification of FGE by Mass Spectrometry
For peptide mass fingerprinting, the purified polypeptide was digested with trypsin in a gel (Shevchenko et al, anal. chem., 1996, 68: 850-. For tandem mass spectrometry, the selected peptides were analyzed by MALDI-TOF post-source decay mass spectrometry. Their corresponding doubly charged ions were isolated and fragmented by offline nano-ESI ion trap mass spectrometry (esquirel lc, Bruker Daltonics). Mass spectral data were used for protein identification in the NCBInr protein database and in the NCBI EST nucleotide database by Mascot search algorithm.
Bioinformatics
The signal peptide and the position of cleavage are described in accordance with the method of von Heijne (von Heijne, Nucleic Acids Res., 1986, 14: 4683-90) carried out in EMBOSS (Rice et al, Trends in genetics, 2000, 16: 276-. N-glycosylation sites were predicted using the Brunak algorithm (Gupta and Brunak, Pac. Symp. Biocomput., 2002, 310-22). Functional domains were detected by searching PFAM-Hidden-Markov-Models (version 7.8) (Sonnhammer et al, Nucleic Acids Res., 1998, 26: 320-322). To search for FGE homologs, the database of the National Center for Biotechnology Information (Wheeler et al, Nucleic Acids Res., 2002, 20: 13-16) was queried with BLAST (Altschul et al, Nucleic Acids Res., 1997, 25: 3389-3402). Sequence similarity was calculated using standard tools from EMBOSS. Genomic site organization and synteny were determined using the human and mouse genomic resources of NCBI and human-mouse homology maps also from NCBI, Bethesda, MD.
Cloning of human FGE cDNA
Application of RNEASY from human fibroblastsTMTotal RNA prepared using Mini kit (Qiagen, Inc., Valencia, Calif.) was obtained using OMNICRIPT RTTMKit (Qiagen, Inc., Valencia, CA) and oligo (dT) primer or 1199nc (CCAATGTAGGTCAGACACG) (SEQ ID NO: 36) specific for FGE. The first strand of cDNA is amplified by PCR using a forward primer 1c (ACATGGCCCGCGCGGGAC) (SEQ ID NO: 37) and 1199nc or 1182nc (CGACTGCTCCTTGGACTGG) (SEQ ID NO: 38) as a reverse primer. The PCR product was cloned directly into pCR4-TOPOTMVector (Invitrogen corporation, Carlsbad, Calif.). The coding sequence of the FGE cDNA was determined by multiple sequencing of the cloned PCR products (which were obtained from different individuals and independent RT-PCR reactions) (SEQ ID NOs: 1 and 3).
Mutation detection, genome sequencing, site-directed mutagenesis and northern blot analysis
The standard protocol used in this study was essentially as described in Lu bke et al (nat. Gen., 2001, 28: 73-76) and Hansske et al (J.Clin. invest., 2002, 109: 725-. Northern spots were hybridized to cDNA probes covering the entire coding region and to beta-actin cDNA probes as controls for RNA loading.
Cell lines and cell cultures
Fibroblasts from patients 1-6 with multiple sulfatase deficiency were obtained from e.christenson (rigsharpititalet Copenhagen), m.beck (universist), respectivelytskinderklinik Mainz),A.Kohlsch ü tter(Universittskrankenhaus Eppendorf,Hamburg),E.Zammarchi(Meyer Hospital,University of Florence),K.Harzer(Institutfür Hirnforschung,Universitt T lubingen), and A.Fencom (Guy's Hospital, London). Human skin fibroblasts, HT-1080, BHK21 and CHO cells were treated with 5% CO at 37 ℃2The cells were stored in Dulbecco's modified Eagle's Medium containing 10% fetal bovine serum.
Transfection, Indirect immunofluorescence, Western blot analysis and detection of FGE Activity
By add-on PCR using Pfu polymerase (Stratagene, La Jolla, Calif.) and the following primers GGAATTCGGGACAACATGGCTGCG(EcoRI) (SEQ ID NO: 39), CCCAAGCTTATGCGTAGTCAGGCACATCATACGGATAGTCCATGGTGGGCAGGC(HA) (SEQ I D NO: 40), CCCAAGCTTACAGGTCTTCTTCAGAAATCAGCTTTTGTTCGTCCATGGTGGGCAGGC (c-Myc) (SEQ ID NO: 41), CCCAAGCTTAGTGATGGTGATGGTGATGCGATCCTCTGTCCATGGTGGGCAGGC (RGS-His)6) (SEQ ID NO: 42) FGE cDNA was loaded with a 5 'EcoRI-site and 3' HA-, c-Myc or RGS-His6Marker sequence followed by a stop codon and a HindIII site. The resulting PCR product was cloned into pMPSVEH as an EcoRI/HindIII fragment (Artemt et al, Gene, 1988, 68: 213-219). Application EFFECTENE TM(Qiagen) as transfection reagent, the resulting plasmid was transiently transfected into HT-1080, BHK21 and CHO cells grown on coverslips. 48 hours after transfection, the cells were subjected to indirect immunofluorescence as described previously (Lu bke et al, nat. Gen., 2001, 28: 73-76; Hansske et al, J.Clin.I.nvest, 2002, 109: 725-733), monoclonal IgGl antibodies against HA (Berkeley antibody Company, Richmond, CA), c-Myc (Santa Cruz biotechnology, inc., Santa Cruz, CA) or RGS-His (Qiagen) were used as primary antibodies. The endoplasmic reticulum marker protein, Protein Disulphide Isomerase (PDI), was detected with different subtypes of monoclonal antibodies (IgG2A, stressgen biotech, Victoria BC, Canada). The primary antibody was probed with a isotype-specific goat secondary antibody (Molecular Probes, inc., Eugene, OR) conjugated to CY2 OR CY3, respectively. Immunofluorescence images were obtained on a Leica TCS Sp2 AOBS laser scanning microscope. For Western blot analysis, HRP-conjugated anti-mouse IgG was used as the secondary antibody using the same monoclonal antibody. Measurement of FGE activity, trypsinized cells were washed with phosphate buffered saline solution dissolved at 10mM Tris (pH 8.0) containing 2.5mM DTT, protease inhibitor and 1% Triton X-100, containing a mixture of protease inhibitors (208. mu.M 4- (2-aminoethyl) benzenesulfonyl fluoride hydrochloride, 0.16. mu.M aprotinin, 4.2. mu.M leupeptin, 7.2. mu.M bestatin, 3. mu.M pepstatin A, 2.8. mu.ME-64) and clarified by centrifugation at 125,000g for 1 hour. The supernatant was chromatographed on a MonoQ PC 1.6/5 column using the conditions described above. Fractions eluted in 50-200mM NaCl were collected, freeze-dried, and reconstituted to one tenth of the original volume, followed by measurement of FGE activity with peptide P23.
Retroviral transduction
The cDNAs of interest were cloned into Moloney murine leukemia virus-based vectors pLPCX and pLNCX2(BD Biosciences Clontech, Palo Alto, Calif.). Transfection of avidity FNX-Eco cells (ATCC, Manassas, Va.), and amphotropic RETROPACKTMTransduction of PT67 cells (BD Biosciences Clontech) and human fibroblasts was performed as described by lu bke et al, nat. gen, 2001, 28: 73-76; thiel et al, biochem.J., 2002, 376, 195-201. For part of the experiments, pLPCX transduced PT67 cells were selected with puromycin prior to measurement of sulfatase activity
Sulfatase analysis
ASA, STS and GalNAc6S activities were determined as described by Rommerskirch and von figure, proc.natl.acad.sci., USA, 1992, 89: 2561-2565; glssl and Kresse, clin, chim, acta, 1978, 88: 111-119.
Results
Peptide-based rapid assay for FGE activity
We have developed a method of using in vitro synthesis35S]ASA fragments were used as substrates for assays to determine FGE activity in microsomal extracts. Fragments are added to the assay mixture as ribosome-linked nascent strand complexes. Quantification of the product includes trypsin digestion, separation of the peptide by RP-HPLC, and combination of hydrazone by chemical derivatization, RP-HPLC separation, and scintillation counting for content [ alpha ], [ beta ] 35S]Identification and quantification of labeled tryptic degradation peptides of FGly (Fey et al, J.biol.chem., 2001, 276: 47021-. This cumbersome procedure needs to be modified in order to monitor the enzyme activity during purification. Synthetic 16-mer peptides corresponding to ASA residues 65-80 and containing sequence motifs required for FGly formation inhibited FGE activity in an in vitro assay. This suggests that peptides such as ASA65-80 may function as substrates for FGE. We synthesized 23-mer peptide P23(SEQ ID NO: 33) which corresponds to ASA residues 60-80 and is appended with N-acetylated methionine and C-amidated serine residues to protect the N-and C-termini, respectively. The form of P23 containing cysteine and FGly can be identified and quantified by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. The presence of FGly residues at position 11 of P23 was confirmed by MALDI-TOF post source decade mass spectrometry (see Peng et al, J. MassSpec., 2003, 38: 80-86). Incubation of P23 with extracts from bovine pancreas or bovine testis microsomes converted up to 95% of the peptide into FGly-containing derivatives (fig. 1). Under standard conditions, the reaction is proportional to the amount of enzyme and the incubation time, provided that less than 50% of the substrate is consumed and the incubation time does not exceed For 24 hours. K of P23mIs 13 nM. Reduced and oxidized glutathione, Ca2+And the effects of pH are comparable to those seen in assays employing ribosome-associated nascent strand complexes as substrates (Fey et al, J.biol.chem., 2001, 276: 47021-47028).
Purification of FGE
Purification of FGE, soluble fraction of bovine testis microsomes (reticuloplasm) was used as starting material. FGE has a specific activity 10-20 times higher than those of the type in reticuloplasm from bovine pancreatic microsomes (Fey et al, J.biol.chem., 2001, 276: 47021-47028). The purification of FGE is accomplished by a combination of four chromatography steps. The first two steps were chromatography on a MonoQ anion exchange column and a concanavalin A-Sepharose column. At pH8, FGE activity was immobilized to MonoQ and eluted at 50-165mM NaCl with 60-90% recovery. When this fraction was mixed with concanavalin A-Sepharose, FGE was immobilized. 30-40% of the initial activity can be eluted with 0.5 Ma-methyl mannoside. The latter two purification steps were chromatography on a 16-mer peptide-derivatized affinity matrix. The first affinity matrix was Affigigel 10 substituted with a variant of the ASA65-80 peptide in which residues Cys69, Pro71 and Arg73 critical for FGly formation were scrambled (hybrid peptide PVSLPTRSCAALLTGR-SEQ ID NO: 34). This peptide did not inhibit FGE activity when added to the in vitro assay at a concentration of 10mM, and did not retain FGE activity when immobilized to Affigel 10. Chromatography on a hybrid peptide affinity matrix removes peptide binding proteins, including partners of the endoplasmic reticulum. The second affinity matrix was Affiggel 10 substituted with a variant of the ASA65-80 peptide in which Cys69 was replaced with serine (Ser69 peptide PVSLSTPSRAALLTGR-SEQ ID NO: 35). The Ser69 peptide affinity matrix efficiently bound FGE. FGE activity was able to elute with 2M KSCN or 25mM Ser69 peptide with 20-40% recovery. Prior to activity determination, the KSCN or Ser69 peptide had to be removed by dialysis. The substitution of Cys69 with serine is critical for the elution of active FGE. Affigel 10 substituted with the wild-type ASA65-80 peptide binds FGE efficiently. However, there is little activity in using chaotropic salts (KSCN, MgCl) 2) Peptides (ASA65-80 or Ser69 peptides), or low or high pHRecovering the eluent eluted by the buffer solution. The polypeptide pattern of the starting material and the active components obtained after having undergone the four chromatographic steps of a typical purification process is shown in fig. 2. In the final fraction, 5% of the starting FGE activity and 0.0006% of the starting protein was recovered (8333-fold purification).
Purified 39.5 and 41.5kDa polypeptides are encoded by a single gene
The 39.5 and 41.5kDa polypeptides in the purified FGE preparation were subjected to peptide mass fingerprinting. The mass spectra of tryptic peptides of the two polypeptides obtained by MALDI-TOF mass spectrometry largely overlapped, suggesting that the two proteins originated from the same gene. Of the tryptic peptides of the two polypeptides, two rich peptides (MH)+1580.73, SQNTPDSSASNLGFR (SEQ ID NO: 43) and MH+2049.91, MVPIPAGVFTMGTDDPQIK-SEQ ID NO: 44 plus two methionine oxidations) was found that was compatible with the protein having GenBank acc.no. ak075459(SEQ ID NO: 4) the cDNA of (1) encodes a protein which is identical to the protein encoded by the cDNA of (1). The amino acid sequences of these two peptides were confirmed by MALDI-TOFpost source decay spectra and MS/MS analysis using offline nano-Electrophoresis (ESI) ion trap mass spectrometry. EST sequences of bovine orthologs of human cDNA covering the C-terminal portion of FGE and identical to the sequences of both peptides provide additional sequence information for bovine FGE.
Evolutionary conserved and domain structure of FGE
The human FGE gene is encoded by SEQ ID NOs: 1 and/or 3, and is located on chromosome 3p 26. It spans approximately 105kb, while the coding sequence is scattered among 9 exons. Three orthologs of the human FGE gene were found in mice (87% concordance), drosophila (48% concordance), and anopheles mosquito (47% concordance). EST sequences of orthologs were found in more 8 species, including cows, pigs, Xenopus laevis, Silurana tropicalis, zebrafish, salmon and other fish species (for details, see example 2). The exon-intron structure is conserved between human and mouse genes, while the mouse gene on chromosome 6E2 is located in a region that is collinear with human chromosome 3p 26. Cerevisiae and c. In prokaryotes, homologues of 12 human FGEs were found. The cDNA of human FGE was predicted to encode a 374 residue protein (FIG. 3 and SEQ ID NO: 2). This protein contains a 33-residue cleavable signal sequence that indicates the migration of FGE into the endoplasmic reticulum, and also contains a single N-glycosylation site at Asn 141. Binding of FGE to concanavalin a suggests that this N-glycosylation site is applied. Residues 87-367 of FGE are listed in the PFAM protein motif database as a domain of unknown function (PFAM: DUF 323). Sequence comparison analysis of human FGE and its orthologs of eukaryotes that have been identified in the database suggests that this domain consists of three distinct subdomains.
Among the four known eukaryotic FGE orthologs, the N-terminal subdomain (residues 91-154 in human FGE) has 46% sequence identity and 79% similarity. In human FGE, this domain carries an N-glycosylation site at Asn 141, which is conserved among other orthologs. The middle portion of FGE (residue 179-308 in human FGE) is a tryptophan-rich subdomain (12 tryptophans per 129 residues). The identity of eukaryotic orthologs within this subdomain was 57% and the similarity was 82%. The C-terminal subdomain (residue 327-366 in human FGE) is the most highly conserved sequence within the FGE family. The sequence identity of the human C-terminal subdomain with eukaryotic orthologs (3 full-length sequences and 8 ESTs) was 85%, and the similarity was 97%. Within 40 residues of subdomain 3, four cysteine residues are fully conserved. Three cysteines are also conserved in prokaryotic FGE orthologs. The 12 prokaryotic members of the FGE family (see example 2 for details) share subdomain structure with eukaryotic FGEs. The boundaries between the three subdomains are more pronounced in the prokaryotic FGE family because variable length non-conserved sequences separate the subdomains from each other. The human and mouse genomes encode two closely related FGE homologs (SEQ ID NOs: 43 and 44, GenBank Acc. No. NM-015411, in humans; and SEQ ID NOs: 45 and 46, GenBank Acc. No. AK076022, in mice). The two paralogs were 86% identical. Their genes are located in the homologous chromosomal region (7 q11 in humans, 5G1 in mice). Both paralogs share subdomain structure with FGE orthologs, with 35% identity and 47% similarity to human FGE. In the third subdomain, which is 100% identical in both homologues, the undecene sequence of the cysteine-containing subdomain 3 is deleted.
Expression, subcellular localization and molecular formats
2.1 kb Single transcript by Total RNA from fibroblasts and Poly A from Heart, brain, placenta, Lung, liver, skeletal muscle, Kidney and pancreas+Northern blot analysis of RNA can be probed. The content varies by an order of magnitude with respect to β -actin RNA, being highest in pancreas and kidney and lowest in brain. Various eukaryotic cell lines stably or transiently expressing cDNA or C-terminal HA-, Myc-or His of human FGE6The cDNA of the FGE derivative extended by tag, used for the analysis of FGE activity and FGE subcellular localization. Transient expression of labeled or unlabeled FGE increased FGE activity by a factor of 1.6-3.9. Stable expression of FGE in PT67 cells increased FGE activity by about 100-fold. Detection of the tagged FGE forms by indirect immunofluorescence in BHK 21, CHO and HT1080 cells revealed co-localization of multiple tagged FGE forms with protein disulfide isomerase, a luminal protein of the endoplasmic reticulum. Western blot analysis of extracts from BHK 21 cells transiently transfected with cDNA encoding the labeled FGE form revealed a single immunologically active band with an apparent size between 42 and 44 kDa.
FGE gene carries mutations in multiple sulfatase deficiency
Multiple sulfatase deficiency is caused by a defect in sulfatase that fails to produce an FGly residue (Schmidt, B. et al, Cell, 1995, 82: 271-278). The FGE gene is therefore a candidate gene for multiple sulfatase deficiency. We amplified and sequenced the FGE-encoding cDNA of 7 patients with multiple sulfatase deficiency, and found ten different mutations, which were confirmed by genomic DNA sequencing (Table 1).
Table 1: mutations in MSD patients
| Mutations | Action on proteins | Description of the invention | Patient's health |
| 1076C>A | S359X | C-terminal 16 residue truncation | 1* |
| IVS3+5-8 del | Deletion of residue 149- | Deletion of exon 3 in reading frame | 1,2 |
| 979C>T | R327X | Loss of subdomain 3 | 2 |
| 1045C>T | R349W | Replacement of conserved residues in subdomain 3 | 3,7 |
| 1046G>A | R349Q | Replacement of conserved residues in subdomain 3 | 4 |
| 1006T>C | C336R | Replacement of conserved residues in subdomain 3 | 4 |
| 836C>T | A279V | Replacement of conserved residues in subdomain 2 | 5 |
| 243delC | Code shifting and puncturing | Loss of all three subdomains | 5 |
| 661delG | Code shifting and puncturing | C-terminal third, including substructure, of missing FGEDomain 3 | 6** |
| IVS6-1G>A | Deletion of residue 281- | Deletion of exon 7 in reading frame | 5 |
Patient 1 was identified in Schmidt, b. et al, Cell, 1995, 82: 271-278 and Rommerskirch and von figura, proc.natl.acad.sci., USA, 1992, 89: 2561 Mosulfatase deficient patients in 2565 Mo.
Patient 6 was identified by Burk et al, j.pediatr, 1984, 104: 574-578 patients with multiple sulfatase deficiency.
Other patients represented unreported cases.
The first patient was heterozygous for a 1076C > A substitution (conversion of the codon for serine 359 to the stop codon) (S359X) and a mutation resulting in deletion of 25 residues of 149-173 (encoded by exon 3 and separating the first and second domains of the protein). Genomic sequencing revealed a deletion of +5-8 nucleotides of the third intron (IVS3+5-8 del), which thereby disrupted the splice donor site of intron 3. The second patient was heterozygous for a mutation resulting in the loss of exon 3 (IVS3+5-8 del) and a 979C > T substitution (conversion of the codon for arginine 327 to a stop codon) (R327X). The truncated FGE encoded by the 979C > T allele lacks a substantial portion of subdomain 3. The third patient was homozygous for the 1045C > T substitution (replacement of the conserved arginine 349 with tryptophan in subdomain 3) (R349W). The fourth patient was heterozygous for two missense mutations in the FGE domain replacing conserved residues: 1046> T substitution (arginine 349 substituted by glutamine) (R349Q) and 1006T > C substitution (cysteine 336 substituted by arginine) (C336R). The fifth patient was heterozygous for the 836C > T substitution (conservative alanine 279 substituted with valine) (a 279V). The second mutation is a single nucleotide deletion (243delC) that changes sequence after proline 81 and leads to translation stop after residue 139. The sixth patient was heterozygous for a single nucleotide deletion (661delG) which changed the amino acid sequence after residue 220 and introduced a stop codon after residue 266. The second mutation was a splice acceptor site mutation of intron 6 (IVS6-1G > A) which resulted in an in-frame deletion of exon 7 encoding residues 281-318. In the seventh patient, the same 1045C > T replacement was found as in the third patient. In addition, we detected two polymorphisms in the coding region of the 18FGE allele from control and multiple sulfatase deficient patients. 22% carry a 188G > a substitution (serine 63 substituted with asparagine) (S63N), while 28% have a silent 1116C > T substitution.
Transduction of multiple sulfatase deficient fibroblasts with wild-type and mutant FGE cDNAs
To confirm that FGE deficiency is responsible for the inactivity of the sulfatase synthesized in multiple sulfatase deficiency, we used retroviral gene transfer to express FGE cDNA in multiple sulfatase deficient fibroblasts. As a control, we transduced retroviral vectors without cDNA inserts. To monitor the compensation of metabolic defects, the activities of ASA, steroid sulfatase (STS) and N-acetylgalactosamine 6-sulfatase (GalNAc6S) were measured in transduced fibroblasts before or after selection. Transduction of wild-type FGE partially restored the catalytic activity of three sulfatases in two multiple sulfatase deficient-cell lines (table 2), while STS activity was partially restored in a third sulfatase deficient-cell line. It should be noted that for ASA and GalNAc6S, recovery was only partial after fibroblast selection, reaching 20 to 50% of normal activity. For STS, activity was found to return to the level of control fibroblasts after selection. The selection increased the activity of ASA and STS by 50 to 80%, which is consistent with the earlier observation that 15 to 50% of the fibroblasts were transduced (Lu bke et al, nat. Gen., 2001, 28: 73-76). Sulfatase activity in multiple sulfatase deficient fibroblasts transduced with retroviral vectors only (table 2) is comparable to those in non-transduced multiple sulfatase deficient fibroblasts (not shown). Transduction of FGE cDNA carrying the IVS3+5-8del mutation failed to restore sulfatase activity (table 2).
Table 2: compensation of multiple sulfatase deficient fibroblasts by transduction with wild-type or mutant FGE cDNAs
1The values give the ratio between ASA (mU/mg cellular protein), STS (μ U/mg cellular protein), GalNAc6S (μ U/mg cellular protein) and β -hexosaminidase (U/mg cellular protein). For control fibroblasts, the mean and variance of 6-11 cell lines are given. The cases indicating the extent of two cultures transduced in parallel are given for multiple sulfatase deficient fibroblasts.
The number of sulfatase deficient fibroblasts refers to the number of patients in Table 1
+ Pre-selection Activity assay
Activity assay after + selection
n.d. no measurement
Discussion of the related Art
FGE is a highly conserved glycoprotein of the endoplasmic reticulum
Purification of FGE from bovine testes yielded two polypeptides of 39.5 and 41.5kDa derived from the same gene. The expression of the three different tagged forms of FGE as single forms in three different eukaryotic cell lines suggests that one of the two forms observed in FGE preparations purified from bovine testes may result from limited proteolysis during purification. The replacement of Cys69 by serine in the ASA65-80 peptide is crucial for the purification of FGE by affinity chromatography. FGE has a cleavable signal sequence that mediates translocation across the endoplasmic reticulum membrane. A further portion of the mature protein (275 out of 340 residues) defines a unique There are domains, which are likely to consist of three subdomains (see example 2), for which no homologues are found in proteins with known function. Recognition of the linear FGly-modified motif in newly synthesized sulfatase polypeptides (Dierks et al, EMBO J., 1999, 18: 2084-2091) may be a function of the FGE subdomain. The catalytic domain can catalyze FGly formation in several ways. It has been suggested that FGE extracts electrons from the thiol group of cysteine and transfers them to a receptor. The resulting thioaldehyde will spontaneously hydrolyze into FGly and H2S (Schmidt, B. et al, Cell, 1995, 82: 271-. Alternatively, FGE can act as a mixed-function oxygenase (monooxygenase) enzyme in electron donors such as FADH2With the aid of (2)2One atom of (A) is introduced into cysteine and the other is introduced into H2And O. The resulting cysteine thioaldehyde hydrated derivative will spontaneously react to FGly and H2And S. Preliminary experiments with partially purified FGE preparations showed that FGly forms an important dependence on molecular oxygen. This would suggest that FGE acts as a mixed-function oxygenase. The particularly high conservation of subdomain 3 and the presence of three fully conserved cysteine residues therein makes this subdomain a likely candidate for a catalytic site. It would be interesting to investigate whether the structural elements mediating the recognition of the FGly motif and the binding of the electron acceptor or electron donor are related to the domain structure of FGE.
The recombinant FGE localizes to the endoplasmic reticulum, which is consistent with its proposed behavioral location. FGly residues are produced in newly synthesized sulfatases either during their translocation to the endoplasmic reticulum or shortly thereafter (Dierks et al, Proc. Natl.Acad. Sci. U.S.A., 1997, 94: 11963-11968; Dierks et al, FEBS Lett., 1998, 423: 61-65). FGE itself does not contain a KDEL type ER-retention signal. Its retention in the endoplasmic reticulum can therefore be mediated by interactions with other ER proteins. Components of the translocation/N-glycosylation mechanism are attractive candidates for partners of this type of interaction.
Mutations in FGE that result in multiple sulfatase deficiency
We have shown that mutations in the gene encoding FGE lead to multiple sulfatase deficiencies. FGE also interacts with other components, and defects in the gene encoding the latter can be equally effective in causing multiple sulfatase deficiency. In seven patients with multiple sulfatase deficiency, we did find ten different mutations in the FGE gene. All mutations had a severe impact on FGE proteins by substituting highly conserved residues or C-terminal truncations of different lengths (four mutations) or large in-frame deletions (two mutations) in subdomain 3 (three mutations) or subdomain 2 (one mutation). For two multiple sulfatase deficiency-cell lines and one multiple sulfatase deficiency mutation, it has been shown that the transduced portion of the FGE cDNA, wild type, but not the mutant, restores sulfatase activity. This clearly identifies the FGE gene as a site of mutation and the pathogenic nature of the mutation. Multiple sulfatase deficiency is both clinically and biochemically heterogeneous. The rare neonatal forms of hydrocephalus that exist at birth, which initially appear as metachromatic leukodystrophy in young children and later develop a common form of ichthyosis and mucopolysaccharidosis similar characteristics, and the less common mild form (in which mucopolysaccharidosis is prevalent in clinical features), have been distinguished. Biochemically characterized by a residual activity of sulfatase that was measured in cultured skin fibroblasts in most cases to be less than 10% of that of controls (Burch et al, Clin. Genet., 1986, 30: 409-15; Basner et al, Pediatr. Res., 1979, 13: 1316-. However, in some sulfatase deficient cell lines, the activity of the selected sulfatase reaches the normal range (Yutaka et al, Clin. Genet., 1981, 20: 296-303). In addition, residual activity has been reported to undergo variation based on cell culture conditions and unknown factors. Biochemically, multiple sulfatase deficiency is classified into two groups. In group I, the residual activity of sulfatase was less than 15%, including those portions of ASB. In group II, the residual activity of the sulfatase was higher, especially those fractions of ASB which reached values as high as 50-100% of the control. All patients reported here, except patient 5, belong to group I, patient 5 belongs to group II of the biochemical phenotype (ASB activity within the scope of the control). Based on clinical criteria, patients 1 and 6 are neonatal, patients 2-4 and 7 are general, and patient 5 is a mucopolysaccharidosis-like form of multiple sulfatase deficiency.
Phenotypic heterogeneity suggests that different mutations in multiple sulfatase deficient patients are associated with different residual FGE activities. Preliminary data on PT67 cells stably expressing FGE IVS3+5-8del showed that in-frame deletion of exon 3 completely terminated FGE activity. Characterization of mutations in multiple sulfatase deficiencies, biochemical properties of variant FGE, and characterization of residual levels of FGly in sulfatase using the most recently developed high sensitivity mass spectrometry approach (Peng et al, j. mass spec, 2003, 38: 80-86) will provide a better understanding of genotype-phenotype correlations in multiple sulfatase deficiencies.
Example 2:
The human FGE gene defines a new gene family conserved from prokaryotes to eukaryotes that modifies sulfatase
Bioinformatics
Signal peptides and cleavage sites are described by the method described by von Heijne (Nucleic Acids Res., 1986, 14: 4683), the implementation of this method in EMBOSS (Rice et al Trends in genetics, 2000, 16: 276. sup. 277), and the method of Nielsen et al (protein Engineering, 1997, 10: 1-6). N-glycosylation sites were predicted using the Brunak algorithm (Gupta and Brunak, Pac. Symp. Biocomput., 2002, 310-22).
Functional domains were probed by searching PFAM-Hidden-Markov-Models (version 7.8) (Sonnhammer et al, Nucleic Acids Res., 1998, 26: 320-322). The sequence from PFAM DUF323 seed was obtained from TrEMBL (Bairoch, A. and Apweiler, R., Nucleic Acids Res., 2000, 28: 45-48). Construction of multiple permutations and phylogenetic trees was done with Clustal W (Thompson, J. et al, nucleic acids Res., 1994, 22: 4673-. For the estimation of the phylogenetic tree, the gap positions are rejected and multiple permutations are corrected. The tree bootstrap (treeboottraping) was completed to obtain significant results. The trees were visualized using Njplot (Perriere, G. and Gouy, M., Biochimie, 1996, 78: 364-. The comparison is plotted using the pret-typlot command from EMBOSS.
To search for FGE homologs, the databases NR, NT and EST of the National Center for Biotechnology Information (NCBI) (Wheeler et al, nucleic acids Res., 2002, 20: 13-16) were queried with BLAST (Altschul et al, nucleic acids Res., 1997, 25: 3389-. For protein sequences, a search was done applying iterative convergent Psi-Blast to the current version NR database, using an expected cutoff of 10 -40And default parameters. Convergence is achieved after 5 iterations. For the nucleotide sequence, the search was done with Psi-TBlastn: using the NR and human FGE protein sequences as input values, a scoring matrix for hFGE was established with iterative convergence Psi-Blast. This matrix is used as input to blastall to query nucleotide databases NT and EST. For both steps, the desired cutoff value of 10 is used-20。
Protein secondary structure prediction was accomplished using Psipred (Jones, D., J Mol biol., 1999, 292: 1950-.
The similarity score for the subdomains was calculated from the alignment using the cons algorithm of EMBOSS with default parameters. Sub-alignments (meta-alignments) are generated by aligning the consensus sequences of the FGE family subclasses. Genomic site organization and Synteny was determined using NCBI's Human and Mouse genomic resources at NCBI (Bethesda, Md.) and Softberry's (Mount Kisco, NY) Human-Mouse-Rat Synteny. Bacterial genomic sequences were downloaded from the NCBI-FTP-server. NCBI microbial genome annotation is used to obtain an overview of genomic sites of bacterial FGE genes.
Results and discussion
Basic features and motifs of human FGE and related proteins
The human FGE gene (SEQ ID NOs: 1, 3) encodes a FGE protein predicted to have 374 residues (SEQ ID NO: 2). The cleavage signal between residues 22-33 (Heijne value of 15.29) and the hydrophilicity values of residues 17-29 between 1.7 and 3.3 (Kyte, J. and Doolittle, R., J Mol biol., 1982, 157: 105-132) indicate that 33N-terminal residues are cleaved off after ER translocation. However, using the Nielsen et al algorithm (Protein Engineering, 1997, 10: 1-6), cleavage of the signal sequence was predicted to occur after residue 34. The protein has a single potential N-glycosylation site at Asn 141.
A search of the protein-based database PFAM (Sonnhammer et al, Nucleic Acids Res., 1998, 26: 320-322) with the FGE protein sequence showed that residues 87-367 of human FGE could be classified as having 7:9 x 10-114Protein domain DUF323 ("domain of unknown function", PF03781) with a high significant expectation. One seed of PFAM to determine DUF323 consists of 25 protein sequences, the subjects of which are hypothetical proteins derived from sequencing data. To analyze the relationship between human FGE and DUF323, multiple alignments of FGE with DUF323 seed sequences were performed. Based on this, a phylogenetic tree is constructed and a bootstrap analysis is performed. Four of the hypothetical sequences (TrEMBL-IDs Q9CK12, Q9I761, 094632, and Q9Y405) differ so strongly from the other members of the seed that they prevent successful bootstrap analysis and have to be removed from the set of sequences. Fig. 2 shows a self-developed tree showing the relationship between human FGE and the remaining 21 DUF323 seed proteins. This tree can be used to sub-classify these seed members into two categories: a homologue closely related to human FGE, and the remaining genes of lesser relevance.
The top 7 proteins have a phylogenetic distance to human FGE between 0.41 and 0.73. They contain only a single domain, DUF 323. Homology within this group extends to the entire amino acid sequence, the greater part of which consists of the DUF323 domain. The DUF323 domain is well conserved within this group of homologues, whereas the other 15 proteins of the seed are less relevant to human FGE (phylogenetic distance between 1.14 and 1.93). Their DUF323 domains differ significantly from the highly conserved first group of DUF 323-domains (see section "FGE subdomain and mutations in FGE genes"). Most of these 15 proteins are hypothetical, and 6 of them have been further studied. One of them, the serine/threonine kinase from c.brachomatis (TrEMBL: 084147), contains other domains than DUF 323: an ATP-binding domain and a kinase domain. Sequences from R.sphaeroides (TrEMBL: Q9ALV8) and Pseudomonas species (TrEMBL: 052577) encode the protein NirV, a gene that is co-transcribed with the copper-containing nitrite reductase nirK (Jain, R. and Shapleigh, J., Microbiology, 2001, 147: 2505-. CarC (TrEMBL: Q9XB56) is an oxygenase involved in the synthesis of beta-lactam antibiotics from E.carotovora (McGowan, S. et al, Mol Microbiol., 1996, 22: 415-426; KhaleleliN, T.C., and Busby RW, Biochemistry, 2000, 39: 8666-8673). XylR (TrEMBL: 031397) and BH0900 (TrEMBL: Q9KEF2) are enhancer binding proteins involved in the regulation of pentose utilization in Bacillaceae (Bacillaceae) and Clostridiaceae (Clostridiaceae) (Rodionov, D. et al, FEMS Microbiol Lett., 2001, 205: 305-. Comparison of FGE and DUF323 results in the establishment of a homology threshold that distinguishes FGE families from distinct functional, remote, DUF 323-containing homologues. The latter include serine/threonine kinases and XylR (transcriptional enhancer) as well as FGE (FGly producing enzyme) and CarC (oxygenase). As discussed elsewhere herein, FGE may also exert its cysteine-modifying function as an oxygenase, suggesting that FGE and non-FGE members of DUF323 seed may share oxygenase function.
Homologs of FGE
The presence of closely related homologs of human FGE in the DUF323 seed led us to search the NCBI's NR database for homologs of human FGE (Wheeler et al, nucleic acids Res., 2002, 20: 13-16). The threshold for the search was chosen in such a way that all 6 homologs and other closely related homologs present in the DUF323 seed were obtained without finding other seed members. This search resulted in the identification of 3 FGE orthologs in eukaryotes, 12 orthologs in prokaryotes, and 2 paralogs in human and mouse (table 3).
Table 3: FGE gene family in eukaryotes and prokaryotes
| SEQ ID NOs:NA,AA[GI] | Species (II) | Length [ AA ]] | Subclass of |
| 1/3,2 | Wisdom (Homo sapiens) | 374 | E1 |
| 49,50[22122361] | Little mouse (Mus musculus) | 372f | E1 |
| 51,52[20130397] | Yellow fruit fly (Drosophila melanogaster) | 336 | E1 |
| 53,54[21289310] | Anopheles gambiae | 290 | E1 |
| 47.48[26344956] | Little mouse | 308 | E2 |
| 45,46[24308053] | Intelligent man | 301 | E2 |
| 55,56[21225812] | Streptomyces coelicolor A3(2) | 314 | P1 |
| 57,58[25028125] | Corynebacterium efficiens YS-314 | 334 | P1 |
| 59.60[23108562] | Novosphingobium aromaticivorans | 338 | P2 |
| 61,62[13474559] | Rhizobium (Mesorhizobium loti) | 372 | P2 |
| 63.64[22988809] | Burkholderia fungorum | 416 | P2 |
| 65,66[16264068] | Melissitus meliloti (Sinorhizobium meliloti) | 303 | P2 |
| 67,68[14518334] | Microtremolia species (Microcisla sp.) | 354 | P2 |
| 69,70[26990068] | Pseudomonas putida (Pseudomonas putida) KT2440 | 291 | P2 |
| 71,72[22975289] | Ralstonia metallidurans | 259 | P2 |
| 73,74[23132010] | Marine Prochloraceae (Prochlorococcusmarinus) | 291 | P2 |
| 75,76[16125425] | Bacillus crescentus CB15 | 338 | P2 |
| 77,78[15607852] | Mycobacterium tuberculosis (Mycobacterium tuberculosis) Ht37Rv | 299 | P2 |
GI-GenBank protein identifier
NA-nucleic acid AA-amino acid
E1 eukaryotic Ortholog E2-eukaryotic paralogs
P1 closely related prokaryote Ortholog P2 other prokaryote Orthologs
Mispredicted protein sequences in f-GenBank
Note that the mouse sequence GI 22122361 is predicted to encode a protein of 284 amino acids in GenBank, although the cDNA sequence NM 145937 encodes a 372 residue protein. This misprediction was due to the omission of the first exon of the murine FGE gene. All sequences found in the NR database are from higher eukaryotes or prokaryotes. FGE-homologues were not detected in archaea or plants. Searches performed at even lower thresholds in the fully sequenced caenorhabditis elegans (c. elegans) and s.cerevisiae (s. cerevisiae) genomes and related ORF databases did not show any homologs. The search in eukaryotic sequences of NT and EST nucleotide databases led to the identification of 8 additional FGE orthologous ESTs with 3' -terminal cDNA sequence fragments that show high conservation at the protein level and are not listed in the NR database. These sequences do not contain the entire coding part of the mRNA and are all from higher eukaryotes (table 4).
Table 4: FGE ortholog EST fragments in eukaryotes
| SEQ ID NOs:NA[GB] | Species (II) |
| 80[CA379852] | Rainbow trout (Oncorhynchus mykiss) |
| 81[AI721440] | Zebra fish (Danio reri)o) |
| 82[BJ505402] | Medaka (Oryzias latipes) |
| 83[BJ054666] | Xenopus laevis (Xenopus laevis) |
| 84[AL892419] | Silurana tropicalis |
| 85[CA064079] | Canadian salmon (Salmo salar) |
| 86[BF189614] | Wild boar (Sus scrofa) |
| 87[AV609121] | Cattle (Bos taurus) |
GB-GenBank accession number; NA-nucleic acids
The construction of multiple alignments of coding sequences from the NR database and of phylogenetic trees (using ClustalW) allows the definition of four subgroups of homologues: eukaryotic orthologs (human, mouse, mosquito and drosophila FGE), eukaryotic paralogs (human and mouse FGE paralogs), prokaryotic orthologs closely related to FGE (Streptomyces and Corynebacterium), and other prokaryotic orthologs (culobacter), Pseudomonas (Pseudomonas), Mycobacterium (Mycobacterium), Prochlorococcus, Mesorhizobium, rhizobium (Sinorhizobium), Novosphingobium, Ralstonia, Burkholderia (Burkholderia), and microbacterium (microsella). Eukaryotic orthologs showed overall identity to human FGE 87% (mouse), 48% (drosophila) and 47% (anopheles). Although FGE orthologs are found in prokaryotes and higher eukaryotes, they are not present in the whole sequencing genome of lower eukaryotes located phylogenetically between saccharomyces cerevisiae and drosophila. Furthermore, FGE homologues are also absent from the entire sequencing genome of E.coli and the pufferfish.
As discussed elsewhere herein, FGE paralogs found in humans and mice may have less FGly-producing activity and contribute to residual sulfatase activity found in multiple sulfatase deficient patients.
Subdomains of FGE
Members of the FGE gene family have three highly conserved parts/domains (as described elsewhere herein). In addition to separating the two non-conserved sequences of the former, they also have non-conserved extensions at the N-and C-termini. The three conserved portions are thought to represent subdomains of the DUF323 domain, as they are separated by non-conserved portions of different lengths. The length of the portion separating subdomains 1 and 2 varies between 22 and 29 residues, while the length of the portion separating subdomains 2 and 3 varies between 7 and 38 amino acids. The N-and C-terminal non-conserved portions show even stronger length changes (N-terminal: 0-90 amino acids, C-terminal: 0-28 amino acids). The sequence of the FGE gene from Ralstonia metalluras is likely to be incomplete because it lacks the first subdomain.
To validate the rationale for the definition of the subdomain of DUF323, we completed the prediction of secondary structure of human FGE protein using Psipred. Hydrophobic ER-signals (residues 1-33) were predicted to contain a helical structure, confirming the signal prediction of von-Heijne algorithm. The N-terminal non-conserved region (amino acids 34-89) and the separation region between subdomains 2 and 3 (amino acids 308-327) comprise a coil portion. The region separating subdomains 1 and 2 comprises a coil. The alpha-helix at amino acid 65/66 has low prediction confidence and is likely to be a prediction artifact. The subdomain boundaries are located within the coil without interrupting the alpha-helix or beta-strand. The first subdomain consists of several beta-strands and one alpha-helix and the second subdomain comprises two beta-strands and four alpha-helices. The third subdomain has an alpha-helix flanked by folded sheets at the beginning and end of the subdomain. Briefly, the secondary structure is consistent with the proposed subdomain structure in that the subdomain boundaries are located within the convolution and the subdomain contains the structural elements alpha-helix and beta-strand.
It should be noted that no subdomains exist as separate modules in the sequences listed in the database. Within each of the four subgroups of FGE family, subdomains are highly conserved, while the third subdomain shows the highest homology (table 5). This subdomain also shows the strongest homology between subgroups.
Table 5: homology of the sub-domains of FGE family (% similarity)
E1-eukaryotic ortholog; e2-eukaryotic paralogues
P1 closely related prokaryotic orthologs; other prokaryotes orthologs of P2
The first subdomain of the FGE family shows the weakest homology between subgroups. In eukaryotic orthologs, it carries an N-glycosylation site: residue Asn 141 in human, Asn 139 in mouse and Asn120 in drosophila. In anopheles, asparagine is not found at residue 130 homologous to drosophila melanogaster Asn 120. However, a two nucleotide change will create an N-glycosylation site Asn 130 in anopheles. Thus, the sequence comprising residue 130 needs to be sequenced again. The second subdomain is tryptophan-rich, 12 trps out of 129 residues of the human FGE. Ten of these tryptophanes are conserved in the FGE family.
High conservation of subdomain 3: subdomain 3 between eukaryotic orthologs is 100% similar and 90% identical. The importance of the third subdomain for protein function is underscored by the observation that this subdomain is a hotspot for pathogenic mutations in multiple sulfatase deficient patients. 7 of the 9 mutations identified in the 6-sulfatase deficient patient described in example 1 are located in the sequence encoding 40 residues of subdomain 3. The residues contain 4 cysteines, 3 of which are conserved in prokaryotic and eukaryotic orthologs. Two eukaryotic paralogs show minimal homology to other members of the FGE family, e.g. they lack 2 of the 3 conserved cysteines of subdomain 3. Features conserved between the subdomain 3 sequences of orthologs and paralogs are the initial RVXXGG (A) S motif (SEQ ID NO: 79), a heptapeptide containing 3 arginines (residues 19-25 of the subdomain consensus sequence), and the terminal GFR motif. Comparison of the DUF323 domains of 15 seed sequences that are not close homologs of FGE showed significant sequence differences: the 15 seed sequences have first and second subdomains that are relatively non-conserved, although the entire subdomain structure is also visible. Subdomain 3, which is very conserved in the FGE family, is shorter and has significantly less homology to eukaryotic subdomain 3 (similarity of about 20%), when compared to prokaryotic FGE family members (similarity of about 60%). They therefore lack the conserved cysteine residues of all subdomains 3. The only conserved features are the initial RVXXGG (A) S motif (SEQ ID NO: 79) and the terminal GFR motif.
Genomic organization of human and murine FGE genes
The human FGE gene is located on chromosome 3p 26. It contains 105kb and 9 exons for translation of the sequence. The murine FGE gene is 80Kb in length and is located on chromosome 6E 2. The 9 exons of the murine FGE gene have almost the same size as the human exons (fig. 3). The main differences between human and mouse genes are the lower conservation of the 3' -UTR in exon 9 and the length of exon 9, which is 461bp longer in the mouse gene. Segment 6E2 of mouse chromosome 6 is highly syntenic with human chromosome segment 3p 26. In the telomere-oriented direction, both the human and murine FGE loci are flanked by genes encoding LMCD1, KIAA0212, ITPR1, AXCAM and IL5 RA. In the centromeric direction, both FGE loci flank the CAV3 and OXTR loci.
Genomic organization of prokaryotic FGE gene
In prokaryotes, sulfatases are classified as either cysteine-type or serine-type sulfatases based on conversion to residues of FGly in their active centers (Miech, C. et al, J Biol chem., 1998, 273: 4835-4837; Dierks, T. et al, Jbiol chem., 1998, 273: 25560-25564). Serine-type sulfatases are part of the operon with AtsB, which encode cytoplasmic proteins containing iron-chalcogen motifs and are essential for the production of FGly from serine residues in klebsiella pneumoniae (klebsiella pneumoniae), escherichia coli (e.coli) and yersinia pestis (yersinia pestis) (Marquordt, c. et al, J Biol chem., 2003, 278: 2212-.
It is therefore interesting to examine whether the prokaryotic FGE gene is located in the vicinity of the cysteine sulfatase enzyme as a FGE substrate. Of the prokaryotic FGE genes shown in table 3, 7 had genomes that had been fully sequenced, allowing analysis of the adjacent related effects of the FGE locus. In fact, in 4 of 7 genomes (C. efficiens: PID 25028125, Pseudomonas putida: PID 26990068, Bacillus crescentus: PID 16125425 and Mycobacterium tuberculosis: PID 15607852), the cysteine type sulfatase was found directly adjacent to FGE, which is consistent with co-transcription of FGE and sulfatase. In two of them (c. efficiens and pseudomonas putida), FGE and sulfatase even have overlapping ORFs, strongly indicating their co-expression. Furthermore, the genome-adjacent related effects of FGE and sulfatase genes in four prokaryotes provide additional evidence for the hypothesis that bacterial FGE is a functional ortholog.
The remaining three organisms contain cysteine sulfatases (Streptomyces coelicolor: PID24413927, Rhizobium: PID 13476324, Sinorhizobium meliloti: PIDs 16262963, 16263377, 15964702), however, the genes adjacent to FGE in these organisms do not contain the canonical sulfatase features (Dierks, T. et al, J Biol chem., 1998, 273: 25560-. Thus, in these organisms, the expression of FGE and cysteine sulfatase is likely to be regulated in trans.
Conclusion
The identification of human FGEs that lack lysosomal storage diseases (multiple sulfatase deficiency) that cause autosomal recessive transmission enables the definition of a new gene family comprising FGE orthologs from prokaryotes and eukaryotes as well as FGE paralogs in mice and humans. FGE was not found in the genome of fully sequenced E.coli, s.cerevisiae, C.elegans and Fugu rubripes. Furthermore, there is a phylogenetic interval between prokaryotes and higher eukaryotes, and no FGE is present in any species located phylogenetically between prokaryotes and drosophila. However, some of these lower eukaryotes, such as caenorhabditis elegans, have cysteine sulfatase genes. This indicates the presence of a second FGly-producing system acting on cysteine sulfatase. This hypothesis is supported by the observation that E.coli without FGE is able to produce FGly in cysteine sulfatase (Dierks, T. et al, J Biol chem., 1998, 273: 25560-.
Example 3:
FGE expression causes a significant increase in sulfatase activity in sulfatase over-expressed cell lines
We wanted to test the effect of FGE on cells expressing/overexpressing sulfatase. To this end, HT-1080 cells expressing the human sulfatases iduronate-2-sulfatase (I2S) or N-acetylgalactosamine 6-sulfatase (GALNS) were transfected in duplicate with the FGE expression construct pxmg.1.3 (table 7 and fig. 4) or with the control plasmid pxmg.1.2 (FGE in antisense orientation, unable to produce functional FGE, table 7). Media samples were taken 24, 48, and 72 hours after the media change after one 24 hour electroporation. Media samples were tested for the respective sulfatase activity by activity analysis and total sulfatase protein levels were assessed by ELISA specific for iduronate-2-sulfatase or N-acetylgalactosamine 6-sulfatase.
TABLE 6 transfected cell lines expressing sulfatase, which was used as substrate for transfection
| Cell line | Plasmids | Expressed sulfatase |
| 36F | pXFM4A.1 | N-acetylgalactosamine 6-sulfatase |
| 30C6 | pXI2S6 | Iduronate-2-sulfatase |
TABLE 7 FGE and control plasmids for transfection of HT-1080 cells expressing iduronate 2-sulfatase and N-acetylgalactosamine 6-sulfatase
| Plasmids | Construction of the major DNA sequence elements* |
| pXMG.1.3(FGE expression) | >1.6kb CMV enhancer/promoter>1.1kb FGE cDNA>hGH 3' untranslated sequences<amp<DHFR box<Cdneo box (neomycin phosphotransferase) |
| pXMG.1.2 (control, FGE reverse direction) | >1.6kb CMV enhancer/promoter<1.1kb FGE cDNA<hGH 3' untranslated sequence<amp<DHFR box<Cdneo box (neomycin phosphotransferase) |
Represents a direction of 5 'to 3'
Experimental procedures
Materials and methods
Transfection of HT-1080 cells producing iduronate 2-sulfatase and N-acetylgalactosamine 6-sulfatase
Collecting HT-1080 cells to obtain 9-12 x 10 cells for each electroporation6The cell of (1). Transfection was repeated twice with two plasmids: one was tested (FGE) and the other was used as control; in this case the control plasmid contains FGEcDNA cloned in reverse orientation with respect to the CMV promoter. Cells were centrifuged at about 1000RPM for 5 minutes. Cells at 16 x 106cells/mL were suspended in 1 XPBS. Add 100. mu.g of plasmid DNA to the bottom of the electroporation cuvette and add 750. mu.L of cell suspension (12X 10)6Individual cells) were added to the DNA solution in the cuvette. The cells and DNA were gently mixed with a plastic pipette, taking care not to generate foam. Cells were electroporated at 450V, 250. mu.F (BioRad Gene pulser). The time constant was recorded.
The electroporated cells were allowed to stand for 10-30 minutes. 1.25mL of DMEM/10% calf serum was then added to each cuvette, mixed, and all cells were transferred to fresh T75 flasks containing 20 mLDMEM/10. After 24 hours, the flasks were refilled with 20mL DMEM/10 to remove dead cells. At 48-72 hours post-transfection, media samples were collected and cells were collected from duplicate T75 flasks.
Preparation of the Medium
1L DMEM/10 (containing 23mL of 2mM L-glutamic acid, 115mL calf serum)
Cells were transfected in media without Methotrexate (MTX). After 24 hours, the cells were cultured in a medium containing an appropriate amount of MTX (36F ═ 1.0 μ M MTX, 30C6 ═ 0.1M MTX). Culture medium was harvested and cells were collected 24, 48 and 72 hours after the re-culture.
Activity assay
Iduronic acid-2-sulfatase (I2S)
NAP5 desalting column (Amersham Pharmacia Biotech AB, Uppsala, Sweden) was equilibrated with dialysis buffer (5mM sodium acetate, 5mM tris, pH 7.0). A sample containing I2S was applied to the column and allowed to enter the bed. The sample was eluted with 1mL of dialysis buffer. The desalted sample was further diluted with reaction buffer (5mM sodium acetate, 0.5mg/L BSA, 0.1% Triton X-100, pH 4.5) to about 100ng/mL I2S. mu.L of each I2S sample was added to the top row of a 96-well fluorometric Plate (Perkin Elmer, Norwalk, CT) and preincubated at 37 ℃ for 15 minutes. Substrate by reacting 4-methyl- Formalosulfate (Fluka, Buchs, Switzerland) was dissolved in substrate buffer (5mM sodium acetate, 0.5mg/mL BSA, pH 4.5) to a final concentration of 1.5 mg/mL. 100 μ L of substrate was added to each well containing the I2S sample, and the plate was incubated at 37 ℃ in the dark for 1 hour. After incubation, 190 μ L of stop buffer (332.5mM glycine, 207.5mM sodium carbonate, pH 10.7) was added to each well containing the sample. Stock solutions of 4-methylumbelliferone (4-MUF, Sigma, St. Louis, Mo.) were prepared in reagent grade water to a final concentration of 1. mu.M for product standards. 150 μ L of 1 μ M4-MUF stock and 150 μ L of stop buffer were added to the top row in the plate. 150 μ L of stop buffer was added to each of the remaining wells in the 96-well plate. Two-fold serial dilutions were made from the top row to the last row of each column of the plate. The plates were read on a Fusion Universal Microplate Analyzer (Packard, Meriden, CT) at an excitation filter wavelength of 330nm and an emission filter wavelength of 440 nm. A standard curve of micromolar number of 4-MUF stock versus fluorescence was obtained, whereas the unknown sample had the fluorescence inferred from this curve. The results are shown as "units/mL" (one unit of activity equals 1. mu. mole of 4-MUF produced per minute at 37 ℃).
N-acetylgalactosamine 6-sulfatase (GALNS)
Assay for GALNS activity utilizes the fluorogenic substrate 4-methylCymene- β -D-galactopyranoside-6-sulfate (Toronto Research Chemicals Inc., catalog No. M33448). The analysis comprises two steps. In the first step, 75 μ L of 1.3mM substrate prepared in reaction buffer (0.1M sodium acetate, 0.1M sodium chloride, pH 4.3) is incubated with 10 μ L of the medium/protein sample or corresponding dilution thereof for 4 hours at 37 ℃. The reaction was terminated by adding 5. mu.L of 2M sodium dihydrogen phosphate to inhibit GALNS activity. After addition of about 500U of beta-galactosidase from Aspergillus oryzae (Sigma, Cat. G5160), the reaction mixture was incubated at 37 ℃ for an additional 1 hour to release the fluorescent portion of the substrate. The second reaction was stopped by adding 910. mu.L of a stop solution (1% glycine, 1% sodium carbonate, pH 10.7). The fluorescence of the resulting mixture was measured by using a measurement wavelength of 359nm and a reference wavelength of 445nm, with 4-methylumbelliferone (sodium salt, Sigma, cat. No. M1508) used as a reference standard. One unit of activity corresponds to the number of nmoles of 4-methylumbelliferone released per hour.
Immunoassay (ELISA)
Iduronic acid-2-sulfatase (I2S)
96 hole flat bottom plate in 50nM sodium bicarbonate (pH 9.6) diluted to 10 u g/mL mouse monoclonal antibody-I2S antibody at 37 degrees C coated for 1 hours. Mouse monoclonal anti-I2S antibodies against purified, recombinantly produced full-length human I2S polypeptide were prepared under contract with Maine biotechnological services, inc. (Portland, ME) using standard hybridoma production techniques. The plate was washed 3 times with 1 XPBS containing 0.1% Tween-20 and blocked with 2% BSA in wash buffer for 1 hour at 37 ℃. Wash buffer containing 2% BSA was used to dilute samples and standards. The I2S standard was diluted to 100ng/mL to 1.56ng/mL and used as such. After removal of the blocking buffer, samples and standards were applied to the plates and incubated at 37 ℃ for 1 hour. The detection antibody, i.e., horseradish peroxidase conjugated mouse anti-I2S antibody, was diluted to 0.15. mu.g/mL in wash buffer containing 2% BSA. The plates were washed 3 times, the probe antibody was added to the plates, and incubated at 37 ℃ for 30 minutes. To develop the plate, TMB substrate was prepared (Bio-Rad, Hercules, Calif.). The plate was washed 3 times, 100. mu.L of substrate was added to each well and incubated at 37 ℃ for 15 minutes. The reaction was stopped with 2N sulfuric acid (100. mu.L/well) and the plates were read on a microtiter plate reader at 450nm and with 655nm as the reference wavelength.
N-acetylgalactosamine 6-sulfatase (GALNS)
Two mouse monoclonal anti-GALNS antibodies provide the basis for GALNS ELISA. Mouse monoclonal anti-GALNS antibodies against purified, recombinantly produced full-length human GALNS polypeptides were also prepared under contract with Maine Biotechnology Services, inc. (Portland, ME) using standard hybridoma production techniques. To capture GALNS, the primary antibody was used to coat F96 maxisorpnnc-Immuno Plate (Nalge Nunc, cat # 442404) in coating buffer (50mM sodium bicarbonate, pH 9.6). After incubation for 1 hour at 37 ℃ and washing with washing buffer, the plates were blocked with blocking buffer (PBS, 0.05% Tween-20, 2% BSA) for 1 hour at 37 ℃. The experimental and control samples were then loaded onto plates along with GALNS standards and further incubated at 37 ℃ for 1 hour. After washing with wash buffer, the antibody, HRP-conjugated probing antibody, was added to blocking buffer followed by incubation at 37 ℃ for 30 minutes. After washing the plates again, Bio-Rad TMB substrate reagent was added and incubated for 15 minutes. The reaction was then stopped by adding 2N sulfuric acid and the results were scored spectrophotometrically at 450nm wavelength by using a Molecular Device plate reader.
Discussion of the related Art
Effect of FGE on sulfatase Activity
GALNS
An approximately 50-fold increase in total GALNS activity was observed relative to control levels (figure 5). This increased level of activity was observed at all three media sampling time points. Furthermore, GALNS activity accumulated linearly over time with a 4-fold increase between 24 and 48 hours and a 2-fold increase at time points between 48 and 72 hours.
I2S
A similar effect, albeit of a smaller absolute magnitude, was also observed for total I2S activity, with an approximately 5-fold increase in total I2S activity observed relative to control levels. The level of this increased activity remained constant for the duration of the experiment. I2S activity accumulated linearly over time in the medium, similar to the results seen with GALNS (2.3 fold between 24 and 48 hours and 1.8 fold between 48 and 72 hours).
Effect of FGE on the specific Activity of sulfatase
GALNS
FGE expression in 36F cells enhanced the apparent specific activity of GALNS (ratio of enzyme activity to total enzyme as measured by ELISA) by 40-60 fold relative to control levels (figure 6). This increase in specific activity remained unchanged at all three time points in the study and appeared to increase throughout the three days of post-transfection accumulation.
I2S
Similar effects were seen with I2S, where a 6-7 fold increase in specific activity (3-5U/mg) was observed relative to the control value (0.5-0.7U/mg).
Neither GALNS (fig. 7) nor I2S ELISA values were significantly affected by transfection of FGE. This indicates that FGE expression does not affect the translational and secretory pathways involved in sulfatase production.
Taken together, all these results for both sulfatases indicate that FGE expression significantly increases the specific sulfatase activity in cell lines overexpressing GALNS and I2S.
Co-expression of FGE (SUMF1) and other sulfatase genes
To test the effect of FGE (SUMF1) on additional sulfatase activity in normal cells, we overexpressed ARSA (SEQ ID NO: 14), ARSC (SEQ ID NO: 18) and ARSE (SEQ ID NO: 22) cDNAs in different cell lines co-transfected or non-co-transfected with FGE (SUMF1) cDNAs and measured the sulfatase activity. Overexpression of sulfatase cDNA in Cos-7 cells caused a modest increase in sulfatase activity, while a surprising synergistic increase (20 to 50 fold) was observed when the sulfatase gene and FGE (SUMF1) genes were co-expressed. Similar effects, albeit weaker, were observed in three additional cell lines, HepG2, LE293 and U2 OS. Simultaneous overexpression of multiple sulfatase cdnas resulted in less increase in the activity of each specific sulfatase relative to overexpression of a single sulfatase, indicating that there was competition for the modification mechanism by different sulfatases.
To test the functional conservation of the FGE (SUMF1) gene during evolution, we overexpressed ARSA, ARSC and ARSE cdnas in various cell lines co-transfected or not co-transfected with various sulfatase deficient cdnas and measured sulfatase activity. The murine and Drosophila FGE (SUMF1) genes are active against all three human sulfatases, with the Drosophila FGE (SUMF1) being less efficient. These data demonstrate the high functional conservation of FGE (SUMF1) during evolution, suggesting significant biological importance for cell function and survival. Similar and consistent effects, albeit much weaker, were observed by application of the FGE2(SUMF2) gene, suggesting that the protein encoded by this gene also has sulfatase modifying activity. These data demonstrate that the amount of protein encoded by FGE (SUMF1) is a limiting factor on sulfatase activity, a finding that has important implications for large-scale production of active sulfatase for enzyme replacement therapy.
Example 4:
Identification of mutated genes in multiple sulfatase deficiency by functional complementation using minicell-mediated chromosome transfer
In a separate experiment using minicell-mediated chromosome transfer in functional complementation, we demonstrated that the mutated gene in multiple sulfatase deficiency is FGE. Our findings provide further insight into new biological mechanisms that affect a whole family of proteins in less related organisms. In addition to identifying the molecular basis of rare genetic diseases, our data further demonstrate the potent potentiating effect of FGE gene products on sulfatase activity. The latter finding has direct clinical significance for the treatment of at least 8 human diseases caused by sulfatase deficiency.
Multiple sulfatase deficiency gene map of chromosome 3p26
To identify the chromosomal location of mutated genes in multiple sulfatase deficiency, we attempted to remedy defective sulfatases by functional complementation through minicell-mediated chromosomal transfer. A panel of human/mouse hybrid cell lines containing a single normal human chromosome labeled with the dominant selectable marker HyTK was used as a source of human chromosome donors and fused with immortalized cell lines from multiple sulfatase deficient patients. All 22 human normal chromosomes were transferred to the patient cell line in one piece and the hybrids were selected with hygromycin. In each of 22 transfer experiments, approximately 25 surviving clones were picked. These clones were grown individually and harvested for subsequent enzyme testing. For each of approximately 440 (20X 22) clones, arylsulfatase A (ARSA) (SEQ ID NO: 15), arylsulfatase B (ARSB) (SEQ ID NO: 17) and arylsulfatase C (ARSC) (SEQ ID NO: 19) activities were tested. This analysis clearly shows that the sulfatase activity of several clones derived from chromosome 3 transfer is significantly higher than all other clones. When analyzing the activity of each individual clone derived from chromosome 3 transfer, a surprising variability was observed. To verify whether each clone has the entire human chromosome 3 from the donor cell line, we applied a set of 23 chromosome 3 polymorphic genetic markers that were evenly distributed along the length of the chromosome and were previously selected based on having different alleles between the donor and patient cell lines. This allows us to detect the presence of donor chromosomes and identify possible loss of specific regions due to accidental chromosome breakage. Each clone with high enzymatic activity retains the entire chromosome 3 from the donor cell line, whereas the less active clone appears to have no intact chromosome based on the deletion of the chromosome 3 allele from the donor cell line. The latter clones likely retain a small region of the donor chromosome containing selectable marker genes that enable these clones to survive in hygromycin-containing medium. These data indicate that normal human chromosome 3 complements the defects observed in multiple sulfatase deficient patient cell lines.
To determine the specific chromosomal region containing the gene responsible for complementing activity, we used Neo-labeled chromosome 3 hybrids that were found to have lost portions of the chromosome. Furthermore, we performed a radiolucent minicell-mediated chromosome transfer of HyTK-labeled human chromosome 3. Sulfatase activity testing and genotyping were performed on 115 chromosome 3 irradiated hybrids using a set of 31 polymorphic microsatellite markers spanning the entire chromosome. All clones showing high enzyme activity appeared to have preserved chromosome 3p 26. Higher resolution analysis using additional markers from this region mapped the putative positions of the complement genes between markers D3S3630 and D3S 2397.
Identification of mutated genes in multiple sulfatase deficiency
We investigated genes from the 3p26 genomic region involved in mutations in patients with multiple sulfatase deficiency. Each exon including the splice junction was PCR amplified and analyzed by direct sequencing. Mutation analysis was done in 12 unrelated affected individuals; 5 were the aforementioned MSD patients, while 7 were unpublished cases. From our multiple sulfatase deficiency population, several mutations were identified in the Expressed Sequence Tag (EST) AK075459(SEQ ID NOs: 4, 5), which corresponds to a gene of unknown function, strongly suggesting that this is a gene involved in multiple sulfatase deficiency. Each mutation was found to be absent in 100 control individuals, thus precluding the presence of sequence polymorphisms. Additional confirmatory mutation analysis was performed on reverse transcribed patient RNA, particularly in those cases where genomic DNA analysis showed mutations present at or near the splice site that could affect splicing. Frameshift, nonsense, splice, and missense mutations were also identified, suggesting that the disease is caused by loss of functional mechanisms, as predicted for recessive disease. This is also consistent with the observation that almost all missense mutations affect amino acids that are highly conserved throughout evolution (see below).
Table 8: additional identified multiple sulfatase deficiency mutations
| Phenotypic exon nucleotide change amino acid changes in case literature |
| BA426 Conary et al, 1988 Medium 3463T>C S155P3 463T>C S155P2.BA428 Burch et al, 1986 Severe birth 5661 delG frameshift 3.BA431 Zenger et al, 1989 Medium 12T>G M1R 2276 delC frameshift 4.BA799 Burk et al, 1981 light-to-medium 3463T>C S155P3 463T>C S155P5.BA806 has not published severe primary 91045T>C R349W6.BA807 Schmidt et al, 1995 unknown 3C 519+4delGTAA ex3 deletion 91076C>A S359X7.BA809 Couchot et al, 1974 mild-moderate 11A >G M1V9 1042G>Unpublished severe 81006T of C A348P8.BA810>C C336R9 1046G>A R349Q9.BA811 No severe primary 3C 519+4delGTAA ex3 deletion 8979C>T R327X10.BA815, unpublished at moderate 5 c.603-6delC ex6 deletion 6836C>T A279V11.BA919 unpublished slight-moderate 91033C>T R345C9 1033C>T R345C12.BA920 unpublished medium 5653G>A C218Y9 1033C>T R345C |
Mutations were identified in each of the multiple sulfatase deficient patients tested, thus eliminating locus heterogeneity. No significant correlation was observed between the type of mutation identified and the severity of the phenotype reported in the patient, suggesting that clinical variability was not due to allelic heterogeneity. In three cases, different patients (cases 1 and 4, cases 6 and 9, and cases 11 and 12 in table 6) were found to carry the same mutation. Two of these patients (cases 11 and 12) were from the same town in siily, suggesting the presence of founder effects indeed confirmed by haplotype analysis (haplotype analysis). Surprisingly, it was found that most patients were double heterozygotes, carrying different allelic mutations, while only a small fraction were homozygotes. Although consistent with the absence of close relations reported by parents, this is an unexpected finding for very rare occult diseases such as multiple sulfatase deficiency.
FGE genes and proteins
The consensus cDNA sequence of the human FGE (also used herein interchangeably as SUMF1) cDNA (SEQ ID NO: 1) was assembled from several Expressed Sequence Tag (EST) clones and partially assembled from the corresponding genomic sequence. This gene contains 9 exons and spans approximately 105kb (see example 1). Sequence comparison also identified the presence of a paralogue of the FGE gene located on human chromosome 7, which we designated as FGE2 (also used interchangeably herein as SUMF2) (SEQ ID NOs: 45, 46).
Functional supplementation of sulfatase deficiency
Fibroblasts from two patients with multiple sulfatase deficiency (cases 1 and 12 in table 8, in which we identified mutations in the FGE (SUMF1) gene (cell lines BA426 and BA920)) were infected with HSV viruses containing the wild-type and two mutant forms of the FGE (SUMF1) cDNA (R327X and Δ ex 3). The activity of ARSA, ARSB and ARSC was tested 72 hours post-infection. Expression of the wild-type FGE (SUMF1) cDNA resulted in functional complementation of all three activities, whereas the mutant FGE (SUMF1) cDNA did not (table 9). These data provide definitive evidence for the identification of FGE (SUMF1) as a multiple sulfatase deficiency gene, and they demonstrate the relevance of mutations found in patients. Disease-related mutations cause sulfatase deficiency, thus demonstrating that FGE (SUMF1) is an essential factor for sulfatase activity.
Table 9: functional supplementation of sulfatase deficiency
(1)All enzymatic activities were expressed as nmoles of 4-methylumbelliferone released as protein mg-13hrs represents
Multiple sulfatase deficient cell lines BA426 and BA920 were infected with HSV amplicons alone and with constructs carrying mutant or wild-type SUMF1 cDNA. The increase in activity of a single arylsulfatase in fibroblasts infected with the wild-type SUMF1 gene relative to the activity of cells infected with vector alone is indicated in parentheses. The activity measured in uninfected control fibroblasts is indicated.
Molecular basis for multiple sulfatase deficiency
Based on the hypothesis that the disease gene should compensate for enzyme deficiency in patient cell lines, we completed minicell-mediated chromosome transfer to immortalized cell lines from patients with multiple sulfatase deficiencies. This technique has been successfully used for the identification of genes whose intended function can be assessed in cell lines, for example by measuring enzyme activity or by probing for morphological features. To account for the random variability of enzyme activity, we measured the activity of three different sulfatases (ARSA, ARSB and ARSC) in a complementation assay. The result of the chromosome transfer clearly indicates the location of the complementing gene on chromosome 3. Mapping of the subregions is achieved by generating a radiosynthesis set of chromosome 3. Individual hybrid clones were characterized both at the genomic level (by typing of 31 microsatellite markers showing different alleles between donor and recipient cell lines) and at the functional level (by testing for sulfatase activity). Analysis of 130 such hybrids resulted in mapping of the complementary region of chromosome 3p 26.
Once the critical genomic regions are identified, the FGE (SUMF1) gene is also identified by mutation analysis in the patient's DNA. Mutations were found in all patients tested, showing that a single gene is involved in multiple sulfatase deficiency. The mutations found are of different types, of which the major part (e.g. splice site, start site, nonsense, frameshift) is estimated to lead to loss of function of the encoded protein, as expected for recessive diseases. Most missense mutations affect codons corresponding to highly conserved amino acids during evolution, suggesting that these mutations also result in loss of function. There was no correlation between the severity of the mutation type and the phenotype, indicating that the latter was due to an unrelated factor. Unexpectedly for rare genetic diseases, many patients were found to be heterozygotes, carrying two different mutations. However, a founder effect on a mutation originating from a small town in siily was identified.
FGE (SUMF1) Gene function
The identification of the FGE (SUMF1) gene as a "complementing factor" was positively demonstrated by the expression of exogenous FGE (SUMF1) cDNA inserted into a viral vector in two different patient cell lines, which rescued the enzymatic absence of the four different sulfatases. In each case, consistent and partial (relative to control patient cell lines transfected with empty vector) recovery of all sulfatase activities was observed. On average, the increase in enzyme activity ranged from 1.7 to 4.9 fold, reaching approximately half the level observed in normal cell lines. The enzyme activity was correlated with the number of virus particles used in each experiment and also with the infection efficiency (tested by marker protein (GFP) analysis). In the same experiment, vectors containing FGE (SUMF1) cDNA carrying two mutations (R327X and Δ ex3) found in patients were applied without a significant increase in enzymatic activity being observed, thus demonstrating the functional relevance of these mutations.
As mentioned elsewhere herein, Schmidt et al first discovered that sulfatase undergoes a highly conserved cysteine to CαPost-translational modification of formylglycine, which is found at the active site of most sulfatases. They also show that this modification is deficient in multiple sulfatase deficiency (Schmidt, B. et al, Cell, 1995, 82: 271-278). Our mutational and functional data provide strong evidence that FGE (SUMF1) is responsible for this modification.
The FGE (SUMF1) gene shows a very high degree of sequence conservation in all the distant related species analyzed (from bacteria to humans). We provide evidence that drosophila homologues of the human FGE (UMF1) gene are able to activate overexpressed human sulfatase, demonstrating that the high level of sequence similarity observed for FGE (SUMF1) genes of distant related species is associated with surprising functional conservation. The notable exception was yeast, which appeared to lack the FGE (SUMF1) gene as well as any sulfatase encoding genes, suggesting that sulfatase function was not required for this organism, and suggesting the existence of a reciprocal effect on the evolution of FGE (SUMF1) and sulfatase genes.
Interestingly, there are two homologous genes FGE (SUMF1) and FGE2(SUMF2) in the genomes of all the vertebrates analyzed, including humans. As is evident in the phylogenetic tree, the FGE2(SUMF2) gene appears to have evolved independently from the FGE (SUMF1) gene. In our analysis, the FGE2(SUMF2) gene also activated sulfatase, however it was done in a much less efficient way with respect to the FGE (SUMF1) gene. This may explain the residual sulfatase activity found in patients with multiple sulfatase deficiency and suggest that a complete sulfatase deficiency will be fatal. At this time, we could not exclude the possibility that the FGE2(SUMF2) gene has another but still unknown function.
Effects on the treatment of diseases caused by sulfatase deficiency
A strong increase (up to 50 fold) in sulfatase activity was observed in cells in which the FGE (SUMF1) cDNA was overexpressed along with ARSA, ARSC, or ARSE cDNAs, relative to cells overexpressing only a single sulfatase. A significant synergistic effect was found in all cell lines, indicating that FGE (SUMF1) is the limiting factor for sulfatase activity. However, variability was observed in different sulfatases, probably due to the different affinities of FGE (SUMF1) -encoding protein to various sulfatases. Variability was also observed between different cell lines that may have different levels of endogenous formylglycine generating enzymes. Consistent with these observations, we found that the expression of multiple sulfatase deficiency genes varies between different tissues, with significantly high levels in the kidney and liver. This can be of great significance, since tissues with low expression levels of the FGE (SUMF1) gene are less efficient at modifying exogenously delivered sulfatase proteins (see below). Taken together, these data suggest that the function of the FGE (SUMF1) gene has evolved to achieve a dual regulatory system, each sulfatase being controlled simultaneously by two mechanisms, one being the sole mechanism responsible for the mRNA level of each structural sulfatase gene and one being the universal mechanism shared by all sulfatases. Furthermore, FGE2(SUMF2) provides partial redundancy to sulfatase modification.
These data are of profound significance for the mass production of active sulfatases to be used in enzyme replacement therapy. Enzyme replacement studies have been reported in animal models of sulfatase deficiency, e.g., feline animal model of mucopolysaccharidosis VI, and have been shown to be effective in preventing and curing several symptoms. Therapeutic trials for humans with the two congenital diseases MPSII (Hunter syndrome) and MPSVI (maroteeaux-Lamy syndrome) derived from sulfatase deficiency are currently ongoing and will soon spread to a large number of patients.
Example 5:
Enzyme replacement therapy for Morquio disease MPS IVA with FGE-activated GALNS
The major osteopathological cause in Morquio patients is the accumulation of Keratan Sulfate (KS) in epiphyseal (growth plate) chondrocytes due to a deficiency in lysosomal sulfatase enzyme (GALNS). The primary goal of in vivo studies is to determine whether Intravenously (IV) administered FGE-activated GALNS can penetrate chondrocytes from growth plates and other appropriate cell types in normal mice. Despite the general lack of bone deformity, a mouse model of GALNS deficiency (Morquio Knock-In-MKI, s.tomatsu, st.louis University, MO) was also used to demonstrate the In vivo biochemical activity of repeatedly administered FGE-activated GALNS. The lack of bone pathology in the mouse model reflects the fact that bone KS is either severely reduced or absent in rodents (Venn G, & Mason RM., Biochem J., 1985, 228: 443-450). However, these mice do show a detectable accumulation of GAGs and other cellular abnormalities in various organs and tissues. Thus, the overall goal of the study was to demonstrate that FGE-activated GALNS permeates into the growth plate (biodistribution studies) and functional GALNS enzyme activity involved in clearance of accumulated GAGs in affected tissues (pharmacokinetic studies).
The results of these studies demonstrate that IV injected FGE-activated GALNS is internalized by chondrocytes of growth plates, albeit at relatively low levels relative to other tissues. Furthermore, FGE-activated GALNS injection in MKI mice for 16 weeks effectively cleared accumulated GAGs, also reducing lysosomal biomarker staining in all soft tissues tested. In conclusion, the experimental success demonstrated the transport of GALNS to growth plate chondrocytes, and also the biochemical activity with respect to the clearance of GAGs in various tissues.
Biodistribution study
Four-week old ICR (normal) mice were given a single IV injection of 5mg/kg FGE-activated GALNS. The liver, femur, heart, kidney and spleen were collected two hours after injection and prepared for histological examination. Anti-human GALNS monoclonal antibodies are used to detect the presence of injected GALNS in various tissues. GALNS was detected in all tissues tested compared to vehicle controls. Furthermore, GALNS was readily observed in all tissues tested with the exception of bone using the horseradish peroxidase reporter system. Demonstration of uptake of GALNS in growth plates requires the use of a more sensitive fluorescein-isothiocyanate (FITC) reporter system, demonstrating that while GALNS permeates into growth plates, it does not so readily permeate growth plate chondrocytes relative to cells of soft tissue. Although a more sensitive fluorescence detection method is required, transport of GALNS to chondrocytes from bone growth plates was observed in all growth plate sections tested, compared to vehicle controls.
Pharmacokinetic studies in MKI mice
Four-week old MKI or wild-type mice were given IV injections weekly (n ═ 8 per group) until 20 weeks of age. Each weekly injection consisted of 2mg/kg of GALNS activated by FGE or vector control (no injection in wild-type mice). All mice were sacrificed at 20 weeks of age for histological examination and stained with the following method: hematoxylin and eosin were used for cytomorphology and alizarin blue was used for GAG detection.
Clearance of accumulated GAGs was demonstrated by reduced or absent alizarin blue staining in all soft tissues examined (liver, heart, kidney and spleen). This was observed only in GALNS-injected mice, and although the growth plates in MKI mice function normally as evidenced by normal bone morphology, there were more subtle cellular abnormalities observed (including vacuolization of chondrocytes with no apparent pathological effect). Vacuolated chondrocytes from hypertrophied and hyperplastic areas of the growth plate were not affected by GALNS administration. This is in contrast to chondrocytes in calcified regions of growth plates, in which case a reduction in vacuolization was observed in GALNS-injected mice. In general, vacuolization of chondrocytes and accumulation of putative non-KS GAGs in the growth plates of MKI mice was surprising and unexpected due to the known lack of KS in the growth plates of mice. These particular observations most likely reflect the fact that in knock-in mice, there is a high level of mutant GALNS (in contrast to knock-out mice where there is no residual mutant GALNS, no growth plate chondrocyte vacuolization, and no GAG accumulation-Tomatsu s. et al, Human molecular genetics, 2003, 12: 3349-. Vacuolization in growth plates may indicate a secondary effect on a subset of cells expressing mutant GALNS. However, enzyme injection lasting 16 weeks strongly demonstrates the fact that FGE-activated GALNS trafficking and enzyme activity in vivo for various tissues.
Detailed description of the drawings
FIG. 1: MALDI-TOF Mass Spectrometry of P23 after incubation in the absence (A) or presence (B) of soluble extracts from bovine testis microsomes
6pmo l of P23 were incubated at 37 ℃ for 10 minutes under standard conditions, in which 1. mu.l of microsomal extract was absent or present. Samples were prepared for MALDI-TOF mass spectrometry as described in the experimental procedure. Single isotopes of P23(2526.28) and FGly derivatives thereof (2508.29)Quality MH+Is indicated.
FIG. 2: aligned phylogenetic tree of 21 proteins derived from human FGE and PFAM-DUF323 seeds
The number of branches indicates the distance the system has occurred. Proteins are specified by their TrEMBL ID number and species name. hFGE-human FGE. Right upper part: a scale of the system occurring distance. Asterisks indicate that the genes have been further studied. The top 7 genes are part of the FGE gene family.
FIG. 3: organization of human and murine FGE Gene loci
Exons are shown to scale in dark boxes (human sites) and light boxes (murine sites). The cross bar at the lower right corner displays a scale. Lines between exons show introns (not to scale). The numbers on the intron lines indicate the size of the intron in kb.
FIG. 4: FIG. showing map of FGE expression plasmid pXMG.1.3
FIG. 5: histogram depicting N-acetylgalactosamine 6-sulfatase activity in 36F cells transiently transfected with FGE expression plasmid
Cells were transfected with the control plasmid pxmg.1.2 with FGE cDNA in reverse orientation, or the FGE expression plasmid pxmg.1.3 in medium without Methotrexate (MTX). After 24 hours, the cells were re-cultured in medium containing 1.0 μ MMTX. After further culturing for 24, 48 and 72 hours, the medium was collected and the cells were collected. N-acetylgalactosamine 6-sulfatase activity was determined by activity assay. Each value shown is the mean of two separate transfections, and the standard deviation is indicated on an error scale.
FIG. 6: histogram depicting the specific activity of N-acetylgalactosamine 6-sulfatase in 36F cells transiently transfected with FGE expression plasmid
Cells were transfected with the control plasmid pxmg.1.2 with FGE cDNA in reverse orientation, or the FGE expression plasmid pxmg.1.3 in medium without Methotrexate (MTX). After 24 hours, the cells were re-cultured in medium containing 1.0. mu.M MTX. After further culturing for 24, 48 and 72 hours, the medium was collected and the cells were collected. The specific activity of N-acetylgalactosamine 6-sulfatase was determined by activity assay and ELISA and expressed as the ratio of N-acetylgalactosamine 6-sulfatase activity per mg of ELISA-reactive N-acetylgalactosamine 6-sulfatase. Each value shown is the average of two separate transfections.
FIG. 7: histogram depicting the production of N-acetylgalactosamine 6-sulfatase in 36F cells transiently transfected with FGE expression plasmid
Cells were transfected with the control plasmid pxmg.1.2 with FGE cDNA in reverse orientation, or the FGE expression plasmid pxmg.1.3 in medium without Methotrexate (MTX). After 24 hours, the cells were re-cultured in medium containing 1.0. mu.M MTX. After further culturing for 24, 48 and 72 hours, the medium was collected and the cells were collected. Total N-acetylgalactosamine 6-sulfatase protein was measured by ELISA. Each value shown is the mean of two separate transfections, and the standard deviation is indicated on an error scale.
FIG. 8: iduronate-2-sulfatase activity in 30C6 cells transiently transfected with FGE expression plasmid
Cells were transfected with the control plasmid pxmg.1.2 with FGE cDNA in reverse orientation, or the FGE expression plasmid pxmg.1.3 in medium without Methotrexate (MTX). After 24 hours, the cells were re-cultured in medium containing 1.0. mu.M MTX. After further culturing for 24, 48 and 72 hours, the medium was collected and the cells were collected. Iduronate-2-sulfatase activity was determined by activity assay. Each value shown is the average of two separate transfections.
FIG. 9: description of kits embodying features of the invention
All references disclosed herein are incorporated by reference in their entirety. Claims appear below and the sequence listing is given below.
Sequence listing
<110>Transkaryotic Therapies,Inc.
von Figura,Kurt
Schmidt,Bernhard
Dierks,Thomas
Heartlein,Michael W.
Cosma,Maria P.
Ballabio,Andrea
<120> diagnosis and treatment of various sulfatase deficiencies and other disorders using formyl-glycine generating enzyme (FGE)
<130>0403WO
<150>US 60/447,747
<151>2003-02-11
<160>95
<170>PatentIn version 3.2
<210>1
<211>1180
<212>DNA
<213> Intelligent people
<220>
<221>CDS
<222>(20)..(1141)
<223>FGE cDNA
<400>1
<210>2
<211>374
<212>PRT
<213> Intelligent people
<400>2
<210>3
<211>1122
<212>DNA
<213> Intelligent people
<400>3
<210>4
<211>2130
<212>DNA
<213> Intelligent people
<400>4
<210>5
<211>374
<212>PRT
<213> Intelligent people
<400>5
<210>6
<211>2297
<212>DNA
<213> Intelligent people
<400>6
<210>7
<211>550
<212>PRT
<213> Intelligent people
<400>7
<210>8
<211>2657
<212>DNA
<213> Intelligent people
<400>8
<210>9
<211>502
<212>PRT
<213> Intelligent people
<400>9
<210>10
<211>1014
<212>DNA
<213> Intelligent people
<400>10
<210>11
<211>522
<212>PRT
<213> Intelligent people
<400>11
<210>12
<211>2379
<212>DNA
<213> Intelligent people
<400>12
<210>13
<211>552
<212>PRT
<213> Intelligent people
<400>13
<210>14
<211>2022
<212>DNA
<213> Intelligent people
<400>14
<210>15
<211>507
<212>PRT
<213> Intelligent people
<400>15
<210>16
<211>2228
<212>DNA
<213> Intelligent people
<400>16
<210>17
<211>533
<212>PRT
<213> Intelligent people
<400>17
<210>18
<211>2401
<212>DNA
<213> Intelligent people
<400>18
<210>19
<211>583
<212>PRT
<213> Intelligent people
<400>19
<210>20
<211>1945
<212>DNA
<213> Intelligent people
<400>20
<210>21
<211>593
<212>PRT
<213> Intelligent people
<400>21
<210>22
<211>1858
<212>DNA
<213> Intelligent people
<400>22
<210>23
<211>589
<212>PRT
<213> Intelligent people
<400>23
<210>24
<211>1996
<212>DNA
<213> Intelligent people
<400>24
<210>25
<211>591
<212>PRT
<213> Intelligent people
<400>25
<210>26
<211>1578
<212>DNA
<213> Intelligent people
<400>26
<210>27
<211>525
<212>PRT
<213> Intelligent people
<400>27
<210>28
<211>4669
<212>DNA
<213> Intelligent people
<400>28
<210>29
<211>871
<212>PRT
<213> Intelligent people
<400>29
<210>30
<211>4279
<212>DNA
<213> Intelligent people
<400>30
<210>31
<211>870
<212>PRT
<213> Intelligent people
<400>31
<210>32
<211>6
<212>PRT
<213> Intelligent people
<220>
<221> variants
<222>(1)..(1)
<223> Leu or Val
<220>
<221>misc_feature
<222>(1)..(3)
<223> Xaa can be any natural amino acid
<220>
<221> variants
<222>(2)..(2)
<223> Cys or Ser
<220>
<221> variants
<222>(3)..(3)
<223> any amino acid
<400>32
<210>33
<211>23
<212>PRT
<213> Artificial
<220>
<223> sequence derived from human arylsulfatase A
<220>
<221> peptides
<222>(1)..(23)
<223> synthesized FGly forms a substrate; primary sequence from human arylsulfatase A
<400>33
<210>34
<211>16
<212>PRT
<213> Artificial
<220>
<223> variants of ASA65-80 peptides in which residues Cys69, Pro71 and Arg73 critical for FGly formation are promiscuous
<220>
<221>MISC_FEATURE
<222>(1)..(16)
<223> hybrid oligopeptide
<400>34
<210>35
<211>16
<212>PRT
<213> Artificial
<220>
<223> variants of ASA65-80 peptide wherein Cys69 is replaced by serine
<220>
<221>MISC_FEATURE
<222>(1)..(16)
<223> Ser69 oligopeptide
<400>35
<210>36
<211>19
<212>DNA
<213> Artificial
<220>
<223> human FGE-specific PCR primers
<220>
<221>misc_feature
<222>(1)..(19)
<223> human FGE specific PCR primer 1199nc
<400>36
<210>37
<211>16
<212>DNA
<213> Artificial
<220>
<223> human FGE-specific PCR primers
<220>
<221>misc_feature
<222>(1)..(16)
<223> human FGE-specific forward PCR primer 1c
<400>37
<210>38
<211>19
<212>DNA
<213> Artificial
<220>
<223> human FGE-specific PCR primers
<220>
<221>misc_feature
<222>(1)..(19)
<223> human FGE-specific reverse PCR primer 1182c
<400>38
<210>39
<211>24
<212>DNA
<213> Artificial
<220>
<223> human FGE-specific PCR primers
<220>
<221>misc_feature
<222>(1)..(24)
<223> EcoRI-containing human 5' -FGE-specific PCR primer
<400>39
<210>40
<211>54
<212>DNA
<213> Artificial
<220>
<223> HA-specific primer
<220>
<221>misc_feature
<222>(1)..(54)
<223> HA-specific primer
<400>40
<210>41
<211>57
<212>DNA
<213> Artificial
<220>
<223> c-myc-specific primers
<220>
<221>misc_feature
<222>(1)..(57)
<223> c-myc-specific primers
<400>41
<210>42
<211>54
<212>DNA
<213> Artificial
<220>
<223> RGS-His 6-specific primer
<220>
<221>misc_feature
<222>(1)..(54)
<223> RGS-His 6-specific primer
<400>42
<210>43
<211>15
<212>PRT
<213> Artificial
<220>
<223> Trypsin oligopeptide of human FGE preparation
<220>
<221>MISC_FEATURE
<222>(1)..(15)
<223> Trypsin oligopeptide of human FGE preparation
<400>43
<210>44
<211>19
<212>PRT
<213> Artificial
<220>
<223> Trypsin oligopeptide of human FGE preparation
<220>
<221>MISC_FEATURE
<222>(1)..(19)
<223> Trypsin oligopeptide of human FGE preparation
<400>44
<210>45
<211>906
<212>DNA
<213> Intelligent people
<400>45
<210>46
<211>301
<212>PRT
<213> Intelligent people
<400>46
<210>47
<211>927
<212>DNA
<213> mouse
<400>47
<210>48
<211>308
<212>PRT
<213> mouse
<400>48
<210>49
<211>855
<212>DNA
<213> mouse
<400>49
<210>50
<211>284
<212>PRT
<213> mouse
<400>50
<210>51
<211>1011
<212>DNA
<213> Drosophila flavedo
<400>51
<210>52
<211>336
<212>PRT
<213> Drosophila flavedo
<400>52
<210>53
<211>870
<212>DNA
<213>Anopheles gambiae
<400>53
<210>54
<211>290
<212>PRT
<213>Anopheles gambiae
<400>54
<210>55
<211>945
<212>DNA
<213> Streptomyces coelicolor
<400>55
<210>56
<211>314
<212>PRT
<213> Streptomyces coelicolor
<400>56
<210>57
<211>1005
<212>DNA
<213>Corynebacterium efficiens
<400>57
<210>58
<211>334
<212>PRT
<213>Corynebacterium efficiens
<400>58
<210>59
<211>1017
<212>DNA
<213>Novosphingobium aromaticivorans
<400>59
<210>60
<211>338
<212>PRT
<213>Novosphingobium aromaticivorans
<400>60
<210>61
<211>1119
<212>DNA
<213>Mesorhizobium loti
<400>61
<210>62
<211>372
<212>PRT
<213>Mesorhizobium loti
<400>62
<210>63
<211>1251
<212>DNA
<213>Burkholderia fungorum
<400>63
<210>64
<211>416
<212>PRT
<213>Burkholderia fungorum
<400>64
<210>65
<211>912
<212>DNA
<213>Sinorhizobium meliloti
<400>65
<210>66
<211>303
<212>PRT
<213>Sinorhizobium meliloti
<400>66
<210>67
<211>1065
<212>DNA
<213> Microoscillatoria species
<400>67
<210>68
<211>354
<212>PRT
<213> Microoscillatoria species
<400>68
<210>69
<211>876
<212>DNA
<213> Pseudomonas putida KT2440
<400>69
<210>70
<211>291
<212>PRT
<213> Pseudomonas putida KT2440
<400>70
<210>71
<211>780
<212>DNA
<213>Ralstonia metallidurans
<400>71
<210>72
<211>259
<212>PRT
<213>Ralstonia metallidurans
<400>72
<210>73
<211>876
<212>DNA
<213>Prochlorococcus marinus
<400>73
<210>74
<211>291
<212>PRT
<213>Prochlorococcus marinus
<400>74
<210>75
<211>1017
<212>DNA
<213> Bacillus crescentus CB15
<400>75
<210>76
<211>338
<212>PRT
<213> Bacillus crescentus CB15
<400>76
<210>77
<211>900
<212>DNA
<213> Mycobacterium tuberculosis H37Rv
<400>77
<210>78
<211>299
<212>PRT
<213> Mycobacterium tuberculosis H37Rv
<400>78
<210>79
<211>7
<212>PRT
<213> Artificial
<220>
<223> prokaryotes and conserved domains in prokaryotes
<220>
<221>DOMAIN
<222>(1)..(7)
<223> conserved Domain
<220>
<221>MISC_FEATURE
<222>(3)..(4)
<223> any amino acid
<220>
<221>MISC_FEATURE
<222>(6)..(6)
<223> any amino acid
<220>
<221>MISC_FEATURE
<222>(6)..(6)
<223> Gly or Ala
<400>79
<210>80
<211>630
<212>DNA
<213>Oncorhynchus mykiss
<400>80
<210>81
<211>655
<212>DNA
<213> Zebra fish
<220>
<221>misc_feature
<222>(590)..(590)
<223> n is a, c, g or t
<220>
<221>misc_feature
<222>(626)..(626)
<223> n is a, c, g or t
<400>81
<210>82
<211>773
<212>DNA
<213>Oryzias latipes
<220>
<221>misc_feature
<222>(690)..(690)
<223> n is a, c, g or t
<220>
<221>misc_feature
<222>(755)..(755)
<223> n is a, c, g or t
<400>82
<210>83
<211>566
<212>DNA
<213> Xenopus laevis
<220>
<221>misc_feature
<222>(6)..(6)
<223> n is a, c, g or t
<220>
<221>misc_feature
<222>(47)..(47)
<223> n is a, c, g or t
<220>
<221>misc_feature
<222>(81)..(81)
<223> n is a, c, g or t
<400>83
<210>84
<211>647
<212>DNA
<213>Silurana tropicalis
<400>84
<210>85
<211>636
<212>DNA
<213> Canadian salmon
<400>85
<210>86
<211>415
<212>DNA
<213> wild boar
<400>86
<210>87
<211>595
<212>DNA
<213> cattle
<400>87
<210>88
<211>1611
<212>DNA
<213> Intelligent people
<220>
<221>CDS
<222>(1)..(1608)
<223>hSULF3
<400>88
<210>89
<211>536
<212>PRT
<213> Intelligent people
<400>89
<210>90
<211>1722
<212>DNA
<213> Intelligent people
<220>
<221>CDS
<222>(1)..(1719)
<223>hSULF4
<400>90
<210>91
<211>573
<212>PRT
<213> Intelligent people
<400>91
<210>92
<211>1710
<212>DNA
<213> Intelligent people
<220>
<221>CDS
<222>(1)..(1707)
<223>hSULF5
<400>92
<210>93
<211>569
<212>PRT
<213> Intelligent people
<400>93
<210>94
<211>2067
<212>DNA
<213> Intelligent people
<220>
<221>CDS
<222>(1)..(2064)
<223>hSULF6
<400>94
<210>95
<211>688
<212>PRT
<213> Intelligent people
<400>95
Claims (31)
1. A pharmaceutical composition comprising:
sulfatase produced by cells for the treatment of sulfatase deficiency, and
a pharmaceutically acceptable carrier, and a pharmaceutically acceptable carrier,
wherein the cell has been contacted with an agent comprising an isolated polypeptide selected from the group consisting of:
has a sequence similar to SEQ ID NO: 2 and having C α -formylglycine generating activity (FGE); and
comprises a sequence selected from SEQ ID NO: 2. 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 and 78.
2. A pharmaceutical composition comprising:
sulfatase produced by cells for the treatment of sulfatase deficiency, and
a pharmaceutically acceptable carrier, and a pharmaceutically acceptable carrier,
wherein the cell has been contacted with an agent comprising an isolated nucleic acid molecule selected from the group consisting of:
(a) as shown in SEQ ID NO: 1, wherein the unique fragment of the nucleotide sequence set forth in SEQ id no: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
3. Use of a sulfatase enzyme that has been contacted with a formylglycine generating enzyme in the manufacture of a medicament for increasing sulfatase specific activity in a subject, wherein the formylglycine generating enzyme is encoded by a nucleic acid molecule selected from the group consisting of:
(a) As shown in SEQ ID NO: 1, wherein the unique fragment of the nucleotide sequence set forth in SEQ id no: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
4. Use of a nucleic acid molecule for the manufacture of a medicament for increasing C α -formylglycine generating activity in a cell, wherein the nucleic acid molecule is selected from the group consisting of
(a) As shown in SEQ ID NO: 1, wherein the unique fragment of the nucleotide sequence set forth in SEQ id no: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
5. A pharmaceutical composition comprising:
an agent comprising a pharmaceutically effective amount of an isolated nucleic acid molecule selected from the group consisting of sulfatase deficiency and a pharmaceutically acceptable carrier for treating multiple sulfatase deficiency
(a) As shown in SEQ ID NO: 1, wherein the unique fragment of the nucleotide sequence set forth in SEQ id no: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
6. A method of identifying an agent useful for modulating C α -formylglycine generating activity, comprising:
(a) contacting a molecule having C α -formylglycine generating activity with a candidate agent,
(b) measuring the C α -formylglycine generating activity of said molecule, and
(c) Comparing the measured C α -formylglycine generating activity of the molecule to a control to determine whether the candidate agent modulates the C α -formylglycine generating activity of the molecule,
wherein the molecule is a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 2. 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 and 78.
7. Selectively binds to a polypeptide comprising SEQ ID NO: 2 for use in the manufacture of a medicament for diagnosing multiple sulfatase deficiency in a subject.
8. Selectively binds to a polypeptide comprising SEQ ID NO: 2 in the manufacture of a medicament for diagnosing a physiological condition characterized by aberrant expression of a nucleic acid molecule or expression product thereof, wherein said physiological condition is multiple sulfatase deficiency.
9. Comprises the amino acid sequence of SEQ ID NO: 2 in the manufacture of a medicament for determining multiple sulfatase deficiency in a sample from a subject, wherein the multiple sulfatase deficiency is characterized by abnormal expression of the polypeptide.
10. The use of claim 9, wherein the sample is a biological fluid or tissue.
11. The use of claim 9, further comprising contacting the sample with a detectable agent selected from the group consisting of:
(a) An antibody that binds selectively to the polypeptide, and
(c) (iii) a polypeptide that binds to the antibody of (iv).
12. The use of claim 11, wherein the antibody or polypeptide is labeled with a radiolabel or an enzyme.
13. A kit comprising a package, the package comprising:
(i) reagent: selected from the group consisting of SEQ ID NOs: 2 and having a C α -formylglycine generating activity (FGE), and
(ii) a second reagent: which selectively binds to a polypeptide or peptide thereof selected from the group consisting of: iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine 6-sulfatase, N-acetylglucosamine 6-sulfatase, arylsulfatase A, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5 and HSulf-6,
and a control for comparing the measured value of binding of the second agent and the polypeptide or peptide thereof.
14. Use of an agent that modulates C α -formylglycine generating activity in the manufacture of a medicament for treating multiple sulfatase deficiency in a subject, wherein the agent that modulates C α -formylglycine generating activity is a polypeptide selected from the group consisting of:
Has a sequence similar to SEQ ID NO: 2 and having C α -formylglycine generating activity (FGE); and
comprises a sequence selected from SEQ ID NO: 2. 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 and 78.
15. The use of claim 14, wherein the medicament further comprises an agent selected from the group consisting of: nucleic acid molecules encoding iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine 6-sulfatase, N-acetylglucosamine 6-sulfatase, arylsulfatase a, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, HSulf-6, and expression products of said nucleic acid molecules.
16. Use of an isolated polypeptide selected from the group consisting of C α -formylglycine generating activity in a subject in the manufacture of a medicament for increasing C α -formylglycine generating activity in a subject
Has a sequence similar to SEQ ID NO: 2 and having C α -formylglycine generating activity (FGE); and
Comprises a sequence selected from SEQ ID NO: 2. 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 and 78.
17. Use of a polypeptide selected from the group consisting of
Has a sequence similar to SEQ ID NO: 2 and having C α -formylglycine generating activity (FGE); and
comprises a sequence selected from SEQ ID NO: 2. 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 and 78.
18. A pharmaceutical composition comprising:
an agent comprising a pharmaceutically effective amount of an isolated polypeptide selected from the group consisting of
Has a sequence similar to SEQ ID NO: 2 and having C α -formylglycine generating activity (FGE); and
comprises a sequence selected from SEQ ID NO: 2. 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 and 78.
19. Use of an isolated polypeptide selected from the group consisting of
Has a sequence similar to SEQ ID NO: 2 and having C α -formylglycine generating activity (FGE); and
comprises a sequence selected from SEQ ID NO: 2. 5, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 and 78.
20. The use of claim 19, wherein the cell expresses an endogenous sulfatase.
21. The use of claim 20, wherein the endogenous sulfatase is activated.
22. The use of claim 19, wherein the cell expresses an exogenous sulfatase.
23. Use according to any one of claims 19 to 22, wherein the sulfatase is selected from iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine 6-sulfatase, N-acetylglucosamine 6-sulfatase, arylsulfatase a, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5 and HSulf-6.
24. The use of claim 19, wherein the cell is a mammalian cell.
25. A kit comprising a package, the package comprising:
An agent that selectively binds to a nucleic acid molecule encoding FGE or an expression product thereof, and
a control for comparison to a measure of binding of said agent and said isolated nucleic acid molecule selected from the group consisting of:
(a) as shown in SEQ ID NO: 1, wherein said unique fragment of a nucleotide sequence set forth in seq id NO: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and the complement of (b), (a); and
a second agent that selectively binds to a polypeptide or peptide thereof selected from the group consisting of: iduronate-2-sulfatase, sulfamidase, N-acetylgalactosamine 6-sulfatase, N-acetylglucosamine 6-sulfatase, arylsulfatase a, arylsulfatase B, arylsulfatase C, arylsulfatase D, arylsulfatase E, arylsulfatase F, arylsulfatase G, HSulf-1, HSulf-2, HSulf-3, HSulf-4, HSulf-5, and HSulf-6, and controls for comparing measurements of binding of the second agent to the polypeptide or peptide thereof.
26. Use of an agent that modulates C α -formylglycine generating activity in the manufacture of a medicament for treating multiple sulfatase deficiency in a subject, wherein the agent that modulates C α -formylglycine generating activity is a nucleic acid molecule selected from the group consisting of:
(a) As shown in SEQ ID NO: 1, wherein said unique fragment of a nucleotide sequence set forth in seq id NO: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
27. Use of an agent that modulates C α -formylglycine generating activity in the manufacture of a medicament for treating multiple sulfatase deficiency in a subject, wherein the agent that modulates C α -formylglycine generating activity is a peptide encoded by a nucleic acid molecule selected from the group consisting of:
(a) as shown in SEQ ID NO: 1, wherein said unique fragment of a nucleotide sequence set forth in seq id NO: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
28. Use of an agent that modulates C α -formylglycine generating activity in the manufacture of a medicament for treating multiple sulfatase deficiency in a subject, wherein the agent that modulates C α -formylglycine generating activity is produced by a cell expressing an FGE nucleic acid molecule selected from the group consisting of:
(a) as shown in SEQ ID NO: 1, wherein said unique fragment of a nucleotide sequence set forth in seq id NO: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
29. Use of an isolated nucleic acid molecule selected from the group consisting of
(a) As shown in SEQ ID NO: 1, wherein said unique fragment of a nucleotide sequence set forth in seq id NO: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
30. Use of an agent that modulates C α -formylglycine generating activity in an effective amount to increase C α -formylglycine generating activity in a subject in the manufacture of a medicament for treating multiple sulfatase deficiency in the subject, wherein the agent that modulates C α -formylglycine generating activity is a sense nucleic acid molecule selected from the group consisting of:
(a) as shown in SEQ ID NO: 1, wherein said unique fragment of a nucleotide sequence set forth in seq id NO: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) the complement of (a).
31. Increasing C in a subject in an effective amount to modulate C-formylglycine generating activity
Use of an agent that modulates C α -formylglycine generating activity in the manufacture of a medicament for treating multiple sulfatase deficiency in a subject, wherein the agent that modulates C α -formylglycine generating activity is an isolated polypeptide encoded by a nucleic acid molecule selected from the group consisting of:
(a) as shown in SEQ ID NO: 1, wherein said unique fragment of a nucleotide sequence set forth in seq id NO: 1 encodes the nucleotide sequence of SEQ ID NO: 2, and
(b) The complement of (a).
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US60/447,747 | 2003-02-11 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| HK1129842A true HK1129842A (en) | 2009-12-11 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN1759176B (en) | Diagnosis and treatment of multiple sulfatase deficiency and other conditions using formyl-glycine-generating enzyme (FGE) | |
| AU2012206984B2 (en) | Diagnosis and treatment of multiple sulfatase deficiency and other using a formylglycine generating enzyme (FGE) | |
| HK1129842A (en) | Diagnosis and treatment of multiple sulfatase deficiency and other using a formylglycine generating enzyme(fge) | |
| HK1090088B (en) | Diagnosis and treatment of multiple sulfatase deficiency and other using a formylglycine generating enzyme(fge) | |
| HK1184069A (en) | Diagnosis and treatment of multiple sulfatase deficiency and other using a formylglycine generating enzyme (fge) | |
| HK1152336B (en) | Diagnosis and treatment of multiple sulfatase deficiency and others using a formylglycine generating enzyme (fge) | |
| HK1152337B (en) | Cells that coexpress a sulfatase and a c-formylglycine generating enzyme and methods and uses thereof | |
| WO2002038763A1 (en) | Pca2501 gene |