US8013137B2 - Modified enteropeptidase protein - Google Patents
Modified enteropeptidase protein Download PDFInfo
- Publication number
- US8013137B2 US8013137B2 US11/973,157 US97315707A US8013137B2 US 8013137 B2 US8013137 B2 US 8013137B2 US 97315707 A US97315707 A US 97315707A US 8013137 B2 US8013137 B2 US 8013137B2
- Authority
- US
- United States
- Prior art keywords
- seq
- enteropeptidase
- asp
- isolated
- amino acid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
- 108010013369 Enteropeptidase Proteins 0.000 title claims abstract description 236
- 102100029727 Enteropeptidase Human genes 0.000 title claims abstract description 175
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 247
- 102000004196 processed proteins & peptides Human genes 0.000 claims abstract description 212
- 229920001184 polypeptide Polymers 0.000 claims abstract description 210
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 133
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 100
- 238000003776 cleavage reaction Methods 0.000 claims abstract description 84
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 83
- 239000002773 nucleotide Substances 0.000 claims abstract description 82
- 238000000034 method Methods 0.000 claims abstract description 70
- 230000007017 scission Effects 0.000 claims abstract description 70
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 57
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 57
- 239000013598 vector Substances 0.000 claims abstract description 33
- 150000007523 nucleic acids Chemical class 0.000 claims description 148
- 102000039446 nucleic acids Human genes 0.000 claims description 127
- 108020004707 nucleic acids Proteins 0.000 claims description 127
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 88
- 230000000694 effects Effects 0.000 claims description 65
- 239000000758 substrate Substances 0.000 claims description 62
- 210000004027 cell Anatomy 0.000 claims description 61
- 238000006467 substitution reaction Methods 0.000 claims description 57
- 102000004190 Enzymes Human genes 0.000 claims description 47
- 108090000790 Enzymes Proteins 0.000 claims description 47
- 108010060175 trypsinogen activation peptide Proteins 0.000 claims description 44
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 claims description 42
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 claims description 42
- 239000012634 fragment Substances 0.000 claims description 41
- 230000002797 proteolythic effect Effects 0.000 claims description 39
- 150000001413 amino acids Chemical class 0.000 claims description 31
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 claims description 28
- 125000000539 amino acid group Chemical group 0.000 claims description 24
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 21
- 230000000295 complement effect Effects 0.000 claims description 21
- 230000035772 mutation Effects 0.000 claims description 21
- 238000002360 preparation method Methods 0.000 claims description 15
- 241000588724 Escherichia coli Species 0.000 claims description 14
- 239000011780 sodium chloride Substances 0.000 claims description 14
- 230000001580 bacterial effect Effects 0.000 claims description 13
- 102220643808 39S ribosomal protein L9, mitochondrial_K63R_mutation Human genes 0.000 claims description 12
- 101000801481 Homo sapiens Tissue-type plasminogen activator Proteins 0.000 claims description 10
- 102100034870 Kallikrein-8 Human genes 0.000 claims description 10
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 claims description 7
- 239000001110 calcium chloride Substances 0.000 claims description 7
- 229910001628 calcium chloride Inorganic materials 0.000 claims description 7
- 238000012217 deletion Methods 0.000 claims description 7
- 230000037430 deletion Effects 0.000 claims description 7
- 238000004519 manufacturing process Methods 0.000 claims description 7
- 102220643818 39S ribosomal protein L9, mitochondrial_K63E_mutation Human genes 0.000 claims description 6
- 102000010631 Kininogens Human genes 0.000 claims description 6
- 108010077861 Kininogens Proteins 0.000 claims description 6
- 108010085895 Laminin Proteins 0.000 claims description 6
- 102220473356 Ubiquitin-conjugating enzyme E2 D2_K63E_mutation Human genes 0.000 claims description 6
- 238000007792 addition Methods 0.000 claims description 6
- 102220040586 rs61749695 Human genes 0.000 claims description 6
- URNKXHHGSVKUKV-HJOGWXRNSA-N (2s)-n-[(2s)-1-[[(2s)-5-(diaminomethylideneamino)-2-[(4-methyl-2-oxochromen-7-yl)amino]pentanoyl]amino]-1-oxo-3-phenylpropan-2-yl]pyrrolidine-2-carboxamide Chemical compound C([C@@H](C(=O)NC(=O)[C@H](CCCN=C(N)N)NC1=CC=2OC(=O)C=C(C=2C=C1)C)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 URNKXHHGSVKUKV-HJOGWXRNSA-N 0.000 claims description 5
- 108020004705 Codon Proteins 0.000 claims description 5
- 102100037362 Fibronectin Human genes 0.000 claims description 5
- 108010067306 Fibronectins Proteins 0.000 claims description 5
- 108010028105 prolyl-phenylalanyl-arginine-4-methylcoumaryl-7-amide Proteins 0.000 claims description 5
- 102000008946 Fibrinogen Human genes 0.000 claims description 4
- 108010049003 Fibrinogen Proteins 0.000 claims description 4
- 210000004102 animal cell Anatomy 0.000 claims description 4
- 229940012952 fibrinogen Drugs 0.000 claims description 4
- 230000002538 fungal effect Effects 0.000 claims description 4
- 108010010803 Gelatin Proteins 0.000 claims description 3
- 238000012258 culturing Methods 0.000 claims description 3
- 239000008273 gelatin Substances 0.000 claims description 3
- 229920000159 gelatin Polymers 0.000 claims description 3
- 235000019322 gelatine Nutrition 0.000 claims description 3
- 235000011852 gelatine desserts Nutrition 0.000 claims description 3
- 101001012262 Bos taurus Enteropeptidase Proteins 0.000 claims description 2
- 101000627872 Homo sapiens 72 kDa type IV collagenase Proteins 0.000 claims 1
- 101710176225 Kallikrein-8 Proteins 0.000 claims 1
- 102100033571 Tissue-type plasminogen activator Human genes 0.000 claims 1
- 102000053150 human MMP2 Human genes 0.000 claims 1
- 102000040430 polynucleotide Human genes 0.000 abstract description 38
- 108091033319 polynucleotide Proteins 0.000 abstract description 38
- 239000002157 polynucleotide Substances 0.000 abstract description 38
- 230000002255 enzymatic effect Effects 0.000 abstract description 5
- 241000276569 Oryzias latipes Species 0.000 description 145
- 235000018102 proteins Nutrition 0.000 description 86
- 239000013615 primer Substances 0.000 description 58
- 102000035195 Peptidases Human genes 0.000 description 51
- 108091005804 Peptidases Proteins 0.000 description 51
- 239000004365 Protease Substances 0.000 description 50
- 108020004414 DNA Proteins 0.000 description 46
- 229940088598 enzyme Drugs 0.000 description 46
- 108010027252 Trypsinogen Proteins 0.000 description 41
- 102000018690 Trypsinogen Human genes 0.000 description 41
- 230000014509 gene expression Effects 0.000 description 37
- 210000000936 intestine Anatomy 0.000 description 37
- 239000000047 product Substances 0.000 description 37
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 32
- 239000002299 complementary DNA Substances 0.000 description 32
- 229940024606 amino acid Drugs 0.000 description 27
- 235000001014 amino acid Nutrition 0.000 description 27
- 230000000692 anti-sense effect Effects 0.000 description 24
- 239000013604 expression vector Substances 0.000 description 20
- 108020004999 messenger RNA Proteins 0.000 description 18
- 101001091371 Homo sapiens Kallikrein-8 Proteins 0.000 description 16
- 239000000523 sample Substances 0.000 description 16
- 101000929809 Bos taurus Acyl-CoA-binding protein Proteins 0.000 description 15
- 108091026890 Coding region Proteins 0.000 description 15
- 239000013612 plasmid Substances 0.000 description 15
- 102000012479 Serine Proteases Human genes 0.000 description 14
- 108010022999 Serine Proteases Proteins 0.000 description 14
- 238000004458 analytical method Methods 0.000 description 14
- 239000012528 membrane Substances 0.000 description 14
- 210000001672 ovary Anatomy 0.000 description 14
- 241000283690 Bos taurus Species 0.000 description 13
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 12
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 12
- 210000004379 membrane Anatomy 0.000 description 12
- 108090000631 Trypsin Proteins 0.000 description 11
- 102000004142 Trypsin Human genes 0.000 description 11
- 238000009396 hybridization Methods 0.000 description 11
- 239000012588 trypsin Substances 0.000 description 11
- 230000003197 catalytic effect Effects 0.000 description 10
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 10
- 238000001262 western blot Methods 0.000 description 10
- 241000251468 Actinopterygii Species 0.000 description 9
- 229920002684 Sepharose Polymers 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 9
- 235000019688 fish Nutrition 0.000 description 9
- 238000003757 reverse transcription PCR Methods 0.000 description 9
- 108091034117 Oligonucleotide Proteins 0.000 description 8
- 230000004071 biological effect Effects 0.000 description 8
- 239000000872 buffer Substances 0.000 description 8
- 238000010367 cloning Methods 0.000 description 8
- 238000001514 detection method Methods 0.000 description 8
- 239000000284 extract Substances 0.000 description 8
- 238000011534 incubation Methods 0.000 description 8
- 102000053602 DNA Human genes 0.000 description 7
- 230000004913 activation Effects 0.000 description 7
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 102000057705 human KLK8 Human genes 0.000 description 7
- 238000000746 purification Methods 0.000 description 7
- 230000002829 reductive effect Effects 0.000 description 7
- 238000002741 site-directed mutagenesis Methods 0.000 description 7
- 239000000126 substance Substances 0.000 description 7
- 210000001519 tissue Anatomy 0.000 description 7
- 108020004635 Complementary DNA Proteins 0.000 description 6
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 6
- 102000000424 Matrix Metalloproteinase 2 Human genes 0.000 description 6
- 108010016165 Matrix Metalloproteinase 2 Proteins 0.000 description 6
- 238000002105 Southern blotting Methods 0.000 description 6
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 6
- 238000000137 annealing Methods 0.000 description 6
- 210000004899 c-terminal region Anatomy 0.000 description 6
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 6
- 239000000463 material Substances 0.000 description 6
- 239000000203 mixture Substances 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 241000894007 species Species 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 5
- 102000000853 LDL receptors Human genes 0.000 description 5
- 108010001831 LDL receptors Proteins 0.000 description 5
- 230000003321 amplification Effects 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 5
- -1 deoxyribonucleotide triphosphates Chemical class 0.000 description 5
- 238000007901 in situ hybridization Methods 0.000 description 5
- 238000000338 in vitro Methods 0.000 description 5
- 239000003112 inhibitor Substances 0.000 description 5
- 238000003780 insertion Methods 0.000 description 5
- 230000037431 insertion Effects 0.000 description 5
- 238000002955 isolation Methods 0.000 description 5
- 108091005485 macrophage scavenger receptors Proteins 0.000 description 5
- 238000003199 nucleic acid amplification method Methods 0.000 description 5
- 102000014452 scavenger receptors Human genes 0.000 description 5
- 238000012216 screening Methods 0.000 description 5
- 239000006228 supernatant Substances 0.000 description 5
- 101000962056 Mytilus edulis Major extrapallial fluid protein Proteins 0.000 description 4
- 238000000636 Northern blotting Methods 0.000 description 4
- 108020004511 Recombinant DNA Proteins 0.000 description 4
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 4
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 4
- 230000001086 cytosolic effect Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 210000001035 gastrointestinal tract Anatomy 0.000 description 4
- 230000000670 limiting effect Effects 0.000 description 4
- 238000010369 molecular cloning Methods 0.000 description 4
- 238000002703 mutagenesis Methods 0.000 description 4
- 231100000350 mutagenesis Toxicity 0.000 description 4
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 4
- 238000003752 polymerase chain reaction Methods 0.000 description 4
- 108091008146 restriction endonucleases Proteins 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 238000010186 staining Methods 0.000 description 4
- 210000001550 testis Anatomy 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 241000972773 Aulopiformes Species 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 3
- 108010038061 Chymotrypsinogen Proteins 0.000 description 3
- 239000003155 DNA primer Substances 0.000 description 3
- 238000001712 DNA sequencing Methods 0.000 description 3
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 3
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 3
- SHIBSTMRCDJXLN-UHFFFAOYSA-N Digoxigenin Natural products C1CC(C2C(C3(C)CCC(O)CC3CC2)CC2O)(O)C2(C)C1C1=CC(=O)OC1 SHIBSTMRCDJXLN-UHFFFAOYSA-N 0.000 description 3
- 102000010911 Enzyme Precursors Human genes 0.000 description 3
- 108010062466 Enzyme Precursors Proteins 0.000 description 3
- 108700026244 Open Reading Frames Proteins 0.000 description 3
- 229940124158 Protease/peptidase inhibitor Drugs 0.000 description 3
- 238000010240 RT-PCR analysis Methods 0.000 description 3
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 3
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 3
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 3
- 108010048241 acetamidase Proteins 0.000 description 3
- 230000003024 amidolytic effect Effects 0.000 description 3
- 108091007433 antigens Proteins 0.000 description 3
- 102000036639 antigens Human genes 0.000 description 3
- 238000003556 assay Methods 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 239000012707 chemical precursor Substances 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 238000004925 denaturation Methods 0.000 description 3
- 230000036425 denaturation Effects 0.000 description 3
- 238000000502 dialysis Methods 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- QONQRTHLHBTMGP-UHFFFAOYSA-N digitoxigenin Natural products CC12CCC(C3(CCC(O)CC3CC3)C)C3C11OC1CC2C1=CC(=O)OC1 QONQRTHLHBTMGP-UHFFFAOYSA-N 0.000 description 3
- SHIBSTMRCDJXLN-KCZCNTNESA-N digoxigenin Chemical compound C1([C@@H]2[C@@]3([C@@](CC2)(O)[C@H]2[C@@H]([C@@]4(C)CC[C@H](O)C[C@H]4CC2)C[C@H]3O)C)=CC(=O)OC1 SHIBSTMRCDJXLN-KCZCNTNESA-N 0.000 description 3
- 238000001962 electrophoresis Methods 0.000 description 3
- 210000003527 eukaryotic cell Anatomy 0.000 description 3
- 230000005714 functional activity Effects 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- 238000002523 gelfiltration Methods 0.000 description 3
- 238000010353 genetic engineering Methods 0.000 description 3
- 239000001963 growth medium Substances 0.000 description 3
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 3
- 238000002991 immunohistochemical analysis Methods 0.000 description 3
- 230000001976 improved effect Effects 0.000 description 3
- 238000010348 incorporation Methods 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 210000004347 intestinal mucosa Anatomy 0.000 description 3
- 244000005700 microbiome Species 0.000 description 3
- 108091027963 non-coding RNA Proteins 0.000 description 3
- 102000042567 non-coding RNA Human genes 0.000 description 3
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 3
- 239000002987 primer (paints) Substances 0.000 description 3
- 238000010188 recombinant method Methods 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 235000019515 salmon Nutrition 0.000 description 3
- 238000002864 sequence alignment Methods 0.000 description 3
- 239000001509 sodium citrate Substances 0.000 description 3
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 3
- 230000008685 targeting Effects 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 238000001890 transfection Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 101710197633 Actin-1 Proteins 0.000 description 2
- 102000007469 Actins Human genes 0.000 description 2
- 108010085238 Actins Proteins 0.000 description 2
- 229920000936 Agarose Polymers 0.000 description 2
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 241000193403 Clostridium Species 0.000 description 2
- 102100031780 Endonuclease Human genes 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- 101150092780 GSP1 gene Proteins 0.000 description 2
- 101150035751 GSP2 gene Proteins 0.000 description 2
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- 241000699666 Mus <mouse, genus> Species 0.000 description 2
- 239000004677 Nylon Substances 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 108020004518 RNA Probes Proteins 0.000 description 2
- 239000003391 RNA probe Substances 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 101000702488 Rattus norvegicus High affinity cationic amino acid transporter 1 Proteins 0.000 description 2
- 241000316848 Rhodococcus <scale insect> Species 0.000 description 2
- 239000012505 Superdex™ Substances 0.000 description 2
- 102000019400 Tissue-type plasminogen activator Human genes 0.000 description 2
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 239000011543 agarose gel Substances 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 239000000427 antigen Substances 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 235000011148 calcium chloride Nutrition 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 239000013611 chromosomal DNA Substances 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 210000000172 cytosol Anatomy 0.000 description 2
- 230000002183 duodenal effect Effects 0.000 description 2
- 210000001842 enterocyte Anatomy 0.000 description 2
- 239000012091 fetal bovine serum Substances 0.000 description 2
- 239000013505 freshwater Substances 0.000 description 2
- 238000001641 gel filtration chromatography Methods 0.000 description 2
- 238000007804 gelatin zymography Methods 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- 238000010438 heat treatment Methods 0.000 description 2
- 229940106780 human fibrinogen Drugs 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 230000007062 hydrolysis Effects 0.000 description 2
- 238000006460 hydrolysis reaction Methods 0.000 description 2
- 230000005764 inhibitory process Effects 0.000 description 2
- 230000000968 intestinal effect Effects 0.000 description 2
- 229960000310 isoleucine Drugs 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 239000002609 medium Substances 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- 150000003833 nucleoside derivatives Chemical class 0.000 description 2
- 210000004940 nucleus Anatomy 0.000 description 2
- 229920001778 nylon Polymers 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 125000001805 pentosyl group Chemical group 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 230000007026 protein scission Effects 0.000 description 2
- 101150054232 pyrG gene Proteins 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 230000003362 replicative effect Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 210000002784 stomach Anatomy 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 239000001226 triphosphate Substances 0.000 description 2
- 235000011178 triphosphate Nutrition 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- MRXDGVXSWIXTQL-HYHFHBMOSA-N (2s)-2-[[(1s)-1-(2-amino-1,4,5,6-tetrahydropyrimidin-6-yl)-2-[[(2s)-4-methyl-1-oxo-1-[[(2s)-1-oxo-3-phenylpropan-2-yl]amino]pentan-2-yl]amino]-2-oxoethyl]carbamoylamino]-3-phenylpropanoic acid Chemical compound C([C@H](NC(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C=O)C1NC(N)=NCC1)C(O)=O)C1=CC=CC=C1 MRXDGVXSWIXTQL-HYHFHBMOSA-N 0.000 description 1
- CFOQGBUQTOGYKI-UHFFFAOYSA-N (4-nitrophenyl) 4-(diaminomethylideneamino)benzoate Chemical compound C1=CC(N=C(N)N)=CC=C1C(=O)OC1=CC=C([N+]([O-])=O)C=C1 CFOQGBUQTOGYKI-UHFFFAOYSA-N 0.000 description 1
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- RNAMYOYQYRYFQY-UHFFFAOYSA-N 2-(4,4-difluoropiperidin-1-yl)-6-methoxy-n-(1-propan-2-ylpiperidin-4-yl)-7-(3-pyrrolidin-1-ylpropoxy)quinazolin-4-amine Chemical compound N1=C(N2CCC(F)(F)CC2)N=C2C=C(OCCCN3CCCC3)C(OC)=CC2=C1NC1CCN(C(C)C)CC1 RNAMYOYQYRYFQY-UHFFFAOYSA-N 0.000 description 1
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 description 1
- 101710163881 5,6-dihydroxyindole-2-carboxylic acid oxidase Proteins 0.000 description 1
- 241000589220 Acetobacter Species 0.000 description 1
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 1
- 241000186046 Actinomyces Species 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 241000607534 Aeromonas Species 0.000 description 1
- 241000588986 Alcaligenes Species 0.000 description 1
- 108010037870 Anthranilate Synthase Proteins 0.000 description 1
- 108010087765 Antipain Proteins 0.000 description 1
- 108010039627 Aprotinin Proteins 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- DJHGAFSJWGLOIV-UHFFFAOYSA-K Arsenate3- Chemical compound [O-][As]([O-])([O-])=O DJHGAFSJWGLOIV-UHFFFAOYSA-K 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 241000228212 Aspergillus Species 0.000 description 1
- 241000351920 Aspergillus nidulans Species 0.000 description 1
- 240000006439 Aspergillus oryzae Species 0.000 description 1
- 235000002247 Aspergillus oryzae Nutrition 0.000 description 1
- 241000589941 Azospirillum Species 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- 241000194108 Bacillus licheniformis Species 0.000 description 1
- 244000063299 Bacillus subtilis Species 0.000 description 1
- 235000014469 Bacillus subtilis Nutrition 0.000 description 1
- 241000606125 Bacteroides Species 0.000 description 1
- 102100030981 Beta-alanine-activating enzyme Human genes 0.000 description 1
- 241000186000 Bifidobacterium Species 0.000 description 1
- 241000589173 Bradyrhizobium Species 0.000 description 1
- 241001453380 Burkholderia Species 0.000 description 1
- 241000186321 Cellulomonas Species 0.000 description 1
- OLVPQBGMUGIKIW-UHFFFAOYSA-N Chymostatin Natural products C=1C=CC=CC=1CC(C=O)NC(=O)C(C(C)CC)NC(=O)C(C1NC(N)=NCC1)NC(=O)NC(C(O)=O)CC1=CC=CC=C1 OLVPQBGMUGIKIW-UHFFFAOYSA-N 0.000 description 1
- 108020004638 Circular DNA Proteins 0.000 description 1
- 241000588923 Citrobacter Species 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- 241001429175 Colitis phage Species 0.000 description 1
- 241000186216 Corynebacterium Species 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- 241000605716 Desulfovibrio Species 0.000 description 1
- 108010010256 Dietary Proteins Proteins 0.000 description 1
- 102000015781 Dietary Proteins Human genes 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- 241000194033 Enterococcus Species 0.000 description 1
- 241000588722 Escherichia Species 0.000 description 1
- 241000186394 Eubacterium Species 0.000 description 1
- 241000605898 Fibrobacter Species 0.000 description 1
- 241000192125 Firmicutes Species 0.000 description 1
- 108010028690 Fish Proteins Proteins 0.000 description 1
- 102000002464 Galactosidases Human genes 0.000 description 1
- 108010093031 Galactosidases Proteins 0.000 description 1
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 1
- 241000626621 Geobacillus Species 0.000 description 1
- 241000589236 Gluconobacter Species 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- 101000636168 Grapevine leafroll-associated virus 3 (isolate United States/NY1) Movement protein p5 Proteins 0.000 description 1
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 1
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 1
- UYTPUPDQBNUYGX-UHFFFAOYSA-N Guanine Natural products O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 1
- 101150009006 HIS3 gene Proteins 0.000 description 1
- 101100295959 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) arcB gene Proteins 0.000 description 1
- 101100246753 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) pyrF gene Proteins 0.000 description 1
- 241001655241 Halochromatium Species 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000773364 Homo sapiens Beta-alanine-activating enzyme Proteins 0.000 description 1
- 101001027128 Homo sapiens Fibronectin Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 241000588748 Klebsiella Species 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 241000186660 Lactobacillus Species 0.000 description 1
- 241000194036 Lactococcus Species 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 241000192132 Leuconostoc Species 0.000 description 1
- GDBQQVLCIARPGH-UHFFFAOYSA-N Leupeptin Natural products CC(C)CC(NC(C)=O)C(=O)NC(CC(C)C)C(=O)NC(C=O)CCCN=C(N)N GDBQQVLCIARPGH-UHFFFAOYSA-N 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 101150068888 MET3 gene Proteins 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- NPPQSCRMBWNHMW-UHFFFAOYSA-N Meprobamate Chemical compound NC(=O)OCC(C)(CCC)COC(N)=O NPPQSCRMBWNHMW-UHFFFAOYSA-N 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 101100022915 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) cys-11 gene Proteins 0.000 description 1
- 108090000913 Nitrate Reductases Proteins 0.000 description 1
- 239000000020 Nitrocellulose Substances 0.000 description 1
- 241000187654 Nocardia Species 0.000 description 1
- 102000007981 Ornithine carbamoyltransferase Human genes 0.000 description 1
- 101710113020 Ornithine transcarbamylase, mitochondrial Proteins 0.000 description 1
- 102100037214 Orotidine 5'-phosphate decarboxylase Human genes 0.000 description 1
- 108010055012 Orotidine-5'-phosphate decarboxylase Proteins 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 241000179039 Paenibacillus Species 0.000 description 1
- 229930040373 Paraformaldehyde Natural products 0.000 description 1
- 206010034133 Pathogen resistance Diseases 0.000 description 1
- 101800003414 Pro-elastase Proteins 0.000 description 1
- 108010040806 Prolipase Proteins 0.000 description 1
- 241000186429 Propionibacterium Species 0.000 description 1
- 229940096437 Protein S Drugs 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 108091034057 RNA (poly(A)) Proteins 0.000 description 1
- 238000012228 RNA interference-mediated gene silencing Methods 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 102100034090 Receptor-type tyrosine-protein phosphatase mu Human genes 0.000 description 1
- 101710168849 Receptor-type tyrosine-protein phosphatase mu Proteins 0.000 description 1
- 241000589180 Rhizobium Species 0.000 description 1
- 241000191025 Rhodobacter Species 0.000 description 1
- 101100394989 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) hisI gene Proteins 0.000 description 1
- 241000190967 Rhodospirillum Species 0.000 description 1
- 241000192023 Sarcina Species 0.000 description 1
- 101100022918 Schizosaccharomyces pombe (strain 972 / ATCC 24843) sua1 gene Proteins 0.000 description 1
- 241000863430 Shewanella Species 0.000 description 1
- BLRPTPMANUNPDV-UHFFFAOYSA-N Silane Chemical compound [SiH4] BLRPTPMANUNPDV-UHFFFAOYSA-N 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 241001136275 Sphingobacterium Species 0.000 description 1
- 241000736131 Sphingomonas Species 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 101100370749 Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145) trpC1 gene Proteins 0.000 description 1
- 241000187391 Streptomyces hygroscopicus Species 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- 241001134777 Sulfobacillus Species 0.000 description 1
- 241000580834 Sulfurospirillum Species 0.000 description 1
- 101710137500 T7 RNA polymerase Proteins 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- 241001647802 Thermobifida Species 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 102000006601 Thymidine Kinase Human genes 0.000 description 1
- 108020004440 Thymidine kinase Proteins 0.000 description 1
- 108020004566 Transfer RNA Proteins 0.000 description 1
- 108700019146 Transgenes Proteins 0.000 description 1
- 101150050575 URA3 gene Proteins 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 108010031318 Vitronectin Proteins 0.000 description 1
- 102100035140 Vitronectin Human genes 0.000 description 1
- 101001124193 Xenopus laevis Neuropilin-1 Proteins 0.000 description 1
- 241001464778 Zymobacter Species 0.000 description 1
- 241000588901 Zymomonas Species 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 238000012867 alanine scanning Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 230000000845 anti-microbial effect Effects 0.000 description 1
- SDNYTAYICBFYFH-TUFLPTIASA-N antipain Chemical compound NC(N)=NCCC[C@@H](C=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCN=C(N)N)NC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SDNYTAYICBFYFH-TUFLPTIASA-N 0.000 description 1
- 210000000436 anus Anatomy 0.000 description 1
- 229960004405 aprotinin Drugs 0.000 description 1
- 101150008194 argB gene Proteins 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 229940000489 arsenate Drugs 0.000 description 1
- 210000004507 artificial chromosome Anatomy 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229940009098 aspartate Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 108010028263 bacteriophage T3 RNA polymerase Proteins 0.000 description 1
- 101150103518 bar gene Proteins 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- PXXJHWLDUBFPOL-UHFFFAOYSA-N benzamidine Chemical compound NC(=N)C1=CC=CC=C1 PXXJHWLDUBFPOL-UHFFFAOYSA-N 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 239000003139 biocide Substances 0.000 description 1
- 230000001851 biosynthetic effect Effects 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 239000004202 carbamide Substances 0.000 description 1
- 238000012219 cassette mutagenesis Methods 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 229960005091 chloramphenicol Drugs 0.000 description 1
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 108010086192 chymostatin Proteins 0.000 description 1
- 238000012411 cloning technique Methods 0.000 description 1
- 239000013599 cloning vector Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- NKLPQNGYXWVELD-UHFFFAOYSA-M coomassie brilliant blue Chemical compound [Na+].C1=CC(OCC)=CC=C1NC1=CC=C(C(=C2C=CC(C=C2)=[N+](CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=2C=CC(=CC=2)N(CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=C1 NKLPQNGYXWVELD-UHFFFAOYSA-M 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 230000002542 deteriorative effect Effects 0.000 description 1
- 238000001784 detoxification Methods 0.000 description 1
- 229960000633 dextran sulfate Drugs 0.000 description 1
- 235000021245 dietary protein Nutrition 0.000 description 1
- 238000002050 diffraction method Methods 0.000 description 1
- 229940042399 direct acting antivirals protease inhibitors Drugs 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- VHJLVAABSRFDPM-QWWZWVQMSA-N dithiothreitol Chemical compound SC[C@@H](O)[C@H](O)CS VHJLVAABSRFDPM-QWWZWVQMSA-N 0.000 description 1
- 210000003158 enteroendocrine cell Anatomy 0.000 description 1
- 238000001952 enzyme assay Methods 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000005021 gait Effects 0.000 description 1
- 230000030279 gene silencing Effects 0.000 description 1
- 230000009368 gene silencing by RNA Effects 0.000 description 1
- 238000012226 gene silencing method Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 125000000291 glutamic acid group Chemical class N[C@@H](CCC(O)=O)C(=O)* 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 235000004554 glutamine Nutrition 0.000 description 1
- 210000002175 goblet cell Anatomy 0.000 description 1
- 239000005090 green fluorescent protein Substances 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- IVSXFFJGASXYCL-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=NC=N[C]21 IVSXFFJGASXYCL-UHFFFAOYSA-N 0.000 description 1
- 229910001385 heavy metal Inorganic materials 0.000 description 1
- 229940094991 herring sperm dna Drugs 0.000 description 1
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 1
- 230000001744 histochemical effect Effects 0.000 description 1
- 230000003301 hydrolyzing effect Effects 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 108010002685 hygromycin-B kinase Proteins 0.000 description 1
- 238000003364 immunohistochemistry Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- ZPNFWUPYTFPOJU-LPYSRVMUSA-N iniprol Chemical compound C([C@H]1C(=O)NCC(=O)NCC(=O)N[C@H]2CSSC[C@H]3C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(N[C@H](C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=4C=CC(O)=CC=4)C(=O)N[C@@H](CC=4C=CC=CC=4)C(=O)N[C@@H](CC=4C=CC(O)=CC=4)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC=4C=CC=CC=4)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(N)=N)NC2=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](CC=2C=CC=CC=2)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H]2N(CCC2)C(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N2[C@@H](CCC2)C(=O)N2[C@@H](CCC2)C(=O)N[C@@H](CC=2C=CC(O)=CC=2)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N2[C@@H](CCC2)C(=O)N3)C(=O)NCC(=O)NCC(=O)N[C@@H](C)C(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(=O)N[C@@H](CC=2C=CC=CC=2)C(=O)N[C@H](C(=O)N1)C(C)C)[C@@H](C)O)[C@@H](C)CC)=O)[C@@H](C)CC)C1=CC=C(O)C=C1 ZPNFWUPYTFPOJU-LPYSRVMUSA-N 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 239000002198 insoluble material Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000004966 intestinal stem cell Anatomy 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 229960000318 kanamycin Drugs 0.000 description 1
- 229930027917 kanamycin Natural products 0.000 description 1
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 1
- 229930182823 kanamycin A Natural products 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 229940039696 lactobacillus Drugs 0.000 description 1
- GDBQQVLCIARPGH-ULQDDVLXSA-N leupeptin Chemical compound CC(C)C[C@H](NC(C)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C=O)CCCN=C(N)N GDBQQVLCIARPGH-ULQDDVLXSA-N 0.000 description 1
- 108010052968 leupeptin Proteins 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 101150039489 lysZ gene Proteins 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 108091007169 meprins Proteins 0.000 description 1
- 230000033607 mismatch repair Effects 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 235000019799 monosodium phosphate Nutrition 0.000 description 1
- 210000004877 mucosa Anatomy 0.000 description 1
- 210000004898 n-terminal fragment Anatomy 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 101150095344 niaD gene Proteins 0.000 description 1
- 229920001220 nitrocellulos Polymers 0.000 description 1
- 238000001668 nucleic acid synthesis Methods 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 229940124276 oligodeoxyribonucleotide Drugs 0.000 description 1
- 238000002515 oligonucleotide synthesis Methods 0.000 description 1
- 210000000287 oocyte Anatomy 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 210000003134 paneth cell Anatomy 0.000 description 1
- 229920002866 paraformaldehyde Polymers 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 150000002972 pentoses Chemical class 0.000 description 1
- 229950000964 pepstatin Drugs 0.000 description 1
- 108010091212 pepstatin Proteins 0.000 description 1
- FAXGPCHRFPCXOO-LXTPJMTPSA-N pepstatin A Chemical compound OC(=O)C[C@H](O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)C[C@H](O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)NC(=O)CC(C)C FAXGPCHRFPCXOO-LXTPJMTPSA-N 0.000 description 1
- 125000001151 peptidyl group Chemical group 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 108010082527 phosphinothricin N-acetyltransferase Proteins 0.000 description 1
- 150000004713 phosphodiesters Chemical group 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 238000005222 photoaffinity labeling Methods 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 238000002818 protein evolution Methods 0.000 description 1
- 230000007065 protein hydrolysis Effects 0.000 description 1
- 229940024999 proteolytic enzymes for treatment of wounds and ulcers Drugs 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 125000003607 serino group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C(O[H])([H])[H] 0.000 description 1
- 210000000813 small intestine Anatomy 0.000 description 1
- AJPJDKMHJJGVTQ-UHFFFAOYSA-M sodium dihydrogen phosphate Chemical compound [Na+].OP(O)([O-])=O AJPJDKMHJJGVTQ-UHFFFAOYSA-M 0.000 description 1
- 229910000162 sodium phosphate Inorganic materials 0.000 description 1
- 239000011537 solubilization buffer Substances 0.000 description 1
- 238000012409 standard PCR amplification Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 239000012134 supernatant fraction Substances 0.000 description 1
- 230000008093 supporting effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 238000005382 thermal cycling Methods 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- 101150016309 trpC gene Proteins 0.000 description 1
- 230000029534 trypsinogen activation Effects 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/48—Hydrolases (3) acting on peptide bonds (3.4)
- C12N9/50—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
- C12N9/64—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue
- C12N9/6421—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from mammals
- C12N9/6424—Serine endopeptidases (3.4.21)
- C12N9/6445—Kallikreins (3.4.21.34; 3.4.21.35)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/48—Hydrolases (3) acting on peptide bonds (3.4)
- C12N9/50—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
- C12N9/64—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue
- C12N9/6402—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from non-mammals
- C12N9/6405—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from non-mammals not being snakes
- C12N9/6408—Serine endopeptidases (3.4.21)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/48—Hydrolases (3) acting on peptide bonds (3.4)
- C12N9/50—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
- C12N9/64—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue
- C12N9/6421—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from mammals
- C12N9/6424—Serine endopeptidases (3.4.21)
- C12N9/6456—Plasminogen activators
- C12N9/6459—Plasminogen activators t-plasminogen activator (3.4.21.68), i.e. tPA
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/48—Hydrolases (3) acting on peptide bonds (3.4)
- C12N9/50—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
- C12N9/64—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue
- C12N9/6421—Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from mammals
- C12N9/6489—Metalloendopeptidases (3.4.24)
- C12N9/6491—Matrix metalloproteases [MMP's], e.g. interstitial collagenase (3.4.24.7); Stromelysins (3.4.24.17; 3.2.1.22); Matrilysin (3.4.24.23)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/06—Preparation of peptides or proteins produced by the hydrolysis of a peptide bond, e.g. hydrolysate products
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y304/00—Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
- C12Y304/21—Serine endopeptidases (3.4.21)
- C12Y304/21009—Enteropeptidase (3.4.21.9), i.e. enterokinase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y304/00—Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
- C12Y304/21—Serine endopeptidases (3.4.21)
- C12Y304/21069—Protein C activated (3.4.21.69)
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S435/00—Chemistry: molecular biology and microbiology
- Y10S435/81—Packaged device or kit
Definitions
- the present invention relates to novel enteropeptidase (EP) variant polypeptides derived from Japanese Medaka ( Oryzias latipes ). More particularly, the present invention relates to novel EP variant polypeptides with enhanced substrate specificity, polynucleotides encoding the EP polypeptides, nucleotide constructs, vectors and host cells comprising the polynucleotides, methods for producing the polypeptides and polynucleotides, and kits.
- EP enteropeptidase
- EP enterokinase, EC 3.4.21.9
- EP catalyzes the conversion, in the duodenal lumen, of trypsinogen into active trypsin via the cleavage of the acidic propeptide from trysinogen (Light et al., Trends Biochem. Sci., 14:110-112 (1989)).
- trypsin initiates a cascade of proteolytic reactions leading to the activation of many pancreatic zymogens, including chymotrypsinogen, proelastase, procarboxypeptidases, and some prolipases (Grishan et al., Gastroenterology, 85:727-731 (1983)).
- EP is highly specific for the sequence Asp-Asp-Asp-Asp-Lys (D 4 K) (SEQ ID NO: 1) of trypsinogen (Bricteux-Gregoire et al., Comp. Biochem. Physiol., 42B: 23-39 (1972)). It is generally believed that EP (or enteropeptidase-like enzyme) is present in all vertebrates. This belief comes from the finding that in almost all vertebrate species a short peptide sequence of Asp-Asp-Asp-Asp-Lys (D 4 K) (SEQ ID NO: 1) is found in the presumed activation site of trypsinogens (14).
- EP is highly specific for the sequence Asp-Asp-Asp-Asp-Lys (D 4 K) (SEQ ID NO: 1) of trypsinogen (Bricteux-Gregoire et al., Comp. Biochem. Physiol., 42B:23-39 (1972)). Because of the high degree of D 4 K (SEQ ID NO: 1) specificity, EP has been used as a suitable reagent for cleaving substrate proteins. Indeed, bovine EP has been widely used for this purpose (Collins-Racie et al., Biotechnology, 13:982-987 (1995)).
- bovine EP protease cleaves at the EP-cleavage site of recombinant fusion proteins, it also simultaneously hydrolyzes other peptide bonds of the proteins to a considerable degree because of its nonspecific proteolytic activity. This causes a seriously low yield of the targeted protein.
- Such nonspecific activities of bovine EP also can be an obstacle in the preparation of active recombinant proteases where the EP is employed for cleavage of the inactive fusion protein. This is particularly serious when the proteases to be examined are ones with very low activity for synthetic and naturally occurring protein substrates.
- such nonspecific activities of bovine EP make it difficult to determine whether the target recombinant proteases have been successfully activated.
- the present inventors have now generated novel EP variant polypeptides from a non-mammalian source, Japanese Medaka, which demonstrates substantially reduced nonspecific proteolytic activity while retaining its high specificity for Asp-Asp-Asp-Asp-Lys (D 4 K) sequence (SEQ ID NO: 1).
- the present study also describes some enzymatic properties of the catalytic serine protease domain.
- the protease domain of medaka EP exhibits very limited amidolytic activity for any of the synthetic peptide substrates tested, indicating that the medaka protease itself is much more highly specific for the Asp-Asp-Asp-Asp-Lys (D 4 K) (SEQ ID NO: 1), than those of its mammalian counterparts.
- mutant proteases of medaka EP were generated by site-directed mutagenesis. Some of the mutated proteases exhibited cleavage specificity that was stricter than that of the wild-type enzyme, and may prove to be more effective tools for recombinant protein technology.
- the invention provides an isolated nucleic acid molecule selected from the group consisting of a nucleic acid molecule comprising a nucleotide sequence which is at least 75% homologous to the nucleotide sequence SEQ ID NO: 3, or SEQ ID NO: 5, or a complement thereof, a nucleic acid molecule comprising a fragment of at least 15 nucleotides of a nucleic acid comprising the nucleotide sequence of SEQ ID NO: 3, SEQ ID NO: 5, or a complement thereof, a nucleic acid molecule which encodes a polypeptide comprising an amino acid sequence at least about 50% identical to the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4, a nucleic acid molecule which encodes a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4; wherein the fragment comprises at least 10 contiguous amino acid residues of the amino acid sequence of SEQ ID NO:2 or SEQ ID NO:
- the isolated nucleic acid molecule is selected from the group consisting of a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 3, SEQ ID NO: 5, or a complement thereof, and a nucleic acid molecule which encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4.
- nucleic acid further comprises vector nucleic acid sequences.
- nucleic acid is operably linked to a surrogate promoter.
- nucleic acid further comprises nucleic acid sequences encoding a heterologous polypeptide.
- a host cell contains the nucleic acid molecule of claim 1 .
- the host cell is selected from the group consisting of: bacterial cells, fungal cells, and animal cells.
- the bacterial cell is Escherichia coli.
- the invention provides isolated polypeptides that are selected from the group consisting of a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4, wherein the fragment comprises at least 15 contiguous amino acids of SEQ ID NO: 2 or SEQ ID NO: 4, a variant of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4, wherein the polypeptide is encoded by a nucleic acid molecule which hybridizes to a complement of a nucleic acid molecule comprising, SEQ ID NO:3, or SEQ ID NO:5, under stringent conditions, a polypeptide which is encoded by a nucleic acid molecule comprising a nucleotide sequence which is at least 50% identical to a nucleic acid comprising the nucleotide sequence SEQ ID NO:3, or SEQ ID NO:5, and a polypeptide comprising an amino acid sequence which is at least 30% homologous to the amino acid sequence of, SEQ ID NO: 2,
- the isolated polypeptides comprise the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4.
- the polypeptide comprising the amino acid sequence of SEQ ID NO: 2 has at least one mutation.
- the mutation is selected from the group consisting of a substitution, deletion, and addition.
- the mutation is a substitution.
- the substitution occurs at amino acid residue selected from the group consisting of: residue 93 through residue 193.
- the substitution comprises a substitution at one or more residues selected from position 63, 105, 144, 173 or 193.
- the substitution is at residue 63.
- the substitution at residue 63 is selected from the group consisting of: K63R, K63A, and K63E.
- the substitution is at residue 105.
- the substitution at residue 105 is selected from the group consisting of T105A, T105R, and T105E. In a particular embodiment, the substitution is at residue 144. In another embodiment, the substitution at residue 144 is F144S. In another embodiment, the substitution is at residue 173. In a particular embodiment, the substitution at residue 173 is E173A. In another embodiment, the substitution is at residue 193. In another embodiment, the substitution at residue 193 is selected from the group consisting of: P193E and P193A.
- the isolated polypeptide with E173A substitution consists of the amino acid sequence of SEQ ID NO: 4. In another further embodiment, the isolated polypeptide with E173A substitution comprises the amino acid sequence of SEQ ID NO: 4.
- any of the isolated polypeptides according to any of the aspects described herein is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1).
- any of the isolated polypeptides according to any of the aspects described herein has low non-specific proteolytic activity.
- the polypeptide has low-specific proteolytic activity for a synthetic peptide substrate.
- the synthetic peptide substrate is a 4-methylcoumaryl-7-amide (MCA)-substrate.
- the synthetic peptide substrate is selected from the group consisting of: Boc-Glu (OBzl)-Ala-Arg-MCA, Z-Phe-Arg-MCA, and Pro-Phe-Arg-MCA.
- the synthetic peptide substrate consists of a fusion protein.
- the fusion protein comprises SEQ ID NO: 1 and another protein.
- the polypeptide has low non-specific proteolytic activity for a biological peptide substrate.
- the biological peptide substrate is selected from the group consisting of: kininogen, fibrinogen, fibronectin, gelatin and laminin.
- the biological peptide substrate consists of a recombinant fusion protein.
- the recombinant fusion protein comprises SEQ ID NO: 1 and another protein.
- the recombinant fusion protein is selected from the group consisting of: gelatinaseA, human kallikrein 8 and tissue type plasminogen activator (tPA).
- the invention provides an isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 2 that has at least one mutation at one or more residues selected from position 63, 105, 144, 173 or 193, wherein the isolated polypeptide is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
- the mutation is a substitution selected from the group consisting of: K63R, K63A, K63E, T105A, T105R, T105E, F144S, E173A, P193A, and P193A. In another particular embodiment of the aspect, the mutation is E173A.
- Another aspect of the invention provides an isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 4, wherein the isolated polypeptide is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
- the invention provides an isolated polypeptide as described herein, wherein the polypeptide is a recombinant polypeptide.
- the invention provides an isolated polypeptide as described herein, wherein the polypeptide has enhanced stability at ⁇ 20 C, 4 C and 32 C.
- the invention teaches a method for producing a polypeptide that is selected from the group consisting of a polypeptide comprising the amino acid sequence SEQ ID NO: 2, or SEQ ID NO: 4, a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4, wherein the fragment comprises at least 15 contiguous amino acids of SEQ ID NO: 2, or SEQ ID NO: 4, a naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4, wherein the polypeptide is encoded by a nucleic acid molecule which hybridizes to a complement of a nucleic acid molecule comprising SEQ ID NO:3, or SEQ ID NO:5, under stringent conditions, and where the method comprises culturing the host cells of the invention under conditions in which the nucleic acid molecule is expressed.
- the polypeptides are produced in an E. coli expression system.
- Another particular aspect of the invention teaches a method for cleavage of a protein containing an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1) using any of the polypeptides of the invention described herein, the method comprising contacting the protein with any of the polypeptides of the invention, and wherein the contacting of the protein with the polypeptide results in specific cleavage.
- the protein is a fusion protein.
- the fusion protein is a recombinant fusion protein.
- the protein is bacterially produced.
- the protein is a synthetic protein.
- the invention teaches a method for the preparation of recombinant protein using any of the polypeptides according to the invention as described herein, the method comprising providing a recombinant fusion protein containing a Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1), and contacting the fusion protein with any of the polypeptides of the invention, wherein contacting the recombinant fusion protein with the polypeptide results in Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1) specific cleavage and preparation of recombinant protein.
- SEQ ID NO: 1 Asp-Asp-Asp-Asp-Lys
- the invention provides a kit comprising any of the polypeptides described herein for use in the cleavage of a protein containing an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1), and instructions for use.
- the protein is a fusion protein. In another embodiments, the fusion protein is a recombinant fusion protein. In further embodiments, the protein is a bacterially produced protein. In a particular embodiment, the protein is a synthetic protein.
- FIG. 1(A) is a schematic representation of the Medaka EP domain structures.
- Medaka EP consists of a putative signal anchor (SA), a mucin-like domain, a low-density-lipoprotein receptor (LDLR) domain, two complement component C1r or C1s (C1r/s) domains, a MAM domain (named for the motifs found in Meprin, Xenopus laevis A5 protein, and protein tyrosine phosphatase ⁇ ), a macrophage scavenger receptor (MSCR) domain, and a serine protease domain with active site residues of histidine (H), aspartate (D), and serine (S). The disulfide bond connecting the heavy and light-chain is shown.
- SA putative signal anchor
- LDLR low-density-lipoprotein receptor
- C1r/s two complement component C1r or C1s domains
- MAM domain named for the motifs found in Meprin, Xen
- FIG. 1(B) shows amino acid sequence alignment of the EP serine protease domain. Amino acid residues are numbered based on the sequence of Medaka EP (top numbers). For comparison, the data for bovine chymotrypsinogen (Chymo) are included among the chymotrypsinogen residue numbers (in parenthesis at the bottom of each block). The arrow indicates a putative activation site between the heavy and light chains. The active site residues (H, D and S) are boxed. The positions of mutations are indicated by asterisks. Figure discloses SEQ ID NOS 65-70, respectively, in order of appearance.
- FIG. 1(C) shows Northern blotting analysis of the expression of Medaka EP mRNA in various tissues. The sizes of the detected mRNAs are shown at the left. The lower panel shows the results for Medaka cytoplasmic actin mRNA as a control.
- FIG. 1(D) shows RT-PCR analysis of the expression of EP mRNA in the gastrointestinal tract.
- the Medaka gastrointestinal tract was divided into 8 pieces, from the stomach (lane 1) to the anus (lane 8), and the PCR products in each piece was electrophoresed.
- FIG. 1(E) shows in situ hybridization of EP mRNA in the Medaka intestine. Neighboring sections of Medaka intestine were hybridized with EP antisense (left panel) or sense RNA probe (right panel). Scale bars: 100 ⁇ m.
- FIG. 1(F) shows Western blotting analysis of the expression of the Medaka EP protein. Extracts of the intestine, testis, and ovary (left panel), and of nuclei, membrane and cytosol fractions of the Medaka intestine (right panel) were analyzed. The size of the EP protein detected is shown at the right.
- FIG. 1(G) shows immunohistochemical analysis of EP in the Medaka intestine using the Medaka anti-EP antibody (left panel). The control section was stained with the primary antibody previously treated with the antigen (right panel). Scale bars: 200 ⁇ m.
- FIG. 2 shows the specificity of Medaka EP-1 protease on peptide and protein substrates.
- A Active recombinant EP proteases were assayed using a GD4K- ⁇ NA (SEQ ID NO: 6) substrate as a substrate.
- B Active recombinant EP proteases were assayed using various synthetic peptide substrates.
- C Active EP proteases were analyzed by gelatin zymography.
- D Fibronectin (4 ⁇ g) was incubated with active EPs (100 ng) at 37° C. for 12 h.
- FIG. 3(A-C) shows the specificity of mutant Medaka EP proteases on peptide and protein substrates.
- A The specific activities of wild-type (EP-1) and mutant EP protease were determined using synthetic peptide substrates.
- B High-molecular-weight (HMW) kininogen (5 ⁇ g) was incubated with active EP proteases (100 ng) at 37° C. for 2 h and analyzed by SDS-PAGE.
- Fibrinogen (10 ⁇ g) was incubated with active EP proteases (100 ng) at 37° C. for 12 h and analyzed by SDS-PAGE.
- FIG. 4 shows the effects of various EP proteases on protein substrates containing a D 4 K-cleavage site (SEQ ID NO: 1).
- a recombinant fusion protein of Medaka gelatinase A (5 ⁇ g) was separately incubated with active EP proteases (100 ng) at 37° C. for 1 h, and analyzed by SDS-PAGE.
- B A recombinant fusion protein of human kallikrein 8 (hK8) (5 ⁇ g) was incubated with active EP proteases (100 ng) at 37° C. for 2 h.
- FIG. 5 shows the expression of two distinct EP transcripts in the Medaka intestine.
- A Amino acid sequence alignment of EP-1 (upper) (SEQ ID NO: 71) and EP-2 (lower) (SEQ ID NO: 72) is shown.
- B RT-PCR analysis of the EP-1 and EP-2 transcript was performed using specific primer pairs with total RNAs isolated from the Medaka intestine. A transcript of Medaka cytoplasmic actin-1 (OLCA-1) was amplified as a control. PCR cycle numbers are indicated at the top of the figure.
- C Southern blot analysis was performed using Medaka genomic DNA (20 ⁇ g/lane) digested with various restriction enzymes as indicated.
- FIG. 6(A-D) shows the in situ detection of EP mRNA in the Medaka ovary. Staining was performed with DIG-labeled antisense (A and C) and sense probes (B and D).
- a and C DIG-labeled antisense
- B and D sense probes
- C The follicles indicated by the box in (A) are shown at higher magnification.
- FIG. 7 shows gel filtration analysis of Medaka intestine extracts.
- the intestine extract was fractionated using a HiLoad 16/60 Superdex 200 ⁇ g column. Fractions having GD4K- ⁇ NA-hydrolyzing (SEQ ID NO: 6) activity (indicated by a bar) were pooled.
- the pooled active fraction was subjected to SDS-PAGE/Western blotting analysis under a reducing condition (left panel) or nonreducing condition (right panel) using anti-Medaka EP protease antibody.
- FIG. 8 shows some enzymatic properties of recombinant Medaka EP-1 and EP-2 protease.
- A The purity of purified recombinant Medaka EP-1 and EP-2 protease was assessed by SDS-PAGE. Lane 1, Medaka EP fusion protein; lane 2, Medaka EP protease treated with immobilized trypsin; lane 3, Medaka EP protease purified using a resource Q column.
- B The enzyme activities of EP proteases were determined at various pHs using GD4K- ⁇ NA (SEQ ID NO: 6) as a substrate.
- C Recombinant Medaka trypsinogen was incubated at 37° C. with EPs for 15, 30 and 45 min.
- FIG. 9 shows the cloning and expression of Medaka trypsinogen.
- A Amino acid sequence alignment of trypsinogen of the Medaka (SEQ ID NO: 73), human (BAA08257) (SEQ ID NO: 74), mouse (AAH61135) (SEQ ID NO: 75), and salmon (CAA49676) (SEQ ID NO: 76) is shown.
- a well conserved D 4 K-cleavage site (SEQ ID NO: 1) for EP is indicated by a broken line. Active site residues (H, D, and S) are boxed.
- B The tissue distribution of Medaka trypsinogen mRNA was analyzed by Northern blotting (upper panel). The sizes of the detected mRNAs are shown at the right. The lower panel shows the detection of Medaka cytoplasmic actin-1 (OLCA-1) mRNA as a control.
- OLCA-1 Medaka cytoplasmic actin-1
- FIG. 10 shows the stability of EP protease.
- Medaka and mammalian EP proteases were incubated at 37° C. in 20 mM Tris.HCl (pH 7.4), 0.2 M NaCl and 2 mM CaCl 2 . Aliquots of the reaction mixtures were taken at the indicated times for an activity assay using GD4K- ⁇ NA (SEQ ID NO: 6) as a substrate. The enzyme activities relative to that at 0-time are shown.
- FIG. 11 shows the activation of Medaka trypsinogen by Medaka wild-type (EP-1) and mutant EP proteases.
- Medaka recombinant trypsinogen was separately incubated with EP proteases at 37° C. for 15, 30 and 45 min, and analyzed by SDS-PAGE followed by CBB staining (upper panel). The relative amount of the active form of Medaka trypsin at each time point was calculated based on the results shown in the upper panel (lower panel). The results are presented as the means ( ⁇ SD) of three separate experiments.
- FIG. 12 shows the sequence listings.
- the present invention provides novel EP variant polypeptides with enhanced substrate specificity, polynucleotides encoding the polypeptides, nucleotide construct, vectors and host cells comprising the polynucleotides, and methods for producing the polypeptides and polynucleotides.
- Described herein is the cloning of cDNAs for enteropeptidase (EP) from the intestine of the medaka, Oryzias latipes , which is a small freshwater teleost.
- EP enteropeptidase
- the mRNAs code for EP-1 (1043 residues) and EP-2 (1036 residues), both of which have a unique, conserved domain structure of the N-terminal heavy-chain and C-terminal catalytic serine protease light-chain.
- the medaka enzyme exhibits extremely low amidolytic activity for small synthetic peptide substrates.
- the present invention describes twelve mutated forms of the medaka EP protease that were produced by site-directed mutagenesis.
- the mutant protease E173A was found to have considerably reduced nonspecific hydrolytic activities both for synthetic and protein substrates without serious reduction of its Asp-Asp-Asp-Asp-Lys (D4K)-cleavage activity (SEQ ID NO: 1).
- D4K Asp-Asp-Asp-Asp-Lys
- the medaka EP proteases were shown to have advantages over their mammalian counterparts.
- the mutated forms of the EP protease described by the present invention represent an improved proteases for use as a restriction proteases to specifically cleave fusion proteins.
- amino acid sequence is recited herein to refer to an amino acid sequence of a protein molecule, “amino acid sequence” and like terms are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule; furthermore, an “amino acid sequence” can be deduced from the nucleic acid sequence encoding the protein.
- Gram-negative bacteria can include Gluconobacter, Rhizobium, Bradyrhizobium, Alcaligenes, Rhodobacter, Rhodococcus, Azospirillum, Rhodospirillum, Sphingomonas, Burkholderia, Desulfomonas, Geospirillum, Succinomonas, Aeromonas, Shewanella, Halochromatium, Citrobacter, Escherichia, Klebsiella, Zymomonas, Zymobacter , and Acetobacter .
- Gram-positive bacteria can include Fibrobacter, Acidobacter, Bacteroides, Sphingobacterium, Actinomyces, Corynebacterium, Nocardia, Rhodococcus, Propionibacterium, Bifidobacterium, Bacillus, Geobacillus, Paenibacillus, Sulfobacillus, Clostridium, Anaerobacter, Eubacterium, Streptococcus, Lactobacillus, Leuconostoc, Enterococcus, Lactococcus, Thermobifida, Cellulomonas , and Sarcina.
- coding sequence is defined herein as a polynucleotide sequence, which directly specifies the amino acid sequence of its protein product.
- fragment is meant a portion (e.g., at least 5, 10, 25, 50, 100, 125, 150, 200, 250, 300, 350, 400, or 500 amino acids or nucleic acids) of a protein or nucleic acid molecule that is substantially identical to a reference protein or nucleic acid and retains the biological activity of the reference. In some embodiments the portion retains at least 50%, 75%, or 80%, or more preferably 90%, 95%, or even 99% of the biological activity of the reference protein or nucleic acid described herein, and retains at least one biological activity of the reference protein.
- fusion protein as used herein is meant to refer to a protein created through genetic engineering from two or more proteins or peptides.
- a fusion protein can refer to a protein in which a Asp-Asp-Asp-Asp-Lys (D4K) sequence (SEQ ID NO: 1) has been intentionally introduced for specific cleavage. Generally, cleavage of the fusion protein generates two polypeptides.
- a fusion protein according to the invention can be a recombinant fusion protein.
- a fusion protein can be generated, for example, from the addition of a vector-derived residue peptide at one terminus, for example the N-terminus, in addition to the amino acid sequence of the native.
- a recombinant fusion protein can be constructed to have Asp-Asp-Asp-Asp-Lys (D4K) cleavage sites (SEQ ID NO: 1) in the vector and in the protein that contains Asp-Asp-Asp-Asp-Lys (D4K) sites (SEQ ID NO: 1) itself.
- D4K Asp-Asp-Asp-Asp-Lys
- homologue refers to a protein or nucleic acid sharing a certain degree of sequence “identity” or sequence “similarity” with a given protein, or the nucleic acid encoding the given protein.
- percent identity refers to the percentage of residues in two sequences that are the same when aligned for maximum correspondence. Sequence “similarity” is related to sequence “identity”, but differs in that residues that are not exactly the same as each other, but that are functionally “similar” are taken into consideration.
- host cell is meant to include any prokaryotic or eukaryotic cell that contains either a cloning vector or an expression vector. This term also includes those prokaryotic or eukaryotic cells that have been genetically engineered to contain the cloned gene(s) in the chromosome or genome of the host cell.
- hybridizes under stringent conditions is intended to describe conditions for hybridization and washing under which nucleotide sequences at least 60%, 70%, 75%, 80%, 85%, 90%, or 95% homologous to each other typically remain hybridized to each other.
- Hybridization conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y., 1991.
- Moderate hybridization conditions are defined as equivalent to hybridization in 2 ⁇ sodium chloride/sodium citrate (SSC) at 30° C., followed by a wash in 1 ⁇ SSC, 0.1% SDS at 50° C.
- Highly stringent conditions are defined as equivalent to hybridization in 6 ⁇ sodium chloride/sodium citrate (SSC) at 45° C., followed by a wash in 0.2 ⁇ SSC, 0.1% SDS at 65° C.
- amino acid or nucleotide sequence which contains a sufficient or minimum number of the same or equivalent amino acid residues or nucleotides, e.g., an amino acid residue which has a similar side chain, to a second amino acid or nucleotide sequence such that the first and second amino acid or nucleotide sequences share common structural domains and/or a common functional activity.
- a homologous or identical nucleic acid molecule of the invention is at least 10, 15, 20, 25, 30 or more nucleotides in length and hybridizes under stringent conditions to a nucleic acid molecule encoding the amino acid sequence of SEQ ID NO: 2 or to a nucleic acid molecule encoding the amino acid sequence of SEQ ID NO: 4.
- the molecule hybridizes under highly stringent conditions.
- the nucleic acid is at least 15-20 nucleotides in length.
- the terms “isolated,” “purified,” or “biologically pure” refer to material that is free to varying degrees from components which normally accompany it as found in its native state. Various levels of purity may be applied as needed according to this invention in the different methodologies set forth herein; the customary purity standards known in the art may be used if no standard is otherwise specified.
- the enteropeptidase polypeptides of the present invention can be in essentially or substantially pure form. For instance, they are essentially free of other polypeptide material with which it is natively associated.
- They can also be at least 20% pure, preferably at least 40% pure, more preferably at least 60% pure, even more preferably at least 80% pure, most preferably at least 90% pure, and even most preferably at least 95% pure, as determined by agarose electrophoresis. This can be accomplished by preparing the polypeptide by a variety of means of well-known recombinant methods or by classical purification methods.
- isolated nucleic acid molecule is meant a nucleic acid (e.g., a DNA, RNA, or analog thereof) that is free of the genes which, in the naturally occurring genome of the organism from which the nucleic acid molecule of the invention is derived, flank the gene.
- the term therefore includes, for example, a recombinant DNA that is incorporated into a vector; into an autonomously replicating plasmid or virus; or into the genomic DNA of a prokaryote or eukaryote; or that exists as a separate molecule (for example, a cDNA or a genomic or cDNA fragment produced by PCR or restriction endonuclease digestion) independent of other sequences.
- the term includes an RNA molecule which is transcribed from a DNA molecule, as well as a recombinant DNA which is part of a hybrid gene encoding additional polypeptide sequence.
- an “isolated polypeptide” (e.g., an isolated or purified biosynthetic enzyme) is substantially free of cellular material or other contaminating polypeptides from the microorganism from which the polypeptide is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized.
- isolated polypeptide and isolated protein refer to compounds comprising amino acids joined via peptide bonds and are used interchangeably.
- Polypeptide molecules have an amino terminus (“N-terminus”) and a carboxy terminus (“C-terminus”). Peptide linkages occur between the backbone amino group of a first amino acid residue and the backbone carboxyl group of a second amino acid residue.
- the terminus of a polypeptide at which a new linkage would occur is the carboxy-terminus of the growing polypeptide chain, and polypeptide sequences are written from left to right beginning at the amino terminus.
- the term “low” means a reduced amount, or a decreased amount, relative to an unmutated or unaltered nucleotide or polypeptide. Unaltered can mean unmutated.
- an EP polypeptide of the invention that contains a mutation may have a low proteolytic activity as compared to an EP polypeptide that does not contain the same mutation.
- the polypeptide has low proteolytic activity, which may be 10%, 15%, 25%, 50%, 75% or even 90% lower than unmutated or unaltered polypeptide.
- mutant nucleic acid molecule or “mutant gene” is intended to include a nucleic acid molecule or gene having a nucleotide sequence which includes at least one alteration (e.g., substitution, insertion, deletion) such that the polypeptide or polypeptide that can be encoded by said mutant exhibits an activity that differs from the polypeptide or polypeptide encoded by the wild-type nucleic acid molecule or gene.
- alteration e.g., substitution, insertion, deletion
- nucleotide refers to a nucleoside phosphorylated at one of its pentose hydroxyl groups.
- nucleoside in turn refers to a compound consisting of a purine [guanine (G) or adenine (A)] or pyrimidine [thymine (T), uridine (U), or cytidine (C)] base covalently linked to a pentose.
- polynucleotide refers to a nucleic acid containing a sequence that is greater than about 100 nucleotides in length.
- nucleic acid refers to a covalently linked sequence of nucleotides in which the 3′ position of the pentose of one nucleotide is joined by a phosphodiester group to the 5′ position of the pentose of the next, and in which the nucleotide residues (bases) are linked in specific sequence; i.e., a linear order of nucleotides.
- nucleic acid is intended to include nucleic acid molecules, e.g., polynucleotides which include an open reading frame encoding a polypeptide, and can further include non-coding regulatory sequences, and introns.
- the terms are intended to include one or more genes that map to a functional locus.
- the terms are intended to include a specific gene for a selected purpose. The gene can be endogenous to the host cell or can be recombinantly introduced into the host cell, e.g., as a plasmid maintained episomally or a plasmid (or fragment thereof) that is stably integrated into the genome.
- operably linked denotes herein a configuration in which a control sequence is placed at an appropriate position relative to the coding sequence of the polynucleotide sequence such that the control sequence directs the expression of the coding sequence of a polypeptide.
- protease is intended to include any polypeptide/s, alone or in combination with other polypeptides, that break peptide bonds between amino acids of proteins.
- proteolytic activity is meant to refer to the cleavage activity of a substrate by an enzyme.
- the term refers to the enzymatic cleavage by enteropeptidases.
- the term is meant to refer to the specific activity of medaka EP for Asp-Asp-Asp-Asp-Lys cleavage sites (SEQ ID NO: 1).
- Non-specific proteolytic activity is meant to refer to cleavage activity that is not directed to a specific cleavage site.
- Specific proteolytic activity is meant to refer to cleavage activity that is directed to a specific cleavage site. Proteolytic activity can be
- recombinant nucleic acid molecule includes a nucleic acid molecule (e.g., a DNA molecule) that has been altered, modified or engineered such that it differs in nucleotide sequence from the native or natural nucleic acid molecule from which the recombinant nucleic acid molecule was derived (e.g., by addition, deletion or substitution of one or more nucleotides).
- a recombinant nucleic acid molecule e.g., a recombinant DNA molecule
- includes an isolated nucleic acid molecule or gene of the present invention e.g., an isolated EP nucleic acid molecule encoding an EP polypeptide operably linked to regulatory sequences.
- substantially identical is meant a protein or nucleic acid molecule exhibiting at least 50% identity to a reference amino acid sequence (for example, any one of the amino acid sequences described herein) or nucleic acid sequence (for example, any one of the nucleic acid sequences described herein).
- a reference amino acid sequence for example, any one of the amino acid sequences described herein
- nucleic acid sequence for example, any one of the nucleic acid sequences described herein.
- such a sequence is at least 50%, are more preferably 60%, 70%, 75%, 80%, 85%, 90%, and most preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical at the amino acid level or nucleic acid to the sequence used for comparison.
- Sequence identity is typically measured using sequence analysis software (for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, BLAST, BESTFIT, GAP, or PILEUP/PRETTYBOX programs). Such software matches identical or similar sequences by assigning degrees of homology to various substitutions, deletions, and/or other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine. In an exemplary approach to determining the degree of identity, a BLAST program may be used, with a probability score between e ⁇ 3 and e ⁇ 100 indicating a closely related sequence.
- sequence analysis software for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin
- variant when used in reference to a polypeptide refers to an amino acid sequence that differs by one or more amino acids from a reference polypeptide.
- Enteropeptidase is a serine protease enzyme that activates its substrates by cleavage.
- Enteropeptidase is an intestinal protease that removes an N-terminal fragment from trypsinogen. The remaining active fragment is trypsin. This cleavage initiates a cascade of proteolytic reactions leading to the activation of many pancreatic zymogens. See, for example, Matsushima et al., J. Biol. Chem. 269(31): 19976-19982 (1994), Kitamoto et al., Proc. Nat. Acad. Sci., 91(16): 7588-7592 (1994).
- the amino acid sequence of the fish EP is homologous to those of its mammalian counterparts, with all the structural features found in mammalian EPs being conserved, including various unique domains in the N-terminal heavy-chain.
- the extent of identity varies from domain to domain.
- LDLR domains 1 and 2 C1 r/s domains 1 and 2
- the MAM domain are highly conserved between medaka and mammalian EP with 45-57% identity, while the identity in the mucin-like and MSCR domain between them is as low as 22%. This fact suggests that the former five domains in the heavy-chain play important roles throughout vertebrate species, although these roles are not known at present.
- the heavy-chain of medaka EP has a hydrophobic segment near the N-terminus. This segment probably serves as a transmembrane anchor, as established for the mammalian EP. Consistent with this notion is the current observation that the 28-kDa immunoreactive protein was detected in the membrane fraction of medaka intestines by specific EP antibodies. The EP was also immunologically detected in the soluble fraction of the intestine. Therefore, as in the case of mammalian EPs, the medaka protease is synthesized as a single-chain zymogen in the intestine.
- bovine EP protease employed for cleavage of the inactive fusion protein presents an obstacle. This is particularly serious when the proteases to be examined are ones with very low activity for synthetic and protein substrates. Significant nonspecific activities of bovine EP protease often makes it difficult to determine whether the target recombinant proteases have been successfully activated.
- the nucleic acid molecule can be single-stranded or double-stranded DNA.
- the isolated nucleic acid molecule of the invention can include a nucleic acid molecule which is free of sequences which naturally flank the nucleic acid molecule (i.e., sequences located at the 5′ and 3′ ends of the nucleic acid molecule) in the chromosomal DNA of the organism from which the nucleic acid is derived.
- an isolated nucleic acid molecule can contain less than about 10 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, 0.1 kb, 50 bp, 25 bp or 10 bp of nucleotide sequences which naturally flank the nucleic acid molecule in chromosomal DNA of the microorganism from which the nucleic acid molecule is derived.
- an “isolated” nucleic acid molecule such as a cDNA molecule, can be substantially free of other cellular materials when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
- the nucleic acid corresponds to enteropeptidase 1 (SEQ ID NO: 3):
- the nucleic acid corresponds to the enteropeptidase 1 with a E 173A mutation (SEQ ID NO: 5):
- an isolated nucleic acid molecule of the invention comprises a nucleotide sequence which is at least about 50% identical, and most preferably 60%, 65%, 70%, 75%, 80%, 85%, and more preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, or a complement thereof.
- the nucleic acid molecule of the invention comprises a fragment of at least about 5-25, more preferably 10-15 nucleotides of a nucleic acid comprising the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, or a complement thereof, that retains the biological activity of SEQ ID NO: 3 or SEQ ID NO: 5, e.g. the fragments have proteolytic activity, and in more specific embodiments, the fragments can cleave at Asp-Asp-Asp-Asp-Lys cleavage sites (SEQ ID NO: 1), and have low non-specific proteolytic activity.
- an isolated nucleic acid molecule of the invention encodes a nucleic acid molecule which encodes a polypeptide comprising an amino acid sequence that is at least about 50% homologous to the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, and retains the biological activity of SEQ ID NO: 2 or SEQ ID NO: 4, e.g.
- sequence identity or “homologue” include a nucleotide or polypeptide sharing at least about 30-35%, advantageously at least about 35-40%, more advantageously at least about 40-50%, and even more advantageously at least about 60%, 70%, 80%, 90% or more identity with the amino acid sequence of a wild-type polypeptide or polypeptide described herein and having a substantially equivalent functional or biological activity as the wild-type polypeptide or polypeptide.
- a enteropeptidase homologue shares at least about 30-35%, advantageously at least about 35-40%, more advantageously at least about 40-50%, and even more advantageously at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identity with the polypeptide having the amino acid sequence set forth as SEQ ID NO: 2 or SEQ ID NO: 4, and has substantially equivalent functional or biological activities (i.e., is a functional equivalent) of the polypeptide having the amino acid sequence set forth as SEQ ID NO: 2 or SEQ ID NO: 4 (e.g., has a substantially equivalent enteropeptidase activities).
- an isolated nucleic acid molecule encodes a variant of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the nucleic acid molecule hybridizes to a complement of a nucleic acid molecule comprising SEQ ID NO: 3 or SEQ ID NO: 5, under stringent conditions.
- stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6.
- a particular, non-limiting example of stringent e.g.
- an isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to the sequence of SEQ ID NO: 3 or SEQ ID NO: 5 corresponding to a naturally-occurring nucleic acid molecule or a naturally occurring allelic variant.
- a naturally-occurring nucleic acid molecule includes an RNA or DNA molecule having a nucleotide sequence that occurs in nature.
- Modification of a nucleotide sequence encoding a polypeptide of the present invention may be necessary for the synthesis of polypeptides substantially identical or similar to the polypeptide.
- the terms “substantially identical” or “substantially similar” to the polypeptide can refer to non-naturally occurring forms of the polypeptide.
- These polypeptides may differ in some engineered way from the polypeptide isolated from its native source, e.g., artificial variants that differ in specific activity, thermostability, pH optimum, or the like.
- the variant sequence may be constructed on the basis of the nucleotide sequence presented as the polypeptide encoding region of SEQ ID NO: 5, e.g., a subsequence thereof, and/or by introduction of nucleotide substitutions which do not give rise to another amino acid sequence of the polypeptide encoded by the nucleotide sequence.
- nucleotide substitution see, e.g., Ford et al., Protein Expression and Purification, 2:95-107 (1991).
- a nucleic acid molecule of the present invention can be isolated using standard molecular biology techniques and the sequence information provided herein.
- nucleic acid molecules can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual.
- a nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques.
- an isolated nucleic acid molecule of the invention is selected from the group consisting of a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, or a complement thereof; and a nucleic acid molecule which encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4.
- the invention provides an isolated polynucleotide encoding a polypeptide, wherein the polynucleotide is a recombinant polynucleotide.
- a recombinant polynucleotide can be a fusion.
- a nucleic acid described herein e.g., an EP nucleic acid
- a transcriptional or translational fusion with a detectable reporter is expressed in an isolated cell (e.g., mammalian or insect cell) under the control of a heterologous promoter, such as an inducible promoter.
- the present invention provides a host cell.
- a host cell includes any cell type which is susceptible to transformation, transfection, or transduction with a nucleic acid construct or expression vector comprising a polynucleotide of the present invention.
- Host cells for use in expressing the EP polypeptides encoded by the expression vectors of the present invention include, but are not limited to, bacterial cells, such as E. coli ; fungal cells, such as yeast cells (e.g., Saccharomyces cerevisiae ); and animal cells such as CHO. Appropriate culture mediums and conditions for the above-described host cells are well known in the art.
- the techniques used to isolate or clone a polynucleotide encoding a polypeptide include isolation from genomic DNA, preparation from cDNA, or a combination thereof.
- the cloning of the polynucleotides of the present invention from such genomic DNA can be effected, e.g., by using the well-known polymerase chain reaction (“PCR”) or antibody screening of expression libraries to detect cloned DNA fragments with shared structural features.
- PCR polymerase chain reaction
- Other nucleic acid amplification procedures such as ligase chain reaction (LCR), ligated activated transcription (LAT) and nucleotide sequence-based amplification (NASBA) may be used.
- PCR Polymerase chain reaction
- This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase.
- the two primers are complementary to their respective strands of the double stranded target sequence.
- the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule.
- the primers are extended with a polymerase so as to form a new pair of complementary strands.
- the steps of denaturation, primer annealing and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one “cycle”; there can be numerous “cycles”) to obtain a high concentration of an amplified segment of the desired target sequence.
- the length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter.
- PCR it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; and/or incorporation of 32 P-labeled deoxyribonucleotide triphosphates, such as dCTP or dATP, into the amplified segment).
- any oligonucleotide sequence can be amplified with the appropriate set of primer molecules.
- the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
- Amplified target sequences may be used to obtain segments of DNA (e.g., genes) for the construction of targeting vectors, transgenes, etc.
- a “primer” refers to an oligonucleotide, whether occurring naturally or produced synthetically, which is capable of acting as a point of initiation of nucleic acid synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced (i.e., in the presence of nucleotides, an inducing agent such as DNA polymerase, and under suitable conditions of temperature and pH).
- the primer is preferably single-stranded for maximum efficiency in amplification, but may alternatively be double-stranded. If double-stranded, the primer is first treated to separate its strands before being used to prepare extension products.
- the primer is an oligodeoxyribonucleotide.
- the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and use of the method.
- a probe refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally or produced synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to another oligonucleotide of interest.
- a probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that the probe used in the present invention is labeled with any “reporter molecule,” so that it is detectable in a detection system, including, but not limited to enzyme (i.e., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems.
- reporter molecule and “label” are used herein interchangeably.
- primers and deoxynucleoside triphosphates may contain labels; these labels may comprise, but are not limited to, 32 P, 33 P, or fluorescent molecules (e.g., fluorescent dyes).
- Southern blot analysis and “Southern blot” and “Southern” refer to the analysis of DNA on agarose or acrylamide gels in which DNA is separated or fragmented according to size followed by transfer of the DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane.
- the immobilized DNA is then exposed to a labeled probe to detect DNA species complementary to the probe used.
- the DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support.
- Southern blots are a standard tool of molecular biologists. J. Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY, 9.31-9.58. (1989)
- Northern blot analysis and “Northern blot” and “Northern” as used herein refer to the analysis of RNA by electrophoresis of RNA on agarose gels to fractionate the RNA according to size followed by transfer of the RNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized RNA is then probed with a labeled probe to detect RNA species complementary to the probe used.
- Northern blots are a standard tool of molecular biologists. J. Sambrook et al., supra, pp 7.39-7.52.
- the terms “Western blot analysis” and “Western blot” and “Western” refers to the analysis of protein(s)(or polypeptides) immobilized onto a support such as nitrocellulose or a membrane.
- a mixture comprising at least one protein is first separated on an acrylamide gel, and the separated proteins are then transferred from the gel to a solid support, such as nitrocellulose or a nylon membrane.
- the immobilized proteins are exposed to at least one antibody with reactivity against at least one antigen of interest.
- the bound antibodies may be detected by various methods, including the use of radiolabeled antibodies.
- Another aspect of the present invention features isolated enteropeptidase polypeptides (e.g., isolated enteropeptidase-1 polypeptides).
- An isolated or purified polypeptide (e.g., an isolated or purified EP-1) is substantially free of cellular material or other contaminating polypeptides from the microorganism from which the polypeptide is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized.
- EP-1 polypeptides or genes products that are mammalian derived polypeptides or gene products.
- the EP-1 polypeptide or gene product is derived from the teleost Medaka.
- EP-1 polypeptides or gene products that can be non-mammalian or mammalian derived polypeptides or gene products which differ from naturally-occurring EP-1 genes or polypeptides, for example, genes which have nucleic acids that are mutated, inserted or deleted, but which encode polypeptides substantially similar to the naturally-occurring gene products of the present invention, e.g., are cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
- Low non-specific proteolytic activity is meant to refer to a reduced amount, or a decreased amount, relative to an unmutated or unaltered nucleotide or polypeptide. Unaltered can mean unmutated.
- the polypeptide has low proteolytic activity, which may be 10%, 15%, 25%, 50%, 75% or even 90% lower than unmutated or unaltered polypeptide.
- the isolated polypeptide encodes EP-1, having SEQ ID NO: 2:
- the isolated polypeptide encodes EP-1 with E173A mutation, having SEQ ID NO: 4:
- nucleic acids which, due to the degeneracy of the genetic code, encode for an identical amino acid as that encoded by the naturally occurring gene. This may be desirable in order to improve the codon usage of a nucleic acid.
- mutate nucleic acids which encode for conservative amino acid substitutions.
- a cleavage specific activity for example cleavage specificity for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1)
- SEQ ID NO: 1 cleavage specificity for Asp-Asp-Asp-Asp-Lys
- the isolated nucleic acid molecule of the invention is selected from a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, or a complement thereof.
- the nucleic acid molecule encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4.
- isolated polypeptides e.g., an isolated EP polypeptide, more specifically an isolated EP-1 polypeptide that comprise a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the fragment comprises at least 5-15 contiguous amino acids of SEQ ID NO: 2 or SEQ ID NO: 4 and retains at least one biological activity of the reference polypeptide that is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
- isolated EP polypeptide e.g., an isolated EP polypeptide, more specifically an isolated EP-1 polypeptide that comprise a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the fragment comprises at least 5-15 contiguous amino acids of SEQ ID NO: 2 or SEQ ID NO: 4 and retains at least one biological activity of the reference polypeptide that is cleavage specific
- Also included in the scope of the invention are a variant or naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the polypeptide is encoded by a nucleic acid molecule which hybridizes to a complement of a nucleic acid molecule comprising SEQ ID NO: 3 or SEQ ID NO: 5 under stringent conditions.
- Modification of a nucleotide sequence encoding a polypeptide of the present invention may be necessary for the synthesis of polypeptides substantially identical or similar to the polypeptide.
- the terms “substantially identical” or “substantially similar” to the polypeptide can refer to non-naturally occurring forms of the polypeptide.
- These polypeptides may differ in some engineered way from the polypeptide isolated from its native source, e.g., artificial variants that differ in specific activity, thermostability, pH optimum, or the like.
- the variant sequence may be constructed on the basis of the nucleotide sequence presented as the polypeptide encoding region of SEQ ID NO: 5, e.g., a subsequence thereof, and/or by introduction of nucleotide substitutions which do not give rise to another amino acid sequence of the polypeptide encoded by the nucleotide sequence.
- nucleotide substitution see, e.g., Ford et al., Protein Expression and Purification, 2:95-107 (1991).
- amino acid residues essential to the activity of the polypeptide encoded by an isolated polynucleotide of the invention may be identified according to procedures known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis. See, e.g., Cunningham and Wells, Science, 244:1081-1085 (1989). In the latter technique, mutations are introduced at every positively charged residue in the molecule, and the resultant mutant molecules are tested for antimicrobial activity to identify amino acid residues that are critical to the activity of the molecule.
- Sites of substrate-enzyme interaction can also be determined by analysis of the three-dimensional structure as determined by such techniques as nuclear magnetic resonance analysis, crystallography or photoaffinity labeling. See, e.g., de Vos et al., Science, 255:306-312 (1992); Smith et al., Journal of Molecular Biology, 224:899-904 (1992); Wlodaver et al., FEBS Letters, 309:59-64 (1992).
- an isolated polypeptide of the present invention comprises an amino acid sequence which is a homologue of the at least one of the polypeptides set forth as SEQ ID NO: 2 or SEQ ID NO: 4 (e.g., comprises an amino acid sequence at least about 30-40% identical, advantageously about 40-50% identical, more advantageously about 50-60% identical, and even more advantageously about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, and has an activity that is substantially similar to that of the polypeptide encoded by the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, respectively, for example is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
- the sequences are aligned for optimal comparison purposes.
- gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence.
- % identity # of identical positions/total # of positions ⁇ 100
- the comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm.
- a particular, non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA 87:2264-68, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-77.
- Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) of Altschul et al. (1990) J. Mol. Biol. 215:403-10.
- Gapped BLAST can be utilized as described in Altschul et al. (1997) Nucleic Acids Research 25(17): 3389-3402.
- the default parameters of the respective programs e.g., XBLAST and NBLAST
- a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller (1988) Comput Appl Biosci. 4:11-17. Such an algorithm is incorporated into the ALIGN program available, for example, at the GENESTREAM network server, IGH adjoin, FRANCE or at the ISREC server. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used.
- the percent identity between two amino acid sequences can be determined using the GAP program in the GCG software package, using either a Blossom 62 matrix or a PAM250 matrix, and a gap weight of 12, 10, 8, 6, or 4 and a length weight of 2, 3, or 4.
- the percent homology between two nucleic acid sequences can be accomplished using the GAP program in the GCG software package, using a gap weight of 50 and a length weight of 3.
- isolated polypeptides comprising a fragment of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the amino acids of the fragment are arranged in any sequence such that the fragment is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has the low non-specific proteolytic activity of SEQ ID NO:2 or SEQ ID NO:4
- isolated EP polypeptides comprising an amino acid sequence which is a variant of the polypeptide of SEQ ID NO: 2.
- variant when used in reference to a polypeptide refers to an amino acid sequence that differs by one or more amino acids from a reference polypeptide.
- the variant may have “conservative” changes, wherein a substituted amino acid has similar structural or chemical properties. More rarely, a variant may have “non-conservative” changes. Similar minor variations may also include amino acid deletions or insertions, or both.
- An EP variant polypeptide and polynucleotide encoding the same can be generated using any technique known in the art, including site-directed mutagenesis.
- EP variant polypeptides of the present invention can be prepared, for example, by using a wild-type EP polypeptide as a starting material to be improved.
- wild-type as applied to a polynucleotide means that the nucleic acid fragment does not comprise any mutations from the form isolated from nature.
- wild-type as applied to a polypeptide (or protein) means that the protein will be active at a level of activity found in nature and typically will comprise the amino acid sequence as found in nature.
- modified or “mutant” when made in reference to a polynucleotide or polypeptide (or protein), respectively, to a polynucleotide or to a polypeptide (or protein) which displays modifications in sequence and/or functional properties (i.e., altered characteristics) when compared to the wild-type polynucleotide or polypeptide.
- wild type indicates a starting or reference sequence prior to a manipulation of the invention.
- Suitable sources of wild-type EP can be identified by screening genomic libraries of organisms for the EP activities described herein.
- a parental amino acid or nucleic acid sequence encoding the wild-type Medaka EP polypeptide was constructed.
- the sequence designated EP-1 (SEQ ID NO: 3 or 4) was utilized as the starting point for all experiments and library construction.
- the isolated polypeptides described herein wherein the polypeptide comprising the amino acid sequence of SEQ ID NO: 2 has at least one mutation.
- the mutation can be a substitution, deletion, or an addition.
- the mutation is a substitution.
- the substitution can occur anywhere in SEQ ID NO: 2, but preferably the substitution occurs at amino acid residue selected from the group consisting of: residue 93 through residue 193.
- the substitution comprises a substitution at one or more residues selected from position 63, 105, 144, 173 or 193.
- the substitution is at residue 63, and consists of K63R, K63A or K63E.
- the substitution is at residue105, and consists of T105A, T105R, or T105E. In other exemplary embodiments, the substitution is at residue 144, and consists of F144S. In other exemplary embodiments, the substitution is at residue 173, and consists of E173A. In other exemplary embodiments, the substitution is at residue 193, and consists of P193E or P193A.
- immunospecific antibodies can be raised against a EP polypeptide, or portions thereof as described herein, using standard techniques known in the art.
- any of the polypeptides of the invention for example a polypeptide comprising the amino acid sequence SEQ ID NO: 2, or SEQ ID NO: 4, a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4; wherein the fragment comprises at least 15 contiguous amino acids of SEQ ID NO: 2, or SEQ ID NO: 4, a naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4, wherein the polypeptide is encoded by a nucleic acid molecule which hybridizes to a complement of a nucleic acid molecule comprising SEQ ID NO:3, or SEQ ID NO:5, under stringent conditions.
- the method for producing the above-mentioned polypeptides comprises culturing the host cells of the invention under conditions in which the nucleic acid molecule is expressed.
- nucleotide construct refers to a nucleic acid molecule, either single- or double-stranded, which is isolated from a naturally occurring gene or which is modified to contain segments of nucleic acids in a manner that would not otherwise exist in nature.
- nucleic acid construct is inclusive of the term expression cassette or expression vector when the nucleic acid construct contains all the control sequences required for expression of a coding sequence (polynucleotide) of the present invention.
- coding sequence is defined herein as a polynucleotide sequence, which directly specifies the amino acid sequence of its protein product.
- the boundaries of a genomic coding sequence are generally determined by a ribosome binding site (prokaryotes) or by the ATG start codon (eukaryotes) located just upstream of the open reading frame at the 5′ end of the mRNA and a transcription terminator sequence located just downstream of the open reading frame at the 3′ end of the mRNA.
- a coding sequence can include, but is not limited to, DNA, cDNA, and recombinant nucleic acid sequences.
- control sequence includes all components, which are necessary or advantageous for the expression of a polynucleotide encoding a polypeptide of the present invention.
- Each control sequence may be native or foreign to the nucleotide sequence encoding the polypeptide or native or foreign to each other.
- Such control sequences may include, but are not limited to, a promoter, and transcriptional and translational stop signals.
- the control sequence may be an appropriate promoter sequence.
- the promoter sequence is a relatively short nucleic acid sequence that is recognized by a host cell for expression of the longer coding region that follows.
- the promoter sequence contains transcriptional control sequences, which mediate the expression of the polypeptide.
- the promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
- operably linked denotes herein a configuration in which a control sequence is placed at an appropriate position relative to the coding sequence of the polynucleotide sequence such that the control sequence directs the expression of the coding sequence of a polypeptide.
- the present invention provides an expression vector comprising the polynucleotide described above.
- expression includes any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion.
- expression vector is defined herein as a linear or circular DNA molecule that comprises a polynucleotide encoding a polypeptide of the invention, and which is operably linked to additional nucleotides that provide for its expression.
- the polypeptide is produced in an E. coli expression system.
- the various nucleic acid and control sequences described above may be joined together to produce a recombinant expression vector which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the polypeptide at such sites.
- the nucleic acid sequence of the present invention may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for expression.
- the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression.
- the expression vector may be any vector (e.g., a plasmid or virus), which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the polynucleotide sequence.
- the choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced.
- the vectors may be linear or closed circular plasmids.
- the expression vector may be an autonomously replicating vector, i.e., a vector which, exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome.
- the vector may contain any means for assuring self-replication.
- the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated.
- a single vector or plasmid or two or more vectors or plasmids which together contain the total DNA to be introduced into the genome of the host cell, or a transposon may be used.
- the expression vector contains one or more selectable markers, which permit easy selection of transformed cells.
- a selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like.
- Examples of bacterial selectable markers are the dal genes from Bacillus subtilis or Bacillus licheniformis , or markers, which confer antibiotic resistance such as ampicillin, kanamycin, chloramphenicol or tetracycline resistance.
- Suitable markers for yeast host cells are ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and URA3.
- Selectable markers for use in a filamentous fungal host cell include, but are not limited to, amdS (acetamidase), argB (ornithine carbamoyltransferase), bar (phosphinothricin acetyltransferase), hph (hygromycin phosphotransferase), niaD (nitrate reductase), pyrG (orotidine-5′-phosphate decarboxylase), sC (sulfate adenyltransferase), and trpC (anthranilate synthase), as well as equivalents thereof.
- Preferred for use in an Aspergillus cell are the amdS and pyrG genes of Aspergillus nidulans or Aspergillus oryzae and the bar gene of Streptomyces hygroscopicus.
- Manipulation of the isolated polynucleotide prior to its insertion into a vector may be desirable or necessary depending on the expression vector.
- An isolated polynucleotide encoding the EP polypeptides of the present invention may be manipulated in a variety of ways well known in the art to provide for expression of the polypeptide.
- the host cell of the invention contains any of the nucleic acid molecules as described herein.
- the host cell is a bacterial cell.
- the bacterial cell is Escherichia coli.
- Engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the polynucleotides of the invention. Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter may be induced by appropriate means (e.g., temperature shift or chemical induction) and the cells may be cultured for an additional period to allow them to produce the desired polypeptide or fragment thereof.
- appropriate means e.g., temperature shift or chemical induction
- the protein that is the target for the EP-1 polypeptide e.g. the protein that contains an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1) can be a fusion protein, a recombinant fusion protein.
- a fusion protein is a protein created through genetic engineering from two or more proteins/peptides. This can be achieved by creating a fusion gene: removing the stop codon from the DNA sequence of the first protein, then appending the DNA sequence of the second protein in frame. That DNA sequence will then be expressed by a cell as a single protein.
- a fusion protein can refer to a protein in which a Asp-Asp-Asp-Asp-Lys (D4K) sequence (SEQ ID NO: 1) has been intentionally introduced for specific cleavage. Generally, cleavage of the fusion protein generates two polypeptides.
- a fusion protein according to the invention can be a recombinant fusion protein.
- a fusion protein can be generated, for example, from the addition of a vector-derived residue peptide at one terminus, for example the N-terminus, in addition to the amino acid sequence of the native.
- a recombinant fusion protein can be constructed to have Asp-Asp-Asp-Asp-Lys (D4K) cleavage sites (SEQ ID NO: 1) in the vector and in the protein that contains Asp-Asp-Asp-Asp-Lys (D4K) sites (SEQ ID NO: 1) itself.
- the recombinant fusion protein can be selected from, but not limited to, gelatinaseA, human kallikrein 8 and tissue type plasminogen activator (tPA).
- the protein can be bacterially produced. Also included in the scope of the invention are synthetic proteins.
- SEQ ID NO: 1 Asp-Asp-Asp-Asp-Lys
- kits comprising any of the polypeptides of the invention as described herein, e.g. enteropeptidase polypeptides that are cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and have low non-specific proteolytic activity.
- the kits containing the polypeptides are used for cleavage of proteins containing an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1), and instructions for use.
- the kits can be used for cleavage of a fusion protein.
- kits can be used for the cleavage of a recombinant fusion protein.
- the kits can be used for the cleavage of a bacterially produced protein.
- the kits can also be used for the cleavage of a synthetic protein.
- the proteins suitable for cleavage by the polypeptides of the invention contain Asp-Asp-Asp-Asp-Lys cleavage sites (SEQ ID NO: 1).
- oligonucleotide PCR primers were synthesized based on the cDNA sequence for conserved regions in serine protease (sense primer: 5′-GT(G/T)(C/G) T(C/G/T)(A/T) C(A/T) GCTGC(C/T) CACTG-3′ (SEQ ID NO: 7), which corresponds to the amino acid sequence NH2-Val-Leu-Thr-Ala-Ala-His-Cys-COOH (SEQ ID NO: 8); and antisense primer: 5′-(A/T) GGGCC (A/T) CC (A/T/G) GAGTC (A/T) CC-3′ (SEQ ID NO: 9), which corresponds to the amino acid sequence NH2-Gly-Asp-Ser-Gly-Gly-Pro-COOH (SEQ ID NO: 10)).
- cDNAs were PCR-amplified under the conditions described for EP in the main text.
- a 435-bp fragment was subcloned into pBluescript (II) KS+ (Stratagene, La Jolla, Calif.) and sequenced.
- a 5′ portion of medaka trypsinogen was obtained by the 5′-RACE method (1) using the 5′-RACE system, Version 2.0 (Invitrogen, Carlsbad, Calif.).
- the antisense primers used were 5′-AGGAGGTGATGAACTG-3′ (SEQ ID NO: 11) (GSP-1; nucleotides 273 to 288, AB272106), 5′-CTCGGTTCCGTCATTGTTCCGGGAT-3′ (SEQ ID NO: 12) (GSP-2; nucleotides 249 to 272, AB272106) and 5′-CCAGACGCACCTCCACTCGGGACT-3′ (SEQ ID NO: 13) (nested GSP; nucleotides 214 to 237, AB272106).
- the two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C. for the first PCR and 35 cycles of 0.5 min at 94° C., 0.5 min at 60° C., and 1 min at 72° C. for the second PCR.
- the amplified products were then subcloned into pBluescript II plasmid (Stratagene) and sequenced.
- a 3′ portion of medaka trypsinogen was obtained by the 3′-RACE method (1) using the 3′-Full RACE Core Set (Takara, Tokyo, Japan).
- the sense primers used were 5′-CATGATCACCAACTCCATGTTCTG-3′ (SEQ ID NO: 14) (RACE1; nucleotides 545 to 568, AB272106) and 5′-TGGATACCTGGAGGGAGG-3′ (SEQ ID NO: 15) (RACE2; nucleotides 572 to 589, AB272106).
- the two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C.
- enteropeptidase-1 (EP-1) and enteropeptidase-2 (EP-2), expressed in the medaka intestine
- RT-PCR was conducted with KOD plus DNA polymerase (Toyobo, Osaka, Japan) using medaka intestine total RNA.
- the primers used were 5′-AGAACATCACAGGTGAACCGGTGA-3′ (SEQ ID NO: 16) (sense primer, nucleotides 1-24, AB272104) and 5′-TTCTGACATTCCTGAAGGGACAGC-3′ (SEQ ID NO: 17) (antisense primer, nucleotides 3930-3953, AB272104).
- PCR conditions were 2 min at 94° C.
- RT-PCR analyses were performed using specific primers: 5′-CAAGAACTACAACAGAAGAA-3′ (SEQ ID NO: 18) (sense) and 5′-GTGTATTGAGAAAAAGGTTGTTAA-3′ (SEQ ID NO: 19) (antisense) for EP-1 (nucleotides 2719-3415, AB272104) and 5′-CAAGAACTACAACAGAAGAA-3′ (SEQ ID NO: 18) (sense) and 5′-CTGTACTAAGAAAAAATTTGTCAT-3′ (SEQ ID NO: 20) (antisense) for EP-2 (nucleotides 2747-3443, AB272105).
- PCR conditions were 3 min at 94° C. for heating, followed by 20, 22, 24, 26 and 28 cycles of 30 sec at
- RACE methods (1) were used for ovary 1.5- and 1.3-kb EP transcripts.
- the sequence of the 5′-end was confirmed by the 5′-RACE using a 5′-RACE system (Invitrogen).
- the primers used were as follows: 5′-AGGTAACCAAGCAGAG-3′ (SEQ ID NO: 21) (nucleotides 3207-3222, AB272104) for the reverse transcriptase reaction, 5′-GAGAACGAGGAGCGCCTGGTCTCA-3′ (SEQ ID NO: 22) (nucleotides 3169-3192, AB272104) for the first PCR, and 5′-ATCCATGAAGTGAAAGCAGACACT-3′ (SEQ ID NO: 23) (nucleotides 3142-3165, AB272104) for the second PCR.
- the PCR was performed under the conditions of 35 cycles of 30 sec at 94° C., 30 sec at 55° C., and 2 min at 72° C.
- the 3′-end of the transcripts was determined by the 3′-RACE method (1).
- 3′-RACE was conducted using a 3′-Full RACE Core Set (Takara) as described above.
- the gastrointestinal tract was obtained from mature medaka (body sizes, 3-4 cm), and divided into 8 pieces (about 0.5 mm each). Specimens from five fish were combined for total RNA preparation. Aliquots of 2 ⁇ g of the total RNAs were used for reverse transcription. PCR was performed for 25 cycles using Ex Taq DNA polymerase (Takara) and the primers 5′-AGGACCAAACGGAACATTTC-3′ (SEQ ID NO: 24) (sense, nucleotides 802-821, AB272104) and 5′-GAGAGGGACGCAGGAGGA-3′ (SEQ ID NO: 25) (antisense, 1422-1439, AB272104).
- RNA from various tissues of the medaka were electrophoretically fractionated and transferred to a Nytran-plus membrane (Schleicher and Schuell, Dassel, Germany).
- the blots were hybridized with 32P-labelled cDNA fragments (nucleotides 3359-3953 in AB272104 for EP and 572-835 in AB272106 for trypsinogen) in buffer containing 50% formamide, 5 ⁇ 0.15 M NaCl/8.65 mM NaH2PO4/1.25 mM EDTA (SSPE), 1% SDS, 5 ⁇ Denhardt's solution, and 100 ⁇ g/ml denatured salmon sperm DNA.
- SSPE 5 ⁇ 0.15 M NaCl/8.65 mM NaH2PO4/1.25 mM EDTA
- SDS 5 ⁇ Denhardt's solution
- 100 ⁇ g/ml denatured salmon sperm DNA 100 ⁇ g/ml denatured salmon sperm DNA.
- medaka cytoplasmic actin (OLCA1) mRNA was detected with a 32P-labeled 312-bp DNA fragment of the fish cDNA (2).
- Medaka genome DNA was extracted as described previously (3), with the exception that the whole-genome DNA was purified from the medaka whole body. Twenty ⁇ g of the genomic DNA was completely digested with various restriction enzymes. The digested DNA was fractionated on a 0.7% agarose gel and alkaline-transferred to a Nytran membrane (Schleicher & Schuell). The blot was hybridized at 60° C.
- RNA probes were prepared by in vitro transcription of reverse-transcriptase fragments of cDNAs with T3 or T7 RNA polymerase using a digoxigenin (DIG) RNA-labeling kit (Boehringer-Mannheim, Mannheim, Germany). A 595-bp cDNA fragment (nucleotides 3359-3953, AB272104) was used as a specific probe.
- the hybridization was conducted at 50° C. for 18 h in 50% formamide, 5′Denhardt's solution, 6′ SSPE, and 0.5 mg/ml yeast transfer RNA. The sections were washed once at 50° C.
- a trypsinogen cDNA fragment (nucleotides 72-755, AB272106) containing its coding sequence, but without the putative signal sequence, was amplified by PCR using the following primers: 5′-CCGGAATTCCTTGACGATGACAAG-3′ (SEQ ID NO: 26) and 5′-CCCAAGCTTTCAGTTGCTAGCCATGGT-3′ (SEQ ID NO: 27).
- the PCR product was digested with EcoR I and Hind III, gel-purified and ligated into the pET30a expression vector.
- a cDNA coding for human tPA (5) was first obtained by RT-PCR from a human ovary total RNA (Stratagene) using the primers 5′-CCCAAGCTTATGAAGAGAGGGCTCTGCTGT-3′ (SEQ ID NO: 28) (sense-1) and 5′-CTTATCGTCATCATGATGATGATGATGGTGTCTGGCTCCTCTTCT-3′ (SEQ ID NO: 29) (antisense-1) (BC007231).
- the resulting mutant was confirmed by DNA sequencing and transfected into CHO cells cultured in F-12 medium (Invitrogen) containing 10% fetal bovine serum (Biological Industries, Beit Haemek, Israel). Transfection was performed using Lipofectamin 2000 (GE Healthcare Biosciences, Uppsala, Sweden).
- the above procedure produced a fusion protein of human tPA having 11 extra amino acid residues (His-His-His-His-His-His-His-His-His-His-His-His-His-Asp-Asp-Asp-Lys (SEQ ID NO: 32): a His-tag sequence followed by an EP-cleavage site) at the N-terminus of mature tPA.
- This fusion protein secreted from transfected CHO cells was collected from the culture media using an Ni2+-Sepharose column.
- Treatment of the fusion protein with EP proteases generated mature tPA without the 11-residue N-terminal peptide.
- Recombinant human kallikrein 8 was prepared as described previously (6).
- Recombinant medaka gelatinase A was prepared as described previously (4).
- the protein antigen was produced using the bacterial expression system with pET30a as described above.
- the recombinant protein eluted from an Ni2+-Sepharose column was injected into rabbits.
- the specific antibody was affinity-purified using membranes onto which pure antigen was blotted (4).
- tissues were homogenized in 50 mM Tris.HCl (pH 7.4), 10 mM KCl, 10 mM MgCl2, 1 mM dithiothreitol, 5 mM EDTA and protease inhibitor cocktail (Wako), and centrifuged at 1,600′ g for 8 min. The pellet was collected as crude nuclei. The supernatant was further centrifuged at 100,000′ g for 30 min. The resulting supernatant and pellet were used as a cytosol and membrane fraction, respectively (7).
- the primary antibodies were affinity-purified EP protease antibodies as described above.
- Intestine sections (15 ⁇ m) were cut on a cryostat and thaw-mounted onto slides coated with silan. Sections on slides that were fixed with 4% paraformaldehyde in PBS for at least 15 min were treated with 3% H2O2 in PBS. After being blocked with BlockAce (Dainippon Seiyaku, Osaka, Japan) for 1 h at room temperature, each section was incubated with purified primary antibodies for 1 h at room temperature, and was then washed with PBS. Bound antibodies were detected using DakoCytomatin EnVision+ System-labeled polymer-HRP anti-rabbit (Dako, Carpinteria, Calif.) according to the manufacturer's instructions. Immunocomplexes were detected using an AEC kit (Vector Laboratories, Burlingame, Calif.).
- Active medaka enteropeptidase was preincubated with various inhibitors at 37 ⁇ in 20 mM Tris.HCl buffer (pH 7.4) containing 50 mM NaCl and 2 mM CaCl2. After incubation for 10 min, the enzyme activity was measured using GD4K- ⁇ NA (SEQ ID NO: 6) as a substrate.
- oligonucleotide PCR primers were synthesized based on the cDNA sequences for conserved C-terminal catalytic protease domains in mammalian EPs (sense primer: 5′-TCIGC(C/T)GC(A/C)CACTG(C/T)GT(C/G)TA(CM(A/G)G(A/G)-3′ (SEQ ID NO: 33), which corresponds to the sequence around the active site histidine, NH 2 -Ser-Ala-Ala-His-Cys-Val-Tyr-Gly-COOH (SEQ ID NO: 34); and antisense primer: 5′-(G/T)A(A/G)TGG(C/T)CC(G/T)CC(A/T)GAATC(A/C)CCCTG-3′ (SEQ ID NO: 35), which corresponds to the sequence around the active site serine, NH 2 — Gln-Gly-Asp
- the thus-obtained cDNAs were amplified under the following PCR conditions: 3 min at 94° C. for denaturation, 30 cycles of 0.5 min at 94° C., 0.5 min at 55° C. for annealing, and 0.5 min at 72° C. for extension, followed by 7 min final extension at 72° C. Fragments of about 0.5-kb in size were recovered from the PCR products by agarose gel purification and subcloned into pBluescript, (II) KS+ (Stratagene, La Jolla, Calif.). A 461-bp clone was obtained and was used as a probe for further screening of a Medaka cDNA library.
- a Medaka intestine random cDNA library was constructed in ⁇ gt10 and was packaged using Gigapack III packaging extract (Stratagene). Approximately 6 ⁇ 10 5 plaques from the library were transferred to nylon membranes (Schleicher and Schuell, Dassel, Germany) and hybridized at 65° C. in a buffer containing 5 ⁇ SSPE, 0.5% SDS, 5 ⁇ Denhardt's solution (Wako, Osaka, Japan), and 100 ⁇ g/ml denatured salmon sperm DNA with the 32 P-labeled 461-bp PCR fragment described above. Filters were washed with increasing stringency, with a final wash of 0.1 ⁇ SSC/0.1% SDS at 50° C.
- Phage DNA was subcloned into pBluescript (II) KS+ for sequencing. An EP clone containing 2689-bp cDNA (nucleotides 611-3298) was obtained. Further screening was conducted with the same library using an EP 477-bp probe (nucleotides 630-1101), and resulted in isolation of a 1364-bp cDNA containing the 5′ portion of the EP sequence.
- a 3′ portion of Medaka EP was obtained by the 3′-RACE method (Frohman et al., Proc. Natl. Acad. Sci. USA, 85:8998-9002 (1988)) using the 3′-Full RACE Core Set (Takara, Tokyo, Japan).
- the sense primers used were 5′-GACATTCTACAGGAGGCTGAGGTT-3′ (SEQ ID NO: 37) (RACE 1; nucleotides 2900 to 2923) and 5′-CGTCTCTTACCCGAGTACACCTTC-3′ (SEQ ID NO: 38) (RACE 2; nucleotides 2951 to 2974).
- the two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C. for the first PCR and 35 cycles of 0.5 min at 94° C., 0.5 min at 57° C., and 1 min at 72° C. for the second PCR.
- the amplified products were then subcloned into pBluescript II plasmid (Stratagene) and sequenced.
- a comparison of the entire amino acid sequences of EP-1 (1043 residues) and EP-2 (1036 residues) reveals a difference of only 22 amino acids, including an insertion of 7 residues in EP-2.
- the full-length EP-1 cDNA clone contained an ORF that codes a protein of 1043 amino acids, while the EP-2 clone codes a protein of 1036 amino acid residues ( FIG. 5 ).
- the deduced amino acid sequence of the Medaka EP was homologous with those of its mammalian counterparts. As in mammalian EPs, unique domain structures were found in the N-terminal heavy chain of the fish protein, as shown in FIG. 1A . However, the extent of sequence identity between the Medaka and mammalian EPs varies considerably from one domain to another: the identity is 21% in the mucin-like domain, 45% in LDLR domain 1, 41% in C1 r/s domain 1, 49% in the MAM domain, 57% in C1 r/s domain 2, 47% in LDLR domain 2, and 23% in the MSCR domain. The C-terminal serine protease domain of Medaka EP exhibited 53% identity for mammalian EP serine proteases.
- RT-PCR analyses using primer sets specific for the two Medaka Eps observed that the band intensities of amplified products were greater in EP-1 than EP-2 at every PCR cycle ( FIG. 5B ).
- RT-PCR using primers common to the two EP transcripts was also performed.
- Amplified products (1235 bp for EP-1 and 1246 bp for EP-2) were gel-purified and subcloned into pBluescript (II) KS + , and the recombinant plasmids were transformed into E. coli , strain JM109. Forty-four clones were randomly picked for the nucleotide sequence analyses; 26 clones were for EP-1 and 18 clones for EP-2.
- the result of Southern blot analysis supports the presence of at least two distinct copies of the EP gene in the Medaka ( FIG. 5C ).
- FIG. 1D In situ hybridization analysis localized EP expression to the intestinal epithelium ( FIG. 1E ).
- a polypeptide band of the same molecular mass was detected in both soluble and membrane fractions of the intestine ( FIG. 1F , Right).
- Western blotting of the intestine extract under nonreducing conditions gave no clear band (data not shown).
- FIG. 1G By immunohistochemical analysis using the antibody, the epithelial localization of EP in the intestine was demonstrated ( FIG. 1G ).
- the extract of Medaka intestines exhibited enzyme activity for the synthetic EP substrate GD 4 K- ⁇ NA (SEQ ID NO: 6). Using this activity as a marker, the apparent molecular mass of intact EP was estimated to be 440 kDa by gel filtration ( FIG. 7A ).
- the above fraction having GD 4 K- ⁇ NA-hydrolyzing activity (SEQ ID NO: 6) showed a 36-kDa polypeptide in Western blotting under reducing conditions ( FIG. 7B , Left). Again, the same fraction did not show any clear band with the current antibody when analyzed under non-reducing conditions ( FIG. 7B , Right).
- EP-1 and EP-2 mRNA are expressed at a ratio of approximately 6:4 in the intestine. It remains to be determined whether they are indeed translated at this ratio. Moreover, it is not known at present whether they have a discrete role in vivo.
- a DNA fragment including the coding sequence for the Medaka EP-1 or EP-2 catalytic domain was amplified by PCR using a pBluescript II plasmid containing cDNA of the catalytic domain as the template.
- the upper and lower primers were 5′-CGCGGATCCCAAGCTGGTGTGGTGGGTGG-3′ (SEQ ID NO: 39) and 5′-CCCAAGCTTTCAGTCTAGATCTGAGAA-3′ (SEQ ID NO: 40), respectively, which had BamHI and HindIII sites at the respective 5′ termini.
- the product was ligated into the cloning site of a pET30a expression vector (Novagen, Madison, Wis.).
- Solubilized proteins were subjected to affinity chromatography on Ni 2+ -Sepharose (GE Healthcare Biosciences, Piscataway, N.J.), and eluted with the same buffer containing 50 mM histidine. Eluted recombinant proteins were renatured by dialysis against 50 mM Tris.HCl (pH 8.0). The fusion protein was then incubated in 50 mM Tris.HCl (pH 8.0) containing 0.5 M NaCl with trypsin immobilized on Sepharose 4B at room temperature for 1 h. The immobilized trypsin was then removed by filtration.
- the resulting sample which contained not only active EP protease but also inactive enzyme protein, was fractionated on a column of Resource Q in AKTA Purifier (GE Healthcare Biosciences, Uppsala, Sweden) to remove inactive enzyme. A trace amount of trypsin often contained in the sample thus prepared was removed by passing through an aprotinin-Sepharose 4B column (Sigma).
- Table 1 shows the effects of inhibitors on medaka EP-1 and EP-2 protease activity.
- the enzyme activities of medaka EP-1 and EP-2 protease were determined in the presence of various inhibitors using GD4K- ⁇ NA (SEQ ID NO: 6) as a substrate. Values are expressed as the percent inhibitions of the respective control activities. Results are the averages of triplicate determinations. From these results, together with the finding that EP-1 is the dominantly expressed form in the intestine, EP-1 was chosen to be used in the following experiments.
- the serine protease domain of Medaka EP-1 cleaved GD 4 K- ⁇ NA (SEQ ID NO: 6) at a rate comparable to those of the porcine and bovine enzymes ( FIG. 2A ).
- the amidolytic activities of Medaka EP-1 protease for the synthetic MCA-containing peptide substrates Boc-Glu(OBzl)-Ala-Arg-MCA, Z-Phe-Arg-MCA, and Pro-Phe-Arg-MCA were much lower than those of the EP proteases of mammalian origin ( FIG. 2B ).
- the kinetic parameters of the proteases for these substrates were determined, and shown in Table 2, below.
- the k cat /K m values of the Medaka enzyme were 1-2 orders of magnitude smaller than those of the mammalian proteases for all MCA-containing synthetic substrates.
- FIG. 2C the proteolytic activity of the Medaka protease was examined using gelatin ( FIG. 2C ), fibronectin ( FIG. 2D ), and laminin ( FIG. 2E ).
- the mammalian proteases were also tested under the same conditions. Little or no hydrolysis was observed with the fish enzyme for the proteins, while these substrates were detectably hydrolyzed by the mammalian proteases.
- the fusion protein containing an EP-cleavage site available from Novagen was tested with various EP proteases.
- the Medaka protease specifically cleaved the fusion protein to generate two polypeptides having expected molecular masses of 16- and 32-kDa ( FIG. 2F ).
- mutant proteases were the same as for the wild-type protein described above.
- active recombinant protein concentrations were determined using the active site titrant p-nitrophenyl-p′-guanidinobenzoate HCl (Sigma) using the method described previously (Chase et al., Biochem. Biophys. Res. Commun., 29:508-514 (1976)).
- Amino acid residues that differed from those of mammalian EP proteases in the corresponding positions were the primary focus. Five such residues were mutated, and shown in the sequences shown in FIG. 1B and in FIG. 3A . A total of 12 mutants could convert the recombinant Medaka trypsinogen to its active enzyme (data not shown).
- EP activity was routinely determined using the specific substrate Gly-Asp-Asp-Asp-Asp-Lys- ⁇ -naphthylamide (GD 4 K- ⁇ NA) (SEQ ID NO: 6) (Sigma) according to the method of Mikhailova and Rumsh (Mikhailova et al., FEBS Lett., 442:226-230 (1999)).
- Enzyme activity for various 4-methylcoumaryl-7-amide (MCA)-containing peptide substrates was determined by the method of Barrett (Barrett et al., J., Biochem. J., 187:909-912 (1980)).
- the mutant proteases had lower nonspecific proteolytic activity for human HMW kininogen ( FIG. 3B ) and human fibrinogen ( FIG. 3C ), both of which were degraded noticeably by mammalian EP proteases. Neither human fibronectin nor laminin was hydrolyzed by the mutants (data not shown).
- Gelatin zymography was conducted as described previously (Ogiwara et al., Proc. Natl. Acad. Sci. USA, 102:8442-8447 (2005)), except gel was incubated in 20 mM Tris.HCl buffer (pH 7.4) containing 50 mM NaCl and 2 mM CaCl 2 .
- oligonucleotide PCR primers were synthesized based on the cDNA sequence for conserved regions in serine protease (sense primer: 5′-GT(G/T)(C/G)T(C/G/T)(A/T)C(A/T)GCTGC(C/T)CACTG-3′ (SEQ ID NO: 7), which corresponds to the amino acid sequence NH2-Val-Leu-Thr-Ala-Ala-His-Cys-COOH (SEQ ID NO: 8); and antisense primer: 5′-(A/T)GGGCC(A/T)CC(A/T/G)GAGTC(A/T)CC-3′ (SEQ ID NO: 9), which corresponds to the amino acid sequence NH2-Gly-Asp-Ser-Gly-Gly-Pro-COOH (SEQ ID NO: 10)).
- cDNAs were PCR-amplified under the conditions described for EP in the main text.
- a 435-bp fragment was subcloned into pBluescript (II) KS+ (Stratagene, La Jolla, Calif.) and sequenced.
- a 5′ portion of Medaka trypsinogen was obtained by the 5′-RACE method (Frohman et al., Proc. Natl. Acad. Sci. USA, 85:8998-9002 (1988)) using the 5′-RACE system, Version 2.0 (Invitrogen, Carlsbad, Calif.).
- the antisense primers used were 5′-AGGAGGTGATGAACTG-3′ (SEQ ID NO: 11) (GSP-1; nucleotides 273 to 288, AB272106), 5′-CTCGGTTCCGTCATTGTTCCGGGAT-3′ (SEQ ID NO: 12) (GSP-2; nucleotides 249 to 272, AB272106) and 5′-CCAGACGCACCTCCACTCGGGACT-3′ (SEQ ID NO: 13) (nested GSP; nucleotides 214 to 237, AB272106).
- the two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C.
- a 3′ portion of Medaka trypsinogen was obtained by the 3′-RACE method (Frohman et al., Proc. Natl. Acad. Sci. USA, 85:8998-9002 (1988)) using the 3′-Full RACE Core Set (Takara, Tokyo, Japan).
- the sense primers used were 5′-CATGATCACCAACTCCATGTTCTG-3′ (SEQ ID NO: 14) (RACE1; nucleotides 545 to 568, AB272106) and 5′-TGGATACCTGGAGGGAGG-3′ (SEQ ID NO: 15) (RACE2; nucleotides 572 to 589, AB272106).
- the two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C. for the first PCR and 35 cycles of 0.5 min at 94° C., 0.5 min at 57° C., and 1 min at 72° C. for the second PCR.
- the amplified products were then subcloned into pBluescript II plasmid (Stratagene) and sequenced.
- a trypsinogen cDNA fragment (nucleotides 72-755, AB272106) containing its coding sequence, but without the putative signal sequence, was amplified by PCR using the following primers: 5′-CCGGAATTCCTTGACGATGACAAG-3′ (SEQ ID NO: 26) and 5′-CCCAAGCTTTCAGTTGCTAGCCATGGT-3′ (SEQ ID NO: 27).
- the PCR product was digested with EcoR I and Hind III, gel-purified and ligated into the pET30a expression vector.
- the expression of recombinant Medaka trypsinogen in the Escherichia coli expression system and its purification with an Ni 2+ -Sepharose column were the same as for the wild-type EP protein described above.
- the purified recombinant protein was renatured by dialysis against 50 mM Tris.HCl (pH 8.0) and further purified with a column of Resource Q.
- tPA tissue-type plasminogen activator
- the resulting mutant was confirmed by DNA sequencing and transfected into CHO cells cultured in F-12 medium (Invitrogen) containing 10% fetal bovine serum (Biological Industries, Beit Haemek, Israel). Transfection was performed using Lipofectamin 2000 (GE Healthcare Biosciences, Uppsala, Sweden).
- the above procedure produced a fusion protein of human tPA having 11 extra amino acid residues (His-His-His-His-His-His-His-His-His-His-His-His-His-Asp-Asp-Asp-Lys (SEQ ID NO: 32): a His-tag sequence followed by an EP-cleavage site) at the N-terminus of mature tPA.
- This fusion protein secreted from transfected CHO cells was collected from the culture media using an Ni 2+ -Sepharose column.
- Treatment of the fusion protein with EP proteases generated mature tPA without the 11-residue N-terminal peptide.
- Recombinant human kallikrein 8 was prepared as described previously (Rajapakse et al., FEBS Lett., 579:6879-6884 (2005)).
- Recombinant Medaka gelatinase A was prepared as described previously (Ogiwara et al., Proc. Natl. Acad. Sci. USA, 102:8442-8447 (2005)).
- a human single-chain tPA fusion protein containing an 11-residue sequence of a His-tag/EP-susceptible site at the N-terminus of mature tPA was generated by CHO cells, and used as a substrate for Medaka and mammalian EP proteases.
- the serine protease domain of medaka EP itself has a stricter specificity for almost all of the substrates tested when compared with the mammalian EP protease.
- Medaka wild-type EP protease would be adequate for the recombinant protein preparation of non-proteolytic enzymes.
- use of the mutant enzymes, in particular the E173A mutant enzyme is preferred.
- the medaka wild-type EP protease and its mutant can be prepared in large quantity in the E. coli expression system. Using the medaka EP serine proteases as fusion protein cleavage enzymes, the desired recombinant proteins can be easily and effectively produced.
Landscapes
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Genetics & Genomics (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
- Peptides Or Proteins (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Abstract
Disclosed are novel enteropeptidase polypeptides, polynucleotides encoding the polypeptides, nucleotide constructs, vectors, host cells comprising the polynucleotides, and methods for producing the polypeptides and polynucleotides. Such polypeptides are useful as protein engineering tool for enzymatic cleavage of fusion proteins. Also provided are kits comprising the polypeptides of the invention.
Description
This application is a continuation of U.S. Provisional Application No. 60/852,454.
Each of the applications and patents cited in this text, as well as each document or reference cited in each of the applications and patents (including during the prosecution of each issued patent; “application cited documents”), and each of the PCT and foreign applications or patents corresponding to and/or paragraphing priority from any of these applications and patents, and each of the documents cited or referenced in each of the application cited documents, are hereby expressly incorporated herein by reference. More generally, documents or references are cited in this text, either in a Reference List, or in the text itself; and, each of these documents or references (“herein-cited references”), as well as each document or reference cited in each of the herein-cited references (including any manufacturer's specifications, instructions, etc.), is hereby expressly incorporated herein by reference.
The present invention relates to novel enteropeptidase (EP) variant polypeptides derived from Japanese Medaka (Oryzias latipes). More particularly, the present invention relates to novel EP variant polypeptides with enhanced substrate specificity, polynucleotides encoding the EP polypeptides, nucleotide constructs, vectors and host cells comprising the polynucleotides, methods for producing the polypeptides and polynucleotides, and kits.
EP (enterokinase, EC 3.4.21.9) is a heterodimeric glycoprotein present in the duodenal and jejunal mucosa and is involved in the digestion of dietary proteins. Specifically, EP catalyzes the conversion, in the duodenal lumen, of trypsinogen into active trypsin via the cleavage of the acidic propeptide from trysinogen (Light et al., Trends Biochem. Sci., 14:110-112 (1989)). The activation of trypsin initiates a cascade of proteolytic reactions leading to the activation of many pancreatic zymogens, including chymotrypsinogen, proelastase, procarboxypeptidases, and some prolipases (Grishan et al., Gastroenterology, 85:727-731 (1983)).
To date, studies have reported the molecular cloning of EP from several mammalian sources, including cattle (LaVallie et al., J. Biol. Chem., 268:23311-13317 (1993); Kitamoto et al., Proc. Natl. Acad. Sci. USA, 91:7588-7592 (1994)), humans (Kitamoto et al., Biochemistry, 34:4562-4568 (1995)), pigs (Matsushima et al., J. Boil. Chem., 269:19976-19982 (1994)), rats (Yahagi et al., Biochem. Biophys. Res. Commun., 219:806-812 (1996)), and mice (Yuan et al., Am. J. Physiol., 274:342-349 (1998)). These studies provided much information on the structural details and organization of EP, and opened a path to further investigation of the molecular properties of this protease. For example, it was reported that the N-terminal heavy-chain is required for efficient activation of trypsinogen by the serine protease domain of the C-terminal light chain (Lu et al., J. Biol. Chem., 272:31293-31300 (1997); Mikhailova et al., FEBS Lett., 442:226-230 (1999)). In addition, a recent study by Lu et al. established the tertiary structure of the bovine EP catalytic domain, thereby demonstrating that Lys99, which is situated in a unique exosite on the enzyme surface, involves in the specific cleavage of trypsinogen and similar peptidyl substrates (Lu et al., J. Mol. Biol., 292:361-373 (1999)). A more recent study reported that a mucin-like domain found in the heavy chain of EP can be a targeting signal for apical sorting of the protein (Zheng et al., J. Biol. Chem., 277:6858-6863 (2002)).
EP is highly specific for the sequence Asp-Asp-Asp-Asp-Lys (D4K) (SEQ ID NO: 1) of trypsinogen (Bricteux-Gregoire et al., Comp. Biochem. Physiol., 42B: 23-39 (1972)). It is generally believed that EP (or enteropeptidase-like enzyme) is present in all vertebrates. This belief comes from the finding that in almost all vertebrate species a short peptide sequence of Asp-Asp-Asp-Asp-Lys (D4K) (SEQ ID NO: 1) is found in the presumed activation site of trypsinogens (14). However, no information on EP in vertebrates other than mammals has been made available to date. EP is highly specific for the sequence Asp-Asp-Asp-Asp-Lys (D4K) (SEQ ID NO: 1) of trypsinogen (Bricteux-Gregoire et al., Comp. Biochem. Physiol., 42B:23-39 (1972)). Because of the high degree of D4K (SEQ ID NO: 1) specificity, EP has been used as a suitable reagent for cleaving substrate proteins. Indeed, bovine EP has been widely used for this purpose (Collins-Racie et al., Biotechnology, 13:982-987 (1995)).
Nonetheless, the conventional system utilizing bovine EP still has significant drawbacks for industrial application due to its nonspecific proteolytic activity. More particularly, while bovine EP protease cleaves at the EP-cleavage site of recombinant fusion proteins, it also simultaneously hydrolyzes other peptide bonds of the proteins to a considerable degree because of its nonspecific proteolytic activity. This causes a seriously low yield of the targeted protein. Such nonspecific activities of bovine EP also can be an obstacle in the preparation of active recombinant proteases where the EP is employed for cleavage of the inactive fusion protein. This is particularly serious when the proteases to be examined are ones with very low activity for synthetic and naturally occurring protein substrates. In addition, such nonspecific activities of bovine EP make it difficult to determine whether the target recombinant proteases have been successfully activated.
Hence there is a need to generate a novel EP variant polypeptide that substantially lacks nonspecific proteolytic activity while retaining its high specificity for D4K sequence (SEQ ID NO: 1).
The present inventors have now generated novel EP variant polypeptides from a non-mammalian source, Japanese Medaka, which demonstrates substantially reduced nonspecific proteolytic activity while retaining its high specificity for Asp-Asp-Asp-Asp-Lys (D4K) sequence (SEQ ID NO: 1).
The inventors here report on the isolation of cDNAs encoding EP of the medaka (Oryzias latipes), a freshwater teleost, and its expression in the tissues. The present study also describes some enzymatic properties of the catalytic serine protease domain. Surprisingly, the protease domain of medaka EP exhibits very limited amidolytic activity for any of the synthetic peptide substrates tested, indicating that the medaka protease itself is much more highly specific for the Asp-Asp-Asp-Asp-Lys (D4K) (SEQ ID NO: 1), than those of its mammalian counterparts. Various mutant proteases of medaka EP were generated by site-directed mutagenesis. Some of the mutated proteases exhibited cleavage specificity that was stricter than that of the wild-type enzyme, and may prove to be more effective tools for recombinant protein technology.
In a first aspect, the invention provides an isolated nucleic acid molecule selected from the group consisting of a nucleic acid molecule comprising a nucleotide sequence which is at least 75% homologous to the nucleotide sequence SEQ ID NO: 3, or SEQ ID NO: 5, or a complement thereof, a nucleic acid molecule comprising a fragment of at least 15 nucleotides of a nucleic acid comprising the nucleotide sequence of SEQ ID NO: 3, SEQ ID NO: 5, or a complement thereof, a nucleic acid molecule which encodes a polypeptide comprising an amino acid sequence at least about 50% identical to the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4, a nucleic acid molecule which encodes a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4; wherein the fragment comprises at least 10 contiguous amino acid residues of the amino acid sequence of SEQ ID NO:2 or SEQ ID NO:4, and a nucleic acid molecule which encodes a variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4; wherein the nucleic acid molecule hybridizes to a complement of a nucleic acid molecule comprising, SEQ ID NO:3 or SEQ ID NO:5, under stringent conditions.
In one embodiment of the first aspect, the isolated nucleic acid molecule is selected from the group consisting of a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 3, SEQ ID NO: 5, or a complement thereof, and a nucleic acid molecule which encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4.
In another embodiment, the nucleic acid further comprises vector nucleic acid sequences. In a further embodiment, the nucleic acid is operably linked to a surrogate promoter. IN another particular embodiment of the aspect, the nucleic acid further comprises nucleic acid sequences encoding a heterologous polypeptide.
In a particular embodiment of the aspect, a host cell contains the nucleic acid molecule of claim 1. In one embodiment, the host cell is selected from the group consisting of: bacterial cells, fungal cells, and animal cells. In a particular embodiment, the bacterial cell is Escherichia coli.
In another aspect, the invention provides isolated polypeptides that are selected from the group consisting of a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4, wherein the fragment comprises at least 15 contiguous amino acids of SEQ ID NO: 2 or SEQ ID NO: 4, a variant of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4, wherein the polypeptide is encoded by a nucleic acid molecule which hybridizes to a complement of a nucleic acid molecule comprising, SEQ ID NO:3, or SEQ ID NO:5, under stringent conditions, a polypeptide which is encoded by a nucleic acid molecule comprising a nucleotide sequence which is at least 50% identical to a nucleic acid comprising the nucleotide sequence SEQ ID NO:3, or SEQ ID NO:5, and a polypeptide comprising an amino acid sequence which is at least 30% homologous to the amino acid sequence of, SEQ ID NO:2, or SEQ ID NO:4.
In one embodiment of the aspect, the isolated polypeptides comprise the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4.
In another embodiment of the aspect, the polypeptide comprising the amino acid sequence of SEQ ID NO: 2 has at least one mutation. In a particular embodiment, the mutation is selected from the group consisting of a substitution, deletion, and addition. In a more particular embodiment, the mutation is a substitution. In a further embodiment, the substitution occurs at amino acid residue selected from the group consisting of: residue 93 through residue 193. In another embodiment, the substitution comprises a substitution at one or more residues selected from position 63, 105, 144, 173 or 193. In a particular embodiment, the substitution is at residue 63. In another embodiment, the substitution at residue 63 is selected from the group consisting of: K63R, K63A, and K63E. In a particular embodiment, the substitution is at residue 105. In another embodiment, the substitution at residue 105 is selected from the group consisting of T105A, T105R, and T105E. In a particular embodiment, the substitution is at residue 144. In another embodiment, the substitution at residue 144 is F144S. In another embodiment, the substitution is at residue 173. In a particular embodiment, the substitution at residue 173 is E173A. In another embodiment, the substitution is at residue 193. In another embodiment, the substitution at residue 193 is selected from the group consisting of: P193E and P193A.
In a further embodiment, the isolated polypeptide with E173A substitution consists of the amino acid sequence of SEQ ID NO: 4. In another further embodiment, the isolated polypeptide with E173A substitution comprises the amino acid sequence of SEQ ID NO: 4.
In one embodiment, any of the isolated polypeptides according to any of the aspects described herein is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1).
In another embodiment, any of the isolated polypeptides according to any of the aspects described herein, has low non-specific proteolytic activity. In a further embodiment, the polypeptide has low-specific proteolytic activity for a synthetic peptide substrate. In another further embodiment, the synthetic peptide substrate is a 4-methylcoumaryl-7-amide (MCA)-substrate. In a particular embodiment, the synthetic peptide substrate is selected from the group consisting of: Boc-Glu (OBzl)-Ala-Arg-MCA, Z-Phe-Arg-MCA, and Pro-Phe-Arg-MCA. In a further embodiment, the synthetic peptide substrate consists of a fusion protein. In a more particular embodiment, the fusion protein comprises SEQ ID NO: 1 and another protein.
In another embodiment, the polypeptide has low non-specific proteolytic activity for a biological peptide substrate. In a further embodiment, the biological peptide substrate is selected from the group consisting of: kininogen, fibrinogen, fibronectin, gelatin and laminin. In one embodiment, the biological peptide substrate consists of a recombinant fusion protein. In another embodiment, the recombinant fusion protein comprises SEQ ID NO: 1 and another protein. In a particular embodiment, the recombinant fusion protein is selected from the group consisting of: gelatinaseA, human kallikrein 8 and tissue type plasminogen activator (tPA).
In one aspect, the invention provides an isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 2 that has at least one mutation at one or more residues selected from position 63, 105, 144, 173 or 193, wherein the isolated polypeptide is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
In one embodiment, the mutation is a substitution selected from the group consisting of: K63R, K63A, K63E, T105A, T105R, T105E, F144S, E173A, P193A, and P193A. In another particular embodiment of the aspect, the mutation is E173A.
Another aspect of the invention provides an isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 4, wherein the isolated polypeptide is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
In another embodiment, the invention provides an isolated polypeptide as described herein, wherein the polypeptide is a recombinant polypeptide.
In still a further embodiment, the invention provides an isolated polypeptide as described herein, wherein the polypeptide has enhanced stability at −20 C, 4 C and 32 C.
In a particular aspect, the invention teaches a method for producing a polypeptide that is selected from the group consisting of a polypeptide comprising the amino acid sequence SEQ ID NO: 2, or SEQ ID NO: 4, a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4, wherein the fragment comprises at least 15 contiguous amino acids of SEQ ID NO: 2, or SEQ ID NO: 4, a naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4, wherein the polypeptide is encoded by a nucleic acid molecule which hybridizes to a complement of a nucleic acid molecule comprising SEQ ID NO:3, or SEQ ID NO:5, under stringent conditions, and where the method comprises culturing the host cells of the invention under conditions in which the nucleic acid molecule is expressed.
In certain embodiments, the polypeptides are produced in an E. coli expression system.
Another particular aspect of the invention teaches a method for cleavage of a protein containing an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1) using any of the polypeptides of the invention described herein, the method comprising contacting the protein with any of the polypeptides of the invention, and wherein the contacting of the protein with the polypeptide results in specific cleavage.
In one embodiment, the protein is a fusion protein. In another embodiment, the fusion protein is a recombinant fusion protein. In a further embodiment, the protein is bacterially produced. In a more particular embodiment, the protein is a synthetic protein.
In a further aspect, the invention teaches a method for the preparation of recombinant protein using any of the polypeptides according to the invention as described herein, the method comprising providing a recombinant fusion protein containing a Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1), and contacting the fusion protein with any of the polypeptides of the invention, wherein contacting the recombinant fusion protein with the polypeptide results in Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1) specific cleavage and preparation of recombinant protein.
In another aspect, the invention provides a kit comprising any of the polypeptides described herein for use in the cleavage of a protein containing an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1), and instructions for use.
In one embodiment, the protein is a fusion protein. In another embodiments, the fusion protein is a recombinant fusion protein. In further embodiments, the protein is a bacterially produced protein. In a particular embodiment, the protein is a synthetic protein.
SEQ ID NO:1—D4K Sequence
SEQ ID NO:2—Amino Acid Sequence of EP-1
SEQ ID NO:3—Nucleic Acid Sequence of EP-1
SEQ ID NO:4—Amino Acid Sequence of EP-173
SEQ ID NO:5—Nucleic Acid Sequence of EP-173
The present invention provides novel EP variant polypeptides with enhanced substrate specificity, polynucleotides encoding the polypeptides, nucleotide construct, vectors and host cells comprising the polynucleotides, and methods for producing the polypeptides and polynucleotides.
Described herein is the cloning of cDNAs for enteropeptidase (EP) from the intestine of the medaka, Oryzias latipes, which is a small freshwater teleost. The mRNAs code for EP-1 (1043 residues) and EP-2 (1036 residues), both of which have a unique, conserved domain structure of the N-terminal heavy-chain and C-terminal catalytic serine protease light-chain. When compared with mammalian EP serine proteases, the medaka enzyme exhibits extremely low amidolytic activity for small synthetic peptide substrates.
The present invention describes twelve mutated forms of the medaka EP protease that were produced by site-directed mutagenesis. Among them, the mutant protease E173A, was found to have considerably reduced nonspecific hydrolytic activities both for synthetic and protein substrates without serious reduction of its Asp-Asp-Asp-Asp-Lys (D4K)-cleavage activity (SEQ ID NO: 1). For the cleavage of fusion proteins containing an Asp-Asp-Asp-Asp-Lys (D4K)-cleavage site (SEQ ID NO: 1), the medaka EP proteases were shown to have advantages over their mammalian counterparts. Based on the present invention, the mutated forms of the EP protease described by the present invention, including the E173A mutant EP protease, represent an improved proteases for use as a restriction proteases to specifically cleave fusion proteins.
Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by a person skilled in the art to which this invention belongs. The following references provide one of skill with a general definition of many of the terms used in this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2nd ed. 1994); The Cambridge Dictionary of Science and Technology (Walker ed., 1988); The Glossary of Genetics, 5th Ed., R. Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, The Harper Collins Dictionary of Biology (1991). As used herein, the following terms have the meanings ascribed to them below, unless specified otherwise.
In this disclosure, “comprises,” “comprising,” “containing” and “having” and the like can have the meaning ascribed to them in U.S. Patent law and can mean “includes,” “including,” and the like; “consisting essentially of” or “consists essentially” likewise has the meaning ascribed in U.S. Patent law and the term is open-ended, allowing for the presence of more than that which is recited so long as basic or novel characteristics of that which is recited is not changed by the presence of more than that which is recited, but excludes prior art embodiments.
The term “amino acid sequence” is recited herein to refer to an amino acid sequence of a protein molecule, “amino acid sequence” and like terms are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule; furthermore, an “amino acid sequence” can be deduced from the nucleic acid sequence encoding the protein.
The term “bacterial cell” is meant to include any Gram negative or Gram positive bacterial cell. Typically, Gram-negative bacteria can include Gluconobacter, Rhizobium, Bradyrhizobium, Alcaligenes, Rhodobacter, Rhodococcus, Azospirillum, Rhodospirillum, Sphingomonas, Burkholderia, Desulfomonas, Geospirillum, Succinomonas, Aeromonas, Shewanella, Halochromatium, Citrobacter, Escherichia, Klebsiella, Zymomonas, Zymobacter, and Acetobacter. Typically, Gram-positive bacteria can include Fibrobacter, Acidobacter, Bacteroides, Sphingobacterium, Actinomyces, Corynebacterium, Nocardia, Rhodococcus, Propionibacterium, Bifidobacterium, Bacillus, Geobacillus, Paenibacillus, Sulfobacillus, Clostridium, Anaerobacter, Eubacterium, Streptococcus, Lactobacillus, Leuconostoc, Enterococcus, Lactococcus, Thermobifida, Cellulomonas, and Sarcina.
The term “coding sequence” is defined herein as a polynucleotide sequence, which directly specifies the amino acid sequence of its protein product. By “fragment” is meant a portion (e.g., at least 5, 10, 25, 50, 100, 125, 150, 200, 250, 300, 350, 400, or 500 amino acids or nucleic acids) of a protein or nucleic acid molecule that is substantially identical to a reference protein or nucleic acid and retains the biological activity of the reference. In some embodiments the portion retains at least 50%, 75%, or 80%, or more preferably 90%, 95%, or even 99% of the biological activity of the reference protein or nucleic acid described herein, and retains at least one biological activity of the reference protein.
The term “fusion protein” as used herein is meant to refer to a protein created through genetic engineering from two or more proteins or peptides. As used herein, a fusion protein can refer to a protein in which a Asp-Asp-Asp-Asp-Lys (D4K) sequence (SEQ ID NO: 1) has been intentionally introduced for specific cleavage. Generally, cleavage of the fusion protein generates two polypeptides. A fusion protein according to the invention can be a recombinant fusion protein. In particular embodiments, a fusion protein can be generated, for example, from the addition of a vector-derived residue peptide at one terminus, for example the N-terminus, in addition to the amino acid sequence of the native. In this way, for example, a recombinant fusion protein can be constructed to have Asp-Asp-Asp-Asp-Lys (D4K) cleavage sites (SEQ ID NO: 1) in the vector and in the protein that contains Asp-Asp-Asp-Asp-Lys (D4K) sites (SEQ ID NO: 1) itself.
The term “homologue”, as used herein, refers to a protein or nucleic acid sharing a certain degree of sequence “identity” or sequence “similarity” with a given protein, or the nucleic acid encoding the given protein. The term “percent identity” refers to the percentage of residues in two sequences that are the same when aligned for maximum correspondence. Sequence “similarity” is related to sequence “identity”, but differs in that residues that are not exactly the same as each other, but that are functionally “similar” are taken into consideration.
The term “host cell” is meant to include any prokaryotic or eukaryotic cell that contains either a cloning vector or an expression vector. This term also includes those prokaryotic or eukaryotic cells that have been genetically engineered to contain the cloned gene(s) in the chromosome or genome of the host cell.
The term “hybridizes under stringent conditions” is intended to describe conditions for hybridization and washing under which nucleotide sequences at least 60%, 70%, 75%, 80%, 85%, 90%, or 95% homologous to each other typically remain hybridized to each other. Hybridization conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y., 1991. Moderate hybridization conditions are defined as equivalent to hybridization in 2× sodium chloride/sodium citrate (SSC) at 30° C., followed by a wash in 1×SSC, 0.1% SDS at 50° C. Highly stringent conditions are defined as equivalent to hybridization in 6× sodium chloride/sodium citrate (SSC) at 45° C., followed by a wash in 0.2×SSC, 0.1% SDS at 65° C.
The term “identical” is intended to include a first amino acid or nucleotide sequence which contains a sufficient or minimum number of the same or equivalent amino acid residues or nucleotides, e.g., an amino acid residue which has a similar side chain, to a second amino acid or nucleotide sequence such that the first and second amino acid or nucleotide sequences share common structural domains and/or a common functional activity. Accordingly, a homologous or identical nucleic acid molecule of the invention is at least 10, 15, 20, 25, 30 or more nucleotides in length and hybridizes under stringent conditions to a nucleic acid molecule encoding the amino acid sequence of SEQ ID NO: 2 or to a nucleic acid molecule encoding the amino acid sequence of SEQ ID NO: 4. Preferably, the molecule hybridizes under highly stringent conditions. In other embodiments, the nucleic acid is at least 15-20 nucleotides in length.
The terms “isolated,” “purified,” or “biologically pure” refer to material that is free to varying degrees from components which normally accompany it as found in its native state. Various levels of purity may be applied as needed according to this invention in the different methodologies set forth herein; the customary purity standards known in the art may be used if no standard is otherwise specified. The enteropeptidase polypeptides of the present invention can be in essentially or substantially pure form. For instance, they are essentially free of other polypeptide material with which it is natively associated. They can also be at least 20% pure, preferably at least 40% pure, more preferably at least 60% pure, even more preferably at least 80% pure, most preferably at least 90% pure, and even most preferably at least 95% pure, as determined by agarose electrophoresis. This can be accomplished by preparing the polypeptide by a variety of means of well-known recombinant methods or by classical purification methods.
By “isolated nucleic acid molecule” is meant a nucleic acid (e.g., a DNA, RNA, or analog thereof) that is free of the genes which, in the naturally occurring genome of the organism from which the nucleic acid molecule of the invention is derived, flank the gene. The term therefore includes, for example, a recombinant DNA that is incorporated into a vector; into an autonomously replicating plasmid or virus; or into the genomic DNA of a prokaryote or eukaryote; or that exists as a separate molecule (for example, a cDNA or a genomic or cDNA fragment produced by PCR or restriction endonuclease digestion) independent of other sequences. In addition, the term includes an RNA molecule which is transcribed from a DNA molecule, as well as a recombinant DNA which is part of a hybrid gene encoding additional polypeptide sequence.
The term an “isolated polypeptide” (e.g., an isolated or purified biosynthetic enzyme) is substantially free of cellular material or other contaminating polypeptides from the microorganism from which the polypeptide is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. The terms “isolated polypeptide” and “isolated protein” refer to compounds comprising amino acids joined via peptide bonds and are used interchangeably. Polypeptide molecules have an amino terminus (“N-terminus”) and a carboxy terminus (“C-terminus”). Peptide linkages occur between the backbone amino group of a first amino acid residue and the backbone carboxyl group of a second amino acid residue. Typically, the terminus of a polypeptide at which a new linkage would occur is the carboxy-terminus of the growing polypeptide chain, and polypeptide sequences are written from left to right beginning at the amino terminus.
The term “low” means a reduced amount, or a decreased amount, relative to an unmutated or unaltered nucleotide or polypeptide. Unaltered can mean unmutated. For example, an EP polypeptide of the invention that contains a mutation may have a low proteolytic activity as compared to an EP polypeptide that does not contain the same mutation. In exemplary embodiments the polypeptide has low proteolytic activity, which may be 10%, 15%, 25%, 50%, 75% or even 90% lower than unmutated or unaltered polypeptide.
The phrase “mutant nucleic acid molecule” or “mutant gene” is intended to include a nucleic acid molecule or gene having a nucleotide sequence which includes at least one alteration (e.g., substitution, insertion, deletion) such that the polypeptide or polypeptide that can be encoded by said mutant exhibits an activity that differs from the polypeptide or polypeptide encoded by the wild-type nucleic acid molecule or gene.
As used herein, the term “nucleotide” refers to a nucleoside phosphorylated at one of its pentose hydroxyl groups. The term “nucleoside” in turn refers to a compound consisting of a purine [guanine (G) or adenine (A)] or pyrimidine [thymine (T), uridine (U), or cytidine (C)] base covalently linked to a pentose. The term “polynucleotide” refers to a nucleic acid containing a sequence that is greater than about 100 nucleotides in length. The term “nucleic acid” refers to a covalently linked sequence of nucleotides in which the 3′ position of the pentose of one nucleotide is joined by a phosphodiester group to the 5′ position of the pentose of the next, and in which the nucleotide residues (bases) are linked in specific sequence; i.e., a linear order of nucleotides.
The term “nucleic acid” is intended to include nucleic acid molecules, e.g., polynucleotides which include an open reading frame encoding a polypeptide, and can further include non-coding regulatory sequences, and introns. In addition, the terms are intended to include one or more genes that map to a functional locus. In addition, the terms are intended to include a specific gene for a selected purpose. The gene can be endogenous to the host cell or can be recombinantly introduced into the host cell, e.g., as a plasmid maintained episomally or a plasmid (or fragment thereof) that is stably integrated into the genome.
The term “operably linked” denotes herein a configuration in which a control sequence is placed at an appropriate position relative to the coding sequence of the polynucleotide sequence such that the control sequence directs the expression of the coding sequence of a polypeptide.
The term “protease” is intended to include any polypeptide/s, alone or in combination with other polypeptides, that break peptide bonds between amino acids of proteins.
The term “proteolytic activity” is meant to refer to the cleavage activity of a substrate by an enzyme. In particular embodiments, the term refers to the enzymatic cleavage by enteropeptidases. In exemplary embodiments, the term is meant to refer to the specific activity of medaka EP for Asp-Asp-Asp-Asp-Lys cleavage sites (SEQ ID NO: 1). “Non-specific proteolytic activity” is meant to refer to cleavage activity that is not directed to a specific cleavage site. “Specific proteolytic activity” is meant to refer to cleavage activity that is directed to a specific cleavage site. Proteolytic activity can be
The term “recombinant” is meant the product of genetic engineering or chemical synthesis.
The term “recombinant nucleic acid molecule” includes a nucleic acid molecule (e.g., a DNA molecule) that has been altered, modified or engineered such that it differs in nucleotide sequence from the native or natural nucleic acid molecule from which the recombinant nucleic acid molecule was derived (e.g., by addition, deletion or substitution of one or more nucleotides). In some embodiments, a recombinant nucleic acid molecule (e.g., a recombinant DNA molecule) includes an isolated nucleic acid molecule or gene of the present invention (e.g., an isolated EP nucleic acid molecule encoding an EP polypeptide) operably linked to regulatory sequences.
By “substantially identical” is meant a protein or nucleic acid molecule exhibiting at least 50% identity to a reference amino acid sequence (for example, any one of the amino acid sequences described herein) or nucleic acid sequence (for example, any one of the nucleic acid sequences described herein). Preferably, such a sequence is at least 50%, are more preferably 60%, 70%, 75%, 80%, 85%, 90%, and most preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical at the amino acid level or nucleic acid to the sequence used for comparison.
Sequence identity is typically measured using sequence analysis software (for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, BLAST, BESTFIT, GAP, or PILEUP/PRETTYBOX programs). Such software matches identical or similar sequences by assigning degrees of homology to various substitutions, deletions, and/or other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine. In an exemplary approach to determining the degree of identity, a BLAST program may be used, with a probability score between e−3 and e−100 indicating a closely related sequence.
The term “variant” when used in reference to a polypeptide refers to an amino acid sequence that differs by one or more amino acids from a reference polypeptide.
Enteropeptidase (EP)
Enteropeptidase (EP) is a serine protease enzyme that activates its substrates by cleavage. Enteropeptidase is an intestinal protease that removes an N-terminal fragment from trypsinogen. The remaining active fragment is trypsin. This cleavage initiates a cascade of proteolytic reactions leading to the activation of many pancreatic zymogens. See, for example, Matsushima et al., J. Biol. Chem. 269(31): 19976-19982 (1994), Kitamoto et al., Proc. Nat. Acad. Sci., 91(16): 7588-7592 (1994). Almost all of the trypsinogen sequences known to date contain a highly conserved tetra-aspartate sequence preceding the lysine-isoleucine scissile peptide bond. Although EP is widely considered to play a role in trypsinogen activation in all vertebrate species, there has been no report on EP from non-mammalian species. Japanese Patent Publication No. 2005-253352, incorporated herein by reference, has described an enteropeptidase sequence from the lower vertebrate medaka. However, the present study is thus the first to report on the molecular and biochemical characterizations of EP from medaka.
The amino acid sequence of the fish EP is homologous to those of its mammalian counterparts, with all the structural features found in mammalian EPs being conserved, including various unique domains in the N-terminal heavy-chain. However, the extent of identity varies from domain to domain. LDLR domains 1 and 2, C1 r/s domains 1 and 2, and the MAM domain are highly conserved between medaka and mammalian EP with 45-57% identity, while the identity in the mucin-like and MSCR domain between them is as low as 22%. This fact suggests that the former five domains in the heavy-chain play important roles throughout vertebrate species, although these roles are not known at present. As for the mucin-like and MSCR domain, a remarkable sequence homology is found among mammalian EPs, suggesting a conserved role for their respective domains in the molecular event involving EP in mammalian species. Indeed, a previous study clearly established the importance of the O-glycosylated mucin-like domain of bovine EP in apical targeting of the protein (12). It is not known at present whether the corresponding domain of medaka EP may also play such a role.
The heavy-chain of medaka EP has a hydrophobic segment near the N-terminus. This segment probably serves as a transmembrane anchor, as established for the mammalian EP. Consistent with this notion is the current observation that the 28-kDa immunoreactive protein was detected in the membrane fraction of medaka intestines by specific EP antibodies. The EP was also immunologically detected in the soluble fraction of the intestine. Therefore, as in the case of mammalian EPs, the medaka protease is synthesized as a single-chain zymogen in the intestine. After migrating to the surface of the intestine as a membrane-bound protein, some EP molecules probably undergo proteolytic attack by a protease(s) to generate soluble EP. The adult medaka fish intestinal epithelium is demonstrated to contain most of the cell types (enterocytes, goblet cells, and enteroendocrine cells) observed in the small intestine of other vertebrates, but lacks crypts containing Paneth cells and intestinal stem cells (22). The data presented herein suggests that medaka EP is localized in the enterocytes in the proximal intestinal epithelium.
Since EP is highly specific for the Asp-Asp-Asp-Asp-Lys (D4K) sequence (SEQ ID NO: 1), this motif has been intentionally introduced for the specific cleavage of fusion proteins. Bovine EP serine protease is now widely used for this purpose. The current system utilizing the bovine enzyme works reasonably well in many cases, but requires handling with great care. Often, difficulties are encountered that include (1) Bovine EP protease primarily cleaves at the EP-cleavage site of recombinant fusion proteins. However, other peptide bonds of the proteins are also hydrolyzed to a considerable degree by its nonspecific proteolytic activity. This results in a low yield of the protein in question. (2) For preparing active recombinant proteases, the bovine EP protease employed for cleavage of the inactive fusion protein presents an obstacle. This is particularly serious when the proteases to be examined are ones with very low activity for synthetic and protein substrates. Significant nonspecific activities of bovine EP protease often makes it difficult to determine whether the target recombinant proteases have been successfully activated.
Isolated Nucleic Acid Molecules
Included in the scope of the present invention are isolated nucleic acid molecules. The nucleic acid molecule can be single-stranded or double-stranded DNA. The isolated nucleic acid molecule of the invention can include a nucleic acid molecule which is free of sequences which naturally flank the nucleic acid molecule (i.e., sequences located at the 5′ and 3′ ends of the nucleic acid molecule) in the chromosomal DNA of the organism from which the nucleic acid is derived. For instance, an isolated nucleic acid molecule can contain less than about 10 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, 0.1 kb, 50 bp, 25 bp or 10 bp of nucleotide sequences which naturally flank the nucleic acid molecule in chromosomal DNA of the microorganism from which the nucleic acid molecule is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular materials when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
In certain embodiments of the invention, the nucleic acid corresponds to enteropeptidase 1 (SEQ ID NO: 3):
GTGGTGGGTGGGGTCAATGCTGAAAAGGGGGCGTGGCCATGGATGGTGTC |
CCTACACTGGAGGGGGCGTCATGGCTGTGGTGCCTCACTGATCGGCAGAG |
ACTGGTTGCTGACTGCTGCACACTGTGTCTATGGGAAGAACACACACCTG |
CAGTACTGGTCAGCTGTTCTTGGCCTTCATGCTCAGAGCAGCATGAACTC |
ACAGGAAGTTCAGATCCGGCAGGTGGACCGCATTATCATCAACAAGAACT |
ACAACAGAAGAACCAAAGAGGCAGACATCGCCATGATGCACCTGCAGCAG |
CCAGTCAACTTCACTGAGTGGGTTCTGCCTGTGTGTTTAGCATCAGAAGA |
TCAACATTTTCCAGCTGGAAGAAGGTGTTTCATTGCAGGGTGGGGTCGGG |
ACGCTGAAGGAGGATCTCTACCTGACATTCTACAGGAGGCTGAGGTTCCC |
CTGGTGGACCAGGATGAGTGCCAGCGTCTCTTACCCGAGTACACCTTCAC |
CTCCAGCATGCTATGTGCTGGATATCCTGAAGGCGGAGTTGACTCCTGTC |
AGGGTGACTCTGGAGGACCTCTGATGTGCTTAGAAGATGCACGGTGGACT |
CTGATTGGTGTGACATCATTTGGCGTTGGCTGTGGGCGTCCTGAGAGACC |
TGGAGCTTATGCTCGAGTGTCTGCTTTCACTTCATGGATTGCTGAGACCA |
GGCGCTCCTCGTTCTCAGATCTAGACTGA |
In other embodiments of the invention, the nucleic acid corresponds to the enteropeptidase 1 with a E 173A mutation (SEQ ID NO: 5):
GTGGTGGGTGGGGTCAATGCTGAAAAGGGGGCGTGGCCATGGATGGTGTC |
CCTACACTGGAGGGGGCGTCATGGCTGTGGTGCCTCACTGATCGGCAGAG |
ACTGGTTGCTGACTGCTGCACACTGTGTCTATGGGAAGAACACACACCTG |
CAGTACTGGTCAGCTGTTCTTGGCCTTCATGCTCAGAGCAGCATGAACTC |
ACAGGAAGTTCAGATCCGGCAGGTGGACCGCATTATCATCAACAAGAACT |
ACAACAGAAGAACCAAAGAGGCAGACATCGCCATGATGCACCTGCAGCAG |
CCAGTCAACTTCACTGAGTGGGTTCTGCCTGTGTGTTTAGCATCAGAAGA |
TCAACATTTTCCAGCTGGAAGAAGGTGTTTCATTGCAGGGTGGGGTCGGG |
ACGCTGAAGGAGGATCTCTACCTGACATTCTACAGGAGGCTGAGGTTCCC |
CTGGTGGACCAGGATGCGTGCCAGCGTCTCTTACCCGAGTACACCTTCAC |
CTCCAGCATGCTATGTGCTGGATATCCTGAAGGCGGAGTTGACTCCTGTC |
AGGGTGACTCTGGAGGACCTCTGATGTGCTTAGAAGATGCACGGTGGACT |
CTGATTGGTGTGACATCATTTGGCGTTGGCTGTGGGCGTCCTGAGAGACC |
TGGAGCTTATGCTCGAGTGTCTGCTTTCACTTCATGGATTGCTGAGACCA |
GGCGCTCCTCGTTCTCAGATCTAGACTGA |
In one embodiment, an isolated nucleic acid molecule of the invention comprises a nucleotide sequence which is at least about 50% identical, and most preferably 60%, 65%, 70%, 75%, 80%, 85%, and more preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, or a complement thereof. In another embodiment, the nucleic acid molecule of the invention comprises a fragment of at least about 5-25, more preferably 10-15 nucleotides of a nucleic acid comprising the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, or a complement thereof, that retains the biological activity of SEQ ID NO: 3 or SEQ ID NO: 5, e.g. the fragments have proteolytic activity, and in more specific embodiments, the fragments can cleave at Asp-Asp-Asp-Asp-Lys cleavage sites (SEQ ID NO: 1), and have low non-specific proteolytic activity. The term “low” means a reduced amount, or a decreased amount, relative to an unmutated or unaltered nucleotide or polypeptide. Unaltered can mean unmutated. In exemplary embodiments the polypeptide has low proteolytic activity, which may be 10%, 15%, 25%, 50%, 75% or even 90% lower than unmutated or unaltered polypeptide. In yet another embodiment, an isolated nucleic acid molecule of the invention encodes a nucleic acid molecule which encodes a polypeptide comprising an amino acid sequence that is at least about 50% homologous to the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, and retains the biological activity of SEQ ID NO: 2 or SEQ ID NO: 4, e.g. retains, for example, proteolytic activity and in more specific embodiments, the fragments can cleave at Asp-Asp-Asp-Asp-Lys cleavage sites (SEQ ID NO: 1), and have low non-specific proteolytic activity. Typically, the terms “sequence identity” or “homologue” include a nucleotide or polypeptide sharing at least about 30-35%, advantageously at least about 35-40%, more advantageously at least about 40-50%, and even more advantageously at least about 60%, 70%, 80%, 90% or more identity with the amino acid sequence of a wild-type polypeptide or polypeptide described herein and having a substantially equivalent functional or biological activity as the wild-type polypeptide or polypeptide. For example, a enteropeptidase homologue shares at least about 30-35%, advantageously at least about 35-40%, more advantageously at least about 40-50%, and even more advantageously at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identity with the polypeptide having the amino acid sequence set forth as SEQ ID NO: 2 or SEQ ID NO: 4, and has substantially equivalent functional or biological activities (i.e., is a functional equivalent) of the polypeptide having the amino acid sequence set forth as SEQ ID NO: 2 or SEQ ID NO: 4 (e.g., has a substantially equivalent enteropeptidase activities).
In another embodiment, an isolated nucleic acid molecule encodes a variant of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the nucleic acid molecule hybridizes to a complement of a nucleic acid molecule comprising SEQ ID NO: 3 or SEQ ID NO: 5, under stringent conditions. Such stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. A particular, non-limiting example of stringent (e.g. high stringency) hybridization conditions are hybridization in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by one or more washes in 0.2×SSC, 0.1% SDS at 50-65° C. Advantageously, an isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to the sequence of SEQ ID NO: 3 or SEQ ID NO: 5 corresponding to a naturally-occurring nucleic acid molecule or a naturally occurring allelic variant. Typically, a naturally-occurring nucleic acid molecule includes an RNA or DNA molecule having a nucleotide sequence that occurs in nature.
Modification of a nucleotide sequence encoding a polypeptide of the present invention may be necessary for the synthesis of polypeptides substantially identical or similar to the polypeptide. The terms “substantially identical” or “substantially similar” to the polypeptide can refer to non-naturally occurring forms of the polypeptide. These polypeptides may differ in some engineered way from the polypeptide isolated from its native source, e.g., artificial variants that differ in specific activity, thermostability, pH optimum, or the like. The variant sequence may be constructed on the basis of the nucleotide sequence presented as the polypeptide encoding region of SEQ ID NO: 5, e.g., a subsequence thereof, and/or by introduction of nucleotide substitutions which do not give rise to another amino acid sequence of the polypeptide encoded by the nucleotide sequence. For a general description of nucleotide substitution, see, e.g., Ford et al., Protein Expression and Purification, 2:95-107 (1991).
A nucleic acid molecule of the present invention (e.g., a nucleic acid molecule having the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, can be isolated using standard molecular biology techniques and the sequence information provided herein. For example, nucleic acid molecules can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989) or can be isolated by the polymerase chain reaction using synthetic oligonucleotide primers designed based upon the sequence of SEQ ID NO: 3 or SEQ ID NO: 5. A nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques.
In one embodiment, an isolated nucleic acid molecule of the invention is selected from the group consisting of a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, or a complement thereof; and a nucleic acid molecule which encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4.
In another embodiment, the invention provides an isolated polynucleotide encoding a polypeptide, wherein the polynucleotide is a recombinant polynucleotide.
A recombinant polynucleotide can be a fusion. For example, a nucleic acid described herein (e.g., an EP nucleic acid) is expressed as a transcriptional or translational fusion with a detectable reporter, and expressed in an isolated cell (e.g., mammalian or insect cell) under the control of a heterologous promoter, such as an inducible promoter.
Host Cells
In another embodiment, the present invention provides a host cell. A host cell includes any cell type which is susceptible to transformation, transfection, or transduction with a nucleic acid construct or expression vector comprising a polynucleotide of the present invention. Host cells for use in expressing the EP polypeptides encoded by the expression vectors of the present invention include, but are not limited to, bacterial cells, such as E. coli; fungal cells, such as yeast cells (e.g., Saccharomyces cerevisiae); and animal cells such as CHO. Appropriate culture mediums and conditions for the above-described host cells are well known in the art.
Isolation and Cloning
The techniques used to isolate or clone a polynucleotide encoding a polypeptide are known in the art and include isolation from genomic DNA, preparation from cDNA, or a combination thereof. The cloning of the polynucleotides of the present invention from such genomic DNA can be effected, e.g., by using the well-known polymerase chain reaction (“PCR”) or antibody screening of expression libraries to detect cloned DNA fragments with shared structural features. See, e.g., Innis et al., 1990, PCR: A Guide to Methods and Application, Academic Press, New York. Other nucleic acid amplification procedures such as ligase chain reaction (LCR), ligated activated transcription (LAT) and nucleotide sequence-based amplification (NASBA) may be used.
Amplification is the production of additional copies of a nucleic acid sequence and is generally carried out using PCR technologies well known in the art (Dieffenbach and G S Dvekler, PCR Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y. (1995)). Polymerase chain reaction (“PCR”) refers to the methods disclosed in U.S. Pat. Nos. 4,683,195, 4,683,202 and 4,965,188, all of which are incorporated herein by reference, which describe a method for increasing the concentration of a segment of a target sequence in a mixture of genomic DNA without cloning or purification. This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase. The two primers are complementary to their respective strands of the double stranded target sequence. To effect amplification, the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule. Following annealing, the primers are extended with a polymerase so as to form a new pair of complementary strands. The steps of denaturation, primer annealing and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one “cycle”; there can be numerous “cycles”) to obtain a high concentration of an amplified segment of the desired target sequence. The length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter.
With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; and/or incorporation of 32P-labeled deoxyribonucleotide triphosphates, such as dCTP or dATP, into the amplified segment). In addition to genomic DNA, any oligonucleotide sequence can be amplified with the appropriate set of primer molecules. In particular, the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications. Amplified target sequences may be used to obtain segments of DNA (e.g., genes) for the construction of targeting vectors, transgenes, etc.
A “primer” refers to an oligonucleotide, whether occurring naturally or produced synthetically, which is capable of acting as a point of initiation of nucleic acid synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced (i.e., in the presence of nucleotides, an inducing agent such as DNA polymerase, and under suitable conditions of temperature and pH). The primer is preferably single-stranded for maximum efficiency in amplification, but may alternatively be double-stranded. If double-stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and use of the method.
A probe refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally or produced synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that the probe used in the present invention is labeled with any “reporter molecule,” so that it is detectable in a detection system, including, but not limited to enzyme (i.e., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label. The terms “reporter molecule” and “label” are used herein interchangeably. In addition to probes, primers and deoxynucleoside triphosphates may contain labels; these labels may comprise, but are not limited to, 32P, 33P, or fluorescent molecules (e.g., fluorescent dyes).
As used herein, the terms “Southern blot analysis” and “Southern blot” and “Southern” refer to the analysis of DNA on agarose or acrylamide gels in which DNA is separated or fragmented according to size followed by transfer of the DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized DNA is then exposed to a labeled probe to detect DNA species complementary to the probe used. The DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support. Southern blots are a standard tool of molecular biologists. J. Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY, 9.31-9.58. (1989)
As used herein, the term “Northern blot analysis” and “Northern blot” and “Northern” as used herein refer to the analysis of RNA by electrophoresis of RNA on agarose gels to fractionate the RNA according to size followed by transfer of the RNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized RNA is then probed with a labeled probe to detect RNA species complementary to the probe used. Northern blots are a standard tool of molecular biologists. J. Sambrook et al., supra, pp 7.39-7.52.
As used herein, the terms “Western blot analysis” and “Western blot” and “Western” refers to the analysis of protein(s)(or polypeptides) immobilized onto a support such as nitrocellulose or a membrane. A mixture comprising at least one protein is first separated on an acrylamide gel, and the separated proteins are then transferred from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized proteins are exposed to at least one antibody with reactivity against at least one antigen of interest. The bound antibodies may be detected by various methods, including the use of radiolabeled antibodies.
Isolated Polypeptides
Another aspect of the present invention features isolated enteropeptidase polypeptides (e.g., isolated enteropeptidase-1 polypeptides).
An isolated or purified polypeptide (e.g., an isolated or purified EP-1) is substantially free of cellular material or other contaminating polypeptides from the microorganism from which the polypeptide is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized.
Included within the scope of the present invention are EP-1 polypeptides or genes products that are mammalian derived polypeptides or gene products. In a preferred embodiment, the EP-1 polypeptide or gene product is derived from the teleost Medaka. Further included within the scope of the present invention are EP-1 polypeptides or gene products that can be non-mammalian or mammalian derived polypeptides or gene products which differ from naturally-occurring EP-1 genes or polypeptides, for example, genes which have nucleic acids that are mutated, inserted or deleted, but which encode polypeptides substantially similar to the naturally-occurring gene products of the present invention, e.g., are cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity. Low non-specific proteolytic activity is meant to refer to a reduced amount, or a decreased amount, relative to an unmutated or unaltered nucleotide or polypeptide. Unaltered can mean unmutated. In exemplary embodiments the polypeptide has low proteolytic activity, which may be 10%, 15%, 25%, 50%, 75% or even 90% lower than unmutated or unaltered polypeptide.
In particular embodiments of the invention, the isolated polypeptide encodes EP-1, having SEQ ID NO: 2:
VVGGVNAEKGAWPWMVSLHWRGRHGCGASLIGRDWLLTAAHCVYGKNTHL |
QYWSAVLGLHAQSSMNSQEVQIRQVDRIIINKNYNRRTKEADIAMMHLQQ |
PVNFTEWVLPVCLASEDQHFPAGRRCFIAGWGRDAEGGSLPDILQEAEVP |
LVDQDECQRLLPEYTFTSSMLCAGYPEGGVDSCQGDSGGPLMCLEDARWT |
LIGVTSFGVGCGRPERPGAYARVSAFTSWIAETRRSSFSDLD* |
In other particular embodiments of the invention, the isolated polypeptide encodes EP-1 with E173A mutation, having SEQ ID NO: 4:
VVGGVNAEKGAWPWMVSLHWRGRHGCGASLIGRDWLLTAAHCVYGKNTHL |
QYWSAVLGLHAQSSMNSQEVQIRQVDRIIINKNYNRRTKEADIAMMHLQQ |
PVNFTEWVLPVCLASEDQHFPAGRRCFIAGWGRDAEGGSLPDILQEAEVP |
LVDQDACQRLLPEYTFTSSMLCAGYPEGGVDSCQGDSGGPLMCLEDARWT |
LIGVTSFGVGCGRPERPGAYARVSAFTSWIAETRRSSFSDLD* |
It is well understood that one of skill in the art can mutate (e.g., substitute) nucleic acids which, due to the degeneracy of the genetic code, encode for an identical amino acid as that encoded by the naturally occurring gene. This may be desirable in order to improve the codon usage of a nucleic acid. Moreover, it is well understood that one of skill in the art can mutate (e.g., substitute) nucleic acids which encode for conservative amino acid substitutions. It is further well understood that one of skill in the art can substitute, add or delete amino acids to a certain degree without substantially affecting the function of a gene product (e.g., a cleavage specific activity, for example cleavage specificity for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1)) as compared with a naturally-occurring gene product, each instance of which is intended to be included within the scope of the present invention.
In an embodiment of the invention, the isolated nucleic acid molecule of the invention is selected from a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 3 or SEQ ID NO: 5, or a complement thereof. In another embodiment of the invention the nucleic acid molecule encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4.
Included in the scope of the invention are isolated polypeptides (e.g., an isolated EP polypeptide, more specifically an isolated EP-1 polypeptide that comprise a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the fragment comprises at least 5-15 contiguous amino acids of SEQ ID NO: 2 or SEQ ID NO: 4 and retains at least one biological activity of the reference polypeptide that is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
Also included in the scope of the invention are a variant or naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the polypeptide is encoded by a nucleic acid molecule which hybridizes to a complement of a nucleic acid molecule comprising SEQ ID NO: 3 or SEQ ID NO: 5 under stringent conditions.
Modification of a nucleotide sequence encoding a polypeptide of the present invention may be necessary for the synthesis of polypeptides substantially identical or similar to the polypeptide. The terms “substantially identical” or “substantially similar” to the polypeptide can refer to non-naturally occurring forms of the polypeptide. These polypeptides may differ in some engineered way from the polypeptide isolated from its native source, e.g., artificial variants that differ in specific activity, thermostability, pH optimum, or the like. The variant sequence may be constructed on the basis of the nucleotide sequence presented as the polypeptide encoding region of SEQ ID NO: 5, e.g., a subsequence thereof, and/or by introduction of nucleotide substitutions which do not give rise to another amino acid sequence of the polypeptide encoded by the nucleotide sequence. For a general description of nucleotide substitution, see, e.g., Ford et al., Protein Expression and Purification, 2:95-107 (1991).
It will be apparent to those skilled in the art that such substitutions can be made outside the regions critical to the function of the molecule and still result in an active polypeptide. Amino acid residues essential to the activity of the polypeptide encoded by an isolated polynucleotide of the invention, and therefore preferably not subject to substitution, may be identified according to procedures known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis. See, e.g., Cunningham and Wells, Science, 244:1081-1085 (1989). In the latter technique, mutations are introduced at every positively charged residue in the molecule, and the resultant mutant molecules are tested for antimicrobial activity to identify amino acid residues that are critical to the activity of the molecule. Sites of substrate-enzyme interaction can also be determined by analysis of the three-dimensional structure as determined by such techniques as nuclear magnetic resonance analysis, crystallography or photoaffinity labeling. See, e.g., de Vos et al., Science, 255:306-312 (1992); Smith et al., Journal of Molecular Biology, 224:899-904 (1992); Wlodaver et al., FEBS Letters, 309:59-64 (1992).
In other embodiments, an isolated polypeptide of the present invention comprises an amino acid sequence which is a homologue of the at least one of the polypeptides set forth as SEQ ID NO: 2 or SEQ ID NO: 4 (e.g., comprises an amino acid sequence at least about 30-40% identical, advantageously about 40-50% identical, more advantageously about 50-60% identical, and even more advantageously about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, and has an activity that is substantially similar to that of the polypeptide encoded by the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4, respectively, for example is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has low non-specific proteolytic activity.
To determine the percent identity of two amino acid sequences or of two nucleic acids, the sequences are aligned for optimal comparison purposes. (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence). When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % identity=# of identical positions/total # of positions×100), advantageously taking into account the number of gaps and size of said gaps necessary to produce an optimal alignment.
The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. A particular, non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA 87:2264-68, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-77. Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) of Altschul et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to nucleic acid molecules of the invention. BLAST polypeptide searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to polypeptide molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al. (1997) Nucleic Acids Research 25(17): 3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. Another particular, non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller (1988) Comput Appl Biosci. 4:11-17. Such an algorithm is incorporated into the ALIGN program available, for example, at the GENESTREAM network server, IGH Montpellier, FRANCE or at the ISREC server. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used.
In another embodiment, the percent identity between two amino acid sequences can be determined using the GAP program in the GCG software package, using either a Blossom 62 matrix or a PAM250 matrix, and a gap weight of 12, 10, 8, 6, or 4 and a length weight of 2, 3, or 4. In yet another embodiment, the percent homology between two nucleic acid sequences can be accomplished using the GAP program in the GCG software package, using a gap weight of 50 and a length weight of 3.
Also included in the scope of the invention are isolated polypeptides comprising a fragment of SEQ ID NO: 2 or SEQ ID NO: 4, wherein the amino acids of the fragment are arranged in any sequence such that the fragment is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and has the low non-specific proteolytic activity of SEQ ID NO:2 or SEQ ID NO:4
It is well understood that also included in the scope of the invention are synthetic or recombinant polypeptides.
In another preferred embodiment of the invention are provided isolated EP polypeptides comprising an amino acid sequence which is a variant of the polypeptide of SEQ ID NO: 2. As used herein, the term “variant” when used in reference to a polypeptide refers to an amino acid sequence that differs by one or more amino acids from a reference polypeptide. The variant may have “conservative” changes, wherein a substituted amino acid has similar structural or chemical properties. More rarely, a variant may have “non-conservative” changes. Similar minor variations may also include amino acid deletions or insertions, or both. An EP variant polypeptide and polynucleotide encoding the same can be generated using any technique known in the art, including site-directed mutagenesis. See, e.g., Ling et al., “Approaches to DNA mutagenesis: an overview,” Anal. Biochem. 254(2):157-78 (1997); Dale et al., “Oligonucleotide-directed random mutagenesis using the phosphorothioate method,” Methods Mol. Biol., 57:369-74 (1996); Smith, “In vitro mutagenesis,” Ann. Rev. Genet., 19:423-462 (1985); Botstein et al., “Strategies and applications of in vitro mutagenesis,” Science, 229:1193-1201 (1985); Carter, “Site-directed mutagenesis,” Biochem. J, 237:1-7 (1986); Kramer et al., “Point Mismatch Repair,” Cell, 38:879-887 (1984); Wells et al., “Cassette mutagenesis: an efficient method for generation of multiple mutations at defined sites,” Gene, 34:315-323 (1985); Minshull et al., “Protein evolution by molecular breeding,” Current Opinion in Chemical Biology, 3:284-290 (1999); Christians et al., “Directed evolution of thymidine kinase for AZT phosphorylation using DNA family shuffling,” Nature Biotechnology, 17:259-264 (1999); Crameri et al., “DNA shuffling of a family of genes from diverse species accelerates directed evolution,” Nature, 391:288-291; Crameri et al., “Molecular evolution of an arsenate detoxification pathway by DNA shuffling,” Nature Biotechnology, 15:436-438 (1997); Zhang et al., “Directed evolution of an effective fructosidase from a galactosidase by DNA shuffling and screening,” Proceedings of the National Academy of Sciences, U.S.A., 94:45-4-4509; Crameri et al., “Improved green fluorescent protein by molecular evolution using DNA shuffling,” Nature Biotechnology, 14:315-319 (1996); Stemmer, “Rapid evolution of a protein in vitro by DNA shuffling,” Nature, 370:389-391 (1994); Stemmer, “DNA shuffling by random fragmentation and reassembly: In vitro recombination for molecular evolution,” Proceedings of the National Academy of Sciences, U.S.A., 91:10747-10751 (1994); WO 95/22625; WO 97/0078; WO 97/35966; WO 98/27230; WO 00/42651; WO 01/75767 and U.S. Pat. No. 6,537,746 which issued to Arnold et al. on Mar. 25, 2003 and is entitled “Method for creating polynucleotide and polypeptide sequences.” To maximize any diversity, several of the above-described techniques can be used in combination.
EP variant polypeptides of the present invention can be prepared, for example, by using a wild-type EP polypeptide as a starting material to be improved. The term “wild-type” as applied to a polynucleotide means that the nucleic acid fragment does not comprise any mutations from the form isolated from nature. The term “wild-type” as applied to a polypeptide (or protein) means that the protein will be active at a level of activity found in nature and typically will comprise the amino acid sequence as found in nature. In contrast, the term “modified” or “mutant” when made in reference to a polynucleotide or polypeptide (or protein), respectively, to a polynucleotide or to a polypeptide (or protein) which displays modifications in sequence and/or functional properties (i.e., altered characteristics) when compared to the wild-type polynucleotide or polypeptide. Thus, the term “wild type” indicates a starting or reference sequence prior to a manipulation of the invention.
Suitable sources of wild-type EP can be identified by screening genomic libraries of organisms for the EP activities described herein. In the present invention, a parental amino acid or nucleic acid sequence encoding the wild-type Medaka EP polypeptide was constructed. The sequence designated EP-1 (SEQ ID NO: 3 or 4) was utilized as the starting point for all experiments and library construction.
Also included in the scope of the invention are the isolated polypeptides described herein, wherein the polypeptide comprising the amino acid sequence of SEQ ID NO: 2 has at least one mutation. In certain embodiment, the mutation can be a substitution, deletion, or an addition. In exemplary embodiments, the mutation is a substitution. The substitution can occur anywhere in SEQ ID NO: 2, but preferably the substitution occurs at amino acid residue selected from the group consisting of: residue 93 through residue 193. In exemplary embodiments, the substitution comprises a substitution at one or more residues selected from position 63, 105, 144, 173 or 193. In exemplary embodiments, the substitution is at residue 63, and consists of K63R, K63A or K63E. In other exemplary embodiments, the substitution is at residue105, and consists of T105A, T105R, or T105E. In other exemplary embodiments, the substitution is at residue 144, and consists of F144S. In other exemplary embodiments, the substitution is at residue 173, and consists of E173A. In other exemplary embodiments, the substitution is at residue 193, and consists of P193E or P193A.
Based on the foregoing isolated enteropeptidase polypeptides, immunospecific antibodies can be raised against a EP polypeptide, or portions thereof as described herein, using standard techniques known in the art.
Methods of the Invention
In one embodiment of the present invention are methods for producing any of the polypeptides of the invention, for example a polypeptide comprising the amino acid sequence SEQ ID NO: 2, or SEQ ID NO: 4, a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, or SEQ ID NO: 4; wherein the fragment comprises at least 15 contiguous amino acids of SEQ ID NO: 2, or SEQ ID NO: 4, a naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:2, or SEQ ID NO:4, wherein the polypeptide is encoded by a nucleic acid molecule which hybridizes to a complement of a nucleic acid molecule comprising SEQ ID NO:3, or SEQ ID NO:5, under stringent conditions. The method for producing the above-mentioned polypeptides comprises culturing the host cells of the invention under conditions in which the nucleic acid molecule is expressed.
The term “nucleotide construct” as used herein refers to a nucleic acid molecule, either single- or double-stranded, which is isolated from a naturally occurring gene or which is modified to contain segments of nucleic acids in a manner that would not otherwise exist in nature. The term nucleic acid construct is inclusive of the term expression cassette or expression vector when the nucleic acid construct contains all the control sequences required for expression of a coding sequence (polynucleotide) of the present invention.
The term “coding sequence” is defined herein as a polynucleotide sequence, which directly specifies the amino acid sequence of its protein product. The boundaries of a genomic coding sequence are generally determined by a ribosome binding site (prokaryotes) or by the ATG start codon (eukaryotes) located just upstream of the open reading frame at the 5′ end of the mRNA and a transcription terminator sequence located just downstream of the open reading frame at the 3′ end of the mRNA. A coding sequence can include, but is not limited to, DNA, cDNA, and recombinant nucleic acid sequences.
The term control sequence includes all components, which are necessary or advantageous for the expression of a polynucleotide encoding a polypeptide of the present invention. Each control sequence may be native or foreign to the nucleotide sequence encoding the polypeptide or native or foreign to each other. Such control sequences may include, but are not limited to, a promoter, and transcriptional and translational stop signals. The control sequence may be an appropriate promoter sequence. The promoter sequence is a relatively short nucleic acid sequence that is recognized by a host cell for expression of the longer coding region that follows. The promoter sequence contains transcriptional control sequences, which mediate the expression of the polypeptide. The promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
The term “operably linked” denotes herein a configuration in which a control sequence is placed at an appropriate position relative to the coding sequence of the polynucleotide sequence such that the control sequence directs the expression of the coding sequence of a polypeptide.
The present invention provides an expression vector comprising the polynucleotide described above. The term “expression” includes any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion. The term “expression vector” is defined herein as a linear or circular DNA molecule that comprises a polynucleotide encoding a polypeptide of the invention, and which is operably linked to additional nucleotides that provide for its expression.
In particular embodiments of the methods for producing any of the polypeptides of the invention, the polypeptide is produced in an E. coli expression system.
In one embodiment, the various nucleic acid and control sequences described above may be joined together to produce a recombinant expression vector which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the polypeptide at such sites. Alternatively, the nucleic acid sequence of the present invention may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for expression. In creating the expression vector, the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression.
The expression vector may be any vector (e.g., a plasmid or virus), which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the polynucleotide sequence. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vectors may be linear or closed circular plasmids.
The expression vector may be an autonomously replicating vector, i.e., a vector which, exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector may contain any means for assuring self-replication. Alternatively, the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated. Furthermore, a single vector or plasmid or two or more vectors or plasmids which together contain the total DNA to be introduced into the genome of the host cell, or a transposon may be used.
Preferably, the expression vector contains one or more selectable markers, which permit easy selection of transformed cells. A selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like. Examples of bacterial selectable markers are the dal genes from Bacillus subtilis or Bacillus licheniformis, or markers, which confer antibiotic resistance such as ampicillin, kanamycin, chloramphenicol or tetracycline resistance. Suitable markers for yeast host cells are ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and URA3.
Selectable markers for use in a filamentous fungal host cell include, but are not limited to, amdS (acetamidase), argB (ornithine carbamoyltransferase), bar (phosphinothricin acetyltransferase), hph (hygromycin phosphotransferase), niaD (nitrate reductase), pyrG (orotidine-5′-phosphate decarboxylase), sC (sulfate adenyltransferase), and trpC (anthranilate synthase), as well as equivalents thereof. Preferred for use in an Aspergillus cell are the amdS and pyrG genes of Aspergillus nidulans or Aspergillus oryzae and the bar gene of Streptomyces hygroscopicus.
The procedures used to ligate the elements described above to construct the recombinant nucleic acid construct and expression vectors of the present invention are well known to one skilled in the art. See, e.g., J. Sambrook et al., Molecular Cloning, A Laboratory Manual, 2nd edition, Cold Spring Harbor, N.Y. (1989)
Manipulation of the isolated polynucleotide prior to its insertion into a vector may be desirable or necessary depending on the expression vector. An isolated polynucleotide encoding the EP polypeptides of the present invention may be manipulated in a variety of ways well known in the art to provide for expression of the polypeptide.
In certain embodiments, the host cell of the invention contains any of the nucleic acid molecules as described herein. In exemplary embodiments, the host cell is a bacterial cell. In certain embodiments, the bacterial cell is Escherichia coli.
Engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the polynucleotides of the invention. Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter may be induced by appropriate means (e.g., temperature shift or chemical induction) and the cells may be cultured for an additional period to allow them to produce the desired polypeptide or fragment thereof.
Also included in the methods of the invention are methods for cleavage of a protein containing an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1) with any of the EP-1 polypeptides of the invention as described herein, the method comprising contacting the protein with any of the polypeptides of claims 1-44, and wherein the contacting of the protein with the polypeptide results in specific cleavage. The protein that is the target for the EP-1 polypeptide, e.g. the protein that contains an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1) can be a fusion protein, a recombinant fusion protein. A fusion protein is a protein created through genetic engineering from two or more proteins/peptides. This can be achieved by creating a fusion gene: removing the stop codon from the DNA sequence of the first protein, then appending the DNA sequence of the second protein in frame. That DNA sequence will then be expressed by a cell as a single protein. A fusion protein can refer to a protein in which a Asp-Asp-Asp-Asp-Lys (D4K) sequence (SEQ ID NO: 1) has been intentionally introduced for specific cleavage. Generally, cleavage of the fusion protein generates two polypeptides. A fusion protein according to the invention can be a recombinant fusion protein. In particular embodiments, a fusion protein can be generated, for example, from the addition of a vector-derived residue peptide at one terminus, for example the N-terminus, in addition to the amino acid sequence of the native. In this way, for example, a recombinant fusion protein can be constructed to have Asp-Asp-Asp-Asp-Lys (D4K) cleavage sites (SEQ ID NO: 1) in the vector and in the protein that contains Asp-Asp-Asp-Asp-Lys (D4K) sites (SEQ ID NO: 1) itself. In certain embodiment, the recombinant fusion protein can be selected from, but not limited to, gelatinaseA, human kallikrein 8 and tissue type plasminogen activator (tPA). The protein can be bacterially produced. Also included in the scope of the invention are synthetic proteins.
Also included in the methods of the invention are methods for the preparation of a recombinant protein using any of the polypeptides of the invention as described herein, the method comprising providing a recombinant fusion protein containing a Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1), and then contacting the fusion protein with any of the polypeptides according to the invention, wherein contacting the recombinant fusion protein with the polypeptide results in Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1) specific cleavage and preparation of recombinant protein.
Kits
The present polypeptides may be assembled into kits Included in the invention are kits comprising any of the polypeptides of the invention as described herein, e.g. enteropeptidase polypeptides that are cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1), and have low non-specific proteolytic activity. In exemplary embodiments, the kits containing the polypeptides are used for cleavage of proteins containing an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO: 1), and instructions for use. The kits can be used for cleavage of a fusion protein. Alternatively, the kits can be used for the cleavage of a recombinant fusion protein. In other embodiments, the kits can be used for the cleavage of a bacterially produced protein. The kits can also be used for the cleavage of a synthetic protein. The proteins suitable for cleavage by the polypeptides of the invention contain Asp-Asp-Asp-Asp-Lys cleavage sites (SEQ ID NO: 1).
The practice of the present invention employs, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are well within the purview of the skilled artisan. Such techniques are explained fully in the literature, such as, “Molecular Cloning: A Laboratory Manual”, second edition (Sambrook, 1989); “Oligonucleotide Synthesis” (Gait, 1984); “Animal Cell Culture” (Freshney, 1987); “Methods in Enzymology” “Handbook of Experimental Immunology” (Weir, 1996); “Gene Transfer Vectors for Mammalian Cells” (Miller and Calos, 1987); “Current Protocols in Molecular Biology” (Ausubel, 1987); “PCR: The Polymerase Chain Reaction”, (Mullis, 1994); “Current Protocols in Immunology” (Coligan, 1991). These techniques are applicable to the production of the polynucleotides and polypeptides of the invention, and, as such, may be considered in making and practicing the invention.
Having now generally described the invention, the same will be more readily understood through reference to the following Examples, which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified.
The results reported herein were obtained using the following Materials and Methods
cDNA Cloning of Medaka Trypsinogen.
For medaka trypsinogen, two degenerate oligonucleotide PCR primers were synthesized based on the cDNA sequence for conserved regions in serine protease (sense primer: 5′-GT(G/T)(C/G) T(C/G/T)(A/T) C(A/T) GCTGC(C/T) CACTG-3′ (SEQ ID NO: 7), which corresponds to the amino acid sequence NH2-Val-Leu-Thr-Ala-Ala-His-Cys-COOH (SEQ ID NO: 8); and antisense primer: 5′-(A/T) GGGCC (A/T) CC (A/T/G) GAGTC (A/T) CC-3′ (SEQ ID NO: 9), which corresponds to the amino acid sequence NH2-Gly-Asp-Ser-Gly-Gly-Pro-COOH (SEQ ID NO: 10)). cDNAs were PCR-amplified under the conditions described for EP in the main text. A 435-bp fragment was subcloned into pBluescript (II) KS+ (Stratagene, La Jolla, Calif.) and sequenced.
A 5′ portion of medaka trypsinogen was obtained by the 5′-RACE method (1) using the 5′-RACE system, Version 2.0 (Invitrogen, Carlsbad, Calif.). The antisense primers used were 5′-AGGAGGTGATGAACTG-3′ (SEQ ID NO: 11) (GSP-1; nucleotides 273 to 288, AB272106), 5′-CTCGGTTCCGTCATTGTTCCGGGAT-3′ (SEQ ID NO: 12) (GSP-2; nucleotides 249 to 272, AB272106) and 5′-CCAGACGCACCTCCACTCGGGACT-3′ (SEQ ID NO: 13) (nested GSP; nucleotides 214 to 237, AB272106). The two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C. for the first PCR and 35 cycles of 0.5 min at 94° C., 0.5 min at 60° C., and 1 min at 72° C. for the second PCR. The amplified products were then subcloned into pBluescript II plasmid (Stratagene) and sequenced.
A 3′ portion of medaka trypsinogen was obtained by the 3′-RACE method (1) using the 3′-Full RACE Core Set (Takara, Tokyo, Japan). The sense primers used were 5′-CATGATCACCAACTCCATGTTCTG-3′ (SEQ ID NO: 14) (RACE1; nucleotides 545 to 568, AB272106) and 5′-TGGATACCTGGAGGGAGG-3′ (SEQ ID NO: 15) (RACE2; nucleotides 572 to 589, AB272106). The two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C. for the first PCR and 35 cycles of 0.5 min at 94° C., 0.5 min at 57° C., and 1 min at 72° C. for the second PCR. The amplified products were then subcloned into pBluescript II plasmid (Stratagene) and sequenced.
RT-PCR Analysis of EP Transcripts.
To identify two distinct EP transcripts, enteropeptidase-1 (EP-1) and enteropeptidase-2 (EP-2), expressed in the medaka intestine, RT-PCR was conducted with KOD plus DNA polymerase (Toyobo, Osaka, Japan) using medaka intestine total RNA. The primers used were 5′-AGAACATCACAGGTGAACCGGTGA-3′ (SEQ ID NO: 16) (sense primer, nucleotides 1-24, AB272104) and 5′-TTCTGACATTCCTGAAGGGACAGC-3′ (SEQ ID NO: 17) (antisense primer, nucleotides 3930-3953, AB272104). PCR conditions were 2 min at 94° C. for heating, followed by 30 cycles of 30 sec at 94° C. for denaturing, 15 sec at 60° C. for annealing and 6 min at 68° C. for extension. The products were sequenced as described above. In some experiments, RT-PCR analyses were performed using specific primers: 5′-CAAGAACTACAACAGAAGAA-3′ (SEQ ID NO: 18) (sense) and 5′-GTGTATTGAGAAAAAGGTTGTTAA-3′ (SEQ ID NO: 19) (antisense) for EP-1 (nucleotides 2719-3415, AB272104) and 5′-CAAGAACTACAACAGAAGAA-3′ (SEQ ID NO: 18) (sense) and 5′-CTGTACTAAGAAAAAATTTGTCAT-3′ (SEQ ID NO: 20) (antisense) for EP-2 (nucleotides 2747-3443, AB272105). PCR conditions were 3 min at 94° C. for heating, followed by 20, 22, 24, 26 and 28 cycles of 30 sec at 94° C. for denaturing, 30 sec at 60° C. for annealing and 30 sec at 72° C. for extension.
For ovary 1.5- and 1.3-kb EP transcripts, RACE methods (1) were used. The sequence of the 5′-end was confirmed by the 5′-RACE using a 5′-RACE system (Invitrogen). The primers used were as follows: 5′-AGGTAACCAAGCAGAG-3′ (SEQ ID NO: 21) (nucleotides 3207-3222, AB272104) for the reverse transcriptase reaction, 5′-GAGAACGAGGAGCGCCTGGTCTCA-3′ (SEQ ID NO: 22) (nucleotides 3169-3192, AB272104) for the first PCR, and 5′-ATCCATGAAGTGAAAGCAGACACT-3′ (SEQ ID NO: 23) (nucleotides 3142-3165, AB272104) for the second PCR. The PCR was performed under the conditions of 35 cycles of 30 sec at 94° C., 30 sec at 55° C., and 2 min at 72° C. The 3′-end of the transcripts was determined by the 3′-RACE method (1). 3′-RACE was conducted using a 3′-Full RACE Core Set (Takara) as described above.
RT-PCR Detection of EP mRNA in the Gastrointestinal Tract.
The gastrointestinal tract was obtained from mature medaka (body sizes, 3-4 cm), and divided into 8 pieces (about 0.5 mm each). Specimens from five fish were combined for total RNA preparation. Aliquots of 2 μg of the total RNAs were used for reverse transcription. PCR was performed for 25 cycles using Ex Taq DNA polymerase (Takara) and the primers 5′-AGGACCAAACGGAACATTTC-3′ (SEQ ID NO: 24) (sense, nucleotides 802-821, AB272104) and 5′-GAGAGGGACGCAGGAGGA-3′ (SEQ ID NO: 25) (antisense, 1422-1439, AB272104).
Northern Blotting.
Two μg of poly(A) RNA from various tissues of the medaka were electrophoretically fractionated and transferred to a Nytran-plus membrane (Schleicher and Schuell, Dassel, Germany). The blots were hybridized with 32P-labelled cDNA fragments (nucleotides 3359-3953 in AB272104 for EP and 572-835 in AB272106 for trypsinogen) in buffer containing 50% formamide, 5×0.15 M NaCl/8.65 mM NaH2PO4/1.25 mM EDTA (SSPE), 1% SDS, 5×Denhardt's solution, and 100 μg/ml denatured salmon sperm DNA. The membranes were washed twice in 2×SSC/0.05% SDS and then twice in 0.1×SSC/0.1% SDS at 50° C. As a control, medaka cytoplasmic actin (OLCA1) mRNA was detected with a 32P-labeled 312-bp DNA fragment of the fish cDNA (2).
Southern Blotting.
Medaka genome DNA was extracted as described previously (3), with the exception that the whole-genome DNA was purified from the medaka whole body. Twenty μg of the genomic DNA was completely digested with various restriction enzymes. The digested DNA was fractionated on a 0.7% agarose gel and alkaline-transferred to a Nytran membrane (Schleicher & Schuell). The blot was hybridized at 60° C. for 16 h in 6×SSPE, 5×Denhardt's solution, 1% SDS, 10% dextran sulfate, and 100 μg/ml denatured herring sperm DNA with a 32P-labeled 595-bp fragment of medaka EP cDNA (nucleotides 3359-3953, AB272104). The membrane was washed at 60° C. in 0.1×SSC/0.1% SDS and exposed to Kodak Biomax Film.
In Situ Hybridization.
In situ hybridization was performed using frozen intestine and ovary sections (15 μm) as described previously (4). RNA probes were prepared by in vitro transcription of reverse-transcriptase fragments of cDNAs with T3 or T7 RNA polymerase using a digoxigenin (DIG) RNA-labeling kit (Boehringer-Mannheim, Mannheim, Germany). A 595-bp cDNA fragment (nucleotides 3359-3953, AB272104) was used as a specific probe. The hybridization was conducted at 50° C. for 18 h in 50% formamide, 5′Denhardt's solution, 6′ SSPE, and 0.5 mg/ml yeast transfer RNA. The sections were washed once at 50° C. in 50% formamide/2′ SSC for 30 min, once at 50° C. in 2′ SSC for 20 min, and twice at 50° C. in 0.2° SSC for 20 min. The hybridization probes were detected using a Dig Nucleic Acid Detection Kit (Roche Molecular Biochemicals, Mannheim, Germany).
Preparation of Recombinant Proteins.
For preparation of medaka recombinant trypsinogen, a trypsinogen cDNA fragment (nucleotides 72-755, AB272106) containing its coding sequence, but without the putative signal sequence, was amplified by PCR using the following primers: 5′-CCGGAATTCCTTGACGATGACAAG-3′ (SEQ ID NO: 26) and 5′-CCCAAGCTTTCAGTTGCTAGCCATGGT-3′ (SEQ ID NO: 27). The PCR product was digested with EcoR I and Hind III, gel-purified and ligated into the pET30a expression vector. The expression of recombinant medaka trypsinogen in the Escherichia coli expression system and its purification with an Ni2+-Sepharose column were the same as for the wild-type EP protein described above. The purified recombinant protein was renatured by dialysis against 50 mM Tris.HCl (pH 8.0) and further purified with a column of Resource Q. These procedures yielded a fusion protein of medaka trypsinogen that had a vector-derived 52-residue peptide at its N-terminus in addition to the 227-residue sequence of the fish trypsinogen. Thus, this recombinant fusion protein contained two EP-cleavage sites: one from the vector used and the other from trypsinogen itself.
For preparation of the insertional mutant of the human tissue-type plasminogen activator (tPA), a cDNA coding for human tPA (5) was first obtained by RT-PCR from a human ovary total RNA (Stratagene) using the primers 5′-CCCAAGCTTATGAAGAGAGGGCTCTGCTGT-3′ (SEQ ID NO: 28) (sense-1) and 5′-CTTATCGTCATCATGATGATGATGATGGTGTCTGGCTCCTCTTCT-3′ (SEQ ID NO: 29) (antisense-1) (BC007231). Using the cDNA as a template, two PCR products were amplified with following primer combinations: sense-1 and antisense-1; and 5′-CACCATCATCATCATCATGATGACGACGATAAGTCTTACCAAGTGATC-3′ (SEQ ID NO: 30) (sense-2) and 5′-CCGCTCGAGTCACGGTCGCATGTTGTCACGAAT-3′ SEQ ID NO: 31) (antisense-2). Using a mixture of these amplified DNAs as templates, the second PCR was performed with the sense-1 and antisense-2 primer. The PCR products were digested with HindIII and XhoI, then gel-purified and ligated into the pCMV tag4 mammalian expression vector (Stratagene). The resulting mutant was confirmed by DNA sequencing and transfected into CHO cells cultured in F-12 medium (Invitrogen) containing 10% fetal bovine serum (Biological Industries, Beit Haemek, Israel). Transfection was performed using Lipofectamin 2000 (GE Healthcare Biosciences, Uppsala, Sweden). The above procedure produced a fusion protein of human tPA having 11 extra amino acid residues (His-His-His-His-His-His-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 32): a His-tag sequence followed by an EP-cleavage site) at the N-terminus of mature tPA. This fusion protein secreted from transfected CHO cells was collected from the culture media using an Ni2+-Sepharose column. Treatment of the fusion protein with EP proteases generated mature tPA without the 11-residue N-terminal peptide.
Recombinant human kallikrein 8 was prepared as described previously (6).
Recombinant medaka gelatinase A was prepared as described previously (4).
Production of Anti-Medaka EP Protease Antisera.
The protein antigen was produced using the bacterial expression system with pET30a as described above. The recombinant protein eluted from an Ni2+-Sepharose column was injected into rabbits. The specific antibody was affinity-purified using membranes onto which pure antigen was blotted (4).
Western Blotting and Immunohistochemistry.
Whole tissues of medaka intestines, ovaries, and testes were separately homogenized in PBS containing 5 mM EDTA and protease inhibitor cocktail (Wako Chemicals, Osaka, Japan), and centrifuged at 18,000′ g for 10 min to obtain supernatant fractions. The supernatants were analyzed by Western blot analysis (4). For fractionation of medaka intestines, tissues were homogenized in 50 mM Tris.HCl (pH 7.4), 10 mM KCl, 10 mM MgCl2, 1 mM dithiothreitol, 5 mM EDTA and protease inhibitor cocktail (Wako), and centrifuged at 1,600′ g for 8 min. The pellet was collected as crude nuclei. The supernatant was further centrifuged at 100,000′ g for 30 min. The resulting supernatant and pellet were used as a cytosol and membrane fraction, respectively (7). The primary antibodies were affinity-purified EP protease antibodies as described above. Intestine sections (15 μm) were cut on a cryostat and thaw-mounted onto slides coated with silan. Sections on slides that were fixed with 4% paraformaldehyde in PBS for at least 15 min were treated with 3% H2O2 in PBS. After being blocked with BlockAce (Dainippon Seiyaku, Osaka, Japan) for 1 h at room temperature, each section was incubated with purified primary antibodies for 1 h at room temperature, and was then washed with PBS. Bound antibodies were detected using DakoCytomatin EnVision+ System-labeled polymer-HRP anti-rabbit (Dako, Carpinteria, Calif.) according to the manufacturer's instructions. Immunocomplexes were detected using an AEC kit (Vector Laboratories, Burlingame, Calif.).
Gel Filtration Chromatography.
Gel filtration chromatography was performed using a HiLoad 16/60 Superdex 200 μg column (GE Healthcare Biosciences) equilibrated with 50 mM Tris.HCl (pH 8.0) and 0.2 M NaCl. Medaka intestine was homogenized in the same buffer containing 5 mM EDTA and protease inhibitor cocktail (Wako) and centrifuged at 18,000 g for 10 min to obtain the supernatant. The resulting supernatant was applied to the column at a flow rate of 24 ml/h. Fractions of 1 ml were collected and assayed for EP protease activity using GD4K-βNA (SEQ ID NO: 6) as a substrate. The active fractions were pooled and used for Western blotting. Calibration of the column was conducted using an HMW gel filtration calibration kit (GE Healthcare Biosciences).
Enzyme Stability.
One hundred nanograms of medaka, porcine, and bovine enteropeptidase were separately incubated at 37° C. in 20 mM Tris.HCl buffer (pH 7.4) containing 50 mM NaCl and 2 mM CaCl2. The enzyme activity was measured at various time points (0 to 96 h) using GD4K-βNA (SEQ ID NO: 6) as a substrate.
Inhibitor Assay.
Active medaka enteropeptidase was preincubated with various inhibitors at 37□ in 20 mM Tris.HCl buffer (pH 7.4) containing 50 mM NaCl and 2 mM CaCl2. After incubation for 10 min, the enzyme activity was measured using GD4K-βNA (SEQ ID NO: 6) as a substrate.
RNA was isolated from the intestine and ovary of Medaka using Isogen (Nippon Gene, Tokyo, Japan). From the thus-obtained total RNA of the Medaka intestine, the first strand of cDNA was synthesized using a SuperScript First-Strand Synthesis System for RT-PCR (Invitrogen, Carlsbad, Calif.). Two degenerate oligonucleotide PCR primers were synthesized based on the cDNA sequences for conserved C-terminal catalytic protease domains in mammalian EPs (sense primer: 5′-TCIGC(C/T)GC(A/C)CACTG(C/T)GT(C/G)TA(CM(A/G)G(A/G)-3′ (SEQ ID NO: 33), which corresponds to the sequence around the active site histidine, NH2-Ser-Ala-Ala-His-Cys-Val-Tyr-Gly-COOH (SEQ ID NO: 34); and antisense primer: 5′-(G/T)A(A/G)TGG(C/T)CC(G/T)CC(A/T)GAATC(A/C)CCCTG-3′ (SEQ ID NO: 35), which corresponds to the sequence around the active site serine, NH2— Gln-Gly-Asp-Ser-Gly-Gly-Pro-Leu-COOH (SEQ ID NO: 36)).
The thus-obtained cDNAs were amplified under the following PCR conditions: 3 min at 94° C. for denaturation, 30 cycles of 0.5 min at 94° C., 0.5 min at 55° C. for annealing, and 0.5 min at 72° C. for extension, followed by 7 min final extension at 72° C. Fragments of about 0.5-kb in size were recovered from the PCR products by agarose gel purification and subcloned into pBluescript, (II) KS+ (Stratagene, La Jolla, Calif.). A 461-bp clone was obtained and was used as a probe for further screening of a Medaka cDNA library.
A Medaka intestine random cDNA library was constructed in λgt10 and was packaged using Gigapack III packaging extract (Stratagene). Approximately 6×105 plaques from the library were transferred to nylon membranes (Schleicher and Schuell, Dassel, Germany) and hybridized at 65° C. in a buffer containing 5×SSPE, 0.5% SDS, 5×Denhardt's solution (Wako, Osaka, Japan), and 100 μg/ml denatured salmon sperm DNA with the 32P-labeled 461-bp PCR fragment described above. Filters were washed with increasing stringency, with a final wash of 0.1×SSC/0.1% SDS at 50° C. Phage DNA was subcloned into pBluescript (II) KS+ for sequencing. An EP clone containing 2689-bp cDNA (nucleotides 611-3298) was obtained. Further screening was conducted with the same library using an EP 477-bp probe (nucleotides 630-1101), and resulted in isolation of a 1364-bp cDNA containing the 5′ portion of the EP sequence.
A 3′ portion of Medaka EP was obtained by the 3′-RACE method (Frohman et al., Proc. Natl. Acad. Sci. USA, 85:8998-9002 (1988)) using the 3′-Full RACE Core Set (Takara, Tokyo, Japan). The sense primers used were 5′-GACATTCTACAGGAGGCTGAGGTT-3′ (SEQ ID NO: 37) (RACE 1; nucleotides 2900 to 2923) and 5′-CGTCTCTTACCCGAGTACACCTTC-3′ (SEQ ID NO: 38) (RACE 2; nucleotides 2951 to 2974). The two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C. for the first PCR and 35 cycles of 0.5 min at 94° C., 0.5 min at 57° C., and 1 min at 72° C. for the second PCR. The amplified products were then subcloned into pBluescript II plasmid (Stratagene) and sequenced.
Medaka EP mRNA exists in two distinct forms, EP-1 and EP-2, in the intestine. A comparison of the entire amino acid sequences of EP-1 (1043 residues) and EP-2 (1036 residues) reveals a difference of only 22 amino acids, including an insertion of 7 residues in EP-2. Here, two distinct Medaka EP cDNA clones, designated as EP-1 (3997-bp, deposited in the DDBJ database, Accession No. AB272104) and EP-2 (4036-bp, AB272105), were obtained. The full-length EP-1 cDNA clone contained an ORF that codes a protein of 1043 amino acids, while the EP-2 clone codes a protein of 1036 amino acid residues (FIG. 5 ). The deduced amino acid sequence of the Medaka EP was homologous with those of its mammalian counterparts. As in mammalian EPs, unique domain structures were found in the N-terminal heavy chain of the fish protein, as shown in FIG. 1A . However, the extent of sequence identity between the Medaka and mammalian EPs varies considerably from one domain to another: the identity is 21% in the mucin-like domain, 45% in LDLR domain 1, 41% in C1 r/s domain 1, 49% in the MAM domain, 57% in C1 r/s domain 2, 47% in LDLR domain 2, and 23% in the MSCR domain. The C-terminal serine protease domain of Medaka EP exhibited 53% identity for mammalian EP serine proteases.
RT-PCR analyses using primer sets specific for the two Medaka Eps observed that the band intensities of amplified products were greater in EP-1 than EP-2 at every PCR cycle (FIG. 5B ). RT-PCR using primers common to the two EP transcripts was also performed. Amplified products (1235 bp for EP-1 and 1246 bp for EP-2) were gel-purified and subcloned into pBluescript (II) KS+, and the recombinant plasmids were transformed into E. coli, strain JM109. Forty-four clones were randomly picked for the nucleotide sequence analyses; 26 clones were for EP-1 and 18 clones for EP-2. The results indicated that EP-1 is a dominant EP species expressed in the Medaka intestine. The result of Southern blot analysis supports the presence of at least two distinct copies of the EP gene in the Medaka (FIG. 5C ).
Northern blot analysis of EP using various fish tissues revealed that the intestine expresses an approximately 4 kb transcript, and this size is consistent with that of the full-length cDNA, as shown in FIG. 1C . Very strong signals at 1.3 kb and 1.5 kb were detected in the ovary and testis. Further analyses indicated that they were transcripts with 1090 bp (corresponding to 2908-3997 in AB272104) and 1241 bp (corresponding to 2757-3997 in AB272104). Both transcripts were found not to code for any functional protein. In situ hybridization analysis indicated that EP mRNA was localized in the cytoplasm of small growing follicles in the ovary of mature female Medaka, as shown in FIG. 6 . Neither Western blotting nor immunohistochemical analysis using specific antibodies for the Medaka EP protease detected corresponding proteins. Therefore, no further study was conducted with ovary EP transcripts.
Because no translated product of the transcripts was detected in the ovary, the biological meaning of their occurrence in this organ is not known. In this context, it is of interest to note the recent identification of non-coding RNAs in eukaryotic cells. Such studies indicate that non-coding RNAs regulate gene expression by novel mechanisms such as RNA interference, gene co-suppression, gene silencing, imprinting and DNA methylation (21). A possibility may be that EP transcripts expressed in the fish ovary play a role as non-coding RNAs in the oocytes of growing follicles.
In RT-PCR using primers common to the two species of Medaka EP, transcripts were detected in the intestinal segments proximal to the stomach, as shown in FIG. 1D . In situ hybridization analysis localized EP expression to the intestinal epithelium (FIG. 1E ). Western blot analysis under reducing conditions of the extract of Medaka intestine, but not ovary and testis extract, using specific anti-EP antibodies against the catalytic domain detected a 36-kDa immunoreactive band (FIG. 1F , Left). A polypeptide band of the same molecular mass was detected in both soluble and membrane fractions of the intestine (FIG. 1F , Right). Western blotting of the intestine extract under nonreducing conditions gave no clear band (data not shown). By immunohistochemical analysis using the antibody, the epithelial localization of EP in the intestine was demonstrated (FIG. 1G ).
The extract of Medaka intestines exhibited enzyme activity for the synthetic EP substrate GD4K-βNA (SEQ ID NO: 6). Using this activity as a marker, the apparent molecular mass of intact EP was estimated to be 440 kDa by gel filtration (FIG. 7A ). The above fraction having GD4K-βNA-hydrolyzing activity (SEQ ID NO: 6) showed a 36-kDa polypeptide in Western blotting under reducing conditions (FIG. 7B , Left). Again, the same fraction did not show any clear band with the current antibody when analyzed under non-reducing conditions (FIG. 7B , Right).
The data presented herein suggests that EP-1 and EP-2 mRNA are expressed at a ratio of approximately 6:4 in the intestine. It remains to be determined whether they are indeed translated at this ratio. Moreover, it is not known at present whether they have a discrete role in vivo.
Taken together, the above results indicate that the fish intestine contains active, membrane-bound EP. Part of the molecule exists in the intestine in a soluble form that is probably detached from the epithelial cell membrane.
A DNA fragment including the coding sequence for the Medaka EP-1 or EP-2 catalytic domain was amplified by PCR using a pBluescript II plasmid containing cDNA of the catalytic domain as the template. The upper and lower primers were 5′-CGCGGATCCCAAGCTGGTGTGGTGGGTGG-3′ (SEQ ID NO: 39) and 5′-CCCAAGCTTTCAGTCTAGATCTGAGAA-3′ (SEQ ID NO: 40), respectively, which had BamHI and HindIII sites at the respective 5′ termini. The product was ligated into the cloning site of a pET30a expression vector (Novagen, Madison, Wis.). Expression of the recombinant Medaka EP catalytic domain in the Escherichia coli expression system was carried out as described previously (Ogiwara et al., Proc. Natl. Acad. Sci. USA, 102:8442-8447 (2005)). The Medaka EP catalytic domain was produced as a fusion protein with an extra amino acid sequence of 50 residues at its N-terminus; the vector-derived N-terminal stretch contained a His-tag and an S-protein sequence. Harvested cells were lysed and the insoluble materials were dissolved in a solubilization buffer containing 6 M urea, 50 mM Tris.HCl (pH 7.6), and 0.5 M NaCl. Solubilized proteins were subjected to affinity chromatography on Ni2+-Sepharose (GE Healthcare Biosciences, Piscataway, N.J.), and eluted with the same buffer containing 50 mM histidine. Eluted recombinant proteins were renatured by dialysis against 50 mM Tris.HCl (pH 8.0). The fusion protein was then incubated in 50 mM Tris.HCl (pH 8.0) containing 0.5 M NaCl with trypsin immobilized on Sepharose 4B at room temperature for 1 h. The immobilized trypsin was then removed by filtration. The resulting sample, which contained not only active EP protease but also inactive enzyme protein, was fractionated on a column of Resource Q in AKTA Purifier (GE Healthcare Biosciences, Uppsala, Sweden) to remove inactive enzyme. A trace amount of trypsin often contained in the sample thus prepared was removed by passing through an aprotinin-Sepharose 4B column (Sigma).
Active recombinant enzyme of the porcine EP serine protease domain (Ile800 to His1034) (Matsushima et al., J. Boil. Chem., 269:19976-19982 (1994)) was prepared basically according to the method described above. Bovine EP serine protease was obtained from Novagen and New England Biolabs (NEB) (Schwalbach, Germany).
The active 32-kDa carboxyl-terminal serine protease domains of both EP-1 and EP-2 were prepared to characterize their enzymatic properties, as shown in FIG. 8A . Both enzymes showed maximal activities for GD4K-βNA (SEQ ID NO: 6) at pH 8, but EP-1 was approximately three times more active than EP-2, as shown in FIG. 8B .
To examine the effects of EP-1 and EP-2 on the physiological substrate trypsinogen, a 866-bp Medaka trypsinogen cDNA (AB272106), which codes for a protein of 242 amino acids (FIG. 9 , supporting information), was obtained from the intestine. Using the sequence, a recombinant fusion protein of Medaka trypsinogen was prepared. The trypsinogen was converted to active trypsin by EP-1 faster than by EP-2 (FIG. 8C ). The behavior of the two proteases for various protease inhibitors was undistinguishable, as illustrated in Table 1, below.
TABLE 1 | |||
Inhibition (%) |
Inhibitor | Concentration | EP-1 | EP-2 | ||
EDTA | 5.0 | |
5 | 10 | ||
DFP | 0.2 | |
99 | 99 | ||
Benzamidine | 1.0 | mM | 79 | 78 | ||
Antipain | 0.1 | |
18 | 20 | ||
Leupeptin | 0.1 | mM | 43 | 47 | ||
Chymostatin | 0.1 | |
0 | 0 | ||
Aprotinin | 0.01 | mg/ |
0 | 5 | ||
SBTI | 0.1 | mg/ |
99 | 99 | ||
E-64 | 0.2 | |
0 | 0 | ||
Pepstatin | 0.1 | |
0 | 3 | ||
Table 1 shows the effects of inhibitors on medaka EP-1 and EP-2 protease activity. The enzyme activities of medaka EP-1 and EP-2 protease were determined in the presence of various inhibitors using GD4K-βNA (SEQ ID NO: 6) as a substrate. Values are expressed as the percent inhibitions of the respective control activities. Results are the averages of triplicate determinations. From these results, together with the finding that EP-1 is the dominantly expressed form in the intestine, EP-1 was chosen to be used in the following experiments.
The serine protease domain of Medaka EP-1 cleaved GD4K-βNA (SEQ ID NO: 6) at a rate comparable to those of the porcine and bovine enzymes (FIG. 2A ). Surprisingly, the amidolytic activities of Medaka EP-1 protease for the synthetic MCA-containing peptide substrates Boc-Glu(OBzl)-Ala-Arg-MCA, Z-Phe-Arg-MCA, and Pro-Phe-Arg-MCA were much lower than those of the EP proteases of mammalian origin (FIG. 2B ). The kinetic parameters of the proteases for these substrates were determined, and shown in Table 2, below. Generally, the kcat/Km values of the Medaka enzyme were 1-2 orders of magnitude smaller than those of the mammalian proteases for all MCA-containing synthetic substrates.
TABLE 2 | ||
GD4K-βna | ||
(SEQ ID NO: 6) | Boc-E(OlBz)-AR-MCA |
kcat/Km | kcat/Km | |||||
Km | kcat | (mM−1 · | Km | kcat | (mM−1 · | |
(mM) | (min−1) | min−1) | (mM) | (min−1) | min−1) | |
EP-1(WT) | 0.7 | 940 | 1300 | 0.2 | 6.7 | 34 |
K63R | 0.2 | 210 | 1100 | 1.2 | 12 | 10 |
T105E | 0.4 | 260 | 650 | 1.3 | 11 | 9 |
E173A | 0.3 | 320 | 1100 | 1.0 | 10 | 10 |
P193E | 0.4 | 290 | 730 | 0.2 | 2.3 | 12 |
Porcine | 0.4 | 530 | 1300 | 0.3 | 110 | 370 |
Bovine (Nvg) | 0.8 | 770 | 960 | 0.5 | 740 | 1500 |
Bovine (Neb) | 0.5 | 1500 | 3000 | 0.4 | 570 | 1400 |
Z-FR-MCA | PFR-MCA |
kcat/Km | kcat/Km | |||||
Km | kcat | (mM−1 · | Km | kcat | (mM−1 · | |
(mM) | (min−1) | min−1) | (mM) | (min−1) | min−1) | |
EP-1(WT) | 0.1 | 2.9 | 29 | 10 | 140 | 14 |
K63R | 0.1 | 1.4 | 14 | 1.1 | 11 | 10 |
T105E | 0.1 | 2.0 | 20 | 1.3 | 16 | 12 |
E173A | 0.4 | 2.3 | 6 | 1.0 | 9.2 | 9 |
P193E | 0.2 | 1.7 | 9 | 1.0 | 27 | 27 |
Porcine | 0.2 | 55 | 280 | 3.9 | 300 | 77 |
Bovine (Nvg) | 0.5 | 720 | 1400 | 4.0 | 790 | 200 |
Bovine (Neb) | 0.4 | 600 | 1500 | 2.9 | 1300 | 450 |
Next, the proteolytic activity of the Medaka protease was examined using gelatin (FIG. 2C ), fibronectin (FIG. 2D ), and laminin (FIG. 2E ). For comparison, the mammalian proteases were also tested under the same conditions. Little or no hydrolysis was observed with the fish enzyme for the proteins, while these substrates were detectably hydrolyzed by the mammalian proteases. Finally, the fusion protein containing an EP-cleavage site (available from Novagen) was tested with various EP proteases. Clearly, the Medaka protease specifically cleaved the fusion protein to generate two polypeptides having expected molecular masses of 16- and 32-kDa (FIG. 2F ). In contrast, the mammalian enzymes not only produced the two expected polypeptides but also further degraded the products, presumably due to their extensive nonspecific proteolytic activities. These results demonstrate that the Medaka EP-1 protease intrinsically has much more strict cleavage specificity than its mammalian counterparts.
Active recombinant Medaka EP-1 was stable at −20° C. and 4° C.; the initial enzyme activity was retained at both temperatures for at least six months with no detectable change in the electrophoretic pattern. When Medaka EP-1 alone was kept at 37° C. at neutral pH, about 30% loss of enzyme activity was observed after 4 days of incubation (FIG. 10 ). In a parallel experiment using bovine EP protease, a sharp decline in enzyme activity was seen after even just a few hours of incubation at 37° C.
Site-directed mutagenesis of Medaka EP-1 was carried out to produce various mutant proteases. For each mutant, two PCR products were first amplified with Medaka EP-1 cDNA as a template using the following primer combinations: one primer combination was the “upper” primer described above and the respective antisense primer, and another combination was the “lower” primer described above and the sense primer. These primers are shown in Table 3, below. Using a mixture of these amplified DNAs as templates, the second PCR was performed with the “upper” and “lower” primer. The PCR products were digested with BamHI and HindIII, gel-purified, and ligated into the pET30a expression vector. All mutants were confirmed by DNA sequencing. The subsequent procedures for preparation of mutant proteases were the same as for the wild-type protein described above. The active recombinant protein concentrations were determined using the active site titrant p-nitrophenyl-p′-guanidinobenzoate HCl (Sigma) using the method described previously (Chase et al., Biochem. Biophys. Res. Commun., 29:508-514 (1976)).
TABLE 3 | ||
Mutant | SEQ ID NO | Primer sequences |
K63R | 41 | |
5′-GTCTATGGGAGGAACACACAC-3′ | ||
42 | |
|
5′-GTGTGTGTTCCTCCCATAGAC-3′ | ||
K63A | 43 | |
5′-GTCTATGGGGCGAACACACAC-3′ | ||
44 | |
|
5′-GTGTGTGTTCGCCCCATAGAC-3 | ||
K63E | ||
45 | |
|
5′-GTCTATGGGGAGAACACACAC-3′ | ||
46 | |
|
5′-GTGTGTGTTCTCCCCATAGAC-3′ | ||
T105R | 47 | |
5′-AACAGAAGAAGGAAAGAGGCA-3′ | ||
48 | |
|
5′-TGCCTCTTTCCTTCTTCTGTT-3′ | ||
T105A | 49 | |
5′-AACAGAAGAGCCAAAGAGGCA-3′ | ||
50 | |
|
5′-TGCCTCTTTGGCTCTTCTGTT-3′ | ||
T105E | 51 | |
5′-AACAGAAGAGAAAAAGAGGCA-3′ | ||
52 | |
|
5′-TGCCTCTTTTTCTCTTCTGTT-3′ | ||
F144S | 53 | |
5′GGAAGAAGGTGTTCCATTGCAGGGTGG-3′ | ||
54 | |
|
5′-CCACCCTGCAATGGAACACCTTCTTCC-3′ | ||
F144A | 55 | |
5′-GGAAGAAGGTGTGCCATTGCAGGGTGG-3′ | ||
56 | |
|
5′-CCACCCTGCAATGGCACACCTTCTTCC-3′ | ||
E173K | 57 | |
5′-GTGGACCAGGATAAGTGCCAGCGTCTC-3′ | ||
58 | |
|
5′-GAGACGCTGGCACTTATCCTGGTCCAC-3′ | ||
E173A | 59 | |
5′-GAGACGCTGGCACTTATCCTGGTCCAC-3′ | ||
60 | |
|
5′-GAGACGCTGGCACGCATCCTGGTCCAC-3 | ||
P193E | ||
61 | |
|
5′-TGTGCTGGATATGAAGAAGGCGGAGTT-3′ | ||
62 | |
|
5′-AACTCCGCCTTCTTCATATCCAGCACA-3′ | ||
P193A | 63 | |
5′-TGTGCTGGATATGCTGAAGGCGGAGTT-3′ | ||
64 | |
|
5′-AACTCCGCCTTCAGCATATCCAGCACA-3′ | ||
Amino acid residues that differed from those of mammalian EP proteases in the corresponding positions were the primary focus. Five such residues were mutated, and shown in the sequences shown in FIG. 1B and in FIG. 3A . A total of 12 mutants could convert the recombinant Medaka trypsinogen to its active enzyme (data not shown).
EP activity was routinely determined using the specific substrate Gly-Asp-Asp-Asp-Asp-Lys-β-naphthylamide (GD4K-βNA) (SEQ ID NO: 6) (Sigma) according to the method of Mikhailova and Rumsh (Mikhailova et al., FEBS Lett., 442:226-230 (1999)). Enzyme activity for various 4-methylcoumaryl-7-amide (MCA)-containing peptide substrates was determined by the method of Barrett (Barrett et al., J., Biochem. J., 187:909-912 (1980)). For kinetic studies, initial velocities, extrapolated from the plot of product versus time, were transformed into double-reciprocal plots (Lineweaver et al., J. Am. Chem. Soc., 56:658-663 (1934)). The maximum velocities (Vmax) and Km and kcat values were obtained from the intercepts of these plots. For all experiments, the results of at least three separate determinations are shown.
Substitutions of residues to those conserved in the mammalian EP protease (namely, K63R, T105E, F144S, E173K, and P193E) consistently resulted in reduced enzyme activity for synthetic peptide substrates, as shown in FIG. 3A . The same held true for all the other mutants except for F144A, which hydrolyzed the GD4K-βNA (SEQ ID NO: 6) as well as the three MCA-containing substrates at an elevated rate when compared with the wild-type enzyme. Among the 12 mutants, K63R, T105E, E173A, and P193E were chosen for further characterization. For the recombinant Medaka trypsinogen, K63R converted to trypsin as fast as the wild type enzyme, while the other mutants activated trypsinogen at a reduced rate, as shown in FIG. 11 . The mutant proteases were characterized by kinetic studies. Interestingly, E173A retained a kcat/Km value comparable to the wild-type enzyme for GD4K-βNA (SEQ ID NO: 6). However, the kcat/Km values for the MCA-containing substrates were lowered (see Table 2, above).
The mutant proteases had lower nonspecific proteolytic activity for human HMW kininogen (FIG. 3B ) and human fibrinogen (FIG. 3C ), both of which were degraded noticeably by mammalian EP proteases. Neither human fibronectin nor laminin was hydrolyzed by the mutants (data not shown).
These results indicate that the substitution of glutamic acid by alanine at 173 caused a significant reduction in unwanted, nonspecific enzyme activities for both the synthetic and protein substrates without seriously deteriorating the mutant's cleavage specificity for the GD4K sequence (SEQ ID NO: 6).
The effect of Medaka EP serine protease on various fusion proteins containing a D4K-cleavage site (SEQ ID NO: 1) was examined. Human plasma fibronectin (Chemicon, Temecula, Calif.), human fibrinogen (Merk Biosciences, Tokyo, Japan), human high-molecular-weight (HMW) kininogen (Calbiochem, La Jolla, Calif.), mouse laminin (Biomedical Technologies Inc., Stoughton, Mass.), D4K-cleavage site-containing (SEQ ID NO: 1) control protein (Novagen), Medaka gelatinase A (Ogiwara et al., Proc. Natl. Acad. Sci. USA, 102:8442-8447 (2005)) and trypsinogen (this study), human kallikrein 8 (hK8) (Rajapakse et al., FEBS Lett., 579:6879-6884 (2005)) and human tissue-type plasminogen activator (tPA) were incubated at 37° C. in 20 mM Tris.HCl buffer (pH 7.4) containing 50 mM NaCl and 2 mM CaCl2 with various EP serine proteases at ratios (w/w) ranging from 20:1 to 100:1. After incubation, samples were subjected to SDS-PAGE followed by Coomassie Brilliant Blue staining. Gelatin zymography was conducted as described previously (Ogiwara et al., Proc. Natl. Acad. Sci. USA, 102:8442-8447 (2005)), except gel was incubated in 20 mM Tris.HCl buffer (pH 7.4) containing 50 mM NaCl and 2 mM CaCl2.
To produce Medaka trypsinogen, two degenerate oligonucleotide PCR primers were synthesized based on the cDNA sequence for conserved regions in serine protease (sense primer: 5′-GT(G/T)(C/G)T(C/G/T)(A/T)C(A/T)GCTGC(C/T)CACTG-3′ (SEQ ID NO: 7), which corresponds to the amino acid sequence NH2-Val-Leu-Thr-Ala-Ala-His-Cys-COOH (SEQ ID NO: 8); and antisense primer: 5′-(A/T)GGGCC(A/T)CC(A/T/G)GAGTC(A/T)CC-3′ (SEQ ID NO: 9), which corresponds to the amino acid sequence NH2-Gly-Asp-Ser-Gly-Gly-Pro-COOH (SEQ ID NO: 10)). cDNAs were PCR-amplified under the conditions described for EP in the main text. A 435-bp fragment was subcloned into pBluescript (II) KS+ (Stratagene, La Jolla, Calif.) and sequenced.
A 5′ portion of Medaka trypsinogen was obtained by the 5′-RACE method (Frohman et al., Proc. Natl. Acad. Sci. USA, 85:8998-9002 (1988)) using the 5′-RACE system, Version 2.0 (Invitrogen, Carlsbad, Calif.). The antisense primers used were 5′-AGGAGGTGATGAACTG-3′ (SEQ ID NO: 11) (GSP-1; nucleotides 273 to 288, AB272106), 5′-CTCGGTTCCGTCATTGTTCCGGGAT-3′ (SEQ ID NO: 12) (GSP-2; nucleotides 249 to 272, AB272106) and 5′-CCAGACGCACCTCCACTCGGGACT-3′ (SEQ ID NO: 13) (nested GSP; nucleotides 214 to 237, AB272106). The two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C. for the first PCR and 35 cycles of 0.5 min at 94° C., 0.5 min at 60° C., and 1 min at 72° C. for the second PCR. The amplified products were then subcloned into pBluescript II plasmid (Stratagene) and sequenced.
A 3′ portion of Medaka trypsinogen was obtained by the 3′-RACE method (Frohman et al., Proc. Natl. Acad. Sci. USA, 85:8998-9002 (1988)) using the 3′-Full RACE Core Set (Takara, Tokyo, Japan). The sense primers used were 5′-CATGATCACCAACTCCATGTTCTG-3′ (SEQ ID NO: 14) (RACE1; nucleotides 545 to 568, AB272106) and 5′-TGGATACCTGGAGGGAGG-3′ (SEQ ID NO: 15) (RACE2; nucleotides 572 to 589, AB272106). The two rounds of PCR reactions were performed under the conditions of 35 cycles of 0.5 min at 94° C., 0.5 min at 55° C., and 1 min at 72° C. for the first PCR and 35 cycles of 0.5 min at 94° C., 0.5 min at 57° C., and 1 min at 72° C. for the second PCR. The amplified products were then subcloned into pBluescript II plasmid (Stratagene) and sequenced.
To produce Medaka recombinant trypsinogen, a trypsinogen cDNA fragment (nucleotides 72-755, AB272106) containing its coding sequence, but without the putative signal sequence, was amplified by PCR using the following primers: 5′-CCGGAATTCCTTGACGATGACAAG-3′ (SEQ ID NO: 26) and 5′-CCCAAGCTTTCAGTTGCTAGCCATGGT-3′ (SEQ ID NO: 27). The PCR product was digested with EcoR I and Hind III, gel-purified and ligated into the pET30a expression vector. The expression of recombinant Medaka trypsinogen in the Escherichia coli expression system and its purification with an Ni2+-Sepharose column were the same as for the wild-type EP protein described above. The purified recombinant protein was renatured by dialysis against 50 mM Tris.HCl (pH 8.0) and further purified with a column of Resource Q.
These procedures yielded a fusion protein of Medaka trypsinogen that had a vector-derived 52-residue peptide at its N-terminus in addition to the 227-residue sequence of the fish trypsinogen. Thus, this recombinant fusion protein contained two EP-cleavage sites: one from the vector used and the other from trypsinogen itself.
To produce the insertional mutant of the human tissue-type plasminogen activator (tPA), a cDNA (BC007231) coding for human tPA (Pie et al., J. Biol. Chem., 275, 33988-33997 (200)) was first obtained by RT-PCR from a human ovary total RNA (Stratagene) using the primers
-
- 5′-CCCAAGCTTATGAAGAGAGGGCTCTGCTGT-3′ (SEQ ID NO: 28) (sense-1) and
- 5′-CTTATCGTCATCATGATGATGATGATGGTGTCTGGCTCCTCTTCT-3′ (SEQ ID NO: 29) (antisense-1).
Using the cDNA as a template, two PCR products were amplified with following primer combinations: sense-1 and antisense-1; and 5′-CACCATCATCATCATCATGATGACGACGATAAGTCTTACCAAGTGATC-3′ SEQ ID NO: 30) (sense-2) and 5′-CCGCTCGAGTCACGGTCGCATGTTGTCACGAAT-3′ (SEQ ID NO: 31) (antisense-2). Using a mixture of these amplified DNAs as templates, the second PCR was performed with the sense-1 and antisense-2 primer. The PCR products were digested with HindIII and XhoI, then gel-purified and ligated into the pCMV tag4 mammalian expression vector (Stratagene). The resulting mutant was confirmed by DNA sequencing and transfected into CHO cells cultured in F-12 medium (Invitrogen) containing 10% fetal bovine serum (Biological Industries, Beit Haemek, Israel). Transfection was performed using Lipofectamin 2000 (GE Healthcare Biosciences, Uppsala, Sweden). The above procedure produced a fusion protein of human tPA having 11 extra amino acid residues (His-His-His-His-His-His-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 32): a His-tag sequence followed by an EP-cleavage site) at the N-terminus of mature tPA. This fusion protein secreted from transfected CHO cells was collected from the culture media using an Ni2+-Sepharose column. Treatment of the fusion protein with EP proteases generated mature tPA without the 11-residue N-terminal peptide.
Recombinant human kallikrein 8 was prepared as described previously (Rajapakse et al., FEBS Lett., 579:6879-6884 (2005)). Recombinant Medaka gelatinase A was prepared as described previously (Ogiwara et al., Proc. Natl. Acad. Sci. USA, 102:8442-8447 (2005)).
Medaka gelatinase A (Ogiwara et al., Proc. Natl. Acad. Sci. USA, 102:8442-8447 (2005)) was synthesized as a fusion protein containing a His-tag and D4K sequence (SEQ ID NO: 1 at the N-terminus in the E. coli expression system using the pET30 expression vector. A 60-kDa fusion protein was converted by wild-type or mutant proteases to a 55-kDa protein (FIG. 4A ). Under the condition of incubation at the same substrate/protease ratio, the fusion protein was extensively digested by mammalian EP serine proteases.
Next, a 35.5-kDa protein of human kallikrein 8 (hK8) in the same E. coli expression system was synthesized. Digestion with Medaka wild-type and mutant EP proteases generated 31.5-kDa active hK8 by cleaving the D4K sequence (SEQ ID NO: 1) (FIG. 4B , Top). Under these conditions, the porcine protease extensively degraded the substrate. The EP protease-treated samples were directly assayed for hK8 activity using Pro-Phe-Arg-MCA, a good synthetic peptide substrate of hK8 (Rajapakse et al., FEBS Lett., 579:6879-6884 (2005)). All the samples treated with the Medaka or mammalian EP proteases exhibited Pro-Phe-Arg-MCA-hydrolyzing activity (FIG. 4B , Middle). As expected, none of the Medaka EP proteases (wild-type EP-1, K63R, E173A, or E193A) showed any significant enzyme activity. In contrast, considerable enzyme activities were detected with porcine and bovine (Neb) EP proteases. The fusion protein, which had been digested with the bovine (Nvg) protease, had very low activity, presumably due to inactivation of the EP protease itself during incubation. The substrate Boc-Glu(OBzl)-Ala-Arg-MCA, which is slightly cleaved by active hK8, was rapidly hydrolyzed with the samples treated with mammalian, but not Medaka, EP proteases (FIG. 4B , Bottom).
Enzyme activities were also detected individually with the EP proteases of mammalian origin at a comparable level, indicating that the activities were due to the action of mammalian EP proteases included in the samples. These results demonstrate that the Medaka EP protease used for cleaving the fusion protein has no serious effect on hK8 activity.
Finally, a human single-chain tPA fusion protein containing an 11-residue sequence of a His-tag/EP-susceptible site at the N-terminus of mature tPA was generated by CHO cells, and used as a substrate for Medaka and mammalian EP proteases. The protein samples treated with the Medaka wild-type or mutant EP proteases, but not with mammalian ones, showed two polypeptides (53- and 55-kDa) detectable with anti-human tPA antibodies (FIG. 4C , Upper). However, the specific antibody for the His-tag sequence did not recognize the polypeptides (FIG. 4C , Lower).
These results indicate that the Medaka proteases properly cleaved the fusion protein at the EP-cleavage site to produce single-chain tPA. These results suggest that the Medaka proteases are more effective than their mammalian counterparts as fusion protein cleavage enzymes for the preparation of desired recombinant proteins.
Taken together, with the exception of medaka EP protease residue position 105 (bovine #98), the residues that were mutated were located at a considerable distance from the enzyme active site. Although mutagenesis had different effects on each of the enzyme activities, one of the mutants, E173A, was interesting in that it showed significantly lower activities than the wild-type enzyme toward all the synthetic substrates tested. In addition, this mutant enzyme still retained a low nonspecific proteolytic activity for protein substrates (HMW kininogen and fibrinogen), with no serious reduction of the D4K (SEQ ID NO: 1) cleaving activity for fusion proteins (gelatinase A, hK8, and tPA). As demonstrated in the present study, the serine protease domain of medaka EP itself has a stricter specificity for almost all of the substrates tested when compared with the mammalian EP protease. Medaka wild-type EP protease would be adequate for the recombinant protein preparation of non-proteolytic enzymes. However, in view of the efficient cleavage at the D4K site (SEQ ID NO: 1) and the minimum nonspecific hydrolysis at the peptide and amide bonds, use of the mutant enzymes, in particular the E173A mutant enzyme, is preferred. The medaka wild-type EP protease and its mutant can be prepared in large quantity in the E. coli expression system. Using the medaka EP serine proteases as fusion protein cleavage enzymes, the desired recombinant proteins can be easily and effectively produced.
A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.
- Light, A. & Janska, H. (1989) Trends Biochem. Sci. 14, 110-112.
- Grishan, F. K., Lee, P. C., Lebenthal, E., Johnson, P., Bradley, C. A. & Greene, H. L. (1983) Gastroenterology 85, 727-731.
- LaVallie, E. R., Rehemtulla, A., Racie, L. A., DiBlasio, E. A., Ferenz, C., Grant, K. L., Light, A. & McCoy, J. M. (1993) J. Biol. Chem. 268, 23311-13317.
- Kitamoto, Y., Yuan, X., Wu, Q., McCourt, D. W. & Sadler, J. E. (1994) Proc. Natl. Acad. Sci. USA 91, 7588-7592.
- Kitamoto, Y., Veile, R. A., Donis-Keller, H. & Sadler, J. E. (1995) Biochemistry 34, 4562-4568.
- Matsushima, M., Ichinose, M., Yahagi, N., Kakei, K., Tsukada, S., Miki, K., Kurokawa, K., Tashiro, K., Shiokawa, K., Shinomiya, K., et al. (1994) J. Boil. Chem. 269, 19976-19982.
- Yahagi, N., Ichinose, M., Matsushima, M., Matsubara, Y., Miki, K., Kurokawa, K., Fukamachi, H., Tashiro, K., Shiokawa, K., Kageyama, T., et al. (1996) Biochem. Biophys. Res. Commun. 219, 806-812.
- Yuan, X., Zheng, X., Lu, D., Rubin, D. C., Pung, C. Y. & Sadler, J. E. (1998) Am. J. Physiol. 274, 342-349.
- Lu, D., Yuan, X., Zheng. X. & Sadler, J. E. (1997) J. Biol. Chem. 272, 31293-31300.
- Mikhailova, A., G. & Rumsh, L., D. (1999) FEBS Lett. 442, 226-230.
- Lu, D., Fütterer, K., Korolev, S., Xinglong, Z., Tan, K., Waksman, G. & Sadler, J. E. (1999) J. Mol. Biol. 292, 361-373.
- Zheng, X. & Sadler, J., E. (2002) J. Biol. Chem. 277, 6858-6863.
- Collins-Racie, L., A., McColgan, J., M., Grant, K., L., DiBlasio-Smith, E., A., McCoy, J., M. & LaVallie, E., R. (1995) Biotechnology 13, 982-987.
- Bricteux-Gregoire, S., Schyns, R., & Florkin., M. (1972) Comp. Biochem. Physiol. 42B, 23-39.
- Frohman, M. A., Dush, M. K. & Martin, G. R. (1988) Proc. Natl. Acad. Sci. USA 85, 8998-9002.
- Ogiwara, K., Takano, N., Shinohara, M., Murakami, M. & Takahashi, T. (2005) Proc. Natl. Acad. Sci. USA 102, 8442-8447.
- Chase, T., J., R. & Shaw, E. (1976) Biochem. Biophys. Res. Commun. 29, 508-514.
- Barrett, A., J. (1980) Biochem. J. 187, 909-912.
- Lineweaver, H. & Bruk, D. (1934) J. Am. Chem. Soc. 56, 658-663.
- Rajapakse, S., Ogiwara, K., Takano, N., Moriyama, A. & Takahashi, T. (2005) FEBS Lett. 579, 6879-6884.
- Costa F. F. (2005) Gene 357, 83-94.
- Rombout, J. H., Stroband, H., W. & Taverne-Thiele, J., J. (1984) Cell Tissue Res. 236, 207-216.
- Frohman, M. A., Dush, M. K. & Martin, G. R. (1988) Proc. Natl. Acad. Sci. USA 85, 8998-9002.
- Kusakabe, R., Kusakabe, T. & Suzuki, N. (1999) Int. J. Dev. Biol. 43, 541-554.
- Kimura, A., Yoshida, I., Takagi, N. & Takahashi, T. (1999) J. Biol. Chem. 274, 24047-24053.
- Ogiwara, K., Takano, N., Shinohara, M., Murakami, M. & Takahashi, T. (2005) Proc. Natl. Acad. Sci. USA 102, 8442-8447.
- Pennica, D., Holmes, W. E., Kohr, W. J., Harkins, R. N., Vehar, G. A., Ward, C. A., Bennett, W. F., Yelverton, E., Seeburg, P. H., Heyneker, H. L., et. al. (1983)
Nature 301, 214-221. - Rajapakse, S., Ogiwara, K., Takano, N., Moriyama, A. & Takahashi, T. (2005) FEBS Lett. 579, 6879-6884.
- Pie, D., Kang, T. & Qi, H. (2000) J. Biol. Chem. 275, 33988-33997.
Claims (48)
1. An isolated nucleic acid molecule comprising
(a) a nucleic acid molecule encoding the amino acid sequence of SEQ ID NO:4 or a proteolytically active fragment thereof; or
(b) a nucleic acid molecule encoding a variant of the amino acid sequence of SEQ ID NO:2 wherein the amino acid at an amino acid residue position selected from the group consisting of position 63, position 105, position 144, position 173, and position 193 is mutated, and wherein the amino acid positions are numbered according to the amino acid sequence set forth in SEQ ID NO:65, or a proteolytically active fragment thereof having enteropeptidase activity; or
(c) a nucleic acid molecule which is the complement of (a) or (b).
2. The isolated nucleic acid molecule of claim 1 which is
(a) a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:5,
(b) a nucleic acid molecule comprising a variant of the nucleic acid sequence of SEQ ID NO:3, wherein the codon at a codon position corresponding to an amino acid residue position selected from the group consisting of position 63, position 105, position 144, position 173, or position 193, of SEQ ID NO:65 is mutated; or
(c) a nucleic acid molecule which is the complement of (a) or (b).
3. The isolated nucleic acid molecule of claim 1 or 2 further comprising vector nucleic acid sequences.
4. The nucleic acid molecule of claim 1 or 2 operably linked to a surrogate promoter.
5. The nucleic acid molecule of claim 1 or 2 further comprising nucleic acid sequences encoding a heterologous polypeptide.
6. An isolated host cell which is transformed with the nucleic acid molecule of claim 1 or 2 .
7. The isolated host cell of claim 6 , wherein the host cell is selected from the group of bacterial cells, fungal cells, and animal cells.
8. The isolated host cell of claim 7 , wherein the bacterial cell is an Escherichia coli host cell.
9. A method for producing an enteropeptidase selected from the group consisting of:
a) an enteropeptidase comprising the amino acid sequence of SEQ ID NO:4 or a variant of the amino acid sequence of SEQ ID NO:2 wherein an amino acid at an amino acid residue position selected from the group consisting of position 63, position 105, position 144, position 173, and position 193 is mutated;
b) an enteropeptidase comprising a proteolytically active fragment of the amino acid sequence of SEQ ID NO:4 or a proteolytically active fragment of a variant of the amino acid sequence of SEQ ID NO:2 wherein an amino acid at an amino acid residue position selected from the group consisting of position 63, position 105, position 144, position 173, and position 193 is mutated; and
c) an enteropeptidase encoded by the nucleic acid sequence of SEQ ID NO:5 or by a nucleic acid molecule that is a variant of the nucleic acid sequence of SEQ ID NO:3, wherein a codon at a codon position corresponding to an amino acid residue position selected from the group consisting of position 63, position 105, position 144, position 173, or position 193 is mutated;
said method comprising culturing the host cell of claim 6 under conditions in which the nucleic acid molecule is expressed and the enteropeptidase is produced, and wherein the amino acid residue positions are numbered according to the amino acid sequence set forth in SEQ ID NO:65.
10. The method of claim 9 , wherein the enteropeptidase is produced in an Escherichia coli host cell.
11. An isolated enteropeptidase selected from the group consisting of:
(a) an enteropeptidase comprising the amino acid sequence of SEQ ID NO:4;
(b) a variant of the amino acid sequence of SEQ ID NO:2 wherein the amino acid at an amino acid residue position selected from the group consisting of position 63, position 105, position 144, position 173, and position 193 is mutated; and
(c) an enteropeptidase comprising a proteolytically active fragment of the amino acid sequence of (a) or (b);
wherein said isolated enteropeptidase has a proteolytic activity of cleaving the 4-methylcoumaryl-7-amide (MCA)-substrate Boc-Glu(OBzl)-Ala-Arg-MCA and wherein the amino acid residue positions are numbered according to the amino acid sequence set forth in SEQ ID NO:65.
12. The isolated enteropeptidase of claim 11 , wherein the mutation is selected from the group consisting of a substitution, deletion, and addition.
13. The isolated enteropeptidase of claim 12 , wherein the mutation is a substitution.
14. The isolated enteropeptidase of claim 13 , wherein the substitution is at residue position 63.
15. The isolated enteropeptidase of claim 14 , wherein the substitution at residue position 63 is K63R, K63A, or K63E.
16. The isolated enteropeptidase of claim 13 , wherein the substitution is at residue position 105.
17. The isolated enteropeptidase of claim 16 , wherein the substitution at residue position 105 is T105A, T105R, or T105E.
18. The isolated enteropeptidase of claim 13 , wherein the substitution is at residue position 144.
19. The isolated enteropeptidase of claim 18 , wherein the substitution at residue position 144 is F144S.
20. The isolated enteropeptidase of claim 13 , wherein the substitution is at residue position 173.
21. The isolated enteropeptidase of claim 20 , wherein the substitution at residue position 173 is E173A.
22. The isolated enteropeptidase of claim 13 , wherein the substitution is at residue position 193.
23. The isolated enteropeptidase of claim 22 , wherein the substitution at residue position 193 is P193E or P193A.
24. The isolated enteropeptidase of claim 11 , comprising the amino acid sequence of SEQ ID NO:4.
25. The isolated enteropeptidase according to claim 11 , wherein the enteropeptidase is a recombinant enteropeptidase.
26. The isolated enteropeptidase according to claim 11 , wherein the enzyme activity of the enteropeptidase in cleaving a GD4K-βNA substrate has an enhanced stability at 37° C. when incubated at pH 7.4 in 0.2 M NaCl and 2 mM CaCl2 by comparison with the enzyme activity of bovine enteropeptidase in cleaving a GD4K-βNA substrate when incubated at pH 7.4 in 0.2 M NaCl and 2 mM CaCl2.
27. The isolated enteropeptidase according to claim 11 , wherein the enteropeptidase is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO:1).
28. The isolated enteropeptidase according to claim 11 , wherein the proteolytic activity of said isolated enteropeptidase for a peptide substrate other than SEQ ID NO:1 is less specific than that for the peptide sequence of SEQ ID NO:1.
29. The isolated enteropeptidase of claim 28 , wherein the peptide substrate is selected from the group consisting of kininogen, fibrinogen, fibronectin, gelatin and laminin.
30. The isolated enteropeptidase of claim 28 , wherein the peptide substrate is a synthetic peptide substrate comprising 4 methylcoumaryl-7-amide (MCA).
31. The isolated enteropeptidase of claim 30 , wherein the synthetic peptide substrate is selected from the group consisting of Boc-Glu(OBzl)-Ala-Arg-MCA, Z-Phe-Arg-MCA, and Pro-Phe-Arg-MCA.
32. The isolated enteropeptidase of claim 28 , wherein the peptide substrate consists of a fusion protein.
33. The isolated enteropeptidase of claim 32 , wherein the fusion protein comprises SEQ ID NO:1 fused to a protein selected from the group consisting of gelatinase A, human kallikrein 8 and tissue type plasminogen activator (tPA).
34. An isolated enteropeptidase comprising a variant of the amino acid sequence of SEQ ID NO:2, wherein the amino acid at an amino acid residue position selected from the group consisting of position 63, position 105, position 144, position 173 and position 193, is mutated, wherein the isolated enteropeptidase variant is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO:1), and has less proteolytic activity for peptide sequences other than that for SEQ ID NO:1, and wherein the amino acid positions are numbered according to the amino acid sequence set forth in SEQ ID NO:65.
35. The isolated enteropeptidase variant according to claim 34 , wherein the mutation is a substitution selected from the group consisting of K63R, K63A, K63E, T105A, T105R, T105E, F144S, E173A, P193A, and P193A.
36. The isolated enteropeptidase variant according to claim 35 , wherein the mutation is E173A.
37. An isolated enteropeptidase comprising the amino acid sequence of SEQ ID NO:4, wherein the isolated enteropeptidase is cleavage specific for Asp-Asp-Asp-Asp-Lys (SEQ ID NO:1), and has less proteolytic activity for peptide sequences other than that for SEQ ID NO:1.
38. A method for cleaving a protein containing an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO:1) with the enteropeptidase of claim 11 , the method comprising:
contacting the protein with the enteropeptidase; wherein the contacting of the protein with the enteropeptidase results in specific cleavage, at the cleavage site of Asp-Asp-Asp-Asp-Lys (SEQ ID NO:1).
39. The method of claim 38 , wherein the protein is a fusion protein.
40. The method of claim 39 , wherein the fusion protein is recombinantly produced by an isolated host cell.
41. The method of claim 38 , wherein the protein is recombinantly produced by a bacterial host cell.
42. A method for preparing a recombinant protein by cleavage with an enteropeptidase of claim 11 , the method comprising:
providing a recombinant fusion protein comprising a Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO:1) fused to said recombinant fusion protein; and
contacting the fusion protein with the enteropeptidase;
wherein contacting the recombinant fusion protein with the enteropeptidase results in a specific cleavage at the Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO:1) and the preparation of the recombinant protein.
43. A kit comprising the enteropeptidase of claim 11 and instructions for use in cleaving a protein comprising an Asp-Asp-Asp-Asp-Lys cleavage site (SEQ ID NO:1).
44. The kit of claim 43 , wherein the protein is a fusion protein.
45. The kit of claim 44 , wherein the fusion protein is recombinantly produced by an isolated host cell.
46. The kit of claim 43 , wherein the protein is recombinantly produced by a bacterial host cell.
47. The kit of claim 43 , wherein the protein is a synthetic protein.
48. The isolated enteropeptidase of claim 11 , wherein the enteropeptidase activity of said isolated enteropeptidase is less than 34 mM−1 min−1 kcat/Km using Boc-Glu(OBzl)-Ala-Arg-MCA as a substrate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/973,157 US8013137B2 (en) | 2006-10-17 | 2007-10-04 | Modified enteropeptidase protein |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US85245406P | 2006-10-17 | 2006-10-17 | |
US11/973,157 US8013137B2 (en) | 2006-10-17 | 2007-10-04 | Modified enteropeptidase protein |
Publications (2)
Publication Number | Publication Date |
---|---|
US20080213836A1 US20080213836A1 (en) | 2008-09-04 |
US8013137B2 true US8013137B2 (en) | 2011-09-06 |
Family
ID=39733354
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/973,157 Expired - Fee Related US8013137B2 (en) | 2006-10-17 | 2007-10-04 | Modified enteropeptidase protein |
Country Status (1)
Country | Link |
---|---|
US (1) | US8013137B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012071257A1 (en) | 2010-11-23 | 2012-05-31 | Allergan, Inc. | Compositions and methods of producing enterokinase in yeast |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6444644B1 (en) * | 1996-05-09 | 2002-09-03 | University College London | Anticoagulant peptide fragments derived from apolipoprotein B-100 |
US20020164588A1 (en) * | 1999-01-29 | 2002-11-07 | The Regents Of The University Of California | Determining the functions and interactions of proteins by comparative analysis |
US20030167477A1 (en) * | 1999-06-23 | 2003-09-04 | Ppl Therapeutics (Scotland) Ltd. | Fusion proteins incorporating lysozyme |
JP2005253325A (en) | 2004-03-10 | 2005-09-22 | Hokkaido Univ | Enteropeptidase from fish |
-
2007
- 2007-10-04 US US11/973,157 patent/US8013137B2/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6444644B1 (en) * | 1996-05-09 | 2002-09-03 | University College London | Anticoagulant peptide fragments derived from apolipoprotein B-100 |
US20020164588A1 (en) * | 1999-01-29 | 2002-11-07 | The Regents Of The University Of California | Determining the functions and interactions of proteins by comparative analysis |
US20030167477A1 (en) * | 1999-06-23 | 2003-09-04 | Ppl Therapeutics (Scotland) Ltd. | Fusion proteins incorporating lysozyme |
JP2005253325A (en) | 2004-03-10 | 2005-09-22 | Hokkaido Univ | Enteropeptidase from fish |
Non-Patent Citations (18)
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012071257A1 (en) | 2010-11-23 | 2012-05-31 | Allergan, Inc. | Compositions and methods of producing enterokinase in yeast |
Also Published As
Publication number | Publication date |
---|---|
US20080213836A1 (en) | 2008-09-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wu et al. | Purification and characterization of a novel fibrinolytic protease from Fusarium sp. CPCC 480097 | |
Park et al. | Functional expression and refolding of new alkaline esterase, EM2L8 from deep-sea sediment metagenome | |
Jalving et al. | Characterization of the kexin-like maturase of Aspergillus niger | |
US7067298B2 (en) | Compositions and methods of using a synthetic Dnase I | |
Elhoul et al. | A novel detergent-stable solvent-tolerant serine thiol alkaline protease from Streptomyces koyangensis TN650 | |
KR20160073937A (en) | Modified factor X polypeptides and uses thereof | |
Park et al. | Gene cloning, purification, and characterization of a cold-adapted lipase produced by Acinetobacter baumannii BD5 | |
Li et al. | Characterization of a new S8 serine protease from marine sedimentary Photobacterium sp. A5–7 and the function of its protease-associated domain | |
Guo et al. | A novel thermostable aspartic protease from Talaromyces leycettanus and its specific autocatalytic activation through an intermediate transition state | |
Gao et al. | Identification and magnetic immobilization of a pyrophilous aspartic protease from Antarctic psychrophilic fungus | |
Zhong et al. | Improvement of low‐temperature caseinolytic activity of a thermophilic subtilase by directed evolution and site‐directed mutagenesis | |
Chen et al. | Ecological function of myroilysin, a novel bacterial M12 metalloprotease with elastinolytic activity and a synergistic role in collagen hydrolysis, in biodegradation of deep-sea high-molecular-weight organic nitrogen | |
US8013137B2 (en) | Modified enteropeptidase protein | |
EP2173873B1 (en) | Protein and dna sequence encoding a cold adapted subtilisin-like activity | |
JP5339543B2 (en) | Novel protease and use thereof | |
JP5283154B2 (en) | Trypsin-like enzyme | |
Lin et al. | Expression and functional characterization of chitribrisin, a thrombin-like enzyme, in the venom of the Chinese green pit viper (Trimeresurus albolabris) | |
Hajji et al. | Gene cloning and expression of a detergent stable alkaline protease from Aspergillus clavatus ES1 | |
Yu et al. | Clone, purification and characterization of thermostable aminopeptidase ST1737 from Sulfolobus tokodaii | |
JP2021129539A (en) | Polypeptides with improved cholesterol esterase activity | |
Islam | Molecular cloning, expression and characterization of a serine proteinase from Japanese edible mushroom, Grifola frondosa: solving the structure-function anomaly of a reported aminopeptidase | |
JP3463951B2 (en) | Thermostable pyroglutamyl peptidase and its gene | |
EP1326890B1 (en) | Shrimp alkaline phosphatase | |
CN108103045B (en) | Lipase and application thereof | |
JP4168319B2 (en) | Stable lipoprotein lipase |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
REMI | Maintenance fee reminder mailed | ||
LAPS | Lapse for failure to pay maintenance fees | ||
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20150906 |