WO1990012033A1 - Construction and use of synthetic constructs encoding syndecan - Google Patents
Construction and use of synthetic constructs encoding syndecan Download PDFInfo
- Publication number
- WO1990012033A1 WO1990012033A1 PCT/US1990/001496 US9001496W WO9012033A1 WO 1990012033 A1 WO1990012033 A1 WO 1990012033A1 US 9001496 W US9001496 W US 9001496W WO 9012033 A1 WO9012033 A1 WO 9012033A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- peptide
- dna
- syndecan
- oligonucleotide
- Prior art date
Links
- 108050006774 Syndecan Proteins 0.000 title claims description 102
- 102000019361 Syndecan Human genes 0.000 title claims description 98
- DIOQZVSQGTUSAI-UHFFFAOYSA-N n-butylhexane Natural products CCCCCCCCCC DIOQZVSQGTUSAI-UHFFFAOYSA-N 0.000 title claims description 98
- 238000010276 construction Methods 0.000 title description 4
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 80
- 150000001413 amino acids Chemical class 0.000 claims abstract description 57
- 230000001086 cytosolic effect Effects 0.000 claims abstract description 14
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims abstract description 11
- 230000013595 glycosylation Effects 0.000 claims abstract description 10
- 238000006206 glycosylation reaction Methods 0.000 claims abstract description 10
- 210000004899 c-terminal region Anatomy 0.000 claims abstract description 9
- 230000002068 genetic effect Effects 0.000 claims abstract description 9
- 230000002378 acidificating effect Effects 0.000 claims abstract description 7
- 210000004027 cell Anatomy 0.000 claims description 83
- 108020004414 DNA Proteins 0.000 claims description 76
- 229940024606 amino acid Drugs 0.000 claims description 56
- 235000001014 amino acid Nutrition 0.000 claims description 56
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 42
- 102000016611 Proteoglycans Human genes 0.000 claims description 36
- 125000003729 nucleotide group Chemical group 0.000 claims description 36
- 239000002773 nucleotide Substances 0.000 claims description 35
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 34
- 108091034117 Oligonucleotide Proteins 0.000 claims description 32
- 239000013598 vector Substances 0.000 claims description 32
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 claims description 14
- 108020004511 Recombinant DNA Proteins 0.000 claims description 13
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 13
- 244000005700 microbiome Species 0.000 claims description 12
- 230000002209 hydrophobic effect Effects 0.000 claims description 11
- 150000003839 salts Chemical class 0.000 claims description 10
- 235000004400 serine Nutrition 0.000 claims description 9
- 235000002374 tyrosine Nutrition 0.000 claims description 9
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 claims description 8
- 230000000295 complement effect Effects 0.000 claims description 8
- 239000004471 Glycine Substances 0.000 claims description 7
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 claims description 7
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 claims description 7
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 claims description 7
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 claims description 7
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 claims description 7
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 claims description 7
- 235000004279 alanine Nutrition 0.000 claims description 7
- 235000018417 cysteine Nutrition 0.000 claims description 7
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 claims description 7
- 229960000310 isoleucine Drugs 0.000 claims description 7
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 claims description 7
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 claims description 7
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 claims description 6
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 claims description 6
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 claims description 6
- 239000004472 Lysine Substances 0.000 claims description 6
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 claims description 6
- 239000004473 Threonine Substances 0.000 claims description 6
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 claims description 6
- 229940009098 aspartate Drugs 0.000 claims description 6
- 229930195712 glutamate Natural products 0.000 claims description 6
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 claims description 6
- 229930182817 methionine Natural products 0.000 claims description 6
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 claims description 6
- 235000008521 threonine Nutrition 0.000 claims description 6
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 claims description 6
- 239000004475 Arginine Substances 0.000 claims description 5
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 claims description 5
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 claims description 5
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 claims description 5
- 235000009582 asparagine Nutrition 0.000 claims description 5
- 229960001230 asparagine Drugs 0.000 claims description 5
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 claims description 5
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 claims description 5
- 239000004474 valine Substances 0.000 claims description 5
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 claims description 4
- 125000000404 glutamine group Chemical group N[C@@H](CCC(N)=O)C(=O)* 0.000 claims description 4
- 241000124008 Mammalia Species 0.000 claims description 3
- 108020005038 Terminator Codon Proteins 0.000 claims description 3
- 125000000637 arginyl group Chemical group N[C@@H](CCCNC(N)=N)C(=O)* 0.000 claims description 3
- WHUUTDBJXJRKMK-VKHMYHEASA-L glutamate group Chemical group N[C@@H](CCC(=O)[O-])C(=O)[O-] WHUUTDBJXJRKMK-VKHMYHEASA-L 0.000 claims description 3
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 claims description 3
- 125000000341 threoninyl group Chemical group [H]OC([H])(C([H])([H])[H])C([H])(N([H])[H])C(*)=O 0.000 claims description 3
- 125000000430 tryptophan group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C2=C([H])C([H])=C([H])C([H])=C12 0.000 claims description 3
- 125000002987 valine group Chemical group [H]N([H])C([H])(C(*)=O)C([H])(C([H])([H])[H])C([H])([H])[H] 0.000 claims description 3
- 241000699800 Cricetinae Species 0.000 claims description 2
- 241000282412 Homo Species 0.000 claims description 2
- 241000699666 Mus <mouse, genus> Species 0.000 claims 1
- 241000699670 Mus sp. Species 0.000 claims 1
- 241000700159 Rattus Species 0.000 claims 1
- 230000003362 replicative effect Effects 0.000 claims 1
- 102000004196 processed proteins & peptides Human genes 0.000 abstract description 29
- 108090000623 proteins and genes Proteins 0.000 description 77
- 239000002299 complementary DNA Substances 0.000 description 76
- 238000000034 method Methods 0.000 description 61
- 102000004169 proteins and genes Human genes 0.000 description 47
- 108020004999 messenger RNA Proteins 0.000 description 46
- 239000012634 fragment Substances 0.000 description 43
- 235000018102 proteins Nutrition 0.000 description 43
- 108010067787 Proteoglycans Proteins 0.000 description 34
- 229920002971 Heparan sulfate Polymers 0.000 description 30
- 108091008146 restriction endonucleases Proteins 0.000 description 26
- 125000003275 alpha amino acid group Chemical group 0.000 description 19
- 101710132601 Capsid protein Proteins 0.000 description 18
- 102000004190 Enzymes Human genes 0.000 description 18
- 108090000790 Enzymes Proteins 0.000 description 18
- 230000015572 biosynthetic process Effects 0.000 description 18
- 238000003776 cleavage reaction Methods 0.000 description 18
- 230000007017 scission Effects 0.000 description 17
- 238000003786 synthesis reaction Methods 0.000 description 17
- 229920002683 Glycosaminoglycan Polymers 0.000 description 16
- 102000053602 DNA Human genes 0.000 description 15
- 238000006243 chemical reaction Methods 0.000 description 15
- 238000004519 manufacturing process Methods 0.000 description 15
- 239000013612 plasmid Substances 0.000 description 15
- 239000000047 product Substances 0.000 description 15
- QAOWNCQODCNURD-UHFFFAOYSA-L sulfate group Chemical group S(=O)(=O)([O-])[O-] QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 15
- 108020004705 Codon Proteins 0.000 description 14
- 239000013615 primer Substances 0.000 description 14
- 238000011282 treatment Methods 0.000 description 13
- 238000010353 genetic engineering Methods 0.000 description 12
- 239000011159 matrix material Substances 0.000 description 12
- 239000000523 sample Substances 0.000 description 12
- 210000002919 epithelial cell Anatomy 0.000 description 11
- 239000000463 material Substances 0.000 description 11
- SQDAZGGFXASXDW-UHFFFAOYSA-N 5-bromo-2-(trifluoromethoxy)pyridine Chemical compound FC(F)(F)OC1=CC=C(Br)C=N1 SQDAZGGFXASXDW-UHFFFAOYSA-N 0.000 description 10
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 10
- 210000001519 tissue Anatomy 0.000 description 10
- 102100034343 Integrase Human genes 0.000 description 9
- 241001529936 Murinae Species 0.000 description 9
- 108091034057 RNA (poly(A)) Proteins 0.000 description 9
- 102000040430 polynucleotide Human genes 0.000 description 9
- 108091033319 polynucleotide Proteins 0.000 description 9
- 239000002157 polynucleotide Substances 0.000 description 9
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 8
- 229920001287 Chondroitin sulfate Polymers 0.000 description 8
- 230000009471 action Effects 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 8
- 230000027455 binding Effects 0.000 description 8
- 229940059329 chondroitin sulfate Drugs 0.000 description 8
- 230000000694 effects Effects 0.000 description 8
- 238000009396 hybridization Methods 0.000 description 8
- 229920002521 macromolecule Polymers 0.000 description 8
- 239000012528 membrane Substances 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 210000002966 serum Anatomy 0.000 description 8
- 241000894007 species Species 0.000 description 8
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 7
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 7
- 210000000170 cell membrane Anatomy 0.000 description 7
- 238000001502 gel electrophoresis Methods 0.000 description 7
- 238000002955 isolation Methods 0.000 description 7
- 210000005075 mammary gland Anatomy 0.000 description 7
- 210000004379 membrane Anatomy 0.000 description 7
- 229920001184 polypeptide Polymers 0.000 description 7
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 6
- 102000008055 Heparan Sulfate Proteoglycans Human genes 0.000 description 6
- 101000852815 Homo sapiens Insulin receptor Proteins 0.000 description 6
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 6
- WOUIMBGNEUWXQG-VKHMYHEASA-N Ser-Gly Chemical group OC[C@H](N)C(=O)NCC(O)=O WOUIMBGNEUWXQG-VKHMYHEASA-N 0.000 description 6
- 108020004682 Single-Stranded DNA Proteins 0.000 description 6
- 108090000054 Syndecan-2 Proteins 0.000 description 6
- 239000007983 Tris buffer Substances 0.000 description 6
- 238000001962 electrophoresis Methods 0.000 description 6
- 102000047882 human INSR Human genes 0.000 description 6
- 230000000977 initiatory effect Effects 0.000 description 6
- 210000004185 liver Anatomy 0.000 description 6
- 102000005962 receptors Human genes 0.000 description 6
- 108020003175 receptors Proteins 0.000 description 6
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 6
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 6
- 102000007469 Actins Human genes 0.000 description 5
- 108010085238 Actins Proteins 0.000 description 5
- 102000012410 DNA Ligases Human genes 0.000 description 5
- 108010061982 DNA Ligases Proteins 0.000 description 5
- 102000010834 Extracellular Matrix Proteins Human genes 0.000 description 5
- 108010037362 Extracellular Matrix Proteins Proteins 0.000 description 5
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 5
- 238000000636 Northern blotting Methods 0.000 description 5
- 239000000872 buffer Substances 0.000 description 5
- 230000015556 catabolic process Effects 0.000 description 5
- 230000001413 cellular effect Effects 0.000 description 5
- 238000005119 centrifugation Methods 0.000 description 5
- 210000004720 cerebrum Anatomy 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 238000006731 degradation reaction Methods 0.000 description 5
- 210000002744 extracellular matrix Anatomy 0.000 description 5
- 239000003102 growth factor Substances 0.000 description 5
- 229920000669 heparin Polymers 0.000 description 5
- 229960002897 heparin Drugs 0.000 description 5
- 238000000338 in vitro Methods 0.000 description 5
- 238000003780 insertion Methods 0.000 description 5
- 230000037431 insertion Effects 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 210000003491 skin Anatomy 0.000 description 5
- 239000011780 sodium chloride Substances 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 4
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 4
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 4
- 108091026890 Coding region Proteins 0.000 description 4
- 108020004635 Complementary DNA Proteins 0.000 description 4
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- HTTJABKRGRZYRN-UHFFFAOYSA-N Heparin Chemical compound OC1C(NC(=O)C)C(O)OC(COS(O)(=O)=O)C1OC1C(OS(O)(=O)=O)C(O)C(OC2C(C(OS(O)(=O)=O)C(OC3C(C(O)C(O)C(O3)C(O)=O)OS(O)(=O)=O)C(CO)O2)NS(O)(=O)=O)C(C(O)=O)O1 HTTJABKRGRZYRN-UHFFFAOYSA-N 0.000 description 4
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 4
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 4
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 4
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 4
- 241000283973 Oryctolagus cuniculus Species 0.000 description 4
- 229910019142 PO4 Inorganic materials 0.000 description 4
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 4
- 102000010780 Platelet-Derived Growth Factor Human genes 0.000 description 4
- 108010038512 Platelet-Derived Growth Factor Proteins 0.000 description 4
- 239000006180 TBST buffer Substances 0.000 description 4
- 239000002585 base Substances 0.000 description 4
- 230000004071 biological effect Effects 0.000 description 4
- AIYUHDOJVYHVIT-UHFFFAOYSA-M caesium chloride Chemical compound [Cl-].[Cs+] AIYUHDOJVYHVIT-UHFFFAOYSA-M 0.000 description 4
- 238000004587 chromatography analysis Methods 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- 210000004292 cytoskeleton Anatomy 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 239000000499 gel Substances 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- PHTQWCKDNZKARW-UHFFFAOYSA-N isoamylol Chemical compound CC(C)CCO PHTQWCKDNZKARW-UHFFFAOYSA-N 0.000 description 4
- 210000004962 mammalian cell Anatomy 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 239000002751 oligonucleotide probe Substances 0.000 description 4
- 239000010452 phosphate Substances 0.000 description 4
- 239000002953 phosphate buffered saline Substances 0.000 description 4
- 238000001556 precipitation Methods 0.000 description 4
- 239000002243 precursor Substances 0.000 description 4
- 230000035484 reaction time Effects 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 238000001890 transfection Methods 0.000 description 4
- 239000001226 triphosphate Substances 0.000 description 4
- 235000011178 triphosphate Nutrition 0.000 description 4
- 229920000936 Agarose Polymers 0.000 description 3
- 102000014914 Carrier Proteins Human genes 0.000 description 3
- SRBFZHDQGSBBOR-IOVATXLUSA-N D-xylopyranose Chemical compound O[C@@H]1COC(O)[C@H](O)[C@H]1O SRBFZHDQGSBBOR-IOVATXLUSA-N 0.000 description 3
- 241000588724 Escherichia coli Species 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- SXRSQZLOMIGNAQ-UHFFFAOYSA-N Glutaraldehyde Chemical compound O=CCCCC=O SXRSQZLOMIGNAQ-UHFFFAOYSA-N 0.000 description 3
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 3
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 3
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 3
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 3
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 3
- 239000000020 Nitrocellulose Substances 0.000 description 3
- 108010076504 Protein Sorting Signals Proteins 0.000 description 3
- 102000006382 Ribonucleases Human genes 0.000 description 3
- 108010083644 Ribonucleases Proteins 0.000 description 3
- 238000012300 Sequence Analysis Methods 0.000 description 3
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 3
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 3
- 108091036066 Three prime untranslated region Proteins 0.000 description 3
- 108010065282 UDP xylose-protein xylosyltransferase Proteins 0.000 description 3
- 102000010199 Xylosyltransferases Human genes 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000000903 blocking effect Effects 0.000 description 3
- 239000000356 contaminant Substances 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- 238000010790 dilution Methods 0.000 description 3
- 239000012895 dilution Substances 0.000 description 3
- 210000002889 endothelial cell Anatomy 0.000 description 3
- 239000013604 expression vector Substances 0.000 description 3
- 238000002649 immunization Methods 0.000 description 3
- 230000003053 immunization Effects 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 108010028930 invariant chain Proteins 0.000 description 3
- 239000003446 ligand Substances 0.000 description 3
- 230000037230 mobility Effects 0.000 description 3
- 229920001220 nitrocellulos Polymers 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 230000002285 radioactive effect Effects 0.000 description 3
- 230000009257 reactivity Effects 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 125000003607 serino group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C(O[H])([H])[H] 0.000 description 3
- 238000010561 standard procedure Methods 0.000 description 3
- 238000003756 stirring Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000005406 washing Methods 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- QRXMUCSWCMTJGU-UHFFFAOYSA-L (5-bromo-4-chloro-1h-indol-3-yl) phosphate Chemical compound C1=C(Br)C(Cl)=C2C(OP([O-])(=O)[O-])=CNC2=C1 QRXMUCSWCMTJGU-UHFFFAOYSA-L 0.000 description 2
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 108091016585 CD44 antigen Proteins 0.000 description 2
- 108010078791 Carrier Proteins Proteins 0.000 description 2
- 102000005598 Chondroitin Sulfate Proteoglycans Human genes 0.000 description 2
- 108010059480 Chondroitin Sulfate Proteoglycans Proteins 0.000 description 2
- 108091035707 Consensus sequence Proteins 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 108010042407 Endonucleases Proteins 0.000 description 2
- 102000018233 Fibroblast Growth Factor Human genes 0.000 description 2
- 108050007372 Fibroblast Growth Factor Proteins 0.000 description 2
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 2
- 102000003886 Glycoproteins Human genes 0.000 description 2
- 108090000288 Glycoproteins Proteins 0.000 description 2
- 241000725303 Human immunodeficiency virus Species 0.000 description 2
- 101710203526 Integrase Proteins 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- 102000003960 Ligases Human genes 0.000 description 2
- 108090000364 Ligases Proteins 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- 229920001213 Polysorbate 20 Polymers 0.000 description 2
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 2
- 238000002123 RNA extraction Methods 0.000 description 2
- 239000003391 RNA probe Substances 0.000 description 2
- 108090000631 Trypsin Proteins 0.000 description 2
- 102000004142 Trypsin Human genes 0.000 description 2
- 239000002253 acid Chemical class 0.000 description 2
- 239000003513 alkali Substances 0.000 description 2
- -1 aromatic amino acids Chemical class 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000010804 cDNA synthesis Methods 0.000 description 2
- 239000001506 calcium phosphate Substances 0.000 description 2
- 229910000389 calcium phosphate Inorganic materials 0.000 description 2
- 235000011010 calcium phosphates Nutrition 0.000 description 2
- 150000001720 carbohydrates Chemical class 0.000 description 2
- 150000001732 carboxylic acid derivatives Chemical class 0.000 description 2
- 230000000747 cardiac effect Effects 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- 239000001913 cellulose Substances 0.000 description 2
- 229920002678 cellulose Polymers 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 229960000633 dextran sulfate Drugs 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- PJMPHNIQZUBGLI-UHFFFAOYSA-N fentanyl Chemical compound C=1C=CC=CC=1N(C(=O)CC)C(CC1)CCN1CCC1=CC=CC=C1 PJMPHNIQZUBGLI-UHFFFAOYSA-N 0.000 description 2
- 229940126864 fibroblast growth factor Drugs 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 230000003301 hydrolyzing effect Effects 0.000 description 2
- 239000012535 impurity Substances 0.000 description 2
- 239000003112 inhibitor Substances 0.000 description 2
- 108010044426 integrins Proteins 0.000 description 2
- 102000006495 integrins Human genes 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- KWGKDLIKAYFUFQ-UHFFFAOYSA-M lithium chloride Chemical compound [Li+].[Cl-] KWGKDLIKAYFUFQ-UHFFFAOYSA-M 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 230000004660 morphological change Effects 0.000 description 2
- 210000004165 myocardium Anatomy 0.000 description 2
- JPXMTWWFLBLUCD-UHFFFAOYSA-N nitro blue tetrazolium(2+) Chemical compound COC1=CC(C=2C=C(OC)C(=CC=2)[N+]=2N(N=C(N=2)C=2C=CC=CC=2)C=2C=CC(=CC=2)[N+]([O-])=O)=CC=C1[N+]1=NC(C=2C=CC=CC=2)=NN1C1=CC=C([N+]([O-])=O)C=C1 JPXMTWWFLBLUCD-UHFFFAOYSA-N 0.000 description 2
- JMANVNJQNLATNU-UHFFFAOYSA-N oxalonitrile Chemical compound N#CC#N JMANVNJQNLATNU-UHFFFAOYSA-N 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 2
- 239000013600 plasmid vector Substances 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 2
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 2
- 230000037452 priming Effects 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 210000002027 skeletal muscle Anatomy 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical group CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 2
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 2
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 2
- 239000012588 trypsin Substances 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 238000001262 western blot Methods 0.000 description 2
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 2
- BRZYSWJRSDMWLG-DJWUNRQOSA-N (2r,3r,4r,5r)-2-[(1s,2s,3r,4s,6r)-4,6-diamino-3-[(2s,3r,4r,5s,6r)-3-amino-4,5-dihydroxy-6-[(1r)-1-hydroxyethyl]oxan-2-yl]oxy-2-hydroxycyclohexyl]oxy-5-methyl-4-(methylamino)oxane-3,5-diol Chemical compound O1C[C@@](O)(C)[C@H](NC)[C@@H](O)[C@H]1O[C@@H]1[C@@H](O)[C@H](O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H]([C@@H](C)O)O2)N)[C@@H](N)C[C@H]1N BRZYSWJRSDMWLG-DJWUNRQOSA-N 0.000 description 1
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 108020005544 Antisense RNA Proteins 0.000 description 1
- JQFZHHSQMKZLRU-IUCAKERBSA-N Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N JQFZHHSQMKZLRU-IUCAKERBSA-N 0.000 description 1
- YXVAESUIQFDBHN-SRVKXCTJSA-N Asn-Phe-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O YXVAESUIQFDBHN-SRVKXCTJSA-N 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 241000713838 Avian myeloblastosis virus Species 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-M Bicarbonate Chemical compound OC([O-])=O BVKZGUZCCUSVTD-UHFFFAOYSA-M 0.000 description 1
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 1
- 102000037716 Chondroitin-sulfate-ABC endolyases Human genes 0.000 description 1
- 108090000819 Chondroitin-sulfate-ABC endolyases Proteins 0.000 description 1
- 102000008186 Collagen Human genes 0.000 description 1
- 108010035532 Collagen Proteins 0.000 description 1
- 102000004427 Collagen Type IX Human genes 0.000 description 1
- 108010042106 Collagen Type IX Proteins 0.000 description 1
- 241000699802 Cricetulus griseus Species 0.000 description 1
- LEVWYRKDKASIDU-QWWZWVQMSA-N D-cystine Chemical compound OC(=O)[C@H](N)CSSC[C@@H](N)C(O)=O LEVWYRKDKASIDU-QWWZWVQMSA-N 0.000 description 1
- 108010017826 DNA Polymerase I Proteins 0.000 description 1
- 102000004594 DNA Polymerase I Human genes 0.000 description 1
- 108010066072 DNA modification methylase EcoRI Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 239000003298 DNA probe Substances 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 240000006497 Dianthus caryophyllus Species 0.000 description 1
- 235000009355 Dianthus caryophyllus Nutrition 0.000 description 1
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 1
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 1
- 102100021238 Dynamin-2 Human genes 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 102000008857 Ferritin Human genes 0.000 description 1
- 108050000784 Ferritin Proteins 0.000 description 1
- 238000008416 Ferritin Methods 0.000 description 1
- 102000003971 Fibroblast Growth Factor 1 Human genes 0.000 description 1
- 108090000386 Fibroblast Growth Factor 1 Proteins 0.000 description 1
- 102000016359 Fibronectins Human genes 0.000 description 1
- 108010067306 Fibronectins Proteins 0.000 description 1
- 229920001917 Ficoll Polymers 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- 108010092364 Glucuronosyltransferase Proteins 0.000 description 1
- 102000016354 Glucuronosyltransferase Human genes 0.000 description 1
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 1
- 108700023372 Glycosyltransferases Proteins 0.000 description 1
- 101000756632 Homo sapiens Actin, cytoplasmic 1 Proteins 0.000 description 1
- 101000817607 Homo sapiens Dynamin-2 Proteins 0.000 description 1
- 108010042918 Integrin alpha5beta1 Proteins 0.000 description 1
- 102100027612 Kallikrein-11 Human genes 0.000 description 1
- 101710180643 Leishmanolysin Proteins 0.000 description 1
- 108010013563 Lipoprotein Lipase Proteins 0.000 description 1
- 102100022119 Lipoprotein lipase Human genes 0.000 description 1
- 239000007993 MOPS buffer Substances 0.000 description 1
- 108010052285 Membrane Proteins Proteins 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- 102000007524 N-Acetylgalactosaminyltransferases Human genes 0.000 description 1
- 108010046220 N-Acetylgalactosaminyltransferases Proteins 0.000 description 1
- BACYUWVYYTXETD-UHFFFAOYSA-N N-Lauroylsarcosine Chemical compound CCCCCCCCCCCC(=O)N(C)CC(O)=O BACYUWVYYTXETD-UHFFFAOYSA-N 0.000 description 1
- 102100023315 N-acetyllactosaminide beta-1,6-N-acetylglucosaminyl-transferase Human genes 0.000 description 1
- 108010056664 N-acetyllactosaminide beta-1,6-N-acetylglucosaminyltransferase Proteins 0.000 description 1
- 230000004988 N-glycosylation Effects 0.000 description 1
- UIQMVEYFGZJHCZ-SSTWWWIQSA-N Nalorphine Chemical compound C([C@@H](N(CC1)CC=C)[C@@H]2C=C[C@@H]3O)C4=CC=C(O)C5=C4[C@@]21[C@H]3O5 UIQMVEYFGZJHCZ-SSTWWWIQSA-N 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 239000004677 Nylon Substances 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 102000000447 Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase Human genes 0.000 description 1
- 108010055817 Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase Proteins 0.000 description 1
- 108010021757 Polynucleotide 5'-Hydroxyl-Kinase Proteins 0.000 description 1
- 102000008422 Polynucleotide 5'-hydroxyl-kinase Human genes 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 108020004518 RNA Probes Proteins 0.000 description 1
- 239000013614 RNA sample Substances 0.000 description 1
- 229920002684 Sepharose Polymers 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 102000006463 Talin Human genes 0.000 description 1
- 108010083809 Talin Proteins 0.000 description 1
- ZMZDMBWJUHKJPS-UHFFFAOYSA-M Thiocyanate anion Chemical compound [S-]C#N ZMZDMBWJUHKJPS-UHFFFAOYSA-M 0.000 description 1
- 108060008245 Thrombospondin Proteins 0.000 description 1
- 102000002938 Thrombospondin Human genes 0.000 description 1
- 102000004887 Transforming Growth Factor beta Human genes 0.000 description 1
- 108090001012 Transforming Growth Factor beta Proteins 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 101710152431 Trypsin-like protease Proteins 0.000 description 1
- 229910052770 Uranium Inorganic materials 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 239000008351 acetate buffer Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 230000001464 adherent effect Effects 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 125000003295 alanine group Chemical group N[C@@H](C)C(=O)* 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000002429 anti-coagulating effect Effects 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 239000003146 anticoagulant agent Substances 0.000 description 1
- 229940127219 anticoagulant drug Drugs 0.000 description 1
- 229940019748 antifibrinolytic proteinase inhibitors Drugs 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 229940041181 antineoplastic drug Drugs 0.000 description 1
- 239000008346 aqueous phase Substances 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 210000001367 artery Anatomy 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 108010058966 bacteriophage T7 induced DNA polymerase Proteins 0.000 description 1
- 210000002469 basement membrane Anatomy 0.000 description 1
- 235000015278 beef Nutrition 0.000 description 1
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 1
- AGSPXMVUFBBBMO-UHFFFAOYSA-N beta-aminopropionitrile Chemical compound NCCC#N AGSPXMVUFBBBMO-UHFFFAOYSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 229940098773 bovine serum albumin Drugs 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 125000002091 cationic group Chemical group 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 239000006285 cell suspension Substances 0.000 description 1
- 230000002490 cerebral effect Effects 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 239000013599 cloning vector Substances 0.000 description 1
- 239000000701 coagulant Substances 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 238000010959 commercial synthesis reaction Methods 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 239000003184 complementary RNA Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000004132 cross linking Methods 0.000 description 1
- 229960003067 cystine Drugs 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000003436 cytoskeletal effect Effects 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- UFJPAQSLHAGEBL-RRKCRQDMSA-N dITP Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(N=CNC2=O)=C2N=C1 UFJPAQSLHAGEBL-RRKCRQDMSA-N 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 238000000326 densiometry Methods 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 230000002616 endonucleolytic effect Effects 0.000 description 1
- 210000002472 endoplasmic reticulum Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 210000002615 epidermis Anatomy 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 229960005542 ethidium bromide Drugs 0.000 description 1
- DNJIEGIFACGWOD-UHFFFAOYSA-N ethyl mercaptane Natural products CCS DNJIEGIFACGWOD-UHFFFAOYSA-N 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 239000011536 extraction buffer Substances 0.000 description 1
- 235000013861 fat-free Nutrition 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 230000003328 fibroblastic effect Effects 0.000 description 1
- WSFSSNUMVMOOMR-UHFFFAOYSA-N formaldehyde Substances O=C WSFSSNUMVMOOMR-UHFFFAOYSA-N 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 238000001641 gel filtration chromatography Methods 0.000 description 1
- 210000004907 gland Anatomy 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- 102000045442 glycosyltransferase activity proteins Human genes 0.000 description 1
- 108700014210 glycosyltransferase activity proteins Proteins 0.000 description 1
- 125000003630 glycyl group Chemical group [H]N([H])C([H])([H])C(*)=O 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- YQOKLYTXVFAUCW-UHFFFAOYSA-N guanidine;isothiocyanic acid Chemical compound N=C=S.NC(N)=N YQOKLYTXVFAUCW-UHFFFAOYSA-N 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- ZMZDMBWJUHKJPS-UHFFFAOYSA-N hydrogen thiocyanate Natural products SC#N ZMZDMBWJUHKJPS-UHFFFAOYSA-N 0.000 description 1
- 210000004201 immune sera Anatomy 0.000 description 1
- 229940042743 immune sera Drugs 0.000 description 1
- 230000002163 immunogen Effects 0.000 description 1
- 238000001114 immunoprecipitation Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 239000007927 intramuscular injection Substances 0.000 description 1
- 238000010255 intramuscular injection Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 229960004592 isopropanol Drugs 0.000 description 1
- 108010045069 keyhole-limpet hemocyanin Proteins 0.000 description 1
- 238000007169 ligase reaction Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000005567 liquid scintillation counting Methods 0.000 description 1
- 210000005228 liver tissue Anatomy 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 108091005601 modified peptides Proteins 0.000 description 1
- MLEBFEHOJICQQS-UHFFFAOYSA-N monodansylcadaverine Chemical compound C1=CC=C2C(N(C)C)=CC=CC2=C1S(=O)(=O)NCCCCCN MLEBFEHOJICQQS-UHFFFAOYSA-N 0.000 description 1
- 230000001002 morphogenetic effect Effects 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 230000001613 neoplastic effect Effects 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- 238000011587 new zealand white rabbit Methods 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 231100000065 noncytotoxic Toxicity 0.000 description 1
- 230000002020 noncytotoxic effect Effects 0.000 description 1
- 230000001453 nonthrombogenic effect Effects 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 150000007523 nucleic acids Chemical class 0.000 description 1
- 229920001778 nylon Polymers 0.000 description 1
- 229920001542 oligosaccharide Polymers 0.000 description 1
- 150000002482 oligosaccharides Chemical class 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 150000002989 phenols Chemical class 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 238000013492 plasmid preparation Methods 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 239000001267 polyvinylpyrrolidone Substances 0.000 description 1
- 235000013855 polyvinylpyrrolidone Nutrition 0.000 description 1
- 229920000036 polyvinylpyrrolidone Polymers 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 230000003389 potentiating effect Effects 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 230000000063 preceeding effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000006920 protein precipitation Effects 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- HNJBEVLQSNELDL-UHFFFAOYSA-N pyrrolidin-2-one Chemical compound O=C1CCCN1 HNJBEVLQSNELDL-UHFFFAOYSA-N 0.000 description 1
- 238000003127 radioimmunoassay Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000008844 regulatory mechanism Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 238000004062 sedimentation Methods 0.000 description 1
- 238000007560 sedimentation technique Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 238000011451 sequencing strategy Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 239000001509 sodium citrate Substances 0.000 description 1
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 1
- 239000001488 sodium phosphate Substances 0.000 description 1
- 229910000162 sodium phosphate Inorganic materials 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 210000002341 stratified epithelial cell Anatomy 0.000 description 1
- 210000003699 striated muscle Anatomy 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- ZRKFYGHZFMAOKI-QMGMOQQFSA-N tgfbeta Chemical compound C([C@H](NC(=O)[C@H](C(C)C)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CC(C)C)NC(=O)CNC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCSC)C(C)C)[C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O)C1=CC=C(O)C=C1 ZRKFYGHZFMAOKI-QMGMOQQFSA-N 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 239000003656 tris buffered saline Substances 0.000 description 1
- RYFMWSXOAZQYPI-UHFFFAOYSA-K trisodium phosphate Chemical compound [Na+].[Na+].[Na+].[O-]P([O-])([O-])=O RYFMWSXOAZQYPI-UHFFFAOYSA-K 0.000 description 1
- GPRLSGONYQIRFK-MNYXATJNSA-N triton Chemical compound [3H+] GPRLSGONYQIRFK-MNYXATJNSA-N 0.000 description 1
- 108010087967 type I signal peptidase Proteins 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 210000001215 vagina Anatomy 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
- C12N15/1138—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing against receptors or cell surface proteins
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/705—Receptors; Cell surface antigens; Cell surface determinants
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/18—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
- C07K16/28—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants
Definitions
- This invention relates generally to the field of genetic engineering and more particularly to genes for proteoglycans, their insertion into recombinant DNA vectors, and the production of the resulting core proteins in recipient strains of microorganisms and the proteoglycan in recipient eukaryotic cells.
- the cellular behavior responsible for the development, repair and maintenance of tissues is regulated, in large part, by interactions between cells and their extracellular matrix. These interactions are mediated by cell surface molecules acting as receptors that bind the large insoluble matrix molecules and induce responses that result in changes of cellular phenotype.
- Several proteins associated with the cell surface can bind matrix components. These proteins differ in their specificity and affinity and in their mode of association with the cell surface. Some bind cells to single matrix ligands while others, such as some members of the integrin super family, appear to have multiple matrix ligands. Of the various matrix- binding proteins at the cell surface, only the integrins are known to be integral membrane proteins. The integrin fibronectin receptor codistributes both with extracellular fibronectin and with intracellular cytoskeletal components, apparently via an association of the receptor's cytoplasmic domain with the cytoskeletal protein talin.
- the present inventors have studied a lipophilic proteoglycan containing both heparan sulfate and chondroitin sulfate that is found at the surface of mouse mammary epithelial cells and that behaves as a high affinity receptor specific for multiple components of the interstitial matrix.
- This proteoglycan has been given the name syndecan in the mouse.
- the proteoglycan binds the epithelial cells via its heparan sulfate chains to collagen types I, III, and V (Koda, J.E., Rapraeger, A., and Bernfield, M., J. Biol. Chem. (1985) 260; 8157-8162), fibronection (Saunders, S. and Bernfield, M.
- Cultured epithelial cells shed the ectodomain from their apical surfaces as a non- lipophilic proteoglycan that contains all of the glycosar ⁇ inoglycan of the intact molecule and polarize the proteoglycan exclusively to their basolateral surfaces, a location consistent with its matrix receptor function. Upon suspension of these cells, the ectodomain is cleaved from the cell surface; the proteoglycan is not replaced while the cells are suspended (Jalkanen, M. , Rapraeger, A., Saunders, S., and Bernfield, M. , J. Cell Biol. (1987) 105: 3087- 3096).
- the proteoglycan is mainly on epithelia in mature tissues (Hayashi, K., Hayashi, M. , Jalkanen, M. , Firestone, J.H., Trelstad, R.L., and Bernfield, M. , J. Histochem. Cytochem. (1987) 35_: 1079-1088), and some of the present inventors have previously proposed that it is a matrix anchor that stablizes the morphology of epithelial sheets by linking the cytoskeleton to the extracellular matrix (Bernfield, M. , Rapraeger, Al, Jalkanen, M. , and Banerjee, S.D., Basement Membranes (1985) 343-352).
- Syndecan undergoes substantial regulation; its size, glycosaminoglycan composition and location at the cell surface vary between epithelial types, and its expression changes during development.
- the proteoglycan is located exclusively at the basolateral cell surface of simple epithelia but surrounds stratified epithelial cells. At basolateral cell surfaces, it appears to contain two heparan sulfate and two chrondroitin sulfate chains, but where it surrounds cells, it contains only a single heparan sulfate chain and a single small chrondroitin sulfate chain (Sanderson, R.D., and Bernfield, M. , Proc. Natl. Acad. Sci. USA (1987) 23J3: 491-497).
- the proteoglycan is lost when the cells terminally differentiate (Hayashi, K., Hayashi, M. , Boutin, E., Cunha, G.R., Bernfield, M. , and Trelstad, R. ., J. Lab. Invest. (1988) J58_: 68-76).
- the proteoglycan is transiently lost when epithelia change their shape and is transiently expressed by mesenchymal cells undergoing morphogenetic tissue interaction.
- Heparan sulfate proteoglycans are ubiquitous on the surfaces of adherent cells and bind various ligands including extracellular matrix, growth factors, proteinase inhibitors, and lipoprotein lipase; see Fransson, L., Trends Biochem. Sci. (1987) _12: 406-
- an isolated peptide having a molecular weight of from about 31 kD to about 35 kD and comprising a hydrophilic amino terminus extracellular region, a hydrophilic carboxy terminus cytoplasmic region, and a hydrophobic transmembrane region between said cytoplasmic and extracellular regions, a dibasic sequence extracellularly adjacent the transmembrane region of the peptide, and at least one glycosylation site in the extracellular region including an Xac-Xaa- Ser-Gly-Xac sequence, wherein Xac is an acidic amino acid and Xaa is any amino acid and wherein said peptide is capable of functioning as a core protein for attachment of a heparan sulfate chain at said Ser.
- A is alanine
- C cysteine
- D is aspartate
- E glutamate
- F is phenylalanine
- G is glycine
- H histidine
- I is isoleucine
- K lysine
- L leucine
- M methionine
- N is asparagine
- P proline
- Q glutamine
- R arginine
- S serine
- T threonine
- V valine
- W tryptophan
- Y is tyrosine.
- DNA and RNA molecules, recombinant DNA vectors, and modified microorganisms or eukaryotic cells comprising a nucleotide sequence that encodes any of the peptides indicated above are also part of the present invention.
- sequences comprising all or part of the following DNA sequence, a complementary DNA or RNA sequence, or a corresponding RNA sequence are especially preferred:
- DNA and RNA molecules containing segments of the larger sequence are also provided for use in carrying out preferred aspects of the invention relating to the production of such peptides by the techniques of genetic engineering and the production of oligonucleotide probes.
- Figure 1 is a formula showing the cDNA sequence for syndecan and the corresponding amino acid sequence.
- Figure 2 is a restriction map showing sequencing strategy of syndecan cDNA clones.
- Figure 3 is a table showing potnetial glycosylation sites of the syndecan core protein and homology of these regions to the glycosylation site of other proteins.
- Figure 4 is a schematic diagram showing different regions of the syndecan core protein.
- Figure 5 is a table showing DNA sequence similarities between murine syndecan and human insulin receptor.
- the 311 amino acid core protein has a unique sequence that contains several structural features consistent with its role as a matrix anchor and as an acceptor of two distinct types of glycosaminoglycan chains.
- the expression of its mRNA is tissue-type specific, and both the 5' and 3' untranslated regions of its cDNA show substantial sequence homology to those of the human insulin receptor cDNA.
- This core protein cDNA defines a new class of matrix receptor, an integral membrane proteoglycan, for which we propose the name syndecan (from the Greek, syndein, to bind together).
- Nucleotide sequence of one strand of syndecan cDNA The numbers refer to the amino acid sequence and corresponding DNA codon sequence beginning at the amino terminus of the protein. The stop codon is marked "end.”
- the trinucleotides of Table 1, termed codons, are presented as DNA trinucleotides, as they exist in the genetic material of a living organism.
- Complementary trinucleotide DNA sequences having opposite strand polarity are functionally equivalent to the codons of Table 1, as is understood in the art.
- An important and well known feature of the genetic code is its redundancy, whereby, for most of the amino acids used to make proteins, more than one coding nucleotide triplet may be employed. Therefore, a number of different nucleotide sequences may code for a given amino acid sequence.
- nucleotide sequences are considered functionally equivalent since they can result in the production of the same amino acid sequence in all organisms, although certain strains may translate some sequences more efficiently than they do others. Occasionally, a methylated variant of a purine or pyrimidine may be found in a given nucleotide sequence. Such methylations do not affect the coding relationship in any way.
- the equivalent codons are shown in Table 2 below. TABLE 2
- Each 3-letter triplet represents a trinucleotide of DNA having a 5' end on the left and a 3' end on the right.
- the letters stand for the purine or pyrimidine bases forming the nucleotide sequence.
- T thymine Since the DNA sequence of the gene has been fully identified, it is possible to produce a DNA gene entirely by synthetic chemistry, after which the gene can be inserted into any of the many available DNA vectors using known techniques of recombinant DNA technology. Thus the present invention can be carried out using reagents, plasmids, and microorganism which are freely available and in the public domain at the time of filing of this patent application. For example, nucleotide sequences greater than
- oligonucleotides can readily be spliced using, among others, the techniques described later in this application to produce any nucleotide sequence described herein. For example, relatively short complementary oligonucleotide sequences with 3' or 5' segments that extend beyond the complementary sequences can be synthesized.
- proteins that lack the amino terminus first 17 amino acids are preferred since the first 17 amino acids appear to represent a signal sequence.
- additional amino acids can be absent from either or both terminals of the sequence given without losing ability to act as a core protein for synthesis of proteoglycans.
- up to 10 additional amino acids can be present at either or both terminals.
- preferred compounds are those which more closely approach the specific formulas given (or the corresponding sequence that lacks a signal sequence) with 10 or fewer, more preferably 5 or fewer, absent amino acids being preferred for either terminal and 7 or fewer, more preferably 4 or fewer, additional amino acids being preferred for either terminal.
- Whether a change results in a functioning peptide can readily be determined by assessing the ability of the corresponding DNA coding for this peptide to produce this peptide in glycosylated form when introduced into eukaryotic cells. Examples of this process are described later in detail. If attachment of glycosaminoglycan chains occurs, the replacement is immaterial, and the molecule being tested is equivalent to those specifically described above. Peptides in which more than one replacement has taken place can readily be tested in the same manner. The number of replacements is not strictly limited, but 10 or fewer are preferred.
- DNA molecules that code for such peptides can readily be determined from the list of codons in Table 2 and are likewise contemplated as being equivalent to the DNA sequence of Table 1.
- any discussion in this application of a replacement or other change in a peptide is equally applicable to the corresponding DNA sequence or to the DNA molecule, recombinant vector, transformed microorganism, or transfected eukaryotic cells in which the sequence is located (and vice versa). Codons can be chosen for use in a particular host organism in accordance with the frequency with which a particular codon is utilized by that host, if desired, to increase the rate at which expression of the peptide occurs.
- DNA (or corresponding RNA) molecules of the invention can have additional nucleotides preceeding or following those that are specifically listed.
- poly A can be added to the 3'-terminal
- short (e.g., fewer than 20 nucleotides) sequence can be added to either terminal to provide a terminal sequence corresponding to a restriction endonuclease site, stop codons can follow the peptide sequence to terminate transcription, and the like.
- DNA molecules containing a promoter region or other control region upstream from the gene can be produced.
- RNA molecules are said to correspond to DNA molecules if they encode the same amino acids and/or control sequences.
- Peptides of the invention can be prepared for the first time as purified preparations, either by direct synthesis or by using a cloned gene as described herein.
- purified is meant, when referring to a peptide or DNA or RNA sequence, that the indicated molecule is present in the substantial absence of other biological macromolecules of the same type.
- purified as used herein preferably means at least 95% by weight, more preferably at least 99% by weight, and most preferably at least 99.8% by weight, of biological macromolecules of the same type present (but water, buffers, and other small molecules, especially molecules having a molecular weight of less than 1000, can be present).
- the term “pure” as used herein preferably has the same numerical limits as “purified” immediately above.
- isolated refers to a peptide, DNA, or RNA molecule separated not only from other peptides, DNAs, or RNAs, respectively, that are present in the natural source of the macromolecule but also from other macromolecules and preferrably refers to a macromolecule found in the presence of (if anything) only a solvent, buffer, ion or other low molecular weight component normally present in a solution of the same. "Isolated” and
- purified do not encompass either natural materials in their native state or natural materials that have been separated into components (e.g., in an acylamide gel) but not obtained either as pure substances or as solutions.
- Two protein sequences are homologous (as this term is preferably used in this specification) if they have an alignment score of >5 (in standard deviation units) using the program ALIGN with the mutation data matrix and a gap penalty of 6 (or greater). See Dayhoff, M.O., in Atlas of Protein Sequence and Structure, 1972, volume 5, National Biomedical Research Foundation, pp. 101-110, and Supplement 2 to this volume, pp. 1-10.
- the two sequences (or parts thereof— robably at least 30 amino acids in length) are more preferably homologous if their amino acids are greater than or equal to 50% identical when optimally aligned using the ALIGN program mentioned above.
- Two DNA sequences are homologous if they hybridize to one another using nitrocellulose filter hybridization (one sequence bound to the filter, the other as a 3 2 p _ labeled probe) using hybridization conditions of 40-50% formamide, 37°-42° C, 4x SSC and wash conditions (after several room temperature washes with 2x SSC, 0.05% SDS) of stringency equivalent to 37° C with lx SSC, 0.05% SDS.
- hybridization conditions 40-50% formamide, 37°-42° C, 4x SSC and wash conditions (after several room temperature washes with 2x SSC, 0.05% SDS) of stringency equivalent to 37° C with lx SSC, 0.05% SDS.
- replacement by or replacement does not necessarily refer to any action that must take place but to the peptide that exists when an indicated “replacement” amino acid is present in the same position as the amino acid indicated to be present in a different formula (e.g., when leucine is present at position 5 instead of isoleucine). Salts of any of the macromolecules described herein will naturally occur when such molecules are present in (or isolated from) aqueous solutions of various pHs. All salts of peptides and other macromolecules having the indicated biological activity are considered to be within the scope of the present invention.
- Examples include alkali, alkaline earth, and other metal salts of carboxylic acid residues, acid addition salts (e.g., HC1) of amino residues, and zwitter ions formed by reactions between carboxylic acid and amino residues within the same molecule.
- acid addition salts e.g., HC1
- zwitter ions formed by reactions between carboxylic acid and amino residues within the same molecule.
- Hydrophobic and hydrophilic regions can be determined by standard procedures from amino acid sequences, for example by plotting hydrophobicity according to the procedure of Kyte and Doolittle, J_-_ Mol. Biol. (1982) 157: 105-132. Plotted values averaged over groups of seven contiguous residues that are positive indicate hydrophobic regions, while negative values indicate hydrophilic regions.
- the invention has specifically contemplated each and every possible variation of peptide or nucleotide that could be made by selecting combinations based on the possible amino acid and codon choices listed in Table 1 and Table 2, and all such variations are to be considered as being specifically disclosed.
- genetic information encoded as mRNA is obtained from cultured epithelial cells, preferably from mammalian sources, and used in the construction of a DNA gene, which is in turn used to produce a peptide of the invention.
- An initial crude cell suspension is sonicated or otherwise treated to disrupt cell membranes so that a crude cell extract is obtained.
- Known techniques of biochemistry e.g., preferential precipitation of proteins
- the crude cell extract, or a partially purified RNA portion therefrom, is then treated to further separate the RNA.
- crude cell extract can be layered on top of a 5 ml cushion of 5.7 M CsCl, 10 mM Tris-HCl, pH 7.5, 1 mM EDTA in a 1 in. _ 3
- SW27 rotor Beckman Instruments Corp., Fullerton, Calif.
- RNA is precipitated from the aqueous phase with ethanol in the presence of 0.2 M Na-acetate pH 5.5 and collected by centrifugation. Any other method of isolating RNA from a cellular source may be used instead of this method.
- RNA may be employed such as polyadenylated, crude or partially purified messenger RNA, which may be heterogeneous in sequence and in molecular size.
- the selectivity of the RNA isolation procedure is enhanced by any method which results in an enrichment of the desired mRNA in the heterodisperse population of mRNA isolated. Any such prepurification method may be employed in preparing a gene of the present invention, provided that the method does not introduce endonucleolytic cleavage of the mRNA.
- Prepurification to enrich for desired mRNA sequences may also be carried out using conventional methods for fractionating RNA, after its isolation from the cell. Any technique which does not result in degradation of the RNA may be employed. The techniques of preparative sedimentation in a sucrose gradient and gel electrophoresis are especially suitable.
- the mRNA must be isolated from the source cells under conditions which preclude degradation of the mRNA.
- the action of RNase enzymes is particularly to be avoided because these enzymes are capable of hydrolytic cleavage of the RNA nucleotide sequence.
- a suitable method for inhibiting RNase during extraction from cells involves the use of 4 M guanidium thiocyanate and 1 M mercaptoethanol during the cell disruption step.
- a low temperature and a pH near 5.0 are helpful in further reducing RNase degradation of the isolated RNA.
- mRNA is prepared essentially free of contaminating protein, DNA, polysaccharides and lipids. Standard methods are well known in the art for accomplishing such purification. RNA thus isolated contains non-messenger as well as messenger RNA.
- a convenient method for separating the mRNA of eukaryotes is chromatography on columns of oligo-dT cellulose, or other oligonucleotide-substituted column material such as poly-U or poly-T Sepharose, taking advantage of the hydrogen bonding specificity conferred by the presence of polyadenylic acid on the 3' end of eukaryotic mRNA. Hybridization with oligonucleotide probes prepared from DNA sequences set forth in this specification can then be used to isolate the particularly desired mRNA.
- the next step in most methods is the formation of DNA commplementary to the isolated heterogeneous sequences of mRNA.
- the enzyme of choice for this reaction is reverse transcriptase, although in principle any enzyme capable of forming a faithful complementary DNA copy of the mRNA template could be used.
- the reaction may be carried out under conditions described in the prior art, using mRNA as a template and a mixture of the four deoxynucleoside triphosphates, dATP, dGTP, dCTP, and dTTP, as precursors for the DNA strand.
- one of the deoxynucleoside triphosphates be labeled with a radioisotope, for example 32 P in the alpha position, in order to monitor the course of the reaction, to provide a tag for recovering the product after separation procedures such as chromatography and. electrophoresis, and for the purpose of making quantitative estimates of recovery.
- a radioisotope for example 32 P in the alpha position
- the cDNA transcripts produced by the reverse transcriptase reaction are somewhat heterogeneous with respect to sequences at the 5' end and the 3' end due to variations in the initiation and termination points of individual transcripts, relative to the mRNA template.
- the variability at the 5' end is thought to be due to the fact that the oligo-dT primer used to initiate synthesis is capable of binding at a variety of loci along the polyadenylated region of the mRNA.
- Synthesis of the cDNA transcript begins at an indeterminate point in the poly-A region, and variable length of poly-A region is transcribed depending on the inital binding site of the oligo-dT primer. It is possible to avoid this indeterminacy by the use of a primer containing, in addition to an oligo-dT tract, one or two nucleotides of the RNA sequence itself, thereby producing a primer which will have a preferred and defined binding site for initiating the transcription reaction.
- the indeterminacy at the 3'-end of the cDNA transcript is due to a variety of factors affecting the reverse transcriptase reaction, and to the possiblity of partial degradation of the RNA template.
- the isolation of specific cDNA transcripts of maximal length is greatly facilitated if conditions for the reverse transcriptase reaction are chosen which not only favor full length synthesis but also repress the synthesis of small DNA chains.
- Preferred reaction conditions for avian myeloblastosis virus reverse transcriptase are given in the examples section of U.S. Patent 4,363,877 and are herein incorporated by reference.
- the specific parameters which may be varied to provide maximal production of long-chain DNA transcripts of high fidelity are reaction temperature, salt concentration, amount of enzyme, concentration of primer relative to template, and reaction time.
- the conditions of temperature and salt concentration are chosen so as to optimize specific base-pairing between the oligo-dT primer and the polyadenylated portion of the RNA template. Under properly chosen conditions, the primer will be able to bind at the polyadenylated region of the RNA template, but non-specific initiation due to primer binding at other locations on the template, such as short, A-rich sequences, will be substantially prevented.
- the effects of temperature and salt are interdependent. Higher temperatures and low salt concentrations decrease the stability of specific base-pairing interactions.
- reaction time is kept as short as possible, in order to prevent non-specific initiations and to minimize the opportunity for degradation. Reaction times are interrelated with temperature, lower temperatures requiring longer reaction times. At 42°C, reactions ranging from 1 min. to 10 minutes are suitable.
- the primer should be present in 50 to 500- fold molar excess over the RNA template and the enzyme should be present in similar molar excess over the RNA template. The use of excess enzyme and primer enhances initiation and cDNA chain growth so that long-chain cDNA transcripts are produced efficiently within the confines of the short incubation times.
- the cDNA prepared as described above may be used as a template for the synthesis of double- stranded DNA, using a DNA poly erase such as reverse transcriptase and a nuclease capable of hydrolyzing single-stranded DNA.
- a DNA poly erase such as reverse transcriptase and a nuclease capable of hydrolyzing single-stranded DNA.
- the cDNA can be purified further by the process of U.S. Patent 4,363,877, although this is not essential.
- heterogeneous cDNA prepared by transcription of heterogeneous mRNA sequences, is treated with one or two restriction endonucleases.
- the choice of endonuclease to be used depends in the first instance upon a prior determination that recognition sites for the enzyme exist in the sequence of the cDNA to be isolated. The method depends upon the existence of two such sites. If the sites are identical, a single enzyme will be sufficient.
- the desired sequence will be cleaved at both sites, eliminating size heterogeneity as far as the desired cDNA sequence is concerned, and creating a population of molecules, termed fragments, containing the desired sequence and homogeneous in length. If the restriction sites are different, two enzymes will be required in order to produce the desired homogeneous length fragments.
- restriction enzyme(s) capable of producing an optimal length nucleotide sequence fragment coding for all or part of the desired protein must be made empirically. If the amino acid sequence of the desired protein is known, it is possible to compare the nucleotide sequence of uniform length nucleotide fragments produced by restriction endonuclease cleavage with the amino acid sequence for which it codes, using the known relationship of the genetic code common to all forms of life. A complete amino acid sequence for the desired protein is not necessary, however, since a reasonably accurate identification may be made on the basis of a partial sequence.
- the uniform length polynucleo- tides produced by restriction endonuclease cleavage may be used as probes capable of identifying the synthesis of the desired protein in an appropriate in vitro protein synthesizing system.
- the mRNA may be purified by affinity chromatography. Other techniques which may be suggested to those skilled in the art will be appropriate for this purpose.
- restriction enzymes suitable for use depends upon whether single-stranded or double- stranded cDNA is used.
- the preferred enzymes are those capable of acting on single-stranded DNA, which is the immediate reaction product of mRNA reverse transcription.
- the number of restriction enzymes now known to be capable of acting on single-stranded DNA is limited.
- the enzymes Haelll, Hhal and Hin(f)I are presently known to be suitable.
- the enzyme MboII may act on single-stranded DNA.
- additional suitable enzymes include those specified for double-stranded cDNA.
- double-stranded cDNA presents the additional technical disadvantages that subsequent sequence analysis is more complex and laborious. For these reasons, single-stranded cDNA is prefered, but the use of double-stranded DNA is feasible. In fact, the present invention was initially reduced to practice using double-stranded cDNA.
- the cDNA prepared for restriction endonuclease treatment may be radioactively labeled so that it may be detected after subsequent separation steps.
- a preferred technique is to incorporate a radioactive label such as ⁇ P in the alpha position of one of the four deoxynucleoside triphosphate precursors. Highest activity is obtained when the concentration of radioactive precursor is high relative to the concentration of the non-radioactive form. However, the total concentration of any deoxynucleoside triphosphate should be greater than 30 yM, in order to maximize the length of cDNA obtained in the reverse transcriptase reaction.
- Fragments which have been produced by the action of a restriction enzyme or combination of two restriction enzymes may be separated from each other and from heterodisperse sequences lacking recognition sites by any appropriate technique capable of separating polynucleotides on the basis of differences in length.
- Such methods include a variety of electrophoretic techniques and sedimentation techniques using an ultracentrifuge.
- Gel electrophoresis is preferred because it provides the best resolution on the basis of polynucleotide length.
- the method readily permits quantitative recovery of separated materials. Convenient gel electrophoresis methods have been described by Dingman, C.W., and Peacock, A.C., Biochemistry (1968) 1_: 659 , and by Maniatis, T., Jeffrey, A. and van de Sande, H., Biochemistry (1975) 1 :3787.
- cDNA transcripts obtained from most sources will be found to be heterodisperse in length.
- polynucleotide chains containing the desired sequence will be cleaved at the respective restriction sites to yield polynucleotide fragments of uniform length.
- polynucleotide fragments of uniform length Upon gel electrophoresis, these will be observed to form a distinct band.
- other discrete bands may be formed as well, which will most likely be of different length than that of the desired sequence.
- the gel electrophoresis pattern will reveal the appearance of one or more discrete bands, while the remainder of the cDNA will continue to be heterodisperse.
- the electrophoresis pattern will reveal that most of the cDNA is present in the discrete band.
- Sequence analysis of the electrophoresis band may be used to detect impurities representing 10% or more of the material in the band.
- a method for detecting lower levels of impurities has been developed founded upon the same general principles applied in the initial isolation method. The method requires that the desired nucleotide sequence fragment contain a recognition site for a restriction endonuclease not employed in the initial isolation.
- the amount of material present in any band of radioactively labeled polynucleotide can be determined by quantitative measurement of the amount of radioactivity present in each band, or by any other appropriate method.
- a quantitative measure of the purity of the fragments of desired sequence can be obtained by comparing the relative amounts of material present in those bands representing sub-fragments of the desired sequence with the total amount of material.
- DNA ligase which catalyzes the end-to-end joining of DNA fragments
- the gel electrophoresis bands representing the sub-fragments of the desired sequence may be separately eluted and combined in the presence of DNA ligase, under the appropriate conditions. See Sgaramella, V., Van de Sande, J.H., and Khorana, H.G., Proc. Natl. Acad. Sci. USA (1970) £7:1468. Where the sequences to be joined are not blunt-ended, the ligase obtained from E. coli may be used; Modrich, P., and Lehman, I.R., J. Biol. Chem. (1970) 245:3626.
- the efficiency of reconstituting the original sequence from sub-fragments produced by restriction endonuclease treatment will be greatly enhanced by the use of a method for preventing reconstitution in improper sequence.
- This unwanted result is prevented by treatment of the homogeneous length cDNA fragment of desired sequence with an agent capable of removing the 5'-terminal phosphate groups on the cDNA prior to cleavage of the homogeneous cDNA with a restriction endonuclease.
- the enzyme alkaline phosphatase is preferred.
- the 5'-terminal phosphate groups are a structural prerequisite for the subsequent joining action of DNA ligase used for reconstituting the cleaved sub-fragments.
- ends which lack a 5'-terminal phosphate cannot be covalently joined.
- the DNA sub-fragments can only be joined at the ends containing a 5'-phosphate generated by the restriction endonuclease cleavage performed on the isolated DNA fragment.
- cDNA transcripts under the conditions described above, are derived from the mRNA region containing the 5'-end of the mRNA template by specifically priming on the same template with a fragment obtained by restriction endonuclease cleavage.
- the above-described method may be used to obtain not only fragments of specific nucleotide sequence related to a desired protein, but also the entire nucleotide sequence coding for the protein of interest.
- Double-stranded, chemically synthesized oligonucleotide linkers, containing the recognition sequence for a restriction endonuclease may be attached to the ends of the isolated cDNA, to facilitate subsequent enzymatic removal of the gene portion from the vector DNA.
- the vector DNA is converted from a continuous loop to a linear form by treatment with an appropriate restriction endonuclease.
- the ends thereby formed are treated with alkaline phosphatase to remove 5'-phosphate end groups so that the vector DNA may not reform a continuous loop in a DNA ligase reaction without first incorporating a segment of the syndecan DNA.
- the cDNA, with attached linker oligonucleotides, and the treated vector DNA are mixed together with a DNA ligase enzyme, to join the cDNA to the vector DNA, forming a continuous loop of recombinant vector DNA, having the cDNA incorporated therein.
- the closed loop will be the only form able to transform a bacterium. Transformation, as is understood in the art and used herein, is the term used to denote the process whereby a microorganism incorporates extracellular DNA and reproduces it stably from generation to generation. Plasmid DNA in the form of a closed loop may be so incorporated under appropriate environmental conditions. The incorporated closed loop plasmid undergoes replication in the transformed cell, and the replicated copies are distributed to progeny cells when cell division occurs. As a result, a new cell line is established, containing the plasmid and carrying the genetic determinants thereof.
- Transformation by a plasmid in this manner occurs at high frequency when the transforming plasmid DNA is in closed loop form, and does not or rarely occurs if linear plasmid DNA is used.
- cDNA clones encoding the syndecan polypeptide from a normal mouse mammary gland epithelial cell line as well as mouse liver tissue.
- the cDNA derived protein sequence of syndecan is unique; comparisons with the National Biomedical Research Foundation and the translated NIH-Genebank databases detected no statistically significant similarities.
- the nascent polypeptide sequence is 311 amino acids and has -a molecular mass of 32,868 daltons.
- Treatment of syndecan with heparatinase and chondroitinase ABC generates a protein with relative mobility of ca. 69k daltons versus globular molecular weight markers on a gradient SDS-PAGE system.
- This anomoly appears to be a charge effect and has been seen in other proteins rich in proline, alanine, and highly charged amino acides.
- Syndecan is not a disulfide cross-linked dimer. Its migration on SDS-PAGE is unchanged following DTT treatment; its CNBr-cleavage product produces a single signal during amino acid sequencing; and its single cysteine in the predicted mature protein is located in the putative transmembrane domain. It also does not appear to be cross-linked by lysyl oxidase- or transglutaminase- mediated reactions because ⁇ -aminoproprionitrile and monodansylcadaverine treatments of NMuMG cells do not change its mobility on SDS-PAGE.
- Proteins with regions rich in proline, alanine and highly charged amino acids have highly extended conformations and anomalously slow mobilities in SDS-PAGE, Guest, J.R., Lewis, H.M. , Graham, L.D., Packman, L.C., and Perham, R.N., J. Mol. Biol. (1985) 185: 743-754. These amino acids are abundant in syndecan, and a Chou and Fasman secondary structure prediction is consistent with large regions of extended conformation.
- In vitro translation of synthetic mRNA corresponding to the coding region of syndecan (Sacl-Hindlll fragment of clone 4-19b) produces a nascent polypeptide of ca. 45k daltons.
- the amino acid sequence derived from the syndecan cDNA shows three functional domains; an extracellular domain and, by inference, transmembrane and cytoplasmic domains.
- the transmembrane domain was inferred from the physical properties of syndecan.
- the derived C- terminal sequence of syndecan contains both a characterics transmembrane domain (amino acids 253 to 277 in Table 1) and a 34 amino acid putative cytoplasmic domain.
- the cytoplasmic domain was inferred from properties already known for purified syndecan indicating that syndecan associates with the actin cytoskeleton.
- An immune serum generated against a synthetic peptide from the C-terminus of the derived protein sequence reacts with native syndecan extracted from NMuMG cells but not with the ectodomain, providing direct evidence for the cytoplasmic domain.
- the ectodomain of syndecan is released from
- NMuMG cell surfaces during cell culture, rapidly in response to cell rounding, or by mild trypsin treatment.
- the putative extracellular domain of syndecan contains a single dibasic site near the plasma membrane at which cleavage of syndecan from the cell surface undoubtedly occurs. Because the endogenously shed ectodomain of syndecan is indistinguishable from the trypsin-released form, a cell surface trypsin-like protease has been proposed. Shedding during cell culture is from the apical surface. However, when these cells are released from the substratum, destroying their polarity, the ectodomain is rapidly shed. These previously known results suggest that a cell surface protease is involved, but the structure of the site was not known. Identification of the putative cleavage site by the present invention will now allow more detailed investigation of this activity and will allow production of modified proteoglycans and other proteins that can be readily cleaved to release their extracellular regions for ready purification.
- Syndecan isolated from several sources is a hybrid proteoglycan, containing both chondroitin sulfate and heparan sulfate. These chains are known to be linked via a xyloside to serine residues in proteins, Roden, L., The Biochemistry of Glycoproteins and Proteoglycans (1980) 267-371 and Dorfman, A., Cell Biology of Extracellular Matrix (1981) 115-138. Regulating the elaboration of both chondroitin sulfate and heparan sulfate chains on the same core protein is a significant problem because the intial four saccharides are identical.
- Specific chain elongation subsequently involves the sequential action of an N-acetylgalactosaminyltransfer- ase and a glucuronosyltransferse for chondroitin sulfate, and an N-acetylglucosaminyltransferase and a glucuronosyltransferase for heparan sulfate.
- This specific chain elongation must involve recognition of unique structural features of the core protein, indicating that distinct peptide sequences might exist at chondroitin sulfate versus heparan sulfate attachment sites.
- chondroitin sulfate and heparan sulfate on syndecan provides the opportunity to assess the relationship between these attachment sites.
- Syndecan contains three potential ser-gly glycosaminoglycan attachment sites that contain some features of this consensus acceptor sequence but also contain unique features (Figure 3B). Though each of these three sequences retains an acidic amino acid two residues N-terminal to the acceptor Ser-Gly, they lack the consensus glycine that is two residues C-terminal to the Ser-Gly. This omission does not preclude this sequence from serving as a xylosyltransferase acceptor because it is also omitted from the Gly-Ser site of type IX collagen, Huber, S., Winterhalter, K.H., and Vaughan, L., J. Biol. Chem. (1988) 26 ⁇ : 752-756.
- An artificial peptide containing a heparan sulfate elongation site of the formula Xac-Xaa-Ser-Gly-Xac, where Xac is an acidic amino acid (aspartate or glutamate) and Xaa is any amino acid, can be prepared and used to produce heparan sulfate in eukaryotic cells as described herein.
- the artificial peptide need not contain any of the remaining structure of the molecules described herein as long as it provides the indicated sequence at a location in the peptide that is available for glycosylation.
- Such locations can be predicted, such as by using the algorithms developed by Chou and Fasman, or by empirically inserting a DNA sequence encoding this amino acid sequence into a gene and determing that the product functions as a recognition sequence for the elongation of heparan sulfate chains.
- a simple artificial peptide might contain multiple copies of the recognition sequence either located directly adjacent to each other or being joined by from one to ten, preferably one to five, amino acids.
- Another preferred embodiment involves producing a known polypeptide by genetic engineering that has been engineered to contain the attachment site of the invention at a location known to reside on an external surface of the polypeptide.
- sequences from the natural syndecan amino acid sequences adjacent the Xac- Xaa-Ser-Gly-Xac sequences are not required, they may be retained if desired in order to produce a protein that more closely resembles syndecan. Accordingly, artifical peptides containing from 1 to 10, 20, 30, or even more naturally adjacent amino acids as shown in Table 1, located either C terminal or N terminal or both to the Xac-Xaa-Ser-Gly-Xac sequence, represent other viable embodiments of the invention. Proteins containing such longer sequences can be prepared in the same manner discussed above using corresponding longer DNA sequences encoding the desired region.
- the number of chondroitin sulfate chains on syndecan apparently differs in cells of distinct cellular organization and changes in response to TGF- ⁇ , implying that each potential glycosaminoglycan • attachment site is not always utilized.
- a possible novel regulatory mechanism for this variation is suggested by the location in syndecan of its single potential N-linked glycosylation site, Asn-Phe-Ser, at residues 43-45. This site is located within the putative chondroitin sulfate attachment sequence, and the attachment of an N-linked sugar at this site would likely prevent subsequent recognition by the xylosytransferase.
- syndecan is expressed mainly in epithelia.
- Northern blot analysis of mRNA revealed two mRNA species at 2.6 and 3.4kb (constant ratio 3:1 respectively) in NMuMG cells as well as skin, liver, and midpregnant mammmary gland, all containing immunoreactive syndecan.
- these two mRNAs were undetectable in cardiac and skeletal muscle, tissues of mesenchymal origin that do not stain with 281-2.
- primitive and embryonic mesenchymal cells also show the 2.6 and 3.4kb mRNA species.
- the first hydrophobic stretch consists of 12 amino acids beginning shortly after the presumptive start methionine. Because syndecan is oriented with its N-terminus outside of the plasma membrane, this appears to be a signal sequence. The N-terminus of mature syndecan is blocked, and, therefore, it has not been possible to determine the N-terminus directly. A likely site for signal peptidase cleavage is following amino acid residue 17 ( Figure 1) in the predicted sequence. Cleavage at this site would generate an N- terminal glutamine which could readily cyclize forming a pyrrolidone carboxlyl residue and thus a blocked N- terminus, as exists in a number of eukaryotic proteins.
- the second hydrophobic stretch is a sequence near the C-terminus which has characteristics of a transmembrane domain (thick underline. Figure 1).
- This sequence is a highly hydrophobic stretch of 25 residues followed immediately by a series of highly charged residues, consistent with the stop transfer signals found following most membrane spanning domains.
- This domain also contains the only cysteine and one of the four tyrosines in the apparant mature protein sequence.
- the putative transmembrane domain defines two hydrophilic domains of the syndecan core protein, a putative extracellular domain consisting of approximately 235 amino acids, and a smaller putative cytoplasmic domain consisting of 34 amino acids.
- the putative cytoplasmic domain contains three tyrosine residues, but the sequences adjacent to these tyrosines are not similar to the presently identified consensus sequences for tyrosine phosphorylation, Hunter, T., and Cooper, J.A., Ann. Rev. Biochem. (1985) 54: 879-930.
- This domain presumably has protein binding activity because the intact proteoglycan but not the ectodomain co-sediments with F-actin, Rapraeger, A., and Bernfield, M. , Extracelluar Matrix (1982) 265-269, and because syndecan associates with the actin-containing cytoskeleton when cross-linked at the cell surface, Rapraeger, A., Jalkanen, M. , and Bernfield, M. , J. Cell Biol. (1986) 10_3: 2683-2696.
- the putative extracellular domain has several sequence characteristics that correspond with the known properties of this proteoglycan.
- the ectodomain of syndecan is shed by cleavage from its membrane anchor, Jalkanen, M., Rapraeger, A., Saunders, S., and Bernfield, M. , J. Cell Biol. (1987) 10_5: 3087-3096, and an indistinguishable molecule is released from the cell surface by mild trypsin treatment, Jalkanen, M. , Rapraeger, A., Saunders, S., and Bernfield, M. , J. Cell Biol. (1987) 105: 3087-3096.
- the only dibasic sequence (Arg-Lys) in this extracellular domain is located adjacent to the putative transmembrane domain at residues 250-251 (identified in Figure 1 by arrows). This location places the cleavage site adjacent to the plasma membrane.
- the putative extracellular domain lacks cysteine thus eliminating disulfide bridges as a means of generating secondary structure in this moleucle.
- the ectodomain contains both heparan sulfate and chondroitin sulfate chains, Rapraeger, A., Jalkanen, M. , Endo, E., Koda, J., and Bernfield, M., J. Cell Biol. (1985b) 260: 11046-11052.
- the serine hydoxyl group of ser-gly sequences are the attachment sites for these glycosaminoglycan chains, Roden, L., The Biochemistry of Glycoproteins and Proteoglycans. 267-371 and Dorfman, A., Cell Biology of Extracellular Matrix 115-138.
- Syndecan possess five such potential glycosaminoglycan attachment sites, all within the putative extracellular domain; three such serines are clustered ar the N-terminus at residues 37, 45, and 47, and the remaining two are clustered near the membrane at residues 207 and 217 (open circles.
- the ectodomain from NMuMG cells is insensitive to digestion by N-glycosidase F, as assessed by PAGE, Weitzhandler, M., Streeter, H.B., Henzel, W.J., and Bernfield, M. , J. Biol. Chem. (1988) 2 : 6949-6952.
- the putative extracellular domain contains a single canonical sequence for the attachment of N-linked oligosaccharide (solid circle. Figure 1).
- the serine in this Asn-Xaa-Ser sequence is a putative glycosaminoglycan attachment site.
- syndecan or a molecule related to syndecan will be expressed when the DNA sequence encoding it is functionally inserted into a vector that is expressed in a eukaryotic cell containing an enzyme system capable of producing glycosaminoglycan chains.
- functionally inserted is meant in proper reading frame and orientation, as is well understood by those skilled in the art.
- Expression of syndecan can be enhanced by including multiple copies of the syndecan gene in a transformed or transfected host, by selecting a vector known to reproduce in the host, thereby producing large quantities of protein from exogeneous inserted DNA, or by any other known means of enhancing peptide expression.
- U.S. Patent 4,419,450 discloses a plasmid useful as a cloning vehicle in recombinant DNA work.
- U.S. Patent 4,362,867 discloses recombinant cDNA construction methods and hybrid nucleotides produced thereby which are useful in cloning processes.
- U.S. Patent 4,403,036 discloses genetic reagents for generating plasmids containing multiple copies of DNA segments.
- U.S. Patent 4,363,877 discloses recombinant DNA transfer vectors.
- Patent 4,356,270 discloses a recombinant DNA cloning vehicle and is a particularly useful disclosure for those with limited experience in the area of genetic engineering since it defines many of the terms used in genetic engineering and the basic processes used therein.
- U.S. Patent 4,336,336 discloses a fused gene and a method of making the same.
- U.S. Patent 4,349,629 discloses plasmid vectors and the production and use thereof.
- U.S. Patent 4,332,901 discloses a cloning vector useful in recombinant DNA.
- Manipulation of the expression vectors will in some case produce constructs which improve the expression of the polypeptide in eukaryotic cells or express syndecan in other hosts. Furthermore, by using the syndecan cDNA or a fragment thereof as a hybridization probe, structurally related genes found in other organisms can be easily cloned. These genes include those that code for related core proteins of proteoglycans from other species, especially mammals such as humans and other primates.
- oligo- nucleotide probes based on the principal and variant nucleotide sequences disclosed herein.
- Such probes can be considerably shorter than the entire sequence but should be at least 14, preferably at least 20, nucleotides in length. Longer oligonucleotides are also useful, up to 30, 40, 50, 75, or 100 nucleotides and further up to the full length of the gene. Both RNA and DNA probes can be used.
- Such probes can also be used in diagnostic tests that detect the presence of genetic material of a predetermined sequence in samples, e.g., as in a polymerase chain reaction (PCR).
- PCR polymerase chain reaction
- the probes are typically labelled in a detectable manner (e.g., with 32p, 3 H, biotin, or avidin) and are incubated with single-stranded DNA or RNA from the organism in which a gene is being sought.
- Hybridization is detected by means of the label after single-stranded and double-stranded (hybridized) DNA (or DNA/RNA) have been separated (typically using nitrocellulose paper).
- Hybridization techniques suitable for use with oligonucleotides are well known.
- oligonucleo ⁇ tide refers to both labeled and unlabeled forms and not just to labeled probes.
- oligonucleotides corresponding to the segments of the gene that code for glycosaminoglycan attachment sites.
- an oligonucleotide with high probability of success in the identification of other gene products is the 64-fold degenerate oligonucleotide of the form GANGGNTCTGGNGA, where N represents presence of all four nucleotides in degenerate sequences.
- the complementary oligonucleotide having the degenerate sequence TCNCCAGANCCNTC is also particularly useful and has the added advantage of ability to identify messenger RNA of these gene products in Northern analysis.
- the invention allows the production in large amounts of highly pure heparan sulfate proteoglycans that contain heparan sulfate chains that are characteristic of specific cell types.
- the surface of endothelial cells is non-thrombogenic because of the anti-coagulant properties of the heparan sulfate chains in a proteoglycan on their surfaces.
- Preparations of this highly anti-coagulant heparan sulfate proteoglycan in soluble form is now possible by transfection of cultured endothelial cells with a DNA construct defined by this invention. Expression of the contruct would produce syndecan containing endothelial cell-derived heparan sulfate chains.
- Sydecan contains a unique protease-susceptible site adjacent to the plasma membrane, allowing the harvesting of this modified syndecan as a soluble product in high yield and purity.
- This approach would produce an anti ⁇ coagulant proteoglycan with very high potency, potentially several thousand times more potent than commercially available heparin.
- the soluble proteins or peptides containing cell-type-speteific heparan sulfate chains made possible by this invention, can be used in the prevention and therapy of certain viral diseases. Dextran sulfate and heparin have been shown to reduce infection and replication of certain retroviruses, including human immunodeficiency virus (HIV). However, these molecules are highly heterogenous and are probably non-specific.
- a more specific inhibitor would be a soluble heparan sulfate peptide or proteoglycan derived from a cell type that interacts with the virus.
- Peptides derived from this invention can also be used as highly specific competitive inhibitors of heparan sulfate (or chrondroitin sulfate) chain initiation. Because mutant transformed cells with reduced cell-surface heparan sulfate are substantially less turmorigenic, this invention has the potential of producing anti-tumor drugs that are non-cytotoxic.
- a DNA construct derived from this invention can be used in fibroblasts that contain surface proteoglycans that bind various growth factors, including acidic fibroblast growth factor (FGF) and basic FGF. This bonding potentiates the action and prevents the proteolytic degradation of these growth factors.
- FGF acidic fibroblast growth factor
- PDGF Platelet-derived growth factor
- the peptide sequences involved in heparan sulfate chain attachment identified by the present invention will allow production of large amounts of cell-type-specific heparan sulfate proteoglycans and enable this attachment site to be placed into other biological macromolecules that do not normally contain it, thereby providing products that are not otherwise available. These products will represent a singular molecular species, whereas the heparins and all other heparan sulfate proteoglycans heretofor described represent many molecular species.
- the greater uniformity afforded by the present invention leads to greater potency and potentially to greater specificity of the materials being purified, thereby enhancing their therapeutic applications.
- heparin from pig intestine or beef lung or dextran sulfate a synthetic product, that are polydispersed, of low potency, and of little specificity
- Cell lines containing the genetic material necessary for the practice of the present invention can be obtained from a number of public sources, some of which are specifically identified in the following examples.
- normal mouse mammary epithelial cells can be prepared from normal mouse tissue using the procedure described in the examples below. The same procedure can be used to obtain genetic material from other species.
- NMuMG mouse mammary epithelial cells (passages 13-22) were maintained in bicarbonate-buffered Dulbecco's modified Eagle medium (Gibco) as described previously, David, G., and Bernfield, M., Proc. Natl. Acad. Sci. USA (1979) 7_6: 786-790.
- Dulbecco's modified Eagle medium Gibco
- cells were plated on 245 x 245 mm tissue culture plates (Nunc) at approximately one-fifth confluent density and grown to 80-90 percent confluency (3-4 days).
- RNA extraction buffer (4 M guanidine isothiocyanate in 5 mM sodium citrate pH 7.0, 0.1M ⁇ -mercaptoethanol and 0.5% N-lauryl sarcosine) and total RNA prepared by CsCl density centrifugation, Chirgwin, J.M., Pryzybyla, A.E., MacDonald, R.J., and Rutter, W.J., Biochemistry (1979) 18: 5194-5299.
- RNA was purified by chromatography on oligo(dT)-cellulose (type 3; Collaborative Research) and utilized in the commercial synthesis (Strategene) of cDNA by the SI method, Huynh, T.V., Young, R.A., and Davis, R.W., DNA Cloning: A
- a primer extension cDNA library was prepared using the RNase H method, Gubler, U., and Hoffman,
- First strand cDNA was synthesized from 10 yg of an 18-bp oligonucleotide containing sequence derived from near the 5' end of PM- 4 (see Example 2).
- the second strand was synthesized using RNase H(BRL) and DNA polymerase Klenow fragment (Boehringer-Mannheim) .
- the cDNA was methylated with EcoRI methylase and then ligated with synthetic EcoRI linkers (New England Biolabs). Excess linkers were removed by EcoRI digestion and the cDNA was purified on agarose gel electrophoresis and recovered by electroelution. The resulting cDNA was inserted into ⁇ gt-10 (Promega and packaged using Giga pack Gold (Stratagene) .
- EXAMPLE 2 EXAMPLE 2
- the cells were resuspended in 50 ml TBST (Tris buffered saline triton: 10 mM Tris pH 7, NaCl 150mM, Triton X-100 0.3%), sonicated, and following addition of 100 yl immunoserum (1:500 dilution), incubated overnight at 4 C. This mixture was centrifuged for 10 min at 4000 rpm and used to screen expressed ⁇ gt-11 cDNA clones. Young, R.A., and Davis, R.W., Science (1983) 22 : 778-782, by detection with alkaline phosphate-conjugated goat-anti- rabbit IgG (Promega).
- syndecan purified from NMuMG cells reacted with an immunserum prepared against a synthetic peptide containing the C-terminal 7 amino acids (Lys-Gln-Gln- Glu-Glu-Phe-Tyr-Ala) of the PM-4 derived protein sequence.
- This immunserum failed to react with the ectodomain which lacks the putative cytoplasmic domain.
- this serum does not cross react with any other cellular proteins as assessed by Western blotting of total cell extracts.
- Purified lambda DNA was prepared from positively selected clones by Lambdasorb immunoprecipitation (Promega). Fragments released by restriction endonuclease digestions were isolated by electrophoresis followed by excision from SeaPlaque agarose (FMC BioProducts). These isolated fragments were subcloned directly, in the presence of agarose, Struhl, K., BioTechniques (1985) 3_: 452-453, to either pGEM 3 and 4 for in vitro transcription, or M13 mpl8 and mpl9. Messing, J., Methods Enzymol. (1983) 101: 20- 78, for sequence analysis.
- DNA sequencing was performed by the dideoxy chain termination method, Sanger, F., Nicklen, S., and Coulson, A.R., Proc. Natl. Acad. Sci. USA (1977) 74: 5463-5467, using a modified T7 DNA polymerase (Sequenase TM, U.S. Biochemical).
- the strategy is summarized in Figure 2. Sequence was generated from both ends of subcloned restriction fragments using universal M13 sequencing primers. The internal sequence of large fragments as well as the complementary strands of all fragments were determined using oligonucleotide primers synthesized in accordance with preceding sequences.
- the cDNA ( Figure 1) has the following features: The first AUG is at postion 240. This putative intiation codon is preceded by two inframe termination codons (TAA and TGA at positions 39 and 72 respectively) and followed by a 930 base open reading frame that ends at position 1173 with a TGA termination codon. Following the putative coding region are 1,243 bases of 3'-untranslated sequence that ends with the poly(A) stretch.
- RNA for Northern analysis was prepared from the following: NMuMG cells, adult liver, newborn skin, mid-pregnant mammary gland, adult cerebrum, skeletal and cardiac muscle. Excised tissues were ground to a fine powder in the presence of liquid nitrogen and transferred directly to RNA exraction buffer (see above); the NMuMG cells were extracted after washing with PBS as described above. The samples were vigorously vortexed, an equal volume of lOmM Tris pH 8.0, ImM EDTA, and 1% SDS added, and subsequently extracted exhaustively with 24:24:1 Tris-saturated phenol:chloroform:isoamyl alcohol followed by a single extraction with 24:1 chloroform:isoamyl alcohol.
- RNA was precipitated by addition of 1/3 volume of 10 M LiCl.
- Poly(A) RNA was prepared by oligo d(T) chromatography as described above.
- Hybridization probes were prepared by in vitro transcription of the 5' EcoRI-SacI fragment of PM-4 subcloned into pGEM3, Melton, D.A., Krieg, P.A., Rebagliati, M.R., Maniatis, T., Zinn, K., and Green, M.R., Nucl. Acids Res. (1984) 12: 7035-7056. Blots were prehybridized at 61°C in 50% formamide, 1% SDS, 5X SSPE, 0.1% ficoll, 0.1% polyvinylpyrrolidone and 100 yg/ml denatured salmon sperm DNA.
- Hybridization was for 16 hrs at 61°C in the same buffer containing 5 x 106 cpm/ml of RNA probe. Filters were washed 2 x 15 min at room temperature in 5% SDS/IX SSPE and 6 x 30 min at 67°C in 1% SDS/0.1X SSPE. Molecular sizes were determined relative to ethidium bromide stained molecular weight markers (BRL) and 18S and 28S riboso al RNA.
- BTL ethidium bromide stained molecular weight markers
- Northern blot analysis of the poly(A) RNA preparations reveals two mRNA bands in NMuMG cells as well as in skin, liver and mammary gland tissues; one band is at 2.6 and the other at 3.4kb.
- Longer exposures of the Northern blot discussed above, as well as others containing larger quantities of poly(A) RNA verify that the mammary gland expresses both the 2.6 and the 3.4 kb messages (data not shown).
- a seven amino acid (14C-labeled) synthetic peptide, corresponding to the predicted C-terminus of syndecan ( Figure 1) was prepared by direct synthesis.
- the N-terminal lysine of this peptide was cross-linked by glutaraldehyde to keyhole limpet hemocyanin (KLH, Calbiochem) for immunization and bovine serum albumin (BSA, Fraction V, Sigma) for screening as described by Doolittle, R.F., Of URFS and ORFS: A Primer on How to Analyze Derived Amino Acid Sequences (1986) 85.
- carrier protein was dissolved in 0.5 ml of 0.4 M phosphate, pH 7.5, mixed with 7.5 ymoles of peptide in 1.5ml water and 1.0 ml of 20 mM glutaraldehyde was added dropwise with stirring over the course of 5 min. After continuous stirring at room temperature for 30 min., 0.25 ml of 1 M glycine was added to block unreacted glutaraldehyde and the stirring resumed for an additional 30 min. The product was dialyzed exhaustively against phosphate-buffered saline and incorporation determined by TCA precipitation and liquid scintillation counting. This procedure resulted in the attachment of 17 moles of synthetic peptide per mole of carrier protein.
- the native lipophilic form of syndecan and the nonlipophilic medium ectodomain form Jalkanen, M. , Rapraeger, A., Saunders, S., and Bernfield, M., J. Cell Biol. (1987) 105: 3087-3096, were isolated and purified as described elsewhere and assessed for their reactivity to the immune sera.
- a cationic nylon membrane, Gene-Trans (Plasco Inc., Woburn, MA) was placed into an immunodot apparatus (V&P Scientific, San Diego, CA) and, samples of intact syndecan and the ectodomain (0.5, 5, 50 and 500 ng) were loaded on the membrane using mild vacuum.
- the membrane was washed for 60 min at room temperture with ten changes of TBST and then incubated for 30 min with 1:7500 dilution of alkaline phosphatase goat-anti- rabbit IgG (Promega, Madison WI). Following washing for 60 min with ten changes of TBST, the immobilized alkaline phosphatase was visualized with nitro blue tetrazolium (NBT) 330 yg/ml and 5-bromo-4-chloro-3- indolyl phosphate (BCIP) 165 yg/ml in lOOmM Tris pH 9.5, 100 mM NaCl, and 5 mM MgCl 2 .
- NBT nitro blue tetrazolium
- BCIP 5-bromo-4-chloro-3- indolyl phosphate
- Syndecan can be expressed within mammalian cells by transfection of a DNA contruct containing the syndecan core protein cDNA linked to a eukaryotic promoter that has the properties of both high-level expression and activity in a wide range of cell types.
- a DNA contruct containing the syndecan core protein cDNA linked to a eukaryotic promoter that has the properties of both high-level expression and activity in a wide range of cell types.
- the expression vector pH ⁇ APr-1- neo has been described (Gunning et al., PNAS 84:4831- 4835) which utilizes the human ⁇ -actin promoter and fullfills both of the above requirements.
- This vector also contains the neomycin-resistance gene which allows selection of transfected cells with the antibiotic G- 418.
- nucleotides 214-1379 of the sequence shown in Figure 1 which encompasses all of the coding region was inserted directionally between the Sall-BamHI sites of the pH ⁇ APr-1-neo vector and thus named p ⁇ -SSyn-neo.
- this fragment was passed sequentially through pGEM 3Z (Promega), pGEM 7Zf (Promega), and Bluescript (Stratagene) .
- This DNA construct was transformed into the bacterial strain TG-1 and prepared in large scale using routine plasmid preparation techniques including CsCl 2 density centrifugation.
- the purified circularized plasmid DNA was transfected into Chinese Hamster Ovary (CHO) cells by standard calcium phosphate precipitation technique, and transfected clones were selected with G418.
- CHO Chinese Hamster Ovary
- the parental CHO (hamster) cells express mRNA which is cross-reactive with the murine syndecan cDNA, neither whole cells nor proteoglycan purified from these cells is reactive with the monoclonal antibody 281-2, a rat monoclonal antibody generated against murine syndecan. Therefore it has been possible to assess the function of the transfected murine syndecan gene using this antibody.
- Anti-sense RNA produced from vectors of this type if expressed in sufficiently high levels, is capable of binding to endogenous message intracellularly and blocking its subsequent translation.
- this vector the same coding region Sacl-Hindll fragment of syndecan described above was inserted into the BamHI-Hindlll site of the pH ⁇ Apr-1-neo vector to produce the vector p ⁇ -ASyn-neo.
- the cDNA was inserted into the vector in the opposite orientation so as to produce mRNA from the transfected gene that is complementary to endogenous syndecan mRNA.
- this fragment was sequencially passed through pGEM 3Z (Promega) and Bluescript (Stratagene) .
- the 64 fold degenerate oligonucleotide of the form GAN GGN TCT GGN GA should statistically have the highest probability of success in the identification of other gene products which contain this putative signal for glycosaminoglycan attachment.
- the complementary oligonucleotide of the form TCN CCA GAN CCN TC should have similar utility, with the added advantage of its ability to identify the messenger RNA of these gene products in Northern analysis.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Zoology (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Wood Science & Technology (AREA)
- General Engineering & Computer Science (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Medicinal Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Cell Biology (AREA)
- Plant Pathology (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Peptides Or Proteins (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
A purified mammalian peptide, and genetic information encoding such peptides, having a molecular weight of from about 31 kD to about 35 kD and comprising an amino terminus extracellular region, a carboxy terminus cytoplasmic region, and a transmembrane region between said cytoplasmic and extracellular regions, a dibasic sequence extracellularly adjacent the transmembrane region of the peptide, and at least one glycosylation site in the extracellular region including an Xac-Xaa-Ser-Gly-Xac sequence, wherein Xac is an acidic amino acid and Xaa is any amino acid. Additional peptides having this glycosylation site and genetic information useful for preparing a number of variations based on this peptide are also provided.
Description
CONSTRUCTION AND USE OF SYNTHETIC CONSTRUCTS ENCODING SYNDECAN
Work leading to the present invention was supported in part by a National Institutes of Health grant. The government has rights in this invention as a result of this support.
BACKGROUND OF THE INVENTION Field of the Invention
This invention relates generally to the field of genetic engineering and more particularly to genes for proteoglycans, their insertion into recombinant DNA vectors, and the production of the resulting core proteins in recipient strains of microorganisms and the proteoglycan in recipient eukaryotic cells.
Description of the Background
The cellular behavior responsible for the development, repair and maintenance of tissues is regulated, in large part, by interactions between cells and their extracellular matrix. These interactions are mediated by cell surface molecules acting as receptors that bind the large insoluble matrix molecules and induce responses that result in changes of cellular phenotype. Several proteins associated with the cell surface can bind matrix components. These proteins differ in their specificity and affinity and in their mode of association with the cell surface. Some bind cells to single matrix ligands while others, such as some members of the integrin super family, appear to have multiple matrix ligands. Of the various matrix- binding proteins at the cell surface, only the integrins are known to be integral membrane proteins.
The integrin fibronectin receptor codistributes both with extracellular fibronectin and with intracellular cytoskeletal components, apparently via an association of the receptor's cytoplasmic domain with the cytoskeletal protein talin.
The present inventors have studied a lipophilic proteoglycan containing both heparan sulfate and chondroitin sulfate that is found at the surface of mouse mammary epithelial cells and that behaves as a high affinity receptor specific for multiple components of the interstitial matrix. This proteoglycan has been given the name syndecan in the mouse. The proteoglycan binds the epithelial cells via its heparan sulfate chains to collagen types I, III, and V (Koda, J.E., Rapraeger, A., and Bernfield, M., J. Biol. Chem. (1985) 260; 8157-8162), fibronection (Saunders, S. and Bernfield, M. , J. Cell Biol. (1988) lpj6: 423-430), and thrombospondin. When its extracellular domain (ectodomain) is cross-linked at the cell surface, it associates intracellularly with the actin cytoskeleton (Rapraeger, A., Jalkanen, M. , and Bernfield, M. , J. Cell Biol. (1986) JL03: 2683-2696), and the isolated proteoglycan binds directly or indirectly to F-actin (Rapraeger, A., and Bernfield, M. , J. Biol. Chem. (1985) 260: 4103-4109). Cultured epithelial cells shed the ectodomain from their apical surfaces as a non- lipophilic proteoglycan that contains all of the glycosarαinoglycan of the intact molecule and polarize the proteoglycan exclusively to their basolateral surfaces, a location consistent with its matrix receptor function. Upon suspension of these cells, the ectodomain is cleaved from the cell surface; the proteoglycan is not replaced while the cells are suspended (Jalkanen, M. , Rapraeger, A., Saunders, S., and Bernfield, M. , J. Cell Biol. (1987) 105: 3087- 3096). The proteoglycan is mainly on epithelia in mature tissues (Hayashi, K., Hayashi, M. , Jalkanen, M. ,
Firestone, J.H., Trelstad, R.L., and Bernfield, M. , J. Histochem. Cytochem. (1987) 35_: 1079-1088), and some of the present inventors have previously proposed that it is a matrix anchor that stablizes the morphology of epithelial sheets by linking the cytoskeleton to the extracellular matrix (Bernfield, M. , Rapraeger, Al, Jalkanen, M. , and Banerjee, S.D., Basement Membranes (1985) 343-352).
Syndecan undergoes substantial regulation; its size, glycosaminoglycan composition and location at the cell surface vary between epithelial types, and its expression changes during development. The proteoglycan is located exclusively at the basolateral cell surface of simple epithelia but surrounds stratified epithelial cells. At basolateral cell surfaces, it appears to contain two heparan sulfate and two chrondroitin sulfate chains, but where it surrounds cells, it contains only a single heparan sulfate chain and a single small chrondroitin sulfate chain (Sanderson, R.D., and Bernfield, M. , Proc. Natl. Acad. Sci. USA (1987) 23J3: 491-497). In self-renewing epithelial cell populations, such as the epidermis or vagina, the proteoglycan is lost when the cells terminally differentiate (Hayashi, K., Hayashi, M. , Boutin, E., Cunha, G.R., Bernfield, M. , and Trelstad, R. ., J. Lab. Invest. (1988) J58_: 68-76). In embryos, the proteoglycan is transiently lost when epithelia change their shape and is transiently expressed by mesenchymal cells undergoing morphogenetic tissue interaction.
Heparan sulfate proteoglycans are ubiquitous on the surfaces of adherent cells and bind various ligands including extracellular matrix, growth factors, proteinase inhibitors, and lipoprotein lipase; see Fransson, L., Trends Biochem. Sci. (1987) _12: 406-
411. However, despite much study of these molecules, no structure is known for the core protein of any such
cell surface proteoglycan.
For general background on genetic engineering, see Watson, J.D., The Molecular Biology of the Gene, 4th Ed., Benjamin, Menlo Park, Calif., (1988).
SUMMARY OF THE INVENTION Accordingly, it is an object of this invention to provide eukaryotic cells capable of providing useful quantities of syndecan and proteins of similar function from multiple species.
It is a further object of this invention to provide a recombinant DNA vector containing a heterologous segment encoding syndecan or a related protein that is capable of being inserted into a microorganism or eukaryotic cell and expressing the encoded protein.
It is still another object of this invention to provide a DNA or RNA segment of defined structure that can be produced synthetically or isolated from natural sources and that can be used in the production of the desired recombinant DNA vectors or that can be used to recover related genes from other sources.
It is yet another object of this invention to provide a peptide that can be produced synthetically in a laboratory or by a microorganism which will mimic the activity of natural syndecan core protein and which can be used to produce proteoglycans and glycosaminoglycans in eukaryotic cells in a reproducible and standardized manner. These and other objects of the invention as will hereinafter become more readily apparent have been accomplished by providing an isolated peptide having a molecular weight of from about 31 kD to about 35 kD and comprising a hydrophilic amino terminus extracellular region, a hydrophilic carboxy terminus cytoplasmic region, and a hydrophobic transmembrane region between said cytoplasmic and extracellular regions, a dibasic
sequence extracellularly adjacent the transmembrane region of the peptide, and at least one glycosylation site in the extracellular region including an Xac-Xaa- Ser-Gly-Xac sequence, wherein Xac is an acidic amino acid and Xaa is any amino acid and wherein said peptide is capable of functioning as a core protein for attachment of a heparan sulfate chain at said Ser.
Particularly preferred are peptides of: (a) a first formula
M-R- R-A-A- L-W-L-W-L- C-A-L-A- L-R-L- Q-P-A- L-P-Q-I-V-A-V-N-V-P-P-E-D-Q-D-G-S-G-D-D- S-D- N-F-S- G-S-G-T-G-A-L-P-D-T-L-S-R-Q-T- P-S-■T-W-K-•D-V-W-L-L- T-A-T-P-T-A-P-E-P-T- S-S- N-T-E- T-A-F-T-S- V-L-P-A-G-E-K-P-E E- G-E-■P-V-L-•H-V-E-A-E-•P-G-F-T-•A-R-D-•K-E-K- E-V--T-T-R-■P-R-E-T-V-•Q-L-P-I-■T-Q-R-•A-S-T- V-R-•V-T-T-•A-Q-A-A-V-■T-S-H-P-•H-G-G-■M-Q-P- G-L-■H-E-T'■S-A-P-T-A•P-G-Q-P-■D-H-Q-■P-P-R- V-E--G-G-G'■T-S-V-I-K-•E-V-V-E-•D-G-T-■A-N-Q- L-P'-A-G-E ■G-S-G-E-Q-■D-F-T-F-•E-T-S-•G-E-N- T-A -V-A-A -V-E-P-G-L'■R-N-Q-P-•P-V-D-•E-G-A- T-G-A-S-Q-S-L-L-D-R'-K-E-V-L--G-G-V--I-A-G- G- -V-G-L-I-F-A-V-C■L-V-A-F--M-L-Y--R-M-K- K-K-D-E-G-S-Y-S-L-E-E-P-K-Q--A-N-G--G-A-Y- Q-K-P-T-K-Q-E-E-F-Y-A
wherein A is alanine, C is cysteine, D is aspartate, E is glutamate, F is phenylalanine, G is glycine, H is histidine, I is isoleucine, K is lysine, L is leucine, M is methionine, N is asparagine, P is proline, Q is glutamine, R is arginine, S is serine, T is threonine, V is valine, W is tryptophan, and Y is tyrosine.
(b) a second formula in which 1 to 10 amino acids in said first formula are replaced by different amino acids,
(c) a third formula in which from 1 to 20 amino acids are absent from either the amino terminal, the carboxy terminal, or both terminals of said first formula or said second formula, or
(d) a fourth formula in which from 1 to 10 additional amino acids are attached sequentially to the amino terminal, carboxy terminal, or both terminals of said first formula or said second formula and salts of compounds having said formulas, wherein said peptide retains an Xac-Xaa-Ser-Gly-Xac sequence capable of acting as an attachment site for heparan sulfate chain synthesis.
DNA and RNA molecules, recombinant DNA vectors, and modified microorganisms or eukaryotic cells comprising a nucleotide sequence that encodes any of the peptides indicated above are also part of the present invention. In particular, sequences comprising all or part of the following DNA sequence, a complementary DNA or RNA sequence, or a corresponding RNA sequence are especially preferred:
ATGAGACGCGCGGCGCTCTGGCTCTGGCTCTGCGCGCTGGCGCTGCGCCTGCAGCCTGCC CTCCCGCAAATTGTGGCTGTAAATGTTCCTCCTGAAGATCAGGATGGCTCTGGGGATGAC TCTGACAACTTCTCTGGCTCTGGCACAGGTGCTTTGCCAGATACTTTGTCACGGCAGACA CCTTCCACTTGGAAGGACGTGTGGCTGTTGACAGCCACGCCCACAGCTCCAGAGCCCACC AGCAGCAACACCGAGACTGCTTTTACCTCTGTCCTGCCAGCCGGAGAGAAGCCCGAGGAG GGAGAGCCTGTGCTCCATGTAGAAGCAGAGCCTGGCTTCACTGCTCGGGACAAGGAAAAG GAGGTCACCACCAGGCCCAGGGAGACCGTGCAGCTCCCCATCACCCAACGGGCCTCAACA GTCAGAGTCACCACAGCCCAGGCAGCTGTCACATCTCATCCGCACGGGGGCATGCAACCT GGCCTCCATGAGACCTCGGCTCCCACAGCACCTGGTCAACCTGACCATCAGCCTCCACGT GTGGAGGGTGGCGGCACTTCTGTCATCAAAGAGGTTGTCGAGGATGGAACTGCCAATCAG CTTCCCGCAGGAGAGGGCTCTGGAGAACAAGACTTCACCTTTGAAACATCTGGGGAGAAC ACAGCTGTGGCTGCCGTAGAGCCCGGCCTGCGGAATCAGCCCCCGGTGGACGAAGGAGCC
ACAGGTGCTTCTCAGAGCCTTTTGGACAGGAAGGAAGTGCTGGGAGGTGTCATTGCCGGA GGCCTAGTGGGCCTCATCTTTGCTGTGTGCCTGGTGGCTTTCATGCTGTACCGGATGAAG AAGAAGGACGAAGGCAGCTACTCCTTGGAGGAGCCCAAACAAGCCAATGGCGGTGCCTAC CAGAAACCCACCAAGCAGGAGGAGTTCTACGCC.
DNA and RNA molecules containing segments of the larger sequence are also provided for use in carrying out preferred aspects of the invention relating to the production of such peptides by the techniques of genetic engineering and the production of oligonucleotide probes.
BRIEF DESCRIPTION OF THE FIGURES The accompanying Figures are provided to illustrate the invention but are not considered to be limiting thereof unless so specified.
Figure 1 is a formula showing the cDNA sequence for syndecan and the corresponding amino acid sequence.
Figure 2 is a restriction map showing sequencing strategy of syndecan cDNA clones.
Figure 3 is a table showing potnetial glycosylation sites of the syndecan core protein and homology of these regions to the glycosylation site of other proteins.
Figure 4 is a schematic diagram showing different regions of the syndecan core protein.
Figure 5 is a table showing DNA sequence similarities between murine syndecan and human insulin receptor.
DESCRIPTION OF THE SPECIFIC EMBODIMENTS Using a library from mouse mammary epithelial cells, we have molecularly cloned and sequenced full length cDNAs for a cell surface proteoglycan matrix receptor and have assessed the expression of its mRNA
in various tissues. The 311 amino acid core protein has a unique sequence that contains several structural features consistent with its role as a matrix anchor and as an acceptor of two distinct types of glycosaminoglycan chains. The expression of its mRNA is tissue-type specific, and both the 5' and 3' untranslated regions of its cDNA show substantial sequence homology to those of the human insulin receptor cDNA. This core protein cDNA defines a new class of matrix receptor, an integral membrane proteoglycan, for which we propose the name syndecan (from the Greek, syndein, to bind together).
Using this information a variety of recombin¬ ant DNA vectors capable of providing syndecan in reasonable quantities are provided. Additional recombinant DNA vectors of related structure that code for synthetic proteins having the key. structural features identified herein as well as for proteins of the same family from other sources can be produced from the syndecan DNA using standard techniques of recombinant DNA technology. A transformant expressing syndecan has been produced as an example of this technology. The newly discovered sequence and structure information can be used, through transfection of eukaryotic cells, to prepare proteoglycans having cleavage sequences and attachment sites that allow ready production of pure proteoglycans and glycosaminoglycans.
Since there is a known and definite correspondence between amino acids in a peptide and the DNA sequence that codes for the peptide, the DNA sequence of a DNA or RNA molecule coding for syndecan (or any of the modified peptides later discussed) can be use to derive the amino acid sequence (and vice versa, at least to the extent that degeneracy of coding). Such a sequence of nucleotides is shown in Table 1 along with the corresponding amino acid sequence.
TABLE 1
1
ATGAGACGCGCGGCGCTCTGGCTCTGGCTCTGCGCGCTGGCGCTGCGCCTGCAGCCTGCC
M R R A A L W L W L C A L A L R L Q P A
21
CTCCCGCAAATTGTGGCTGTAAATGTTCCTCCTGAAGATCAGGATGGCTCTGGGGATGAC
L P Q I V A V N V P P E D Q D G S G D D
41
TCTGACAACTTCTCTGGCTCTGGCACAGGTGCTTTGCCAGATACTTTGTCACGGCAGACA
S D N F S G S G T G A L P D T L S R Q T
61
CCTTCCACTTGGAAGGACGTGTGGCTGTTGACAGCCACGCCCACAGCTCCAGAGCCCACC
P S T W K D V W L L T A T P T A P E P T
81
AGCAGCAACACCGAGACTGCTTTTACCTCTGTCCTGCCAGCCGGAGAGAAGCCCGAGGAG
S S N T E T A F T S V L P A G E K P E E
101
GGAGAGCCTGTGCTCCATGTAGAAGCAGAGCCTGGCTTCACTGCTCGGGACAAGGAAAAG
G E P V L H V E A E P G F T A R D K E K
121
GAGGTCACCACCAGGCCCAGGGAGACCGTGCAGCTCCCCATCACCCAACGGGCCTCAACA
E V T T R P R E T V Q L P I T Q R A S T
141
GTCAGAGTCACCACAGCCCAGGCAGCTGTCACATCTCATCCGCACGGGGGCATGCAACCT
V R V T T A Q A A V T S H P H G G M Q P
161
GGCCTCCATGAGACCTCGGCTCCCACAGCACCTGGTCAACCTGACCATCAGCCTCCACGT
G L H E T S A P T A P G Q P D H Q P P R
181
GTGGAGGGTGGCGGCACTTCTGTCATCAAAGAGGTTGTCGAGGATGGAACTGCCAATCAG
V E G G G T S V I K E V V E D G T A N Q
201
CTTCCCGCAGGAGAGGGCTCTGGAGAACAAGACTTCACCTTTGAAACATCTGGGGAGAAC
L P A G E G S G E Q D F T F E T S G E N
221
ACAGCTGTGGCTGCCGTAGAGCCCGGCCTGCGGAATCAGCCCCCGGTGGACGAAGGAGCC
T A V A A V E P G L R N Q P P V D E G A
241
ACAGGTGCTTCTCAGAGCCTTTTGGACAGGAAGGAAGTGCTGGGAGGTGTCATTGCCGGA
T G A S Q S L L D R K E V L G G V I A G
261
GGCCTAGTGGGCCTCATCTTTGCTGTGTGCCTGGTGGCTTTCATGCTGTACCGGATGAAG G L V G L I F A V C L V A F M L Y R M K
281
AAGAAGGACGAAGGCAGCTACTCCTTGGAGGAGCCCAAACAAGCCAATGGCGGTGCCTAC
K K D E G S Y S L E E P K Q A N G G A Y
301
CAGAAACCCACCAAGCAGGAGGAGTTCTACGCCTGA Q K P T K Q E E F Y A end
Nucleotide sequence of one strand of syndecan cDNA. The numbers refer to the amino acid sequence and corresponding DNA codon sequence beginning at the amino terminus of the protein. The stop codon is marked "end."
The trinucleotides of Table 1, termed codons, are presented as DNA trinucleotides, as they exist in the genetic material of a living organism. Complementary trinucleotide DNA sequences having opposite strand polarity are functionally equivalent to the codons of Table 1, as is understood in the art. An important and well known feature of the genetic code is its redundancy, whereby, for most of the amino acids used to make proteins, more than one coding nucleotide triplet may be employed. Therefore, a number of different nucleotide sequences may code for a given amino acid sequence. Such nucleotide sequences are considered functionally equivalent since they can result in the production of the same amino acid sequence in all organisms, although certain strains may translate some sequences more efficiently than they do others. Occasionally, a methylated variant of a purine or pyrimidine may be found in a given nucleotide sequence. Such methylations do not affect the coding relationship in any way. The equivalent codons are shown in Table 2 below.
TABLE 2
GENETIC CODE
Alanine (Ala, A) GCA, GCC, GCG, GCT Arginine (Arg, R) AGA, ACG, CGA, CGC, CGG, CGT Asparagine (Asn, N) AAC, AAT Aspartic acid (Asp, D) GAC, GAT Cysteine (Cys, C) TGC, TGT
Glutamic acid (Glu, E) GAA, GAG Glutamine (Gin, Q) CAA, CAG Glycine (Gly, G) GGA, GGC, GGG, GGT Histidine (His, H) CAC, CAT Isoleucine (lie, I) ATA, ATC, ATT Leucine (Leu, L) CTA, CTC, CTG, CTT, TTA, TTG Lysine (Lys, K) AAA, AAG Methionine (Met, M) ATG Phenylalanine (Phe, F) TTC, TTT Proline (Pro, P) CCA, CCC, CCG, CCT Serine (Ser, S) AGC, AGT, TCA, TCC, TCG, TCT Threonine (Thr, T) ACA, ACC, ACG, ACT Tryptophan (Trp, W) TGG Tyrosine (Tyr, Y) TAC, TAT Valine (Val, V) GTA, GTC, GTG, GTT
Termination signal (end) TAA, TAG, TGA
Key: Each 3-letter triplet represents a trinucleotide of DNA having a 5' end on the left and a 3' end on the right. The letters stand for the purine or pyrimidine bases forming the nucleotide sequence.
A = adenine
G = guanine C = cytosine
T = thymine
Since the DNA sequence of the gene has been fully identified, it is possible to produce a DNA gene entirely by synthetic chemistry, after which the gene can be inserted into any of the many available DNA vectors using known techniques of recombinant DNA technology. Thus the present invention can be carried out using reagents, plasmids, and microorganism which are freely available and in the public domain at the time of filing of this patent application. For example, nucleotide sequences greater than
100 bases long could be readily synthesized in 1984 on an Applied Biosystems Model 380A DNA Synthesizer as evidenced by commercial advertising of the same (e.g.. Genetic Engineering News, November/December 1984, p. 3). Such oligonucleotides can readily be spliced using, among others, the techniques described later in this application to produce any nucleotide sequence described herein. For example, relatively short complementary oligonucleotide sequences with 3' or 5' segments that extend beyond the complementary sequences can be synthesized. By producing a series of such short segments with "sticky" ends that hybridize with the next short oligonucleotide, sequential oligonucleotides can be joined together by the use of ligases to produce a longer oligonucleotide that is beyond the reach of direct synthesis.
Furthermore, automated equipment is also available that makes direct synthesis of any of the peptides disclosed herein readily available. In the same issue of Genetic Engineering News mentioned above, a commercially available automated peptide synthesizer having a coupling efficiency exceeding 99% is advertised (page 34). Such equipment provides ready access to the peptides of the invention, either by direct synthesis or by synthesis of a series of fragments that can be coupled using other known techniques. Recent advances in technology make
synthesis of nucleotide sequences and peptides even more readily accessible.
In addition to the specific peptide sequence shown in Table 1, other peptides based on this sequence and representing minor variations thereof will have the biological activity of syndecan. In particular, proteins that lack the amino terminus first 17 amino acids are preferred since the first 17 amino acids appear to represent a signal sequence. Other variations can also be present. For example, up to 20 additional (i.e., not counting the 17-amino-acid leader sequence) amino acids can be absent from either or both terminals of the sequence given without losing ability to act as a core protein for synthesis of proteoglycans. Likewise, up to 10 additional amino acids can be present at either or both terminals. These variations are possible because the sites of glycosylation are located in more central regions of the molecule and the transmembrane region at the carboxy terminus does not need to be the full indicated length in order to be effective. Nevertheless, preferred compounds are those which more closely approach the specific formulas given (or the corresponding sequence that lacks a signal sequence) with 10 or fewer, more preferably 5 or fewer, absent amino acids being preferred for either terminal and 7 or fewer, more preferably 4 or fewer, additional amino acids being preferred for either terminal.
Within the central portion of the molecule, replacement of amino acids is more restricted in order that biological activity can be maintained. However, minor variations of the previously mentioned peptides and DNA molecules are also contemplated as being equivalent to those peptides and DNA molecules that are set forth in more detail, as will be appreciated by those skilled in the art. For example, it is reasonable to expect that an isolated replacement
of a leucine with an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar replacement of an amino acid with a structurally related amino acid will not have a major effect on the biological activity of the resulting molecule, especially if the replacement does not involve an amino acid at one of the glycosylation or cleavage sites. Conservative replacements are those that take place within a family of amino acids that are related in their side chains. Genetically encoded amino acids are generally divided into four families: (1) acidic = aspartate, glutamate; (2) basic = lysine, arginine, histidine; (3) nonpolar = alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan; and (4) uncharged polar = glycine, asparagine, glutamine, cystine, serine, threonine, tyrosine. Phenylalanine, tryptophan, and tyrosine are sometimes classified jointly as aromatic amino acids. Whether a change results in a functioning peptide can readily be determined by assessing the ability of the corresponding DNA coding for this peptide to produce this peptide in glycosylated form when introduced into eukaryotic cells. Examples of this process are described later in detail. If attachment of glycosaminoglycan chains occurs, the replacement is immaterial, and the molecule being tested is equivalent to those specifically described above. Peptides in which more than one replacement has taken place can readily be tested in the same manner. The number of replacements is not strictly limited, but 10 or fewer are preferred.
DNA molecules that code for such peptides can readily be determined from the list of codons in Table 2 and are likewise contemplated as being equivalent to the DNA sequence of Table 1. In fact, since there is a fixed relationship between DNA codons and amino acids in a peptide, any discussion in this application of a
replacement or other change in a peptide is equally applicable to the corresponding DNA sequence or to the DNA molecule, recombinant vector, transformed microorganism, or transfected eukaryotic cells in which the sequence is located (and vice versa). Codons can be chosen for use in a particular host organism in accordance with the frequency with which a particular codon is utilized by that host, if desired, to increase the rate at which expression of the peptide occurs. In addition to the specific nucleotides listed in Table 1, DNA (or corresponding RNA) molecules of the invention can have additional nucleotides preceeding or following those that are specifically listed. For example, poly A can be added to the 3'-terminal, short (e.g., fewer than 20 nucleotides) sequence can be added to either terminal to provide a terminal sequence corresponding to a restriction endonuclease site, stop codons can follow the peptide sequence to terminate transcription, and the like. Additionally, DNA molecules containing a promoter region or other control region upstream from the gene can be produced. All DNA molecules containing the sequences of the invention will be useful for at least one purpose since all can minimally be fragmented to produce oligonucleotide probes and be used in the isolation of additional DNA from biological sources. RNA molecules are said to correspond to DNA molecules if they encode the same amino acids and/or control sequences.
Peptides of the invention can be prepared for the first time as purified preparations, either by direct synthesis or by using a cloned gene as described herein. By "purified" is meant, when referring to a peptide or DNA or RNA sequence, that the indicated molecule is present in the substantial absence of other biological macromolecules of the same type. The term
"purified" as used herein preferably means at least 95% by weight, more preferably at least 99% by weight, and
most preferably at least 99.8% by weight, of biological macromolecules of the same type present (but water, buffers, and other small molecules, especially molecules having a molecular weight of less than 1000, can be present). The term "pure" as used herein preferably has the same numerical limits as "purified" immediately above. The term "isolated" as used herein refers to a peptide, DNA, or RNA molecule separated not only from other peptides, DNAs, or RNAs, respectively, that are present in the natural source of the macromolecule but also from other macromolecules and preferrably refers to a macromolecule found in the presence of (if anything) only a solvent, buffer, ion or other low molecular weight component normally present in a solution of the same. "Isolated" and
"purified" do not encompass either natural materials in their native state or natural materials that have been separated into components (e.g., in an acylamide gel) but not obtained either as pure substances or as solutions.
Two protein sequences (or peptides derived from them of at least 30 amino acids in length) are homologous (as this term is preferably used in this specification) if they have an alignment score of >5 (in standard deviation units) using the program ALIGN with the mutation data matrix and a gap penalty of 6 (or greater). See Dayhoff, M.O., in Atlas of Protein Sequence and Structure, 1972, volume 5, National Biomedical Research Foundation, pp. 101-110, and Supplement 2 to this volume, pp. 1-10. The two sequences (or parts thereof— robably at least 30 amino acids in length) are more preferably homologous if their amino acids are greater than or equal to 50% identical when optimally aligned using the ALIGN program mentioned above. Two DNA sequences (or a DNA and RNA sequence) are homologous if they hybridize to one another using nitrocellulose filter hybridization
(one sequence bound to the filter, the other as a 32p_ labeled probe) using hybridization conditions of 40-50% formamide, 37°-42° C, 4x SSC and wash conditions (after several room temperature washes with 2x SSC, 0.05% SDS) of stringency equivalent to 37° C with lx SSC, 0.05% SDS. The number of preferred hyberdization conditions are set forth in the examples that follow.
The phrase "replaced by" or "replacement" as used herein does not necessarily refer to any action that must take place but to the peptide that exists when an indicated "replacement" amino acid is present in the same position as the amino acid indicated to be present in a different formula (e.g., when leucine is present at position 5 instead of isoleucine). Salts of any of the macromolecules described herein will naturally occur when such molecules are present in (or isolated from) aqueous solutions of various pHs. All salts of peptides and other macromolecules having the indicated biological activity are considered to be within the scope of the present invention. Examples include alkali, alkaline earth, and other metal salts of carboxylic acid residues, acid addition salts (e.g., HC1) of amino residues, and zwitter ions formed by reactions between carboxylic acid and amino residues within the same molecule.
Hydrophobic and hydrophilic regions can be determined by standard procedures from amino acid sequences, for example by plotting hydrophobicity according to the procedure of Kyte and Doolittle, J_-_ Mol. Biol. (1982) 157: 105-132. Plotted values averaged over groups of seven contiguous residues that are positive indicate hydrophobic regions, while negative values indicate hydrophilic regions.
The invention has specifically contemplated each and every possible variation of peptide or nucleotide that could be made by selecting combinations based on the possible amino acid and codon choices
listed in Table 1 and Table 2, and all such variations are to be considered as being specifically disclosed.
In a preferred embodiment of the invention, genetic information encoded as mRNA is obtained from cultured epithelial cells, preferably from mammalian sources, and used in the construction of a DNA gene, which is in turn used to produce a peptide of the invention. An initial crude cell suspension is sonicated or otherwise treated to disrupt cell membranes so that a crude cell extract is obtained. Known techniques of biochemistry (e.g., preferential precipitation of proteins) can be used for initial purification if desired. The crude cell extract, or a partially purified RNA portion therefrom, is then treated to further separate the RNA. For example, crude cell extract can be layered on top of a 5 ml cushion of 5.7 M CsCl, 10 mM Tris-HCl, pH 7.5, 1 mM EDTA in a 1 in. _ 3 | in. nitrocellulose tube and centrifuged in an SW27 rotor (Beckman Instruments Corp., Fullerton, Calif.) at 27,000 rpm for 16 hrs at 15°C. After centrifugation, the tube contents are decanted, the tube is drained, and the bottom _- cm containing the clear RNA pellet is cut off with a razor blade. The pellets are transferred to a flask and dissolved in 20 ml 10 mM Tris-HCl, pH 7.5, 1 mM EDTA, 5% sarcosyl and 5% phenol. The solution is then made 0.1 M in NaCl and shaken with 40 ml of a 1:1 phenol:chloroform mixture. RNA is precipitated from the aqueous phase with ethanol in the presence of 0.2 M Na-acetate pH 5.5 and collected by centrifugation. Any other method of isolating RNA from a cellular source may be used instead of this method.
Various forms of RNA may be employed such as polyadenylated, crude or partially purified messenger RNA, which may be heterogeneous in sequence and in molecular size. The selectivity of the RNA isolation procedure is enhanced by any method which results in an
enrichment of the desired mRNA in the heterodisperse population of mRNA isolated. Any such prepurification method may be employed in preparing a gene of the present invention, provided that the method does not introduce endonucleolytic cleavage of the mRNA.
Prepurification to enrich for desired mRNA sequences may also be carried out using conventional methods for fractionating RNA, after its isolation from the cell. Any technique which does not result in degradation of the RNA may be employed. The techniques of preparative sedimentation in a sucrose gradient and gel electrophoresis are especially suitable.
The mRNA must be isolated from the source cells under conditions which preclude degradation of the mRNA. The action of RNase enzymes is particularly to be avoided because these enzymes are capable of hydrolytic cleavage of the RNA nucleotide sequence. A suitable method for inhibiting RNase during extraction from cells involves the use of 4 M guanidium thiocyanate and 1 M mercaptoethanol during the cell disruption step. In addition, a low temperature and a pH near 5.0 are helpful in further reducing RNase degradation of the isolated RNA.
Generally, mRNA is prepared essentially free of contaminating protein, DNA, polysaccharides and lipids. Standard methods are well known in the art for accomplishing such purification. RNA thus isolated contains non-messenger as well as messenger RNA. A convenient method for separating the mRNA of eukaryotes is chromatography on columns of oligo-dT cellulose, or other oligonucleotide-substituted column material such as poly-U or poly-T Sepharose, taking advantage of the hydrogen bonding specificity conferred by the presence of polyadenylic acid on the 3' end of eukaryotic mRNA. Hybridization with oligonucleotide probes prepared from DNA sequences set forth in this specification can then be used to isolate the
particularly desired mRNA.
The next step in most methods is the formation of DNA commplementary to the isolated heterogeneous sequences of mRNA. The enzyme of choice for this reaction is reverse transcriptase, although in principle any enzyme capable of forming a faithful complementary DNA copy of the mRNA template could be used. The reaction may be carried out under conditions described in the prior art, using mRNA as a template and a mixture of the four deoxynucleoside triphosphates, dATP, dGTP, dCTP, and dTTP, as precursors for the DNA strand. It is convenient to provide that one of the deoxynucleoside triphosphates be labeled with a radioisotope, for example 32P in the alpha position, in order to monitor the course of the reaction, to provide a tag for recovering the product after separation procedures such as chromatography and. electrophoresis, and for the purpose of making quantitative estimates of recovery. The cDNA transcripts produced by the reverse transcriptase reaction are somewhat heterogeneous with respect to sequences at the 5' end and the 3' end due to variations in the initiation and termination points of individual transcripts, relative to the mRNA template. The variability at the 5' end is thought to be due to the fact that the oligo-dT primer used to initiate synthesis is capable of binding at a variety of loci along the polyadenylated region of the mRNA. Synthesis of the cDNA transcript begins at an indeterminate point in the poly-A region, and variable length of poly-A region is transcribed depending on the inital binding site of the oligo-dT primer. It is possible to avoid this indeterminacy by the use of a primer containing, in addition to an oligo-dT tract, one or two nucleotides of the RNA sequence itself, thereby producing a primer which will have a preferred and defined binding site for initiating the
transcription reaction.
The indeterminacy at the 3'-end of the cDNA transcript is due to a variety of factors affecting the reverse transcriptase reaction, and to the possiblity of partial degradation of the RNA template. The isolation of specific cDNA transcripts of maximal length is greatly facilitated if conditions for the reverse transcriptase reaction are chosen which not only favor full length synthesis but also repress the synthesis of small DNA chains. Preferred reaction conditions for avian myeloblastosis virus reverse transcriptase are given in the examples section of U.S. Patent 4,363,877 and are herein incorporated by reference. The specific parameters which may be varied to provide maximal production of long-chain DNA transcripts of high fidelity are reaction temperature, salt concentration, amount of enzyme, concentration of primer relative to template, and reaction time. The conditions of temperature and salt concentration are chosen so as to optimize specific base-pairing between the oligo-dT primer and the polyadenylated portion of the RNA template. Under properly chosen conditions, the primer will be able to bind at the polyadenylated region of the RNA template, but non-specific initiation due to primer binding at other locations on the template, such as short, A-rich sequences, will be substantially prevented. The effects of temperature and salt are interdependent. Higher temperatures and low salt concentrations decrease the stability of specific base-pairing interactions. The reaction time is kept as short as possible, in order to prevent non-specific initiations and to minimize the opportunity for degradation. Reaction times are interrelated with temperature, lower temperatures requiring longer reaction times. At 42°C, reactions ranging from 1 min. to 10 minutes are suitable. The primer should be present in 50 to 500-
fold molar excess over the RNA template and the enzyme should be present in similar molar excess over the RNA template. The use of excess enzyme and primer enhances initiation and cDNA chain growth so that long-chain cDNA transcripts are produced efficiently within the confines of the short incubation times.
In many cases, it will be possible to further purify the cDNA using single-stranded cDNA sequences transcribed from mRNA. However, as discussed below, there may be instances in which the desired restriction enzyme is one which acts only on double-stranded DNA. In these cases, the cDNA prepared as described above may be used as a template for the synthesis of double- stranded DNA, using a DNA poly erase such as reverse transcriptase and a nuclease capable of hydrolyzing single-stranded DNA. Methods for preparing double- stranded DNA in this manner have been described in the prior art. See, for example, Ullrich, A., Shine, J., Chirgwin, J. Pictet, R., Tischer, E., Rutter, W.J. and Goodman, H.M. , Science (1977) 196:1313. If desired, the cDNA can be purified further by the process of U.S. Patent 4,363,877, although this is not essential. In this method, heterogeneous cDNA, prepared by transcription of heterogeneous mRNA sequences, is treated with one or two restriction endonucleases. The choice of endonuclease to be used depends in the first instance upon a prior determination that recognition sites for the enzyme exist in the sequence of the cDNA to be isolated. The method depends upon the existence of two such sites. If the sites are identical, a single enzyme will be sufficient. The desired sequence will be cleaved at both sites, eliminating size heterogeneity as far as the desired cDNA sequence is concerned, and creating a population of molecules, termed fragments, containing the desired sequence and homogeneous in length. If the restriction sites are different, two enzymes will be required in order to
produce the desired homogeneous length fragments.
The choice of restriction enzyme(s) capable of producing an optimal length nucleotide sequence fragment coding for all or part of the desired protein must be made empirically. If the amino acid sequence of the desired protein is known, it is possible to compare the nucleotide sequence of uniform length nucleotide fragments produced by restriction endonuclease cleavage with the amino acid sequence for which it codes, using the known relationship of the genetic code common to all forms of life. A complete amino acid sequence for the desired protein is not necessary, however, since a reasonably accurate identification may be made on the basis of a partial sequence. Where the amino acid sequence of the desired protein is now known, the uniform length polynucleo- tides produced by restriction endonuclease cleavage may be used as probes capable of identifying the synthesis of the desired protein in an appropriate in vitro protein synthesizing system. Alternatively, the mRNA may be purified by affinity chromatography. Other techniques which may be suggested to those skilled in the art will be appropriate for this purpose.
The number of restriction enzymes suitable for use depends upon whether single-stranded or double- stranded cDNA is used. The preferred enzymes are those capable of acting on single-stranded DNA, which is the immediate reaction product of mRNA reverse transcription. The number of restriction enzymes now known to be capable of acting on single-stranded DNA is limited. The enzymes Haelll, Hhal and Hin(f)I are presently known to be suitable. In addition, the enzyme MboII may act on single-stranded DNA. Where further study reveals that other restriction enzymes can act on single-stranded DNA, such other enzymes may appropriately be included in the list of preferred enzymes. Additional suitable enzymes include those
specified for double-stranded cDNA. Such enzymes are not preferred since additional reactions are required in order to produce double-stranded cDNA, providing increased opportunities for the loss of longer sequences and for other losses due to incomplete recovery. The use of double-stranded cDNA presents the additional technical disadvantages that subsequent sequence analysis is more complex and laborious. For these reasons, single-stranded cDNA is prefered, but the use of double-stranded DNA is feasible. In fact, the present invention was initially reduced to practice using double-stranded cDNA.
The cDNA prepared for restriction endonuclease treatment may be radioactively labeled so that it may be detected after subsequent separation steps. A preferred technique is to incorporate a radioactive label such as ^P in the alpha position of one of the four deoxynucleoside triphosphate precursors. Highest activity is obtained when the concentration of radioactive precursor is high relative to the concentration of the non-radioactive form. However, the total concentration of any deoxynucleoside triphosphate should be greater than 30 yM, in order to maximize the length of cDNA obtained in the reverse transcriptase reaction. See Efstratiadis, A., Maniatis, T., Kafatos, F.C., Jeffrey, A., and Vournakis, J.N., Cell, (1975) :367. For the purpose of determining the nucleotide sequence of cDNA, the 5' ends may be conveniently labeled with 32P in a reaction catalyzed by the enzyme polynucleotide kinase. See
Maxam, A.M. and Gilbert, W., Proc. Natl. Acad. Sci. USA (1977) 74:560.
Fragments which have been produced by the action of a restriction enzyme or combination of two restriction enzymes may be separated from each other and from heterodisperse sequences lacking recognition sites by any appropriate technique capable of
separating polynucleotides on the basis of differences in length. Such methods include a variety of electrophoretic techniques and sedimentation techniques using an ultracentrifuge. Gel electrophoresis is preferred because it provides the best resolution on the basis of polynucleotide length. In addition, the method readily permits quantitative recovery of separated materials. Convenient gel electrophoresis methods have been described by Dingman, C.W., and Peacock, A.C., Biochemistry (1968) 1_: 659 , and by Maniatis, T., Jeffrey, A. and van de Sande, H., Biochemistry (1975) 1 :3787.
Prior to restriction endonuclease treatment, cDNA transcripts obtained from most sources will be found to be heterodisperse in length. By the action of a properly chosen restriction endonuclease, or pair of endonucleases, polynucleotide chains containing the desired sequence will be cleaved at the respective restriction sites to yield polynucleotide fragments of uniform length. Upon gel electrophoresis, these will be observed to form a distinct band. Depending on the presence or absence of restriction sites on other sequences, other discrete bands may be formed as well, which will most likely be of different length than that of the desired sequence. Therefore, as a consequence of restriction endonuclease action, the gel electrophoresis pattern will reveal the appearance of one or more discrete bands, while the remainder of the cDNA will continue to be heterodisperse. In the case where the desired cDNA sequence comprises the major polynucleotide species present, the electrophoresis pattern will reveal that most of the cDNA is present in the discrete band.
Although it is unlikely that two different sequences will be cleaved by restriction enzymes to yield fragments of essentially similar length, a method for determining the purity of the defined length
fragments is desirable. Sequence analysis of the electrophoresis band may be used to detect impurities representing 10% or more of the material in the band. A method for detecting lower levels of impurities has been developed founded upon the same general principles applied in the initial isolation method. The method requires that the desired nucleotide sequence fragment contain a recognition site for a restriction endonuclease not employed in the initial isolation. Treatment of polynucleotide material, eluted from a gel electrophoresis band, with a restriction endonuclease capable of acting internally upon the desired sequence will result in cleavage of the desired sequence into two sub-fragments, most probably of unequal length. These sub-fragments upon electrophoresis will form two discrete bands at positions corresponding to their respective lengths, the sum of which will equal the length of the polynucleotide prior to cleavage. Contaminants in the original band that are not susceptible to the restriction enzyme may be expected to migrate to the original position. Contaminants containing one or more recognition sites for the enzyme may be expected to yield two or more sub-fragments. Since the distribution of recognition sites is believed to be essentially random, the probability that a contaminant will also yield sub-fragments of the same size as those of the fragment of desired sequence is extremely low. The amount of material present in any band of radioactively labeled polynucleotide can be determined by quantitative measurement of the amount of radioactivity present in each band, or by any other appropriate method. A quantitative measure of the purity of the fragments of desired sequence can be obtained by comparing the relative amounts of material present in those bands representing sub-fragments of the desired sequence with the total amount of material.
Following the foregoing separation or any other technique that isolates the desired gene, the sequence may be reconstituted. The enzyme DNA ligase, which catalyzes the end-to-end joining of DNA fragments, may be employed for this purpose. The gel electrophoresis bands representing the sub-fragments of the desired sequence may be separately eluted and combined in the presence of DNA ligase, under the appropriate conditions. See Sgaramella, V., Van de Sande, J.H., and Khorana, H.G., Proc. Natl. Acad. Sci. USA (1970) £7:1468. Where the sequences to be joined are not blunt-ended, the ligase obtained from E. coli may be used; Modrich, P., and Lehman, I.R., J. Biol. Chem. (1970) 245:3626. The efficiency of reconstituting the original sequence from sub-fragments produced by restriction endonuclease treatment will be greatly enhanced by the use of a method for preventing reconstitution in improper sequence. This unwanted result is prevented by treatment of the homogeneous length cDNA fragment of desired sequence with an agent capable of removing the 5'-terminal phosphate groups on the cDNA prior to cleavage of the homogeneous cDNA with a restriction endonuclease. The enzyme alkaline phosphatase is preferred. The 5'-terminal phosphate groups are a structural prerequisite for the subsequent joining action of DNA ligase used for reconstituting the cleaved sub-fragments. Therefore, ends which lack a 5'-terminal phosphate cannot be covalently joined. The DNA sub-fragments can only be joined at the ends containing a 5'-phosphate generated by the restriction endonuclease cleavage performed on the isolated DNA fragment.
The majority of cDNA transcripts, under the conditions described above, are derived from the mRNA region containing the 5'-end of the mRNA template by specifically priming on the same template with a
fragment obtained by restriction endonuclease cleavage. In this way, the above-described method may be used to obtain not only fragments of specific nucleotide sequence related to a desired protein, but also the entire nucleotide sequence coding for the protein of interest. Double-stranded, chemically synthesized oligonucleotide linkers, containing the recognition sequence for a restriction endonuclease, may be attached to the ends of the isolated cDNA, to facilitate subsequent enzymatic removal of the gene portion from the vector DNA. See Scheller e_t al. , Science (1977) 196:177. The vector DNA is converted from a continuous loop to a linear form by treatment with an appropriate restriction endonuclease. The ends thereby formed are treated with alkaline phosphatase to remove 5'-phosphate end groups so that the vector DNA may not reform a continuous loop in a DNA ligase reaction without first incorporating a segment of the syndecan DNA. The cDNA, with attached linker oligonucleotides, and the treated vector DNA are mixed together with a DNA ligase enzyme, to join the cDNA to the vector DNA, forming a continuous loop of recombinant vector DNA, having the cDNA incorporated therein. Where a plasmid vector is used, usually the closed loop will be the only form able to transform a bacterium. Transformation, as is understood in the art and used herein, is the term used to denote the process whereby a microorganism incorporates extracellular DNA and reproduces it stably from generation to generation. Plasmid DNA in the form of a closed loop may be so incorporated under appropriate environmental conditions. The incorporated closed loop plasmid undergoes replication in the transformed cell, and the replicated copies are distributed to progeny cells when cell division occurs. As a result, a new cell line is established, containing the plasmid and carrying the genetic determinants thereof. Transformation by a
plasmid in this manner, where the plasmid genes are maintained in the cell line by plasmid replication, occurs at high frequency when the transforming plasmid DNA is in closed loop form, and does not or rarely occurs if linear plasmid DNA is used. Once a recombinant vector has been made, transformation of a suitable microorganism is a straightforward process, and novel microorganism strains containing the syndecan gene or a related gene may readily be isolated, using appropriate selection techniques as is understood in the art.
Using these general techniques specifically as set forth in the following examples, we have isolated cDNA clones encoding the syndecan polypeptide from a normal mouse mammary gland epithelial cell line as well as mouse liver tissue. The cDNA derived protein sequence of syndecan is unique; comparisons with the National Biomedical Research Foundation and the translated NIH-Genebank databases detected no statistically significant similarities. The nascent polypeptide sequence is 311 amino acids and has -a molecular mass of 32,868 daltons. Treatment of syndecan with heparatinase and chondroitinase ABC generates a protein with relative mobility of ca. 69k daltons versus globular molecular weight markers on a gradient SDS-PAGE system. Treatment of the ectodomain with anhydrous HF for 1.5 hrs at 0°C, Mort, A.J. and Lamport, D.T.A., Anal. Biochem. (1977) Σ2: 289-309, yields a protein that migrates as a broad band at ca. 46k daltons, Weitzhandler, M. , Streeter, H.B., Henzel, W.J., and Bernfield, M., J. Biol. Chem. (1988) 263: 6949-6952. These core protein sizes as measured by SDS-PAGE are larger than would be predicted based on the cDNA and any incompletely removed carbohydrate. This anomoly appears to be a charge effect and has been seen in other proteins rich in proline, alanine, and highly charged amino acides. Syndecan is
not a disulfide cross-linked dimer. Its migration on SDS-PAGE is unchanged following DTT treatment; its CNBr-cleavage product produces a single signal during amino acid sequencing; and its single cysteine in the predicted mature protein is located in the putative transmembrane domain. It also does not appear to be cross-linked by lysyl oxidase- or transglutaminase- mediated reactions because β-aminoproprionitrile and monodansylcadaverine treatments of NMuMG cells do not change its mobility on SDS-PAGE. Proteins with regions rich in proline, alanine and highly charged amino acids have highly extended conformations and anomalously slow mobilities in SDS-PAGE, Guest, J.R., Lewis, H.M. , Graham, L.D., Packman, L.C., and Perham, R.N., J. Mol. Biol. (1985) 185: 743-754. These amino acids are abundant in syndecan, and a Chou and Fasman secondary structure prediction is consistent with large regions of extended conformation. In vitro translation of synthetic mRNA corresponding to the coding region of syndecan (Sacl-Hindlll fragment of clone 4-19b) produces a nascent polypeptide of ca. 45k daltons. Therefore, while we have not excluded the possiblity of other post-translational modifications, the bulk of the size difference probably reflects anomalous gel migration on SDS-PAGE. The amino acid sequence derived from the syndecan cDNA shows three functional domains; an extracellular domain and, by inference, transmembrane and cytoplasmic domains.
The transmembrane domain was inferred from the physical properties of syndecan. The derived C- terminal sequence of syndecan contains both a characterics transmembrane domain (amino acids 253 to 277 in Table 1) and a 34 amino acid putative cytoplasmic domain. The cytoplasmic domain was inferred from properties already known for purified syndecan indicating that syndecan associates with the actin cytoskeleton. An immune serum generated against
a synthetic peptide from the C-terminus of the derived protein sequence reacts with native syndecan extracted from NMuMG cells but not with the ectodomain, providing direct evidence for the cytoplasmic domain. The ectodomain of syndecan is released from
NMuMG cell surfaces during cell culture, rapidly in response to cell rounding, or by mild trypsin treatment. The putative extracellular domain of syndecan contains a single dibasic site near the plasma membrane at which cleavage of syndecan from the cell surface undoubtedly occurs. Because the endogenously shed ectodomain of syndecan is indistinguishable from the trypsin-released form, a cell surface trypsin-like protease has been proposed. Shedding during cell culture is from the apical surface. However, when these cells are released from the substratum, destroying their polarity, the ectodomain is rapidly shed. These previously known results suggest that a cell surface protease is involved, but the structure of the site was not known. Identification of the putative cleavage site by the present invention will now allow more detailed investigation of this activity and will allow production of modified proteoglycans and other proteins that can be readily cleaved to release their extracellular regions for ready purification.
Syndecan isolated from several sources is a hybrid proteoglycan, containing both chondroitin sulfate and heparan sulfate. These chains are known to be linked via a xyloside to serine residues in proteins, Roden, L., The Biochemistry of Glycoproteins and Proteoglycans (1980) 267-371 and Dorfman, A., Cell Biology of Extracellular Matrix (1981) 115-138. Regulating the elaboration of both chondroitin sulfate and heparan sulfate chains on the same core protein is a significant problem because the intial four saccharides are identical. The synthesis of both types of chains is initiated by a xylosyltransferase that
resides in either the endoplasmic reticulum or the Golgi, see Farquhar, M.G., Ann. Rev. Cell Biol. (1985) 1: 447-488, and by three Golgi-localized glycosyltransferases, Geetha-Habib, M., Campbell, S.C., Schwartz, N.B., J. Biol. Chem. (1984) ^5 : 7300-7310. Specific chain elongation subsequently involves the sequential action of an N-acetylgalactosaminyltransfer- ase and a glucuronosyltransferse for chondroitin sulfate, and an N-acetylglucosaminyltransferase and a glucuronosyltransferase for heparan sulfate. This specific chain elongation must involve recognition of unique structural features of the core protein, indicating that distinct peptide sequences might exist at chondroitin sulfate versus heparan sulfate attachment sites. The presence of both chondroitin sulfate and heparan sulfate on syndecan provides the opportunity to assess the relationship between these attachment sites. Based on the core protein sequence of three chondroitin sulfate proteoglycans, PG-19, PG- 40, and invariant chain and the reactivity of a xylosyltransferase with synthetic peptides. Bourdon, M.A., Krusius, T., Campbel, S., Schwartz, N.B., and Ruoslahti, E., Proc. Natl. Acad. Sci. USA (1987) ^4: 3194-3198, proposed that the xylose acceptor sequence for chondroitin sulfate in these proteoglycans is acidic-acidic-Xaa-Ser-Gly-Xaa-Gly. Syndecan contains five ser-gly sequences; the two in its single Ser-Gly- Ser-Gly repeat closely match this previously proposed acceptor sequence (Figure 3A) . Interestingly, although this consensus acceptor sequence is located near the N- terminus of syndecan and near the C-terminus of invariant chain, it is distant from the plasma membrane on both proteins.
Syndecan contains three potential ser-gly glycosaminoglycan attachment sites that contain some features of this consensus acceptor sequence but also contain unique features (Figure 3B). Though each of
these three sequences retains an acidic amino acid two residues N-terminal to the acceptor Ser-Gly, they lack the consensus glycine that is two residues C-terminal to the Ser-Gly. This omission does not preclude this sequence from serving as a xylosyltransferase acceptor because it is also omitted from the Gly-Ser site of type IX collagen, Huber, S., Winterhalter, K.H., and Vaughan, L., J. Biol. Chem. (1988) 26^: 752-756. The unique feature of these three sequences is the consistent finding of an acidic amino acid C-terminal to the Ser-Gly (Figure 3B) . In contrast, the analogous amino acids in the chondroitin sulfate proteoglycans PG-19, PG-40, and invariant chain are either uncharged or hydrophobic. These three Xac-Xaa-Ser-Gly-Xac sites in sydecan appear to represent unique recognition sequences for the elongation of glycosaminoglycan chains, especially heparan sulfate chains. An artificial peptide containing a heparan sulfate elongation site of the formula Xac-Xaa-Ser-Gly-Xac, where Xac is an acidic amino acid (aspartate or glutamate) and Xaa is any amino acid, can be prepared and used to produce heparan sulfate in eukaryotic cells as described herein. The artificial peptide need not contain any of the remaining structure of the molecules described herein as long as it provides the indicated sequence at a location in the peptide that is available for glycosylation. Such locations can be predicted, such as by using the algorithms developed by Chou and Fasman, or by empirically inserting a DNA sequence encoding this amino acid sequence into a gene and determing that the product functions as a recognition sequence for the elongation of heparan sulfate chains. A simple artificial peptide, for example, might contain multiple copies of the recognition sequence either located directly adjacent to each other or being joined by from one to ten, preferably one to five, amino acids. Another preferred embodiment
involves producing a known polypeptide by genetic engineering that has been engineered to contain the attachment site of the invention at a location known to reside on an external surface of the polypeptide. On the other hand, although sequences from the natural syndecan amino acid sequences adjacent the Xac- Xaa-Ser-Gly-Xac sequences are not required, they may be retained if desired in order to produce a protein that more closely resembles syndecan. Accordingly, artifical peptides containing from 1 to 10, 20, 30, or even more naturally adjacent amino acids as shown in Table 1, located either C terminal or N terminal or both to the Xac-Xaa-Ser-Gly-Xac sequence, represent other viable embodiments of the invention. Proteins containing such longer sequences can be prepared in the same manner discussed above using corresponding longer DNA sequences encoding the desired region.
The number of chondroitin sulfate chains on syndecan apparently differs in cells of distinct cellular organization and changes in response to TGF-β, implying that each potential glycosaminoglycan • attachment site is not always utilized. A possible novel regulatory mechanism for this variation is suggested by the location in syndecan of its single potential N-linked glycosylation site, Asn-Phe-Ser, at residues 43-45. This site is located within the putative chondroitin sulfate attachment sequence, and the attachment of an N-linked sugar at this site would likely prevent subsequent recognition by the xylosytransferase.
Of a wide variety of mature tissues examined with antibody 281-2, syndecan is expressed mainly in epithelia. Northern blot analysis of mRNA revealed two mRNA species at 2.6 and 3.4kb (constant ratio 3:1 respectively) in NMuMG cells as well as skin, liver, and midpregnant mammmary gland, all containing immunoreactive syndecan. In contrast, these two mRNAs
were undetectable in cardiac and skeletal muscle, tissues of mesenchymal origin that do not stain with 281-2. However primitive and embryonic mesenchymal cells also show the 2.6 and 3.4kb mRNA species. A 4.5 kb mRNA was detected in adult cerebrum, which does not react in fixed tissue sections with the antibody, Hayashi, K., Hayashi, M, Jalkanen, M., Firestone, J.H., Trelstad, R.L., and Bernfield, M. , J. Histochem. Cytochem. (1987) 35: 1079-1088. The cDNA sequence reported here corresponds to the smaller (2.6 kb) and more abundant of the two mRNAs. Though the relationship between the 2.6 and 3.4 kb mRNAs is unknown, they are likely generated by usage of alternative polyandenylation sites. Probes from both 5' and 3' regions of the syndecan cDNA hybridized identically to these two mRNAs in Northern blot analysis. Moreover, the primer-extended library contained clones identical to the 5' end of clone 4- 19B. The relationship of the 4.5 kb mRNA identified in cerebrum to the others is unknown because clones have not yet been characterized from cerebral cDNA libraries.
Sequence alignments demonstrate similarity at the nucleotide level between the mouse syndecan and human insulin receptor cDNA sequences. The insulin receptor sequence is set forth in Ebina, Y., Ellis, L., Jarnagin, K., Edery, M. , Graf, L., Clauser, E., Ou, J., Masiarz, F., Kan, Y.W., Golfine, I.D., Roth, R.A., and Rutter, W.J., Cell (1985) £0: 747-758. Alignment of these sequences (University of Wisconsin GCG Bestfit program) places the putative start ATGs near the middle of a region of similarity; a 99 bp region of syndecan which spans its 5'-untranslated and initial coding sequences is 67% identical, with four small gaps, to the analogous region of the human insulin receptor
(Figure 5A) . The location of this similarity and the large size of the 5'-untranslated regions suggest that
these sequences are shared translational control elements, as has been described for the 5'-untranslated region of the mRNAs for ferritin, Aziz, N., and Munro, H.N., Proc. Natl. Acad. Sci. USA (1987) 8_4: 8478-8482 and Casey, J.L., Hentze, M.W., Koeller, D.M., Caughman, S.W. , Rouault, T.A., Klausner, R.D., and Harford, J.B., Science (1988) 240: 924-928, and the B polypeptide of platelet-derived growth factor, Ratner, L., Theilan, B., and Collins, T., Nucleic Acids Res. (1987) 15: 6017-6036. There is also a second region of similarity between these cDNAs in their 3 '-untranslated regions; a 35 bp T-rich sequence of syndecan is 80% identical (no gaps) with a sequence of the human insulin receptor (Figure 5B) . These identical sequences in both 5' and 3'-untranslated regions between the mouse syndecan and human insulin receptor mRNAs suggest that post- transcriptional controls are shared by these two molecules.
A number of fine-structure aspects of syndecan can be seen by references to DNA and amino acid sequences. Starting at the indicated AUG (Figure 1), the syndecan cDNA codes for a protein of 311 amino acids containing two hydrophobic stretches. The derived sequence suggests several domains and structural features; their presumed arrangement is summarized in Figure 4.
The first hydrophobic stretch consists of 12 amino acids beginning shortly after the presumptive start methionine. Because syndecan is oriented with its N-terminus outside of the plasma membrane, this appears to be a signal sequence. The N-terminus of mature syndecan is blocked, and, therefore, it has not been possible to determine the N-terminus directly. A likely site for signal peptidase cleavage is following amino acid residue 17 (Figure 1) in the predicted sequence. Cleavage at this site would generate an N- terminal glutamine which could readily cyclize forming
a pyrrolidone carboxlyl residue and thus a blocked N- terminus, as exists in a number of eukaryotic proteins.
The second hydrophobic stretch is a sequence near the C-terminus which has characteristics of a transmembrane domain (thick underline. Figure 1). This sequence is a highly hydrophobic stretch of 25 residues followed immediately by a series of highly charged residues, consistent with the stop transfer signals found following most membrane spanning domains. This domain also contains the only cysteine and one of the four tyrosines in the apparant mature protein sequence. The putative transmembrane domain defines two hydrophilic domains of the syndecan core protein, a putative extracellular domain consisting of approximately 235 amino acids, and a smaller putative cytoplasmic domain consisting of 34 amino acids. This orientation with respect to the plasma membrane is confirmed by the reactivity of immune serum directed either against a peptide containing the C-terminal seven amino acids or against the ectodomain of syndecan. The anti-C-terminus immune serum recognizes the hydrophobic native form of syndecan, but is unreactive with the non-hydrophobic ectodomain. In contrast, the anti-ectodomain immune serum recognizes both forms of the molecule.
The putative cytoplasmic domain contains three tyrosine residues, but the sequences adjacent to these tyrosines are not similar to the presently identified consensus sequences for tyrosine phosphorylation, Hunter, T., and Cooper, J.A., Ann. Rev. Biochem. (1985) 54: 879-930. This domain presumably has protein binding activity because the intact proteoglycan but not the ectodomain co-sediments with F-actin, Rapraeger, A., and Bernfield, M. , Extracelluar Matrix (1982) 265-269, and because syndecan associates with the actin-containing cytoskeleton when cross-linked at the cell surface, Rapraeger, A., Jalkanen, M. , and
Bernfield, M. , J. Cell Biol. (1986) 10_3: 2683-2696.
The putative extracellular domain has several sequence characteristics that correspond with the known properties of this proteoglycan. The ectodomain of syndecan is shed by cleavage from its membrane anchor, Jalkanen, M., Rapraeger, A., Saunders, S., and Bernfield, M. , J. Cell Biol. (1987) 10_5: 3087-3096, and an indistinguishable molecule is released from the cell surface by mild trypsin treatment, Jalkanen, M. , Rapraeger, A., Saunders, S., and Bernfield, M. , J. Cell Biol. (1987) 105: 3087-3096. The only dibasic sequence (Arg-Lys) in this extracellular domain is located adjacent to the putative transmembrane domain at residues 250-251 (identified in Figure 1 by arrows). This location places the cleavage site adjacent to the plasma membrane. The putative extracellular domain lacks cysteine thus eliminating disulfide bridges as a means of generating secondary structure in this moleucle. The ectodomain contains both heparan sulfate and chondroitin sulfate chains, Rapraeger, A., Jalkanen, M. , Endo, E., Koda, J., and Bernfield, M., J. Cell Biol. (1985b) 260: 11046-11052. The serine hydoxyl group of ser-gly sequences are the attachment sites for these glycosaminoglycan chains, Roden, L., The Biochemistry of Glycoproteins and Proteoglycans. 267-371 and Dorfman, A., Cell Biology of Extracellular Matrix 115-138. Syndecan possess five such potential glycosaminoglycan attachment sites, all within the putative extracellular domain; three such serines are clustered ar the N-terminus at residues 37, 45, and 47, and the remaining two are clustered near the membrane at residues 207 and 217 (open circles. Figure
1). The ectodomain from NMuMG cells is insensitive to digestion by N-glycosidase F, as assessed by PAGE, Weitzhandler, M., Streeter, H.B., Henzel, W.J., and
Bernfield, M. , J. Biol. Chem. (1988) 2 : 6949-6952. The putative extracellular domain contains a single canonical sequence for the attachment of N-linked oligosaccharide (solid circle. Figure 1). The serine in this Asn-Xaa-Ser sequence is a putative glycosaminoglycan attachment site.
In all cases, syndecan or a molecule related to syndecan will be expressed when the DNA sequence encoding it is functionally inserted into a vector that is expressed in a eukaryotic cell containing an enzyme system capable of producing glycosaminoglycan chains. By "functionally inserted" is meant in proper reading frame and orientation, as is well understood by those skilled in the art. Expression of syndecan can be enhanced by including multiple copies of the syndecan gene in a transformed or transfected host, by selecting a vector known to reproduce in the host, thereby producing large quantities of protein from exogeneous inserted DNA, or by any other known means of enhancing peptide expression.
In addition to the above general procedures which can be used for preparing recombinant DNA molecules and transformed unicellular organisms in accordance with the practices of this invention, other known techniques and modifications thereof can be used in carrying out the practice of the invention. In particular, techniques relating to genetic engineering have recently undergone explosive growth and development. Many recent U.S. patents disclose plasmids, genetically engineering microorganisms, and methods of conducting genetic engineering which can be used in the practice of the present invention. For example, U.S. Patent 4,273,875 discloses a plasmid and a process of isolating the same. U.S. Patent 4,304,863 discloses a process for producing bacteria by genetic engineering in which a hybrid plasmid is constructed
and used to transform a bacterial host. U.S. Patent 4,419,450 discloses a plasmid useful as a cloning vehicle in recombinant DNA work. U.S. Patent 4,362,867 discloses recombinant cDNA construction methods and hybrid nucleotides produced thereby which are useful in cloning processes. U.S. Patent 4,403,036 discloses genetic reagents for generating plasmids containing multiple copies of DNA segments. U.S. Patent 4,363,877 discloses recombinant DNA transfer vectors. U.S. Patent 4,356,270 discloses a recombinant DNA cloning vehicle and is a particularly useful disclosure for those with limited experience in the area of genetic engineering since it defines many of the terms used in genetic engineering and the basic processes used therein. U.S. Patent 4,336,336 discloses a fused gene and a method of making the same. U.S. Patent 4,349,629 discloses plasmid vectors and the production and use thereof. U.S. Patent 4,332,901 discloses a cloning vector useful in recombinant DNA. Although some of these patents are directed to the production of a particular gene product that is not within the scope of the present invention, the procedures described therein can easily be modified to the practice of the invention described in this specification by those skilled in the art of genetic engineering.
All of these patents as well as all other patents and other publications cited in this disclosure are herein individually incorporated by reference.
Manipulation of the expression vectors will in some case produce constructs which improve the expression of the polypeptide in eukaryotic cells or express syndecan in other hosts. Furthermore, by using the syndecan cDNA or a fragment thereof as a hybridization probe, structurally related genes found in other organisms can be easily cloned. These genes include those that code for related core proteins of proteoglycans from other species, especially mammals
such as humans and other primates.
Particularly contemplated is the isolation of related genes from these and other organisms that express progeoglycans on their surfaces by using oligo- nucleotide probes based on the principal and variant nucleotide sequences disclosed herein. Such probes can be considerably shorter than the entire sequence but should be at least 14, preferably at least 20, nucleotides in length. Longer oligonucleotides are also useful, up to 30, 40, 50, 75, or 100 nucleotides and further up to the full length of the gene. Both RNA and DNA probes can be used. Such probes can also be used in diagnostic tests that detect the presence of genetic material of a predetermined sequence in samples, e.g., as in a polymerase chain reaction (PCR). In use, the probes are typically labelled in a detectable manner (e.g., with 32p, 3H, biotin, or avidin) and are incubated with single-stranded DNA or RNA from the organism in which a gene is being sought. Hybridization is detected by means of the label after single-stranded and double-stranded (hybridized) DNA (or DNA/RNA) have been separated (typically using nitrocellulose paper). Hybridization techniques suitable for use with oligonucleotides are well known.
Although probes are normally used with a detectable label that allows easy identification, unlabeled oligonucleotides are also useful, both as precursors of labeled probes and for use in methods that provide for direct detection of double-stranded DNA (or DNA/RNA). Accordingly, the term "oligonucleo¬ tide" refers to both labeled and unlabeled forms and not just to labeled probes.
Particularly preferred are oligonucleotides corresponding to the segments of the gene that code for glycosaminoglycan attachment sites. As discussed in the examples that follow, an oligonucleotide with high
probability of success in the identification of other gene products is the 64-fold degenerate oligonucleotide of the form GANGGNTCTGGNGA, where N represents presence of all four nucleotides in degenerate sequences. The complementary oligonucleotide having the degenerate sequence TCNCCAGANCCNTC is also particularly useful and has the added advantage of ability to identify messenger RNA of these gene products in Northern analysis. The invention allows the production in large amounts of highly pure heparan sulfate proteoglycans that contain heparan sulfate chains that are characteristic of specific cell types. For example, the surface of endothelial cells is non-thrombogenic because of the anti-coagulant properties of the heparan sulfate chains in a proteoglycan on their surfaces. Preparations of this highly anti-coagulant heparan sulfate proteoglycan in soluble form is now possible by transfection of cultured endothelial cells with a DNA construct defined by this invention. Expression of the contruct would produce syndecan containing endothelial cell-derived heparan sulfate chains. Sydecan contains a unique protease-susceptible site adjacent to the plasma membrane, allowing the harvesting of this modified syndecan as a soluble product in high yield and purity. This approach would produce an anti¬ coagulant proteoglycan with very high potency, potentially several thousand times more potent than commercially available heparin. The soluble proteins or peptides containing cell-type-speteific heparan sulfate chains, made possible by this invention, can be used in the prevention and therapy of certain viral diseases. Dextran sulfate and heparin have been shown to reduce infection and replication of certain retroviruses, including human immunodeficiency virus (HIV). However, these molecules are highly heterogenous and are
probably non-specific. A more specific inhibitor would be a soluble heparan sulfate peptide or proteoglycan derived from a cell type that interacts with the virus. Peptides derived from this invention can also be used as highly specific competitive inhibitors of heparan sulfate (or chrondroitin sulfate) chain initiation. Because mutant transformed cells with reduced cell-surface heparan sulfate are substantially less turmorigenic, this invention has the potential of producing anti-tumor drugs that are non-cytotoxic.
Production of the heparan sulfate proteoglycan defined by this invention will allow the manufacture of molecules that bind growth factors, especially those involved in angiogensis. These proteoglycans are of significant theraputic value in those instances where local growth factor effects would be useful. A DNA construct derived from this invention can be used in fibroblasts that contain surface proteoglycans that bind various growth factors, including acidic fibroblast growth factor (FGF) and basic FGF. This bonding potentiates the action and prevents the proteolytic degradation of these growth factors. Platelet-derived growth factor (PDGF) binds to heparin in vitro, and the syndecan DNA construct could be used to prepare large amounts of soluble PDGF binding proteoglycan.
The peptide sequences involved in heparan sulfate chain attachment identified by the present invention will allow production of large amounts of cell-type-specific heparan sulfate proteoglycans and enable this attachment site to be placed into other biological macromolecules that do not normally contain it, thereby providing products that are not otherwise available. These products will represent a singular molecular species, whereas the heparins and all other heparan sulfate proteoglycans heretofor described represent many molecular species. The greater
uniformity afforded by the present invention leads to greater potency and potentially to greater specificity of the materials being purified, thereby enhancing their therapeutic applications. Accordingly, existing materials such as heparin from pig intestine or beef lung or dextran sulfate, a synthetic product, that are polydispersed, of low potency, and of little specificity, can be replaced by genetically engineered products of the present invention. Cell lines containing the genetic material necessary for the practice of the present invention can be obtained from a number of public sources, some of which are specifically identified in the following examples. For example, normal mouse mammary epithelial cells can be prepared from normal mouse tissue using the procedure described in the examples below. The same procedure can be used to obtain genetic material from other species.
The invention now being generally described, it will be more readily understood by reference to the following examples which are included for purposes of illustration only and are not intended to limit the invention unless so stated.
EXAMPLE 1
cDNA Libraries
NMuMG mouse mammary epithelial cells (passages 13-22) were maintained in bicarbonate-buffered Dulbecco's modified Eagle medium (Gibco) as described previously, David, G., and Bernfield, M., Proc. Natl. Acad. Sci. USA (1979) 7_6: 786-790. For prepartion of poly(A) RNA, cells were plated on 245 x 245 mm tissue culture plates (Nunc) at approximately one-fifth confluent density and grown to 80-90 percent confluency (3-4 days). Following brief washing with ice-cold PBS the cells were solubilized in RNA extraction buffer (4
M guanidine isothiocyanate in 5 mM sodium citrate pH 7.0, 0.1M β-mercaptoethanol and 0.5% N-lauryl sarcosine) and total RNA prepared by CsCl density centrifugation, Chirgwin, J.M., Pryzybyla, A.E., MacDonald, R.J., and Rutter, W.J., Biochemistry (1979) 18: 5194-5299. Poly(A) RNA was purified by chromatography on oligo(dT)-cellulose (type 3; Collaborative Research) and utilized in the commercial synthesis (Strategene) of cDNA by the SI method, Huynh, T.V., Young, R.A., and Davis, R.W., DNA Cloning: A
Practical Approach (1985) 49-78. Following addition of EcoRI linkers, those cDNA greater than 1 kb in length were isolated by gel filtration chromatography, inserted into the EcoRI sites of λ gt-10 and the expression vector λ gt-11 and packaged. A portion of the gt-11 library was amplified for later study, while the remainder was screened immediately without expansion.
A primer extension cDNA library was prepared using the RNase H method, Gubler, U., and Hoffman,
B.J., Gene (1983) 2JL: 263-269. First strand cDNA was synthesized from 10 yg of an 18-bp oligonucleotide containing sequence derived from near the 5' end of PM- 4 (see Example 2). The second strand was synthesized using RNase H(BRL) and DNA polymerase Klenow fragment (Boehringer-Mannheim) . The cDNA was methylated with EcoRI methylase and then ligated with synthetic EcoRI linkers (New England Biolabs). Excess linkers were removed by EcoRI digestion and the cDNA was purified on agarose gel electrophoresis and recovered by electroelution. The resulting cDNA was inserted into λ gt-10 (Promega and packaged using Giga pack Gold (Stratagene) .
EXAMPLE 2
Isolation of Syndecan cDNA Clones
The preparation of a rabbit serum antibody to the ectodomain of NMuMG syndecan has been described elsewhere, Jalkanen, M., Rapraeger, A., and Bernfield, M., J. Cell Biol. (1988) 106: 953-962. For screening clones in λ gt-11, the immunoserum was first absorbed against E. coli proteins to reduce background. Briefly, a 500 ml culture of E. coli strain Y1090 was grown to saturation in the presence of 50 yg/ml ampicillin. Following centrifugation, the cells were resuspended in 50 ml TBST (Tris buffered saline triton: 10 mM Tris pH 7, NaCl 150mM, Triton X-100 0.3%), sonicated, and following addition of 100 yl immunoserum (1:500 dilution), incubated overnight at 4 C. This mixture was centrifuged for 10 min at 4000 rpm and used to screen expressed λ gt-11 cDNA clones. Young, R.A., and Davis, R.W., Science (1983) 22 : 778-782, by detection with alkaline phosphate-conjugated goat-anti- rabbit IgG (Promega). Four antibody reactive clones were identified from 7.5 x 105 recombinants and were plaque-purified. Northern and Southern hybridization experiments allowed grouping of these clones into three distinct sets of related clones. Two of these sets produced fusion proteins that reacted with immunoserum affinity-purified against the ectodomain of syndecan. A 2.1-kb clone from one of these sets, PM-4, was found to contain a sequence that exactly matched the partial amino acid sequence of a cyanogen bromide-cleaved fragment of the ectodomain of syndecan. Additionally, syndecan purified from NMuMG cells reacted with an immunserum prepared against a synthetic peptide containing the C-terminal 7 amino acids (Lys-Gln-Gln- Glu-Glu-Phe-Tyr-Ala) of the PM-4 derived protein sequence. This immunserum failed to react with the ectodomain which lacks the putative cytoplasmic
domain. Furthermore, this serum does not cross react with any other cellular proteins as assessed by Western blotting of total cell extracts.
Additional screeing of the NMuMG λ gt-10 libraries was performed using radiolabeled fragments from the 5' end of PM-4 (250 bp EcoRI-HincII fragment). cDNA fragments isolated from SeaPlaque agarose (FMC BioProducts) were labeled with 32P by random oligonucleotide priming, Feinberg, A.P., and Vogelstein, B., Addendum. Anal. Biochem. (1984) 137: 266-267, and used as described by Maniatis, T., Fritsch, E.F., and Sambrook, J., Molecular Cloning: A Laboratory Manual (1982). This screening yielded two clones, 4-19B and 4-15 (Figure 2). Additional screening of a primer-extended λ gt-10 cDNA library, prepared with liver poly(A) RNA and a synthetic oligonucleotide complimentary to a site near the 5' end of PM-4 (positions 848-865 in Table 1) was screened with the same 250 bp probe. Several independent clones were characterized from this library; each contained a 5' sequence identical with that of clone 4-19B. -
EXAMPLE 3
Subcloning and DNA Sequencing
Purified lambda DNA was prepared from positively selected clones by Lambdasorb immunoprecipitation (Promega). Fragments released by restriction endonuclease digestions were isolated by electrophoresis followed by excision from SeaPlaque agarose (FMC BioProducts). These isolated fragments were subcloned directly, in the presence of agarose, Struhl, K., BioTechniques (1985) 3_: 452-453, to either pGEM 3 and 4 for in vitro transcription, or M13 mpl8 and mpl9. Messing, J., Methods Enzymol. (1983) 101: 20- 78, for sequence analysis.
DNA sequencing was performed by the dideoxy
chain termination method, Sanger, F., Nicklen, S., and Coulson, A.R., Proc. Natl. Acad. Sci. USA (1977) 74: 5463-5467, using a modified T7 DNA polymerase (Sequenase ™, U.S. Biochemical). The strategy is summarized in Figure 2. Sequence was generated from both ends of subcloned restriction fragments using universal M13 sequencing primers. The internal sequence of large fragments as well as the complementary strands of all fragments were determined using oligonucleotide primers synthesized in accordance with preceding sequences. Sequencing artifacts generated as the result of G-C compression were avoided by determining all sequences using both dGTP and the nucleotide analogue dITP. The cDNA (Figure 1) has the following features: The first AUG is at postion 240. This putative intiation codon is preceded by two inframe termination codons (TAA and TGA at positions 39 and 72 respectively) and followed by a 930 base open reading frame that ends at position 1173 with a TGA termination codon. Following the putative coding region are 1,243 bases of 3'-untranslated sequence that ends with the poly(A) stretch. Because each of the primer extended clones has the same 5' end as the largest cDNA clone from the NMuMG library, M-4-19B, this sequence appears to include the complete 5'-untranslated region of syndecan. Other features have been previously discussed.
EXAMPLE 4
Northern Blots
RNA for Northern analysis was prepared from the following: NMuMG cells, adult liver, newborn skin, mid-pregnant mammary gland, adult cerebrum, skeletal and cardiac muscle. Excised tissues were ground to a fine powder in the presence of liquid nitrogen and transferred directly to RNA exraction buffer (see
above); the NMuMG cells were extracted after washing with PBS as described above. The samples were vigorously vortexed, an equal volume of lOmM Tris pH 8.0, ImM EDTA, and 1% SDS added, and subsequently extracted exhaustively with 24:24:1 Tris-saturated phenol:chloroform:isoamyl alcohol followed by a single extraction with 24:1 chloroform:isoamyl alcohol. Following precipitation with an equal volume of 2- propanol, and resuspension in lOmM Tris pH 7.5, ImM EDTA, RNA was precipitated by addition of 1/3 volume of 10 M LiCl. Poly(A) RNA was prepared by oligo d(T) chromatography as described above.
For Northern analysis, 2 yg of each poly(A) RNA sample was separated by electrophoresis in 1.2% agarose-formaldehyde gels in the presence of MOPS
(Sigma)- Acetate buffer pH 7.0, Maniatis, T., Fritsch, E.F., and Sambrook, J., Molecular Cloning. A Laboratory Manual (1982). Following alkali treatment, Danielsen, M. , Northrop, J.P., and Ringold, G.M., EMBO J_;_ (1986) 5_: 2513-2522, and neutralization in transfer buffer (0.025 M sodium phosphate pH 6.5), the gel was blotted to Gene Screen and the RNA immobilized by UV cross-linking. Church, G.M., and Gilbert, W. , Proc. Natl. Acad. Sci. USA (1984) 81: 1991-1995. Hybridization probes were prepared by in vitro transcription of the 5' EcoRI-SacI fragment of PM-4 subcloned into pGEM3, Melton, D.A., Krieg, P.A., Rebagliati, M.R., Maniatis, T., Zinn, K., and Green, M.R., Nucl. Acids Res. (1984) 12: 7035-7056. Blots were prehybridized at 61°C in 50% formamide, 1% SDS, 5X SSPE, 0.1% ficoll, 0.1% polyvinylpyrrolidone and 100 yg/ml denatured salmon sperm DNA. Hybridization was for 16 hrs at 61°C in the same buffer containing 5 x 106 cpm/ml of RNA probe. Filters were washed 2 x 15 min at room temperature in 5% SDS/IX SSPE and 6 x 30 min at 67°C in 1% SDS/0.1X SSPE. Molecular sizes were determined relative to ethidium bromide stained
molecular weight markers (BRL) and 18S and 28S riboso al RNA.
Northern blot analysis of the poly(A) RNA preparations revels two mRNA bands in NMuMG cells as well as in skin, liver and mammary gland tissues; one band is at 2.6 and the other at 3.4kb. The apparent lower level of expression found in midpregnant mammary gland, as compared with skin and liver, consistent with the relative paucity of epithelial cells in the mammary gland. Longer exposures of the Northern blot discussed above, as well as others containing larger quantities of poly(A) RNA, verify that the mammary gland expresses both the 2.6 and the 3.4 kb messages (data not shown). Scanning densitometry shows that these two messages are present at a nearly constant relative abundance of 3:1 (2.6kb:3.4kb) in NMuMG cells and in skin, liver, and mammary gland tissues (data not shown). As expected from the immunohistology, neither of these mRNAs were present in detectable amounts in cerebrum and striated muscle tissues (skeletal and cardiac). However, Northern analysis consistently detected a distinct 4.5kb mRNA in the cerebrum. The relationship of this message to that of syndecan is currently not known.
EXAMPLE 5
Preparation and Use of Antibodies to Synthetic Peptides
A seven amino acid (14C-labeled) synthetic peptide, corresponding to the predicted C-terminus of syndecan (Figure 1) was prepared by direct synthesis. The N-terminal lysine of this peptide was cross-linked by glutaraldehyde to keyhole limpet hemocyanin (KLH, Calbiochem) for immunization and bovine serum albumin (BSA, Fraction V, Sigma) for screening as described by Doolittle, R.F., Of URFS and ORFS: A Primer on How to Analyze Derived Amino Acid Sequences (1986) 85.
Briefly, 10 mg carrier protein was dissolved in 0.5 ml of 0.4 M phosphate, pH 7.5, mixed with 7.5 ymoles of peptide in 1.5ml water and 1.0 ml of 20 mM glutaraldehyde was added dropwise with stirring over the course of 5 min. After continuous stirring at room temperature for 30 min., 0.25 ml of 1 M glycine was added to block unreacted glutaraldehyde and the stirring resumed for an additional 30 min. The product was dialyzed exhaustively against phosphate-buffered saline and incorporation determined by TCA precipitation and liquid scintillation counting. This procedure resulted in the attachment of 17 moles of synthetic peptide per mole of carrier protein. For immunization, 1.25 mg of synthetic peptide-KLH conjugate in 0.5 ml PBS pH 7.5 mixed with 0.5 ml complete Freunds adjuvant. The emulsion was delivered by intramuscular injections, 0.1 ml in each of ten sites, into 3 month old New Zealand white rabbit. After 2 weeks, the immunization was repeated with an identical quantity of immunogen. 10 days later, the rabbit was injected with Innovar 0.125 ml/kg subcutaneously and was bled from the central auricular artery. Innovar was reversed with Nalline 0.2 ml/kg, and serum was prepared from the collected blood. The native lipophilic form of syndecan and the nonlipophilic medium ectodomain form, Jalkanen, M. , Rapraeger, A., Saunders, S., and Bernfield, M., J. Cell Biol. (1987) 105: 3087-3096, were isolated and purified as described elsewhere and assessed for their reactivity to the immune sera. A cationic nylon membrane, Gene-Trans (Plasco Inc., Woburn, MA), was placed into an immunodot apparatus (V&P Scientific, San Diego, CA) and, samples of intact syndecan and the ectodomain (0.5, 5, 50 and 500 ng) were loaded on the membrane using mild vacuum. After loading, remaining binding sites on the membrane were blocked by 1 hr incubation in a solution containing 0.5% BSA, 3%
Carnation instant nonfat dry milk, 10 nM Tris (Sigma) pH 8.0, 0.15 M NaCl and 0.3% Tween-20. Incubation with immune serum was performed at dilutions of 1:200 for the anti-cytoplasmic domain, and 1:500 for the anti- ectodomain in 10 mM Tris pH 7.4, 0.15 M NaCl, and 0.3% Tween-20 (TBST) for 30 min at room temperature. The membrane was washed for 60 min at room temperture with ten changes of TBST and then incubated for 30 min with 1:7500 dilution of alkaline phosphatase goat-anti- rabbit IgG (Promega, Madison WI). Following washing for 60 min with ten changes of TBST, the immobilized alkaline phosphatase was visualized with nitro blue tetrazolium (NBT) 330 yg/ml and 5-bromo-4-chloro-3- indolyl phosphate (BCIP) 165 yg/ml in lOOmM Tris pH 9.5, 100 mM NaCl, and 5 mM MgCl2.
EXAMPLE 6
DNA construct for the expression of syndecan core protein in mammalian cells
Syndecan can be expressed within mammalian cells by transfection of a DNA contruct containing the syndecan core protein cDNA linked to a eukaryotic promoter that has the properties of both high-level expression and activity in a wide range of cell types. For example, the expression vector pHβ APr-1- neo has been described (Gunning et al., PNAS 84:4831- 4835) which utilizes the human β-actin promoter and fullfills both of the above requirements. This vector also contains the neomycin-resistance gene which allows selection of transfected cells with the antibiotic G- 418. A Sacl-Hindlll fragment of the syndecan cDNA
(nucleotides 214-1379 of the sequence shown in Figure 1) which encompasses all of the coding region was
inserted directionally between the Sall-BamHI sites of the pHβ APr-1-neo vector and thus named pβ-SSyn-neo. In order to generate the necessary restriction sites on the 5' and 3' ends of the syndecan cDNA fragment for insertion into this vector, this fragment was passed sequentially through pGEM 3Z (Promega), pGEM 7Zf (Promega), and Bluescript (Stratagene) . Thus the resulting configuration of restriction sites at the point of insertion in pHβ APr-1-neo is as follows: Sall-Clal-Hindlll-EcoRV-EcoRI-SacI-syndecan cDNA fragment-Hindlll-BamHI.
This DNA construct was transformed into the bacterial strain TG-1 and prepared in large scale using routine plasmid preparation techniques including CsCl2 density centrifugation. The purified circularized plasmid DNA was transfected into Chinese Hamster Ovary (CHO) cells by standard calcium phosphate precipitation technique, and transfected clones were selected with G418. Although the parental CHO (hamster) cells express mRNA which is cross-reactive with the murine syndecan cDNA, neither whole cells nor proteoglycan purified from these cells is reactive with the monoclonal antibody 281-2, a rat monoclonal antibody generated against murine syndecan. Therefore it has been possible to assess the function of the transfected murine syndecan gene using this antibody. By both quantitative radioimmunoassay and Western blotting, we have confirmed that clones of the transfected CHO cells express murine syndecan at levels about 1/3 that expressed endogenously by NMuMG mouse mammary epithelial cells, the murine cell line which to date has demonstrated the highest natural levels of expression. Furthermore, a quantitatively higher level of murine syndecan is actually accumulated in the culture media of these CHO cells versus the NMuMG cells, suggesting that the absolute rate of synthesis
from the transfected gene is probably in excesses of even the highest natural levels in murine cells.
Example 7
DNA construct for blocking expression of syndecan core protein in mammalian cells
We have constructed anti-sense cDNA vectors analogous to the sense constructs described above for the purposes of blocking syndecan expression in mammalian cells. Anti-sense RNA produced from vectors of this type, if expressed in sufficiently high levels, is capable of binding to endogenous message intracellularly and blocking its subsequent translation.
To construct this vector, the same coding region Sacl-Hindll fragment of syndecan described above was inserted into the BamHI-Hindlll site of the pHβ Apr-1-neo vector to produce the vector pβ-ASyn-neo. In this application, however, the cDNA was inserted into the vector in the opposite orientation so as to produce mRNA from the transfected gene that is complementary to endogenous syndecan mRNA. To generate the appropriate restriction sites on the 5' and 3' ends of the syndecan cDNA for insertion into this site, this fragment was sequencially passed through pGEM 3Z (Promega) and Bluescript (Stratagene) . Thus, the resulting configuration of restriction sites at the point of insertion in pHβ APr-neo vector is as follows: Hindlll-syndecan cDNA fragment-Scal-EcoRI-Pstl-Smal- BamHI.
Upon transfection of this construct into NMuMG cells by calcium phosphate precipitation and selection with G418, we have observed two distinct morphological changes in these cells which appear to correlate with a reduction in the level of syndecan expression. These morphological changes include a change from the normal
cobblestone appearance of the epithelial monolayer to a fibroblastic and to a neoplastic morphology and cell behaviors.
EXAMPLE 8
Identification of related molecules with degenerate oligonuceotides
While in principle any degenerate oligonucleotide corresponding to the murine syndecan gene product has a potential usefulness in the identification of related biological molecules, some oligonucleotide sequences have higher value. In studying the three putative glycosaminoglycan attachment sites in Syndecan of the consensus sequence D/E-X-S-G-D/E, we have observed that two of these sites have a conserved G in the X position, and that furthermore all five glycosaminoglycan attachment sites in syndecan utilize a single codon, TCT, of the six possible codons for the serine residue. Therefore, we expect that the 64 fold degenerate oligonucleotide of the form GAN GGN TCT GGN GA (where N is all four nucleotides) should statistically have the highest probability of success in the identification of other gene products which contain this putative signal for glycosaminoglycan attachment. Similarily, the complementary oligonucleotide of the form TCN CCA GAN CCN TC should have similar utility, with the added advantage of its ability to identify the messenger RNA of these gene products in Northern analysis.
All publications and patent applications cited in this specification are herein incorporated by reference as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference.
Although the foregoing invention has been described in some detail by way of illustration and
example for purposes of clarity of understanding, it will be readily apparent to those of ordinary skill in the art in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.
Claims
1. A purified mammalian peptide having a molecular weight of from about 31 kD to about 35 kD and comprising an amino terminus hydrophilic extracellular region, a carboxy terminus hydrophilic cytoplasmic region, and a transmembrane hydrophobic region between said cytoplasmic and extracellular regions, a dibasic sequence extracellularly adjacent the transmembrane region of the peptide, and at least one glycosylation sites in the extracellular region including an Xac-Xaa- Ser-Gly-Xac sequence, wherein Xac is an acidic amino acid and Xaa is any amino acid.
2. The peptide of Claim 1, wherein said proteoglycan is obtainable from a mammal selected from the group consisting of humans, mice, rats, and hamsters.
3. The peptide of Claim 2, wherein said mammal is a mouse.
4. The peptide of Claim 3, wherein said proteoglycan is syndecan.
5. The peptide of Claim 1, wherein said peptide is gylcosylated at said glycosylation site.
6. The peptide of Claim 1, wherein said peptide is selected from
(1) compounds of
(a) a first formula: M-R- ■R-A-A-L- W-L-W- L-C-A- L-A-L- R-L-Q- P-A- L-P- ■Q-I-V-A- V-N-V- P-P-E- D-Q-D- G-S-G- D-D- S-D- -N-F-S-G- ■S-G-T- G-A-L- p-D-T- •L-S-R- Q-T- P-S- ■T-W-K-D- V-W-L- L-T-A- T-P-T- A-P-E- P-T- S-S- ■N-T-E-T- •A-F-T- S-V-L- P-A-G- E-K-P- ■E-E- G-E- -P-V-L-H- V-E-A- E-P-G- •F-T-A- R-D-K- E-K- E-V- -T-T-R-P- •R-E-T- V-Q-L- •P-I-T- ■Q-R-A- S-T- V-R- -V-T-T-A- -Q-A-A- •V-T-S- ■H-P-H- •G-G-M- Q-P- G-L- -H-E-T-S- •A-P-T- ■A-P-G- •Q-P-D- •H-Q-P- P-R- V-E- -G-G-G-T- ■S-V-I- ■K-E-V- ■V-E-D- •G-T-A- •N-Q- L-P- -A-G-E-G- •S-G-E- •Q-D-F- •T-F-E- •T-S-G- ■E-N- T-A -V-A-A-V- ■E-P-G- ■L-R-N- •Q-P-p- ■V-D-E- G-A- T-G' -A-S-Q-S- ■L-L-D- ■R-K-E- •V-L-G- ■G-V-I- A-G- G-L- -V-G-L-I- ■F-A-V- ■C-L-V- •A-F-M- ■L-Y-R- M-K- K-K' -D-E-G-S- ■Y-S-L- ■E-E-P- ■K-Q-A- •N-G-G- •A-Y- Q-K -P-T-K-Q- -E-E-F- •Y-A
wherein A is alanine, C is cysteine, D is aspartate, E is glutamate, F is phenylalanine, G is glycine, H is histidine, I is isoleucine, K is lysine, L is leucine, M is methionine, N is asparagine, P is proline, Q is glutamine, R is arginine, S is serine, T is threonine, V is valine, W is tryptophan, and Y is tyrosine, (b) a second formula in which at least one amino acid in said first formula is replaced by a different amino acid, with the proviso that no more than 10 replacements take place.
(c) a third formula in which from 1 to 15 amino acids are absent from either the amino terminal, the carboxy terminal, or both terminals of said first formula or said second formula, or
(d) a fourth formula in which from 1 to 10 additional amino acids are attached sequentially to the amino terminal, carboxy terminal, or both terminals of said first formula or said second formula and
(2) salts of compounds having said formulas.
7. A purified DNA or RNA molecule, which comprises a nucleotide sequence coding for a peptide of Claim 1.
8. The molecule of Claim 7, wherein said sequence comprises a segment at least 14 nucleotides in length that is homologous to a segment of approximately said length in a DNA sequence
ATGAGACGCGCGGCGCTCTGGCTCTGGCTCTGCGCGCTGGCGCTGCGCCTGCAGCCTGCC CTCCCGCAAATTGTGGCTGTAAATGTTCCTCCTGAAGATCAGGATGGCTCTGGGGATGAC TCTGACAACTTCTCTGGCTCTGGCACAGGTGCTTTGCCAGATACTTTGTCACGGCAGACA CCTTCCACTTGGAAGGACGTGTGGCTGTTGACAGCCACGCCCACAGCTCCAGAGCCCACC AGCAGCAACACCGAGACTGCTTTTACCTCTGTCCTGCCAGCCGGAGAGAAGCCCGAGGAG GGAGAGCCTGTGCTCCATGTAGAAGCAGAGCCTGGCTTCACTGCTCGGGACAAGGAAAAG GAGGTCACCACCAGGCCCAGGGAGACCGTGCAGCTCCCCATCACCCAACGGGCCTCAACA GTCAGAGTCACCACAGCCCAGGCAGCTGTCACATCTCATCCGCACGGGGGCATGCAACCT GGCCTCCATGAGACCTCGGCTCCCACAGCACCTGGTCAACCTGACCATCAGCCTCCACGT GTGGAGGGTGGCGGCACTTCTGTCATCAAAGAGGTTGTCGAGGATGGAACTGCCAATCAG CTTCCCGCAGGAGAGGGCTCTGGAGAACAAGACTTCACCTTTGAAACATCTGGGGAGAAC ACAGCTGTGGCTGCCGTAGAGCCCGGCCTGCGGAATCAGCCCCCGGTGGACGAAGGAGCC ACAGGTGCTTCTCAGAGCCTTTTGGACAGGAAGGAAGTGCTGGGAGGTGTCATTGCCGGA GGCCTAGTGGGCCTCATCTTTGCTGTGTGCCTGGTGGCTTTCATGCTGTACCGGATGAAG AAGAAGGACGAAGGCAGCTACTCCTTGGAGGAGCCCAAACAAGCCAATGGCGGTGCCTAC CAGAAACCCACCAAGCAGGAGGAGTTCTACGCC.
9. The molecule of Claim 7, wherein said sequence is followed by a termination codon.
10. The molecule of Claim 7, wherein said sequence is preceded by a promoter.
11. A recombinant DNA vector, wherein said vector is capable of replicating in a microorganism or being expressed in eukaryotic cell and said vector comprises a nucleotide sequence coding for a peptide of Claim 1.
12. The vector of Claim 11, wherein said sequence comprises a segment at least 14 nucleotides in length that is homologous to a segment of approximately said length in a DNA sequence
ATGAGACGCGCGGCGCTCTGGCTCTGGCTCTGCGCGCTGGCGCTGCGCCTGCAGCCTGCC CTCCCGCAAATTCTGGCTGTAAATGTTCCTCCTGAAGATCAGGATGGCTCTGGGGATGAC TCTGACAACTTCTCTGGCTCTGGCACAGGTGCTTTGCCAGATACTTTGTCACGGCAGACA CCTTCCACTTGGAAGGACGTGTGGCTGTTGACAGCCACGCCCACAGCTCCAGAGCCCACC AGCAGCAACACCGAGACTGCTTTTACCTCTGTCCTGCCAGCCGGAGAGAAGCCCGAGGAG GGAGAGCCTGTGCTCCATGTAGAAGCAGAGCCTGGCTTCACTGCTCGGGACAAGGAAAAG GAGGTCACCACCAGGCCCAGGGAGACCGTGCAGCTCCCCATCACCCAACGGGCCTCAACA GTCAGAGTCACCACAGCCCAGGCAGCTGTCACATCTCATCCGCACGGGGGCATGCAACCT GGCCTCCATGAGACCTCGGCTCCCACAGCACCTGGTCAACCTGACCATCAGCCTCCACGT GTGGAGGGTGGCGGCACTTCTGTCATCAAAGAGGTTGTCGAGGATGGAACTGCCAATCAG CTTCCCGCAGGAGAGGGCTCTGGAGAACAAGACTTCACCTTTGAAACATCTGGGGAGAAC ACAGCTGTGGCTGCCGTAGAGCCCGGCCTGCGGAATCAGCCCCCGGTGGACGAAGGAGCC ACAGGTGCTTCTCAGAGCCTTTTGGACAGGAAGGAAGTGCTGGGAGGTGTCATTGCCGGA GGCCTAGTGGGCCTCATCTTTGCTGTGTGCCTGGTGGCTTTCATGCTGTACCGGATGAAG AAGAAGGACGAAGGCAGCTACTCCTTGGAGGAGCCCAAACAAGCCAATGGCGGTGCCTAC CAGAAACCCACCAAGCAGGAGGAGTTCTACGCC.
13. A genetically engineered cell, wherein said cell comprises a microorganism or eukaryotic cell containing exogenous genetic information encoding a peptide of Claim 1.
14. An isolated oligonucleotide, comprising at least 14 sequential nucleotides selected from nucleotide sequences that code for an amino acid sequence M-R- R-A-A- L-W- L-W-L- ■C-A-L- A-L-R- L-Q-P-A- L-P- Q-I-V- A-V- N-V-P- ■P-E-D- Q-D-G- S-G-D-D- S-D- N-F-S- G-S- G-T-G- ■A-L-P- •D-T-L- S-R-Q-T- P-S- T-W-K- D-V- W-L-L- ■T-A-T- •P-T-A- P-E-P-T- s-s- N-T-E- T-A- F-T-S- ■V-L-P- ■A-G-E- •K-P-E-E-
G-E- •P-V-L- H-V- E-A-E- -P-G-F- ■T-A-R- •D-K-E-K- E-V- ■T-T-R- •P-R- •E-T-V- -Q-L-P- •I-T-Q- ■R-A-S-T- V-R- ■V-T-T- •A-Q- •A-A-V- ■T-S-H- ■P-H-G- ■G-M-Q-P- G-L- ■H-E-T- •S-A- ■P-T-A- -P-G-Q- ■P-D-H- ■Q-P-P-R- V-E- •G-G-G- ■T-S- ■V-I-K- -E-V-V- ■E-D-G- ■T-A-N-Q- L-P- •A-G-E- ■G-S- •G-E-Q' -D-F-T- -F-E-T- -S-G-E-N- T-A- •V-A-A- •V-E- ■P-G-L- -R-N-Q- ■P-P-V- •D-E-G-A- T-G- •A-S-Q- -S-L- •L-D-R' -K-E-V- -L-G-G- •V-I-A-G- G-L- -V-G-L- -I-F- ■A-V-C -L-V-A- -F-M-L- -Y-R-M-K- K-K- -D-E-G- -S-Y- -S-L-E -E-P-K- -Q-A-N- -G-G-A-Y- Q-K- -P-T-K- -Q-E- -E-F-Y -A
wherein A is alanine, C is cysteine, D is aspartate, E is glutamate, F is phenylalanine, G is glycine, H is histidine, I is isoleucine, K is lysine, L is leucine, M is methionine, N is asparagine, P is proline, Q is glutamine, R is arginine, S is serine, T is threonine, V is valine, W is tryptophan, and Y is tyrosine.
15. The oligonucleotide of Claim 14, wherein said oligonucleotide is DNA.
16. The oligonucleotide of Claim 14, wherein said oligonucleotide is RNA.
17. The oligonucleotide of Claim 14, wherein said oligonucleotide is radioactivity labeled.
18. The oligonucleotide of Claim 14, wherein said oligonucleotide comprises at least 20 sequential nucleotides.
19. The oligonucleotide of Claim 14, wherein said sequential nucleotides include a sequence GAXGGXTCTGGXGA or TCXCCAGAXCCXTC, where X is any nucleotide.
20. The oligonucleotide of Claim 14, wherein said nucleotide sequences comprise a first DNA sequence of formula
ATGAGACGCGCGGCGCTCTGGCTCTGGCTCTGCGCGCTGGCGCTGCGCCTGCAGCCTGCC CTCCCGCAAATTGTGGCTGTAAATGTTCCTCCTGAAGATCAGGATGGCTCTGGGGATGAC TCTGACAACTTCTCTGGCTCTGGCACAGGTGCTTTGCCAGATACTTTGTCACGGCAGACA CCTTCCACTTGGAAGGACGTGTGGCTGTTGACAGCCACGCCCACAGCTCCAGAGCCCACC AGCAGCAACACCGAGACTGCTTTTACCTCTGTCCTGCCAGCCGGAGAGAAGCCCGAGGAG GGAGAGCCTGTGCTCCATGTAGAAGCAGAGCCTGGCTTCACTGCTCGGGACAAGGAAAAG GAGGTCACCACCAGGCCCAGGGAGACCGTGCAGCTCCCCATCACCCAACGGGCCTCAACA GTCAGAGTCACCACAGCCCAGGCAGCTGTCACATCTCATCCGCACGGGGGCATGCAACCT GGCCTCCATGAGACCTCGGCTCCCACAGCACCTGGTCAACCTGACCATCAGCCTCCACGT GTGGAGGGTGGCGGCACTTCTGTCATCAAAGAGGTTGTCGAGGATGGAACTGCCAATCAG CTTCCCGCAGGAGAGGGCTCTGGAGAACAAGACTTCACCTTTGAAACATCTGGGGAGAAC ACAGCTGTGGCTGCCGTAGAGCCCGGCCTGCGGAATCAGCCCCCGGTGGACGAAGGAGCC ACAGGTGCTTCTCAGAGCCTTTTGGACAGGAAGGAAGTGCTGGGAGGTGTCATTGCCGGA GGCCTAGTGGGCCTCATCTTTGCTGTGTGCCTGGTGGCTTTCATGCTGTACCGGATGAAG AAGAAGGACGAAGGCAGCTACTCCTTGGAGGAGCCCAAACAAGCCAATGGCGGTGCCTAC CAGAAACCCACCAAGCAGGAGGAGTTCTACGCC
a second DNA sequence complementary to said first DNA sequence, or a RNA sequence corresponding to said first or second DNA sequence.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US33158589A | 1989-03-29 | 1989-03-29 | |
US331,585 | 1989-03-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1990012033A1 true WO1990012033A1 (en) | 1990-10-18 |
Family
ID=23294566
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1990/001496 WO1990012033A1 (en) | 1989-03-29 | 1990-03-22 | Construction and use of synthetic constructs encoding syndecan |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO1990012033A1 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5422243A (en) * | 1991-01-15 | 1995-06-06 | Jalkanen; Markku T. | Detection of syndecan content in biological materials such as tissues and body fluids for indications of malignant transformations of cells |
US5486599A (en) * | 1989-03-29 | 1996-01-23 | The Board Of Trustees Of The Leland Stanford Junior University | Construction and use of synthetic constructs encoding syndecan |
US5610148A (en) * | 1991-01-18 | 1997-03-11 | University College London | Macroscopically oriented cell adhesion protein for wound treatment |
US5629287A (en) * | 1991-01-18 | 1997-05-13 | University College London | Depot formulations |
US5726058A (en) * | 1992-12-01 | 1998-03-10 | Jalkanen; Markku | Syndecan stimulation of cellular differentiation |
US5851993A (en) * | 1994-06-13 | 1998-12-22 | Biotie Therapies Ltd. | Suppression of tumor cell growth by syndecan-1 ectodomain |
US6017727A (en) * | 1994-03-07 | 2000-01-25 | Biotie Therapies Ltd. | Syndecan enhancer element and syndecan stimulation of cellular differentiation |
US6699968B1 (en) | 1989-03-29 | 2004-03-02 | Children's Medical Center Corporation | Construction and use of synthetic constructs encoding syndecan |
-
1990
- 1990-03-22 WO PCT/US1990/001496 patent/WO1990012033A1/en unknown
Non-Patent Citations (4)
Title |
---|
JOURNAL OF CELL BIOLOGY, "Cell Surface Proteoglycan of Mouse Mammary Epithelial Cells in Shed by cleavage of its Matrix-binding Ectodomain from its membrane-associated Domain", Vol. 105, pp. 3087-3096, December 1987, JALKANEN et al., See Abstract. * |
JOURNAL OF CELL BIOLOGY, "Molecular cloning of Syndecan, an integral membrane proteoglycan", Vol. 108, pp. 1547-1556, April 1989, SAUNDERS et al., See Fig. 1. * |
JOURNAL OF CELL BIOLOGY, "Mouse Mammary Epithelial Cells Produce Basement Membrane and Cell Surface Heparan Sulfate Proteoglycans Containing Distinct Core Proteins", Vol. 106, pp. 953-962, March 1988, JALKANEN et al., See abstract. * |
PROMEGA 1987/1988, Catalogue and Reference Guide, issued 1987. "Cloning Systems and Vectors", see pages 5 and 6. * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5486599A (en) * | 1989-03-29 | 1996-01-23 | The Board Of Trustees Of The Leland Stanford Junior University | Construction and use of synthetic constructs encoding syndecan |
US6531295B1 (en) | 1989-03-29 | 2003-03-11 | Children's Medical Center Corporation | Synthetic constructs encoding syndecan |
US6699968B1 (en) | 1989-03-29 | 2004-03-02 | Children's Medical Center Corporation | Construction and use of synthetic constructs encoding syndecan |
US7183393B2 (en) | 1989-03-29 | 2007-02-27 | Children's Medical Center Corporation | Construction and use of synthetic constructs encoding syndecan |
US5422243A (en) * | 1991-01-15 | 1995-06-06 | Jalkanen; Markku T. | Detection of syndecan content in biological materials such as tissues and body fluids for indications of malignant transformations of cells |
US5610148A (en) * | 1991-01-18 | 1997-03-11 | University College London | Macroscopically oriented cell adhesion protein for wound treatment |
US5629287A (en) * | 1991-01-18 | 1997-05-13 | University College London | Depot formulations |
US5726058A (en) * | 1992-12-01 | 1998-03-10 | Jalkanen; Markku | Syndecan stimulation of cellular differentiation |
US6017727A (en) * | 1994-03-07 | 2000-01-25 | Biotie Therapies Ltd. | Syndecan enhancer element and syndecan stimulation of cellular differentiation |
US5851993A (en) * | 1994-06-13 | 1998-12-22 | Biotie Therapies Ltd. | Suppression of tumor cell growth by syndecan-1 ectodomain |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5486599A (en) | Construction and use of synthetic constructs encoding syndecan | |
O'Connell et al. | Cloning of cDNAs encoding mammalian double-stranded RNA-specific adenosine deaminase | |
Saunders et al. | Molecular cloning of syndecan, an integral membrane proteoglycan. | |
CA1340232C (en) | Dna sequence, recombinant dna molecules and processes for producing lymphocyte function associated antegen-3 | |
Arpin et al. | Functional differences between L-and T-plastin isoforms. | |
DE3687761T2 (en) | MANUFACTURE OF THE HUMAN VON WILLEBRAND FACTOR BY RECOMBINANT DNA. | |
Thomas et al. | Isolation of cDNAs encoding the complete sequence of bovine type X collagen. Evidence for the condensed nature of mammalian type X collagen genes | |
US6197937B1 (en) | Modified low density lipoprotein receptor | |
KR100629185B1 (en) | Human Rizophosphatidic Acid Receptor Material and Uses thereof | |
WO1990002181A1 (en) | Dna sequences, recombinant dna molecules and processes for producing pi-linked lymphocyte function associated antigen-3 | |
WO1990012033A1 (en) | Construction and use of synthetic constructs encoding syndecan | |
JP2949440B2 (en) | Fusion genes and their protein products for treatment and diagnosis of congenital or acquired genetic diseases | |
JPH03151877A (en) | Bone calcium precipitation factor | |
HU197939B (en) | Process for producing deoxyribonucleic acid, or its precursor, determining human growth hormone releasing factor | |
US6699968B1 (en) | Construction and use of synthetic constructs encoding syndecan | |
US5550055A (en) | Recombinant DNA-produced T11 and fragments thereof | |
JPS61149089A (en) | Polypeptide secretion development vector, microorganism transformed with same, and production of polypeptide with microorganism | |
JPH04144684A (en) | Endothelin receptor | |
JPH0335795A (en) | Production of polypeptide having human interleukin 2 activity | |
JPH02485A (en) | Novel human interleukin-4, recombination vector for manifesting the same factor and transformant transformed by the same vector | |
US20020128440A1 (en) | Endoderm, cardiac and neural inducing factors - oligonucleotides for expressing human frazzled (frzb-1) protein | |
JPH08506244A (en) | Novel polypeptides having serotonin-like receptor activity, nucleic acids encoding these polypeptides, and uses | |
JP2839837B2 (en) | DNA encoding the ligand-binding domain protein of granulocyte colony-stimulating factor receptor | |
US5830754A (en) | Recombinant DNA-produced T11 and fragments thereof | |
JPH0691823B2 (en) | Novel DNA and method for producing the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA JP |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB IT LU NL SE |