WO2001036980A2 - A process for identifying the active site in a biological target - Google Patents
A process for identifying the active site in a biological target Download PDFInfo
- Publication number
- WO2001036980A2 WO2001036980A2 PCT/GB2000/004420 GB0004420W WO0136980A2 WO 2001036980 A2 WO2001036980 A2 WO 2001036980A2 GB 0004420 W GB0004420 W GB 0004420W WO 0136980 A2 WO0136980 A2 WO 0136980A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- type
- ligand
- target
- ligands
- targets
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 177
- 230000008569 process Effects 0.000 title claims abstract description 102
- 239000003446 ligand Substances 0.000 claims abstract description 176
- 239000000126 substance Substances 0.000 claims abstract description 96
- 230000003993 interaction Effects 0.000 claims abstract description 71
- 230000000704 physical effect Effects 0.000 claims abstract description 44
- 102000004169 proteins and genes Human genes 0.000 claims description 62
- 108090000623 proteins and genes Proteins 0.000 claims description 62
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 61
- 150000001413 amino acids Chemical class 0.000 claims description 57
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 42
- 108020004414 DNA Proteins 0.000 claims description 40
- 229920002521 macromolecule Polymers 0.000 claims description 36
- 238000013461 design Methods 0.000 claims description 29
- 238000004458 analytical method Methods 0.000 claims description 23
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 18
- 238000004364 calculation method Methods 0.000 claims description 15
- 239000002131 composite material Substances 0.000 claims description 15
- 102000004190 Enzymes Human genes 0.000 claims description 12
- 108090000790 Enzymes Proteins 0.000 claims description 12
- 229910000147 aluminium phosphate Inorganic materials 0.000 claims description 11
- 238000013401 experimental design Methods 0.000 claims description 11
- 150000002894 organic compounds Chemical class 0.000 claims description 11
- NBIIXXVUZAFLBC-UHFFFAOYSA-N phosphoric acid Substances OP(O)(O)=O NBIIXXVUZAFLBC-UHFFFAOYSA-N 0.000 claims description 11
- 238000000513 principal component analysis Methods 0.000 claims description 11
- 238000004519 manufacturing process Methods 0.000 claims description 10
- 102000003688 G-Protein-Coupled Receptors Human genes 0.000 claims description 8
- 108090000045 G-Protein-Coupled Receptors Proteins 0.000 claims description 8
- 125000000539 amino acid group Chemical group 0.000 claims description 8
- 239000001257 hydrogen Substances 0.000 claims description 8
- 229910052739 hydrogen Inorganic materials 0.000 claims description 8
- 108090000862 Ion Channels Proteins 0.000 claims description 7
- 102000004310 Ion Channels Human genes 0.000 claims description 7
- 235000014633 carbohydrates Nutrition 0.000 claims description 7
- 150000001720 carbohydrates Chemical class 0.000 claims description 7
- 239000013598 vector Substances 0.000 claims description 7
- KBPLFHHGFOOTCA-UHFFFAOYSA-N 1-Octanol Chemical compound CCCCCCCCO KBPLFHHGFOOTCA-UHFFFAOYSA-N 0.000 claims description 6
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 claims description 6
- 230000015572 biosynthetic process Effects 0.000 claims description 6
- 125000004122 cyclic group Chemical group 0.000 claims description 6
- 229940079593 drug Drugs 0.000 claims description 6
- 239000003814 drug Substances 0.000 claims description 6
- 108020001507 fusion proteins Proteins 0.000 claims description 6
- 102000037865 fusion proteins Human genes 0.000 claims description 6
- 230000000144 pharmacologic effect Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000009510 drug design Methods 0.000 claims description 5
- 238000002474 experimental method Methods 0.000 claims description 5
- 238000012417 linear regression Methods 0.000 claims description 5
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 239000003054 catalyst Substances 0.000 claims description 4
- -1 organic libraries Proteins 0.000 claims description 4
- 238000003786 synthesis reaction Methods 0.000 claims description 4
- 101001059454 Homo sapiens Serine/threonine-protein kinase MARK2 Proteins 0.000 claims description 3
- 102000004278 Receptor Protein-Tyrosine Kinases Human genes 0.000 claims description 3
- 108090000873 Receptor Protein-Tyrosine Kinases Proteins 0.000 claims description 3
- 102100028904 Serine/threonine-protein kinase MARK2 Human genes 0.000 claims description 3
- 239000000370 acceptor Substances 0.000 claims description 3
- 238000009835 boiling Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 238000004128 high performance liquid chromatography Methods 0.000 claims description 3
- 238000004770 highest occupied molecular orbital Methods 0.000 claims description 3
- 108091008039 hormone receptors Proteins 0.000 claims description 3
- 238000002329 infrared spectrum Methods 0.000 claims description 3
- 150000002611 lead compounds Chemical class 0.000 claims description 3
- 238000004811 liquid chromatography Methods 0.000 claims description 3
- 238000004768 lowest unoccupied molecular orbital Methods 0.000 claims description 3
- 230000014759 maintenance of location Effects 0.000 claims description 3
- 238000002844 melting Methods 0.000 claims description 3
- 230000008018 melting Effects 0.000 claims description 3
- 238000004776 molecular orbital Methods 0.000 claims description 3
- 238000000655 nuclear magnetic resonance spectrum Methods 0.000 claims description 3
- 238000004835 semi-empirical calculation Methods 0.000 claims description 3
- 238000001228 spectrum Methods 0.000 claims description 3
- 108090000721 thyroid hormone receptors Proteins 0.000 claims description 3
- 102000004217 thyroid hormone receptors Human genes 0.000 claims description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 3
- YKBGVTZYEHREMT-KVQBGUIXSA-N 2'-deoxyguanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](CO)O1 YKBGVTZYEHREMT-KVQBGUIXSA-N 0.000 claims description 2
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 claims description 2
- CKTSBUTUHBMZGZ-SHYZEUOFSA-N 2'‐deoxycytidine Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-SHYZEUOFSA-N 0.000 claims description 2
- CKTSBUTUHBMZGZ-UHFFFAOYSA-N Deoxycytidine Natural products O=C1N=C(N)C=CN1C1OC(CO)C(O)C1 CKTSBUTUHBMZGZ-UHFFFAOYSA-N 0.000 claims description 2
- 238000007476 Maximum Likelihood Methods 0.000 claims description 2
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 claims description 2
- 239000002245 particle Substances 0.000 claims description 2
- 230000003612 virological effect Effects 0.000 claims description 2
- 239000000816 peptidomimetic Substances 0.000 claims 3
- 102000053602 DNA Human genes 0.000 claims 2
- 108090000301 Membrane transport proteins Proteins 0.000 claims 2
- 102000003939 Membrane transport proteins Human genes 0.000 claims 2
- 102000007451 Steroid Receptors Human genes 0.000 claims 2
- 108010085012 Steroid Receptors Proteins 0.000 claims 2
- 101710172711 Structural protein Proteins 0.000 claims 2
- 108091006116 chimeric peptides Proteins 0.000 claims 2
- 229920001184 polypeptide Polymers 0.000 claims 2
- 239000003270 steroid hormone Substances 0.000 claims 2
- 230000001131 transforming effect Effects 0.000 claims 2
- 102000004378 Melanocortin Receptors Human genes 0.000 claims 1
- 108090000950 Melanocortin Receptors Proteins 0.000 claims 1
- 238000007620 mathematical function Methods 0.000 claims 1
- 230000002906 microbiologic effect Effects 0.000 claims 1
- 235000001014 amino acid Nutrition 0.000 description 58
- 230000004071 biological effect Effects 0.000 description 55
- 229940024606 amino acid Drugs 0.000 description 51
- 235000018102 proteins Nutrition 0.000 description 50
- 238000013459 approach Methods 0.000 description 21
- 150000001875 compounds Chemical class 0.000 description 20
- 102000005962 receptors Human genes 0.000 description 19
- 108020003175 receptors Proteins 0.000 description 19
- 230000000694 effects Effects 0.000 description 18
- 125000003275 alpha amino acid group Chemical group 0.000 description 16
- 238000003052 fractional factorial design Methods 0.000 description 16
- 125000004429 atom Chemical group 0.000 description 14
- 102000008314 Type 1 Melanocortin Receptor Human genes 0.000 description 13
- 108010021428 Type 1 Melanocortin Receptor Proteins 0.000 description 13
- 238000004617 QSAR study Methods 0.000 description 12
- 102000008318 Type 3 Melanocortin Receptor Human genes 0.000 description 12
- 108010021433 Type 3 Melanocortin Receptor Proteins 0.000 description 12
- 238000011835 investigation Methods 0.000 description 11
- 238000012360 testing method Methods 0.000 description 11
- 108010067902 Peptide Library Proteins 0.000 description 9
- 239000011159 matrix material Substances 0.000 description 9
- 229920000642 polymer Polymers 0.000 description 9
- YNYDCASRUGOZJC-UHFFFAOYSA-N 2-piperidin-4-yl-1,3-oxazole Chemical compound C1CNCCC1C1=NC=CO1 YNYDCASRUGOZJC-UHFFFAOYSA-N 0.000 description 8
- 108060003345 Adrenergic Receptor Proteins 0.000 description 8
- 102000017910 Adrenergic receptor Human genes 0.000 description 8
- 238000012512 characterization method Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000000875 corresponding effect Effects 0.000 description 6
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 5
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 5
- 108020004566 Transfer RNA Proteins 0.000 description 5
- 125000003636 chemical group Chemical group 0.000 description 5
- 108700010039 chimeric receptor Proteins 0.000 description 5
- 238000003055 full factorial design Methods 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 108020004999 messenger RNA Proteins 0.000 description 5
- 239000002773 nucleotide Substances 0.000 description 5
- 125000003729 nucleotide group Chemical group 0.000 description 5
- 239000002904 solvent Substances 0.000 description 5
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 4
- 108091028043 Nucleic acid sequence Proteins 0.000 description 4
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000010238 partial least squares regression Methods 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- 229930010555 Inosine Natural products 0.000 description 3
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 3
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 3
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 3
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 3
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 3
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 3
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 3
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 3
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 3
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 3
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 102000004022 Protein-Tyrosine Kinases Human genes 0.000 description 3
- 108090000412 Protein-Tyrosine Kinases Proteins 0.000 description 3
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 3
- 239000004473 Threonine Substances 0.000 description 3
- 102000030619 alpha-1 Adrenergic Receptor Human genes 0.000 description 3
- 108020004102 alpha-1 Adrenergic Receptor Proteins 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 229960003786 inosine Drugs 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 238000012628 principal component regression Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 3
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 102000014914 Carrier Proteins Human genes 0.000 description 2
- 108010078791 Carrier Proteins Proteins 0.000 description 2
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 2
- 239000004471 Glycine Substances 0.000 description 2
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 2
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 2
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 2
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 2
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 2
- 239000004472 Lysine Substances 0.000 description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 2
- 102000002067 Protein Subunits Human genes 0.000 description 2
- 108010001267 Protein Subunits Proteins 0.000 description 2
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 2
- 108091008324 binding proteins Proteins 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 239000000969 carrier Substances 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 230000005754 cellular signaling Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 229960002989 glutamic acid Drugs 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 2
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 2
- 125000004435 hydrogen atom Chemical class [H]* 0.000 description 2
- 230000003834 intracellular effect Effects 0.000 description 2
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 2
- 229960000310 isoleucine Drugs 0.000 description 2
- 229930182817 methionine Natural products 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000000491 multivariate analysis Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 2
- 230000026731 phosphorylation Effects 0.000 description 2
- 238000006366 phosphorylation reaction Methods 0.000 description 2
- 238000000053 physical method Methods 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 230000009261 transgenic effect Effects 0.000 description 2
- HXKKHQJGJAFBHI-UHFFFAOYSA-N 1-aminopropan-2-ol Chemical compound CC(O)CN HXKKHQJGJAFBHI-UHFFFAOYSA-N 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- 238000005084 2D-nuclear magnetic resonance Methods 0.000 description 1
- ZOOGRGPOEVQQDX-UUOKFMHZSA-N 3',5'-cyclic GMP Chemical compound C([C@H]1O2)OP(O)(=O)O[C@H]1[C@@H](O)[C@@H]2N1C(N=C(NC2=O)N)=C2N=C1 ZOOGRGPOEVQQDX-UUOKFMHZSA-N 0.000 description 1
- RZVAJINKPMORJF-UHFFFAOYSA-N Acetaminophen Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 230000010658 Adrenergic Receptor Interactions Effects 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 125000001433 C-terminal amino-acid group Chemical group 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- 108090000565 Capsid Proteins Proteins 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 102100023321 Ceruloplasmin Human genes 0.000 description 1
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 1
- 102000001189 Cyclic Peptides Human genes 0.000 description 1
- 108010069514 Cyclic Peptides Proteins 0.000 description 1
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 1
- 108010016626 Dipeptides Proteins 0.000 description 1
- 241001524679 Escherichia virus M13 Species 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 108010008364 Melanocortins Proteins 0.000 description 1
- 108010052285 Membrane Proteins Proteins 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 239000005864 Sulphur Substances 0.000 description 1
- 102000008316 Type 4 Melanocortin Receptor Human genes 0.000 description 1
- 108010021436 Type 4 Melanocortin Receptor Proteins 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 238000005672 Willgerodt-Kindler rearrangement reaction Methods 0.000 description 1
- MMWCIQZXVOZEGG-HOZKJCLWSA-N [(1S,2R,3S,4S,5R,6S)-2,3,5-trihydroxy-4,6-diphosphonooxycyclohexyl] dihydrogen phosphate Chemical compound O[C@H]1[C@@H](O)[C@H](OP(O)(O)=O)[C@@H](OP(O)(O)=O)[C@H](O)[C@H]1OP(O)(O)=O MMWCIQZXVOZEGG-HOZKJCLWSA-N 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 210000004671 cell-free system Anatomy 0.000 description 1
- 230000002925 chemical effect Effects 0.000 description 1
- 150000005829 chemical entities Chemical class 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000012569 chemometric method Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- VGONTNSXDCQUGY-UHFFFAOYSA-N desoxyinosine Natural products C1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 VGONTNSXDCQUGY-UHFFFAOYSA-N 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 150000001982 diacylglycerols Chemical class 0.000 description 1
- 238000006471 dimerization reaction Methods 0.000 description 1
- 238000012912 drug discovery process Methods 0.000 description 1
- 239000006274 endogenous ligand Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000008571 general function Effects 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 229940029575 guanosine Drugs 0.000 description 1
- 150000002484 inorganic compounds Chemical class 0.000 description 1
- 229910010272 inorganic material Inorganic materials 0.000 description 1
- 238000002898 library design Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 239000002865 melanocortin Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 239000010445 mica Substances 0.000 description 1
- 229910052618 mica group Inorganic materials 0.000 description 1
- 230000004001 molecular interaction Effects 0.000 description 1
- 238000000302 molecular modelling Methods 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 239000005445 natural material Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 150000002902 organometallic compounds Chemical class 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 230000026792 palmitoylation Effects 0.000 description 1
- 230000006461 physiological response Effects 0.000 description 1
- 230000000063 preceeding effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
- 238000000159 protein binding assay Methods 0.000 description 1
- 230000009822 protein phosphorylation Effects 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 239000002210 silicon-based material Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 108020003113 steroid hormone receptors Proteins 0.000 description 1
- 102000005969 steroid hormone receptors Human genes 0.000 description 1
- 150000003431 steroids Chemical class 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 108020001572 subunits Proteins 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
- 238000002424 x-ray crystallography Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
Definitions
- the present invention relates to processes for the identification of binding site(s) in biological targets such as mRNA, rRNA, tRNA, DNA, proteins, peptides, endogenous ligands, receptors and enzymes.
- the invention is based on the use of multivariate methods such as experimental design, Principal Component Analysis (PCA), Soft Independent Modelling of Class Analogues (SIMCA), Principal Component Regression (PCR), Projections to Latent Structures (PLS), Multivariate Design (MVD), Statistical Molecular Design (SMD), Informative Chemical Libraries, Multivariate Quantitative Structure Activity Relationships (MQSAR) and Multivariate Characterisation (MNC).
- ligands referred to as "ligands” herein
- biological testing of these ligands.
- the interaction(s) with the macromolecule or macromolecules of interest i.e. the "targets”
- a lead compound is identified and optimised for fulfilling the demands of candidate drug (CD).
- CD candidate drug
- the currently known methods involve the synthesis of a huge number of compounds before a lead can be identified.
- a typical criterion for a ligand to be of interest is that it shows a desired activity, affinity or selectivity for a particular target.
- the ligands are most often tested for their affinity to macromolecular targets that are proteins such as enzymes, hormone receptors or G-protein coupled receptors. However, the testing of these ligands with other macromolecular targets such as specific sequences of D ⁇ A is also common.
- Typical examples of the ligands that it is desired to test, and to design improved variants of, include organic compounds, peptides (linear or cyclic sequences of amino acids), mixtures of peptides and organic compounds, and sequences of D ⁇ A.
- a common procedure in the development of active compounds is to randomly synthesise a library containing a few hundred up to millions of different compounds. An HTS-assay is then used for measuring the biological activity of the compounds or the interaction with the macromolecule(s) of interest. The most promising compounds are then selected for further refinement.
- a macromolecule of interest can, for example, be crystallized and its 3D structure determined by use of crystallographic methods. It is also possible to use NMR for elucidating the 3D structure of proteins. Once the 3D structure of the macromolecule is known, a chemical entity can then be designed to fit into a suitable region of the macromolecule. A large problem is however that the determination of the 3D structure of macromolecules is difficult, expensive and not always possible (see Branden and Tooze, 1991).
- QSAR Quantitative Structure Activity Relationships
- Such methods analyse the relation between the structures of test compounds and their affinity to the macromolecule. The information is then used to deduce better structures.
- a well-known example of a QSAR method constitutes CoMFA (Cramer et al., J. Amer. Chem. Soc, 1988, 110, 5959-5967).
- So called pharmacophore models are also used in drug design (see e.g. Daveu et al. 1999; de Groot et al. 1999; McGregor et al. 1999).
- Proteins are built from amino acid chains (generally termed primary structures) that form structural elements (motifs) such as ⁇ -helices, ⁇ -sheets, loop regions, hairpin ⁇ motifs, and the like (generally termed secondary structures), which then are used in the building of larger structures (generally termed tertiary structures or domains); these domains in turn forming the overall protein structure (generally called quartenary structures) (for extensive examples and discussion on this topic, reference is given to Branden and Tooze, 1991).
- Integral membrane proteins often exist in large number of homologous variants.
- Well known examples include tyrosine kinases, serine/threonine kinases, ion channels, G- protein coupled receptors and the steroid/thyroid hormone receptor family.
- G-protein coupled receptors For example, about 1000 different variants of the G-protein coupled receptors have been cloned and sequenced.
- the large group of G-protein coupled receptors constitutes a good example of homologous proteins with similar structural organization.
- the G-protein coupled receptors are known to be built from one single amino acid chain forming seven transmembrane ⁇ - helices, one extracellular N-terminal amino acid chain, one intra-cellularly located C- terminal amino acid chain, three extracellular loops and three intracellular loops (Baldwin, 1993).
- the methods that are known in the art make use of information on new or known ligands, and in some cases variants thereof, and the affinity to the target. In most of the cases, information on a number of variants of the ligands in question is correlated with information derived from the binding of these variants with a single target molecule. In a large number of cases, the information of the 3D-structure of the target is used. However, use of the combination of chemical physical descriptors for both the target and the ligands simultaneously for the identification of the active site of the target by applying quantitative methods has never been done.
- the present invention also takes advantage of the fact that available technology allows the construction and production of modified macromolecules. This is a technology that in the case of proteins is generally referred to as protein engineering (Branden and Tooze, 1991).
- one or several amino acids in a protein may be exchanged for other amino acid(s), removed or new amino acids are added. This is generally done by so-called directed mutagenesis techniques.
- one specific amino acid in a protein can be exchanged for another amino acid (see e.g. Frandberg et al, 1994).
- chimeric proteins have incorporated or exchanged parts of the amino acid sequence(s) from another protein.
- Schioth et al. see Schioth et al. (1998).
- the analogous procedure for the construction of chimeric DNA's can of course also be undertaken.
- each amino acid in a protein or peptide sequence may be quantitatively characterised, i.e. be translated to three latent variables containing physical and chemical information. This means that instead of comparing sequences with a one- letter code, a quantitative description of each sequence can be generated.
- This method has been used for obtaining descriptors of the ligands (the active compounds) and then used in MQSAR to relate chemical structures to properties or biological activities (BA).
- a further improvement is to combine the ACC with OSC (Orthogonal Scatter Correction) to reduce the noise in the models (Andersson et al. 1998).
- OSC Orthogonal Scatter Correction
- peptides may be applied for DNA, RNA and other polymers or oligomers.
- the current invention provides a novel method for identifying the interaction site, binding site or active site in a macromolecule such as mRNA, rRNA, tRNA, DNA, peptides, proteins, carbohydrates or any kind of oligomers or polymers, whether natural or synthetic.
- a macromolecule such as mRNA, rRNA, tRNA, DNA, peptides, proteins, carbohydrates or any kind of oligomers or polymers, whether natural or synthetic.
- the invention relates to the use of "informative combinatorial chemistry", “informative peptide libraries”, MQSAR and a chemical/physical description of the target either based on the principal properties for the building blocks of the target (i.e. aminoacids or similar) or handled as chimeric target proteins or handled as mutated target proteins.
- Other macromolecules such as mRNA, rRNA, tRNA, DNA, peptides or enzymes can be handled in the same way.
- the invention relates to a process for characterising the interaction between a Ligand Y and a Target X comprising:
- Step 1 Obtaining information representing one or more chemical and/or physical properties of at least two ligands of the type Y;
- Step 2 Obtaining information representing one or more chemical and/or physical properties of at least two targets of the type X;
- Step 3 Obtaining information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y and at least two of the targets of the type X;
- Steps 1, 2 and 3 processing the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and Target X from which one or more of the properties of the interaction between the Ligand Y and the Target X may be characterised.
- the term "characterising the interaction” includes obtaining information on, determining, predicting or estimating at least one chemical and/or physical property of the interaction or of the sites of interaction; estimating or predicting the position of the site of interaction within the Target X; estimating or predicting the position of the site of interaction within the Ligand Y; estimating or predicting the binding affinity, selectivity, activity, biological activity or avidity of the Ligand Y or Y' for Target X; estimating or predicting which subsequences, regions or parts of the Ligand Y interact with the Target X; or estimating or predicting which subsequences, regions or parts of the Target X interact with the Ligand Y.
- the invention also provides a process for estimating the position of the active site in a Target X in an interaction between a Ligand Y and a Target X, or estimating one or more physical and/or chemical properties of the active site, comprising the above Steps 1, 2, and 3, and correlating the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the position of the active site or one or more physical and/or chemical properties of the active site in the Target X may be estimated.
- the invention further provides a process for predicting the position of the active site in an interaction between a Ligand Y and a Target X, or predicting one or more physical and/or chemical properties of the active site, comprising: the above Steps 1, 2, and 3; Step 4, which comprises correlating the information from Steps 1 , 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X; and using the model to predict the position of the active site or one or more physical and/or chemical properties of the active site.
- a further embodiment of the invention provides a process performed with the aid of a programmed computer for the estimation of the position of the active site in a Target X, in an interaction between a Ligand Y and a Target X, or one or more physical and/or chemical properties of the active site, comprising the steps of:
- Step 1 Inputting information representing one or more chemical and/or physical properties of at least two ligands of the type Y;
- Step 2 Inputting information representing one or more chemical and/or physical properties of at least two targets of the type X;
- Step 3 Inputting information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y and at least two of the targets of the type X;
- Step 4 Computing a model from the inputted information which describes the interaction between the Ligand Y and the Target X; and then using the model to estimate the position of the active site, or to estimate one or more physical and/or chemical properties of the active site.
- the invention also provides a process for assisting in the design of a Ligand Y' which binds to a Target X, the Ligand Y' having an increased or decreased binding affinity, selectivity or avidity for the Target X compared to that of a Ligand Y, comprising the Steps 1, 2 and 3 of the invention; and then correlating the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the structure and/or one or more chemical and/or physical properties of the Ligand Y' may be estimated or predicted.
- a further embodiment provides a process for estimating or predicting the binding affinity, selectivity or avidity of a Ligand Y' with a Target X, comprising Steps 1, 2 and 3 of the invention; and then correlating the information from Steps 1 , 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the binding affinity, selectivity or avidity of the Ligand Y with the Target X may be estimated or predicted.
- the Ligand Y' is a ligand of the type Y, as hereindefined.
- the first steps are to describe the target X (and ligand Y) by numbers which give a good representation of the chemical/physical properties of the target (and/or ligand, respectively). Examples of different ways to describe the targets and ligands are given below:
- the processes of the invention will make use of information which directly represents the chemical/physical properties of the targets and/or ligands. In most cases, however, the processes of the invention will make use of the information (i.e. descriptors) which indirectly represents the information on the chemical/physical properties of the targets and/or ligands, i.e. the latter information is subjected to a conversion, operation, transformation or translation process such as those described herein (e.g. principal properties, bit-vectors, PCA, ACC, etc.) before being correlated.
- information i.e. descriptors
- the latter information is subjected to a conversion, operation, transformation or translation process such as those described herein (e.g. principal properties, bit-vectors, PCA, ACC, etc.) before being correlated.
- the targets X may be of any chemical nature but it is preferred that X is represented by proteins, peptides, large or small peptides, protein subunits, receptors, ion-channels, transporters, carriers, enzymes, drug binding proteins, proteins participating in cell signaling, polymers, structures (including linear, cyclic and branched structures and combinations thereof) which at least in part are being composed of building blocks, DNA's, part of DNA's, DNA sequences, RNA's, part of RNA's, RNA sequences, transfer RNA's, messenger RNA's, carbohydrate, with proteins being most preferred.
- X may be selected so as to contain one peptide chain (i.e.
- X may equally well be selected to contain several peptide chains held together by molecular interactions (e.g. the macromolecule being a multimeric protein, in other words being composed of several sub- units).
- any chemical modification of the amino acid chain of X is allowed.
- modification(s) include (but are not limited to) glycosylation, palmitoylation, phosphorylation, proteolytic degradation, peptide chain breaks, knicking, oxidations, or any other chemical, biochemical or biological modification(s).
- X can also contain non-protein moieties such as co-factors, prosthetic groups, metal atoms, and the like. It is also allowed that natural amino acids of a protein are exchanged for non-natural amino acids.
- the molecular weight of the target X is preferably larger than 5000 g/mole, more preferably larger than 7000 g/mole, even more preferably larger than 10000 g/mole, even more preferably larger than 12000 g/mole, still even more preferably larger than 14000 g/mole, even still even more preferably larger than 17000 g/mole and most preferably larger than 20000 g/mole.
- the molecular weight of target X is larger than 25000 g/mole or even larger than 30000 g/mole and more.
- the molecular weight of X can be as low as 3000 g/mole or even as low as 2000 g/mole or even as low as 1000 g/mole or smaller, or even lower, or of any other molecular weight suited for the problem to be investigated.
- the ligands included in Y are of any chemical nature.
- Y include (but not limited to) organic compounds, chemical libraries, peptides, peptide libraries, protein subunits, proteins, receptors, ion-channels, transporters, carriers, enzymes, drug binding proteins, proteins participating in cell signaling, polymers and structures (including linear, cyclic and branched structures and combinations thereof) which at least in part are being composed of building blocks, non-peptides, organic chemical compounds, DNA's, part of DNA's, DNA sequences, RNA's, RNA sequences, part of RNA's, transfer RNA's, messenger RNA's, carbohydrates, hybrids of any of the aforementioned and the like.
- Y is preferably an informative organic library or an informative peptide library. Also Y can be a set of substances taken from nature, e.g., natural substance libraries.
- Y is selected to be a ligand which has the properties that are listed above for the properties of target X.
- the molecular weight of Y is preferably not restricted to any particular size; it may be small or it may be large.
- Target X and Ligand Y are essentially interchangeable, i.e. the invention is not limited to interactions between "targets” and “ligands”; it applies to any entities which are capable of interacting with one another.
- the molecular weight of Y is within the range 100 - 5000 g/mole.
- Y is of a macromolecular nature.
- Y is preferably larger than 5000 g/mole, more preferably larger than 7000 g/mole, even more preferably larger than 10000 g/mole, even more preferably larger than 12000 g/mole, still even more preferably larger than 14000 g/mole, even still even more preferably larger than 17000 g/mole and most preferably larger than 20000 g/mole.
- the molecular weight of molecule Y is larger than 25000 g/mole or even larger than 30000 g/mole and more.
- the ligand Y is a small peptide or a low molecular weight organic compound within the range of 100-5000g/mole, preferably below 2000g/mole, or even more preferably below lOOOg/mole or most preferably below 850g/mole.
- the information on the properties of the targets of type X and/or the information on the properties of the ligands of type Y may be derived, inter alia, from atom counts, measured or calculated thin layer liquid chromatography (TLC), retention times on HPLC, refractive index, isoelectric point, melting point, boiling point, molecular weight, hydrophobicity, hydrophilicity, chromatographic mobility, van der Waals volume, octanol/water partion coefficient (logP), energy of molecular orbital, heat of formation, polarizability, electronegativity, hardness, total accessible molecular surface area, polar accessible molecular surface area, nonpolar accessible molecular surface area, number of hydrogen bond donors, number of hydrogen bond acceptors, charge, IR-spectra, NMR-spectra or other spectra, HOMO, LUMO, semi-empirical calculations ab inito calculations or 3D quantum mechanical calculations.
- the descriptors of X and descriptors of Y may be calculated from already known facts about X or Y (e.g. the structural formula of X or Y), rather than obtaining them by chemical or physical measurements. In some cases, however, the information on the properties may be determined experimentally.
- the "targets of type X" and the "ligands of type Y” at least some of the “targets of type X” should be capable of interacting with at least some of the "ligands of type Y". However, it is not a prerequisite that all targets of type X are capable of interacting with all of the ligands of type Y since a non- interacting X/Y pair might also provide useful information.
- the majority (e.g. 50%, 60%, 70%, 80%, 90%) or even 100%) of the "targets of type X" are capable of interacting with the majority (e.g. 50%, 60%, 70%, 80%, 90% or even 100%) of the "ligands of type Y".
- the "targets of type X” all have similar physical, chemical, biological and/or pharmacological properties. In other embodiments of the invention, the "targets of type X" share similar structural, compositional or organisational features.
- the targets of the type X show high diversity.
- the "ligands of type Y" all have similar physical, chemical, biological and/or pharmacological properties. In other embodiments of the invention, the "ligands of type Y" share similar structural, compositional or organisational features.
- Preferred targets of type X are macromolecules having a polymeric structure that show sequence (or other building block or composite building block) homologies of at least 10 %, more preferably of at least 20 % and most preferably of at least 30 %. Even more preferred are macromolecules of type X whose sequence(s) or subsequence(s) (or other building block(s) or composite building block(s)) included in the region(s) included in the analysis according to the procedures of the invention show homologies of at least 10 %, more preferably of at least 20 % and most preferably of at least 30 %.
- macromolecules of type X where the subsequences (or other building blocks or composite building blocks) comprising parts included in the analysis according to the procedures of the invention show homologies of at least 10 %, more preferably of at least 20 % and most preferably of at least 30 %.
- Preferred methods for calculating homologies are by using the BLAST algorithm (http:/www.bioactivesite.com/darwin2000/blast/; November 15 2000), using standing settings.
- a “target of the type X” is, in most cases, what would generally be termed a variant of the Target X, i.e. one which shares a property or function in common with the Target X.
- each of the targets of the type X independently shares a structure or a DNA, RNA or amino acid sequence in common with the Target X; or a building block (as defined herein) or combination of building blocks in common with the Target X.
- a building block as defined herein
- ligand of the type Y when the ligands of type Y are macromolecules.
- targets of the type X which are chimeric variants of the Target X, i.e. which differ from the Target X though the exchange or addition of one or more aminoacids or nucleotides, or aminoacid or nucleotide sequences.
- the present invention recognizes that biological macromolecules have polymeric structures, as they are composed of smaller building blocks linked together. Thus, proteins are composed of chains of amino acids linked together, while DNA is composed of chains of nucleotides linked together.
- the present invention takes advantage of the polymeric nature of macromolecules in the analysis of the chemical and/or physical properties of X. Using such an approach it is not necessary to have any information of the positions of all atoms in X in three dimensional space. This contrasts the present method from e.g. molecular modelling (or any similar method) which aims to determine the position of all (or at least most of) the molecule's atoms in X in three dimensional space.
- most embodiments of the present invention exclude the use of numeric information which refers to co-ordinates of all (or at least most of) the atoms in X in three-dimensional space. Moreover, for the same reason, some embodiments of the invention exclude the use of numeric information that directly refers to co-ordinates of all (or at least most of) atoms in Y in three-dimensional space. Accordingly, some embodiments of the invention exclude the use of numeric information which directly refers to co-ordinates of all (or at least most of) atoms in both X and Y in three-dimensional space. Moreover, the present invention recognizes that the three dimensional structure of a molecule can be described by the angles betweens its atoms.
- numeric information which refers to angles between all (or at least most of) atoms in X.
- numeric information that directly refers angles between all (or at least most of) of atoms in Y.
- numeric information which directly refers to angles between all (or at least most of) atoms in both X and Y.
- angle in this context is included bond angles, torsion angles and dihedral angles.
- building block is defined as a chemical residue that can be linked together with other chemical residues so as to create a chain.
- Building blocks usually come in sets, where each member contains variable region(s) that bring different chemical properties to the different building blocks, and chemical groups which are used to linking the building blocks together.
- a set of building blocks could contain the eight members ("residues") a, b, c, d, e, f, g and h.
- a molecule of desired size and composition could then created by linking the building blocks together, e.g.:
- polymers used in the invention can also exist in cyclic variants, such as:
- L is a chemical group comprising a linker so as to create a cycle
- T and E are start and end groups.
- polymeric structures can exist in branched variants, such as e.g.
- T-a-c-g-h-a-c-c-E where L is a chemical group comprising a linker so as to create a branch in the molecule, and T and E are start and end groups.
- a structure used in conjunction with the present invention can have zero, one or more number of cycles.
- a structure used in conjunction with the present invention can have zero, one or more number of branches.
- a structure used in conjunction with the present invention can be modified chemically by removing, adding or exchanging atom(s) within a building block.
- a structure used in conjunction with the present invention can be modified chemically by removing, adding or exchanging chemical groups within a building block.
- start-groups can be the same or different.
- end-groups can be the same or different.
- the molecular weight of building block is preferably less than 10000 g/mole, more preferably less than 5000 g/mole, even more preferably less than 3000 g/mole, even somewhat more preferably less than 2000 g/mole and most preferably less than 1500 g/mole.
- the molecular weight of building blocks are quite small such as less than 1000 g/mole, more preferably less than 600 g/mole, even more preferably less than 400 g/mole, even less than 300 g/mole, even less than 200 g/mole, even less than 100 g/mole.
- the building blocks are sets of amino acid residues.
- amino acid residue is defined as residue of glycine, alanine, valine, leucine, isoleucine, serine, cysteine, threonine, methionine, phenylalanine, tyrosine, tryptophan, proline, histidine, lysine, arginine, aspartic acid, glutamic acid, asparagine, glutamine, and any other naturally occuring amino acid.
- the building block may also include a residue having the following general structure
- Z is hydrogen, X or -CH2X where X is chemical moiety of any structure with molecular weight preferably less than 2000 g/mole, more preferably less than 1000 g/mole, even more preferably less than 600 g/mole, and most preferably less than 400 g/mole, T is start-group of any desired structure, or bond to another amino acid residue, and E end- group of any desired structure, or bond to another amino acid residue.
- nucleotide in other words e.g. deoxyadenosine 5'-phosphoric acid, deoxyguanosine 5'-phosphoric acid, deoxycytidine 5 '-phosphoric acid, deoxythymidine 5 '-phosphoric acid, deoxyuridine 5'- phosphoric acid.
- artifical nucleotides may be used, such as deoxyinosine 5'- phosphoric acid, and alike. The invention of course recognizes that DNA and RNA generally occur in double stranded form with matching (or eventually mismatching) base pairs.
- Compounds of building blocks may include common atoms of organic compounds such as hydrogen, carbon, nitrogen, oxygen, sulphur, phosphor. However, also other atoms may be used, e.g silicon. Polymers used in the present inventions include for both X and Y silicon- containing compounds (as well as other types of organo-metallic-compounds) which by use of the procedures of the invention can be optimized for desired properties, e.g. for use as catalysts that can withstand harsh conditions (e.g. high temperature, high or low pH, etc.).
- Both X and Y for use with the procedures of the invention include both natural and synthetic compounds.
- chimeric proteins and in an analogous fashion to the use of chimeric DNAs.
- regions of the amino-acid chains of two or more homologous proteins or DNA's can be exchanged so as to create new proteins (or DNA's) inheriting properties of the original proteins (or DNA's).
- Creating such a set of chimeric proteins (or DNA's) and using them as X's in conjunction with the procedures of the invention will create a case where the used proteins (or DNA's) are likely to show gross similarities in their three dimensional organization.
- the invention also includes the use of informative peptide libraries and the use thereof for identification of the active site in any kind of biological target.
- Examples of such libraries are given in Examples 3 to 8 and can be used separately or in combination with each other or with any kind of peptides. Since the length of the number of aminoacids (AA) in the peptides is varied, a pre-treatment with ACC of the matrix describing the properties of different peptides is preferably made in order to obtain an uniform matrix. This is generally necessary for making the required calculations in order to identify the "active site" in the target.
- PP's Principal Properties
- Table 1 The z-scale used to characterise each amino acid.
- a 330 AA long peptide (e.g. a receptor) will be described by 990 numbers reflecting its chemical/physical properties. Principal properties may also be used to describe the ligand. In an analogous manner, any polymeric structure may be described by numbers representing chemical and/or physical properties of its building blocks.
- Step lb Binary coding bit vectors
- a second approach for the assignment of descriptors to the ligand X and/or to the target Y is to use a binary coding and create a "bit- vector".
- a binary assignment may be performed as follows:
- a zero is used for one of the variations and an one for the other.
- AjBi C ⁇ is described by 000
- A2B1 Ci is described by 100
- Ai B2C1 is described by 010, etc.
- DNA is composed of building blocks (also termed bases or nucleotides) termed adenine (A), thymidine (T), cytosine (C) and guanosine (G).
- A building blocks
- T thymidine
- C cytosine
- G guanosine
- bases of DNA are sometimes allowed to be exchanged with other bases without affecting the functionality of the DNA.
- bases of DNA are sometimes allowed to be exchanged with other bases without affecting the functionality of the DNA.
- phenylalanine is coded for by TTT and TTC while serine is coded for by TCU, TCC, TCA and TCG.
- AorTorCorG l 111, etc.
- inosine when an artifical base such as inosine (I) is used, hybridization does not occur. Accordingly, inosine could be described as 0000. (However, the numbers 1111 may be used if more appropriate for the problem under investigation, as inosine would not have any negative effect on hybridization thus allowing any base to match).
- Part of type A2 0 1 0
- assignments may be done based on the above, e.g.:
- An additional building block A4 having "no effect" may be described by 0 0 0.
- An additional building block A5 combining the effects of A ⁇ and A2 may be described by 1 1 0.
- An additional building block A7 combining the effects of A ⁇ , A2 and A3 may be described by 1 1 l, etc.
- effect may be interpreted as changes in the chemical and/or physical properties and for identification of interactions between molecules of type X with molecules of type Y.
- Step lc Bit vectors for the description/characterisation of structures for use as descriptors of X or descriptors of Y.
- Another way of assigning descriptors to X and/or Y is to create bit vectors, counting how many times a defined structural feature occurs in a structure.
- the concept is illustrated for a small set of structures in Table 2.1 using the bit string variables defined in Table 2.2.
- the structural features defined in the bit strings are used to identify how many times they occur in the investigated structures, resulting in Table 2.3, which is a description of the structures in Table 2.1, using the descriptors in Table 2.2.
- the bit strings used here only serves as an example to illustrate the method. All structural functionalities that occur in the structures under investigation may be added as bit string variables.
- the bit strings may also be used as only indicating the presence or absence of the feature defined by the bit string, which would result in Table 2.4.
- Additional operations may also be carried out on the descriptors of X and/or Y, for example, translation, PCA and ACC, as exemplified further below.
- Crossed auto covariances (CC) between two different scales, j and k, are calculated according to the following equation:
- the method may be used both for obtaining Descriptors of X and Descriptors of Y, to be used in conjunction with the procedures of the invention.
- a further improvement is to combine ACC with OSC (Orthogonal Scatter Correction) to reduce the noise in the models (Andersson et al., 1998).
- chimeric macromolecules By combining parts (or building blocks or composite building blocks) from two macromolecules of type X chimeric macromolecules may be manufactured.
- two molecules of type Y that are divided into parts (or building blocks or composite building blocks) may be exchanged so as to create chimeric variants of Y.
- the approach is of course not limited to mixtures of two original variants of X (or Y), but may be extended to any number of original variants.
- experimental design (Lundstedt et al. 1998; Box et al. 1978) will enhance the analysis when used in conjunction with the procedures of the invention.
- Experimental design is used to allow the extraction of the maximum information from the selected subset of chimeras.
- the method is exemplified by a molecule that contains four parts A, B, C and D, with two possible variations each (c.f. the case shown in Example 1 :1 for chimeric MCI and MC3 receptors).
- the second step in this procedure is to describe the ligands Y in a relevant way.
- the approaches include those described above and also those which are used to design and describe informative chemical libraries or informative peptide libraries (Lundstedt et al. 1997 and Andersson et al. 1999).
- Ligand Y Any other description of the Ligand Y may be used, such as any of the conventional descriptions used in QSAR and MQSAR for description of physicochemical properties of organic molecules. Examples of such useful descriptions include GRID (Goodford, 1985) and GRIND descriptors (http/www.miasrl.com/software/amanual/backgr.html of November 15, 2000).
- a preferred method for the description of the Ligand Y is through the use of an informative peptide library.
- the twenty natural aminoacids were characterised using the z-scale developed by Hellberg et al., Table 1, resulting in a description of each amino acid with three numerical variables.
- the twenty aa's where thereafter sorted in nine different groups according to a 2 3 full factorial design, as in Table 2.
- the library may also consist of non-peptidic compounds, e.g. low molecular weight organic or inorganic compounds.
- the third step in the procedure is to measure the interaction between the ligands of the type Y and targets of the type X. This may be measured by any means known per se.
- the interaction may be quantitated, for example, on the basis of binding affinity, selectivity, activity, biological activity, avidity, Km of enzyme, hybridisation or any other means which directly or indirectly provides a measure of the interaction.
- the affinity or activity of the different Ligands Y (most preferably from an informative compound library) for a target X or a number of targets X is measured.
- binding affinity or biological activity may, for example, be determined using methods described by Lunec et al. (1992), Szardenings et al. (1992), Schioth et al. (1992) or other similar methods.
- a very specific example for how to the biological activity constitutes ligand binding methods.
- concentrations of X or Y are usually incubated together.
- concentrations of Y are varied systematically into different assays containing the same concentration of X and the amount of Y bound to X is then measured and related to the activity for the interaction of variants of Y with variants of X.
- a third labelled molecule (the "labelled ligand") is added which also binds to X, the binding of the labelled ligand being prevented by Y.
- the degree and concentrations active for variants of Y being capable of preventing the binding of the labelled ligand to variants of X are related the activity of the interactions of the Y's with the X's.
- Such ligand binding methods are well known in the art, specific examples are found in Uhlen and Wikberg
- the binding approach may also be useful when X is a non-protein macromolecule, such as DNA.
- Methods for hybridization measurements for DNA are well known in the art.
- the capacity of variants of X to convert a substrate to a product may be measured.
- the influence of different concentrations of variants of Y to either inhibit or promote the conversion of the substrate to the product may then be measured and used as a measure of B A.
- measures of B A include quantifying second messenger elements (cAMP, cGMP, intracellular calcium concentrations, inositol triphosphate, diacylglycerol, and the like), and quantifying protein phosphorylation (including phosphorylations of tyrosine, serine and threonine). Such measurements can be typically done in organs, isolated cells, cells in culture, cell free systems, membrane preparations, and the like.
- cAMP second messenger elements
- cGMP intracellular calcium concentrations
- inositol triphosphate diacylglycerol, and the like
- protein phosphorylation including phosphorylations of tyrosine, serine and threonine
- BA constitute measurements of ion-channel opening and closure, single ion channel currents, membrane potentials, voltage clamping and other electrophysiological measurements.
- any suitable biochemical, biophysical or pharmacological response related to the interaction of X and Y can be used as a measure of BA.
- a very specific example is quantifying the dimerization of tyrosine kinase receptors.
- X could be one variant of subunit of the tyrosine kinase and Y another variant of a subunit of the tyrosine kinase and the capacity of X to interact with Y quantitated and used as a measure of BA.
- measurment for use as BA is measuring the avidity.
- X could include variants of antibodies and Y could include variants of antigens. The degree of interaction of X with Y can then be measured by using methods well known in immunology, such as by quantifying avidity.
- the X may also be included in a multicellular organism.
- the production of transgenic animal is well known in the art.
- Obtaining the BA may in e.g. involve the administration of a Y to transgenic animals containing different variants of X and observing any desired physiological response in the animal.
- X may also be included in a viral particle or a phage.
- E.g. X may be included within the amino acid sequence of a capsid protein of a virus or phage, e.g. M13-phages.
- the method for obtaining BA according to the procedure of the invention may be used in several ways for the analysis of the interactions of X and Y, for the design of improved macromolecules X and/or for the design of improved molecules Y.
- Steps 1, 2 and 3 do not have to be carried out in this order. They may be carried out in any order or even simultaneously.
- the fourth step is to establish a mathematical model describing the observed interaction between the Ligand X and Target Y, as a function of the properties of the ligands Y and the properties of the targets X.
- a preferred procedure for identifying an active site in a macromolecule is based on the chimeric approach exemplified by receptors. This is the fastest and simplest route for finding the region of the target wherein the active site is located.
- the chimeric receptors are preferably combined in accordance with a multivariate design, factorial or fractional factorial design, in order to obtain a well balanced and informative set of combined
- chimeric receptors This is the first real step towards informative combinatorial biology.
- the use of naturally-occurring variants of the Target X has proven to be surprisingly effective and useful in conjunction with the present invention.
- Identification of the active site is done by describing the biological activity (BA) as a function of the properties of the ligands Y, the properties of the targets X, and the interaction between the ligands Y and the targets X.
- the interaction is defined by multiplying the descriptors capturing the properties of the ligand with the descriptors of the target where the descriptors may be principal properties or other chemical and/or physical descriptors.
- new descriptors may be generated by any function of the descriptors of the target X and ligand Y.
- One example of a model of the biological activity is given by the equation :
- BA BA average + b,*(X) + b 2 *(Y) + b 12 *(X)*(Y)
- the coefficients in the equation above may be determined by PLS but may also, if the number of measurements are big enough, be determined by PCR, MLR, NN (neural net), Stepwise regression or other similar method.
- the coefficients in the equation provide the necessary information for finding the location of the binding site in the target as well as with important information of the chemical/physical properties needed for a very active ligand.
- the coefficients provide information about which features are important in the ligands, the targets and the important features of the interaction between them.
- the binding site and/or active site is identified as the interaction terms in between the chemical/physical descriptors or "principal properties" of the ligands and the chemical/physical descriptors or principal properties for the target.
- BA Biological activity
- the estimated correlation coefficient of the "target-ligand-interaction" provides information of the position of the active site in the macromolecule as well as a description of the chemical/physical properties of the active-site. This information is of outstanding value in the design of new leads as well as in lead optimisation. Mathematically this a simple procedure which surprisingly provides information regarding the "active site”.
- the model is preferrably produced using one or more of multivariate methods, partial least squares methods, neural networks, multiple linear regression, non-linear regression, curve fitting, model fitting, stepwise regression and maximum likelihood methods.
- a new description of the interesting regions of the target may optionally be made with higher resolution, i.e each AA in the interesting region is replaced by its principal properties (see Table 1). If further information regarding the target is needed, then exchange of specific AA's can be made by mutations in the interesting region or regions of the target. This should preferably be done by an informative design in order to ensure diversity in properties.
- the model derived by the use of the present invention may be directly useful for predicting the properties of novel targets of type X as well as novel ligands of type Y.
- the invention is particularly useful in drug design as well as in the engineering of new molecules of type X or Y (e.g. in protein engineering).
- the processes in accordance with the present invention may be implemented at least partially using software e.g. computer programs. It will thus be seen that when viewed from further aspects, the present invention provides computer software specifically adapted to carry out the processes hereinabove described when installed on data processing means, and a computer program element comprising computer software code portions for performing the processes hereinabove described when the program element is run on data processing means.
- the invention also extends to a computer software carrier comprising such software, particularly when used to operate a process of the invention.
- a computer software carrier may be a physical storage medium such as a ROM chip, CD ROM or disk, or may be a signal such as an electronic signal over wires, an optical signal or a radio signal such as to a satellite or the like.
- Figure 3 Calculation of auto covariances and auto cross covariances.
- Figure 4 Full factorial and fractional factorial design.
- Figure 5 Generalised template for X of Example 1 comprising aligned MCI and MC3 receptor amino acid sequences and division of template into parts A, B, C and D.
- Figure 6 Generalised template for Y of Example 1 comprising aligned MSH and
- Figure 7 Molecules of type Y in Example 1.
- Figure 8 Permutation testing in Example 1:1.
- Figure 9 Observed versus calculated K* for model of Example 1 :1.
- Figure 16 Alignment of three subtypes of human wild-type alpha 1 adrenoceptors.
- Figure 17 52 positions with sequence variation, extracted from TM regions of human alpha- 1 adrenoceptor subtypes and chimeras of alpha- 1 adrenoceptors.
- Black characters on white background denote variation between 2 amino acids; white characters on black background denote variation between 3 amino acids.
- Figure 18 X data set from Example 2.
- Figure 19 Molecular template and details of the compounds used in Example 2.
- Figure 21 Summary of pKi values (BA) reported for alpha- 1 adrenergic receptor interactions with 4-piperidyl oxazole antagonists.
- Figure 22 Graph showing observed v calculated pKi for Example 2.
- Figure 23 Normalised PLS regression coefficients from Example 2.1.
- Figure 25 The 16 peptides selected according to a 2 fractional factorial design + 3 cp
- Figure 27 32 peptides selected according to a 2 " fractional factorial design + 3 cp (or
- Figure 28 32 peptides selected according to a 2 l5"10 fractional factorial design + 3 cp (or 1 cp 2 random).
- Figure 29 32 peptides selected according to a 2 " fractional factorial design + 3 cp (or 1 cp 2 random).
- Figure 30 51 peptides selected according to a 2 21"16 fractional factorial design with additional experiments added from a half a fold over +3 cp.
- a macromolecular template X was made by using a generalised structure of the melanocortin receptor 1 (MCI) (FEBS Lett. 1992, 309, 417-420) and melanocortin receptor 3 (MC3) (J. Biol. Chem. 1993, 268, 8246-8250).
- MCI melanocortin receptor 1
- MC3 melanocortin receptor 3
- Figure 5 The thus formed template was then divided into 4 parts termed A, B, C and D, as illustrated in Fig. 5.
- Figure 5 shows the aligned amino acid sequences of the MCI and the MC3 receptors with the parts A, B, C and D of template X indicated.
- A1B1C1D i.e. native MCI -receptor
- A,B,C,D 2 i.e. native MCI -receptor
- A,B,C,D 2 i.e. native MCI -receptor
- A,B,C,D 2 i.e. native MCI -receptor
- A,B,C,D 2 i.e. native MCI -receptor
- A1B2C2D1 A,B 2 C 2 D 2 , A2B 2 C ⁇ D ⁇
- a 2 B 2 C 2 D i.e. native MC3-receptor
- the template Y was made by using a generalised structure derived from two known peptides MSH and MS04 (J. Biol. Chem. 1997, 272, 27943-27948) (Figure 6). As shown in the figure, both peptides have a common sequence in the middle, but their C- and N- terminals differ. Using this central common part, both peptides could be aligned to each other creating the template Y ( Figure 6). We then divided Y into three parts: N-terminai part, the middle and C-terminal part (see Figure 6). Because both peptides have exactly the same sequence in their middle part, we neglected it for the further analysis, leaving two selected parts in Y: ⁇ (i.e. N-terminal part) and ⁇ (i.e. C-terminal part).
- the model BA of Example 1 : 1 was improved by adding cross-terms. This was done by calculating new descriptor signals from the original descriptor signals given in Table 6 of Example 1 :1 by performing all possible multiplications of two different original descriptors.
- the new descriptor signals thus obtained are generally referred to as cross- terms (see SIMCA 7.0 manual, 1998).
- the improved PLS model i.e. improved model BA; in the following termed model BA of Example 1 :2
- was obtained using SIMCA autofit and had 2 significant components see SIMCA 7.0 manual, 1998) and yielded R 2 and Q 2 values of respectively, 0.95 and 0.66.
- the permutations of the new model are shown in the output abstraction of Figure 10. In Figure 11 is shown an output abstraction representing the comparison of the calculated BA and measured BAs being derived by use of the model BA of Example 1 :2. As seen the correlation is excellent.
- a new model BA was created from the model of Example 1:2 by removing descriptor signals which had lower variable influence values than 0.3 (see SIMCA 7.0 manual, 1998 for the meaning of variable influence and how this is performed) and performing PLS calculations essentially as described above for Examples 1 : 1 and 1 :2.
- the permutations of the new model BA (in the following termed model BA of Example 1 :3) are shown by the output abstraction represented in Figure 12.
- Figure 13 is shown the output abstraction representing a comparison of the calculated BAs derived by the used of model BA of Example 1:3 and the measured BAs. As seen the correlation for the values is excellent.
- Example BA of Example 1 :3 was used to analyze the influence and interactions of parts in X and Y. This was done by calculating the variable importance in the projection (VIP) for each descriptor of Example 1:3 (including the cross-terms retained in Example 1 :3) using SIMCA 7.0 (see SIMCA 7.0 manual, 1998, p. 15-11). An output abstraction representing these influences are shown in Figure 18. As can be seen from the abstraction the highest influence is exerted by part ⁇ of Y and part B of X. Part A of X and part ⁇ of Y are also important, while D and C parts of X are unimportant. Although part D is not important, the interaction of this part with part B (i.e. B x D column) shows a significant effect on the responses ( Figure 14).
- Example 1 :3 The model BA created in Example 1 :3 was used to predict the abilty of new variants of X to bind MSH peptides. According to Example 1:1, step ii) only the signals derived from 8 MC1/MC3 receptor chimeras were used out of 14 possible chimeras. The interaction of the remaining 6 with the MSH peptides was predicted using the Model BA of Example :3, an output abstraction for the prediction being shown in Table 7.
- model BA of Example 1 For this purpose we created a new model BA (in the following termed model BA of Example 1 :6) using only those cross-terms containing signals derived from parts from both X and Y, and using the signals derived from measured BAs and using the same PLS procedure as above.
- the new model showed R 2 and Q values of respectively, 0.64 and 0.61.
- the variable importance in the projection (VIP) for each cross-term descriptor was calculated as in Example 1 :4, an output abstraction for which is being shown in Figure 15. As can be seen from the Figure 15, the most important interactions are between part B and part ⁇ , and between part B and part ⁇ .
- every amino acid was assigned 5 numbers selected from the 5 z- scale descriptors for amino-acids derived by Sandberg (Sandberg et al J. Med. Chem. 41 (1998) 2481-2491). However, for positions differing by only 2 amino acids, the 5 z-scale descriptor numbers were in an additional step merged into one number by calculating physico-chemical distances based on the two differing amino acids, as follows:
- AB is the physiocochemical distance between amino acids A and B and Z A the z-scale of amino acid A and Z ⁇ the z-scale of amino acid B.
- the number of positions with two amino acids differing were for parts A-G, respectively, 9, 2, 5, 9, 9, 4 and 3 (totally 41).
- Number of positions with three amino acids differing were, respectively, 2, 3, 1, 2, 0, 2, 1 (totally 11).
- FIG. 22 A graphical representation of the derived relationships is shown in Fig. 22, the figure showing observed and predicted pK*- values.
- MIP a is the modelling importance of primary term, ⁇ a the standard deviation, and coeff a the regression coefficient of variable a in the data set.
- TM2 and TM5 show clearly higher importance than the other TM regions for the binding of 4-piperidyl oxazoles.
- MIC a is the modelling importance
- ⁇ a the standard deviation
- n the regression coefficient of cross-terms in the data set.
- the AD n corresponds to the average deviation from the means of cross-terms partners of a, and was approximated by 0.8 • ⁇ n , where ⁇ ses is the standard deviation of the cross-term partners of a.
- TM2 and TM5 show clearly higher importance for the specificity of 4-piperidyl oxazoles binding to the alpha- 1 adrenoceptors, compared the other TM regions.
- the 16 peptides were selected according to a 2 fractional factorial design + 3 cp (or 1 cp 2 random).
- the 16 peptides were selected according to a 2 9"5 fractional factorial design + 3 cp (or 1 cp 2 random).
- the 32 peptides were selected according to a 2 " fractional factorial design + 3 cp (or 1 cp 2 random).
- the 32 peptides were selected according to a 2 I5"10 fractional factorial design + 3 cp (or 1 cp 2 random).
- Example 7 Reference is made to the hexapeptides disclosed in Figure 29.
- the 51 peptides were selected according to a 2 2I"16 fractional factorial design with additional experiments added from a half a fold over +3 cp.
- SIMCA 7.0 A new standard in multivariate data analysis, Manual, Edition August 21, 1998, Umetri AB, Box 7960, SE907 19 Umea, Sweden.
- Zaliani, A and Gancia, E MS-WHIM scores for amino acids: A new 3D-description for peptide QSAR and QSPR studies. J. Chem. Inf. Comput. Sci. 1999, 39, 525-533.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Pharmacology & Pharmacy (AREA)
- Medicinal Chemistry (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Peptides Or Proteins (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU15305/01A AU1530501A (en) | 1999-11-18 | 2000-11-20 | A process for identifying the active site in a biological target |
EP00977666A EP1232466A2 (en) | 1999-11-18 | 2000-11-20 | Method for identifying the active site in a biological target |
NZ518980A NZ518980A (en) | 1999-11-18 | 2000-11-20 | A process for identifying the active site in a biological target by producing a model of the interaction between a ligand and a target |
CA002392086A CA2392086A1 (en) | 1999-11-18 | 2000-11-20 | A process for identifying the active site in a biological target |
HK03101309.1A HK1049218A1 (en) | 1999-11-18 | 2003-02-20 | Method for identifying the active site in a biological target |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9927346.8 | 1999-11-18 | ||
GBGB9927346.8A GB9927346D0 (en) | 1999-11-18 | 1999-11-18 | Method for analysis and design of entities of a chemical or biochemical nature |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2001036980A2 true WO2001036980A2 (en) | 2001-05-25 |
WO2001036980A3 WO2001036980A3 (en) | 2002-03-14 |
Family
ID=10864777
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB2000/004420 WO2001036980A2 (en) | 1999-11-18 | 2000-11-20 | A process for identifying the active site in a biological target |
Country Status (8)
Country | Link |
---|---|
EP (1) | EP1232466A2 (en) |
AU (1) | AU1530501A (en) |
CA (1) | CA2392086A1 (en) |
GB (1) | GB9927346D0 (en) |
HK (1) | HK1049218A1 (en) |
NZ (1) | NZ518980A (en) |
WO (1) | WO2001036980A2 (en) |
ZA (1) | ZA200203963B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1209610A1 (en) * | 2000-11-28 | 2002-05-29 | Valentin Capital Management | Method and apparatus for determining affinity data between a ligand and a target |
WO2003008551A3 (en) * | 2001-07-18 | 2003-10-16 | Structural Genomix Inc | Systems and methods for predicting active site residues in a protein |
DE10241793A1 (en) * | 2002-09-06 | 2004-06-17 | Roos, Gudrun, Dr. | Analysis apparatus for predicting the pharmaceutical activity of plant extracts comprises a nuclear magnetic resonance spectroscope producing a spectrum compared with a database of spectra of known active materials |
WO2003040994A3 (en) * | 2001-11-02 | 2005-01-13 | Arqule Inc | Cyp2c9 binding models |
WO2002044990A3 (en) * | 2000-11-28 | 2005-04-21 | Valentin Capital Man | Apparatus and method for determining affinity data between a target and a ligand |
US7662782B2 (en) | 1998-05-05 | 2010-02-16 | Action Pharma A/S | Melanocortin 1 receptor selective compounds |
US7935786B2 (en) | 1998-03-09 | 2011-05-03 | Zealand Pharma A/S | Pharmacologically active peptide conjugates having a reduced tendency towards enzymatic hydrolysis |
US8466104B2 (en) | 2005-08-26 | 2013-06-18 | Abbvie Inc. | Therapeutically active alpha MSH analogues |
WO2013163348A1 (en) * | 2012-04-24 | 2013-10-31 | Laboratory Corporation Of America Holdings | Methods and systems for identification of a protein binding site |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5587293A (en) * | 1994-01-06 | 1996-12-24 | Terrapin Technologies, Inc. | Method to identify binding partners |
CA2298629A1 (en) * | 1997-08-01 | 1999-02-11 | Novalon Pharmaceutical Corporation | Method of identifying and developing drug leads |
EP1049796A1 (en) * | 1997-12-18 | 2000-11-08 | Sepracor, Inc. | Methods for the simultaneous identification of novel biological targets and lead structures for drug development |
-
1999
- 1999-11-18 GB GBGB9927346.8A patent/GB9927346D0/en not_active Ceased
-
2000
- 2000-11-20 EP EP00977666A patent/EP1232466A2/en not_active Withdrawn
- 2000-11-20 AU AU15305/01A patent/AU1530501A/en not_active Abandoned
- 2000-11-20 WO PCT/GB2000/004420 patent/WO2001036980A2/en active IP Right Grant
- 2000-11-20 NZ NZ518980A patent/NZ518980A/en unknown
- 2000-11-20 CA CA002392086A patent/CA2392086A1/en not_active Abandoned
-
2002
- 2002-05-17 ZA ZA200203963A patent/ZA200203963B/en unknown
-
2003
- 2003-02-20 HK HK03101309.1A patent/HK1049218A1/en unknown
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7935786B2 (en) | 1998-03-09 | 2011-05-03 | Zealand Pharma A/S | Pharmacologically active peptide conjugates having a reduced tendency towards enzymatic hydrolysis |
US7662782B2 (en) | 1998-05-05 | 2010-02-16 | Action Pharma A/S | Melanocortin 1 receptor selective compounds |
EP1209610A1 (en) * | 2000-11-28 | 2002-05-29 | Valentin Capital Management | Method and apparatus for determining affinity data between a ligand and a target |
WO2002044990A3 (en) * | 2000-11-28 | 2005-04-21 | Valentin Capital Man | Apparatus and method for determining affinity data between a target and a ligand |
WO2003008551A3 (en) * | 2001-07-18 | 2003-10-16 | Structural Genomix Inc | Systems and methods for predicting active site residues in a protein |
WO2003040994A3 (en) * | 2001-11-02 | 2005-01-13 | Arqule Inc | Cyp2c9 binding models |
DE10241793A1 (en) * | 2002-09-06 | 2004-06-17 | Roos, Gudrun, Dr. | Analysis apparatus for predicting the pharmaceutical activity of plant extracts comprises a nuclear magnetic resonance spectroscope producing a spectrum compared with a database of spectra of known active materials |
US8466104B2 (en) | 2005-08-26 | 2013-06-18 | Abbvie Inc. | Therapeutically active alpha MSH analogues |
US8563508B2 (en) | 2005-08-26 | 2013-10-22 | Abbvie Inc. | Method for preventing or reducing acute renal failure by administration of therapeutically active α-MSH analogues |
US8703702B2 (en) | 2005-08-26 | 2014-04-22 | Abbvie Inc. | Therapeutically active α-MSH analogues |
WO2013163348A1 (en) * | 2012-04-24 | 2013-10-31 | Laboratory Corporation Of America Holdings | Methods and systems for identification of a protein binding site |
EP4050609A1 (en) * | 2012-04-24 | 2022-08-31 | Laboratory Corporation of America Holdings | Methods and systems for identification of a protein binding site |
Also Published As
Publication number | Publication date |
---|---|
HK1049218A1 (en) | 2003-05-02 |
NZ518980A (en) | 2005-05-27 |
GB9927346D0 (en) | 2000-01-12 |
ZA200203963B (en) | 2003-05-19 |
WO2001036980A3 (en) | 2002-03-14 |
CA2392086A1 (en) | 2001-05-25 |
EP1232466A2 (en) | 2002-08-21 |
AU1530501A (en) | 2001-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ribeiro et al. | A chemical perspective on allostery | |
Gapsys et al. | pmx Webserver: a user friendly interface for alchemistry | |
Jespers et al. | QresFEP: an automated protocol for free energy calculations of protein mutations in Q | |
Maffucci et al. | An updated test of AMBER force fields and implicit solvent models in predicting the secondary structure of helical, β-hairpin, and intrinsically disordered peptides | |
Cole et al. | Interrogation of the protein-protein interactions between human BRCA2 BRC repeats and RAD51 reveals atomistic determinants of affinity | |
Tian et al. | Fast and reliable prediction of domain–peptide binding affinity using coarse-grained structure models | |
Bogetti et al. | A twist in the road less traveled: The AMBER ff15ipq-m force field for protein mimetics | |
Wikberg et al. | Proteochemometrics: a tool for modeling the molecular interaction space | |
WO2001036980A2 (en) | A process for identifying the active site in a biological target | |
Neuwald | Gleaning structural and functional information from correlations in protein multiple sequence alignments | |
Chandramouli et al. | Two-level stochastic search of low-energy conformers for molecular spectroscopy: implementation and validation of MM and QM models | |
US6622094B2 (en) | Method for determining relative energies of two or more different molecules | |
Ben-Shimon et al. | Protonation States in molecular dynamics simulations of peptide folding and binding | |
Gallardo et al. | Protein–nucleic acid interactions for RNA polymerase II elongation factors by molecular dynamics simulations | |
Prusis et al. | Prediction of indirect interactions in proteins | |
Abeln et al. | Protein Three-Dimensional Structure Prediction. | |
Calligari et al. | Decomposition of proteins into dynamic units from atomic cross-correlation functions | |
Santini et al. | Rational design of glycosaminoglycan binding cyclic peptides using cPEPmatch | |
Punia et al. | Computation of the protein conformational transition pathway on ligand binding by linear response-driven molecular dynamics | |
Pinto et al. | The nucleotide excision repair proteins through the lens of molecular dynamics simulations | |
Delgado et al. | Polarizable AMOEBA Model for Simulating Mg2+· Protein· Nucleotide Complexes | |
US20070192034A1 (en) | Methods for representing sequence-dependent contextual information present in polymer sequence and uses thereof | |
AU2006200494A1 (en) | A process for identifying the active site in a biological target | |
US7016786B1 (en) | Statistical methods for analyzing biological sequences | |
Li et al. | Advances in structure-based allosteric drug design |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 518980 Country of ref document: NZ |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15305/01 Country of ref document: AU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2002/03963 Country of ref document: ZA Ref document number: 2392086 Country of ref document: CA Ref document number: 200203963 Country of ref document: ZA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2000977666 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2000977666 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWP | Wipo information: published in national office |
Ref document number: 518980 Country of ref document: NZ |
|
WWG | Wipo information: grant in national office |
Ref document number: 518980 Country of ref document: NZ |