WO2002020564A2 - Procede servant a identifier des sequences de peptides possedant une fonctionnalite specifique - Google Patents
Procede servant a identifier des sequences de peptides possedant une fonctionnalite specifique Download PDFInfo
- Publication number
- WO2002020564A2 WO2002020564A2 PCT/EP2001/010195 EP0110195W WO0220564A2 WO 2002020564 A2 WO2002020564 A2 WO 2002020564A2 EP 0110195 W EP0110195 W EP 0110195W WO 0220564 A2 WO0220564 A2 WO 0220564A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- peptide
- sequence
- sequences
- binding
- artificial neural
- Prior art date
Links
- 108090000765 processed proteins & peptides Proteins 0.000 title claims abstract description 188
- 238000000034 method Methods 0.000 title claims abstract description 68
- 108010067902 Peptide Library Proteins 0.000 claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 claims description 54
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 54
- 102000004169 proteins and genes Human genes 0.000 claims description 44
- 108090000623 proteins and genes Proteins 0.000 claims description 44
- 150000001413 amino acids Chemical class 0.000 claims description 25
- 238000004422 calculation algorithm Methods 0.000 claims description 21
- 238000012549 training Methods 0.000 claims description 21
- 108010033276 Peptide Fragments Proteins 0.000 claims description 19
- 102000007079 Peptide Fragments Human genes 0.000 claims description 19
- 238000009826 distribution Methods 0.000 claims description 16
- 102000005962 receptors Human genes 0.000 claims description 10
- 108020003175 receptors Proteins 0.000 claims description 10
- 239000003814 drug Substances 0.000 claims description 9
- 238000012360 testing method Methods 0.000 claims description 9
- 229940079593 drug Drugs 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 8
- 238000003776 cleavage reaction Methods 0.000 claims description 7
- 230000007017 scission Effects 0.000 claims description 7
- 238000002255 vaccination Methods 0.000 claims description 6
- 108091005804 Peptidases Proteins 0.000 claims description 5
- 239000004365 Protease Substances 0.000 claims description 5
- 239000003112 inhibitor Substances 0.000 claims description 4
- 102000003688 G-Protein-Coupled Receptors Human genes 0.000 claims description 3
- 108090000045 G-Protein-Coupled Receptors Proteins 0.000 claims description 3
- 102000035195 Peptidases Human genes 0.000 claims description 3
- 238000004166 bioassay Methods 0.000 claims description 3
- 210000001151 cytotoxic T lymphocyte Anatomy 0.000 claims description 2
- 230000006698 induction Effects 0.000 claims description 2
- 235000018102 proteins Nutrition 0.000 claims 9
- 102000010180 Endothelin receptor Human genes 0.000 claims 3
- 108050001739 Endothelin receptor Proteins 0.000 claims 3
- 102000006481 HIV Receptors Human genes 0.000 claims 3
- 108010083930 HIV Receptors Proteins 0.000 claims 3
- 235000001014 amino acid Nutrition 0.000 claims 3
- 239000002333 angiotensin II receptor antagonist Substances 0.000 claims 3
- 108091008039 hormone receptors Proteins 0.000 claims 3
- 239000003475 metalloproteinase inhibitor Substances 0.000 claims 3
- 239000000018 receptor agonist Substances 0.000 claims 3
- 229940044601 receptor agonist Drugs 0.000 claims 3
- 239000002536 vasopressin receptor antagonist Substances 0.000 claims 3
- 229940121891 Dopamine receptor antagonist Drugs 0.000 claims 2
- 239000003696 aspartic proteinase inhibitor Substances 0.000 claims 2
- 239000002852 cysteine proteinase inhibitor Substances 0.000 claims 2
- 239000003210 dopamine receptor blocking agent Substances 0.000 claims 2
- 239000003136 dopamine receptor stimulating agent Substances 0.000 claims 2
- 238000010647 peptide synthesis reaction Methods 0.000 claims 2
- 239000003001 serine protease inhibitor Substances 0.000 claims 2
- 102000028517 Neuropeptide receptor Human genes 0.000 claims 1
- 239000002439 beta secretase inhibitor Substances 0.000 claims 1
- 229940042399 direct acting antivirals protease inhibitors Drugs 0.000 claims 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 claims 1
- 239000002464 receptor antagonist Substances 0.000 claims 1
- 229940044551 receptor antagonist Drugs 0.000 claims 1
- 206010028980 Neoplasm Diseases 0.000 description 16
- 230000006870 function Effects 0.000 description 15
- 239000011159 matrix material Substances 0.000 description 13
- 101710163595 Chaperone protein DnaK Proteins 0.000 description 9
- 101710178376 Heat shock 70 kDa protein Proteins 0.000 description 9
- 101710152018 Heat shock cognate 70 kDa protein Proteins 0.000 description 9
- 239000000427 antigen Substances 0.000 description 8
- 108091007433 antigens Proteins 0.000 description 8
- 102000036639 antigens Human genes 0.000 description 8
- 230000035772 mutation Effects 0.000 description 8
- 230000003612 virological effect Effects 0.000 description 6
- 210000001744 T-lymphocyte Anatomy 0.000 description 5
- 125000003275 alpha amino acid group Chemical group 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 102000054766 genetic haplotypes Human genes 0.000 description 4
- 210000000987 immune system Anatomy 0.000 description 4
- 238000006467 substitution reaction Methods 0.000 description 4
- 230000004075 alteration Effects 0.000 description 3
- 201000011510 cancer Diseases 0.000 description 3
- 230000028993 immune response Effects 0.000 description 3
- 238000009169 immunotherapy Methods 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 229960005486 vaccine Drugs 0.000 description 3
- 239000005541 ACE inhibitor Substances 0.000 description 2
- 102000043129 MHC class I family Human genes 0.000 description 2
- 108091054437 MHC class I family Proteins 0.000 description 2
- 102000004245 Proteasome Endopeptidase Complex Human genes 0.000 description 2
- 108090000708 Proteasome Endopeptidase Complex Proteins 0.000 description 2
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 2
- 239000013543 active substance Substances 0.000 description 2
- 229940044094 angiotensin-converting-enzyme inhibitor Drugs 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000009510 drug design Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 150000007523 nucleic acids Chemical group 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 210000004881 tumor cell Anatomy 0.000 description 2
- 230000009385 viral infection Effects 0.000 description 2
- 101001091423 Agaricus bisporus Polyphenol oxidase 2 Proteins 0.000 description 1
- 102000008873 Angiotensin II receptor Human genes 0.000 description 1
- 108050000824 Angiotensin II receptor Proteins 0.000 description 1
- 208000023275 Autoimmune disease Diseases 0.000 description 1
- 101000705994 Bombyx mori Phenoloxidase subunit 2 Proteins 0.000 description 1
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000008949 Histocompatibility Antigens Class I Human genes 0.000 description 1
- 108010088652 Histocompatibility Antigens Class I Proteins 0.000 description 1
- 102000018713 Histocompatibility Antigens Class II Human genes 0.000 description 1
- 108010027412 Histocompatibility Antigens Class II Proteins 0.000 description 1
- 108010048209 Human Immunodeficiency Virus Proteins Proteins 0.000 description 1
- 102000043131 MHC class II family Human genes 0.000 description 1
- 108091054438 MHC class II family Proteins 0.000 description 1
- 101000606124 Margaritifera margaritifera Tyrosinase-like protein 2 Proteins 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 101000773106 Pinctada maxima Tyrosinase-like protein Proteins 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000036755 cellular response Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 231100000433 cytotoxic Toxicity 0.000 description 1
- 230000001472 cytotoxic effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000000763 evoking effect Effects 0.000 description 1
- 230000010429 evolutionary process Effects 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 239000004030 hiv protease inhibitor Substances 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 230000003053 immunization Effects 0.000 description 1
- 238000002649 immunization Methods 0.000 description 1
- 230000001024 immunotherapeutic effect Effects 0.000 description 1
- 206010022000 influenza Diseases 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 239000000816 peptidomimetic Substances 0.000 description 1
- 238000002823 phage display Methods 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000002818 protein evolution Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 108010055094 transporter associated with antigen processing (TAP) Proteins 0.000 description 1
- 238000003041 virtual screening Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
- C07K1/04—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length on carriers
- C07K1/047—Simultaneous synthesis of different peptide species; Peptide libraries
Definitions
- the present invention relates to a method for creating a sequence-function relationship, a method for identifying or generating peptide sequences having a specific functionality using the created sequence-function relationship and a method for generating a focussed synthetic peptide library.
- Peptides having a known functionality and molecules derived therefrom having a modified or improved functionality can be used for nanotechnology, bioelectronical devices, biosensors and, in particular, for drug or vaccine design. Once a useful peptide is found it may serve as starting point for further screening. However, most screening procedures like peptide libraries or phage display techniques are largely based on pure random search. Since the number of possible natural sequence variants is 20 n for a peptide consisting of n amino acids, huge amounts of data have to be screened. Therefore, despite the enormous potential of modern high throuphput screening techniques, pure random search result only in a very small fraction of peptides revealing the desired function.
- Still another object of the present invention was to provide a method for generating new peptide sequences having a desired functionality.
- Still another object of the present invention was to provide lead structures for drug design.
- Another object of the present invention was to provide focussed synthetic peptide libraries which contain an enriched amount of peptides having a desired functionality.
- the invention relates to the prediction of a specific function of a peptide or protein, e.g . a binding affinity of peptides or proteins to receptors or proteins, respectively, solely from their peptide sequences.
- the invention in particular, relates to a method comprising one or more of the following features:
- biologically active peptides which can be used as seed peptides.
- the biologically active peptides having a desired specific functionality may be derived e.g. from a compound library, from literature or from experiments
- peptide sequence variants e.g . in the form of a peptide library, which are adjacent to the seed peptide within the sequence space with regard to the physical-chemical features.
- the creation of the peptide variant sequences can be performed by computer-based calculation methods without any need of chemical or biological experiments
- ANN artificial neural network
- an evolution strategy is used as training algorithm
- peptide lead structures being functionally analogous but having different structures can be obtained. These peptide lead structures can be used directly as drug or active agent or can be used for development of new drugs.
- the invention relates to a method for creating a sequence- function relationship comprising the steps:
- a sequence-function relationship in particular, a quantitative sequence-function relationship, can be created by generating i set of synthetic sequences constituting a focussed peptide library in a first step. This set of sequences is then employed as training data set for an artificial neural network for optimization of a model of the sequence- function relationship.
- sequence-function relationship denotes a correlation between the primary sequence of a molecule, in particular, an amino acid sequence or a nucleic acid sequence with a desired activity or functionality.
- a first step at least one seed peptide sequence of a peptide having a specific desired functionality is provided.
- the method according to the invention is also applicable if only a few or even only a single peptide sequence having the desired specific functionality is known.
- the specific functionality of the peptide can be any desired functionality which is attributable to a peptide.
- Preferred examples of the function are binding to a receptor, protease cleavage sites of protein sequences, domains which affect the transport performance of a peptide or protein.
- the function is selected from peptide binding to MHC molecules, binding to ACE, binding to HIV proteins like protease, transcriptase etc., binding to HSP proteins.
- the synthetic peptide sequences preferably have the same number of amino acids as the seed peptide sequence, i.e. they are selected from the sequence space of the seed peptide sequence.
- the amino acid (aa) length of the seed peptide sequence and of the synthetic peptide sequences is preferably from 4 to 500 aa, more preferably from 5 to 1 00 aa.
- an algorithm is used which generates variants stemming from sequence space regions around the seed peptide with an essentially bell-shaped distribution.
- the set of sequences generated shows a unimodal bell-shaped distribution and, in particular, an essentially Gaussian distribution.
- This approach is based on the assumption that molecules with a similar or improved function can be identified among the peptides located close to the seed peptide in sequence space.
- the assumption underlying the algorithm used in the present invention is based on that in natural evolutionary processes large alterations of a protein may occur with a generation but these extremely different mutants rarely survive; most observed mutations leading to a slightly improved function are single- site substitutions keeping the vast majority of the sequence unchanged and conservative replacements tend to prefer substitutions of amino acids which are similar in their intrinsic physicochemical properties.
- a localized, bell-shaped distribution of variants for construction of a useful peptide library is selected to approximately reflect these aspects of natural protein evolution.
- due to the bell-shaped distribution even peptides spaced far apart from the seed peptide in sequence are included. By using this algorithm even large sequence alterations leading to improved function, e.g. if several optima exist in sequence space, can be detected.
- a suitable distance measure for use in the present algorithm is e.g. the Euclidian distance.
- the Euclidian distance between two peptides A and A' of length n is defined as:
- the distance ⁇ -, between two amino acids at sequence position i can be taken from an amino acid distance matrix.
- Suitable amino acid distance matrices are e.g. the matrix of Feng et al. (J.Mol.Evol.21 ( 1 985), 1 1 2) or the matrix of Niefind and Schomburg (J.Mol.Biol.21 9 ( 1 991 ) 481 ) .
- Peptides with a similar biological function usually have low pairwise distance values, wherein in this simple model similarity is determined solely by the amino acid sequence and all sequence positions are assumed to contribute to a functionality of a peptide.
- a mutation rate matrix Based on the distance measure applied a mutation rate matrix can be calculated .
- Such a mutation rate matrix contains the substitution probabilities for the individual amino acids contained within the sequence.
- a rate matrix For the conversion of an amino acid distance matrix, as described above, into a rate matrix the following formula can be used:
- the exchange rate r reckon is a monotonously decaying function of the distance, ⁇ , is a position-specific parameter defining the shape of the bell- shaped distribution, in particular, of a Gaussian distribution, which may be subjected to time-dependent alterations, for example, in simulation experiments.
- Small ⁇ values lead to narrow distributions of the exchange rates which reflect strong selection pressure.
- the rate matrix is calculated from a distance matrix with a ⁇ , between 0.05 and 1 , in particular, between 0.1 and 0.5.
- ⁇ s values e.g. 0.1 , 0.2, 0.3 and 0.4
- ⁇ values depending on the sequence position, if there is any information about position-specific mutation rates.
- the algorithm used therefore can be applied to produce peptides in multidimensional sequence space.
- the distances of these sequences to the original peptide (seed peptide) satisfy a bell-shaped distribution.
- a peptide selection scheme for systematic evolutionary design and construction of synthetic peptide libraries for generating a set of sequences having an essentially bell-shaped distribution is described e.g. by G.Schneider et al., Minimal Invasive Medicine 6(3) (1 995) 106-1 1 5, the disclosure of which is hereby incorporated by reference.
- the sequences obtained are classified as positive hits or negative hits according to a predetermined limit.
- the predetermined limit can be a distance measure or similarity index of the - 9 - synthetic peptides with regard to the seed peptide or it can be a simple numerical value classifying the sequences found into two groups.
- the predetermined limit is set as a value between 1 0 and 50% closest distance, which means, e.g. that the synthetic peptide sequences created are classified into two groups, wherein 1 0 to 50%, in particular, 20% having the closest distance are classified as positive hits, wherein 50 to 90%, in particular, 80% having a distance not so close are classified as negative hits.
- This classifying of the data set obtained with the algorithm applied allows for the use of only a few or even only a single seed peptide with known functionality, since starting out from the seed peptide sequence sufficiently large amounts of data representing positive peptides, peptides with low activity and peptides having no activity can be generated.
- the classified sequence set is then used for training an artificial neural network.
- a suitable procedure for training an artificial neural network with the synthetic peptide sequence set is described e.g. by G.Schneider and P.Wrede, J.Mol.Evol.36 (1 993) 586-595, the disclosure of which is incorporated by reference.
- an evolution strategy is employed for training an artificial neural network (I .Rechenberg (1 973), Evolutionsstrategie - Optim réelle memorir Systeme nach Prinzipien der biologischen Evolution, Frommann-Holzboog, Stuttgart) .
- a trained artificial neural network comprising a model sequence-function relationship.
- a preferred artificial neural network consists of an input layer which employs at least four physicochemical amino acid properties for the sequence description, e.g. hydrophobicity, hydrophilicity, polarity and volume, one to four hidden layers for the feature extraction and a single output layer for classification. While it is possible to generate a trained artificial neural network using a single set of training sequences, better results often can be obtained by using at least two different training sequence sets, e.g. sequence sets derived from different seed peptide sequences or training sequence sets derived from the same seed peptide using different distance measures or different mutation rate matrices. Different mutation rate matrices can be obtained e.g. by using different position-specific parameters ⁇ ]r as described above.
- the trained artificial neural network is evaluated in a test phase with a test set of sequences which is distinct from the training set of sequences.
- a virtually generated sequence set can be used.
- the invention relates to a trained artificial neural network which is obtainable by the method described above.
- a proportion of > 50%, more preferably > 80% and most preferably > 90% virtual sequences have been used for training the artificial neural network.
- the invention relates to a method for identifying peptide sequences having a specific functionality within a protein sequence comprising the steps:
- the algorithm described above can be used to predict a specific functionality within a protein sequence, e.g. a peptide fragment sequence of said protein having an affinity for a receptor or protein, respectively.
- a protein sequence which is to be evaluated is provided in a first step.
- the protein sequence may be derived from a protein library, from literature or from experiments. This protein sequence is divided in individual peptide fragments having a predetermined length, e.g. from 5 to 500, preferably from 1 0 to 100 and most preferably from 10 to 50 amino acids in length.
- all possible peptide fragments having a predetermined length are generated from said protein sequence by a sliding window technique (frame shift) .
- the peptide fragments obtained are then evaluated numerically with the method described above.
- a trained artificial neural network peptide sequences having the desired specific functionality can be identified.
- the peptide fragment sequences evaluated are sorted according to their quality.
- a suitable quality criterion is e.g. their distance from a peptide sequence which is known show the desired functionality.
- the peptide fragments having the highest biochemical activity are then grouped at the top of the list and can then be tested biologically.
- the peptide sequences identified as positive or the best hits can be tested in biological assays, thus considerably decreasing the number of molecules which are subjected to this time-consuming and laborious step.
- the invention further relates to a method for altering peptides and peptidomimetics and their uses, based on a method for generating a focused synthetic peptide library and using the discovered AA-sequences for immunotherapy.
- the invention comprises a method for identifying peptide sequences having a specific functionality, comprising the steps:
- An object of this embodiment was to provide a method for generating new peptide sequences having a desired functionality.
- Another object of this embodiment was to provide peptidic lead structures which can be used for immunotherapy.
- T-cell epitope for any MHC-halotype derived from a compound library, from literature or from experiments
- Epitopes are important for alerting the immune system and stimulate an attack on the tumor cells.
- these peptides analogous to the natural tumor or viral antigen, different T-cell clones that would not be stimulated with the natural epitope alone can be activated. Together with the natural tumor or viral antigens, the altered peptides can induce an evaluated immune response against a tumor specific or viral antigen.
- this embodiment provides an activation of different T-cell clones recognizing tumor specific or viral antigens due to various peptide sequences which are quite similar in the sequence respectively fitness space. Therefore, the immune system can be boosted and the potency for using these natural antigens combined with PepHarvester ® based altered peptide sequences as vaccines against cancer and infectious diseases increases.
- tyrosinase 2 (Tyr2) antigen FVWLHYYSV the following variants for vaccination can be generated:
- These peptides can be used as a pool for vaccination against melanoma.
- the invention relates to a method for generating peptide sequences having a specific functionality, comprising the steps:
- the above-described optimized trained artificial neural network is employed as fitness function for a computer-based systematical search within the sequence space.
- the sequence space is the amount of all possible peptides having a given length.
- new variants of a parent peptide are created, subsequently evaluated with the artificial neural network and then, in a selection step, a peptide variant, e.g. the best peptide variant, is selected as parent peptide for the next cycle.
- An artificial neural network trained as described above is used as the fitness function in a protein design cycle. Initially, a random sequence of predetermined length, e.g. from 5 to 500 amino acids, preferably from 10 to 100 amino acids in length is provided as parent sequence.
- new sequences are derived, e.g. with a gradient search, a gradient search with momentum or a diffuse search.
- an evolution strategy is employed.
- the different strategies as well as the procedure for a simulated molecular evolution is described e.g. by G .Schneider, J .Schuchhardt and P.Wrede, Comput.Appl.Biosci., 1 0(6) (1 994) 635-645, the disclosure of which is incorporated by reference.
- the evolution strategy allows for the detection of optimal sequences, even if there is a plurality of local maxima within the sequence space. - 1 6 -
- the number of cycles used for generating new peptides is preferably from 1 0 to 1 0,000, more preferably from 1 0 to 1 ,000.
- the invention relates to a method for generating a focussed synthetic peptide library comprising the steps:
- seed peptide sequence is selected from a peptide binding to the MHC-receptors, HSP-proteins, HlV-receptors, ACE-receptors and GPCR receptors.
- the above-described algorithm for creating a sequence set having an essentially bell-shaped distribution can be applied for generating a focussed synthetic peptide library having an enriched amount of members showing a desired functionality.
- ⁇ libraries having different diversity can be created. Choosing a small ⁇ , e.g. ⁇ ⁇ 0.05, a library having a low diversity is obtained, while using a large ⁇ , e.g. ⁇ > 0.5, a high diversity is obtained.
- For defining distances between the peptide preferably an amino acid distance matrix and the Euclidian distance are - 1 7 - used (G.Schneider et al., Minimal invasive Medicine 6(3) ( 1 995) 1 06-1 1 5) .
- Preferred seed peptide sequences are the Mart 1 -sequence:AAGIGILTV which binds to MHC molecules and LHIYTT which binds to HSP70 proteins.
- the focussed synthetic peptide libraries contain sequences whose distances from the seed peptide are essentially bell-shaped.
- the invention relates to a focussed synthetic peptide library obtainable by the method above.
- Focussed peptide libraries are preferably generated with the program PepHarvester ® (very similar to Pep Maker, cf. G.Schneider et al., Minimal Invasive Medicine 6(3) (1 995) 106- 1 1 5) .
- the focussed bell-shaped library obtainable by said program provides the training data set for the second software tool, namely an artificial neural network (ANN) for the optimization of a model for a sequence- activity relationship.
- ANN artificial neural network
- Artificial neural networks usable are described e.g. by G .Schneider and P.Wrede, J.Mol.Evol.36 ( 1 993) 586-595.
- SME simulated molecular evolution
- G.Schneider and P.Wrede Biophys. J. 66, 335- 344 ( 1 994)
- G . Schneid er, J . Sch uchhardt and P .Wrede Comput.Appl.Biosci. 10(6) (1 994) 635-645
- novel peptide sequences having a predetermined length can be generated by combining the artificial neural network as fitness function and a mutation generator.
- SME ® is a computer-based evolutionary search to structural highly diverse isofunctional peptide variants (Wrede et al. (1 998), Biochemistry 37, 3588- 3593; Schneider et al. (1 998) Proc.Natl.Acad.Sci. 95, 1 21 79-1 21 84) .
- peptides such as MHC1 -binding peptides
- the predicted peptide sequences as well as de novo designed peptides can be employed for biotechnological, medical, diagnostical or therapeutical applications. - 1 8 -
- the invention is particularly applicable to drugs selected from tumor vaccination, antibiotics, drugs related to HIV, influenza, viral infection and ACE inhibitors.
- the invention is applicable, in particular, for predicting peptide sequences with regard to the affinity for protein binding sites or receptors, respectively.
- Peptidic lead structures are also used as immunotherapeutics, as peptide ligands for certain receptors (e.g. ACE inhibitors: angiotensin receptor), as antibiotically active peptides, as active agents against HIV (e.g. blocking of the pathway into the cell; HIV-RT inhibitors, HIV protease inhibitors etc.) or as peptides binding to different MHC molecules (so far about 600 different haplotypes are known) with high affinity as immune modulators against cancer and virus infections.
- ACE inhibitors angiotensin receptor
- antibiotically active peptides as active agents against HIV (e.g. blocking of the pathway into the cell; HIV-RT inhibitors, HIV protease inhibitors etc.) or as peptides binding to different MHC molecules (so far about 600 different haplotypes are known) with high affinity as immune modulators against cancer and virus infections.
- cleavage sites in particular, proteasome cleavage sites of protein sequences with the methods according to the invention and to predict the - 1 9 - transport of proteins, e.g . the transport of proteins into the ER by the TAP transporter (MHC-I processing pathway) .
- binding affinities of peptides for MHC class I and MHC class II can be predicted, with different artificial neural networks being trained for binding to different HLA haplotypes and for prediction of the processibility of the peptides (sequence specifity of different proteases and proteasomes) in human cells. Since for proteasome restriction sites and rare MHC haplotypes only a few experimental data are available, these are ideal targets for the methods according to the invention because these few data are sufficient for the novel methods provided hereby. In particular, starting out from the few known data sufficiently large amounts of data can be generated for training an artificial neural network following the procedure described above.
- the invention describes a tool which allows to predict peptide sequences from a selected long protein sequence, which have similar or even improved biochemical properties compared to a seed peptide. It is also possible to create novel peptide variants de novo which show similar or even improved biochemical or pharmacological properties compared to a seed peptide.
- An essential feature of the invention is that only few known data are sufficient to train an artificial neural network.
- the bioinformatical tool provided hereby is applicable to predict tumor-specific peptide sequences from proteins, from which only very few biological data are available in literature.
- the invention is particularly applicable for predicting peptides which are presented on MHC I and MHC II molecules which are occurring rarely in nature or which have not yet been evaluated sufficiently by experiments.
- the invention comprises a computer-readable storage medium such as a diskette or CD containing any of the above-described tools, libraries and/or algorithms. - 20 -
- Fig .1 shows a flow diagram of the prediction tool according to the invention.
- Fig.2 elucidates the PepHarvester ® algorithm for the seed peptide
- AAGIG1LTV By using a diversity index of 0.1 0 20 variants were created.
- Fig.3 shows the PepHarvester ® algorithm applied to the HSP70 binding peptide LHIYTT.
- Fig.4 shows a mathematical correlation between the peptides and their assumed activity.
- Fig.5a shows calculated active epitopes from a selected protein sequence.
- Fig .5b shows calculated HSP70-binding peptides within the p53 protein.
- Activated MHC class I - restrictive CD8 + T cells - are efficiently eliminating tumor cells.
- tumor-specific peptides having high affinity to the MHC-I receptor were designed de novo according to the invention. Exchanging peptides bound in the binding group of MHC-I by these tumor-specific peptides having high affinity a specific C - 21 - cell response can be initiated. By immunization with these peptides a cytotoxic immune response is evoked which effectively results in tumor regresion.
- HSP70 Peptides were predicted or generated, respectively, which bind to HSP70 protein, starting out from sequence LHIYTT.
- HSP70 are used in tumor biology, since they serve as tumor markers and molecular chaperons and they are also easily included from antigen-presented cells (APC) of the immune system.
- Peptides bound to HSP70 are presented to the T cells also on MHC-I and MHC-II, thus specifically activating the immune response.
- Example 3 From the predicted peptides which bind to a specific MHC haplotype in a subsequent algorithm those peptides are selected which additionally bind to HSP70. It is further possible to hybridize tumor-specific or viral peptides predicted as negative hit according to the invention, with a peptide identified as binding to HSP70 according to the invention and use the hybrid peptide for vaccination. These peptides can be used as effective vaccines against cancer, autoimmune diseases, HIV and other virus diseases, since they stimulate the immune system epitope-specifically.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- Peptides Or Proteins (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2001293794A AU2001293794A1 (en) | 2000-09-05 | 2001-09-05 | A method for identifying peptide sequences having a specific functionality |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US65505600A | 2000-09-05 | 2000-09-05 | |
US09/655,056 | 2000-09-05 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2002020564A2 true WO2002020564A2 (fr) | 2002-03-14 |
WO2002020564A3 WO2002020564A3 (fr) | 2003-10-02 |
Family
ID=24627318
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2001/010195 WO2002020564A2 (fr) | 2000-09-05 | 2001-09-05 | Procede servant a identifier des sequences de peptides possedant une fonctionnalite specifique |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2001293794A1 (fr) |
WO (1) | WO2002020564A2 (fr) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE10343690A1 (de) * | 2003-09-18 | 2005-04-21 | Caesar Stiftung | Verfahren zur Bestimmung optimierter Oligomere |
US7894995B2 (en) | 2006-02-16 | 2011-02-22 | Microsoft Corporation | Molecular interaction predictors |
US8121797B2 (en) | 2007-01-12 | 2012-02-21 | Microsoft Corporation | T-cell epitope prediction |
US8396671B2 (en) | 2006-02-16 | 2013-03-12 | Microsoft Corporation | Cluster modeling, and learning cluster specific parameters of an adaptive double threading model |
US8706421B2 (en) | 2006-02-16 | 2014-04-22 | Microsoft Corporation | Shift-invariant predictions |
CN117037902A (zh) * | 2023-07-18 | 2023-11-10 | 哈尔滨工业大学 | 基于蛋白质物理化学特征嵌入的肽与mhc i类蛋白结合基序预测方法 |
WO2023230077A1 (fr) * | 2022-05-23 | 2023-11-30 | Palepu Kalyan | Apprentissage contrastif pour conception de dégradeur à base de peptides et ses utilisations |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU5150499A (en) * | 1998-06-02 | 1999-12-20 | Imtox Gmbh | Rationally designed peptides, production and use thereof |
-
2001
- 2001-09-05 WO PCT/EP2001/010195 patent/WO2002020564A2/fr active Application Filing
- 2001-09-05 AU AU2001293794A patent/AU2001293794A1/en not_active Abandoned
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE10343690A1 (de) * | 2003-09-18 | 2005-04-21 | Caesar Stiftung | Verfahren zur Bestimmung optimierter Oligomere |
US7894995B2 (en) | 2006-02-16 | 2011-02-22 | Microsoft Corporation | Molecular interaction predictors |
US8396671B2 (en) | 2006-02-16 | 2013-03-12 | Microsoft Corporation | Cluster modeling, and learning cluster specific parameters of an adaptive double threading model |
US8706421B2 (en) | 2006-02-16 | 2014-04-22 | Microsoft Corporation | Shift-invariant predictions |
US8121797B2 (en) | 2007-01-12 | 2012-02-21 | Microsoft Corporation | T-cell epitope prediction |
WO2023230077A1 (fr) * | 2022-05-23 | 2023-11-30 | Palepu Kalyan | Apprentissage contrastif pour conception de dégradeur à base de peptides et ses utilisations |
CN117037902A (zh) * | 2023-07-18 | 2023-11-10 | 哈尔滨工业大学 | 基于蛋白质物理化学特征嵌入的肽与mhc i类蛋白结合基序预测方法 |
Also Published As
Publication number | Publication date |
---|---|
AU2001293794A1 (en) | 2002-03-22 |
WO2002020564A3 (fr) | 2003-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Micheletti et al. | Recurrent oligomers in proteins: an optimal scheme reconciling accurate and concise backbone representations in automated folding and design studies | |
Yang et al. | An introduction to epitope prediction methods and software | |
CN114649054B (zh) | 基于深度学习的抗原亲和力预测方法和系统 | |
Pintro et al. | Optimized virtual screening workflow: Towards target-based polynomial scoring functions for HIV-1 protease | |
CN102479295B (zh) | 一种计算机预测蛋白功能的方法 | |
AU2001245011B2 (en) | System and method for systematic prediction of ligand/receptor activity | |
WO2002020564A2 (fr) | Procede servant a identifier des sequences de peptides possedant une fonctionnalite specifique | |
AU2001245011A1 (en) | System and method for systematic prediction of ligand/receptor activity | |
He et al. | Development and application of computational methods in phage display technology | |
Chang et al. | Estimation and extraction of B‐cell linear epitopes predicted by mathematical morphology approaches | |
Handoko et al. | Extreme learning machine for predicting HLA-peptide binding | |
HUP0400698A2 (hu) | Csökkentett immunogenitású módosított interleukin-1-receptor-antagonista (IL-1RA) | |
CN108932400B (zh) | 一种考虑界面信息的有效的蛋白质-rna复合物结构预测方法 | |
Hu et al. | Conservation of hot regions in protein–protein interaction in evolution | |
JP2004069417A (ja) | ノード座標の決定方法、ネットワーク表示方法及びスクリーニング方法 | |
WO2002073193A1 (fr) | Strategie informatisee pour l'enumeration d'ensembles conformationnels de peptides et de proteines et l'analyse d'affinites de ligands | |
CA2294771A1 (fr) | Procede de deduction des fonctions proteiques au moyen d'une base de donnees de ligands | |
Mann et al. | Classifying proteinlike sequences in arbitrary lattice protein models using LatPack | |
Evensen et al. | Ligand design by a combinatorial approach based on modeling and experiment: application to HLA-DR4 | |
Krenn et al. | Array technology and proteomics in autoimmune diseases | |
EP1230615A2 (fr) | Procede de manipulation de donnees de sequences de proteines ou d'adn servant a generer des ligands peptidiques complementaires | |
Staquicini et al. | Combinatorial vascular targeting in translational medicine | |
Manzoor et al. | Evolution of machine learning methods in linear B-cell epitope prediction | |
Minkiewicz et al. | Online programs and databases of peptides and proteolytic enzymes–a brief update for 2007–2008 | |
WO2002048713A2 (fr) | Identification de structures de plomb |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PH PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: COMMUNICATION UNDER RULE 69 EPC (EPO FORM 1205A OF 11.07.2003) |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |