+

WO2001096861A1 - Systeme d'identification de molecule - Google Patents

Systeme d'identification de molecule Download PDF

Info

Publication number
WO2001096861A1
WO2001096861A1 PCT/SE2001/001322 SE0101322W WO0196861A1 WO 2001096861 A1 WO2001096861 A1 WO 2001096861A1 SE 0101322 W SE0101322 W SE 0101322W WO 0196861 A1 WO0196861 A1 WO 0196861A1
Authority
WO
WIPO (PCT)
Prior art keywords
molecules
mass
masses
molecule
stored
Prior art date
Application number
PCT/SE2001/001322
Other languages
English (en)
Other versions
WO2001096861A8 (fr
Inventor
Jan Eriksson
Original Assignee
Jan Eriksson
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jan Eriksson filed Critical Jan Eriksson
Priority to AU2001264517A priority Critical patent/AU2001264517A1/en
Publication of WO2001096861A1 publication Critical patent/WO2001096861A1/fr
Publication of WO2001096861A8 publication Critical patent/WO2001096861A8/fr

Links

Classifications

    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01JELECTRIC DISCHARGE TUBES OR DISCHARGE LAMPS
    • H01J49/00Particle spectrometers or separator tubes
    • H01J49/26Mass spectrometers or separator tubes
    • H01J49/34Dynamic spectrometers
    • H01J49/40Time-of-flight spectrometers
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures

Definitions

  • the present invention relates to a method and tools for the identification of unknown molecules, and, particularly, a method and tools for molecule identification that provide a solution to the problem of random mass matching.
  • Molecule identification problems can concern e.g. the tracing of unwanted substances in the environment and the studies of metabolic pathways and disease-state markers in drug development projects. Molecule identification problems can sometimes be solved by the appropriate application of instruments and methods for the acquisition and processing of data from a sample containing the molecules to be identified.
  • data from a sample is mass data.
  • Molecular or molecular constituent mass data can be obtained by a variety of techniques including techniques such as ultra-centrifugation, electrophoresis, and mass spectrometry.
  • Experimental mass data from the sample analyzed is often compared with database-information about known or hypothetical molecules.
  • MS mass spectrometry
  • MS of protein-digests combined with searching in protein and DNA sequence databases is a method of choice for the identification of proteins in proteomics projects.
  • the field of proteomics which include the elucidation of protein function under various cell conditions, is believed to form a future basis for drug design.
  • MS-protein identification involves cleavage of proteins with an enzyme having high digestion specificity (usually trypsin), whereupon the resulting proteolytic products are subjected to mass analysis by either matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) or electrospray ionization mass spectrometry (ESI-MS).
  • MALDI-MS matrix-assisted laser desorption/ionization mass spectrometry
  • ESI-MS electrospray ionization mass spectrometry
  • the experimentally determined masses are then compared with masses of peptides that individual proteins in a database would yield if they were cleaved by the same enzyme as was used in the experiment.
  • individual proteolytic peptide ions are isolated and subjected to fragmentation and fragment mass analysis in the mass spectrometer.
  • the resulting fragment masses are then compared with hypothetical proteolytic peptide fragment masses of the proteins in a database.
  • the protein is identified based on an evaluation of either or both of these comparisons.
  • Mass spectrometry determines a peptide mass mi to an accuracy ⁇ mi, with Amil ⁇ ii typically >30 ppm. Within the mass range mt ⁇ mi proteolytic peptide masses of several proteins in a genome database can match.
  • an unmodified peptide will match randomly with several proteins in the database, in addition to the true match with the actual protein present in the sample, and a modified peptide will yield only random matches. Consequently, a database search using mass spectrometry information will not always identify a protein unambiguously. Therefore, in order to perform accurate and reliable molecule identification, instruments for obtaining mass data must be appropriately linked with the use of other technical resources for the comparison of mass data and mass information obtained from a database.
  • the link can be a system that makes use of a method including means for comparison of data and database information, preferably operated via a computer.
  • Identification of proteins by the above-described approach requires a scheme for determining the best match between the experimental data and a sequence in the database.
  • Existing schemes for determining the best match include ranking by number of matches (W.J. Henzel et al., Proc. Natl. Acad. Sci. U S A 90, 5011, 1993), a scoring system based on the observed frequency of peptides from all proteins in a database in a given molecular weight range (the so-called "MOWSE score" (D.J.C. Pappin et al., Current Biology 6, 327,1993), and a scheme based on Bayesian probabilities (W. Zhang et al., Anal. Chem. 72, 2482, 2000).
  • the object of the present invention is to overcome the shortcomings of the above-mentioned schemes, i.e., to provide a method that solves the problem of random mass matching.
  • This and other objects have been met by providing a system including methods of determining the probability for a particular score due to random mass matching of a molecule, and to utilize the computed probability to rank molecules.
  • the method comprises: a) determining the number of matches between a database molecule and mass data; b) computing the probability that a database molecule would yield a particular number of matches by chance; c) computing a score based on one or several probabilities computed in step (b); c) comparing the scores of molecules in a molecule database; and d) identifying the molecule or molecules that yield(s) the best score (s).
  • the invention further provides a method of generating a frequency function of the number of matches for random (false) molecule identification for any experimental condition.
  • the method comprises: a) defining a sub -population of the molecules contained in a database; b) computing the probability that a molecule in this sub-population would yield a particular number of matches by chance; c) computing a probability that all molecules in the sub-population would yield at most a particular number of matches by chance; d) computing the probability that at least one molecule in the sub-population would yield at least a particular number of matches by chance; and e) determining the relative frequency of each number of matches by using the probability computed in step (d) for each number of matches and generating therefrom a frequency function of the number of matches for random protein identification.
  • Fig. 1 shows frequencies (i.e., number of matching proteins) of various tryptic peptide masses in a database.
  • Fig. 2 shows mass distribution peaks for tryptic peptides.
  • Fig. 3 shows the performance of an implementation of one embodiment of the invention in comparison with state of the art systems for protein identification.
  • the graph displays results from simulations employing the invention (denoted Probity), a Bayesian method, and a method based on the number of matches.
  • Fig. 4 shows score frequency functions generated by the invention in comparison with score frequency functions generated by simulation.
  • Examples of large-scale molecule identification can be found in proteomics projects, where thousands of proteins from cells are to be identified, or cells are screened for molecular markers of states of disease.
  • the ultimate goal of molecule identification procedures is to rely on simple, rapid and automated procedures and instrumentation.
  • the technical solutions of the system that links and compares mass data with database information are of key importance to the design of instruments for automated molecule identification, since the system used will influence strongly the capability of obtaining a high relative frequency of true identification results, which is particularly critical when the quality of the data is poor.
  • automated identification instrumentation demand that the quality of identification results is assessed automatically by the use of a significance test (J. Eriksson et al., Anal. Che . 72, 999, 2000).
  • One object of the present invention is to provide a system that utilizes methods that allow more accurate molecule identification and more accurate and rapid significance testing of identification results.
  • the method according to the invention appropriately takes into account the phenomenon of random matching, and is therefore well suited for implementation in an automated molecule identification system.
  • a particular concern regarding large-scale molecular identification is the time required to obtain the identification result together with a quality assessment of this result.
  • a quality assessment can be accomplished by significance test, which requires knowledge of functions describing scores for false results.
  • Such frequency functions are currently obtained by simulation of random molecular identification.
  • an analytical expression for the derivation of a frequency function is provided.
  • the methods according to the invention are well suited for, but not limited to, applications, in which the molecules are biological molecules that can exist in cells of organisms.
  • Bio molecules include any biological polymer that can be degraded into constituent parts. The degradation is preferably into constituent parts at predictable positions to form predictable masses.
  • biological molecules include proteins, nucleic acid molecules, polysaccharides and carbohydrates.
  • An experimental biological molecule is a biological molecule that is to be identified; the experimental biological molecule can also be referred to as an unknown biological molecule.
  • a theoretical biological molecule is a biological molecule is a known biological molecule described in a database.
  • Proteins are polymers of amino acids. Constituent parts of proteins comprise amino acids.
  • a protein typically contains approximately at least ten amino acids, preferably at least 50 amino acids and more preferably at least 100 amino acids.
  • Nucleic acids are polymers of nucleotides. Constituent parts of nucleic acids comprise nucleotides. Typically, a nucleic acid contains at least 100 nucleotides, preferably at least 500 nucleotides.
  • Polysaccharides are polymers of monosaccharides. Constituent parts of polysaccharides comprise one or more monosaccharides. Typically, a polysaccharide contains at least five monosaccharides, preferably at least ten monosaccharides.
  • Mass data of biological molecules are quantifiable information about the masses of the constituent parts of the biological molecule.
  • Mass data include individual mass spectra and groups of mass spectra.
  • the mass spectra can be in the form of peptide maps, oglionucleotide maps or oligosaccharide maps.
  • the method of the present invention includes generating experimental mass data for the experimental molecule within a certain mass range. Mass data include the measured masses. The method also includes generating theoretical mass data in the same mass range. In one embodiment, the experimental mass data is a subset of the experimental mass data.
  • mass data for molecules can be generated in any manner that provides mass data within certain accuracy. Examples include matrix-assisted laser desorption/ionization mass spectrometry, electrospray ionization mass spectrometry, chromatography and electrophoresis. Mass data can also be generated by a general -purpose computer configured by software or otherwise. For the purposes of the present invention the mass data, for example a peptide mass, mi, is determined to an accuracy ⁇ mi, with ⁇ mi/mi preferably ⁇ 10,000 ppm, more preferably ⁇ 100 ppm, and most preferably ⁇ 30 ppm.
  • a step in generating mass data of a molecule may include first cleaving the molecule into constituent parts.
  • Biological molecules may be cleaved by methods known in the art.
  • the biological molecules are cleaved into constituent parts at predictable positions to form predictable masses.
  • Methods of cleaving include chemical degradation of the biological molecules.
  • Biological molecules may be degraded by contacting the biological molecule with any chemical substance.
  • proteins may be predictably degraded into peptides by means of cyanogen bromide and enzymes, such as trypsin, endoproteinase Asp-N, V8 protease, endoproteinase Arg-C, etc.
  • Nucleic acids may be predictably degraded into constituent parts by means of restriction endonucleases, such as Eco RI, Sma I, BamH I, Hinc II, etc.
  • Polysaccharides may be degraded into constituent parts by means of enzymes, such as maltase, amylase, alpha-mannosidase, etc.
  • a mass range (m m in, m ma ⁇ ) is determined for the experimental mass data.
  • the mass range can be any mass range of the mass data.
  • the mass range is the minimum and maximum measured masses of the experimental mass data for a molecule.
  • a molecule database is any compilation of information about characteristics of molecules.
  • a molecule database can be a biological molecule database.
  • Databases are the preferred method for storing both polypeptide amino acid sequences and the nucleic acid sequences that code for these polypeptides.
  • the databases come in a variety of different types that have advantages and disadvantages when viewed as the hypothesis for a polypeptide identification experiment.
  • database entry for an amino acid sequence may appear to be a simple text file for a user browsing for a particular polypeptide
  • database many databases are organized into very flexible, complicated structures.
  • the detailed implementation of the database on a particular system may be based on a collection of simple text files (a "flat-file” database), a collection of tables (a “relational” database), or it may be organized around concepts that stem from the idea of a protein, gene, or organism (an "object-oriented” database). Protein mass data may be predicted from nucleic acid sequence databases.
  • protein mass data may be obtained directly from protein sequence databases that contain a collection of amino acid sequences represented by a string of single-letter or three-letter codes for the residues in a polypeptide, starting at the N-terminus of the sequence. These codes may contain nonstandard characters to indicate ambiguity at a particular site (such as "B” indicating that the residue may be "D" (aspartic acid) or "N” (asparagine)).
  • the sequences typically have a unique number-letter combination associated with them that is used internally by the database to identify the sequence, usually referred to as the accession number for the sequence.
  • Databases may contain a combination of amino acid sequences, comments, literature references, and notes on known posttranslational modifications to the sequence.
  • a database that contains these elements is referred as "annotated”.
  • Annotated databases are used if some functional or structural information is known about the mature protein, as opposed to a sequence that is known only from the translation of a stretch of nucleic acid sequence.
  • Non- annotated databases only contain the sequence, an accession number, and a descriptive title.
  • the background information known about an experimental molecule by which the data base search can be constrained can include any information.
  • Some examples of background information include information about the species of an experimental biological molecule, knowledge or an assumption about the mass of the experimental biological molecule and the isoelectric point of the experimental biological molecule.
  • the observed molecular mass or the observed isoelectric point of a protein can be used in combination with the measured masses of peptides generated by proteolysis to constrain the search for a polypeptide.
  • the comparison between the theoretical mass data of the database proteins and the mass data of the unknown protein may be constrained to only those proteins of the database which are within a chosen mass range.
  • the chosen mass range is preferably within 50% of the mass of the unknown protein, more preferably within 35%, most preferably within 25%.
  • the comparison between the theoretical mass data of the database proteins and the mass data of the unknown protein may be constrained to only those proteins of the database which are within a chosen isoelectric point range.
  • the isoelectric point (pi) of a protein is the pH at which its net charge is zero.
  • the chosen isoelectric point range is preferably within 50% of the isoelectric point of the unknown protein, more preferably within 35%, most preferably within 25%.
  • fragment mass data for a peptide can be generated in any manner which provides fragment mass data within a certain accuracy.
  • Experimental conditions include the type of energy used to generate the fragment mass data.
  • Nibrational excitation energy can be used.
  • the vibrational excitation may be generated by collisions of the peptide with electrons, photons, gas molecules or a surface.
  • Electronic excitation can be used.
  • the electronic excitation may be generated by collisions of the peptide with electrons, photons, gas molecules (e.g. argon) or a surface.
  • the experimental fragment mass spectrum of a peptide from an enzymatically digested unknown protein is compared with the theoretical masses calculated by applying the rules for the specificity of the enzyme, and the rules for the fragmentation as known to those of ordinary skill in the art, to the amino acid sequence of a database protein.
  • Fragment mass data for the purposes of this invention can be generated by using multidimensional mass spectrometry (MS/MS), also known as tandem mass spectrometry.
  • MS/MS multidimensional mass spectrometry
  • a number of types of mass spectrometers can be used including a triple-quadruple mass spectrometer, a Fourier-transform cyclotron resonance mass spectrometer, a tandem time-of-flight mass spectrometer, and a quadruple ion trap mass spectrometer.
  • a single peptide from a protein digest is subjected to MS/MS measurement and the observed pattern of fragment ions is compared to the patterns of fragment ions predicted from database sequences.
  • the invention provides a method to determine the probabilities for the scores that a particular molecule in a database can yield by chance when compared with mass data.
  • the method can operate under a variety of experimental and database search constraints.
  • the score can be the number of matches between masses derived from known or hypothetical molecules or molecular constituents in a database and masses in mass data from one or several known or unknown molecules, or molecular constituents.
  • the score can also result from a computation that utilizes the number of matches.
  • the invention provides a method to extract information about the molecules in a database.
  • information that can be extracted from a database are total molecular mass, charge, isoelectric point, hydrophobicity and known or hypothetical chemical modification, and mass, charge, isoelectric point, hydrophobicity and known or hypothetical chemical modification of molecular constituents.
  • the invention provides a method to perform actions on molecules in the database that are supposed to mimic actions occurring in an experiment.
  • actions are degradation of molecules into molecular constituents by hydrolysis, where hydrolysis can result from the activity of chemicals or enzymes.
  • the method can also perform actions that mimic experimental actions on molecular constituents. For example, the fragmentation of an excited molecular constituent into smaller pieces.
  • the invention provides a method to derive a number of molecular pieces, k u , resulting from an action assumed to mimic an experimental situation.
  • the pieces can be molecular constituents, such as proteolytic peptides resulting from enzymatic digestion of a protein, where different assumptions can be made concerning the degree of completeness of the enzymatic digestion.
  • the pieces can be molecular constituents in the form of fragments of molecular constituents, e.g. fragments of proteolytic peptides.
  • the invention provides a method to organize the masses of molecules or molecular constituents or fragments thereof. Examples of such organization are given in Fig. 1 and 2, where Fig. 1 displays the number of proteins in a database that match a given proteolytic peptide mass and Fig 2 displays the clustered distribution of proteolytic peptide masses. Masses clustering in this or similar fashions will be referred to as a mass distribution peak. Mass distribution peaks can be found for all molecules that contain a limited number of different atoms (e.g. C, H, N, O, S).
  • the invention provides a method for defining mass regions wherein the frequency of various masses can be determined. The method defines fi as the fraction of masses of molecular constituents or fragments that falls into a mass region i.
  • the invention provides a method that determines a probability pt that a particular molecule in a database will be found in a randomly chosen mass distribution peak in the mass region i:
  • P t F(k u ,m i ,c) , where P is a function, mt is a mass region, and c denotes experimental and database search constraints.
  • pi is given by: which describes the probability that a molecular constituent from a particular molecule characterized by fu. will be found in a single randomly chosen mass distribution peak.
  • the denominator of the expression above describing pi represents the number of mass distributions peaks within the mass region i.
  • ⁇ (mi, Am) can be interpreted as a statistical measure of the number of molecular constituent masses that can be found within ⁇ Am from a randomly chosen molecular constituent mass.
  • the mass accuracy Am can be different for different mass regions, i.e., in that case denoted by Ami.
  • the invention provides a method to determine ⁇ (mi, Am) by simulation of the relative frequency of masses around a randomly chosen mass in a mass distribution.
  • ⁇ (mi, Am) is determined by integration of a function describing molecular constituent mass distributions and normalization to the total number of molecular constituent masses in a mass distribution peak.
  • ⁇ (mi, Am) is determined by direct counting followed by normalization.
  • a finite number of mass regions between m m in and m m ax is employed, each having an individually defined pi'.
  • the probabilities pi' are employed to compute a total probability, p(k), for an individual molecule in the database to match randomly k out of n masses, where the n masses refers to the number of masses in the mass data.
  • p(k) G(p i ',k,n,c') , where G is a function and c' denotes experimental and database search constraints.
  • a score related to random matching is employed in the process of ranking molecules in a database.
  • the probability p(k) is employed in the process of ranking molecules in a database.
  • a whole database or a fraction of a database is processed and organized to allow the computation of p(k) for molecules in the database, k denotes the number of matches between the masses of molecular constituents of each database molecule investigated and masses in the mass data.
  • the molecules in the database can be known or hypothetical.
  • the molecule or molecules producing the mass data can be known or unknown.
  • the ranking of the molecules in a database is based on the score S(p(k)), where ⁇ S is a function.
  • the molecule in the database that yields the lowest S(p(k)) for k matches with the mass data is given the highest rank.
  • the molecule in the database yielding the second lowest S(p(k)) for k matches is given the second highest rank and so on.
  • the identification of a molecule or molecules is among the molecules having the highest ranks.
  • the highest ranks can be the top ranked molecule only, but it can also be more molecules than the top ranked, e.g. the top two, top three, top four, top five, top ten, or top 100.
  • the number of ranked molecules that are considered as identification results can also be determined by the use of a significance test.
  • the invention provides a method of generating a frequency distribution of scores for a particular experimental condition, wherein the scores relate to random identifications of proteins.
  • a frequency distribution is any compilation of the observed values of the variable being studied and how many times each value is observed.
  • Frequency distributions can be in the form of a table of listings, a bar graph, a histogram, a frequency polygon, or a continuous curve.
  • Functions derived from frequency distributions can be continuous (probability density function) or discrete (probability mass functions). Cumulative distribution functions of each type of function can also be derived.
  • the frequency function is generated for a sub-population with H members from a database.
  • the sub -population is selected based upon values of k u .
  • the frequency function is generated for molecules ranked upon their number of matches.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Généralement les données de masses ne sont pas uniques notamment chaque masse déterminée expérimentalement peut être assortie de façon aléatoire à au moins une molécule d'une base de données. L'assortiment aléatoire des données de masses et des molécules d'une base de données peut donner des résultats d'identification erronés. Pour réduire ces résultats erronés, l'assortiment aléatoire doit être comptabilisé de façon appropriée dans un procédé d'identification de molécule. L'invention concerne un procédé permettant de déterminer, pour n'importe quelle molécule de la base de données et n'importe quelle contrainte de recherche expérimentale et de recherche de base de données, la probabilité que l'assortiment aléatoire génère un nombre particulier d'assortiments des données de masses et des masses de constituants de molécules. Ce procédé utilise la probabilité déterminée d'assortiment aléatoire pour attribuer des scores et ranger les molécules dans une base de données. L'invention concerne également un procédé permettant de générer une fonction de fréquence des scores pour n'importe quel état expérimental ou n'importe quelles contraintes de recherche de base de données, sachant que les scores concernent des identifications aléatoires de molécules. Des fonctions de fréquence sont des outils nécessaires et suffisants pour tester la signification d'un score associée à une identification d'une molécule biologique inconnue.
PCT/SE2001/001322 2000-06-14 2001-06-12 Systeme d'identification de molecule WO2001096861A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001264517A AU2001264517A1 (en) 2000-06-14 2001-06-12 System for molecule identification

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE0002214A SE517259C2 (sv) 2000-06-14 2000-06-14 System för molekylidentifiering
SE0002214-5 2000-06-14

Publications (2)

Publication Number Publication Date
WO2001096861A1 true WO2001096861A1 (fr) 2001-12-20
WO2001096861A8 WO2001096861A8 (fr) 2002-08-01

Family

ID=20280077

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2001/001322 WO2001096861A1 (fr) 2000-06-14 2001-06-12 Systeme d'identification de molecule

Country Status (3)

Country Link
AU (1) AU2001264517A1 (fr)
SE (1) SE517259C2 (fr)
WO (1) WO2001096861A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005031343A1 (fr) * 2003-10-01 2005-04-07 Proteome Systems Intellectual Property Pty Ltd Procede permettant de determiner la vraisemblance biologique de compositions ou structures candidates
US7349809B2 (en) 2000-02-02 2008-03-25 Yol Bolsum Canada Inc. Method of non-targeted complex sample analysis
US8478762B2 (en) 2009-05-01 2013-07-02 Microsoft Corporation Ranking system
EP1376651B1 (fr) * 2002-06-25 2014-06-11 Hitachi, Ltd. Méthode et dispositif d' analyse des données en spectrométrie de masse
US10697969B2 (en) 2005-09-12 2020-06-30 Med-Life Discoveries Lp Methods for diagnosing a colorectal cancer (CRC) health state or change in CRC health state, or for diagnosing risk of developing CRC or the presence of CRC in a subject

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5538897A (en) * 1994-03-14 1996-07-23 University Of Washington Use of mass spectrometry fragmentation patterns of peptides to identify amino acid sequences in databases
JP2000048765A (ja) * 1998-07-24 2000-02-18 Jeol Ltd 飛行時間型質量分析計
EP1047107A2 (fr) * 1999-04-06 2000-10-25 Micromass Limited Méthode pour l' identification de péptides et de protéines par spectrométrie de masse
WO2000073787A1 (fr) * 1999-05-27 2000-12-07 Rockefeller University Systeme expert pour l'identification de proteines utilisant l'information en spectrometrie de masse combinee a la recherche de base de donnees

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5538897A (en) * 1994-03-14 1996-07-23 University Of Washington Use of mass spectrometry fragmentation patterns of peptides to identify amino acid sequences in databases
JP2000048765A (ja) * 1998-07-24 2000-02-18 Jeol Ltd 飛行時間型質量分析計
EP1047107A2 (fr) * 1999-04-06 2000-10-25 Micromass Limited Méthode pour l' identification de péptides et de protéines par spectrométrie de masse
WO2000073787A1 (fr) * 1999-05-27 2000-12-07 Rockefeller University Systeme expert pour l'identification de proteines utilisant l'information en spectrometrie de masse combinee a la recherche de base de donnees

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PATENT ABSTRACTS OF JAPAN *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7349809B2 (en) 2000-02-02 2008-03-25 Yol Bolsum Canada Inc. Method of non-targeted complex sample analysis
US7865312B2 (en) 2000-02-02 2011-01-04 Phenomenome Discoveries Inc. Method of non-targeted complex sample analysis
EP1376651B1 (fr) * 2002-06-25 2014-06-11 Hitachi, Ltd. Méthode et dispositif d' analyse des données en spectrométrie de masse
WO2005031343A1 (fr) * 2003-10-01 2005-04-07 Proteome Systems Intellectual Property Pty Ltd Procede permettant de determiner la vraisemblance biologique de compositions ou structures candidates
US10697969B2 (en) 2005-09-12 2020-06-30 Med-Life Discoveries Lp Methods for diagnosing a colorectal cancer (CRC) health state or change in CRC health state, or for diagnosing risk of developing CRC or the presence of CRC in a subject
US8478762B2 (en) 2009-05-01 2013-07-02 Microsoft Corporation Ranking system

Also Published As

Publication number Publication date
AU2001264517A1 (en) 2001-12-24
SE0002214D0 (sv) 2000-06-14
WO2001096861A8 (fr) 2002-08-01
SE517259C2 (sv) 2002-05-14
SE0002214L (sv) 2001-12-15

Similar Documents

Publication Publication Date Title
US6393367B1 (en) Method for evaluating the quality of comparisons between experimental and theoretical mass data
Henzel et al. Protein identification: the origins of peptide mass fingerprinting
US8639447B2 (en) Method for identifying peptides using tandem mass spectra by dynamically determining the number of peptide reconstructions required
Blueggel et al. Bioinformatics in proteomics
US6446010B1 (en) Method for assessing significance of protein identification
Liska et al. Combining mass spectrometry with database interrogation strategies in proteomics
Lu et al. A suffix tree approach to the interpretation of tandem mass spectra: applications to peptides of non-specific digestion and post-translational modifications
US20020046002A1 (en) Method to evaluate the quality of database search results and the performance of database search algorithms
WO2001096861A1 (fr) Systeme d'identification de molecule
WO2004083233A2 (fr) Identification de peptides
EP1820133B1 (fr) Methode et systeme d'identification de polypeptides
CA2477151A1 (fr) Procede d'identification de proteines au moyen de donnees de spectrometrie de masse
US20040044481A1 (en) Method for protein identification using mass spectrometry data
US20020152033A1 (en) Method for evaluating the quality of database search results by means of expectation value
Hubbard Computational approaches to peptide identification via tandem MS
Fridman et al. The probability distribution for a random match between an experimental-theoretical spectral pair in tandem mass spectrometry
WO2002101355A9 (fr) Analyse proteomique amelioree
Liu et al. PRIMA: peptide robust identification from MS/MS spectra
US7603240B2 (en) Peptide identification
Fang et al. Feature selection in validating mass spectrometry database search results
WO2003087805A2 (fr) Procede permettant de calculer de maniere efficace la masse de peptides modifies en vue de l'identification par recherche de base de donnees et spectrometrie de masse
WO2004070643A2 (fr) Procede de prediction d'une fonction proteine
Phanse et al. Proteomics and Protein Identification by Mass Spectrometry
Yan et al. Separation of ion types in tandem mass spectrometry data interpretation-a graph-theoretic approach
Wu et al. Peptide identification via tandem mass spectrometry

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: C1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: C1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WR Later publication of a revised version of an international search report
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载