+

WO2013173795A1 - Système de biosurveillance basé sur une séquence en temps réel - Google Patents

Système de biosurveillance basé sur une séquence en temps réel Download PDF

Info

Publication number
WO2013173795A1
WO2013173795A1 PCT/US2013/041709 US2013041709W WO2013173795A1 WO 2013173795 A1 WO2013173795 A1 WO 2013173795A1 US 2013041709 W US2013041709 W US 2013041709W WO 2013173795 A1 WO2013173795 A1 WO 2013173795A1
Authority
WO
WIPO (PCT)
Prior art keywords
probe
database
sequence
probes
nucleotide sequence
Prior art date
Application number
PCT/US2013/041709
Other languages
English (en)
Inventor
Philip Rolfe
Original Assignee
Pathogenica, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pathogenica, Inc. filed Critical Pathogenica, Inc.
Publication of WO2013173795A1 publication Critical patent/WO2013173795A1/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Definitions

  • the invention provides a method of treating a subject suspected of being infected with a pathogen, comprising detecting at least one target organism (e.g., a pathogen) by the methods of the invention and administering a suitable therapeutic treatment based on the at least one organism detected.
  • a target organism e.g., a pathogen
  • the algorithm reports the surveillance data in real time over the network.
  • processing comprises comparing the stored nucleotide sequence data to the user inputted nucleotide sequence data.
  • the comparing comprises identifying a match between the stored nucleotide sequence data and the user inputted nucleotide sequence data, wherein if a match is present, the database optionally communicates through the network the identity of the pathogenic organism, or strain thereof, and wherein if a match is not present, the unmatched user inputted nucleotide sequence data is stored to the database.
  • the unmatched user inputted nucleotide sequence data is stored in real time.
  • Probes in a mixture will typically have similar bulk properties (such as, homologous probe sequence length, homologous probe sequence T m , and length of the captured region of interest, and the lack of secondary structure) or fall in ranges of similar values.
  • the T m of the homologous probe sequences in a mixture of probes will be within 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 °C of each other, or in particular embodiments have the same T m .
  • the homologous probe sequences in a mixture of probes will all be within 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 nucleotide in length of each other, and in particular embodiments they are the same length.
  • the length of the region of interest between the target sequences of a probe may be common to all probes in the mixture, or vary over a range of values, such as 2-20, 20-100, 20-200, 40-300, 100-300 nucleotides.
  • the regions of interest are within 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 nucleotides in length of each other.
  • the regions of interest are the same length.
  • Barcode lengths may also vary, but are generally within 25, 20, 15, 10, or 5 nucleotides of each other. In particular embodiments, the barcodes are the same length.
  • SICs may comprise a region of interest as defined above, where the region of interest is modified to further comprise a sequence heterologous to the region of interest.
  • the sequence heterologous to the region of interest in the SICs is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40 contiguous bases, or more.
  • the panel is a subject panel for genotyping a subject.
  • the subject panel comprises probes for at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 40, 80, 100, 200, 400, 800, 1000, 5000, or 10000 subject loci.
  • the panel is for a mammalian subject.
  • the mammal is a human.
  • the panel is a prenatal or neonatal panel for detecting heritable genetic abnormalities and/or genotypes associated with increased risk for disease.
  • the panel comprises probes directed to one or more of the genes described in paragraphs 25, 57, and 58 of U.S. Patent Application Publication No.2010/0137426, paragraphs 6 and 7 of U.S. Patent Application Publication No.2009/0305284, paragraph 27 of U.S. Patent Application Publication No.
  • the probes and mixture provided by the invention can be produced by the skilled artisan by following the examples and the general teachings of the application.
  • the probe design process (also referred to as probe design "pipeline”) may take as input a set of genomic DNA sequences against which probes may be designed and the sets of particular strains of target organisms.
  • the genomic DNA sequences may be entire genomes, particular genes, or genomic coordinates in one or more strains.
  • the pipeline may take as input a set of genomes, genes, or coordinates and will select a set of regions to target based on some criteria.
  • the pipeline may use criteria such as regions that vary between the input genomes, genes, or coordinates of the targeted regions in the homologous probe sequence set and a larger set of known genomes.
  • the pipeline may filter n-mers to remove those of substantially the same or exactly the same sequence (i.e., a "duplicate screen").
  • n-mers with the same suffix of length x where x is the minimum n used in enumerating genomic segments of length n (as described above), are considered and the ones with the highest scores may be kept, where the scores are based on the n-mer's suitability as a ligation- side homer, as described above.
  • To generate a set of candidate extension-side homers n-mers with the same prefix of length x are considered and the ones with the highest scores may be kept.
  • a score for the candidate probes may be generated by 1) computing the number of SNPs or indels (insertions or deletions or combinations thereof), up to a selected maximum value, which are observed between each pair of strains to which the probe is expected to bind; 2) generating a sum of the values from (1) to yield the total number of SNPs or indels that the probe may reveal; and 3) multiplying the sum from (2) by an estimate of the probability that the probe will work. This product is the probe's final score.
  • Methods of probe design may include a method for scoring homers and for scoring complete probes, wherein the score corresponds to the probability that the probe will work.
  • the invention provides methods of detecting the presence of one or more organisms of interest in a test sample.
  • the methods comprise the step of contacting a mixture comprising probes described above with any of the test samples described above in a capture reaction, as defined above.
  • a mixture comprising probes is contacted with nucleic acids extracted from a test sample, along with a polymerase enzyme and nucleotide triphosphates (NTPs), and capturing at least one region of interest by polymerasedependent extension of at least one homologous probe sequence in the mixture.
  • NTPs nucleotide triphosphates
  • hybridization of a probe to the target sequences in the organism of interest is followed by polymerase mediated, target-sequence directed addition of nucleotides to the 3' homologous probe sequence, terminating due to obstruction at the 5' homologous probe sequence of the probe.
  • a ligation reaction joins the terminal 3' nucleotide to the 5' nucleotide of arm H2.
  • the methods of the invention further comprise the step of amplifying capture reaction products in an amplification reaction.
  • amplifying nucleic acids include the polymerase chain reaction (see, e.g., U.S. Patent Nos. 4,683,195 and 4,683,202 and McPherson and Moller, PCR (the baSICs), Taylor & Francis; 2 edition (March 30, 2006)), OLA (oligonucleotide ligation amplification) (see, e.g., U.S. Patent Nos. 5,185,243, 5,679,524, and 5,573,907), rolling-circle amplification ("RCA,” described in Baner ei al, Nuc.
  • RCA rolling-circle amplification
  • the amplification is linear amplification such as, RCA.
  • capture reaction products e.g., circularized probes
  • RCA capture reaction products
  • the methods provided by the invention may comprise the step of contacting sample nucleic acids, capture reaction products or amplification reaction products with a secondary-capture oligonucleotide capture probe which comprises a moiety designed to be captured, such as a biotin molecule, and a nucleic acid sequence, which is able to hybridize to the sample nucleic acids, capture reaction products, or amplification reaction products.
  • a secondary-capture oligonucleotide capture probe which comprises a moiety designed to be captured, such as a biotin molecule, and a nucleic acid sequence, which is able to hybridize to the sample nucleic acids, capture reaction products, or amplification reaction products.
  • oligonucleotide such as a biotinylated oligonucleotide, may be used to enrich their target nucleic acids using affinity purification.
  • the capture probes contain sequences that facilitate processing for sequencing by a certain sequencing technology, such as sequences that can serve as anchor sites for sequencing by synthesis, primer sites for sequencing reaction initiation, or restriction enzyme sites that allow cleavage for improved ligation of oligonucleotide adaptors for sequencing of the particular amplicon.
  • circularized capture probes are contacted by oligonucleotides which prime polymerase-mediated extension of the capture probes to generate sequences complementary to that of the circularized probe, including from at least one to one million or more concatemerized copies of the original circular probe.
  • Some reads may map to regions common between one or more strains. In this schematic illustration, most reads align to strains A, B, C and D and are common. In contrast, other reads may be unique to specific strains (e.g., the subset of reads aligning only to strain D). In some embodiments, quantitative models are used to predict the distribution of common reads and unique reads in order to provide a quantitative estimate of the proportion of each unique pathogen present in the sample.
  • Output of results can occur in parallel (1) to company server, (2) to xml and HL7 formats, e.g., for deposit in hospital system, in an electronic medical record (EMR) system, or in other HL7 or xml capable storage systems, for use in existing health record frameworks, and/or (3) to physician-friendly graphical and text formats, e.g., graphs, tables, summary text and possible annotated, web formats linking to reference information.
  • Output formats are arbitrary, e.g., simple text, spreadsheet data, binary data objects, encrypted and/or compressed files.
  • a complete record may involve all or some of these linked to a diagnostic test via unique identifiers. They may be assembled into a coherent object or may be accessible via a search for the unique identifier.
  • kits for detection of targeted regions in th genomes of pathogenic organisms using the compositions described herein with networks of next generation sequencing machines can allow for the development of a digital biorepository based of real time aggregation of genetic sequences of pathogenic strains. This can also be used for realtime global biosurveillance of emerging infections.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Organic Chemistry (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne un procédé comprenant la formation d'une sonde à inversion moléculaire (molecular inversion probe, MIP; molécule linéaire monocaténaire contenant deux bras se liant à la cible lesquels bras peuvent être séparés par une séquence squelette) pour cibler sélectivement une matrice nucléotidique provenant d'une source quelconque (virale, procaryote et eucaryote) afin d'obtenir des informations, notamment la quantité de transcrit et des données de séquence.
PCT/US2013/041709 2012-05-18 2013-05-17 Système de biosurveillance basé sur une séquence en temps réel WO2013173795A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261649068P 2012-05-18 2012-05-18
US61/649,068 2012-05-18

Publications (1)

Publication Number Publication Date
WO2013173795A1 true WO2013173795A1 (fr) 2013-11-21

Family

ID=49584361

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/041709 WO2013173795A1 (fr) 2012-05-18 2013-05-17 Système de biosurveillance basé sur une séquence en temps réel

Country Status (1)

Country Link
WO (1) WO2013173795A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113470742A (zh) * 2020-03-31 2021-10-01 阿里巴巴集团控股有限公司 数据处理方法、装置、存储介质及计算机设备
WO2022272085A1 (fr) * 2021-06-25 2022-12-29 Atossa Therapeutics, Inc. Détection à distance d'agents pathogènes

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050267971A1 (en) * 2004-04-01 2005-12-01 Fritz Charles W System and method of using DNA for linking to network resources
US20070264642A1 (en) * 2000-10-24 2007-11-15 Willis Thomas D Direct multiplex characterization of genomic DNA
WO2011156795A2 (fr) * 2010-06-11 2011-12-15 Pathogenica, Inc. Acides nucléiques pour la détection multiplex d'organismes et leurs procédés d'utilisation et de production
US20120004111A1 (en) * 2007-11-21 2012-01-05 Cosmosid Inc. Direct identification and measurement of relative populations of microorganisms with direct dna sequencing and probabilistic methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070264642A1 (en) * 2000-10-24 2007-11-15 Willis Thomas D Direct multiplex characterization of genomic DNA
US20050267971A1 (en) * 2004-04-01 2005-12-01 Fritz Charles W System and method of using DNA for linking to network resources
US20120004111A1 (en) * 2007-11-21 2012-01-05 Cosmosid Inc. Direct identification and measurement of relative populations of microorganisms with direct dna sequencing and probabilistic methods
WO2011156795A2 (fr) * 2010-06-11 2011-12-15 Pathogenica, Inc. Acides nucléiques pour la détection multiplex d'organismes et leurs procédés d'utilisation et de production

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113470742A (zh) * 2020-03-31 2021-10-01 阿里巴巴集团控股有限公司 数据处理方法、装置、存储介质及计算机设备
WO2021202066A1 (fr) * 2020-03-31 2021-10-07 Alibaba Group Holding Limited Procédé et système de traitement de données génétiques
JP2023514789A (ja) * 2020-03-31 2023-04-10 アリババ・グループ・ホールディング・リミテッド 遺伝子データを処理するための方法およびシステム
WO2022272085A1 (fr) * 2021-06-25 2022-12-29 Atossa Therapeutics, Inc. Détection à distance d'agents pathogènes

Similar Documents

Publication Publication Date Title
US20130261196A1 (en) Nucleic Acids For Multiplex Organism Detection and Methods Of Use And Making The Same
US10138519B2 (en) Universal sanger sequencing from next-gen sequencing amplicons
AU2018331434A1 (en) Universal short adapters with variable length non-random unique molecular identifiers
KR20180020137A (ko) 고유 분자 색인(umi)을 갖는 용장성 판독을 사용하는 서열분석된 dna 단편의 오류 억제
WO2013173774A2 (fr) Sondes d'inversion moléculaire
CN115176032B (zh) 用于评估微生物群体的组合物和方法
US20160115544A1 (en) Molecular barcoding for multiplex sequencing
WO2013067167A2 (fr) Procédé et système de détection d'un organisme
US20160177374A1 (en) Integrated Capture and Amplification of Target Nucleic Acid for Sequencing
US20240279751A1 (en) A rapid multiplex rpa based nanopore sequencing method for real-time detection and sequencing of multiple viral pathogens
JP2023519919A (ja) 病原体を検出するためのアッセイ
KR20220035482A (ko) 프로브의 조합을 사용한 게놈 서열의 검출 및 프로브 분자 및 상기 프로브를 포함하는 유기체의 특이적 검출을 위한 배열
US20230326553A1 (en) Identifying a target nucleic acid
WO2013173795A1 (fr) Système de biosurveillance basé sur une séquence en temps réel
JP2023520590A (ja) 病原体診断検査
JP5210634B2 (ja) スペーサー領域を使用するセラチア(Serratia)種の検出、同定および鑑別
Park et al. Detection strategies for foodborne salmonella and prospects for utilization of whole genome sequencing approaches
Ong'era et al. High-throughput sequencing approaches applied to SARS-CoV-2
US20240203528A1 (en) Improved detection of genomic sequences and probe molecules therefor
Bajaj et al. MICROBIAL GENOMICS
WO2024030342A1 (fr) Procédés et compositions pour l'analyse d'acides nucléiques
EP4298242A1 (fr) Détection améliorée de séquences génomiques et molécules sondes associées
KR20230025222A (ko) 박테로이데스 균주의 정량을 위한 프라이머 세트 및 이의 용도

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13790831

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02/04/2015)

122 Ep: pct application non-entry in european phase

Ref document number: 13790831

Country of ref document: EP

Kind code of ref document: A1

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载