US20090305902A1 - Double-Tiled and Multi-Tiled Arrays and Methods Thereof - Google Patents
Double-Tiled and Multi-Tiled Arrays and Methods Thereof Download PDFInfo
- Publication number
- US20090305902A1 US20090305902A1 US12/086,142 US8614206A US2009305902A1 US 20090305902 A1 US20090305902 A1 US 20090305902A1 US 8614206 A US8614206 A US 8614206A US 2009305902 A1 US2009305902 A1 US 2009305902A1
- Authority
- US
- United States
- Prior art keywords
- array
- tiled
- probes
- nucleic acid
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 86
- 238000003491 array Methods 0.000 title abstract description 49
- 239000000523 sample Substances 0.000 claims description 183
- 150000007523 nucleic acids Chemical class 0.000 claims description 61
- 102000039446 nucleic acids Human genes 0.000 claims description 58
- 108020004707 nucleic acids Proteins 0.000 claims description 58
- 125000003729 nucleotide group Chemical group 0.000 claims description 44
- 238000009396 hybridization Methods 0.000 claims description 39
- 239000002773 nucleotide Substances 0.000 claims description 39
- 238000003499 nucleic acid array Methods 0.000 claims description 18
- 230000000295 complement effect Effects 0.000 claims description 15
- 238000004458 analytical method Methods 0.000 claims description 8
- 239000013641 positive control Substances 0.000 claims description 6
- 238000010195 expression analysis Methods 0.000 claims description 5
- 239000013642 negative control Substances 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 2
- 238000012085 transcriptional profiling Methods 0.000 abstract description 4
- 108090000623 proteins and genes Proteins 0.000 description 45
- 102000053602 DNA Human genes 0.000 description 38
- 108020004414 DNA Proteins 0.000 description 38
- 229920002477 rna polymer Polymers 0.000 description 31
- 108091034117 Oligonucleotide Proteins 0.000 description 24
- 238000002493 microarray Methods 0.000 description 23
- 108020004999 messenger RNA Proteins 0.000 description 22
- 230000014509 gene expression Effects 0.000 description 21
- 239000000126 substance Substances 0.000 description 19
- 102000040430 polynucleotide Human genes 0.000 description 18
- 108091033319 polynucleotide Proteins 0.000 description 18
- 239000002157 polynucleotide Substances 0.000 description 18
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 16
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 16
- 241000894007 species Species 0.000 description 15
- 239000012634 fragment Substances 0.000 description 13
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 12
- 230000015572 biosynthetic process Effects 0.000 description 12
- 238000001514 detection method Methods 0.000 description 12
- 239000011159 matrix material Substances 0.000 description 12
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 12
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 11
- 150000001875 compounds Chemical class 0.000 description 11
- 230000027455 binding Effects 0.000 description 10
- 210000004027 cell Anatomy 0.000 description 10
- 239000002299 complementary DNA Substances 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 10
- 239000000463 material Substances 0.000 description 10
- 239000000758 substrate Substances 0.000 description 10
- 238000003786 synthesis reaction Methods 0.000 description 10
- -1 for instance Chemical class 0.000 description 9
- 229930182830 galactose Natural products 0.000 description 9
- 238000002372 labelling Methods 0.000 description 9
- 102000004169 proteins and genes Human genes 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 8
- 238000013461 design Methods 0.000 description 8
- 238000013467 fragmentation Methods 0.000 description 8
- 238000006062 fragmentation reaction Methods 0.000 description 8
- 239000000203 mixture Substances 0.000 description 8
- 229920000642 polymer Polymers 0.000 description 8
- 239000007787 solid Substances 0.000 description 8
- 108091093037 Peptide nucleic acid Proteins 0.000 description 7
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 7
- 239000000178 monomer Substances 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 6
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 6
- 230000003321 amplification Effects 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 238000003780 insertion Methods 0.000 description 6
- 230000037431 insertion Effects 0.000 description 6
- 238000003199 nucleic acid amplification method Methods 0.000 description 6
- 239000013612 plasmid Substances 0.000 description 6
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 5
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 5
- 210000003050 axon Anatomy 0.000 description 5
- 239000000284 extract Substances 0.000 description 5
- 238000010369 molecular cloning Methods 0.000 description 5
- UHOVQNZJYSORNB-UHFFFAOYSA-N monobenzene Natural products C1=CC=CC=C1 UHOVQNZJYSORNB-UHFFFAOYSA-N 0.000 description 5
- 239000000047 product Substances 0.000 description 5
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 4
- 108700026244 Open Reading Frames Proteins 0.000 description 4
- FPIPGXGPPPQFEQ-OVSJKPMPSA-N all-trans-retinol Chemical compound OC\C=C(/C)\C=C\C=C(/C)\C=C\C1=C(C)CCCC1(C)C FPIPGXGPPPQFEQ-OVSJKPMPSA-N 0.000 description 4
- 239000000975 dye Substances 0.000 description 4
- 239000008103 glucose Substances 0.000 description 4
- 239000003068 molecular probe Substances 0.000 description 4
- 230000035772 mutation Effects 0.000 description 4
- 102000054765 polymorphisms of proteins Human genes 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- 108090000765 processed proteins & peptides Proteins 0.000 description 4
- 230000002441 reversible effect Effects 0.000 description 4
- 125000006850 spacer group Chemical group 0.000 description 4
- 235000000346 sugar Nutrition 0.000 description 4
- ANRHNWWPFJCPAZ-UHFFFAOYSA-M thionine Chemical compound [Cl-].C1=CC(N)=CC2=[S+]C3=CC(N)=CC=C3N=C21 ANRHNWWPFJCPAZ-UHFFFAOYSA-M 0.000 description 4
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 3
- 108700028369 Alleles Proteins 0.000 description 3
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 3
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 3
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 3
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 3
- 239000002253 acid Substances 0.000 description 3
- 238000003556 assay Methods 0.000 description 3
- 238000005284 basis set Methods 0.000 description 3
- 239000011324 bead Substances 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 3
- 238000007796 conventional method Methods 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 3
- 238000003205 genotyping method Methods 0.000 description 3
- 238000010348 incorporation Methods 0.000 description 3
- 230000006698 induction Effects 0.000 description 3
- 125000005647 linker group Chemical group 0.000 description 3
- 230000000873 masking effect Effects 0.000 description 3
- 239000002777 nucleoside Substances 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- 150000003839 salts Chemical class 0.000 description 3
- 150000008163 sugars Chemical class 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- FPIPGXGPPPQFEQ-UHFFFAOYSA-N 13-cis retinol Natural products OCC=C(C)C=CC=C(C)C=CC1=C(C)CCCC1(C)C FPIPGXGPPPQFEQ-UHFFFAOYSA-N 0.000 description 2
- GYMFBYTZOGMSQJ-UHFFFAOYSA-N 2-methylanthracene Chemical compound C1=CC=CC2=CC3=CC(C)=CC=C3C=C21 GYMFBYTZOGMSQJ-UHFFFAOYSA-N 0.000 description 2
- 108091006146 Channels Proteins 0.000 description 2
- AHCYMLUZIRLXAA-SHYZEUOFSA-N Deoxyuridine 5'-triphosphate Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C=C1 AHCYMLUZIRLXAA-SHYZEUOFSA-N 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 101150094690 GAL1 gene Proteins 0.000 description 2
- 102100028501 Galanin peptides Human genes 0.000 description 2
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 2
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 2
- 101150069554 HIS4 gene Proteins 0.000 description 2
- 101100121078 Homo sapiens GAL gene Proteins 0.000 description 2
- SIKJAQJRHWYJAI-UHFFFAOYSA-N Indole Chemical compound C1=CC=C2NC=CC2=C1 SIKJAQJRHWYJAI-UHFFFAOYSA-N 0.000 description 2
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 150000008575 L-amino acids Chemical class 0.000 description 2
- 108090001090 Lectins Proteins 0.000 description 2
- 102000004856 Lectins Human genes 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 108091027974 Mature messenger RNA Proteins 0.000 description 2
- BACYUWVYYTXETD-UHFFFAOYSA-N N-Lauroylsarcosine Chemical compound CCCCCCCCCCCC(=O)N(C)CC(O)=O BACYUWVYYTXETD-UHFFFAOYSA-N 0.000 description 2
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 2
- SMWDFEZZVXVKRB-UHFFFAOYSA-N Quinoline Chemical compound N1=CC=CC2=CC=CC=C21 SMWDFEZZVXVKRB-UHFFFAOYSA-N 0.000 description 2
- 239000013614 RNA sample Substances 0.000 description 2
- 238000011530 RNeasy Mini Kit Methods 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- 241000235070 Saccharomyces Species 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- VMHLLURERBWHNL-UHFFFAOYSA-M Sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- 238000012197 amplification kit Methods 0.000 description 2
- 239000012491 analyte Substances 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000010804 cDNA synthesis Methods 0.000 description 2
- 210000000170 cell membrane Anatomy 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 150000004676 glycans Chemical class 0.000 description 2
- 239000005090 green fluorescent protein Substances 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- QRMZSPFSDQBLIX-UHFFFAOYSA-N homovanillic acid Chemical compound COC1=CC(CC(O)=O)=CC=C1O QRMZSPFSDQBLIX-UHFFFAOYSA-N 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 230000002779 inactivation Effects 0.000 description 2
- 239000000543 intermediate Substances 0.000 description 2
- PHTQWCKDNZKARW-UHFFFAOYSA-N isoamylol Chemical compound CC(C)CCO PHTQWCKDNZKARW-UHFFFAOYSA-N 0.000 description 2
- 239000002523 lectin Substances 0.000 description 2
- 239000003446 ligand Substances 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 102000006240 membrane receptors Human genes 0.000 description 2
- 108020004084 membrane receptors Proteins 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009149 molecular binding Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 239000002853 nucleic acid probe Substances 0.000 description 2
- 150000003833 nucleoside derivatives Chemical class 0.000 description 2
- 229920001542 oligosaccharide Polymers 0.000 description 2
- 150000002482 oligosaccharides Chemical class 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 239000004033 plastic Substances 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 229920001184 polypeptide Polymers 0.000 description 2
- 229920001282 polysaccharide Polymers 0.000 description 2
- 239000005017 polysaccharide Substances 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 229960003471 retinol Drugs 0.000 description 2
- 235000020944 retinol Nutrition 0.000 description 2
- 239000011607 retinol Substances 0.000 description 2
- 239000001022 rhodamine dye Substances 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 108700004121 sarkosyl Proteins 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 235000017281 sodium acetate Nutrition 0.000 description 2
- 239000001632 sodium acetate Substances 0.000 description 2
- 239000011343 solid material Substances 0.000 description 2
- 108091035539 telomere Proteins 0.000 description 2
- 102000055501 telomere Human genes 0.000 description 2
- 210000003411 telomere Anatomy 0.000 description 2
- 125000003396 thiol group Chemical group [H]S* 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000003260 vortexing Methods 0.000 description 2
- XORFPBHYHGEHFP-WMZOPIPTSA-N (13s,14s)-3-amino-13-methyl-12,14,15,16-tetrahydro-11h-cyclopenta[a]phenanthren-17-one Chemical compound NC1=CC=C2C(CC[C@]3([C@H]4CCC3=O)C)=C4C=CC2=C1 XORFPBHYHGEHFP-WMZOPIPTSA-N 0.000 description 1
- TVKPTWJPKVSGJB-XHCIOXAKSA-N (3s,5s,8r,9s,10s,13r,14s,17r)-3,5,14-trihydroxy-13-methyl-17-(6-oxopyran-3-yl)-2,3,4,6,7,8,9,11,12,15,16,17-dodecahydro-1h-cyclopenta[a]phenanthrene-10-carbaldehyde Chemical compound C=1([C@H]2CC[C@]3(O)[C@H]4[C@@H]([C@]5(CC[C@H](O)C[C@@]5(O)CC4)C=O)CC[C@@]32C)C=CC(=O)OC=1 TVKPTWJPKVSGJB-XHCIOXAKSA-N 0.000 description 1
- 101150084750 1 gene Proteins 0.000 description 1
- NZDOXVCRXDAVII-UHFFFAOYSA-N 1-[4-(1h-benzimidazol-2-yl)phenyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1C1=CC=C(C=2NC3=CC=CC=C3N=2)C=C1 NZDOXVCRXDAVII-UHFFFAOYSA-N 0.000 description 1
- IJCDIVICNFHGMA-UHFFFAOYSA-N 1-[7-[(dimethylamino)methyl]-2-oxochromen-3-yl]pyrrole-2,5-dione Chemical compound O=C1OC2=CC(CN(C)C)=CC=C2C=C1N1C(=O)C=CC1=O IJCDIVICNFHGMA-UHFFFAOYSA-N 0.000 description 1
- TUISHUGHCOJZCP-UHFFFAOYSA-N 1-fluoranthen-3-ylpyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1C1=CC=C2C3=C1C=CC=C3C1=CC=CC=C12 TUISHUGHCOJZCP-UHFFFAOYSA-N 0.000 description 1
- RUFPHBVGCFYCNW-UHFFFAOYSA-N 1-naphthylamine Chemical compound C1=CC=C2C(N)=CC=CC2=C1 RUFPHBVGCFYCNW-UHFFFAOYSA-N 0.000 description 1
- TZMSYXZUNZXBOL-UHFFFAOYSA-N 10H-phenoxazine Chemical compound C1=CC=C2NC3=CC=CC=C3OC2=C1 TZMSYXZUNZXBOL-UHFFFAOYSA-N 0.000 description 1
- RNIPJYFZGXJSDD-UHFFFAOYSA-N 2,4,5-triphenyl-1h-imidazole Chemical compound C1=CC=CC=C1C1=NC(C=2C=CC=CC=2)=C(C=2C=CC=CC=2)N1 RNIPJYFZGXJSDD-UHFFFAOYSA-N 0.000 description 1
- SCSMTGPIQWEYHG-UHFFFAOYSA-N 2,4-diphenylfuran-3-one Chemical compound O=C1C(C=2C=CC=CC=2)OC=C1C1=CC=CC=C1 SCSMTGPIQWEYHG-UHFFFAOYSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- JBIJLHTVPXGSAM-UHFFFAOYSA-N 2-naphthylamine Chemical compound C1=CC=CC2=CC(N)=CC=C21 JBIJLHTVPXGSAM-UHFFFAOYSA-N 0.000 description 1
- OALHHIHQOFIMEF-UHFFFAOYSA-N 3',6'-dihydroxy-2',4',5',7'-tetraiodo-3h-spiro[2-benzofuran-1,9'-xanthene]-3-one Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC(I)=C(O)C(I)=C1OC1=C(I)C(O)=C(I)C=C21 OALHHIHQOFIMEF-UHFFFAOYSA-N 0.000 description 1
- 102100023415 40S ribosomal protein S20 Human genes 0.000 description 1
- MMODPNWQZVBPDI-UHFFFAOYSA-N 5-acetamido-5-isothiocyanato-2-[2-(2-sulfophenyl)ethenyl]cyclohexa-1,3-diene-1-sulfonic acid Chemical compound C1=CC(NC(=O)C)(N=C=S)CC(S(O)(=O)=O)=C1C=CC1=CC=CC=C1S(O)(=O)=O MMODPNWQZVBPDI-UHFFFAOYSA-N 0.000 description 1
- CJIJXIFQYOPWTF-UHFFFAOYSA-N 7-hydroxycoumarin Natural products O1C(=O)C=CC2=CC(O)=CC=C21 CJIJXIFQYOPWTF-UHFFFAOYSA-N 0.000 description 1
- FWEOQOXTVHGIFQ-UHFFFAOYSA-N 8-anilinonaphthalene-1-sulfonic acid Chemical compound C=12C(S(=O)(=O)O)=CC=CC2=CC=CC=1NC1=CC=CC=C1 FWEOQOXTVHGIFQ-UHFFFAOYSA-N 0.000 description 1
- 150000005027 9-aminoacridines Chemical group 0.000 description 1
- OGOYZCQQQFAGRI-UHFFFAOYSA-N 9-ethenylanthracene Chemical compound C1=CC=C2C(C=C)=C(C=CC=C3)C3=CC2=C1 OGOYZCQQQFAGRI-UHFFFAOYSA-N 0.000 description 1
- GJCOSYZMQJWQCA-UHFFFAOYSA-N 9H-xanthene Chemical compound C1=CC=C2CC3=CC=CC=C3OC2=C1 GJCOSYZMQJWQCA-UHFFFAOYSA-N 0.000 description 1
- 208000035657 Abasia Diseases 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 101150029409 CFTR gene Proteins 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 201000003883 Cystic fibrosis Diseases 0.000 description 1
- 102100029861 Cytochrome b-c1 complex subunit 8 Human genes 0.000 description 1
- 150000008574 D-amino acids Chemical class 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 102100030960 DNA replication licensing factor MCM2 Human genes 0.000 description 1
- XPDXVDYUQZHFPV-UHFFFAOYSA-N Dansyl Chloride Chemical compound C1=CC=C2C(N(C)C)=CC=CC2=C1S(Cl)(=O)=O XPDXVDYUQZHFPV-UHFFFAOYSA-N 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 206010059866 Drug resistance Diseases 0.000 description 1
- 238000012286 ELISA Assay Methods 0.000 description 1
- 238000004435 EPR spectroscopy Methods 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 102000010834 Extracellular Matrix Proteins Human genes 0.000 description 1
- 108010037362 Extracellular Matrix Proteins Proteins 0.000 description 1
- 102000018711 Facilitative Glucose Transport Proteins Human genes 0.000 description 1
- ZNDMLUUNNNHNKC-UHFFFAOYSA-N G-strophanthidin Natural products CC12CCC(C3(CCC(O)CC3(O)CC3)CO)C3C1(O)CCC2C1=CC(=O)OC1 ZNDMLUUNNNHNKC-UHFFFAOYSA-N 0.000 description 1
- 101150037782 GAL2 gene Proteins 0.000 description 1
- 102100021735 Galectin-2 Human genes 0.000 description 1
- 108091052347 Glucose transporter family Proteins 0.000 description 1
- 102100029138 H/ACA ribonucleoprotein complex subunit 3 Human genes 0.000 description 1
- 102100031249 H/ACA ribonucleoprotein complex subunit DKC1 Human genes 0.000 description 1
- 101150009006 HIS3 gene Proteins 0.000 description 1
- 108010078851 HIV Reverse Transcriptase Proteins 0.000 description 1
- AXUYMUBJXHVZEL-UHFFFAOYSA-N Hellebrigenin Natural products C1=CC(=O)OC=C1C1CCC2(O)C1(C)CCC(C1(CC3)C=O)C2CCC1(O)CC3OC1OC(CO)C(O)C(O)C1O AXUYMUBJXHVZEL-UHFFFAOYSA-N 0.000 description 1
- 108091027305 Heteroduplex Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101001114932 Homo sapiens 40S ribosomal protein S20 Proteins 0.000 description 1
- 101000585358 Homo sapiens Cytochrome b-c1 complex subunit 8 Proteins 0.000 description 1
- 101000583807 Homo sapiens DNA replication licensing factor MCM2 Proteins 0.000 description 1
- 101001124920 Homo sapiens H/ACA ribonucleoprotein complex subunit 3 Proteins 0.000 description 1
- 101000844866 Homo sapiens H/ACA ribonucleoprotein complex subunit DKC1 Proteins 0.000 description 1
- 101001056308 Homo sapiens Malate dehydrogenase, cytoplasmic Proteins 0.000 description 1
- 101000579758 Homo sapiens Raftlin Proteins 0.000 description 1
- 101000880439 Homo sapiens Serine/threonine-protein kinase 3 Proteins 0.000 description 1
- 101000642268 Homo sapiens Speckle-type POZ protein Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- AVXURJPOCDRRFD-UHFFFAOYSA-N Hydroxylamine Chemical class ON AVXURJPOCDRRFD-UHFFFAOYSA-N 0.000 description 1
- 102000009617 Inorganic Pyrophosphatase Human genes 0.000 description 1
- 108010009595 Inorganic Pyrophosphatase Proteins 0.000 description 1
- 102100026475 Malate dehydrogenase, cytoplasmic Human genes 0.000 description 1
- PWHULOQIROXLJO-UHFFFAOYSA-N Manganese Chemical compound [Mn] PWHULOQIROXLJO-UHFFFAOYSA-N 0.000 description 1
- 108091092878 Microsatellite Proteins 0.000 description 1
- 108020005196 Mitochondrial DNA Proteins 0.000 description 1
- UCIRZXAGRBPGJM-UHFFFAOYSA-N NC1=CC=CN=C1.NC1=CC=CN=C1.ICCCCCCCCCCI Chemical compound NC1=CC=CN=C1.NC1=CC=CN=C1.ICCCCCCCCCCI UCIRZXAGRBPGJM-UHFFFAOYSA-N 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 102000001490 Opioid Peptides Human genes 0.000 description 1
- 108010093625 Opioid Peptides Proteins 0.000 description 1
- 239000001888 Peptone Substances 0.000 description 1
- 108010080698 Peptones Proteins 0.000 description 1
- 239000004743 Polypropylene Substances 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- MUPFEKGTMRGPLJ-RMMQSMQOSA-N Raffinose Natural products O(C[C@H]1[C@@H](O)[C@H](O)[C@@H](O)[C@@H](O[C@@]2(CO)[C@H](O)[C@@H](O)[C@@H](CO)O2)O1)[C@@H]1[C@H](O)[C@@H](O)[C@@H](O)[C@@H](CO)O1 MUPFEKGTMRGPLJ-RMMQSMQOSA-N 0.000 description 1
- 102100028208 Raftlin Human genes 0.000 description 1
- 101100394989 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) hisI gene Proteins 0.000 description 1
- AUNGANRZJHBGPY-SCRDCRAPSA-N Riboflavin Chemical compound OC[C@@H](O)[C@@H](O)[C@@H](O)CN1C=2C=C(C)C(C)=CC=2N=C2C1=NC(=O)NC2=O AUNGANRZJHBGPY-SCRDCRAPSA-N 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 101100055115 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) AHP1 gene Proteins 0.000 description 1
- 101100003479 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ATP1 gene Proteins 0.000 description 1
- 101100325259 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ATP16 gene Proteins 0.000 description 1
- 101100271637 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ATP2 gene Proteins 0.000 description 1
- 101100493461 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ATP3 gene Proteins 0.000 description 1
- 101100059229 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) CBF5 gene Proteins 0.000 description 1
- 101100342449 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) CDC19 gene Proteins 0.000 description 1
- 101100440792 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) COX6 gene Proteins 0.000 description 1
- 101100067685 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GAL1 gene Proteins 0.000 description 1
- 101100067698 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GAL2 gene Proteins 0.000 description 1
- 101100335872 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GAL3 gene Proteins 0.000 description 1
- 101100121588 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GCY1 gene Proteins 0.000 description 1
- 101100232295 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GLK1 gene Proteins 0.000 description 1
- 101100285707 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) HSC82 gene Proteins 0.000 description 1
- 101100507949 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) HXT2 gene Proteins 0.000 description 1
- 101100436396 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) INH1 gene Proteins 0.000 description 1
- 101100479680 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KRS1 gene Proteins 0.000 description 1
- 101100076264 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) MDH1 gene Proteins 0.000 description 1
- 101100291253 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) MIG2 gene Proteins 0.000 description 1
- 101100131280 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) MRH1 gene Proteins 0.000 description 1
- 101100133594 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) NOP10 gene Proteins 0.000 description 1
- 101100028327 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) OYE2 gene Proteins 0.000 description 1
- 101100082596 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) PDC5 gene Proteins 0.000 description 1
- 101100190359 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) PHO88 gene Proteins 0.000 description 1
- 101100086100 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) QCR8 gene Proteins 0.000 description 1
- 101100527995 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPL9B gene Proteins 0.000 description 1
- 101100363072 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPS20 gene Proteins 0.000 description 1
- 101100530884 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPS22B gene Proteins 0.000 description 1
- 101100476637 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SAM4 gene Proteins 0.000 description 1
- 101100116805 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SDH4 gene Proteins 0.000 description 1
- 101100171688 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) TEF2 gene Proteins 0.000 description 1
- 101100208020 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) TRP1 gene Proteins 0.000 description 1
- 101100376001 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YCR051W gene Proteins 0.000 description 1
- 101100376074 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YDR010C gene Proteins 0.000 description 1
- 101100106169 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YHR033W gene Proteins 0.000 description 1
- 101100420143 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YMR31 gene Proteins 0.000 description 1
- 101100107111 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ZEO1 gene Proteins 0.000 description 1
- 101100053997 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ZRT1 gene Proteins 0.000 description 1
- 102100037628 Serine/threonine-protein kinase 3 Human genes 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 102100036422 Speckle-type POZ protein Human genes 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- ODJLBQGVINUMMR-UHFFFAOYSA-N Strophanthidin Natural products CC12CCC(C3(CCC(O)CC3(O)CC3)C=O)C3C1(O)CCC2C1=CC(=O)OC1 ODJLBQGVINUMMR-UHFFFAOYSA-N 0.000 description 1
- 101710137500 T7 RNA polymerase Proteins 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- MUPFEKGTMRGPLJ-UHFFFAOYSA-N UNPD196149 Natural products OC1C(O)C(CO)OC1(CO)OC1C(O)C(O)C(O)C(COC2C(C(O)C(O)C(CO)O2)O)O1 MUPFEKGTMRGPLJ-UHFFFAOYSA-N 0.000 description 1
- 229910052770 Uranium Inorganic materials 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 239000000999 acridine dye Substances 0.000 description 1
- 150000001251 acridines Chemical class 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 239000000853 adhesive Substances 0.000 description 1
- 230000001070 adhesive effect Effects 0.000 description 1
- 239000000556 agonist Substances 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- 150000001454 anthracenes Chemical class 0.000 description 1
- 230000001745 anti-biotin effect Effects 0.000 description 1
- 230000000890 antigenic effect Effects 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000012620 biological material Substances 0.000 description 1
- 238000010170 biological method Methods 0.000 description 1
- 239000012472 biological sample Substances 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 229940041514 candida albicans extract Drugs 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 235000011089 carbon dioxide Nutrition 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 125000003636 chemical group Chemical group 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- CYDMQBQPVICBEU-UHFFFAOYSA-N chlorotetracycline Natural products C1=CC(Cl)=C2C(O)(C)C3CC4C(N(C)C)C(O)=C(C(N)=O)C(=O)C4(O)C(O)=C3C(=O)C2=C1O CYDMQBQPVICBEU-UHFFFAOYSA-N 0.000 description 1
- CYDMQBQPVICBEU-XRNKAMNCSA-N chlortetracycline Chemical compound C1=CC(Cl)=C2[C@](O)(C)[C@H]3C[C@H]4[C@H](N(C)C)C(O)=C(C(N)=O)C(=O)[C@@]4(O)C(O)=C3C(=O)C2=C1O CYDMQBQPVICBEU-XRNKAMNCSA-N 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 239000000356 contaminant Substances 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- ZYGHJZDHTFUPRJ-UHFFFAOYSA-N coumarin Chemical compound C1=CC=C2OC(=O)C=CC2=C1 ZYGHJZDHTFUPRJ-UHFFFAOYSA-N 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 1
- 230000027832 depurination Effects 0.000 description 1
- 239000008121 dextrose Substances 0.000 description 1
- FFYPMLJYZAEMQB-UHFFFAOYSA-N diethyl pyrocarbonate Chemical compound CCOC(=O)OC(=O)OCC FFYPMLJYZAEMQB-UHFFFAOYSA-N 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 229960005542 ethidium bromide Drugs 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 210000002744 extracellular matrix Anatomy 0.000 description 1
- 108010068213 factor XIIa inhibitor Proteins 0.000 description 1
- 108091006047 fluorescent proteins Proteins 0.000 description 1
- 102000034287 fluorescent proteins Human genes 0.000 description 1
- 230000005021 gait Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- IPCSVZSSVZVIGE-UHFFFAOYSA-M hexadecanoate Chemical compound CCCCCCCCCCCCCCCC([O-])=O IPCSVZSSVZVIGE-UHFFFAOYSA-M 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 210000003917 human chromosome Anatomy 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000000984 immunochemical effect Effects 0.000 description 1
- 238000000126 in silico method Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- PZOUSPYUWWUPPK-UHFFFAOYSA-N indole Natural products CC1=CC=CC2=C1C=CN2 PZOUSPYUWWUPPK-UHFFFAOYSA-N 0.000 description 1
- RKJUIXBNRJVNHR-UHFFFAOYSA-N indolenine Natural products C1=CC=C2CC=NC2=C1 RKJUIXBNRJVNHR-UHFFFAOYSA-N 0.000 description 1
- JPMOKRWIYQGMJL-UHFFFAOYSA-N inh1 Chemical compound CC1=CC(C)=CC=C1C1=CSC(NC(=O)C=2C=CC=CC=2)=N1 JPMOKRWIYQGMJL-UHFFFAOYSA-N 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 238000011005 laboratory method Methods 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 229920000126 latex Polymers 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 125000003473 lipid group Chemical group 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- HWYHZTIRURJOHG-UHFFFAOYSA-N luminol Chemical compound O=C1NNC(=O)C2=C1C(N)=CC=C2 HWYHZTIRURJOHG-UHFFFAOYSA-N 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 239000006249 magnetic particle Substances 0.000 description 1
- 229910052748 manganese Inorganic materials 0.000 description 1
- 239000011572 manganese Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- DZVCFNFOPIZQKX-LTHRDKTGSA-M merocyanine Chemical compound [Na+].O=C1N(CCCC)C(=O)N(CCCC)C(=O)C1=C\C=C\C=C/1N(CCCS([O-])(=O)=O)C2=CC=CC=C2O\1 DZVCFNFOPIZQKX-LTHRDKTGSA-M 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 238000012775 microarray technology Methods 0.000 description 1
- 239000004005 microsphere Substances 0.000 description 1
- OUAAURDVPDKVAK-UHFFFAOYSA-N n-phenyl-1h-benzimidazol-2-amine Chemical compound N=1C2=CC=CC=C2NC=1NC1=CC=CC=C1 OUAAURDVPDKVAK-UHFFFAOYSA-N 0.000 description 1
- IHRUNHAGYIHWNV-UHFFFAOYSA-N naphtho[2,3-h]cinnoline Chemical compound C1=NN=C2C3=CC4=CC=CC=C4C=C3C=CC2=C1 IHRUNHAGYIHWNV-UHFFFAOYSA-N 0.000 description 1
- 238000002663 nebulization Methods 0.000 description 1
- 230000009871 nonspecific binding Effects 0.000 description 1
- 125000003835 nucleoside group Chemical group 0.000 description 1
- QIQXTHQIDYTFRH-UHFFFAOYSA-N octadecanoic acid Chemical compound CCCCCCCCCCCCCCCCCC(O)=O QIQXTHQIDYTFRH-UHFFFAOYSA-N 0.000 description 1
- 238000002515 oligonucleotide synthesis Methods 0.000 description 1
- 239000003399 opiate peptide Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 239000003960 organic solvent Substances 0.000 description 1
- LCCNCVORNKJIRZ-UHFFFAOYSA-N parathion Chemical compound CCOP(=S)(OCC)OC1=CC=C([N+]([O-])=O)C=C1 LCCNCVORNKJIRZ-UHFFFAOYSA-N 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 108010011903 peptide receptors Proteins 0.000 description 1
- 102000014187 peptide receptors Human genes 0.000 description 1
- 238000010647 peptide synthesis reaction Methods 0.000 description 1
- 235000019319 peptone Nutrition 0.000 description 1
- 125000002080 perylenyl group Chemical group C1(=CC=C2C=CC=C3C4=CC=CC5=CC=CC(C1=C23)=C45)* 0.000 description 1
- CSHWQDPOILHKBI-UHFFFAOYSA-N peryrene Natural products C1=CC(C2=CC=CC=3C2=C2C=CC=3)=C3C2=CC=CC3=C1 CSHWQDPOILHKBI-UHFFFAOYSA-N 0.000 description 1
- RDOWQLZANAYVLL-UHFFFAOYSA-N phenanthridine Chemical group C1=CC=C2C3=CC=CC=C3C=NC2=C1 RDOWQLZANAYVLL-UHFFFAOYSA-N 0.000 description 1
- 150000004713 phosphodiesters Chemical group 0.000 description 1
- 108060006184 phycobiliprotein Proteins 0.000 description 1
- 238000000053 physical method Methods 0.000 description 1
- 229920001155 polypropylene Polymers 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 150000004032 porphyrins Chemical class 0.000 description 1
- 150000003141 primary amines Chemical class 0.000 description 1
- 238000000159 protein binding assay Methods 0.000 description 1
- DLOBKMWCBFOUHP-UHFFFAOYSA-N pyrene-1-sulfonic acid Chemical compound C1=C2C(S(=O)(=O)O)=CC=C(C=C3)C2=C2C3=CC=CC2=C1 DLOBKMWCBFOUHP-UHFFFAOYSA-N 0.000 description 1
- 150000003220 pyrenes Chemical class 0.000 description 1
- 150000003254 radicals Chemical class 0.000 description 1
- MUPFEKGTMRGPLJ-ZQSKZDJDSA-N raffinose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO[C@@H]2[C@@H]([C@@H](O)[C@@H](O)[C@@H](CO)O2)O)O1 MUPFEKGTMRGPLJ-ZQSKZDJDSA-N 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 239000012508 resin bead Substances 0.000 description 1
- HSSLDCABUXLXKM-UHFFFAOYSA-N resorufin Chemical compound C1=CC(=O)C=C2OC3=CC(O)=CC=C3N=C21 HSSLDCABUXLXKM-UHFFFAOYSA-N 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 description 1
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 1
- 229930187593 rose bengal Natural products 0.000 description 1
- AZJPTIGZZTZIDR-UHFFFAOYSA-L rose bengal Chemical compound [K+].[K+].[O-]C(=O)C1=C(Cl)C(Cl)=C(Cl)C(Cl)=C1C1=C2C=C(I)C(=O)C(I)=C2OC2=C(I)C([O-])=C(I)C=C21 AZJPTIGZZTZIDR-UHFFFAOYSA-L 0.000 description 1
- 229940081623 rose bengal Drugs 0.000 description 1
- STRXNPAVPKGJQR-UHFFFAOYSA-N rose bengal A Natural products O1C(=O)C(C(=CC=C2Cl)Cl)=C2C21C1=CC(I)=C(O)C(I)=C1OC1=C(I)C(O)=C(I)C=C21 STRXNPAVPKGJQR-UHFFFAOYSA-N 0.000 description 1
- YGSDEFSMJLZEOE-UHFFFAOYSA-M salicylate Chemical compound OC1=CC=CC=C1C([O-])=O YGSDEFSMJLZEOE-UHFFFAOYSA-M 0.000 description 1
- 229960001860 salicylate Drugs 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 238000005464 sample preparation method Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000013515 script Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- OSQUFVVXNRMSHL-LTHRDKTGSA-M sodium;3-[(2z)-2-[(e)-4-(1,3-dibutyl-4,6-dioxo-2-sulfanylidene-1,3-diazinan-5-ylidene)but-2-enylidene]-1,3-benzoxazol-3-yl]propane-1-sulfonate Chemical compound [Na+].O=C1N(CCCC)C(=S)N(CCCC)C(=O)C1=C\C=C\C=C/1N(CCCS([O-])(=O)=O)C2=CC=CC=C2O\1 OSQUFVVXNRMSHL-LTHRDKTGSA-M 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 150000003431 steroids Chemical class 0.000 description 1
- ODJLBQGVINUMMR-HZXDTFASSA-N strophanthidin Chemical compound C1([C@H]2CC[C@]3(O)[C@H]4[C@@H]([C@]5(CC[C@H](O)C[C@@]5(O)CC4)C=O)CC[C@@]32C)=CC(=O)OC1 ODJLBQGVINUMMR-HZXDTFASSA-N 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- MPLHNVLQVRSVEE-UHFFFAOYSA-N texas red Chemical compound [O-]S(=O)(=O)C1=CC(S(Cl)(=O)=O)=CC=C1C(C1=CC=2CCCN3CCCC(C=23)=C1O1)=C2C1=C(CCC1)C3=[N+]1CCCC3=C2 MPLHNVLQVRSVEE-UHFFFAOYSA-N 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 108700012359 toxins Proteins 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 239000001003 triarylmethane dye Substances 0.000 description 1
- ORHBXUUXSCNDEV-UHFFFAOYSA-N umbelliferone Chemical compound C1=CC(=O)OC2=CC(O)=CC=C21 ORHBXUUXSCNDEV-UHFFFAOYSA-N 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 229910052720 vanadium Inorganic materials 0.000 description 1
- GPPXJZIENCGNKB-UHFFFAOYSA-N vanadium Chemical compound [V]#[V] GPPXJZIENCGNKB-UHFFFAOYSA-N 0.000 description 1
- 231100000611 venom Toxicity 0.000 description 1
- 239000002435 venom Substances 0.000 description 1
- 210000001048 venom Anatomy 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 239000001018 xanthene dye Substances 0.000 description 1
- 239000012138 yeast extract Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6834—Enzymatic or biochemical coupling of nucleic acids to a solid phase
- C12Q1/6837—Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
Definitions
- Microarrays high-throughput platforms for analyzing gene expression and features of total genomic DNA, among other things, are gaining in popularity as researchers discover ever more applications for their unbiased and broad feature sets and among the diagnostic industry for transcriptional profiling and polymorphism analysis. Microarray analyses are currently limited by the number of individual features that can be placed on each array, making the use of microarrays expensive and time consuming.
- Microarrays including genome tiling microarrays, are exceptionally powerful tools for querying diverse genomic features, including mapping gene expression and structure, analyzing polymorphisms, determining protein binding targets, and examining genome architecture 1-4 .
- the utility of genome tiling microarrays lies in the unbiased selection of densely spaced features. Current microarrays and studies using them are restricted both by expense (the number of arrays or slides purchased) and by spatial limitations of microarray technology (the number of features on each array). Thus, there is a need in the art to increase the number of sequences present on an array to provide cost and timesavings.
- Described herein is a multi-tiling method that significantly increases the number of features (e.g., sequences) present on an array and methods of making and using the multi-tiled array.
- features e.g., sequences
- for the first time is successful transcriptional profiling using the multi-tiled array format.
- the described arrays and methods provide cost and timesavings as well as preserving precious samples. Using this method, we and others can now save money and precious samples by using fewer arrays to cover a region, or can perform investigations at significantly higher resolution without incurring increasing costs or increasing the amount of sample required for the experiment.
- the double-tiling array is useful for complex, two-color, whole-genome hybridizations.
- multi-tiled nucleic acid arrays comprising an immobilized array of nucleic acid features, wherein each feature comprises an inner probe and an outer probe, wherein the inner and outer probes are unrelated in genomic coordinates.
- one of the inner or the outer probe is arranged horizontally and the other is arranged vertically.
- the features of the array further comprise middle probes between the inner and the outer probes, wherein the probes are unrelated in genomic coordinates.
- the features of the array further comprise second middle probes between the inner and the middle probes, wherein the probes are unrelated in genomic coordinates.
- the array may further comprise at least one positive control feature.
- the array may further comprise at least one negative control feature.
- the multi-tiled array comprises from between about 100 to about 3 billion features. In a related embodiment, multi-tiled array comprises from between about 10,000 to 10 million features. In a related embodiment, the multi-tiled array comprises from between about 1000 to about 5 million features.
- the arrays described herein may have any number of features as determined appropriate by one of skill in the art for a particular purpose.
- multi-tiled nucleic acid arrays comprising an immobilized array of nucleic acid features, wherein the features comprise an inner probe, a middle probe, and an outer probe, wherein the probes are unrelated in genomic coordinates.
- the probes are from between about 10 nucleotides to about 50 nucleotides in length. In a related embodiment, the probes are from between about 15 nucleotides to about 40 nucleotides in length. In another related embodiment, the probes are from between about 20 nucleotides to about 35 nucleotides in length. In a related embodiment, the probes are 30 nucleotides in length.
- the inner, middle, and outer probes are arranged horizontally, vertically and diagonally, respectively or in any order.
- the probes on a multi-tiled array of a certain layer are arranged in one manner different from those in another layer. It does not matter which layer is arranged in which manner. Layers of probes may also be arranged in non-linear or random patterns.
- the features further comprise spacers between the inner and the middle probe and between the middle and the outer probe.
- multi-tiled nucleic acid arrays comprising an immobilized array of nucleic acid features, wherein the features comprise four probes, an inner probe a middle probe, and an outer probe, wherein the probes are unrelated in genomic coordinates.
- the probes are from between about 10 nucleotides to about 50 nucleotides in length.
- the probes are arranged horizontally, vertically, diagonally upper left to lower right and diagonally lower left to upper right.
- the features further comprise spacers between the inner and the middle probe and between the middle and the outer probe.
- multi-tiled nucleic acid arrays comprising an immobilized array of nucleic acid features, wherein the features comprise at least two probes unrelated in genomic coordinates.
- the features comprise three probes.
- the features comprise four probes.
- methods of expression (transcriptional) profiling comprising providing a multi-tiled array, hybridizing a labeled sample to the array; and analyzing the array.
- the array comprises portions of at least one genome.
- Exemplary genomes include, for example, mammals, yeast, bacteria, plants, and the like.
- the profiling further comprises comparing the expression profile of a sample to an expression profile reference.
- the sample is a clinical sample.
- analyzing the array comprises deconvolution of a signal.
- the analyzing determines an expression profile of a sample.
- the method of expression profiling evaluates a subject for a condition.
- the condition is a disease condition.
- the method of expression profiling diagnoses a subject for a condition. In a related embodiment, the method of expression profiling monitors a subject for a condition. In another related embodiment, the subject is a human.
- a multi-tiled array comprising selecting probe sequences; arranging inner probe sequences in sequence order, and appending outer probe sequences in sequence order to the inner probe sequences.
- the methods may further comprise masking a genome of an organism prior to selecting probe sequences.
- one of the inner or the outer probe sequences are arranged horizontally and the other are arranged vertically.
- the array may further comprise appending third probe sequences in sequence order to the outer probe sequences.
- the third probe sequences are arranged diagonally.
- selecting the probe sequences comprises selecting one or more of random sequence or sequences with low probability of conformational problems.
- the methods may further comprise randomizing the positions of the sequences. In one embodiment, the methods may further comprise adding a spacer between the inner and the outer probe.
- the masking comprises masking repetitive genomic sequences.
- the selecting of the probes comprises separating each probe by at least a distance of 1 to 500 nucleotides. In a related embodiment, the selecting of the probes comprises separating each probe by a distance of between about 1 to about 1,000 nucleotides.
- methods of array based evaluation of a sample comprising providing a multi-tiled array; hybridizing a sample to the array; and deconvoluting signal intensities.
- the methods may further comprise analyzing the signal intensities.
- the methods may further comprise examining fluorescent feature adjacency to determine whether the inner or outer probe was hybridized.
- the signal is a fluorescent or color signal.
- the methods may further comprise preparing a sample.
- preparing the sample comprises one or more of digesting a sample, labeling a digested sample, and purifying sample.
- deconvoluting comprises visualizing the microarray and examining the data obtained from the microarray.
- digesting a sample for cDNA synthesis may be by using MMLV-RT, DTT, 10 mM DNTP and RNaseOUT (Agilent Technologies Kit) or Agilent Low RNA Input Linear Amplification Kit.
- labeling a digested sample is by in vitro transcription.
- purifying sample is, for example, by QIAGEN's QIAquick spin columns as described in the RNeasy Mini Kit (QIAGEN).
- deconvoluting comprises visualizing the microarray.
- the visualizing is, for example, by Axon GenePix 4,000B scanner (Axon Instruments).
- the data generated from the deconvolution and the visualization is examined, for example, by using GenePix Pro 6.0.
- methods of polymorphism analysis comprising providing a multi-tiled nucleic acid array of probes comprising a first set of probes spanning each of a collection of polymorphic sites in known sequences of unknown function and complementary to a first allelic forms of the sites, and a second set of probes spanning each of the polymorphic sites in the collection and complementary to second allelic forms of the sites, wherein the collection of polymorphic sites includes at least 10 unlinked polymorphic sites; and hybridizing a nucleic acid sample from a subject to the array of probes and analyzing the hybridization intensities of probes in the first and second probe sets to determine a profile of polymorphic forms present in the individual.
- a multi-tiled chemical array comprising a plurality of features of bioorganic molecules in a predetermined arrangement, comprising providing a substantially planar solid material having an attachment surface; and attaching the features of bioorganic molecules onto the attachment surface, wherein the features comprise an inner probe and an outer probe, wherein the inner and outer probes are unrelated in genomic coordinates.
- the array comprises from about 50 to about 3 billion (3 ⁇ 10e9) different features of the bioorganic molecules and wherein the bioorganic molecules are attached to the surface of each the tile at a density of about 1000 to 100,000 bioorganic molecules per square micron of the attachment surface.
- the material comprises a solid nonporous material selected from the group consisting of a glass, a silicon, and a plastic.
- the methods may further comprise bringing the constructed array into contact with a same sample.
- the methods may further comprise performing a quality test on the attachment surface after the attaching.
- the methods may further comprise verifying the fidelity of the bioorganic molecules on the attachment surface.
- the methods may further comprise verifying the density of attachment of the bioorganic molecules on the attachment surface.
- the bioorganic molecules are presynthesized before attachment onto the surface.
- kits for use in expression profiling of a nucleic acid comprising a multi-tiled nucleic acid array-, and instructions for use.
- FIG. 1 depicts a sample 4 ⁇ 4 array for didactic purposes.
- the sequence to be tiled is split into two equal-length segments, represented here as first half, A-P; second half, 1-16. 30-mers from each half-sequence are tiled separately, A-P (inner stack) horizontally and 1-16 (outer stack) vertically.
- A-P inner stack
- 1-16 outer stack
- Outer stack tiles are overlaid on inner stack tiles and the 32 30-mers are concatenated to form 16 60-mers.
- FIG. 2 depicts a plasmid experiment—results agree well with predictions.
- FIG. 3 depicts a two-color double-tiled array clearly demonstrating galactose induction.
- a section of the two-color double-tiled array showing red signal in lines resulting from hybridization of Cy5-labeled RNA from galactose-induced cultures along with Cy3-labeled RNA from glucose-induced cultures. Most lines are yellow, indicating that as expected, most genes are expressed at similar levels in the glucose- and galactose-grown cultures.
- the features illuminated in a horizontal red line are derived from GAL1; the vertical red line is signal from GAL2.
- native Ty1 sequences were found to be downregulated approximately 2.5 fold by galactose induction; this conclusion was confirmed by real-time RT-PCR.
- FIG. 4 depicts a double-tiled arrays show low between-array variation. Box plots showing the distribution of difference between estimated relative expression obtained from replicate RNA samples. Ideally, these differences should be 0; thus, tighter box plots are associated with better precision.
- the first box plot (green) represents the data from double-tiled arrays and the second plot represents data from conventional single-tiled arrays.
- FIG. 5 shows correspondence at the top (CAT) plots.
- Correspondence shown in the y-axis, is defined as the number of genes in common in lists formed by ranking genes by their log-ratios and keeping the top N. The size of the list N is varied and shown in the x-axis.
- the blue line shows correspondence between two replicate single-tiled arrays
- the red represents correspondence between two replicate double-tiled arrays
- the green line shows the average correspondence between single-tiled and double tiled arrays (there are 4 possible comparisons, all shown in thinner lines).
- the yellow area represents a 99.9% critical region for the null hypothesis of no correspondence, i.e. anything outside this region attains a p-value of less than 0.001.
- the practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art.
- Such conventional techniques include polymer array synthesis, hybridization, ligation, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the example herein below. However, other equivalent conventional procedures can, of course, also be used.
- Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series (Vols.
- the present invention can employ solid substrates, including arrays in some preferred embodiments.
- Methods and techniques applicable to polymer (including protein) array synthesis have been described in U.S. Ser. No. 09/536,841, WO 00/58516, U.S. Pat. Nos.
- An “array” is an arrangement of objects in space in which each object occupies a separate predetermined spatial position.
- Each of the objects in the array of this invention comprises one or more species of chemical moiety attached to a “discrete physical entity”, such that the physical location of each species is known or ascertainable.
- a “discrete physical entity” is a unit of substantially planar material (e.g., a solid material, a membrane, a gel or a combination of materials) that can be handled and still maintain its identity, and can be subdivided into “tiles” for recombining in various ways to form a physical array.
- the tiles will have regular geometric shapes, e.g., a sector of a circle, a rectangle, and the like, with radial or linear dimensions of about 100 mm to about 10 mm, most preferably about 1 ⁇ M to about 1000 ⁇ M.
- the subdivision of the entity into tiles can be made either before or after attachment of the chemical moiety, and by any suitable method for cutting the entity, e.g., with a dicing saw. These methods are well known in the art of semiconductor chip manufacture and can be optimized by one skilled in the art for the particular material selected for use in this invention.
- a “support” is a surface or structure for the attachment of tiles.
- the “support” may be of any desired shape and size and can be fabricated from a variety of materials.
- the support material can be treated for biocompatibility (i.e., to protect biological samples and probes from undesired structure or activity changes upon contact with the support surface) and to reduce non-specific binding of biological materials to the support. These procedures are well known in the art (see, e.g., Schoneich et al, Anal. Chem. 65: 67-84R (1993)).
- the tiles can be attached to the support by means of an adhesive, by insertion into a pocket or channel formed in the support, or by any other means that will provide a stable and secure spatial arrangement.
- Tile is the process of forming an array by picking and placing individual tiles comprising single or multiple species of chemical moieties (referred to as “features”) on a support in a fixed spatial pattern.
- Multi-tiling refers for example to an array in which the individual features contain two or more non-contiguous sequences directly or indirectly associated or bound to form the feature.
- the multi-tiled arrays are useful, for example, for complex, two-color, whole-genome hybridizations, transcriptional profiling, mapping gene expression and structure, analyzing polymorphisms, determining protein binding targets, and examining genome architecture.
- the genome tiling microarrays allow for the unbiased selection of densely spaced features. As an example, double-tiling effectively doubles the number of sequences fitting on any given array as each feature has an inner and an outer probe.
- a 60-mer feature for DNA oligonucleotide microarrays each comprise two concatenated 30-mers.
- the features may be, for example, in the context of a double-tiled array, from between about 10 to about 200-mers.
- the features may be made of two 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 85, 90, 95, or 100-mers.
- the oligonucleotides features in a double-tiled array may be concatenated, spaced by a linker to which they are both bound or associated or otherwise attached or associated to form a feature of the array.
- the features of a multi-tiled array may be arranged in linear, non-linear, or random patterns.
- the inner probe of the feature which is directly or indirectly bound or associated with the substrate, may be in a horizontal arrangement while the outer probe of the feature will be in a vertical arrangement or vice versa.
- One of the features may also be in, for example, a diagonal arrangement.
- the inner probe is in a diagonal arrangement
- the middle probe is in a horizontal arrangement
- the outer probe is in a vertical arrangement.
- the probes of a feature are unrelated in genomic coordinate or sequence arrangement from the other probes of a feature.
- the positions of the sequences of the features may be randomized to reduce potential spatial artifacts.
- probes in one arrangement will span contiguous sequences or may be separated by some distance.
- the inner probes of a feature may be separated by from about 10 to about 500 nucleotides.
- the probes may be separated by about 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 130, 140, 150, 160, 170, 180, 190, or about 200 nucleotides.
- the probes may be separated by any number of nucleotides determined to give the optimal sequence coverage as determined by one of skill in the art depending on the purpose of the array or the experiment or diagnostic the array is being used for.
- the fluorescent polynucleotides will span a contiguous set of sequences or probes on an array illuminating a line of features.
- fluorescent feature adjacency By examining fluorescent feature adjacency, one can easily determine whether the inner or outer probe, as a fluorescent molecule binding one outer probe will bind several adjacent outer probes, illuminating a horizontal vs. vertical line of features. If the features are randomized, they can be computationally “derandomized” and the adjacency patterns will be apparent.
- An array may be made of any number of features as known in the art. For example, a 44,000 feature (60-mer) array of the (Agilent Technologies Inc.) spanning the entire Saccharomyces cerevisiae genome is an example. Other genomes may be made into arrays and may be designed as described herein or by other methods known to those of skill in the art, e.g., vertebrate, mammals, plants, etc. To adequately cover a genome, repetitive sequences (e.g., retrotransposons and long terminal repeats (LTRs), telomeres, and X and Y′ elements) may be masked at the feature selection stage. An array may also contain positive and/or negative controls.
- LTRs long terminal repeats
- telomeres e.g., telomeres, and X and Y′ elements
- Positive controls may be made of sequences that are known to be in a sample of interest or may be added to a sample and the features may be added to the array of those sequences.
- Exemplary positive controls include the Ty1 sequences for a yeast array.
- programs such as Primer3 9 and the like may be used to choose oligonucleotides with the lowest likelihood of conformational problems. Sequences may also be selected randomly or by any other method suitable for a particular purpose.
- Deconvolution refers to computationally or otherwise analyzing which probe in a feature is bound by sample. None of the probes, each probe of a feature may be bound or one or more probes of a feature may be bound by sample.
- a “chemical moiety” is an organic or inorganic molecule that is preformed at the time of attachment to a discrete physical moiety, in distinction to an organic molecule that is synthesized in situ on an array surface.
- the preferred mode of attachment is by covalent bonding, although noncovalent means of attachment or immobilization might be appropriate depending on the particular type of chemical moiety that is used.
- a “chemical moiety” can be covalently modified by the addition or removal of groups after the moiety is attached to a physically distinct entity.
- the chemical moieties of this invention are preferably “bioorganic molecules” of natural or synthetic origin, are capable of synthesis or replication by chemical, biochemical or molecular biological methods, and are capable of interacting with biological systems, e.g., cell receptors, immune system components, growth factors, components of the extracellular matrix, DNA and RNA, and the like.
- the preferred bioorganic molecules for use in the arrays of this invention are “molecular probes” selected from nucleic acids (or portions thereof), proteins (or portions thereof), polysaccharides (or portions thereof), and lipids (or portions thereof), for example, oligonucleotides, peptides, oligosaccharides or lipid groups that are capable of use in molecular recognition and affinity-based binding assays (e.g., antigen-antibody, receptor-ligand, nucleic acid-protein, nucleic acid-nucleic acid, and the like).
- molecular recognition and affinity-based binding assays e.g., antigen-antibody, receptor-ligand, nucleic acid-protein, nucleic acid-nucleic acid, and the like.
- An array may contain different families of bioorganic molecule, e.g., proteins and nucleic acids, but typically will contain two or more species of the same family of molecule, e.g., two or more sequences of oligonucleotide, two or more protein antigens, two or more chemically distinct small organic molecules, and the like.
- An array can be formed from two species of molecule, although it is preferred that the array contain several tens to thousands of species of molecule, preferably from about 50 to about 1000 species. Each species of course can be present in multiple copies if desired.
- an “analyte” is a molecule whose detection is desired and which selectively or specifically binds to a molecular probe.
- An analyte can be the same or different type of molecule as the molecular probe to which it binds.
- complementary refers to the hybridization or base pairing between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid to be sequenced or amplified.
- Complementary nucleotides are, generally, A and T (or A and U), or C and G.
- Two single stranded RNA or DNA molecules are said to be complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%.
- complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement.
- selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary. See, M. Kanehisa Nucleic Acids Res. 12:203 (1984), incorporated herein by reference.
- detectable moiety means a chemical group that provides a signal.
- the signal is detectable by any suitable means, including spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. In certain cases, the signal is detectable by 2 or more means.
- the detectable moiety provides the signal either directly or indirectly.
- a direct signal is produced where the labeling group spontaneously emits a signal, or generates a signal upon the introduction of a suitable stimulus.
- Radiolabels such as 3 H, 125 I, 35 S, 14 C or 32 P, and magnetic particles, such as DynabeadsTM, are nonlimiting examples of groups that directly and spontaneously provide a signal
- Labeling groups that directly provide a signal in the presence of a stimulus include the following nonlimiting examples: colloidal gold (40-80 nm diameter), which scatters green light with high efficiency; fluorescent labels, such as fluorescein, Texas red, Rhoda mine, and green fluorescent protein (Molecular Probes, Eugene, Oreg.), which absorb and subsequently emit light; chemiluminescent or bioluminescent labels, such as luminol, lophine, acridine salts and luciferins, which are electronically excited as the result of a chemical or biological reaction and subsequently emit light; spin labels, such as van
- a detectable moiety provides an indirect signal where it interacts with a second compound that spontaneously emits a signal, or generates a signal upon the introduction of a suitable stimulus.
- Biotin for example, produces a signal by forming a conjugate with streptavidin, which is then detected. See Hybridization With Nucleic Acid Probes. In Laboratory Techniques in Biochemistry and Molecular Biology; Tijssen, P., Ed.; Elsevier. New York, 1993; Vol. 24.
- An enzyme such as horseradish peroxidase or alkaline phosphatase, that is attached to an antibody in a label-antibody-antibody as in an ELISA assay, also produces an indirect signal.
- a preferred detectable moiety is a fluorescent group.
- Fluorescent groups typically produce a high signal to noise ratio, thereby providing increased resolution and sensitivity in a detection procedure.
- the fluorescent group absorbs light with a wavelength above about 300 nm, more preferably above about 350 nm, and most preferably above about 400 nm.
- the wavelength of the light emitted by the fluorescent group is preferably above about 310 nm, more preferably above about 360 nm, and most preferably above about 410 nm.
- the fluorescent detectable moiety is selected from a variety of structural classes, including the following nonlimiting examples: 1- and 2-aminonaphthalene, p,p′diaminostilbenes, pyrenes, quaternary phenanthridine salts, 9-aminoacridines, p,p′-diaminobenzophenone imines, anthracenes, oxacarbocyanine, marocyanine, 3-aminoequilenin, perylene, bisbenzoxazole, bis-p-oxazolyl benzene, 1,2-benzophenazin, retinol, bis-3-aminopridinium salts, hellebrigenin, tetracycline, sterophenol, benzimidazolyl phenylamine, 2-oxo-3-chromen, indole, xanthen, 7-hydroxycoumarin, phenoxazine, salicylate, strophanthidin, porphyrins
- fluorescent compounds are suitable for incorporation into the present invention.
- Nonlimiting examples of such compounds include the following: dansyl chloride; fluoresceins, such as 3,6-dihydroxy-9-phenylxanthhydrol; rhodamineisothiocyanate; N-phenyl-1-amino-8-sulfonatonaphthalene; N-phenyl-2-amino-6-sulfonatonaphthanlene; 4-acetamido-4-isothiocyanatostilbene-2,2′-disulfonic acid; pyrene-3-sulfonic acid; 2-toluidinonapththalene-6-sulfonate; N-phenyl, N-methyl 2-aminonaphthalene-6-sulfonate; ethidium bromide; stebrine; auroniine-0,2-(9′-anthroyl)palmitate; dansyl phosphatidylethanolamin; N,N′-dio
- colloidal gold Another preferred detectable moiety is colloidal gold.
- the colloidal gold particle is typically 40 to 80 nm in diameter.
- the colloidal gold may be attached to a labeling compound in a variety of ways.
- the linker moiety of the nucleic acid labeling compound terminates in a thiol group (—SH), and the thiol group is directly bound to colloidal gold through a dative bond.
- —SH thiol group
- it is attached indirectly, for instance through the interaction between colloidal gold conjugates of antibiotin and a biotinylated labeling compound.
- the detection of the gold labeled compound may be enhanced through the use of a silver enhancement method. See Danscher et al. J. Histotech 1993, 16, 201-207.
- fragmentation refers to the breaking of nucleic acid molecules into smaller nucleic acid fragments.
- size of the fragments generated during fragmentation can be controlled such that the size of fragments is distributed about a certain predetermined nucleic acid length.
- genomic is all the genetic material in the chromosomes of an organism.
- DNA derived from the genetic material in the chromosomes of a particular organism is genomic DNA.
- a genomic library is a collection of clones made from a set of randomly generated overlapping DNA fragments representing the entire genome of an organism.
- hybridization refers to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-helix polynucleotide; triple-stranded hybridization is also theoretically possible.
- the resulting (usually) double-stranded polynucleotide is a “hybrid.”
- the proportion of the population of polynucleotides that forms stable hybrids is referred to herein as the “degree of hybridization.”
- Hybridizations are usually performed under stringent conditions, for example, at a salt concentration of no more than 1 M and a temperature of at least 25° C.
- conditions of 5.times.SSPE 750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4 and a temperature of 25-30° C. are suitable for allele-specific probe hybridizations.
- stringent conditions see, for example, Sambrook, Fritsche and Maniatis. “Molecular Cloning A laboratory Manual” 2.sup.nd Ed. Cold Spring Harbor Press (1989) which is hereby incorporated by reference in its entirety for all purposes above.
- hybridization conditions will typically include salt concentrations of less than about 1 M, more usually less than about 500 mM and preferably less than about 200 mM.
- Hybridization temperatures can be as low as 5° C., but are typically greater than 22° C., more typically greater than about 30° C., and preferably in excess of about 37° C. Longer fragments may require higher hybridization temperatures for specific hybridization. As other factors may affect the stringency of hybridization, including base composition and length of the complementary strands, presence of organic solvents and extent of base mismatching; the combination of parameters is more important than the absolute measure of any one alone.
- hybridization probes are oligonucleotides capable of binding in a base-specific manner to a complementary strand of nucleic acid. Such probes include peptide nucleic acids, as described in Nielsen et al., Science 254, 1497-1500 (1991), and other nucleic acid analogs and nucleic acid mimetics.
- hybridizing specifically to refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence or sequences under stringent conditions when that sequence is present in a complex mixture (for example, total cellular) DNA or RNA.
- isolated nucleic acid as used herein mean an object species invention that is the predominant species present (i.e., on a molar basis it is more abundant than any other individual species in the composition).
- an isolated nucleic acid comprises at least about 50, 80 or 90% (on a molar basis) of all macromolecular species present.
- the object species is purified to essential homogeneity (contaminant species cannot be detected in the composition by conventional detection methods).
- linker group means to provide a linking function, which either alone or in conjunction with appropriate connecting groups, provide appropriate spacing of the Q group from the primary amine (Q-L-NH.sub.2) at such a length and in such a configuration as to allow appropriate reaction with the abasic DNA.
- the term “monomer” as used herein refers to any member of the set of molecules that can be joined together to form an oligomer or polymer.
- the set of monomers useful in the present invention includes, but is not restricted to, for the example of (poly)peptide synthesis, the set of L-amino acids, D-amino acids, or synthetic amino acids.
- “monomer” refers to any member of a basis set for synthesis of an oligomer. For example, dimers of L-amino acids form a basis set of 400 “monomers” for synthesis of polypeptides. Different basis sets of monomers may be used at successive steps in the synthesis of a polymer.
- the term “monomer” also refers to a chemical subunit that can be combined with a different chemical subunit to form a compound larger than either subunit alone.
- mRNA includes, but is not limited to pre-mRNA transcript(s), transcript processing intermediates, mature mRNA(s) ready for translation and transcripts of the gene or genes, or nucleic acids derived from the mRNA transcript(s). Transcript processing may include splicing, editing and degradation.
- a nucleic acid derived from a mRNA transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template.
- a cDNA reverse transcribed from a mRNA, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc. are all derived from the mRNA transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample.
- mRNA derived samples include, but are not limited to, mRNA transcripts of a gene or genes, cDNA reverse transcribed from the mRNA, cRNA transcribed from the cDNA, DNA amplified from the genes; RNA transcribed from amplified DNA, and the like.
- nucleic acid library sometimes referred to as a “array” as used herein refers to a synthetically or biosynthetically prepared collection of nucleic acids. Arrays may be used, inter alia, to screen for the presence or absence of a nucleic acid in a sample. Arrays of nucleic acids are available in a wide variety of different formats (for example, libraries of cDNAs or libraries of oligos tethered to resin beads, silica chips, or other solid supports). Additionally, the term “array” is meant to include those libraries of nucleic acids which can be prepared by spotting nucleic acids of essentially any length (for example, from 1 to about 1000 nucleotide monomers in length) onto a substrate.
- nucleic acid refers to a polymeric form of nucleotides of any length, either ribonucleotides, deoxyribonucleotides or peptide nucleic acids (PNAs), that comprise purine and pyrimidine bases, or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.
- the backbone of the polynucleotide can comprise sugars and phosphate groups, as may typically be found in RNA or DNA, or modified or substituted sugar or phosphate groups.
- a polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs.
- nucleoside, nucleotide, deoxynucleoside and deoxynucleotide generally include analogs such as those described herein. These analogs are those molecules having some structural features in common with a naturally occurring nucleoside or nucleotide such that when incorporated into a nucleic acid or oligonucleoside sequence, they allow hybridization with a naturally occurring nucleic acid sequence in solution.
- these analogs are derived from naturally occurring nucleosides and nucleotides by replacing and/or modifying the base, the ribose or the phosphodiester moiety.
- the changes can be tailor made to stabilize or destabilize hybrid formation or enhance the specificity of hybridization with a complementary nucleic acid sequence as desired.
- nucleic acids may include any polymer or oligomer of pyrimidine and purine bases, preferably cytosine, thymine, and uracil, and adenine and guanine, respectively. See Albert L. Lehninger, PRINCIPLES OF BIOCEMISTRY, at 793-800 (Worth Pub. 1982). Indeed, the present invention contemplates any deoxyribonucleotide, ribonucleotide or peptide nucleic acid component, and any chemical variants thereof, such as methylated, hydroxymethylated or glucosylated forms of these bases, and the like.
- the polymers or oligomers may be heterogeneous or homogeneous in composition, and may be isolated from naturally-occurring sources or may be artificially or synthetically produced.
- the nucleic acids may be DNA or RNA, or a mixture thereof, and may exist permanently or transitionally in single-stranded or double-stranded form, including homoduplex, heteroduplex, and hybrid states.
- oligonucleotide or sometimes refer by “polynucleotide” as used herein refers to a nucleic acid ranging from at least 2, preferably at least 8, and more preferably at least 20 nucleotides in length or a compound that specifically hybridizes to a polynucleotide.
- Polynucleotides of the present invention include sequences of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) which may be isolated from natural sources, produced by recombination or artificially synthesized and mimetics thereof.
- a further example of a polynucleotide of the present invention may be peptide nucleic acid (PNA).
- the invention also encompasses situations in which there is a nontraditional base pairing such as Hoogsteen base pairing which has been identified in certain tRNA molecules and postulated to exist in a triple helix.
- Nontraditional base pairing such as Hoogsteen base pairing which has been identified in certain tRNA molecules and postulated to exist in a triple helix.
- Polynucleotide and oligonucleotide are used interchangeably in this application.
- polymorphism refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population.
- a polymorphic marker or site is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%, and more preferably greater than 10% or 20% of a selected population.
- a polymorphism may comprise one or more base changes, an insertion, a repeat, or a deletion.
- a polymorphic locus may be as small as one base pair.
- Polymorphic markers include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu.
- VNTR's variable number of tandem repeats
- minisatellites dinucleotide repeats
- trinucleotide repeats trinucleotide repeats
- tetranucleotide repeats simple sequence repeats
- insertion elements such as Alu.
- multi-tiled arrays e.g., double tiled are useful for detection of deletion, duplication or insertion polymorphisms.
- probe refers to a surface-immobilized molecule that can be recognized by a particular target. See U.S. Pat. No. 6,582,908 for an example of arrays having all possible combinations of probes with 10, 12, and more bases.
- probes that can be investigated by this invention include, but are not restricted to, agonists and antagonists for cell membrane receptors, toxins and venoms, viral epitopes, hormones (for example, opioid peptides, steroids, etc.), hormone receptors, peptides, enzymes, enzyme substrates, cofactors, drugs, lectins, sugars, oligonucleotides, nucleic acids, oligosaccharides, proteins, and monoclonal antibodies.
- the probes are oligonucleotide analogues which are capable of hybridizing with a target nucleic sequence by complementary base-pairing.
- Complementary base pairing includes sequence-specific base pairing, which comprises, e.g., Watson-Crick base pairing or other forms of base pairing such as Hoogsteen base pairing.
- the probes are attached by any appropriate linkage to a support. 3′ attachment is more usual as this orientation is compatible with the preferred chemistry used in solid phase synthesis of oligonucleotides and oligonucleotide analogues (with the exception of, e.g., analogues which do not have a phosphate backbone, such as peptide nucleic acids).
- solid support refers to a material or group of materials having a rigid or semi-rigid surface or surfaces.
- at least one surface of the solid support will be substantially flat, although in some embodiments it may be desirable to physically separate synthesis regions for different compounds with, for example, wells, raised regions, pins, etched trenches, or the like.
- the solid support(s) will take the form of beads, resins, gels, microspheres, or other geometric configurations. See U.S. Pat. No. 5,744,305 for exemplary substrates.
- Target refers to a molecule that has an affinity for a given probe.
- Targets may be naturally-occurring or man-made molecules. Also, they can be employed in their unaltered state or as aggregates with other species. Targets may be attached, covalently or noncovalently, to a binding member, either directly or via a specific binding substance.
- targets which can be employed by this invention include, but are not restricted to, antibodies, cell membrane receptors, monoclonal antibodies and antisera reactive with specific antigenic determinants (such as on viruses, cells or other materials), drugs, oligonucleotides, nucleic acids, peptides, cofactors, lectins, sugars, polysaccharides, cells, cellular membranes, and organelles.
- Targets are sometimes referred to in the art as anti-probes.
- a “Probe Target Pair” is formed when two macromolecules have combined through molecular recognition to form a complex.
- the methods of the invention has broad applications and are not limited to any particular detection methods, they are particularly suitable for detecting a large number of, such as more than 1000, 5000, 10,000, 50,000 different transcript features.
- Fragmentation of nucleic acids comprises breaking nucleic acid molecules into smaller fragments. Fragmentation of nucleic acid may be desirable to optimize the size of nucleic acid molecules for certain reactions and destroy their three dimensional structure. For example, fragmented nucleic acids may be used for more efficient hybridization of target DNA to nucleic acid probes than non-fragmented DNA. According to a preferred embodiment, before hybridization to a microarray, target nucleic acid should be fragmented to sizes ranging from 50 to 200 bases long to improve target specificity and sensitivity. In a more preferred embodiment, the average size of such fragments, one must consider the components of the assay cocktail in partial fragments obtained is at least 10, 20, 30, 40, 50, 60, 70, 80, 100 or 200 nucleotides.
- molar ratios of cold to hot nucleotides in the reaction mixture must be considered as well as the affinity constant, K.sub.m, of the enzyme at issue for the analogs at question and to the substrate.
- K.sub.m affinity constant
- mRNA or mRNA transcripts include, but not limited to pre-mRNA transcript(s), transcript processing intermediates, mature mRNA(s) ready for translation and transcripts of the gene or genes, or nucleic acids derived from the mRNA transcript(s). Transcript processing may include splicing, editing and degradation.
- a nucleic acid derived from an mRNA transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template.
- a cDNA reverse transcribed from an mRNA, a cRNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc. are all derived from the mRNA transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample.
- mRNA derived samples include, but are not limited to, mRNA transcripts of the gene or genes, cDNA reverse transcribed from the mRNA, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like.
- a fragment, segment, or DNA segment refers to a portion of a larger DNA polynucleotide or DNA.
- a polynucleotide for example, can be broken up, or fragmented into, a plurality of segments.
- Various methods of fragmenting nucleic acid are well known in the art. These methods may be, for example, either chemical or physical in nature.
- Chemical fragmentation may include partial degradation with a DNase; partial depurination with acid; the use of restriction enzymes; intron-encoded endonucleases; DNA-based cleavage methods, such as triplex and hybrid formation methods, that rely on the specific hybridization of a nucleic acid segment to localize a cleavage agent to a specific location in the nucleic acid molecule; or other enzymes or compounds which cleave DNA at known or unknown locations.
- Physical fragmentation methods may involve subjecting the DNA to a high shear rate.
- High shear rates may be produced, for example, by moving DNA through a chamber or channel with pits or spikes, or forcing the DNA sample through a restricted size flow passage, e.g., an aperture having a cross sectional dimension in the micron or submicron scale.
- Other physical methods include sonication and nebulization.
- Combinations of physical and chemical fragmentation methods may likewise be employed such as fragmentation by heat and ion-mediated hydrolysis. See for example, Sambrook et al., “Molecular Cloning: A Laboratory Manual,” 3rd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2001) (“Sambrook et al.) which is incorporated herein by reference for all purposes.
- Useful size ranges may be from 100, 200, 400, 700 or 1000 to 500, 800, 1500, 2000, 4000 or 10,000 base pairs. However, larger size ranges such as 4000, 10,000 or 20,000 to 10,000, 20,000 or 500,000 base pairs may also be useful.
- Patents that describe synthesis techniques in specific embodiments include U.S. Pat. Nos. 5,412,087, 6,147,205, 6,262,216, 6,310,189, 5,889,165, and 5,959,098. Nucleic acid arrays are described in many of the above patents, but the same techniques are applied to polypeptide arrays.
- the present invention also contemplates many uses for polymers attached to solid substrates. These uses include gene expression monitoring, profiling, library screening, genotyping and diagnostics. Gene expression monitoring, and profiling methods can be shown in U.S. Pat. Nos. 5,800,992, 6,013,449, 6,020,135, 6,033,860, 6,040,138, 6,177,248 and 6,309,822. Genotyping and uses therefore are shown in U.S. Ser. No. 60/319,253, 10/013,598, and U.S. Pat. Nos. 5,856,092, 6,300,063, 5,858,659, 6,284,460, 6,361,947, 6,368,799 and 6,333,179. Other uses are embodied in U.S. Pat. Nos. 5,871,928, 5,902,723, 6,045,996, 5,541,061, and 6,197,506.
- the present invention also contemplates sample preparation methods in certain preferred embodiments.
- the genomic sample may be amplified by a variety of mechanisms, some of which may employ PCR. See, e.g., PCR Technology: Principles and Applications for DNA Amplification (Ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (Eds. Innis, et al., Academic Press, San Diego, Calif., 1990); Mattila et al., Nucleic Acids Res. 19,4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (Eds.
- LCR ligase chain reaction
- LCR ligase chain reaction
- Landegren et al. Science 241, 1077 (1988) and Barringer et al. Gene 89:117 (1990)
- transcription amplification Kwoh et al., Proc. Natl. Acad. Sci. USA 86, 1173 (1989) and WO88/10315
- self-sustained sequence replication Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990) and WO90/06995
- selective amplification of target polynucleotide sequences U.S. Pat. No.
- CP-PCR consensus sequence primed polymerase chain reaction
- AP-PCR arbitrarily primed polymerase chain reaction
- NABSA nucleic acid based sequence amplification
- Other amplification methods that may be used are described in, U.S. Pat. Nos. 5,242,794, 5,494,810, 4,988,617 and in U.S. Ser. No. 09/854,317, each of which is incorporated herein by reference.
- Hybridization assay procedures and conditions will vary depending on the application and are selected in accordance with the general binding methods known including those referred to in: Maniatis et al. Molecular Cloning: A Laboratory Manual (2.sup.nd Ed. Cold Spring Harbor, N.Y, 1989); Berger and Kimmel Methods in Enzymology, Vol. 152, Guide to Molecular Cloning Techniques (Academic Press, Inc., San Diego, Calif., 1987); Young and Davism, P.N.A.S, 80: 1194 (1983). Methods and apparatus for carrying out repeated and controlled hybridization reactions have been described in U.S. Pat. Nos.
- Computer software products of the invention typically include computer readable medium having computer-executable instructions for performing the logic steps of the method of the invention.
- Suitable computer readable medium include floppy disk, CD-ROM/DVD/DVD-ROM, hard-disk drive, flash memory, ROM/RAM, magnetic tapes and etc.
- the computer executable instructions may be written in a suitable computer language or combination of several languages. Basic computational biology methods are described in, e.g.
- the present invention may also make use of various computer program products and software for a variety of purposes, such as probe design, management of data, analysis, and instrument operation. See, U.S. Pat. Nos. 5,593,839, 5,795,716, 5,733,729, 5,974,164, 6,066,454, 6,090,555, 6,185,561, 6,188,783, 6,223,127, 6,229,911 and 6,308,170. Additionally, the present invention may have preferred embodiments that include methods for providing genetic information over networks such as the Internet as shown in U.S. patent application Ser. Nos. 10/197,621, 10/063,559 (U.S. Publication No. 20020183936), Ser. Nos. 10/065,868, 10/328,818, 10/328,872, 10/423,403 60/349,546, and 60/482,389.
- the double- (or multiple-) tiling technique can dramatically increase the depth and the breadth of coverage of a wide range of microarray experiments.
- oligonucleotide analogue arrays are used to determine whether there are any differences between a reference sequence and a target oligonucleotide, e.g., whether an individual has a mutation or polymorphism in a known gene.
- the oligonucleotide target is optionally a nucleic acid such as a PCR amplicon, which comprises one or more nucleotide analogues.
- arrays are designed to contain probes exhibiting complementarity to one or more selected reference sequence whose sequence is known. The arrays are used to read a target sequence comprising either the reference sequence itself or variants of that sequence.
- Reference sequences of interest include sequences known to include mutations or polymorphisms associated with phenotypic changes having clinical significance in human patients. For example, the CFTR gene and P53 gene in humans have been identified as the location of several mutations resulting in cystic fibrosis or cancer respectively.
- Other reference sequences of interest include those that serve to identify pathogen microorganisms and/or are the site of mutations by which such microorganisms acquire drug resistance (e.g., the HIV reverse transcriptase gene for HIV resistance).
- Other reference sequences of interest include regions where polymorphic variations are known to occur (e.g., the Droop region of mitochondrial DNA). These reference sequences also have utility for, e.g., forensic, cladistic, or epidemiological studies.
- oligonucleotide analogue probes is usually laid down in rows and columns for simplified data processing, such a physical arrangement of probes on the solid substrate is not essential.
- the data from the probes is collected and processed to yield the sequence of a target irrespective of the actual physical arrangement of the probes on, e.g., a chip.
- the hybridization signals from the respective probes is assembled into any conceptual array desired for subsequent data reduction, whatever the physical arrangement of probes on the substrate.
- 60-mer features e.g., probes
- the “inner” 30-mers e.g., the 30 nt bound to the slide
- An “outer stack” of 30-mers which was computationally grafted onto the inner stack, produces 30-mer pairs concatenated into 60-mers (e.g., the probes) ( FIG. 1 a, b ).
- the positions of the sequences can be randomized to reduce potential spatial artifacts.
- bound (e.g., hybridized or associated) fluorescent polynucleotides can span a contiguous set of sequences, illuminating a line of features.
- bound fluorescent polynucleotides e.g., sample
- the features depending on which stack is illuminated will be in, for example, a horizontal, vertical, diagonal line or other arranged or shaped designs).
- a spacer e.g., chemical
- a 44,000 feature (60-mer) array (Agilent Technologies Inc.) spanning the entire Saccharomyces cerevisiae genome. Repetitive sequences were masked at the feature selection stage (described below). The 30-mers were separated by an average spacing of 123 nucleotides (this spacing is based on the unmasked i.e. nonrepetitive component of the genome). Positive controls included Ty1 sequences, arranged to read “TY” in the center of the array when bound to labeled Ty1 DNA (two other sets of Ty1 controls are present, in both horizontal and vertical arrangements).
- a few yeast sequences were chosen as the sample to be hybridized to the array (see below). Some of the sequences were predicted to bind to inner 30-mers and illuminating horizontal lines, and others binding outer 30-mers in vertical lines.
- RNA from galactose-grown cells was labeled with Cy5 (red) and the glucose with Cy3 (green). Most of the lines were yellow, as expected, indicating that most genes are expressed at comparable levels in the two cultures; however, there were clearly visible red lines present on the array, indicating successful detection of genes upregulated in the galactose-induced culture.
- the errors were assumed independently identically distributed with mean 0 and used the least squares method.
- the 44,290 ⁇ 6,606 design matrix, X was created with rows representing features and columns representing the open reading frames (ORFs) in the Saccharomyces Gene Database annotation file, with a 1 placed at position x jk if ORF j is represented on feature k. It was then denoted the 6606 ⁇ 1 vector of true relative gene expression for each gene with ⁇ and the 44,290 ⁇ 1 vector of log ratios and errors with y and ⁇ respectively.
- the model could then be written as:
- 80,897 30-bp features were chosen from the yeast genome in three steps.
- the remaining oligonucleotides (9.7% of the total) were evenly spaced across the gaps without regard to sequence properties.
- the 30-mer sequences were arranged in sequence order and first from left to right, then top to bottom along the microarray, until the inner stack was filled, then the final 60-mers were created by appending the remaining 30-mers, in order from top to bottom, then left to right, forming the outer stack.
- These double-tiled 44K arrays were synthesized by Agilent Technologies (AMADJD# 13371).
- features were chosen from the masked yeast genome; these 60-mer features were, as above, first chosen by Primer3 and then chosen randomly to create enough features at the required density to tile the yeast genome and are described in detail elsewhere (Wheelan S J, Scheifele L Z, Martinez-Murillo F, Irizarry R A, Boeke J D, “Eukaryotic Transposable Elements and Genome Evolution Special Feature: Transposon insertion site profiling chip (CIP-chip),” Proc Natl Acad Sci USA. 2006 103(47):17632-7.).
- the single-tiled 44K arrays were synthesized by Agilent Technologies (AMADID #13306).
- a mixture of plasmids B154 (HIS4 and flanking YCL sequences), YIp1 (HIS3), and pEDB9c (Ty1, URA3, and GAL1 promoter) was used to query the array.
- Each plasmid was digested in three parallel reactions with AluI, MspI, and HpyCH4V.
- the resulting fragments were heat-inactivated, pooled and labeled for hybridization to the microarray as follows: 200 ng DNA was incubated with 36 ⁇ g random hexamer in a 23 ⁇ l reaction at 100° C. for 2 minutes, then 4° C. for 4 minutes.
- the labeling reaction then proceeded with the addition of 5 ⁇ L 10 ⁇ dNTP (8 mM dATP, dCTP, dGTP, 4 mM dUTP), 5 ⁇ l 10 ⁇ Klenow buffer, 7 ⁇ l Klenow (exo-) fragment (5U/ ⁇ l), 7 ⁇ l H 2 O, and 2 ⁇ l Cy5 dUTP, and was incubated at 37° C. for 2 hours.
- the reaction was stopped with 5 ⁇ l 10.5 M EDTA pH 8.0.
- the products were mixed with 450 ⁇ l TE and concentrated on a Microcon YM-30 (Amicon catalog #42410) column.
- the products were washed again with 450 ⁇ l TE and 10 ⁇ l sheared salmon sperm DNA (10 mg/ml), and concentrated again on a Microcon column. The resulting volume was adjusted to 26 ⁇ l with the addition of H 2 O, and SDS and SSC were added to final concentrations of 3 ⁇ SSC and 0.3% SDS, in a total volume of 32.5 ⁇ l. After incubation at 100° C. for 90 seconds and then 37° C. for 30 minutes, the products were spotted onto microarrays and covered with 22 ⁇ 60 mm cover slips (VWR catalog #48393 070).
- microarrays were hybridized overnight in a humid chamber at 55° C. In the morning, the arrays were washed in 2 ⁇ SSC, 0.03% SDS for 5 minutes at 55° C., then in 1 ⁇ SSC for 5 minutes at room temperature, and finally in 0.2 ⁇ SSC for 5 minutes at room temperature. Microarrays were allowed to air dry and then scanned in a GenePix 400013 scanner (Axon Instruments), using GenePix Pro 5.1 software.
- TES Tris-HCl, pH 7.5, mM ethylenediaminetetraacetic acid (EDTA), and 0.5% SDS
- 400 ⁇ l acid phenol/chloroform was added, and after vortexing briefly, the extracts were incubated at 65° C. for 60 minutes with brief, occasional vortexing.
- the extracts were placed on ice for 5 minutes, then spun at top speed in a microcentrifuge at 4° C. for 5 minutes. The aqueous layer was transferred to a new tube and extracted once more with acid phenol/chloroform.
- RNA was treated with DNase I by incubating 50 ⁇ l RNA with 10 ⁇ l 10 ⁇ DNase I buffer, 1 ⁇ l DNase I, 2 ⁇ l RNasin, and 37 ⁇ l water at 37° C. for 30 minutes. 10 ⁇ l 25 mM EDTA was added before heat inactivation at 65° C. for 15 minutes. After 1 minute on ice, the RNA was cleaned up with 100 ⁇ l phenol/chloroform/isoamyl alcohol, vortexed, and centrifuged for 5 minutes in a microcentrifuge at 13,000 rpm at 4° C.
- RNA concentration was adjusted to 500 ng/ ⁇ l.
- Yeast RNA was processed using a modification of the Agilent Low RNA Input Fluorescent Linear Amplification protocol (Agilent Technologies Kit, Protocol version 3.3, July 2005; Maitreya Dunham, personal communication).
- RNA samples 400 ng were denatured for 10 minutes at 65° C. in the presence of T7 promoter primer and nuclease-free water in a total volume of 11.5 ⁇ l, and snap cooled for 5 minutes on ice.
- the cDNA synthesis was done using MMLV-RT, DTT, 10 mM dNTP and RNaseOUT (Agilent Technologies Kit) at 40° C. for 2 hours, followed by an enzyme inactivation step for 15 minutes at 65° C.
- 2.4 ⁇ l of either cyanine 3-CTP (10 mM) or cyanine 5-CTP (10 mM) were added and incorporated in an in vitro transcription step at 40° C.
- cRNA labeled RNA
- concentrations and sources are proprietary.
- Amplified cRNA was then purified using QIAGEN's QIAquick spin columns as described in the RNeasy Mini Kit (QIAGEN).
- a total of 850 ng labeled cRNA from each sample (Cy3-Cy5 labeled) were mixed and fragmented using the Gene Expression Hybridization Kit (Agilent Technologies) and hybridized to the array for 17 hours at 45° C. (for the double-tiled array) or 55° C. (for the conventional 60-mer array) in the dark.
- the arrays were then washed in solution A (700 ml dH2O, 300 ml 20 ⁇ SSPE, 20% N-lauroylsarcosine) for 1 minute at RT, followed by 1 minute in wash B (997 ml dH2O, 3 ml 20 ⁇ SSPE, 0.25 ml 20% N-lauroylsarcosine) at RT, and by a 30 second wash in Acetonitrile (100%, anhydrous)
- the arrays were scanned using the Axon GenePix 4,000B scanner (Axon Instruments) and the images were analyzed using GenePix Pro 6.0.
- Microarray platform and sample data have been deposited in GEO (accession GSE5721).
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Immunology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
- Plural Heterocyclic Compounds (AREA)
- Transforming Electric Information Into Light Information (AREA)
Abstract
Described herein are multi-tiling methods that increases the number of features present on an array and methods of making and using the multi-tiled arrays. The arrays are useful, for example, for transcriptional profiling and genomic studies.
Description
- This application claims the benefit of U.S. Provisional Application No. 60/749,484, filed Dec. 12, 2005, the entire contents of which are incorporated herein by reference.
- Microarrays, high-throughput platforms for analyzing gene expression and features of total genomic DNA, among other things, are gaining in popularity as researchers discover ever more applications for their unbiased and broad feature sets and among the diagnostic industry for transcriptional profiling and polymorphism analysis. Microarray analyses are currently limited by the number of individual features that can be placed on each array, making the use of microarrays expensive and time consuming.
- Microarrays, including genome tiling microarrays, are exceptionally powerful tools for querying diverse genomic features, including mapping gene expression and structure, analyzing polymorphisms, determining protein binding targets, and examining genome architecture1-4. The utility of genome tiling microarrays lies in the unbiased selection of densely spaced features. Current microarrays and studies using them are restricted both by expense (the number of arrays or slides purchased) and by spatial limitations of microarray technology (the number of features on each array). Thus, there is a need in the art to increase the number of sequences present on an array to provide cost and timesavings.
- Described herein is a multi-tiling method that significantly increases the number of features (e.g., sequences) present on an array and methods of making and using the multi-tiled array. For example, described herein, for the first time is successful transcriptional profiling using the multi-tiled array format. The described arrays and methods provide cost and timesavings as well as preserving precious samples. Using this method, we and others can now save money and precious samples by using fewer arrays to cover a region, or can perform investigations at significantly higher resolution without incurring increasing costs or increasing the amount of sample required for the experiment.
- On aspect describes a double-tiling technique that effectively doubles the number of features, (e.g., sequences) fitting on any given array. For example, the double-tiling array is useful for complex, two-color, whole-genome hybridizations.
- Provided herein, according to one aspect are multi-tiled nucleic acid arrays comprising an immobilized array of nucleic acid features, wherein each feature comprises an inner probe and an outer probe, wherein the inner and outer probes are unrelated in genomic coordinates.
- In one embodiment, one of the inner or the outer probe is arranged horizontally and the other is arranged vertically. In a related embodiment, the features of the array further comprise middle probes between the inner and the outer probes, wherein the probes are unrelated in genomic coordinates. In another related embodiment, the features of the array further comprise second middle probes between the inner and the middle probes, wherein the probes are unrelated in genomic coordinates.
- In one embodiment, the array may further comprise at least one positive control feature.
- In one embodiment, the array may further comprise at least one negative control feature.
- In one embodiment, the multi-tiled array comprises from between about 100 to about 3 billion features. In a related embodiment, multi-tiled array comprises from between about 10,000 to 10 million features. In a related embodiment, the multi-tiled array comprises from between about 1000 to about 5 million features. The arrays described herein may have any number of features as determined appropriate by one of skill in the art for a particular purpose.
- Provided herein, according to one aspect are multi-tiled nucleic acid arrays comprising an immobilized array of nucleic acid features, wherein the features comprise an inner probe, a middle probe, and an outer probe, wherein the probes are unrelated in genomic coordinates.
- In one embodiment, the probes are from between about 10 nucleotides to about 50 nucleotides in length. In a related embodiment, the probes are from between about 15 nucleotides to about 40 nucleotides in length. In another related embodiment, the probes are from between about 20 nucleotides to about 35 nucleotides in length. In a related embodiment, the probes are 30 nucleotides in length.
- In one embodiment, the inner, middle, and outer probes are arranged horizontally, vertically and diagonally, respectively or in any order. The probes on a multi-tiled array of a certain layer are arranged in one manner different from those in another layer. It does not matter which layer is arranged in which manner. Layers of probes may also be arranged in non-linear or random patterns.
- In one embodiment, the features further comprise spacers between the inner and the middle probe and between the middle and the outer probe.
- Provided herein, according to one aspect are multi-tiled nucleic acid arrays comprising an immobilized array of nucleic acid features, wherein the features comprise four probes, an inner probe a middle probe, and an outer probe, wherein the probes are unrelated in genomic coordinates.
- In one embodiment the probes are from between about 10 nucleotides to about 50 nucleotides in length.
- In one embodiment, the probes are arranged horizontally, vertically, diagonally upper left to lower right and diagonally lower left to upper right. In a related embodiment, the features further comprise spacers between the inner and the middle probe and between the middle and the outer probe.
- Provided herein, according to one aspect are multi-tiled nucleic acid arrays comprising an immobilized array of nucleic acid features, wherein the features comprise at least two probes unrelated in genomic coordinates. In a related embodiment, the features comprise three probes. In another related embodiment, the features comprise four probes.
- Provided herein, according to one aspect are methods of expression (transcriptional) profiling, comprising providing a multi-tiled array, hybridizing a labeled sample to the array; and analyzing the array.
- In one embodiment, the array comprises portions of at least one genome. Exemplary genomes include, for example, mammals, yeast, bacteria, plants, and the like.
- In one embodiment, the profiling further comprises comparing the expression profile of a sample to an expression profile reference.
- In one embodiment, the sample is a clinical sample.
- In one embodiment, analyzing the array comprises deconvolution of a signal.
- In one embodiment, the analyzing determines an expression profile of a sample.
- In one embodiment, the method of expression profiling evaluates a subject for a condition.
- In one embodiment, the condition is a disease condition.
- In one embodiment, the method of expression profiling diagnoses a subject for a condition. In a related embodiment, the method of expression profiling monitors a subject for a condition. In another related embodiment, the subject is a human.
- Provided herein, according to one aspect are methods of constructing a multi-tiled array (of increasing features of an array), comprising selecting probe sequences; arranging inner probe sequences in sequence order, and appending outer probe sequences in sequence order to the inner probe sequences.
- In one embodiment, the methods may further comprise masking a genome of an organism prior to selecting probe sequences.
- In one embodiment, one of the inner or the outer probe sequences are arranged horizontally and the other are arranged vertically.
- In one embodiment, the array may further comprise appending third probe sequences in sequence order to the outer probe sequences.
- In one embodiment, the third probe sequences are arranged diagonally.
- In one embodiment, selecting the probe sequences comprises selecting one or more of random sequence or sequences with low probability of conformational problems.
- In one embodiment, the methods may further comprise randomizing the positions of the sequences. In one embodiment, the methods may further comprise adding a spacer between the inner and the outer probe.
- In one embodiment, the masking comprises masking repetitive genomic sequences.
- In one embodiment, the selecting of the probes comprises separating each probe by at least a distance of 1 to 500 nucleotides. In a related embodiment, the selecting of the probes comprises separating each probe by a distance of between about 1 to about 1,000 nucleotides.
- Provided herein, according to one aspect are methods of array based evaluation of a sample, comprising providing a multi-tiled array; hybridizing a sample to the array; and deconvoluting signal intensities.
- In one embodiment, the methods may further comprise analyzing the signal intensities.
- In one embodiment, the methods may further comprise examining fluorescent feature adjacency to determine whether the inner or outer probe was hybridized.
- In one embodiment, the signal is a fluorescent or color signal. In one embodiment, the methods may further comprise preparing a sample. In a related embodiment, preparing the sample comprises one or more of digesting a sample, labeling a digested sample, and purifying sample. In a related embodiment, deconvoluting comprises visualizing the microarray and examining the data obtained from the microarray.
- In one embodiment, digesting a sample for cDNA synthesis may be by using MMLV-RT, DTT, 10 mM DNTP and RNaseOUT (Agilent Technologies Kit) or Agilent Low RNA Input Linear Amplification Kit. In one embodiment, labeling a digested sample is by in vitro transcription. In another embodiment, purifying sample is, for example, by QIAGEN's QIAquick spin columns as described in the RNeasy Mini Kit (QIAGEN).
- In another embodiment, deconvoluting comprises visualizing the microarray. In a related embodiment, the visualizing is, for example, by Axon GenePix 4,000B scanner (Axon Instruments). In another embodiment, the data generated from the deconvolution and the visualization is examined, for example, by using GenePix Pro 6.0.
- Provided herein, according to one aspect are methods of polymorphism analysis comprising providing a multi-tiled nucleic acid array of probes comprising a first set of probes spanning each of a collection of polymorphic sites in known sequences of unknown function and complementary to a first allelic forms of the sites, and a second set of probes spanning each of the polymorphic sites in the collection and complementary to second allelic forms of the sites, wherein the collection of polymorphic sites includes at least 10 unlinked polymorphic sites; and hybridizing a nucleic acid sample from a subject to the array of probes and analyzing the hybridization intensities of probes in the first and second probe sets to determine a profile of polymorphic forms present in the individual.
- Provided herein, according to one aspect are methods for constructing a multi-tiled chemical array comprising a plurality of features of bioorganic molecules in a predetermined arrangement, comprising providing a substantially planar solid material having an attachment surface; and attaching the features of bioorganic molecules onto the attachment surface, wherein the features comprise an inner probe and an outer probe, wherein the inner and outer probes are unrelated in genomic coordinates.
- In one embodiment, the array comprises from about 50 to about 3 billion (3×10e9) different features of the bioorganic molecules and wherein the bioorganic molecules are attached to the surface of each the tile at a density of about 1000 to 100,000 bioorganic molecules per square micron of the attachment surface.
- In one embodiment, the material comprises a solid nonporous material selected from the group consisting of a glass, a silicon, and a plastic.
- In one embodiment, the methods may further comprise bringing the constructed array into contact with a same sample.
- In one embodiment, the methods may further comprise performing a quality test on the attachment surface after the attaching.
- In one embodiment, the methods may further comprise verifying the fidelity of the bioorganic molecules on the attachment surface.
- In one embodiment, the methods may further comprise verifying the density of attachment of the bioorganic molecules on the attachment surface.
- In one embodiment, the bioorganic molecules are presynthesized before attachment onto the surface.
- Provided herein, according to one aspect are kits for use in expression profiling of a nucleic acid comprising a multi-tiled nucleic acid array-, and instructions for use.
-
FIG. 1 depicts asample 4×4 array for didactic purposes. (a) The sequence to be tiled is split into two equal-length segments, represented here as first half, A-P; second half, 1-16. 30-mers from each half-sequence are tiled separately, A-P (inner stack) horizontally and 1-16 (outer stack) vertically. (b) Outer stack tiles are overlaid on inner stack tiles and the 32 30-mers are concatenated to form 16 60-mers. -
FIG. 2 depicts a plasmid experiment—results agree well with predictions. (a) Virtual array, produced in HTML by Perl scripts, showing the idealized hybridization of the plasmid mixture to the features. The signal from HIS4 and adjacent sequences (YCL plasmid) is discontinuous due to disruptions by mandatory Agilent control features. (b) Actual experimental results, showing illumination of features by binding to the fluorescent extract Inset: detail of intersection of horizontal and vertical lines. (c) Overlay of virtual and experimental results. Red indicates features expected to be bound that are actually bound in (b). Yellow dots (5.6% of total features shown) are predicted to hybridize but do not actually hybridize at high levels. Blue dots (also 5.6% of total) indicate features that are bound experimentally but not expected to hybridize, given the pattern in (a). -
FIG. 3 depicts a two-color double-tiled array clearly demonstrating galactose induction. A section of the two-color double-tiled array, showing red signal in lines resulting from hybridization of Cy5-labeled RNA from galactose-induced cultures along with Cy3-labeled RNA from glucose-induced cultures. Most lines are yellow, indicating that as expected, most genes are expressed at similar levels in the glucose- and galactose-grown cultures. The features illuminated in a horizontal red line are derived from GAL1; the vertical red line is signal from GAL2. Unexpectedly, native Ty1 sequences were found to be downregulated approximately 2.5 fold by galactose induction; this conclusion was confirmed by real-time RT-PCR. -
FIG. 4 depicts a double-tiled arrays show low between-array variation. Box plots showing the distribution of difference between estimated relative expression obtained from replicate RNA samples. Ideally, these differences should be 0; thus, tighter box plots are associated with better precision. The first box plot (green) represents the data from double-tiled arrays and the second plot represents data from conventional single-tiled arrays. -
FIG. 5 shows correspondence at the top (CAT) plots. Correspondence, shown in the y-axis, is defined as the number of genes in common in lists formed by ranking genes by their log-ratios and keeping the top N. The size of the list N is varied and shown in the x-axis. In this plot we show correspondence between arrays hybridized to replicate samples. The blue line shows correspondence between two replicate single-tiled arrays, the red represents correspondence between two replicate double-tiled arrays, and the green line shows the average correspondence between single-tiled and double tiled arrays (there are 4 possible comparisons, all shown in thinner lines). The yellow area represents a 99.9% critical region for the null hypothesis of no correspondence, i.e. anything outside this region attains a p-value of less than 0.001. - Before the invention is described in detail, it is to be understood that this invention is not limited to the particular component parts or process steps of the methods described, as such parts and methods may vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting. As used in the specification and the appended claims, the singular forms “a”, an and “the” include plural referents unless the context clearly indicates otherwise.
- Throughout this disclosure, various aspects of this invention can be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
- The practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art. Such conventional techniques include polymer array synthesis, hybridization, ligation, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the example herein below. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series (Vols. I-IV), Using Antibodies: A Laboratory Manual, Cells: A Laboratory Manual, PCR Primer: A Laboratory Manual, and Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press), Stryer, L. (1995) Biochemistry (4th Ed.) Freeman, N.Y., Gait, “Oligonucleotide Synthesis: A Practical Approach” 1984, IRL Press, London, Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3.sup.rd Ed., W.H. Freeman Pub., New York, N.Y. and Berg et al. (2002) Biochemistry, 5.sup.th Ed., W.H. Freeman Pub., New York, N.Y., all of which are herein incorporated in their entirety by reference for all purposes. The present invention can employ solid substrates, including arrays in some preferred embodiments. Methods and techniques applicable to polymer (including protein) array synthesis have been described in U.S. Ser. No. 09/536,841, WO 00/58516, U.S. Pat. Nos. 5,143,854, 5,242,974, 5,252,743, 5,324,633, 5,384,261, 5,405,783, 5,424,186, 5,451,683, 5,482,867, 5,491,074, 5,527,681, 5,550,215, 5,571,639, 5,578,832, 5,593,839, 5,599,695, 5,624,711, 5,631,734, 5,795,716, 5,831,070, 5,837,832, 5,856,101, 5,858,659, 5,936,324, 5,968,740, 5,974,164, 5,981,185, 5,981,956, 6,025,601, 6,033,860, 6,040,193, 6,090,555, 6,136,269, 6,269,846 and 6,428,752, in PCT Applications Nos. PCT/US99/00730 (International Publication Number WO 99/36760) and PCT/US01/04285, which are all incorporated herein by reference in their entirety for all purposes.
- In this specification and in the claims that follow, reference will be made to a number of terms which are used as defined below.
- An “array” is an arrangement of objects in space in which each object occupies a separate predetermined spatial position. Each of the objects in the array of this invention comprises one or more species of chemical moiety attached to a “discrete physical entity”, such that the physical location of each species is known or ascertainable. A “discrete physical entity” is a unit of substantially planar material (e.g., a solid material, a membrane, a gel or a combination of materials) that can be handled and still maintain its identity, and can be subdivided into “tiles” for recombining in various ways to form a physical array. Preferably, the tiles will have regular geometric shapes, e.g., a sector of a circle, a rectangle, and the like, with radial or linear dimensions of about 100 mm to about 10 mm, most preferably about 1 μM to about 1000 μM. The subdivision of the entity into tiles can be made either before or after attachment of the chemical moiety, and by any suitable method for cutting the entity, e.g., with a dicing saw. These methods are well known in the art of semiconductor chip manufacture and can be optimized by one skilled in the art for the particular material selected for use in this invention.
- A “support” is a surface or structure for the attachment of tiles. The “support” may be of any desired shape and size and can be fabricated from a variety of materials. The support material can be treated for biocompatibility (i.e., to protect biological samples and probes from undesired structure or activity changes upon contact with the support surface) and to reduce non-specific binding of biological materials to the support. These procedures are well known in the art (see, e.g., Schoneich et al, Anal. Chem. 65: 67-84R (1993)). The tiles can be attached to the support by means of an adhesive, by insertion into a pocket or channel formed in the support, or by any other means that will provide a stable and secure spatial arrangement.
- “Tiling” is the process of forming an array by picking and placing individual tiles comprising single or multiple species of chemical moieties (referred to as “features”) on a support in a fixed spatial pattern.
- “Multi-tiling,” as used herein, refers for example to an array in which the individual features contain two or more non-contiguous sequences directly or indirectly associated or bound to form the feature. The multi-tiled arrays are useful, for example, for complex, two-color, whole-genome hybridizations, transcriptional profiling, mapping gene expression and structure, analyzing polymorphisms, determining protein binding targets, and examining genome architecture. The genome tiling microarrays allow for the unbiased selection of densely spaced features. As an example, double-tiling effectively doubles the number of sequences fitting on any given array as each feature has an inner and an outer probe. In one embodiment of a double-tiled array, a 60-mer feature for DNA oligonucleotide microarrays each comprise two concatenated 30-mers. The features may be, for example, in the context of a double-tiled array, from between about 10 to about 200-mers. For example, the features may be made of two 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 85, 90, 95, or 100-mers. The oligonucleotides features in a double-tiled array may be concatenated, spaced by a linker to which they are both bound or associated or otherwise attached or associated to form a feature of the array.
- The features of a multi-tiled array may be arranged in linear, non-linear, or random patterns. For example, in the context of a double-tiled array, the inner probe of the feature, which is directly or indirectly bound or associated with the substrate, may be in a horizontal arrangement while the outer probe of the feature will be in a vertical arrangement or vice versa. One of the features may also be in, for example, a diagonal arrangement. In a triple-tiled arrangement, for example, the inner probe is in a diagonal arrangement, the middle probe is in a horizontal arrangement and the outer probe is in a vertical arrangement. The probes of a feature are unrelated in genomic coordinate or sequence arrangement from the other probes of a feature.
- The positions of the sequences of the features may be randomized to reduce potential spatial artifacts.
- In one embodiment, probes in one arrangement (e.g., the inner probes of a feature) will span contiguous sequences or may be separated by some distance. For example, the inner probes of a feature may be separated by from about 10 to about 500 nucleotides. The probes may be separated by about 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 130, 140, 150, 160, 170, 180, 190, or about 200 nucleotides. The probes may be separated by any number of nucleotides determined to give the optimal sequence coverage as determined by one of skill in the art depending on the purpose of the array or the experiment or diagnostic the array is being used for. For example, in a sample, the fluorescent polynucleotides will span a contiguous set of sequences or probes on an array illuminating a line of features. By examining fluorescent feature adjacency, one can easily determine whether the inner or outer probe, as a fluorescent molecule binding one outer probe will bind several adjacent outer probes, illuminating a horizontal vs. vertical line of features. If the features are randomized, they can be computationally “derandomized” and the adjacency patterns will be apparent.
- An array may be made of any number of features as known in the art. For example, a 44,000 feature (60-mer) array of the (Agilent Technologies Inc.) spanning the entire Saccharomyces cerevisiae genome is an example. Other genomes may be made into arrays and may be designed as described herein or by other methods known to those of skill in the art, e.g., vertebrate, mammals, plants, etc. To adequately cover a genome, repetitive sequences (e.g., retrotransposons and long terminal repeats (LTRs), telomeres, and X and Y′ elements) may be masked at the feature selection stage. An array may also contain positive and/or negative controls. Positive controls may be made of sequences that are known to be in a sample of interest or may be added to a sample and the features may be added to the array of those sequences. Exemplary positive controls include the Ty1 sequences for a yeast array. In selecting sequences of a genome to be probes, programs such as Primer39 and the like may be used to choose oligonucleotides with the lowest likelihood of conformational problems. Sequences may also be selected randomly or by any other method suitable for a particular purpose.
- “Deconvolution,” as used herein, refers to computationally or otherwise analyzing which probe in a feature is bound by sample. None of the probes, each probe of a feature may be bound or one or more probes of a feature may be bound by sample. One method of deconvolution is to define yi as the normalized log ratio of the red versus green intensity for feature i. Then assume that the contribution of each component was additive and used the following linear model: yi=θgi1+θgi2+ε, where gi1 is the index of the inside gene and gi2 is the index of the outside gene, θgi is the relative expression for each gene, and ε represents measurement error. Estimate θgi for all gi. Assumed the errors were independently identically distributed with mean 0 and used the least squares method. In one embodiment, for example with an array having 44,290 features, create a 44,290×6,606 design matrix, X, with rows representing features and columns representing the open reading frames (ORFs) in the Saccharomyces Gene Database annotation file, with a 1 placed at position xjk if ORF j is represented on feature k. Then denote the 6606×1 vector of true relative gene expression for each gene with Θ and the 44,290×1 vector of log ratios and errors with y and ε respectively. The model could then be written as: y=XΘ+{right arrow over (ε)} and the least squares solution is: {circumflex over (Θ)}=(XTX)−1XTy. This is the matrix form of the multiple regression equations. Solving this equation involves inverting a 6,606×6,606 matrix. Taking advantage of X as an extremely sparse matrix and solve the equation using the Matrix package in R (http://cran.r-project.org/src/contrib/Descriptions/Matrix.html).
- A “chemical moiety” is an organic or inorganic molecule that is preformed at the time of attachment to a discrete physical moiety, in distinction to an organic molecule that is synthesized in situ on an array surface. The preferred mode of attachment is by covalent bonding, although noncovalent means of attachment or immobilization might be appropriate depending on the particular type of chemical moiety that is used. If desired, a “chemical moiety” can be covalently modified by the addition or removal of groups after the moiety is attached to a physically distinct entity.
- The chemical moieties of this invention are preferably “bioorganic molecules” of natural or synthetic origin, are capable of synthesis or replication by chemical, biochemical or molecular biological methods, and are capable of interacting with biological systems, e.g., cell receptors, immune system components, growth factors, components of the extracellular matrix, DNA and RNA, and the like. The preferred bioorganic molecules for use in the arrays of this invention are “molecular probes” selected from nucleic acids (or portions thereof), proteins (or portions thereof), polysaccharides (or portions thereof), and lipids (or portions thereof), for example, oligonucleotides, peptides, oligosaccharides or lipid groups that are capable of use in molecular recognition and affinity-based binding assays (e.g., antigen-antibody, receptor-ligand, nucleic acid-protein, nucleic acid-nucleic acid, and the like). An array may contain different families of bioorganic molecule, e.g., proteins and nucleic acids, but typically will contain two or more species of the same family of molecule, e.g., two or more sequences of oligonucleotide, two or more protein antigens, two or more chemically distinct small organic molecules, and the like. An array can be formed from two species of molecule, although it is preferred that the array contain several tens to thousands of species of molecule, preferably from about 50 to about 1000 species. Each species of course can be present in multiple copies if desired.
- An “analyte” is a molecule whose detection is desired and which selectively or specifically binds to a molecular probe. An analyte can be the same or different type of molecule as the molecular probe to which it binds.
- The term “complementary” as used herein refers to the hybridization or base pairing between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid to be sequenced or amplified. Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single stranded RNA or DNA molecules are said to be complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%. Alternatively, complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary. See, M. Kanehisa Nucleic Acids Res. 12:203 (1984), incorporated herein by reference.
- The term “detectable moiety” (Q) means a chemical group that provides a signal. The signal is detectable by any suitable means, including spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. In certain cases, the signal is detectable by 2 or more means.
- The detectable moiety provides the signal either directly or indirectly. A direct signal is produced where the labeling group spontaneously emits a signal, or generates a signal upon the introduction of a suitable stimulus. Radiolabels, such as 3H, 125I, 35S, 14C or 32P, and magnetic particles, such as Dynabeads™, are nonlimiting examples of groups that directly and spontaneously provide a signal Labeling groups that directly provide a signal in the presence of a stimulus include the following nonlimiting examples: colloidal gold (40-80 nm diameter), which scatters green light with high efficiency; fluorescent labels, such as fluorescein, Texas red, Rhoda mine, and green fluorescent protein (Molecular Probes, Eugene, Oreg.), which absorb and subsequently emit light; chemiluminescent or bioluminescent labels, such as luminol, lophine, acridine salts and luciferins, which are electronically excited as the result of a chemical or biological reaction and subsequently emit light; spin labels, such as vanadium, copper, iron, manganese and nitroxide free radicals, which are detected by electron spin resonance (ESR) spectroscopy; dyes, such as quinoline dyes, triarylmethane dyes and acridine dyes, which absorb specific wavelengths of light; and colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. See U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149 and 4,366,241.
- A detectable moiety provides an indirect signal where it interacts with a second compound that spontaneously emits a signal, or generates a signal upon the introduction of a suitable stimulus. Biotin, for example, produces a signal by forming a conjugate with streptavidin, which is then detected. See Hybridization With Nucleic Acid Probes. In Laboratory Techniques in Biochemistry and Molecular Biology; Tijssen, P., Ed.; Elsevier. New York, 1993; Vol. 24. An enzyme, such as horseradish peroxidase or alkaline phosphatase, that is attached to an antibody in a label-antibody-antibody as in an ELISA assay, also produces an indirect signal.
- A preferred detectable moiety is a fluorescent group. Fluorescent groups typically produce a high signal to noise ratio, thereby providing increased resolution and sensitivity in a detection procedure. Preferably, the fluorescent group absorbs light with a wavelength above about 300 nm, more preferably above about 350 nm, and most preferably above about 400 nm. The wavelength of the light emitted by the fluorescent group is preferably above about 310 nm, more preferably above about 360 nm, and most preferably above about 410 nm.
- The fluorescent detectable moiety is selected from a variety of structural classes, including the following nonlimiting examples: 1- and 2-aminonaphthalene, p,p′diaminostilbenes, pyrenes, quaternary phenanthridine salts, 9-aminoacridines, p,p′-diaminobenzophenone imines, anthracenes, oxacarbocyanine, marocyanine, 3-aminoequilenin, perylene, bisbenzoxazole, bis-p-oxazolyl benzene, 1,2-benzophenazin, retinol, bis-3-aminopridinium salts, hellebrigenin, tetracycline, sterophenol, benzimidazolyl phenylamine, 2-oxo-3-chromen, indole, xanthen, 7-hydroxycoumarin, phenoxazine, salicylate, strophanthidin, porphyrins, triarylmethanes, flavin, xanthene dyes (e.g., fluorescein and rhodamine dyes); cyanine dyes; 4,4-difluoro-4-bora-3a,4a-diaza-s-indacene dyes and fluorescent proteins (e.g., green fluorescent protein, phycobiliprotein).
- A number of fluorescent compounds are suitable for incorporation into the present invention. Nonlimiting examples of such compounds include the following: dansyl chloride; fluoresceins, such as 3,6-dihydroxy-9-phenylxanthhydrol; rhodamineisothiocyanate; N-phenyl-1-amino-8-sulfonatonaphthalene; N-phenyl-2-amino-6-sulfonatonaphthanlene; 4-acetamido-4-isothiocyanatostilbene-2,2′-disulfonic acid; pyrene-3-sulfonic acid; 2-toluidinonapththalene-6-sulfonate; N-phenyl, N-methyl 2-aminonaphthalene-6-sulfonate; ethidium bromide; stebrine; auroniine-0,2-(9′-anthroyl)palmitate; dansyl phosphatidylethanolamin; N,N′-dioctadecyl oxacarbocycanine; N,N′-dihexyl oxacarbocyanine; merocyanine, 4-(3′-pyrenyl)butryate; d-3-aminodesoxy-equilenin; 12-(9′-anthroyl)stearate; 2-methylanthracene; 9-vinylanthracene; 2,2′-(vinylene-p-phenylene)bisbenzoxazole; β-bis[2-(4-methyl-5-phenyl oxazolyl)]benzene; 6-dimethylamino-1,2-benzophenzin; retinol; bis(3′-aminopyridinium)-1,10-decandiyl diiodide; sulfonaphthylhydrazone of hellibrienin; chlorotetracycline; N-(7-dimethylaminomethyl-2-oxo-3-chromenyl)maleimide; N-[p-(2-benzimidazolyl)phenyl]maleimide; N-(4-fluoranthyl)maleimide; bis(homovanillic acid); resazarin; 4-chloro-7-nitro-2,1,3-benzooxadizole; merocyanine 540; resorufin; rose bengal and 2,4-diphenyl-3(2H)-furanone. Preferably, the fluorescent detectable moiety is a fluorescein or rhodamine dye.
- Another preferred detectable moiety is colloidal gold. The colloidal gold particle is typically 40 to 80 nm in diameter. The colloidal gold may be attached to a labeling compound in a variety of ways. In one embodiment, the linker moiety of the nucleic acid labeling compound terminates in a thiol group (—SH), and the thiol group is directly bound to colloidal gold through a dative bond. See Mirkin et al. Nature 1996, 382, 607-609. In another embodiment, it is attached indirectly, for instance through the interaction between colloidal gold conjugates of antibiotin and a biotinylated labeling compound. The detection of the gold labeled compound may be enhanced through the use of a silver enhancement method. See Danscher et al.
J. Histotech 1993, 16, 201-207. - The term “effective amount” as used herein refers to an amount sufficient to induce a desired result.
- The term “fragmentation” refers to the breaking of nucleic acid molecules into smaller nucleic acid fragments. In certain embodiments, the size of the fragments generated during fragmentation can be controlled such that the size of fragments is distributed about a certain predetermined nucleic acid length.
- The term “genome” as used herein is all the genetic material in the chromosomes of an organism. DNA derived from the genetic material in the chromosomes of a particular organism is genomic DNA. A genomic library is a collection of clones made from a set of randomly generated overlapping DNA fragments representing the entire genome of an organism.
- The term “hybridization” as used herein refers to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-helix polynucleotide; triple-stranded hybridization is also theoretically possible. The resulting (usually) double-stranded polynucleotide is a “hybrid.” The proportion of the population of polynucleotides that forms stable hybrids is referred to herein as the “degree of hybridization.” Hybridizations are usually performed under stringent conditions, for example, at a salt concentration of no more than 1 M and a temperature of at least 25° C. For example, conditions of 5.times.SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30° C. are suitable for allele-specific probe hybridizations. For stringent conditions, see, for example, Sambrook, Fritsche and Maniatis. “Molecular Cloning A laboratory Manual” 2.sup.nd Ed. Cold Spring Harbor Press (1989) which is hereby incorporated by reference in its entirety for all purposes above.
- The term “hybridization conditions” as used herein will typically include salt concentrations of less than about 1 M, more usually less than about 500 mM and preferably less than about 200 mM. Hybridization temperatures can be as low as 5° C., but are typically greater than 22° C., more typically greater than about 30° C., and preferably in excess of about 37° C. Longer fragments may require higher hybridization temperatures for specific hybridization. As other factors may affect the stringency of hybridization, including base composition and length of the complementary strands, presence of organic solvents and extent of base mismatching; the combination of parameters is more important than the absolute measure of any one alone.
- The term “hybridization probes” as used herein are oligonucleotides capable of binding in a base-specific manner to a complementary strand of nucleic acid. Such probes include peptide nucleic acids, as described in Nielsen et al., Science 254, 1497-1500 (1991), and other nucleic acid analogs and nucleic acid mimetics.
- The term “hybridizing specifically to” as used herein refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence or sequences under stringent conditions when that sequence is present in a complex mixture (for example, total cellular) DNA or RNA.
- The term “isolated nucleic acid” as used herein mean an object species invention that is the predominant species present (i.e., on a molar basis it is more abundant than any other individual species in the composition). Preferably, an isolated nucleic acid comprises at least about 50, 80 or 90% (on a molar basis) of all macromolecular species present. Most preferably, the object species is purified to essential homogeneity (contaminant species cannot be detected in the composition by conventional detection methods).
- The term “linker group” (L) as used in connection with the present invention means to provide a linking function, which either alone or in conjunction with appropriate connecting groups, provide appropriate spacing of the Q group from the primary amine (Q-L-NH.sub.2) at such a length and in such a configuration as to allow appropriate reaction with the abasic DNA.
- The term “monomer” as used herein refers to any member of the set of molecules that can be joined together to form an oligomer or polymer. The set of monomers useful in the present invention includes, but is not restricted to, for the example of (poly)peptide synthesis, the set of L-amino acids, D-amino acids, or synthetic amino acids. As used herein, “monomer” refers to any member of a basis set for synthesis of an oligomer. For example, dimers of L-amino acids form a basis set of 400 “monomers” for synthesis of polypeptides. Different basis sets of monomers may be used at successive steps in the synthesis of a polymer. The term “monomer” also refers to a chemical subunit that can be combined with a different chemical subunit to form a compound larger than either subunit alone.
- The term “mRNA,” sometimes referred to “mRNA transcripts” as used herein, includes, but is not limited to pre-mRNA transcript(s), transcript processing intermediates, mature mRNA(s) ready for translation and transcripts of the gene or genes, or nucleic acids derived from the mRNA transcript(s). Transcript processing may include splicing, editing and degradation. As used herein, a nucleic acid derived from a mRNA transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template. Thus, a cDNA reverse transcribed from a mRNA, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc., are all derived from the mRNA transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample. Thus, mRNA derived samples include, but are not limited to, mRNA transcripts of a gene or genes, cDNA reverse transcribed from the mRNA, cRNA transcribed from the cDNA, DNA amplified from the genes; RNA transcribed from amplified DNA, and the like.
- The term “nucleic acid library,” sometimes referred to as a “array” as used herein refers to a synthetically or biosynthetically prepared collection of nucleic acids. Arrays may be used, inter alia, to screen for the presence or absence of a nucleic acid in a sample. Arrays of nucleic acids are available in a wide variety of different formats (for example, libraries of cDNAs or libraries of oligos tethered to resin beads, silica chips, or other solid supports). Additionally, the term “array” is meant to include those libraries of nucleic acids which can be prepared by spotting nucleic acids of essentially any length (for example, from 1 to about 1000 nucleotide monomers in length) onto a substrate. The term “nucleic acid” as used herein refers to a polymeric form of nucleotides of any length, either ribonucleotides, deoxyribonucleotides or peptide nucleic acids (PNAs), that comprise purine and pyrimidine bases, or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases. The backbone of the polynucleotide can comprise sugars and phosphate groups, as may typically be found in RNA or DNA, or modified or substituted sugar or phosphate groups. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. The sequence of nucleotides may be interrupted by non-nucleotide components for example by nucleotide analogs that undergo non-traditional hybridization. Thus the terms nucleoside, nucleotide, deoxynucleoside and deoxynucleotide generally include analogs such as those described herein. These analogs are those molecules having some structural features in common with a naturally occurring nucleoside or nucleotide such that when incorporated into a nucleic acid or oligonucleoside sequence, they allow hybridization with a naturally occurring nucleic acid sequence in solution. Typically, these analogs are derived from naturally occurring nucleosides and nucleotides by replacing and/or modifying the base, the ribose or the phosphodiester moiety. The changes can be tailor made to stabilize or destabilize hybrid formation or enhance the specificity of hybridization with a complementary nucleic acid sequence as desired.
- The term “nucleic acids” as used herein may include any polymer or oligomer of pyrimidine and purine bases, preferably cytosine, thymine, and uracil, and adenine and guanine, respectively. See Albert L. Lehninger, PRINCIPLES OF BIOCEMISTRY, at 793-800 (Worth Pub. 1982). Indeed, the present invention contemplates any deoxyribonucleotide, ribonucleotide or peptide nucleic acid component, and any chemical variants thereof, such as methylated, hydroxymethylated or glucosylated forms of these bases, and the like. The polymers or oligomers may be heterogeneous or homogeneous in composition, and may be isolated from naturally-occurring sources or may be artificially or synthetically produced. In addition, the nucleic acids may be DNA or RNA, or a mixture thereof, and may exist permanently or transitionally in single-stranded or double-stranded form, including homoduplex, heteroduplex, and hybrid states.
- The term “oligonucleotide” or sometimes refer by “polynucleotide” as used herein refers to a nucleic acid ranging from at least 2, preferably at least 8, and more preferably at least 20 nucleotides in length or a compound that specifically hybridizes to a polynucleotide. Polynucleotides of the present invention include sequences of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) which may be isolated from natural sources, produced by recombination or artificially synthesized and mimetics thereof. A further example of a polynucleotide of the present invention may be peptide nucleic acid (PNA). The invention also encompasses situations in which there is a nontraditional base pairing such as Hoogsteen base pairing which has been identified in certain tRNA molecules and postulated to exist in a triple helix. “Polynucleotide” and “oligonucleotide” are used interchangeably in this application.
- The term “polymorphism” as used herein refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population. A polymorphic marker or site is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%, and more preferably greater than 10% or 20% of a selected population. A polymorphism may comprise one or more base changes, an insertion, a repeat, or a deletion. A polymorphic locus may be as small as one base pair. Polymorphic markers include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu. For example, multi-tiled arrays, e.g., double tiled) are useful for detection of deletion, duplication or insertion polymorphisms.
- The term “probe” as used herein refers to a surface-immobilized molecule that can be recognized by a particular target. See U.S. Pat. No. 6,582,908 for an example of arrays having all possible combinations of probes with 10, 12, and more bases. Examples of probes that can be investigated by this invention include, but are not restricted to, agonists and antagonists for cell membrane receptors, toxins and venoms, viral epitopes, hormones (for example, opioid peptides, steroids, etc.), hormone receptors, peptides, enzymes, enzyme substrates, cofactors, drugs, lectins, sugars, oligonucleotides, nucleic acids, oligosaccharides, proteins, and monoclonal antibodies.
- The probes are oligonucleotide analogues which are capable of hybridizing with a target nucleic sequence by complementary base-pairing. Complementary base pairing includes sequence-specific base pairing, which comprises, e.g., Watson-Crick base pairing or other forms of base pairing such as Hoogsteen base pairing. The probes are attached by any appropriate linkage to a support. 3′ attachment is more usual as this orientation is compatible with the preferred chemistry used in solid phase synthesis of oligonucleotides and oligonucleotide analogues (with the exception of, e.g., analogues which do not have a phosphate backbone, such as peptide nucleic acids).
- The term “solid support”, “support”, and “substrate” as used herein are used interchangeably and refer to a material or group of materials having a rigid or semi-rigid surface or surfaces. In many embodiments, at least one surface of the solid support will be substantially flat, although in some embodiments it may be desirable to physically separate synthesis regions for different compounds with, for example, wells, raised regions, pins, etched trenches, or the like. According to other embodiments, the solid support(s) will take the form of beads, resins, gels, microspheres, or other geometric configurations. See U.S. Pat. No. 5,744,305 for exemplary substrates.
- The term “target” as used herein refers to a molecule that has an affinity for a given probe. Targets may be naturally-occurring or man-made molecules. Also, they can be employed in their unaltered state or as aggregates with other species. Targets may be attached, covalently or noncovalently, to a binding member, either directly or via a specific binding substance. Examples of targets which can be employed by this invention include, but are not restricted to, antibodies, cell membrane receptors, monoclonal antibodies and antisera reactive with specific antigenic determinants (such as on viruses, cells or other materials), drugs, oligonucleotides, nucleic acids, peptides, cofactors, lectins, sugars, polysaccharides, cells, cellular membranes, and organelles. Targets are sometimes referred to in the art as anti-probes. As the term targets is used herein, no difference in meaning is intended. A “Probe Target Pair” is formed when two macromolecules have combined through molecular recognition to form a complex.
- While the methods of the invention has broad applications and are not limited to any particular detection methods, they are particularly suitable for detecting a large number of, such as more than 1000, 5000, 10,000, 50,000 different transcript features.
- Fragmentation of nucleic acids comprises breaking nucleic acid molecules into smaller fragments. Fragmentation of nucleic acid may be desirable to optimize the size of nucleic acid molecules for certain reactions and destroy their three dimensional structure. For example, fragmented nucleic acids may be used for more efficient hybridization of target DNA to nucleic acid probes than non-fragmented DNA. According to a preferred embodiment, before hybridization to a microarray, target nucleic acid should be fragmented to sizes ranging from 50 to 200 bases long to improve target specificity and sensitivity. In a more preferred embodiment, the average size of such fragments, one must consider the components of the assay cocktail in partial fragments obtained is at least 10, 20, 30, 40, 50, 60, 70, 80, 100 or 200 nucleotides. To obtain fragments of such size, molar ratios of cold to hot nucleotides in the reaction mixture must be considered as well as the affinity constant, K.sub.m, of the enzyme at issue for the analogs at question and to the substrate. The greater the ratio of hot nucleotide to cold, the greater the level of incorporation that may be expected. The greater the ratio of incorporation of photoactive nucleotides, the smaller the size of resulting fragments.
- mRNA or mRNA transcripts, as used herein, include, but not limited to pre-mRNA transcript(s), transcript processing intermediates, mature mRNA(s) ready for translation and transcripts of the gene or genes, or nucleic acids derived from the mRNA transcript(s). Transcript processing may include splicing, editing and degradation. As used herein, a nucleic acid derived from an mRNA transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template. Thus, a cDNA reverse transcribed from an mRNA, a cRNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc., are all derived from the mRNA transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample. Thus, mRNA derived samples include, but are not limited to, mRNA transcripts of the gene or genes, cDNA reverse transcribed from the mRNA, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like.
- A fragment, segment, or DNA segment refers to a portion of a larger DNA polynucleotide or DNA. A polynucleotide, for example, can be broken up, or fragmented into, a plurality of segments. Various methods of fragmenting nucleic acid are well known in the art. These methods may be, for example, either chemical or physical in nature. Chemical fragmentation may include partial degradation with a DNase; partial depurination with acid; the use of restriction enzymes; intron-encoded endonucleases; DNA-based cleavage methods, such as triplex and hybrid formation methods, that rely on the specific hybridization of a nucleic acid segment to localize a cleavage agent to a specific location in the nucleic acid molecule; or other enzymes or compounds which cleave DNA at known or unknown locations. Physical fragmentation methods may involve subjecting the DNA to a high shear rate. High shear rates may be produced, for example, by moving DNA through a chamber or channel with pits or spikes, or forcing the DNA sample through a restricted size flow passage, e.g., an aperture having a cross sectional dimension in the micron or submicron scale. Other physical methods include sonication and nebulization. Combinations of physical and chemical fragmentation methods may likewise be employed such as fragmentation by heat and ion-mediated hydrolysis. See for example, Sambrook et al., “Molecular Cloning: A Laboratory Manual,” 3rd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2001) (“Sambrook et al.) which is incorporated herein by reference for all purposes. These methods can be optimized to digest a nucleic acid into fragments of a selected size range. Useful size ranges may be from 100, 200, 400, 700 or 1000 to 500, 800, 1500, 2000, 4000 or 10,000 base pairs. However, larger size ranges such as 4000, 10,000 or 20,000 to 10,000, 20,000 or 500,000 base pairs may also be useful.
- Patents that describe synthesis techniques in specific embodiments include U.S. Pat. Nos. 5,412,087, 6,147,205, 6,262,216, 6,310,189, 5,889,165, and 5,959,098. Nucleic acid arrays are described in many of the above patents, but the same techniques are applied to polypeptide arrays.
- The present invention also contemplates many uses for polymers attached to solid substrates. These uses include gene expression monitoring, profiling, library screening, genotyping and diagnostics. Gene expression monitoring, and profiling methods can be shown in U.S. Pat. Nos. 5,800,992, 6,013,449, 6,020,135, 6,033,860, 6,040,138, 6,177,248 and 6,309,822. Genotyping and uses therefore are shown in U.S. Ser. No. 60/319,253, 10/013,598, and U.S. Pat. Nos. 5,856,092, 6,300,063, 5,858,659, 6,284,460, 6,361,947, 6,368,799 and 6,333,179. Other uses are embodied in U.S. Pat. Nos. 5,871,928, 5,902,723, 6,045,996, 5,541,061, and 6,197,506.
- The present invention also contemplates sample preparation methods in certain preferred embodiments. Prior to or concurrent with genotyping, the genomic sample may be amplified by a variety of mechanisms, some of which may employ PCR. See, e.g., PCR Technology: Principles and Applications for DNA Amplification (Ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (Eds. Innis, et al., Academic Press, San Diego, Calif., 1990); Mattila et al., Nucleic Acids Res. 19,4967 (1991); Eckert et al., PCR Methods and
Applications 1, 17 (1991); PCR (Eds. McPherson et al., IRL Press, Oxford); and U.S. Pat. Nos. 4,683,202, 4,683,195, 4,800,159 4,965,188, and 5,333,675, and each of which is incorporated herein by reference in their entireties for all purposes. The sample may be amplified on the array. See, for example, U.S. Pat. No. 6,300,070 and U.S. patent application Ser. No. 09/513,300, which are incorporated herein by reference. - Other suitable amplification methods include the ligase chain reaction (LCR) (e.g., Wu and Wallace,
Genomics 4, 560 (1989), Landegren et al., Science 241, 1077 (1988) and Barringer et al. Gene 89:117 (1990)), transcription amplification (Kwoh et al., Proc. Natl. Acad. Sci. USA 86, 1173 (1989) and WO88/10315), self-sustained sequence replication (Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990) and WO90/06995), selective amplification of target polynucleotide sequences (U.S. Pat. No. 6,410,276), consensus sequence primed polymerase chain reaction (CP-PCR) (U.S. Pat. No. 4,437,975), arbitrarily primed polymerase chain reaction (AP-PCR) (U.S. Pat. No. 5,413,909, 5,861,245) and nucleic acid based sequence amplification (NABSA). (See, U.S. Pat. Nos. 5,409,818, 5,554,517, and 6,063,603, each of which is incorporated herein by reference). Other amplification methods that may be used are described in, U.S. Pat. Nos. 5,242,794, 5,494,810, 4,988,617 and in U.S. Ser. No. 09/854,317, each of which is incorporated herein by reference. - Additional methods of sample preparation and techniques for reducing the complexity of a nucleic sample are described in Dong et al.,
Genome Research 11, 1418 (2001), in U.S. Pat. Nos. 6,361,947, 6,391,592 and U.S. patent application Ser. Nos. 09/916,135, 09/920,491, 09/910,292, and 10/013,598. - Methods for conducting polynucleotide hybridization assays have been well developed in the art. Hybridization assay procedures and conditions will vary depending on the application and are selected in accordance with the general binding methods known including those referred to in: Maniatis et al. Molecular Cloning: A Laboratory Manual (2.sup.nd Ed. Cold Spring Harbor, N.Y, 1989); Berger and Kimmel Methods in Enzymology, Vol. 152, Guide to Molecular Cloning Techniques (Academic Press, Inc., San Diego, Calif., 1987); Young and Davism, P.N.A.S, 80: 1194 (1983). Methods and apparatus for carrying out repeated and controlled hybridization reactions have been described in U.S. Pat. Nos. 5,871,928, 5,874,219, 6,045,996 and 6,386,749, 6,391,623 each of which are incorporated herein by reference The present invention also contemplates signal detection of hybridization between ligands in certain preferred embodiments. See U.S. Pat. Nos. 5,143,854, 5,578,832; 5,631,734; 5,834,758; 5,936,324; 5,981,956; 6,025,601; 6,141,096; 6,185,030; 6,201,639; 6,218,803; and 6,225,625, in U.S. Patent application 60/364,731 and in PCT Application PCT/US99/06097 (published as WO99/47964), each of which also is hereby incorporated by reference in its entirety for all purposes.
- Methods and apparatus for signal detection and processing of intensity data are disclosed in, for example, U.S. Pat. Nos. 5,143,854, 5,547,839, 5,578,832, 5,631,734, 5,800,992, 5,834,758; 5,856,092, 5,902,723, 5,936,324, 5,981,956, 6,025,601, 6,090,555, 6,141,096, 6,185,030, 6,201,639; 6,218,803; and 6,225,625, in U.S. Patent application 60/364,731 and in PCT Application PCT/US99/06097 (published as WO99/47964), each of which also is hereby incorporated by reference in its entirety for all purposes.
- The practice of the present invention may also employ conventional biology methods, software and systems. Computer software products of the invention typically include computer readable medium having computer-executable instructions for performing the logic steps of the method of the invention. Suitable computer readable medium include floppy disk, CD-ROM/DVD/DVD-ROM, hard-disk drive, flash memory, ROM/RAM, magnetic tapes and etc. The computer executable instructions may be written in a suitable computer language or combination of several languages. Basic computational biology methods are described in, e.g. Setubal and Meidanis et al., Introduction to Computational Biology Methods (PWS Publishing Company, Boston, 1997); Salzberg, Searles, Kasif, (Ed.), Computational Methods in Molecular Biology, (Elsevier, Amsterdam, 1998); Rashidi and Buehler, Bioinformatics Basics: Application in Biological Science and Medicine (CRC Press, London, 2000) and Ouellette and Baxevanis Bioinformatics: A Practical Guide for Analysis of Gene and Proteins (Wiley & Sons, Inc., 2.sup.nd ed., 2001).
- The present invention may also make use of various computer program products and software for a variety of purposes, such as probe design, management of data, analysis, and instrument operation. See, U.S. Pat. Nos. 5,593,839, 5,795,716, 5,733,729, 5,974,164, 6,066,454, 6,090,555, 6,185,561, 6,188,783, 6,223,127, 6,229,911 and 6,308,170. Additionally, the present invention may have preferred embodiments that include methods for providing genetic information over networks such as the Internet as shown in U.S. patent application Ser. Nos. 10/197,621, 10/063,559 (U.S. Publication No. 20020183936), Ser. Nos. 10/065,868, 10/328,818, 10/328,872, 10/423,403 60/349,546, and 60/482,389.
- In any application in which multiple tiles of a double-tiled array wilt be bound by each fluorescent polynucleotide, it is straightforward to determine by inspection whether inner or outer 30-mers are bound. The technique is not limited to only two nonadjacent oligonucleotides per feature; higher orders of tiling are also possible. Each feature can be split into multiple smaller sub-features, e.g. 100-mer features could readily be subdivided into four 25-mers, forming diagonals or non-linear designs. Whole genome tiling arrays in particular are in need of methods to increase array feature density—for example, Cheng et al. recently reported analysis of 10 human chromosomes at 5 bp resolution, requiring 98 arrays per sample.8 Using similar arrays with triple-tiled 25 mers, the number of arrays required per sample would be reduced 3-fold. Thus, the double- (or multiple-) tiling technique can dramatically increase the depth and the breadth of coverage of a wide range of microarray experiments.
- In diagnostic applications, oligonucleotide analogue arrays (e.g., arrays on chips, slides or beads) are used to determine whether there are any differences between a reference sequence and a target oligonucleotide, e.g., whether an individual has a mutation or polymorphism in a known gene. As discussed supra, the oligonucleotide target is optionally a nucleic acid such as a PCR amplicon, which comprises one or more nucleotide analogues. In one embodiment, arrays are designed to contain probes exhibiting complementarity to one or more selected reference sequence whose sequence is known. The arrays are used to read a target sequence comprising either the reference sequence itself or variants of that sequence. Any polynucleotide of known sequence is selected as a reference sequence. Reference sequences of interest include sequences known to include mutations or polymorphisms associated with phenotypic changes having clinical significance in human patients. For example, the CFTR gene and P53 gene in humans have been identified as the location of several mutations resulting in cystic fibrosis or cancer respectively. Other reference sequences of interest include those that serve to identify pathogen microorganisms and/or are the site of mutations by which such microorganisms acquire drug resistance (e.g., the HIV reverse transcriptase gene for HIV resistance). Other reference sequences of interest include regions where polymorphic variations are known to occur (e.g., the Droop region of mitochondrial DNA). These reference sequences also have utility for, e.g., forensic, cladistic, or epidemiological studies.
- Although an array of oligonucleotide analogue probes is usually laid down in rows and columns for simplified data processing, such a physical arrangement of probes on the solid substrate is not essential. Provided that the spatial location of each probe in an array is known, the data from the probes is collected and processed to yield the sequence of a target irrespective of the actual physical arrangement of the probes on, e.g., a chip. In processing the data, the hybridization signals from the respective probes is assembled into any conceptual array desired for subsequent data reduction, whatever the physical arrangement of probes on the substrate.
- In one aspect, described are 60-mer features (e.g., probes) for DNA oligonucleotide microarrays that each comprise two concatenated 30-mers. The “inner” 30-mers (e.g., the 30 nt bound to the slide) form an “inner stack” and are unrelated in genomic coordinates to the “outer” 30-mers. An “outer stack” of 30-mers, which was computationally grafted onto the inner stack, produces 30-mer pairs concatenated into 60-mers (e.g., the probes) (
FIG. 1 a, b). The positions of the sequences can be randomized to reduce potential spatial artifacts. For example, bound (e.g., hybridized or associated) fluorescent polynucleotides (e.g., sample) can span a contiguous set of sequences, illuminating a line of features. By examining fluorescent feature adjacency, it can be determined whether the inner or outer 30-mer hybridized to the sample, as for example, a fluorescent molecule binding one outer 30-mer will bind several adjacent outer 30-mers, illuminating a line of features. The features, depending on which stack is illuminated will be in, for example, a horizontal, vertical, diagonal line or other arranged or shaped designs). There is, of course, the possibility of a spurious match across the junction of the 30-mers, but simulations and practical experiments revealed no instances of this. In one embodiment, to prevent or reduce, even further the possibility of a spurious match across a junction of the probes, a spacer (e.g., chemical) could be linked at the junction between the probes to prevent cross-hybridization. - In one aspect, described is a 44,000 feature (60-mer) array (Agilent Technologies Inc.) spanning the entire Saccharomyces cerevisiae genome. Repetitive sequences were masked at the feature selection stage (described below). The 30-mers were separated by an average spacing of 123 nucleotides (this spacing is based on the unmasked i.e. nonrepetitive component of the genome). Positive controls included Ty1 sequences, arranged to read “TY” in the center of the array when bound to labeled Ty1 DNA (two other sets of Ty1 controls are present, in both horizontal and vertical arrangements).
- A few yeast sequences were chosen as the sample to be hybridized to the array (see below). Some of the sequences were predicted to bind to inner 30-mers and illuminating horizontal lines, and others binding outer 30-mers in vertical lines. A “virtual array,” an in silico model of the ideal hybridization of the test DNA, as shown in (
FIG. 2 a), included both horizontal and vertical lines and illustrated the layout of the central Ty1 control features. The technique was experimentally confirmed (FIG. 2 b), and demonstrated that the inner and outer 30-mers of each 60-mer can be separately and specifically bound. The signal intensity for inner and outer 30-mers was similar, suggesting binding to each half of the 60-mer. In a virtual overlay (FIG. 2 c) we it was seen that the actual array was, qualitatively, in agreement with the predicted array. - One yeast culture was grown in galactose and another in glucose (as the sole carbon source), and the expressed sequences of the cultures were examined in a cyanine 3-cyanine 5 (Cy3-Cy5) two-color labeling using a double-tiled microarray, attempting to reproduce the steady-state galactose vs. glucose results of Lashkari et al,5 (
FIG. 3 ). The RNA from galactose-grown cells was labeled with Cy5 (red) and the glucose with Cy3 (green). Most of the lines were yellow, as expected, indicating that most genes are expressed at comparable levels in the two cultures; however, there were clearly visible red lines present on the array, indicating successful detection of genes upregulated in the galactose-induced culture. - Analyzing the double-tiled, two-color array provided a computational challenge, as the final fluorescence seen for any one composite feature represents the sum of the fluorescence of the two conjoined 30-mer features, which could in principle bind to two separate molecules in the fluorescent extract. To deconvolute the fluorescence intensities, yi was first defined as the normalized log ratio of the red versus green intensity for feature i. Then it was assumed that the contribution of each component was additive and used the following linear model: yi=θgi1+θgi2+ε, where gi1 is the index of the inside gene and gi2 is the index of the outside gene, θgi is the relative expression for each gene, and E represents measurement error. The goal was to estimate θgi for all gi. The errors were assumed independently identically distributed with mean 0 and used the least squares method. Specifically, the 44,290×6,606 design matrix, X, was created with rows representing features and columns representing the open reading frames (ORFs) in the Saccharomyces Gene Database annotation file, with a 1 placed at position xjk if ORF j is represented on feature k. It was then denoted the 6606×1 vector of true relative gene expression for each gene with Θ and the 44,290×1 vector of log ratios and errors with y and ε respectively. The model could then be written as:
-
y=XΘ+{right arrow over (ε)} - and the least squares solution is:
-
{circumflex over (Θ)}=(X T X)−1 X T y - This is the matrix form of the multiple regression equations. Notice that solving this equation involves inverting a 6,606×6,606 matrix, which is not a trivial task even with today's computer power, as it requires at least 216 billion operations in R (if done using Gaussian elimination). However, as X is an extremely sparse matrix the equation may be solved in a few seconds using the Matrix package in R, for example shown on the world wide web at http://cran.r-projectorg/src/contrib/Descriptions/Matrix.html.
- To evaluate the concordance and reproducibility of data collected using the double-tiled and conventional single-tiled 60-mer arrays, the same galactose- and glucose-grown, labeled RNA extracts were hybridized to Agilent custom 60-mer (conventional) whole genome yeast arrays. Box plots were created (
FIG. 4 ) showing the distribution of the difference between estimated relative expression obtained from replicate RNA samples for the conventional and double-tiled arrays. It can readily be seen from the box plots that the quality of the double-tiled army signals was very comparable to that of the single-tiled array. Once analyzed in this way, the data was ranked first by their signal to noise ratio defined as the moderated t-statistic6 and then, for the top 150 consistent genes, by rank order of average log ratio. This second ranking was done because many genes with very small and possibly insignificant effects were consistent across all of the arrays. The results (Table 1) are consistent with those of Lashkari et al.5; for example, it was found that genes involved in galactose metabolism and transport, as well as ATP synthase subunits, were the highest up-regulated transcripts in the galactose-grown cells, while a glucose transporter, among other genes, was down-regulated. -
TABLE 1 Gene expression in the galactose- and glucose-grown samples. Rank SGD ID Gene name M P value 1 YBR020W GAL1 2.5 0.00017 2 YLR081W GAL2 2.0 0.00075 3 YKL085W MDH1 1.3 0.0027 4 YDL181W INH1 1.2 0.00073 5 YOR120W GCY1 1.0 0.010 6 YJR121W ATP2 1.0 0.0012 7 YDL004W ATP16 1.0 0.0056 8 YBR039W ATP3 0.94 0.0069 9 YBL099W ATP1 0.92 0.0023 10 YJL166W QCR8 0.89 0.0022 11 YHR033W 0.84 0.011 12 YBR118W TEF2 0.81 0.00073 13 YCL040W GLK1 0.75 0.0037 14 YFR049W YMR31 0.71 0.012 15 YDR178W SDH4 0.68 0.013 16 YHR051W COX6 0.67 0.0013 17 YDR010C 0.64 0.0072 18 YDR007W TRP1 0.60 0.0027 19 YDR009W GAL3 0.59 0.0060 20 YPL273W SAM4 −0.45 0.0070 −20 YCR051W −0.46 0.020 −19 YHR179W OYE2 −0.47 0.025 −18 YDR037W KRS1 −0.49 0.014 −17 YGL209W MIG2 −0.49 0.010 −16 YNL067W RPL9B −0.50 0.00073 −15 YLR367W RPS22B −0.51 0.012 −14 YBR106W PHO88 −0.52 0.0041 −13 YMR186W HSC82 −0.52 0.0041 −12 YLR175W CBF5 −0.52 0.014 −10 YGL255W ZRT1 −0.55 0.0072 −9 YLR134W PDC5 −0.55 0.0048 −8 YDR033W MRH1 −0.56 0.0034 −7 YHR072W-A NOP10 −0.60 0.0062 −6 Ty1 −0.62 0.020 −5 YAL038W CDC19 −0.69 0.00069 −4 YHL015W RPS20 −0.73 0.00045 −3 YMR011W HXT2 −0.77 0.014 −2 YOL109W ZEO1 −0.95 0.00069 −1 YLR109W AHP1 −1.2 0.00073 The top 20 and bottom 20 expressed genes in the double-tiled and the single-tiled arrays, rank-ordered by log ratio (all of these are also in the top 150 when ranked by consistency between the arrays). M is the mean log ratio of expression across all four arrays. - As a more extensive test of statistical concordance between the double-tiled and single-tiled arrays, the differential expression data was evaluated in the form of a CAT plot7 (correspondence at the top,
FIG. 5 ). Correspondence is a simple and highly informative way of comparing lists of data and is defined here as the number of genes in common in the lists made by ranking genes by their log-ratio and keeping the top N members of the lists. - It can readily be seen that concordance at the top between replicates of both the single- and double-tiled arrays was good, as the curves were well above the height of the yellow line, which demarcates the 99.9th percentile under the null hypothesis (no concordance). The concordance was also at the top between the double- and single-tiled array data was nearly indistinguishable from the intraplatform data, which is remarkable given that the two array platforms include completely independent sets of sequence features. This provided a direct demonstration that statistically, double-tiled arrays perform as well as single-tiled arrays in this yeast whole genome transcript profiling experiment.
- In one exemplary array, 80,897 30-bp features were chosen from the yeast genome in three steps. First, the yeast genome was masked; retrotransposons and long terminal repeats (LTRs), telomeres, and X and Y′ elements were not included in the sequences used for feature selection. Second, Primer39 was used to choose oligonucleotides with the lowest likelihood of conformational problems; this process did not yield enough oligonucleotides spaced at the required high density. Finally, the remaining oligonucleotides (9.7% of the total) were evenly spaced across the gaps without regard to sequence properties. The 30-mer sequences were arranged in sequence order and first from left to right, then top to bottom along the microarray, until the inner stack was filled, then the final 60-mers were created by appending the remaining 30-mers, in order from top to bottom, then left to right, forming the outer stack. These double-tiled 44K arrays were synthesized by Agilent Technologies (AMADJD# 13371).
- As above, features were chosen from the masked yeast genome; these 60-mer features were, as above, first chosen by Primer3 and then chosen randomly to create enough features at the required density to tile the yeast genome and are described in detail elsewhere (Wheelan S J, Scheifele L Z, Martinez-Murillo F, Irizarry R A, Boeke J D, “Eukaryotic Transposable Elements and Genome Evolution Special Feature: Transposon insertion site profiling chip (CIP-chip),” Proc Natl Acad Sci USA. 2006 103(47):17632-7.). The single-tiled 44K arrays were synthesized by Agilent Technologies (AMADID #13306).
- A mixture of plasmids B154 (HIS4 and flanking YCL sequences), YIp1 (HIS3), and pEDB9c (Ty1, URA3, and GAL1 promoter) was used to query the array. Each plasmid was digested in three parallel reactions with AluI, MspI, and HpyCH4V. The resulting fragments were heat-inactivated, pooled and labeled for hybridization to the microarray as follows: 200 ng DNA was incubated with 36 μg random hexamer in a 23 μl reaction at 100° C. for 2 minutes, then 4° C. for 4 minutes. The labeling reaction then proceeded with the addition of 5 μL 10×dNTP (8 mM dATP, dCTP, dGTP, 4 mM dUTP), 5
μl 10× Klenow buffer, 7 μl Klenow (exo-) fragment (5U/μl), 7 μl H2O, and 2 μl Cy5 dUTP, and was incubated at 37° C. for 2 hours. The reaction was stopped with 5 μl 10.5 M EDTA pH 8.0. The products were mixed with 450 μl TE and concentrated on a Microcon YM-30 (Amicon catalog #42410) column. The products were washed again with 450 μl TE and 10 μl sheared salmon sperm DNA (10 mg/ml), and concentrated again on a Microcon column. The resulting volume was adjusted to 26 μl with the addition of H2O, and SDS and SSC were added to final concentrations of 3×SSC and 0.3% SDS, in a total volume of 32.5 μl. After incubation at 100° C. for 90 seconds and then 37° C. for 30 minutes, the products were spotted onto microarrays and covered with 22×60 mm cover slips (VWR catalog #48393 070). - The microarrays were hybridized overnight in a humid chamber at 55° C. In the morning, the arrays were washed in 2×SSC, 0.03% SDS for 5 minutes at 55° C., then in 1×SSC for 5 minutes at room temperature, and finally in 0.2×SSC for 5 minutes at room temperature. Microarrays were allowed to air dry and then scanned in a GenePix 400013 scanner (Axon Instruments), using GenePix Pro 5.1 software.
- To examine expression levels in galactose-grown versus glucose-grown yeast, we first grew an overnight culture of BY4743 yeast in yeast extract/peptone (YEP)+2% raffinose, to an OD600 of 5.5. YEP+2% galactose and YEP+2% dextrose cultures were then inoculated with the overnight culture to a starting OD600 of 0.25 or 0.125, and the cultures were grown at 30° C. to OD600 0.6. Cells were pelleted by centrifugation in 50 ml conical tubes at 1300 rcf for 5 minutes at 4° C., resuspended in 1 ml ice-cold water and pelleted again in a microcentrifuge at 13,000 rpm at 4° C., and then the supernatant was decanted and the cells were frozen on dry ice. RNA was prepared as follows, after the method of Schmitt et al. ° with modifications.
- Cells were thawed on ice and resuspended in 400 μl TES (10 mM Tris-HCl, pH 7.5, mM ethylenediaminetetraacetic acid (EDTA), and 0.5% SDS); 400 μl acid phenol/chloroform was added, and after vortexing briefly, the extracts were incubated at 65° C. for 60 minutes with brief, occasional vortexing. The extracts were placed on ice for 5 minutes, then spun at top speed in a microcentrifuge at 4° C. for 5 minutes. The aqueous layer was transferred to a new tube and extracted once more with acid phenol/chloroform. RNA was precipitated out of the aqueous layer: the aqueous layer was transferred to a new tube and 40 μl 3 M sodium acetate, pH 5.3 and 1 ml ice cold 100% ethanol were added, and the tube was placed at 80° C. overnight. After a 5-minute spin at 4° C., the pellet was washed in ice-cold 70% ethanol and spun again for 5 minutes at 4° C. The pellet was resuspended in 50 μl DEPC-treated water and further purified using a Qiagen RNeasy kit. Finally, the RNA was treated with DNase I by incubating 50 μl RNA with 10
μl 10× DNase I buffer, 1 μl DNase I, 2 μl RNasin, and 37 μl water at 37° C. for 30 minutes. 10 μl 25 mM EDTA was added before heat inactivation at 65° C. for 15 minutes. After 1 minute on ice, the RNA was cleaned up with 100 μl phenol/chloroform/isoamyl alcohol, vortexed, and centrifuged for 5 minutes in a microcentrifuge at 13,000 rpm at 4° C. The aqueous layer was taken to a new tube and 400 μl ice-cold 100% ethanol and 10 μl M sodium acetate pH 5.3 were added, and the RNA was precipitated overnight at −80° C., then washed with 70% ethanol and resuspended in 30 μl diethyl pyrocarbonate-treated (DEPC) water. Finally, the RNA concentration was adjusted to 500 ng/μl. - Yeast RNA was processed using a modification of the Agilent Low RNA Input Fluorescent Linear Amplification protocol (Agilent Technologies Kit, Protocol version 3.3, July 2005; Maitreya Dunham, personal communication).
- 400 ng of total RNA were denatured for 10 minutes at 65° C. in the presence of T7 promoter primer and nuclease-free water in a total volume of 11.5 μl, and snap cooled for 5 minutes on ice. The cDNA synthesis was done using MMLV-RT, DTT, 10 mM dNTP and RNaseOUT (Agilent Technologies Kit) at 40° C. for 2 hours, followed by an enzyme inactivation step for 15 minutes at 65° C. To each sample, 2.4 μl of either cyanine 3-CTP (10 mM) or cyanine 5-CTP (10 mM) were added and incorporated in an in vitro transcription step at 40° C. for 2 hours using PEG, RNaseOUT, T7 RNA polymerase and inorganic pyrophosphatase to generate labeled cRNA (reagents are included in the Agilent Low RNA Input Linear Amplification Kit; concentrations and sources are proprietary). Amplified cRNA was then purified using QIAGEN's QIAquick spin columns as described in the RNeasy Mini Kit (QIAGEN). After confirming that the specific activity of the labeled cRNA was between 10 and 20 pmols per μg of cRNA, a total of 850 ng labeled cRNA from each sample (Cy3-Cy5 labeled) were mixed and fragmented using the Gene Expression Hybridization Kit (Agilent Technologies) and hybridized to the array for 17 hours at 45° C. (for the double-tiled array) or 55° C. (for the conventional 60-mer array) in the dark. The arrays were then washed in solution A (700 ml dH2O, 300 ml 20×SSPE, 20% N-lauroylsarcosine) for 1 minute at RT, followed by 1 minute in wash B (997 ml dH2O, 3 ml 20×SSPE, 0.25 ml 20% N-lauroylsarcosine) at RT, and by a 30 second wash in Acetonitrile (100%, anhydrous) The arrays were scanned using the Axon GenePix 4,000B scanner (Axon Instruments) and the images were analyzed using GenePix Pro 6.0.
- Microarray platform and sample data have been deposited in GEO (accession GSE5721).
-
- 1. Bertone, P., Gerstein, M. & Snyder, M. Applications of DNA tiling arrays to experimental genome annotation and regulatory pathway discovery. Chromosome Res. 13, 259-274 (2005).
- 2. Bertone, P. et al. Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242-2246 (2004).
- 3. Mockler, T. C. et al. Applications of DNA tiling arrays for whole-genome analysis. Genomics 85, 1-15 (2005).
- 4. Shoemaker, D. D. et al. Experimental annotation of the human genome using microarray technology. Nature 409, 922-927 (2001).
- 5. Lashkari, D. A. et al. Yeast microarrays for genome wide parallel genetic and gene expression analysis. Proc. Natl. Acad. Sci. U.S.A. 94, 13057-13062 (1997).
- 6. Smyth, G. K Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3, Article3 (2004).
- 7. Irizarry, R. A. et al. Multiple-laboratory comparison of microarray platforms. Nat.
Methods 2, 345-350 (2005). - 8. Cheng, J. et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149-1154 (2005).
- 9. Rozen, S. & Skaletsky, H. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132, 365-386 (2000).
- 10. Schmitt, M. E., Brown, T. A. & Trumpower, B. L. A rapid and simple method for preparation of RNA from Saccharomyces cerevisiae. Nucleic Acids Res. 18, 3091-3092 (1990).
Claims (23)
1. A multi-tiled nucleic acid array, comprising an immobilized array of nucleic acid features, wherein each feature comprises an inner probe and an outer probe, wherein the inner and outer probes are unrelated in genomic coordinates.
2. The multi-tiled nucleic acid array of claim 1 , wherein one of the inner or the outer probe is arranged horizontally and the other is arranged vertically.
3. The multi-tiled nucleic acid array of claim 1 , wherein the features of the array further comprise middle probes between the inner and the outer probes, wherein the probes are unrelated in genomic coordinates.
4. The multi-tiled nucleic acid array of claim 3 , wherein the features of the array further comprise second middle probes between the inner and the middle probes, wherein the probes are unrelated in genomic coordinates.
5. The multi-tiled nucleic acid array of claim 1 , further comprising at least one positive control feature.
6. The multi-tiled nucleic acid array of claim 1 , further comprising at least one negative control feature.
7. The multi-tiled nucleic acid array of claim 1 , wherein the multi-tiled array comprises from between about 100 to about 3 billion features.
8. The multi-tiled nucleic acid array of claim 1 , wherein the multi-tiled array comprises from between about 10,000 to 10 million features. 10 million to 3 billion.
9. A multi-tiled nucleic acid array, comprising an immobilized array of nucleic acid features, wherein the features comprise an inner probe, a middle probe, and an outer probe, wherein the probes are unrelated in genomic coordinates.
10. The multi-tiled array of claim 9 , wherein the probes are from between about 10 nucleotides to about 50 nucleotides in length.
11-12. (canceled)
13. A multi-tiled nucleic acid array, comprising an immobilized array of nucleic acid features, wherein the features comprise four probes, an inner probe a middle probe, and an outer probe, wherein the probes are unrelated in genomic coordinates.
14. The multi-tiled array of claim 13 , wherein the probes are from between about 10 nucleotides to about 50 nucleotides in length.
15-16. (canceled)
17. A multi-tiled nucleic acid array, comprising an immobilized array of nucleic acid features, wherein the features comprise at least two probes unrelated in genomic coordinates.
18. A method of expression profiling, comprising:
providing a multi-tiled array,
hybridizing a labeled sample to the array; and
analyzing the array.
19-28. (canceled)
29. A method of constructing a multi-tiled array, comprising:
selecting probe sequences;
arranging inner probe sequences in sequence order, and
appending outer probe sequences in sequence order to the inner probe sequences.
30-39. (canceled)
40. A method of array based evaluation of a sample, comprising:
providing a multi-tiled array;
hybridizing a sample to the array; and
deconvoluting signal intensities.
41-46. (canceled)
47. A method of polymorphism analysis comprising providing a multi-tiled nucleic acid array of probes comprising a first set of probes spanning each of a collection of polymorphic sites in known sequences of unknown function and complementary to a first allelic forms of the sites, and a second set of probes spanning each of the polymorphic sites in the collection and complementary to second allelic forms of the sites, wherein the collection of polymorphic sites includes at least 10 unlinked polymorphic sites; and hybridizing a nucleic acid sample from a subject to the array of probes and analyzing the hybridization intensities of probes in the first and second probe sets to determine a profile of polymorphic forms present in the individual.
48-56. (canceled)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/086,142 US20090305902A1 (en) | 2005-12-12 | 2006-12-12 | Double-Tiled and Multi-Tiled Arrays and Methods Thereof |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US74948405P | 2005-12-12 | 2005-12-12 | |
US12/086,142 US20090305902A1 (en) | 2005-12-12 | 2006-12-12 | Double-Tiled and Multi-Tiled Arrays and Methods Thereof |
PCT/US2006/047497 WO2007070553A2 (en) | 2005-12-12 | 2006-12-12 | Double-tiled and multi-tiled arrays and methods thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
US20090305902A1 true US20090305902A1 (en) | 2009-12-10 |
Family
ID=37964631
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/086,142 Abandoned US20090305902A1 (en) | 2005-12-12 | 2006-12-12 | Double-Tiled and Multi-Tiled Arrays and Methods Thereof |
Country Status (2)
Country | Link |
---|---|
US (1) | US20090305902A1 (en) |
WO (1) | WO2007070553A2 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4731325A (en) * | 1984-02-17 | 1988-03-15 | Orion-Yhtyma | Arrays of alternating nucleic acid fragments for hybridization arrays |
US6054270A (en) * | 1988-05-03 | 2000-04-25 | Oxford Gene Technology Limited | Analying polynucleotide sequences |
US6238869B1 (en) * | 1997-12-19 | 2001-05-29 | High Throughput Genomics, Inc. | High throughput assay system |
US20020137031A1 (en) * | 1999-07-01 | 2002-09-26 | Paul K. Wolber | Multidentate arrays |
US20040234963A1 (en) * | 2003-05-19 | 2004-11-25 | Sampas Nicholas M. | Method and system for analysis of variable splicing of mRNAs by array hybridization |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2832731A1 (en) * | 2002-08-30 | 2003-05-30 | Commissariat Energie Atomique | Preparing genetic fingerprint, useful e.g. for identifying an individual animal, plant or microorganism, using a universal array of hybridization probes |
-
2006
- 2006-12-12 WO PCT/US2006/047497 patent/WO2007070553A2/en active Application Filing
- 2006-12-12 US US12/086,142 patent/US20090305902A1/en not_active Abandoned
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4731325A (en) * | 1984-02-17 | 1988-03-15 | Orion-Yhtyma | Arrays of alternating nucleic acid fragments for hybridization arrays |
US6054270A (en) * | 1988-05-03 | 2000-04-25 | Oxford Gene Technology Limited | Analying polynucleotide sequences |
US6238869B1 (en) * | 1997-12-19 | 2001-05-29 | High Throughput Genomics, Inc. | High throughput assay system |
US20020137031A1 (en) * | 1999-07-01 | 2002-09-26 | Paul K. Wolber | Multidentate arrays |
US20040234963A1 (en) * | 2003-05-19 | 2004-11-25 | Sampas Nicholas M. | Method and system for analysis of variable splicing of mRNAs by array hybridization |
Also Published As
Publication number | Publication date |
---|---|
WO2007070553A2 (en) | 2007-06-21 |
WO2007070553A3 (en) | 2007-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7361468B2 (en) | Methods for genotyping polymorphisms in humans | |
US7250289B2 (en) | Methods of genetic analysis of mouse | |
US7300788B2 (en) | Method for genotyping polymorphisms in humans | |
US7323308B2 (en) | Methods of genetic analysis of E. coli | |
US20050106591A1 (en) | Methods and kits for preparing nucleic acid samples | |
US20040191810A1 (en) | Immersed microarrays in conical wells | |
US20040146910A1 (en) | Methods of genetic analysis of rat | |
US20030186280A1 (en) | Methods for detecting genomic regions of biological significance | |
US7312035B2 (en) | Methods of genetic analysis of yeast | |
US20030186279A1 (en) | Large scale genotyping methods | |
US20040161779A1 (en) | Methods, compositions and computer software products for interrogating sequence variations in functional genomic regions | |
US20050208555A1 (en) | Methods of genotyping | |
US7354720B2 (en) | Label free analysis of nucleic acids | |
US7629164B2 (en) | Methods for genotyping polymorphisms in humans | |
US20110160092A1 (en) | Methods for Selecting a Collection of Single Nucleotide Polymorphisms | |
US20040185475A1 (en) | Methods for genotyping ultra-high complexity DNA | |
US20090305902A1 (en) | Double-Tiled and Multi-Tiled Arrays and Methods Thereof | |
US9845494B2 (en) | Enzymatic methods for genotyping on arrays | |
US20050074799A1 (en) | Use of guanine analogs in high-complexity genotyping | |
US20040191807A1 (en) | Automated high-throughput microarray system | |
US20040096837A1 (en) | Non-contiguous oligonucleotide probe arrays | |
US20040171167A1 (en) | Chip-in-a-well scanning | |
US20060147940A1 (en) | Combinatorial affinity selection | |
US20040110132A1 (en) | Method for concentrate nucleic acids | |
US20050136452A1 (en) | Methods for monitoring expression of polymorphic alleles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: JOHNS HOPKINS UNIVERSITY, THE, MARYLAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BOEKE, JEF D.;WHEELAN, SARAH J.;REEL/FRAME:023076/0452 Effective date: 20090730 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: NATIONAL INSTITUTES OF HEALTH - DIRECTOR DEITR, MA Free format text: CONFIRMATORY LICENSE;ASSIGNOR:THE JOHNS HOPKINS UNIVERSITY;REEL/FRAME:048342/0163 Effective date: 20190204 |