这是indexloc提供的服务,不要输入任何密码

Inferring gene expression from ribosomal promoter sequences, a crowdsourcing approach

  1. Gustavo Stolovitzky1
  1. 1IBM T.J. Watson Research Center, Yorktown Heights, New York 10598, USA;
  2. 2Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana 46556, USA;
  3. 3Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08540, USA;
  4. 4Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel
      1. 7Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana 46556, USA;
      2. 8Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, Indiana 46556, USA;
      3. 9Genome and Systems Biology Degree Program, National Taiwan University and Academia Sinica, Taipei 106, Taiwan;
      4. 10Department of Bio-Industrial Mechatronics Engineering, National Taiwan University, Taipei 106, Taiwan;
      5. 11Electrical Engineering Department, Texas A&M University, College Station, Texas 77843, USA;
      6. 12Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland;
      7. 13Center for Genome Sciences and Systems Biology and Department of Computer Science, Washington University, St. Louis, Missouri 63110, USA;
      8. 14Laboratory of Immunoregulation and Mucosal Immunology, Department for Molecular Biomedical Research, VIB, Ghent University, 9052 Gent, Belgium;
      9. 15Department of Plant Systems Biology, VIB, Ghent University, 9052 Gent, Belgium;
      10. 16Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Gent, Belgium;
      11. 17Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany;
      12. 18Institute of Molecular Biology, 55128 Mainz, Germany;
      13. 19Institute for Medical Genetics, Universitätsklinikum Charité, 13353 Berlin, Germany;
      14. 20The Institute for Quantitative Biology, East Tennessee State University, Johnson City, Tennessee 37614-0663, USA;
      15. 21Interdisciplinary Centre for Mathematical and Computational Modeling, University of Warsaw, 00-927 Warsaw, Poland;
      16. 22School of Mathematics and Statistics, University of Hyderabad, Hyderabad-500046, India;
      17. 23Department of Mathematical Sciences, Department of Biology, Biocenter Oulu, University of Oulu, FIN-90014 Finland;
      18. 24Department of Mathematics, MIT, Cambridge, Massachusetts 02139, USA;
      19. 25Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium;
      20. 26Department of Mathematics & Computer Sciences, University of Antwerp, B2020 Antwerp, Belgium;
      21. 27Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp, B2020 Antwerp, Belgium;
      22. 28Fondazione Edmund Mach, Research and Innovation Centre, 38010 Trento, Italy;
      23. 29Department of Computer Science, Duke University, Durham, North Carolina 27708, USA;
      24. 30Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, USA

      Abstract

      The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a gene for yellow fluorescence protein and inserted in the same genomic site of yeast Saccharomyces cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low-expressed and mutated promoters were difficult to obtain, although in the latter case, only when the mutation induced a large change in promoter activity compared to the wild-type sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the three best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites.

      Footnotes

      • 5 A complete list of consortium authors appears at the end of this manuscript.

      • 6 Corresponding author

        E-mail pmeyerr{at}us.ibm.com

      • [Supplemental material is available for this article.]

      • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.157420.113.

      • Received March 12, 2013.
      • Accepted August 14, 2013.

      This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), as described at http://creativecommons.org/licenses/by-nc/3.0/.

      Related Articles

      | Table of Contents

      Preprint Server