Scalable, functional bioinformatics on the JVM — written in Scala, usable from Scala/Java and friends.
This is a community-driven fork focused on code clarity, small safe refactors, and contributor experience.
This a functional bioinformatics library.
- Strongly-typed DNA/RNA/Protein sequences (with IUPAC ambiguity & gapped variants)
- Transcription (DNA → RNA)
- Translation (RNA → amino acids) with BioJava interop
- Parsers/Writers:
- Iterator-based FASTA reader/writer
- Iterator-based PAML (PHY) reader
- Phylip reader/writer (via BioJava)
- Early alignment utilities and attribute system (immutable, WIP)
Some APIs reflect “classic BioScala” semantics; modernization happens behind shims first to keep PRs tiny.
🧬 Functional Design: Immutable data structures and pure functions for reliable, reproducible analysis.
🧩 Modular Architecture: Plug-and-play modules for sequences, alignments, and attributes.
📊 Extensible: Easily add custom functionality or integrate with other tools.
🔬 Research-Ready: Designed with bioinformatics workflows in mind.
-
DNA/RNA/Protein Sequences: Immutable, type-safe representations with support for gaps and IUPAC symbols.
-
Sequence Alignment: Basic pairwise alignment and sparse alignment tools.
-
Transcription & Translation: Convert DNA to RNA and RNA to protein sequences.
-
Attributes: Attach metadata (e.g., IDs, descriptions) to sequences and alignments.
Since BioScala is a work in progress and not yet published on Maven Central, you’ll need to clone the repository and publish it locally:
- Clone the Repository:
Copy
git clone https://github.com/bioscala/bioscala.git
cd bioscala
- Publish Locally: Use sbt to publish the library to your local Ivy repository:
Copy
sbt publishLocal
- Add to Your Project:
Add the dependency to your
build.sbt
:
libraryDependencies += "org.bioscala" %% "bioscala-core" % "0.2.0"
val dnaSequence = new DNASequence("ATGGCCATTGTAATGGGCCGCTGAA")
val rnaSequence = dnaSequence.transcribe()
println(rnaSequence) // Output: AUGGCCAUUGUAAUGGGCCGCUGAA
val rnaSequence = new RNASequence("AUGGCCAUUGUAAUGGGCCGCUGAA")
val proteinSequence = SequenceTranslation.translate(rnaSequence.seq)
println(proteinSequence) // Output: MAIVMGR*
val alignment: List[List[NTSymbol]] = List(
List(A, C, G, T, Gap),
List(A, Gap, G, T, C)
)
val (filteredAlignment, removedColumns) = SparseAlignment.removeSparseRows(alignment, minSymbols = 2)
println(filteredAlignment) // Output: List(List(A, C, G, T, Gap), List(A, Gap, G, T, C))
-
Sequences: Immutable lists of nucleotides or amino acids.
-
Alignments: Lists of sequences with gap support.
-
Attributes: Metadata attached to sequences or alignments.
-
Alignment: Tools for working with sequence alignments.
-
Attribute: Managing metadata and properties.
-
Chemistry: Representing nucleotides, amino acids, and codons.
-
Nucleotide: Core DNA and RNA sequence handling.
-
Sequence: High-level sequence abstractions.
Please refer to these links for a richer documentation:
BioScala is an open-source project, and we welcome contributions from the community! Here’s how you can get involved:
-
🐛 Report Bugs: Issue Tracker
-
💡 Suggest Features: I will share the Medium post here.
-
👩💻 Contribute Code: Contributing Guide
-
💬 Join our Discord to participate in discussions.
Original author/maintainer: Pjotr Prins
Interop: BioJava for translation and IO helpers
License: BSD (see LICENSE).
If you use BioScala in your research, please cite:
@software{bioscala,
author = {BioScala Team},
title = {BioScala: A Functional Bioinformatics Library},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/bioscala/bioscala}}
}
BioScala is released under the BSD 3-Clause License , ensuring freedom for academic, commercial, and personal use.
BioScala is made possible by the contributions of developers like you. Special thanks to:
-
The Scala community for building a powerful and expressive language.
-
The open-source bioinformatics community for inspiring this project.
BioScala is more than a library—it’s a community-driven effort to make bioinformatics more accessible and functional. Whether you’re a seasoned bioinformatician or a curious beginner, we welcome you to the BioScala community.
🌟 Star this repo to show your support. 🚀 Fork and contribute to shape the future of BioScala.
Let’s build the future of bioinformatics, together.
BioScala Team