+
Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### `Fixed`

- [#405] Fix database to tool mismatching in KAIJU2KRONA input (❤️ to @MajoroMask for reporting, fix by @jfy133)
- [#406] Fixed overwriting of bracken-derived kraken2 outputs when the database name is shared between Bracken/Kraken2. (❤️ to @MajoroMask for reporting, fix by @jfy133)

### `Dependencies`

Expand Down
2 changes: 1 addition & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -485,7 +485,7 @@ process {
}

withName: KRAKENTOOLS_COMBINEKREPORTS_KRAKEN {
ext.prefix = { "kraken2_${meta.id}_combined_reports" }
ext.prefix = { "kraken2_${meta.db_name}_combined_reports" }
publishDir = [
path: { "${params.outdir}/kraken2/" },
mode: params.publish_dir_mode,
Expand Down
2 changes: 2 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,7 @@ The main taxonomic profiling file from Bracken is the `*.tsv` file. This provide

- `kraken2/`
- `<db_name>_combined_reports.txt`: A combined profile of all samples aligned to a given database (as generated by `krakentools`)
- If you have also run Bracken, the original Kraken report (i.e., _before_ read re-assignment) will also be included in this directory with `-bracken` suffixed to your Bracken database name. For example: `kraken2-<mydatabase>-bracken.tsv`. However in most cases you want to use the actual Bracken file (i.e., `bracken_<mydatabase>.tsv`).
- `<db_name>/`
- `<sample_id>_<db_name>.classified.fastq.gz`: FASTQ file containing all reads that had a hit against a reference in the database for a given sample
- `<sample_id>_<db_name>.unclassified.fastq.gz`: FASTQ file containing all reads that did not have a hit in the database for a given sample
Expand Down Expand Up @@ -582,6 +583,7 @@ The resulting HTML files can be loaded into your web browser for exploration. Ea
- `<tool>_<database>*.{tsv,csv,arrow,parquet,biom}`: Standardised taxon table containing multiple samples. The standard format is the `tsv`.
- The first column describes the taxonomy ID and the rest of the columns describe the read counts for each sample.
- Note that the file naming scheme will apply regardless of whether `TAXPASTA_MERGE` (multiple sample run) or `TAXPASTA_STANDARDISE` (single sample run) are executed.
- If you have also run Bracken, the initial Kraken report (i.e., _before_ read re-assignment) will also be included in this directory with `-bracken` suffixed to your Bracken database name. For example: `kraken2-<mydatabase>-bracken.tsv`. However in most cases you want to use the actual Bracken file (i.e., `bracken_<mydatabase>.tsv`).

</details>

Expand Down
10 changes: 7 additions & 3 deletions subworkflows/local/profiling.nf
Original file line number Diff line number Diff line change
Expand Up @@ -172,9 +172,13 @@ workflow PROFILING {
ch_raw_classifications = ch_raw_classifications.mix( KRAKEN2_KRAKEN2.out.classified_reads_assignment )
ch_raw_profiles = ch_raw_profiles.mix(
KRAKEN2_KRAKEN2.out.report
// Set the tool to be strictly 'kraken2' instead of potentially 'bracken' for downstream use.
// Will remain distinct from 'pure' Kraken2 results due to distinct database names in file names.
.map { meta, report -> [meta + [tool: 'kraken2'], report]}
// Rename tool in the meta for the for-bracken files to disambiguate from only-kraken2 results in downstream steps.
// Note may need to rename back to to just bracken in those downstream steps depending on context.
.map {
meta, report ->
def new_tool =
[meta + [tool: meta.tool == 'bracken' ? 'kraken2-bracken' : meta.tool], report]
}
)

}
Expand Down
23 changes: 17 additions & 6 deletions subworkflows/local/standardisation_profiles.nf
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,19 @@ workflow STANDARDISATION_PROFILES {
.map {
meta, profile ->
def meta_new = [:]
meta_new.id = meta.db_name
meta_new.tool = meta.tool == 'malt' ? 'megan6' : meta.tool
meta_new.db_name = meta.db_name
[meta_new, profile]
}
.groupTuple ()
.map { [ it[0], it[1].flatten() ] }
.map {
meta, profiles ->
meta = meta + [
tool: meta.tool == 'kraken2-bracken' ? 'kraken2' : meta.tool, // replace to get the right output-format description
id: meta.tool == 'kraken2-bracken' ? "${meta.db_name}-bracken" : "${meta.db_name}" // append so to disambiguate when we have same databases for kraken2 step of bracken, with normal bracken
]
[ meta, profiles.flatten() ]
}

ch_taxpasta_tax_dir = params.taxpasta_taxonomy_dir ? Channel.fromPath(params.taxpasta_taxonomy_dir, checkIfExists: true).collect() : []

Expand Down Expand Up @@ -85,7 +92,7 @@ workflow STANDARDISATION_PROFILES {
centrifuge: it[0]['tool'] == 'centrifuge'
ganon: it[0]['tool'] == 'ganon'
kmcp: it [0]['tool'] == 'kmcp'
kraken2: it[0]['tool'] == 'kraken2'
kraken2: it[0]['tool'] == 'kraken2' || it[0]['tool'] == 'kraken2-bracken'
metaphlan: it[0]['tool'] == 'metaphlan'
motus: it[0]['tool'] == 'motus'
unknown: true
Expand Down Expand Up @@ -158,11 +165,15 @@ workflow STANDARDISATION_PROFILES {
// Have to sort by size to ensure first file actually has hits otherwise
// the script fails
ch_profiles_for_kraken2 = ch_input_profiles.kraken2
.map { [it[0]['db_name'], it[1]] }
.groupTuple(sort: {-it.size()} )
.map {
[[id:it[0]], it[1]]
meta, profiles ->
def new_meta = [:]
new_meta.tool = meta.tool == 'kraken2-bracken' ? 'kraken2' : meta.tool // replace to get the right output-format description
new_meta.id = meta.tool // append so to disambiguate when we have same databases for kraken2 step of bracken, with normal bracken
new_meta.db_name = meta.tool == 'kraken2-bracken' ? "${meta.db_name}-bracken" : "${meta.db_name}" // append so to disambiguate when we have same databases for kraken2 step of bracken, with normal bracken
[ new_meta, profiles ]
}
.groupTuple(sort: {-it.size()})

KRAKENTOOLS_COMBINEKREPORTS_KRAKEN ( ch_profiles_for_kraken2 )
ch_multiqc_files = ch_multiqc_files.mix( KRAKENTOOLS_COMBINEKREPORTS_KRAKEN.out.txt )
Expand Down
8 changes: 6 additions & 2 deletions subworkflows/local/visualization_krona.nf
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ workflow VISUALIZATION_KRONA {
ch_input_profiles = profiles
.branch {
centrifuge: it[0]['tool'] == 'centrifuge'
kraken2: it[0]['tool'] == 'kraken2'
kraken2: it[0]['tool'] == 'kraken2' || it[0]['tool'] == 'kraken2-bracken'
unknown: true
}
ch_input_classifications = classifications
Expand All @@ -41,7 +41,11 @@ workflow VISUALIZATION_KRONA {
Convert Kraken2 formatted reports into Krona text files
*/
ch_kraken_reports = ch_input_profiles.kraken2
.mix( ch_input_profiles.centrifuge )
.map {
meta, report ->
[meta + [tool: meta.tool == 'bracken' ? 'kraken2-bracken' : meta.tool], report]
}
.mix( ch_input_profiles.centrifuge )
KRAKENTOOLS_KREPORT2KRONA ( ch_kraken_reports )
ch_krona_text = ch_krona_text.mix( KRAKENTOOLS_KREPORT2KRONA.out.txt )
ch_versions = ch_versions.mix( KRAKENTOOLS_KREPORT2KRONA.out.versions.first() )
Expand Down
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载